calling conventions / parameter passing in assembler?
virtuPIC
Posts: 193
Okay, the Propeller is even missing decent subroutine calls... However, assuming I am using call or jmpret instructions: are there any conventions
If there are no such conventions yet I volunteer to define some. Either on my own or leading a working group. Might also be a program item for the Propeller West Coast Expo. I am competent for this since I have a strong history in compiler construction I know about programming languages, programming concepts and I am even a backend expert.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Airspace V - international hangar flying!
www.airspace-v.com/ggadgets for tools & toys
- how to build a call stack that can even handle recursion,
- how to pass parameters (i.e. at which addresses),
- how to pass return values?
If there are no such conventions yet I volunteer to define some. Either on my own or leading a working group. Might also be a program item for the Propeller West Coast Expo. I am competent for this since I have a strong history in compiler construction I know about programming languages, programming concepts and I am even a backend expert.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Airspace V - international hangar flying!
www.airspace-v.com/ggadgets for tools & toys
Comments
Code sharing isn't really done at the level of the Prop assembly code nor are libraries used (there's no "include" provision in the compiler). The basic unit of code sharing is the Spin object. More useful would be some guidelines for new objects. For example, there are a variety of display objects for different display devices (TV, VGA, LCD, serial, etc.) Some of them have a set of basic output editing routines (like for decimal, hexadecimal, and binary number formats and zero-terminated strings). Some of them don't have these or have a subset. A guideline document that summarizes what others have done and recommends a standard set of output editing routines would be useful for anyone implementing a new device driver. Others may bring the earlier I/O drivers up-to-date by adding any missing routines or making a modified version of the original that conforms to whatever common ground we agree on.
A bit of a warning ... There's a natural flow to the creation of standards. If you attempt to create standards for something that's not mature enough, the standards will be ignored because they don't match what experience shows is a "best" way to do things or development will be arrested because you've codified something that doesn't really work well and developers may be locked into a standard that doesn't work. If you try to create a standard too late in the evolution of something, there will be too many ad-hoc solutions that are well established to make a meaningful standard (other than to codify what people actually do).
This may be a problem that does not need solving.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
Those share very easily.
Smaller assembly language routines typically use a few COG longs, or a HUB memory location or two. Those are easily shared also. That's a straight forward cut 'n paste.
Libraries in native PASM don't work like you would expect on other CPU's. I've seen people load up a COG with functions, then use a block of RAM to communicate with it. That COG is always running, ready to do stuff. I've also seen people fire off a COG to do something, then have it terminate. That has a COG startup cost though.
There is also the C environment from Imagecraft. I remember a deal going on right now too. C + a demo board for a good price, or something!
Richard can chime in [noparse]:)[/noparse]
Anyway, since that's C, you can just do what you normally would!
The mode of operation there is different still. Basically, a small software supervisor kernel runs on a COG. It fetches assembly language instructions from the HUB, one at a time, executes them, then continues on. Performance is about 1/5 of native PASM, but it's a more traditional environment / program work flow.
Really, the design of the Propeller makes some new things possible that often marginalize the need for the things you are talking about in your post. Code sharing with SPIN and PASM is pretty easy. The structures are not all that complex, and driver type code is very atomic, as it's targeted for a COG. Mostly possible to just run it from SPIN, then do your thing in your program.
Edit: Heater and Mike beat me to the punch!!
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Wiki: Share the coolness!
Chat in real time with other Propellerheads on IRC #propeller @ freenode.net
Safety Tip: Life is as good as YOU think it is!
OK you got me.
Edit: Could of sworn I was replying to mpark, or did I imaging his post?
Edit again: So mpark, is it so that you did not get me, there is no recursion in the SPIN interpreter. Even if there is in SPIN itself. Rather like a CPU emulator where the emulated CPU supports a stack but the emulator need not have.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
Post Edited (heater) : 2/2/2009 6:25:45 PM GMT
Carl Jacobs (JDForth) has a recursive PASM Fibonacci program.
Ha yes, now we are talking, those Forth guys can't do anything with out a stack.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
I believe that PASM is probably the first language that I've used that doesn't have any internal notion whatsoever of a stack. Spin, of course, has got a stack which allows the definition of recursive functions quite simply.
heater is quite right in saying that us Forth guys can't live without a stack, but neither can Spin guys, C guys, Pascal Guys, most Assembly language guys, most Basic guys. In fact, finding a language that is *not* stack based is the real challenge. It's just that Forth is the one of the few languages where the stack is so explicitly exposed.
I recall early last year stumbling over a bug in a AVR based C project that I was working on. The bug was caused by the fact that the floating point string printing routine was using a LOT more of the stack than what I was expecting (or had allocated)! The routine required something in the order of 100 bytes on the stack! I was shocked - to say the least.
In 20 odd years of programming I think I've used a recursive algorithm to solve only ONE problem - directory tree navigation, using recusion to step down into the various sub-directories. I could imagine a memory allocation alorithm being recursive, or a database indexing algorithm. But, as mentioned by Mike and heater, this level of coding may well be outside the scope of programming on the Propeller.
Regards,
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Carl Jacobs
JDForth - Forth to Spin Compiler http://www.jacobsdesign.com.au/software/jdforth/jdforth.php
Includes: FAT16 support for SD cards. Bit-bash Serial at 2M baud. 32-bit floating point maths.·Fib(28) in 0.86 seconds. ~3x faster than spin, ~40% larger than spin.
Hmm...I now feel the urge to define a simple C or Pascal like language that has no need of a stack and generates PASM code.
So:
No recursion.
No operator precedence or bracketed expressions.
Operators evaluated in the order they appear.
No local variables, at least not on a stack.
What else do we have to ditch? We already don't have interrupts to worry about[noparse]:)[/noparse]
I'd want a full set operators to cover all PASM operations, like SPIN does.
Given that the call hierarchy would then be simple tree structure it must be possible for the compiler to pass parameters around through global variables and work out when they can be reused in different parts of the program. Same goes for local variables.
As a start the compiler would know that if no one higher up the call tree than a given function uses a particular PASM register then it can be used in that function as a local or to pass a parameter. Even if some other function in another branch of the tree reuses that register.
In this way quite compact and fast code could be generated.
Would anyone have a use for such a thing? Or would it be just another fun toy for me?
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
PL360 is a high level assembler / low level compiler that was developed for the IBM360 long ago. It had a "full set" of operators much like Spin and evaluated them left to right without parentheses. It allowed subscript brackets, but these were for use with index registers. I'm not sure what should be done to handle self-modifying code for indexing. Some kind of local variable use could work like [noparse][[/noparse] ] with a number in the brackets could be a sort of local variable referring to a source or destination like "counter := counter + - " and somewhere else you'd have " := @table + 1". This 2nd statement would generate "MOVS local001,#table" then "ADD local001,#1". The 1st statement would generate "local001 ADD counter,0-0" and "local002 SUB counter,0-0". This notation wouldn't handle self-modification of immediate source values, but I'm sure someone could come up with something that makes sense.
PL360 had nice flow of control statements like IF / THEN, GOTO, and WHILE / DO. It even had a FOR / DO statement which could be optimized when the start and end values are both constants (to use DJNZ). These mostly took care of generating labels and jumps. In this case, a Prop version might test the carry and zero flags and might test for zero or non-zero.
Arrays? Who said anything about arrays? Let's just have pointers. After all arrays are just syntactic sugar for offsetting from a pointer.
Either way in all instructions the src and dest fields are pointers (as long as there is no #) so all we have to do is jam the calculated index values into them with MOVS, MOVD. Tricky part might me optimizing that so that it does not happen too often and as you say for loops should add/subtract to the source and dest fields as they go around rather than calculating a new base + offset each time. That might be easier if the index variable in a for loop is only in scope during the loop, then you can:
1. Calculate it's initial value at the start of the loop from base + offset and MOVS/MOVD that value into all instructions in the loop that use it.
2. Calculate the for loop step size.
3. After each iteration add/subtract that step size from all the instructions src/dest fields that use the index. There may be more than one.
May help if it is not allowed to write to the index during the loop. If you allow it then there may be a whole bunch of instructions within the loop that need modifying accordingly, yuck.
For random array accesses here and there in a program we need to MOVS/D the base address into an instruction and then add the index each time.
Damn, now I'm going to be up all night thinking about this.
Did I mention, the multiply and divide operators will only work for constant powers of 2 for efficiency sake[noparse]:)[/noparse]
Edit: I just noticed that you didn't actually use the word "array" I just imagined it.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
Post Edited (heater) : 2/3/2009 4:16:06 AM GMT
Mike, you are correct that currently program memory is very small. Half a KWords minus special registers minus data memory. That's even less than PIC12F508 has - and this one only has a two level stack. However, memory size might increase dramatically. The keyword here is virtual memory. We could see the cog memory as a first level cache. There is this other thread about connecting a large DRAM to the prop. I stopped following it as it went into philosophical discussion. However, I thought how to do it and I am pretty sure that you can connect a DRAM without additional hardware and access will be faster than to propeller main RAM.
Yes, I started assembler programming (hobby and business) on the 6502 (Commodore C64). We didn't have calling conventions, parameters, or local variables on this machine. This caused a few tricky, time eating bugs since from time to time even a single programmer forget which addresses or symbols are already in use. This went different when I upgraded to the 68 000 (Atari ST). Here we had indirect addressing with displacement also on the stack pointer which allowed for simple implementation of parameters and local variables. And by the way, there have been C-compilers since the early 70s. A C compiler needs these features. C has been developed on PDP-11 which had 64 KB of memory. Consider a propeller a PDP-11 on a chip.
There is a duality between data structures and programming concepts.
- Single data item vs. single instruction.
- Set of data items vs. instruction sequence
- Array of data items vs. loop
- Tree or graph data structure vs. recursion
Well, I can imagine that you don't handle trees or arrays in embedded programming that frequently. And if so, you can simulate recursion by programming a lopp and handling the stack explicitly. As a side note, there is also a relation to grammar levels. Parsing a context free grammar like that for an arithmetic expression results in a syntax tree which you can most easily traverse by recursion. And this concept is important for more than user interfaces. And a second side note: The data structure needs not to be in main memory completely. This makes the small memory argument less valid.I am still thinking that we need something like calling conventions. Maybe not really rules but recommendations. Kinda styleguide. Implementing quasi-local variables by prepending names with function names seems not enough and is also waste of precious memory.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Airspace V - international hangar flying!
www.airspace-v.com/ggadgets for tools & toys
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Thank you for the hint to Bill Henning's LMM. I'll have a look into that. Recently I found papers (and patents) on code compression. Might be interesting to combine these two.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Airspace V - international hangar flying!
www.airspace-v.com/ggadgets for tools & toys
PS: If there is a resonance, I intend to document the modules in a CMap and I also will try to implement the mechanism in the Hanno's communication process, as far as I know, the variable to communicate with ViewPort have to be consecutive and I can not arrange my variables in that way in my application
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
cmapspublic3.ihmc.us:80/servlet/SBReadResourceServlet?rid=1181572927203_421963583_5511&partName=htmltext
Cog memory size is not going to increase. The memory size is wired into the instructions (9-bit address fields). You cannot connect additional memory to the Propeller and have it function like the on-chip memory. The data paths do not exist. The Propeller is nothing like a PDP-11. For one, it's a RISC architecture. Self-modifying code has to be used for any kind of structure access to cog memory. Packing and unpacking of partial word data is expensive. Any kind of indirect reference to cog memory is expensive.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
cmapspublic3.ihmc.us:80/servlet/SBReadResourceServlet?rid=1181572927203_421963583_5511&partName=htmltext
Just one remark regarding your last message, Mike: I've read a number of PIC datasheets. and haven't seen any interrupts for PIC10F / PIC12F. You don't need the hardware stack for subroutine calls. You can program your own call stack.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Airspace V - international hangar flying!
www.airspace-v.com/ggadgets for tools & toys
The focus of the Propeller is on SPIN, and that is well defined. With the introduction of virtual machines like LMM, this may change; LMM already has small ABI itself. I suggest you look into LMM.
Post Edited (Andrew E Mileski) : 2/3/2009 7:04:47 PM GMT
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
cmapspublic3.ihmc.us:80/servlet/SBReadResourceServlet?rid=1181572927203_421963583_5511&partName=htmltext
There is not much demand for PASM calling conventions, I guess, because we don't have any high level languages generating raw PASM to mix and match together. Also as stated with the lack of a stack and the confined space it makes the overheads to burdensome.
You are right about the equivalent parameter/result exchange amongst processes executing in parallel. As far as I can tell most Propeller driver objects, serial, TV, VGA, whatever come with a set of interface routines in SPIN. This is fine as it is really simple to use SPIN as glue to assemble a system out of those drivers.
BUT what if you don't want SPIN? What if you want to go straight from the your PASM code in on COG to, say, the FullDuplexSerial object's PASM?. Or you want to go from ImageCraft C to a PASM driver from Obex? Then you find the interfaces are not defined. Its time to read the code and probably hack it around to fit. Not good.
As far as I can tell there is a common pattern of sharing a data block containing a "command" and the parameters/results. I have not looked at so many objects.
Perhaps there is scope to define a "convention" in that area.
Don't worry, we read old papers from time to time , some times we even understand them!
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
cmapspublic3.ihmc.us:80/servlet/SBReadResourceServlet?rid=1181572927203_421963583_5511&partName=htmltext
heater, I look forward to your pasm compiler implementation.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
--Steve
Interesting how one's experiences affect one's prejudices.· I wonder whether Crays have stacks?· Probably not, because Seymour Cray started out at Control Data.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
· -- Carl, nn5i@arrl.net
Post Edited (Carl Hayes) : 2/4/2009 4:31:34 AM GMT
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
· -- Carl, nn5i@arrl.net
Post Edited (Carl Hayes) : 2/4/2009 4:03:28 AM GMT
-Phil
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
--Steve
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
cmapspublic3.ihmc.us:80/servlet/SBReadResourceServlet?rid=1181572927203_421963583_5511&partName=htmltext
That sounds like a challenge!. Given that my compiler designing/writing skills are up to the level of Jack Crenshaws "Let's Build A Compiler" (just) and the language features proposed here are fairly minimal this might even happen.
I do have a version of a Crenshaw TINY like language that compiles to LMM PASM so there is a little head start.
Might take a while.....
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.