Propeller Tool & BST & HomeSpun extension request - means faster objects, and f
Bill Henning
Posts: 6,445
Hi guys,
I have a moderately simple request for Spin compilers - it does not need any new byte codes.
Currently, adding a Spin method to send another message to Largos requires an average of 6 longs. There are currently around 90 different messages - that's 540 longs for messaging overhead. What I propose below would cut that overhead to perhaps 1 long, probably 0 longs. I came up with this as I am currently working on the graphics engine for Morpheus, which will later become the standard Largos engine. While benchmarking calling the wrapper functions, it has become evident that there is a significant overhead to sending messages.
Let's say I have a function called "Plot", which sends a message by writing to "cmd, arg1, arg2, arg3" all of which are longs.
PUB Plot(x,y,c)
repeat while cmd!=0 ' wait until mailbox is not busy, could use lockset(n) instead
arg1:=x
arg2:=y
arg3:=c
cmd:= GR#PLOT
I sent the command last, because that is what triggers the action on the mailbox contents.
What I would like instead is to be able to declare messages, in the following format:
MSG(GR#PLOT) Plot(x,y,c)@mailbox:@returns
- the GR#PLOT would be a 32 bit constant, and it is enough to be able to specify constants for the messages specifier.
- @mailbox would be the address we want to put the message specifier into, after popping the arguments off the stack into the longs following @mailbox
- @returns would be the location to write the result to
ie
long[noparse][[/noparse]@mailbox] := GR#PLOT
long@mailbox+4]:=@returns
long[noparse][[/noparse]@mailbox+8]:=c
long[noparse][[/noparse]@mailbox+12]:=y
long[noparse][[/noparse]@mailbox+16]:=x
This method would support a variable number of arguments!
The convention should also be:
- the generated code would wait until mailbox was 0 before writing the code, perhaps also specifying a LOCK number somewhere.
- if there is a return value, it should be written to @return
This would speed up messaging from Spin immensely, not just for Largos, not just for my drivers, but for anyone's code that sends arguments to a cog process - like the floating point library, the serial library etc.
The generic format would be:
MSG(const{,locknum}) msg_name(arg1,..,argN)@mailboxaddr{:@retvaladdr}
where anything in {}'s is optional.
Ideally I'd like all three tools I mentioned to add this support - currently I am using BST under Ubuntu and I love it!
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Please use mikronauts _at_ gmail _dot_ com to contact me off-forum, my PM is almost totally full
Morpheus & Mem+dual Prop SBC w/ 512KB kit $119.95, 2MB memory IO board kit $89.95, both kits $189.95
www.mikronauts.com - my site 6.250MHz custom Crystals for running Propellers at 100MHz
Las - Large model assembler for the Propeller Largos - a feature full nano operating system for the Propeller
I have a moderately simple request for Spin compilers - it does not need any new byte codes.
Currently, adding a Spin method to send another message to Largos requires an average of 6 longs. There are currently around 90 different messages - that's 540 longs for messaging overhead. What I propose below would cut that overhead to perhaps 1 long, probably 0 longs. I came up with this as I am currently working on the graphics engine for Morpheus, which will later become the standard Largos engine. While benchmarking calling the wrapper functions, it has become evident that there is a significant overhead to sending messages.
Let's say I have a function called "Plot", which sends a message by writing to "cmd, arg1, arg2, arg3" all of which are longs.
PUB Plot(x,y,c)
repeat while cmd!=0 ' wait until mailbox is not busy, could use lockset(n) instead
arg1:=x
arg2:=y
arg3:=c
cmd:= GR#PLOT
I sent the command last, because that is what triggers the action on the mailbox contents.
What I would like instead is to be able to declare messages, in the following format:
MSG(GR#PLOT) Plot(x,y,c)@mailbox:@returns
- the GR#PLOT would be a 32 bit constant, and it is enough to be able to specify constants for the messages specifier.
- @mailbox would be the address we want to put the message specifier into, after popping the arguments off the stack into the longs following @mailbox
- @returns would be the location to write the result to
ie
long[noparse][[/noparse]@mailbox] := GR#PLOT
long@mailbox+4]:=@returns
long[noparse][[/noparse]@mailbox+8]:=c
long[noparse][[/noparse]@mailbox+12]:=y
long[noparse][[/noparse]@mailbox+16]:=x
This method would support a variable number of arguments!
The convention should also be:
- the generated code would wait until mailbox was 0 before writing the code, perhaps also specifying a LOCK number somewhere.
- if there is a return value, it should be written to @return
This would speed up messaging from Spin immensely, not just for Largos, not just for my drivers, but for anyone's code that sends arguments to a cog process - like the floating point library, the serial library etc.
The generic format would be:
MSG(const{,locknum}) msg_name(arg1,..,argN)@mailboxaddr{:@retvaladdr}
where anything in {}'s is optional.
Ideally I'd like all three tools I mentioned to add this support - currently I am using BST under Ubuntu and I love it!
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Please use mikronauts _at_ gmail _dot_ com to contact me off-forum, my PM is almost totally full
Morpheus & Mem+dual Prop SBC w/ 512KB kit $119.95, 2MB memory IO board kit $89.95, both kits $189.95
www.mikronauts.com - my site 6.250MHz custom Crystals for running Propellers at 100MHz
Las - Large model assembler for the Propeller Largos - a feature full nano operating system for the Propeller
Comments
Functionally what you want is something like:
This could be done very easily with a simple substitution macro facility, but it's not clear it would save you much over the calls.
1 long is 4 spin bytecodes. You consume more than that simply doing the repeat while loop.
As Mike pointed out, have you thought about generic wrappers? Also Spin Parameters are pushed on the local stack contiguously in the order they are declared, so something like this might work
Or something to that effect.
Remember, any method with parameters has the overhead of pushing them all onto the stack pre-call also.
I think I'm really missing what you are trying to achieve here.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
lt's not particularly silly, is it?
Basically I am trying to accomplish two things:
1) not have to write a separate stub function, with assignments, for every message
2) save some execution time.
3) allow for variable number of arguments (think printf)
Consider: (from Largos) message wrappers for clients
pub putc(ch)
send1(cogid,_putc,ch)
pub puthex(n,d)
send2(cogid,_puthex,n,d)
...
PUB send1(mailbox,cmd,a1)
send3(mailbox,cmd,a1,0,0)
PUB send2(mailbox,cmd,a1,a2)
send3(mailbox,cmd,a1,a2,0)
Using a MSG declaration I suggested:
MSG(_putc) putc(c):mailbox
MSG(_puthex) puthex(n,d):mailbox
The difference is:
- MSG declarations do not generate any code at time of declaration, only time of invocation (basically a different flavor of macro)
- no PUB delcaration taking code space in the MESG object
- no actual call - the MSG just tells the compiler how to write the constant and the arguments to mailbox by popping them off the stack
- time saved: the assignments in the pub functions, which now pop the arguments off the stack and write to the variables... hmm i should look at the generated code
An optimizing Spin compiler would allow us to do something like
INLINE PUB putc(c)
send1(mailbox,_putc,c)
True, each call would be a bit bigger, but most programs won't use most of the stubs, and each stub currently takes 6 longs.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Please use mikronauts _at_ gmail _dot_ com to contact me off-forum, my PM is almost totally full
Morpheus & Mem+dual Prop SBC w/ 512KB kit $119.95, 2MB memory IO board kit $89.95, both kits $189.95
www.mikronauts.com - my site 6.250MHz custom Crystals for running Propellers at 100MHz
Las - Large model assembler for the Propeller Largos - a feature full nano operating system for the Propeller
Post Edited (Bill Henning) : 8/19/2009 11:54:34 PM GMT
I think you are going to trade off code size for execution speed.
Each time you use the inline, you are going to be including the while loop and a stack of read write code.. at least with method calls you only include the code once, even though you still have the overhead of pushing all the parameters on the stack in the first place. The inline would be faster as it removes the overhead of the method call. But if you were to use it more than once in your code you've just blown away your size advantage.
With regards to pure code size, bst has unused object elimination. You could put 400 stubs in a sub-object and only the ones that were actually called would be compiled into the resulting code.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
lt's not particularly silly, is it?
And I love the listings file!
What I was trying to get at was decreased code space, increased execution speed, and variable number of arguments (I'm not asking for much <grin>)
Looks like my initial thoughts on the matter won't do it.
What would be nice would be a mechanism something like polymorphism, but more efficient.
What I am looking for is a mechanism to have several function declarations map onto just one message sending function, with a variable number of parameters - without having to manually stuff an array whose address I can pass.
I will think more on how this may be accomplished, and will post more later in this thread [noparse]:)[/noparse]
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Please use mikronauts _at_ gmail _dot_ com to contact me off-forum, my PM is almost totally full
Morpheus & Mem+dual Prop SBC w/ 512KB kit $119.95, 2MB memory IO board kit $89.95, both kits $189.95
www.mikronauts.com - my site 6.250MHz custom Crystals for running Propellers at 100MHz
Las - Large model assembler for the Propeller Largos - a feature full nano operating system for the Propeller
No time just yet but keep it in mind.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Home of the MultiBladeProps: TriBladeProp, RamBlade, TwinBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: Micros eg Altair, and Terminals eg VT100 (Index) ZiCog (Z80) , MoCog (6809)
· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBladeProp is: www.bluemagic.biz/cluso.htm
I was looking into it more, and the best option would be a new Spin byte code... call it MESG for now
assuming the arguments were pushed on the stack left-to-right
MESG(msg_num) msg_name(arg1...argN)
would push
msg_num
arg1
...
argN
So the opcode would be:
MSG n, addr - four byte opcode, n = number of arguments on the stack, addr = address of message box
Pops long arguments off the stack (which brings them in right to left), and stores them starting at
addr+4*n
and working backwards to addr, so:
MESG 0, addr
would only pop the message code, and store it at addr
MSG 3, addr
would pop three arguments first into addr+12, addr+8, addr+4, then the message code into addr
and so on!
Add it to the MSG syntax I proposed at the top of this thread, and all of a sudden Largos messages are >2x as fast in Spin, and so is sending messages to current driver cogs in mailboxes.
It also allows something neat:
MESG(_printf) printf(handle,formatstr,arg1,arg2,...)
Now would that not be wonderful?
Another useful opcode would be:
CALLV n, addr
for function calls with a variable number of arguments
printf in Spin... sscanf in Spin...
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Please use mikronauts _at_ gmail _dot_ com to contact me off-forum, my PM is almost totally full
Morpheus & Mem+dual Prop SBC w/ 512KB kit $119.95, 2MB memory IO board kit $89.95, both kits $189.95
www.mikronauts.com - my site 6.250MHz custom Crystals for running Propellers at 100MHz
Las - Large model assembler for the Propeller Largos - a feature full nano operating system for the Propeller
1. Push data (variable length) onto the stack.
2. The other removes(pops)? or copies? (variable length) data from the stack and stores in memory locations backwards?
1. I presume you are using the existing bytecodes to do this with just an extension to the compiler?
2. So how do you maintain the stack length? and keep it consistent?
or do you need 2 extra spin bytecodes?
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Home of the MultiBladeProps: TriBladeProp, RamBlade, TwinBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: Micros eg Altair, and Terminals eg VT100 (Index) ZiCog (Z80) , MoCog (6809)
· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBladeProp is: www.bluemagic.biz/cluso.htm
Actually what you really need is reverse longmove where the source address is incremented and the dest address is decremented. That way you can fill the buffer with the cmd being the last thing written. If you were to implement the command Bill is talking about the compiler would need to do something like this.
Push Cmd
Push A
Push B
Push C
push count(4)
push buffer address
Opcode
The interpreter does something like this :
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
lt's not particularly silly, is it?
It relies on the fact that the address of the first parameter is the address of an array on the stack that includes all the parameters and local variables in order of appearance.
-Phil
I have not implemented any of this, I am kicking ideas around at this time.
The two new opcodes would make it fairly easy.
The MSG opcode would be followed by three bytes - one is the message id constant, the other is the address of the mailbox.
For the variable argument function call, it would still be followed by three bytes - one is the number of arguments on the stack, and the second is the offset to the method (address of function).
If done this way, the implementation of the MSG opcode would know how many longs to pop from the stack into the mailbox (whose address is part of the new opcode) - and the last one popped would go right into the mailbox address and trigger the service cog. By having the number of arguments and method address as part of the instruction it would save two instruction decodes and some stack manipulation.
Similarly the CALLV would know how many arguments are on the stack - perhaps the vararg functions would have to be written something like:
pub vararg printf(args)
that way they would know how many arguments are in the argument vector
and an example would be:
printf(string("number %d string %s hex %x\n"),15,string("hello world"),addr)
then the implementation of printf would get:
args[noparse][[/noparse]0]=3
args=@string("number %d string %s hex %x\n")
args=15
args=addr
I realize this could be done by making assignments like above, but it is MUCH more readable if supported by the compiler.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Please use mikronauts _at_ gmail _dot_ com to contact me off-forum, my PM is almost totally full
Morpheus & Mem+dual Prop SBC w/ 512KB kit $119.95, 2MB memory IO board kit $89.95, both kits $189.95
www.mikronauts.com - my site 6.250MHz custom Crystals for running Propellers at 100MHz
Las - Large model assembler for the Propeller Largos - a feature full nano operating system for the Propeller
Really the only difference is instead of pushing the count and the buffer address I would like them as part of the command in order to save two opcode decodes and two pushes/pops!
[noparse][[/noparse]opcode:8][noparse][[/noparse]count:8][noparse][[/noparse]address:16]
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Please use mikronauts _at_ gmail _dot_ com to contact me off-forum, my PM is almost totally full
Morpheus & Mem+dual Prop SBC w/ 512KB kit $119.95, 2MB memory IO board kit $89.95, both kits $189.95
www.mikronauts.com - my site 6.250MHz custom Crystals for running Propellers at 100MHz
Las - Large model assembler for the Propeller Largos - a feature full nano operating system for the Propeller
I'll take a look at it - looks promising - however it would be slower than what I am suggesting due to more stacking and an extra function call - but may allow me to make my messaging object a bit faster than it is now
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Please use mikronauts _at_ gmail _dot_ com to contact me off-forum, my PM is almost totally full
Morpheus & Mem+dual Prop SBC w/ 512KB kit $119.95, 2MB memory IO board kit $89.95, both kits $189.95
www.mikronauts.com - my site 6.250MHz custom Crystals for running Propellers at 100MHz
Las - Large model assembler for the Propeller Largos - a feature full nano operating system for the Propeller
While it may be a bit slower that what you are suggesting, it does not involve a new ram based interpreter or changes to the compiler so it retains Parallax compatibility (which is good).
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
lt's not particularly silly, is it?
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Please use mikronauts _at_ gmail _dot_ com to contact me off-forum, my PM is almost totally full
Morpheus & Mem+dual Prop SBC w/ 512KB kit $119.95, 2MB memory IO board kit $89.95, both kits $189.95
www.mikronauts.com - my site 6.250MHz custom Crystals for running Propellers at 100MHz
Las - Large model assembler for the Propeller Largos - a feature full nano operating system for the Propeller
From what I am understanding, the above "push count(4)" would have to be either part of the cmd (as Bill said and would obviously be a better approach) or it would need to immediately follow the cmd -·otherwise how would the interpreter know how many parameters were on the stack in order to locate the count parameter.
One bit is missing. IIRC the interpreter bytecodes are basically all used except for $3F, so effectively we are using another byte in all of this. And how do we know how to pop all items from the stack in order to preserve its' integrity (I am missing something here)??
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Home of the MultiBladeProps: TriBladeProp, RamBlade, TwinBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: Micros eg Altair, and Terminals eg VT100 (Index) ZiCog (Z80) , MoCog (6809)
· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBladeProp is: www.bluemagic.biz/cluso.htm
I was thinking if we were to get into "extending" the interpreter like this we'd use $3F as an extended bytecode with another code following to identify the actual extended bytecode (buying us an extra $FF instructions to implement if required). As you can see above, the interpreter knows what to do with the stack in this case.
I don't really like it, but if you were to implement paging or sections of LMM in the interpreter it's certainly achievable.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
lt's not particularly silly, is it?