Here is Chip's new SQRT that I incorporated into my Interpreter.
'Chip's smaller version: masksqrt = $40000000 gets used directly and rotated all the way back to orig. value
math_F8 mov x,#0 'reset root
msqr or x,masksqrt 'set trial bit
cmpsub y,x wc 'subtract root from input if fits
sumnc x,masksqrt 'cancel trial bit, set root bit if fit
shr x,#1 'shift root down
ror masksqrt,#2 wc 'shift mask down (wraps on last iteration)
if_nc jmp #msqr 'loop until mask restored
jmp #push
The Rom Interpreter was compiled and listed with homespun.
Dave said...
(replace this text with what was said)
I have a question about one line of code in the Spin interpreter.· A few lines after the "push" label, and just before the "jmp #loop" instruction there is a test instruction that writes the zero flag.· However, the code at the top of the loop ignores the zero flag.· Do you know the reason for the test instruction?· It seems to be a wasted instruction.
Remember the PUSH is a subroutine, so the jmp #loop will not always jump to loop. The C flag is required to be set if returning from a mathops routine.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔ Links to other interesting threads:
I have a new update to the LMM Interpreter.· I added the FCALL and FCACHE psuedo-ops, and I optimized the LMM PASM replacements for the sqrt, strsize and strcomp instructions.· I used Chip's faster sqrt code, and the LMM PASM version runs almost as fast as the original PASM version in the Spin interpreter.· The strsize and strcomp routines actually run faster than the original versions in the Spin interpreter for string sizes·greater than·25 characters.· This is because I was able to reduce the orignal 64-cycle loop to a 16-cycle loop for strsize, and a 32-cycle loop for strcomp.· I use the FCACHE psuedo-op to run these loops at full speed.
I use the coginit/lock instruction space in the Spin interpreter for the FCACHE area.· This provides either a 16 instruction cache or it can be use for 16 registers, or any combination the programmer chooses.· This works well for small loops that loop many times.· Also, once a loop is loaded in the cache it can be called directly without having to reload it.
The attached demo program measures the speed of the sqrt, strsize and strcomp routines before and after loading the LMM PASM interpreter.· It also compares three routines written in Spin. LMM PASM and LMM PASM+FCACHE.· The routines implement a simple loop that adds the numbers from 1 to N.· There is a dramatic difference in speed between the three versions.
Dave, This is great, hope you dont mind, I took your interpreter and I have done 2 changes and I am now using it for my inline asm i2c drivers.
The 2 changes are:
1. I put the indirect call in, I used the 3 res longs at the end of the fcache, you didn't seem to be using it so put the indirect logic there and also saved 1 long from your fcall logic.
2. I change the 3c entry point just to pop x rather than x,y. I felt that the 2nd paramter is dependent on what you are calling, so shouldn't be done in the lmm code itself. e.g. my i2c code has either 4 or 5 parameters.
I have attached my modified lmm_inter.spin if you are interested.
Can we add the MIT license at the end, So we can it and variants to obex?
Tim, the icall looks like a good addition.· If I understand correctly the value after the "jmp #icall" could be interchanged with the value in t1, since the final address is the sum of these two.· This would allow storing the call table address after the "jmp #icall" and putting the index in t1, or vice-versa.
We could also perform a an ijmp by jumping to the address after icall.· This would be useful if you didn't want to change the return address register.
I like the idea of pushing only the code address.· I agree that it is better for the user code to handle pushing and poping the parameters.
I'll change it to handle all the cogs.· I'll clean up the code a bit and put it in the OBEX when I have a chance.
The way I use icall is the address of the start of the·jump table is in t1, the long after the jmp #icall is the index into the jump table.
It saves the return address then adds the index to t1 and jmps to the address in that address.
but putting the start address of the jump table after the jump #icall and the index in t1 should work as well
I like the idea of adding a ijmp entry point as well.
The other thing I have done is to re-order the file so the patching code and the code that is permanentely loaded in cogs are together at the end of the DAT section. This makes it easier to reuse that hub space once all the cogs using lmm are loaded.·Once I load the lmm into the cog, I reuse that cog for buffer space.
Using lmm for i2c drivers is working very well, I am now getting about 5x speed up for device access above straight spin code. With the core lmm i2c routines available, it is costing me·~30-40longs per device to add the lmm version
I2C is a great application for LMM PASM.· It's the type of code that runs only when the main program needs to access an I2C device.· Of course, there are applications that periodically sample devices while the main thread runs, but even those drivers might work well with Spin mixed in.
Another thing that should work well is floating point.· This would be good for applications that need to do a little bit of floating point quickly, and it would be a waste to use a full cog to support it.
Tim, that's a good suggestion about re-using the PASM code after the cog is loaded.· I'll move the loader and cog PASM code together at the end of the DAT section.
I'm a bit concerned about putting an MIT license on Chip's code.· The LMM interpreter contains bit's and pieces of the Spin interpreter, and it uses Chip's improved sqrt routine.· Does anybody know Chip's view on this.· Chip, could you comment on this?
Dave: You may only put a MIT license to your modifications. You will need to ask Chip's permission before you publish any of his code in the OBEX. See the top of my Faster Spin for a statement. However, if all you are doing (which is what I believe you have done) is adding a routine to modify the existing Interpreter on the fly, so to speak, then I am sure you can publish this on the OBEX. As for the SQRT, while I cannot speak for Chip, he did publish the code for us on the forum and I am sure you could use it with an acknowledgement, just as I have done. If you want to publish any of the Interpreter code, you must ask Chip's permission (BTW I have and he was fine) - just PM him.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔ Links to other interesting threads:
Tim, thanks for fixing the bugs and reordering the code.· I guess I wasn't very careful in converting those parts of the Spin interpreter in to LMM PASM.· We need to test this with non-LMM applications to make sure nothing else got broken.· I'll test it with various apps that I've developed.
I removed the test for the cogid and the cog_loaded flag in the start method.· The programmer will have to ensure that he calls the start method once at the beginning on each cog that he wants to run LMM.· In theory, the start method could be called more than once, and it will just rewrite the same information.
I also replaced the run method with run0, run1 and run2, which is used for zero, one or two parameters. I've ported the basic floating point functions to LMM PASM, and have attached a test program.· LMM PASM runs about 3.3 to 4 times faster than Spin, except for divide, which is 13 times faster.· The Spin FDiv is really slow.· The PASM routines are about 2.2 to 2.5 times faster than the LMM PASM routines.
I PM'ed Chip about using the SQRT and Spin interpreter pieces of code.· I'm still waiting to hear back from him.· In the meantime I plan on writing a brief·document on how to program in LMM PASM.
My floating point test program is attached.· It uses your latest version of the LMM interpreter with the small changes in the start and run methods that I mentioned.
I got the OK from Chip to add the MIT license and post our code to the OBEX.· I wrote up a brief description and programmers guide and attached it to this post.· It probably has a few typos and errors, so please look it over and post your corrections and suggestions.· I hope to get this into the OBEX in the next few days.
I'm thinking of renaming the x, y, a, t1, etc. Spin registers to reg0, reg1, ... reg7.· I would renumber the existing reg0, reg1, ... registers to reg8, reg9, etc.· The original Spin interpreter names for the first 8 registers are not important, and may be confusing.· Does that sound OK?
Bill, I did use the FCACHE for the FMul and FDiv routines.· They would have been much slower without it.· I originally planned on using the *FILL and *MOVE area of the Spin interpreter, but I decided to use the coginit and lock area instead.· I didn't want ot move the *FILL and *MOVE code to LMM PASM and slow it down.· There is also a constant defined in this code that can't be moved.· I may consider for a later rev of the LMM interpreter.
BTW, we need a shorter term to refer to Spin + LMM PASM.· I was thinking of using SpinLP.· Does that sound OK?· Any other suggestions for a shorter term?
I suspect that due to their looped nature, the *FILL and *COPY routines would not slow down significantly if FCACHED - and having more space for an LMM FCACHE is a good thing [noparse]:)[/noparse] but as you say, it can be used for a later version.
As far as a name goes... how about:
Spin+
or
LMMSpin
?
Dave Hein said...
I got the OK from Chip to add the MIT license and post our code to the OBEX. I wrote up a brief description and programmers guide and attached it to this post. It probably has a few typos and errors, so please look it over and post your corrections and suggestions. I hope to get this into the OBEX in the next few days.
I'm thinking of renaming the x, y, a, t1, etc. Spin registers to reg0, reg1, ... reg7. I would renumber the existing reg0, reg1, ... registers to reg8, reg9, etc. The original Spin interpreter names for the first 8 registers are not important, and may be confusing. Does that sound OK?
Bill, I did use the FCACHE for the FMul and FDiv routines. They would have been much slower without it. I originally planned on using the *FILL and *MOVE area of the Spin interpreter, but I decided to use the coginit and lock area instead. I didn't want ot move the *FILL and *MOVE code to LMM PASM and slow it down. There is also a constant defined in this code that can't be moved. I may consider for a later rev of the LMM interpreter.
BTW, we need a shorter term to refer to Spin + LMM PASM. I was thinking of using SpinLP. Does that sound OK? Any other suggestions for a shorter term?
Renaming the registers is good idea, you probably want to move t1 to another register·(prehaps reg7 (op2) or reg23 (was reg15)). The problem with reg23 is that then icall/ijmp doesn't work with fcache code so probably reg7.
doc looks good, couple of comments
typo·- FCALL section 'caledl'
fcache - you might want to make it clearer than you need a org cache_addr in the lmm pasm code and why
You might want to make a comment about waitcnt - to be careful about the delay number since instructions take '32' clocks rather than 4. ·
SpinLMM seems like a fine name. Postfixing LMM means it's an add-on rather than the basic architecture. Of course one could call it SpinOlay! to show the functional origin of LMM and the bravado of the performance enhancements.
I converted this to PDF if You give permission I can post it in my post.
So You can attach it in OBX file
Regards
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔ Nothing is impossible, there are only different degrees of difficulty. For every stupid question there is at least one intelligent answer. Don't guess - ask instead. If you don't ask you won't know. If your gonna construct something, make it·as simple as·possible yet as versatile as posible.
jazzed said...
SpinLMM seems like a fine name. Postfixing LMM means it's an add-on rather than the basic architecture. Of course one could call it SpinOlay! to show the functional origin of LMM and the bravado of the performance enhancements.
I like SpinLMM.· I'll change the name in the document and add the other changes as well.· Sapieha, I will post a new version of the document soon.· I would appreciate it if you could convert the new version into a PDF.· I will be posting it later today after I complete a few more gardending tasks for my wfe.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔ Nothing is impossible, there are only different degrees of difficulty. For every stupid question there is at least one intelligent answer. Don't guess - ask instead. If you don't ask you won't know. If your gonna construct something, make it·as simple as·possible yet as versatile as posible.
Dave: Great news Chip gave you the OK for the MIT license. I did expect he would.
This is great work. I think a caveat should be added·to the writeup that it is not for the beginner. BTW homespun should also work (it has @@@ too).
SpinLMM sounds great.
BTW I would prefer the registers remain the same. I know I can change my Faster version but it would be easier to maintain with the same names. Then if later someone else tackles the interpreter before discovering your code, it can be added easier. No point in changing names just for the sake of it.
I found what I think is a better way to name the registers from org 0... (the way it overlays the org 0 initialisation code for later use as registers)
'-----------------------------------------------------------------------------------------------------------------
'The next 8 registers overlay the Interpreter initialisation code which follows
org 0
x res 1 'these 8 occupy the entry-code space
y res 1
a res 1
t1 res 1
t2 res 1
op res 1
op2 res 1
adr res 1
'-----------------------------------------------------------------------------------------------------------------
'The next 8 instructions re-used as registers (defined above) once this initialisation code has executed
org 0
Interpreter mov x,#$1F0-pbase 'entry, load initial parameters
mov y,par
:loop add y,#2
:par rdword pbase,y
add :par,#$100 'inc d lsb
add :par,#$100
djnz x,#:loop
cogid id 'set id
'-----------------------------------------------------------------------------------------------------------------
I do not like the format for the ICALL as it is not obvious and prone to errors ·············· mov t1, #8··········· ' Index the third long in the jump table
What this means is ·············· mov t1, #(3*4)-4
but we are better to say it is "2" using base "0" so ·············· mov t1, #(2*4)····· 'index· to jump_table[noparse][[/noparse]2]
or ·············· mov t1,#2*nlong··· 'where nlong equ 4
We could also use ·············· mov t1, #(@@@jump_table3 - @@@jump_table)
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔ Links to other interesting threads:
Cluso99, I use icall in a different way, heres some of my code
lmmgetdata sub lmm#dcurr,#4
rdlong lmm#t1,lmm#dcurr 'i2ctable
jmp #lmm#icall 'i2c setup
long I2CLMM#SETUP
sub lmm#dcurr,#4
rdlong lmm#reg11,lmm#dcurr [url=mailto:'@lx]'@lx[/url]
mov lmm#reg8, #HMC5843_STATUS 'status register '
jmp #lmm#icall
long I2CLMM#READREGHEADER
shl lmm#reg7, #16
mov lmm#reg12, lmm#reg7
mov lmm#reg7, #1 'read with NAK
jmp #lmm#icall
long I2CLMM#READ 'status
jmp #lmm#icall
long I2CLMM#STOP 'i2c stop
cmp lmm#reg12, #0 wz
or lmm#reg12, lmm#reg4
cmp lmm#reg4, #5 wc 'if < 5 then return
if_c_or_nz wrlong lmm#reg12,lmm#dcurr 'push status
if_c_or_nz add lmm#dcurr,#4
if_c_or_nz jmp #lmm#loop 'back to interpreter
jmp #lmm#icall
long I2CLMM#START 'i2c start
mov lmm#reg7, #0 'init ack status
mov lmm#reg4, lmm#a 'i2c address with read bit set
or lmm#reg4, #1
jmp #lmm#icall
long I2CLMM#WRITE
shl lmm#reg7, #8 wz
or lmm#reg12, lmm#reg7
mov lmm#reg7, #%100000 'read with ACK, NAK 6th read
jmp #lmm#icall
long I2CLMM#READWORDR
mov lmm#reg8, lmm#reg10
jmp #lmm#icall
long I2CLMM#READWORDR
mov lmm#reg9, lmm#reg10
jmp #lmm#icall
long I2CLMM#READWORDR
if_z wrlong lmm#reg8, lmm#reg11 'write results out to hub addresses
add lmm#reg11, #4
if_z wrlong lmm#reg9, lmm#reg11
add lmm#reg11, #4
if_z wrlong lmm#reg10, lmm#reg11
mov lmm#x, lmm#reg12 'push status and acks
jmp #lmm#fretx 'return back to interpreter after pushing x
I load t1 at the start of the routine, the address of the jump table is passed in as a parameter (i2ctable)·and I put that in t1 once for the routine - I dont reload it during the routine. The value following the jmp #icall is a constant.
I've added the various suggestions to the document and the SpinLMM interpreter.· I have attached a preliminary version of the code I will be posting to the OBEX.· Please look it over and provide corrections that should be made to the files.
Sapieha, could you please convert this version of the·SpinLMM.html to a PDF file?· Please change the version number from 0.9 to 1.0 before converting.
Cluso, I renamed the registers as others had suggested.· I think it is·less confusing for programmers to use reg0 through reg7 instead of x, y, a, etc.· I didn't fully understand your concern about this.· The values of x, y, a, etc. are still defined within SpinLMM.spin, so I hope this meets your needs.· BTW, I finally looked at the orignal thread where you and Hippy discussed improving the Spin interpreter, and I saw that LMM was discussed quite a bit back then.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔ Nothing is impossible, there are only different degrees of difficulty. For every stupid question there is at least one intelligent answer. Don't guess - ask instead. If you don't ask you won't know. If your gonna construct something, make it·as simple as·possible yet as versatile as posible.
mov callstack, pcurr ' on entry to lmmrun, copy pcurr - which points beyond spin stack - to callstack
' change FCALL as follows:
FCALL add callstack,#4
wrlong lmm_pc,callstack
rdlong lmm_pc,lmm_pc
jmp #next
' change FRET as follows:
FRET rdlong lmm_pc,callstack
sub callstack,#4
jmp #next
Also, may I suggest renaming FRETX to FRET_REG0 - it may be more readable to those just starting out as it makes it explicit that REG0 is being returned.
Sapieha, the PDF looks good.· I might make a few changes to the HTML to make the text line up better in the PDF.· I'll let you know.
Bill, I like the idea of jumping by adding or subtracting values to/from the program counter.· It is useful for relocatable code, and it's an easy way to loop back a few instructions or to skip forward a few instructions.· Immediate jumps could be as large as +/- 510.· Longer relative jumps could be done with an index register.· I might add a "Tips and Tricks" section that describe this technique and a few others.
SpinLMM does not actually contain a return instruction that is used with FCALL.· A return is performed by doing a "mov lmm_pc, lmm_ret" instruction.· FRET and FRETX are used to return back to the Spin interpreter.· Maybe these should be named FEXIT and FEXIT_REG0, or FSPIN and FSPIN_REG0.
I originally thought that FCALL should put the return address on the stack.· However, I like the idea of using a dedicated register, which could be pushed to the stack if needed.
SpinLMM has a very small set of pseudo-ops.· I want to minimized the effect on Spin execution speed.· However, I might add a few more pseudo-ops later on.
I posted SpinLMM to the OBEX.· It is located at http://obex.parallax.com/objects/635/ .· I decided to keep the code the same as version 0.9, except I bumped the version number to 1.0.· Thanks to everyone for their suggestions and improvements.
Brad, it would be nice to add macro capability to BST as Cluso suggested.· I have some macro expansion code that I developed for CSPIN if your interested.· It would have to be modified to preserve newlines and indentation for Spin.
Comments
reference again, here is the Faster Spin Interpreter thread. There is also a PASM trace of the Interpreter starting here too. http://forums.parallax.com/showthread.php?p=731577·
Here is the PASM & SPIN Debugger thread which also has some info on the Interpreter intermingled· including the listing of the ROM Interpreter http://forums.parallax.com/showthread.php?p=748420
The Rom Interpreter was compiled and listed with homespun.
Remember the PUSH is a subroutine, so the jmp #loop will not always jump to loop. The C flag is required to be set if returning from a mathops routine.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Home of the MultiBladeProps: TriBlade,·RamBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: CPUs Z80 etc; Micros Altair etc;· Terminals·VT100 etc; (Index) ZiCog (Z80) , MoCog (6809)·
· Prop OS: SphinxOS·, PropDos , PropCmd··· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBlade Props: www.cluso.bluemagic.biz
I use the coginit/lock instruction space in the Spin interpreter for the FCACHE area.· This provides either a 16 instruction cache or it can be use for 16 registers, or any combination the programmer chooses.· This works well for small loops that loop many times.· Also, once a loop is loaded in the cache it can be called directly without having to reload it.
The attached demo program measures the speed of the sqrt, strsize and strcomp routines before and after loading the LMM PASM interpreter.· It also compares three routines written in Spin. LMM PASM and LMM PASM+FCACHE.· The routines implement a simple loop that adds the numbers from 1 to N.· There is a dramatic difference in speed between the three versions.
Dave
Post Edited (Dave Hein) : 7/1/2010 8:13:45 PM GMT
The 2 changes are:
1. I put the indirect call in, I used the 3 res longs at the end of the fcache, you didn't seem to be using it so put the indirect logic there and also saved 1 long from your fcall logic.
2. I change the 3c entry point just to pop x rather than x,y. I felt that the 2nd paramter is dependent on what you are calling, so shouldn't be done in the lmm code itself. e.g. my i2c code has either 4 or 5 parameters.
I have attached my modified lmm_inter.spin if you are interested.
Can we add the MIT license at the end, So we can it and variants to obex?
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Home of the MultiBladeProps: TriBlade,·RamBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: CPUs Z80 etc; Micros Altair etc;· Terminals·VT100 etc; (Index) ZiCog (Z80) , MoCog (6809)·
· Prop OS: SphinxOS·, PropDos , PropCmd··· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBlade Props: www.cluso.bluemagic.biz
We could also perform a an ijmp by jumping to the address after icall.· This would be useful if you didn't want to change the return address register.
I like the idea of pushing only the code address.· I agree that it is better for the user code to handle pushing and poping the parameters.
I'll change it to handle all the cogs.· I'll clean up the code a bit and put it in the OBEX when I have a chance.
Thanks,
Dave
It saves the return address then adds the index to t1 and jmps to the address in that address.
but putting the start address of the jump table after the jump #icall and the index in t1 should work as well
I like the idea of adding a ijmp entry point as well.
The other thing I have done is to re-order the file so the patching code and the code that is permanentely loaded in cogs are together at the end of the DAT section. This makes it easier to reuse that hub space once all the cogs using lmm are loaded.·Once I load the lmm into the cog, I reuse that cog for buffer space.
Using lmm for i2c drivers is working very well, I am now getting about 5x speed up for device access above straight spin code. With the core lmm i2c routines available, it is costing me·~30-40longs per device to add the lmm version
Post Edited (Timmoore) : 7/2/2010 5:45:11 PM GMT
Another thing that should work well is floating point.· This would be good for applications that need to do a little bit of floating point quickly, and it would be a waste to use a full cog to support it.
Tim, that's a good suggestion about re-using the PASM code after the cog is loaded.· I'll move the loader and cog PASM code together at the end of the DAT section.
I'm a bit concerned about putting an MIT license on Chip's code.· The LMM interpreter contains bit's and pieces of the Spin interpreter, and it uses Chip's improved sqrt routine.· Does anybody know Chip's view on this.· Chip, could you comment on this?
Dave
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Home of the MultiBladeProps: TriBlade,·RamBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: CPUs Z80 etc; Micros Altair etc;· Terminals·VT100 etc; (Index) ZiCog (Z80) , MoCog (6809)·
· Prop OS: SphinxOS·, PropDos , PropCmd··· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBlade Props: www.cluso.bluemagic.biz
1. I added the # to the fjmp, dont think it will work without this
2. coginit used popayx but you only popped 2 parameters rather than 3·
·I have attached my version which has the indirect call and jmp and the re-arranged code to get a block to reused as well as these fixes.
Tim
I removed the test for the cogid and the cog_loaded flag in the start method.· The programmer will have to ensure that he calls the start method once at the beginning on each cog that he wants to run LMM.· In theory, the start method could be called more than once, and it will just rewrite the same information.
I also replaced the run method with run0, run1 and run2, which is used for zero, one or two parameters. I've ported the basic floating point functions to LMM PASM, and have attached a test program.· LMM PASM runs about 3.3 to 4 times faster than Spin, except for divide, which is 13 times faster.· The Spin FDiv is really slow.· The PASM routines are about 2.2 to 2.5 times faster than the LMM PASM routines.
I PM'ed Chip about using the SQRT and Spin interpreter pieces of code.· I'm still waiting to hear back from him.· In the meantime I plan on writing a brief·document on how to program in LMM PASM.
My floating point test program is attached.· It uses your latest version of the LMM interpreter with the small changes in the start and run methods that I mentioned.
Dave
Consider throwing out *FILL and *COPY, and providing LMM versions which use FCACHE - thus providing a larger FCACHE area.
For the loops in FMUL/FDIV, consider FCACHEing the loops.
Regards,
Bill
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
www.mikronauts.com E-mail: mikronauts _at_ gmail _dot_ com
My products: Morpheus / Mem+ / PropCade / FlexMem / VMCOG / Propteus / Proteus / SerPlug
and 6.250MHz Crystals to run Propellers at 100MHz & 5.0" OEM TFT VGA LCD modules
Las - Large model assembler Largos - upcoming nano operating system
I'm thinking of renaming the x, y, a, t1, etc. Spin registers to reg0, reg1, ... reg7.· I would renumber the existing reg0, reg1, ... registers to reg8, reg9, etc.· The original Spin interpreter names for the first 8 registers are not important, and may be confusing.· Does that sound OK?
Bill, I did use the FCACHE for the FMul and FDiv routines.· They would have been much slower without it.· I originally planned on using the *FILL and *MOVE area of the Spin interpreter, but I decided to use the coginit and lock area instead.· I didn't want ot move the *FILL and *MOVE code to LMM PASM and slow it down.· There is also a constant defined in this code that can't be moved.· I may consider for a later rev of the LMM interpreter.
BTW, we need a shorter term to refer to Spin + LMM PASM.· I was thinking of using SpinLP.· Does that sound OK?· Any other suggestions for a shorter term?
Oh, I just found that I can't attach HTML files.· Here's a link to the lmm.html document -- http://home.swbell.net/davehein/lmm.html·.
Dave
I suspect that due to their looped nature, the *FILL and *COPY routines would not slow down significantly if FCACHED - and having more space for an LMM FCACHE is a good thing [noparse]:)[/noparse] but as you say, it can be used for a later version.
As far as a name goes... how about:
Spin+
or
LMMSpin
?
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
www.mikronauts.com E-mail: mikronauts _at_ gmail _dot_ com
My products: Morpheus / Mem+ / PropCade / FlexMem / VMCOG / Propteus / Proteus / SerPlug
and 6.250MHz Crystals to run Propellers at 100MHz & 5.0" OEM TFT VGA LCD modules
Las - Large model assembler Largos - upcoming nano operating system
doc looks good, couple of comments
typo·- FCALL section 'caledl'
fcache - you might want to make it clearer than you need a org cache_addr in the lmm pasm code and why
You might want to make a comment about waitcnt - to be careful about the delay number since instructions take '32' clocks rather than 4.
·
Cheers [noparse]:)[/noparse],
--Steve
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Pages: Propeller JVM
I converted this to PDF if You give permission I can post it in my post.
So You can attach it in OBX file
Regards
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Nothing is impossible, there are only different degrees of difficulty.
For every stupid question there is at least one intelligent answer.
Don't guess - ask instead.
If you don't ask you won't know.
If your gonna construct something, make it·as simple as·possible yet as versatile as posible.
Sapieha
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
www.mikronauts.com E-mail: mikronauts _at_ gmail _dot_ com
My products: Morpheus / Mem+ / PropCade / FlexMem / VMCOG / Propteus / Proteus / SerPlug
and 6.250MHz Crystals to run Propellers at 100MHz & 5.0" OEM TFT VGA LCD modules
Las - Large model assembler Largos - upcoming nano operating system
Thanks,
Dave
No problem - as son You say it is OK
Regards
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Nothing is impossible, there are only different degrees of difficulty.
For every stupid question there is at least one intelligent answer.
Don't guess - ask instead.
If you don't ask you won't know.
If your gonna construct something, make it·as simple as·possible yet as versatile as posible.
Sapieha
This is great work. I think a caveat should be added·to the writeup that it is not for the beginner. BTW homespun should also work (it has @@@ too).
SpinLMM sounds great.
BTW I would prefer the registers remain the same. I know I can change my Faster version but it would be easier to maintain with the same names. Then if later someone else tackles the interpreter before discovering your code, it can be added easier. No point in changing names just for the sake of it.
I found what I think is a better way to name the registers from org 0... (the way it overlays the org 0 initialisation code for later use as registers)
I do not like the format for the ICALL as it is not obvious and prone to errors
·············· mov t1, #8··········· ' Index the third long in the jump table
What this means is
·············· mov t1, #(3*4)-4
but we are better to say it is "2" using base "0" so
·············· mov t1, #(2*4)····· 'index· to jump_table[noparse][[/noparse]2]
or
·············· mov t1,#2*nlong··· 'where nlong equ 4
We could also use
·············· mov t1, #(@@@jump_table3 - @@@jump_table)
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Home of the MultiBladeProps: TriBlade,·RamBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: CPUs Z80 etc; Micros Altair etc;· Terminals·VT100 etc; (Index) ZiCog (Z80) , MoCog (6809)·
· Prop OS: SphinxOS·, PropDos , PropCmd··· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBlade Props: www.cluso.bluemagic.biz
I load t1 at the start of the routine, the address of the jump table is passed in as a parameter (i2ctable)·and I put that in t1 once for the routine - I dont reload it during the routine. The value following the jmp #icall is a constant.
Sapieha, could you please convert this version of the·SpinLMM.html to a PDF file?· Please change the version number from 0.9 to 1.0 before converting.
Cluso, I renamed the registers as others had suggested.· I think it is·less confusing for programmers to use reg0 through reg7 instead of x, y, a, etc.· I didn't fully understand your concern about this.· The values of x, y, a, etc. are still defined within SpinLMM.spin, so I hope this meets your needs.· BTW, I finally looked at the orignal thread where you and Hippy discussed improving the Spin interpreter, and I saw that LMM was discussed quite a bit back then.
Dave
As promised.
Regards
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Nothing is impossible, there are only different degrees of difficulty.
For every stupid question there is at least one intelligent answer.
Don't guess - ask instead.
If you don't ask you won't know.
If your gonna construct something, make it·as simple as·possible yet as versatile as posible.
Sapieha
Post Edited (Sapieha) : 7/6/2010 9:58:09 PM GMT
please note, conditional branching is possible:
sub pc,#(instr_back+1)<<2
and
add pc,#(instr_fwd)<<2
for example:
Also, you could maintain an FCALL stack:
Also, may I suggest renaming FRETX to FRET_REG0 - it may be more readable to those just starting out as it makes it explicit that REG0 is being returned.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
www.mikronauts.com E-mail: mikronauts _at_ gmail _dot_ com
My products: Morpheus / Mem+ / PropCade / FlexMem / VMCOG / Propteus / Proteus / SerPlug
and 6.250MHz Crystals to run Propellers at 100MHz & 5.0" OEM TFT VGA LCD modules
Las - Large model assembler Largos - upcoming nano operating system
Bill, I like the idea of jumping by adding or subtracting values to/from the program counter.· It is useful for relocatable code, and it's an easy way to loop back a few instructions or to skip forward a few instructions.· Immediate jumps could be as large as +/- 510.· Longer relative jumps could be done with an index register.· I might add a "Tips and Tricks" section that describe this technique and a few others.
SpinLMM does not actually contain a return instruction that is used with FCALL.· A return is performed by doing a "mov lmm_pc, lmm_ret" instruction.· FRET and FRETX are used to return back to the Spin interpreter.· Maybe these should be named FEXIT and FEXIT_REG0, or FSPIN and FSPIN_REG0.
I originally thought that FCALL should put the return address on the stack.· However, I like the idea of using a dedicated register, which could be pushed to the stack if needed.
SpinLMM has a very small set of pseudo-ops.· I want to minimized the effect on Spin execution speed.· However, I might add a few more pseudo-ops later on.
Dave
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Home of the MultiBladeProps: TriBlade,·RamBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: CPUs Z80 etc; Micros Altair etc;· Terminals·VT100 etc; (Index) ZiCog (Z80) , MoCog (6809)·
· Prop OS: SphinxOS·, PropDos , PropCmd··· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBlade Props: www.cluso.bluemagic.biz
Brad, it would be nice to add macro capability to BST as Cluso suggested.· I have some macro expansion code that I developed for CSPIN if your interested.· It would have to be modified to preserve newlines and indentation for Spin.
Dave
HMC5843 tri-axis compass
ITG-3200 tri-axis gyro
ADXL345 tri-axis accelerometer
BMP085 pressure sensor
BLINKM RGB led
edit: fixed link
Post Edited (Timmoore) : 7/9/2010 12:00:13 AM GMT