Jazzed,
I updated the jvmVirtualPeripheral by moving the pin conversion code to the
VP initialization routines. This reduces the VP code itself and saves cycles.
I also moved the timer latch code into the timer VP code. The latch method
now only sets a byte to the passed bank value, and then waits for
the timer VP code to clear that byte. At that point the current timer count
is latched and valid.
I think it is sufficient for working VP code to have the PRI methods of
jvmVirtualPeripheral converted to pasm. The VP code only depends on
the 8.68 usec tick, which is generated in method isr. And since isr runs
in itw own cog,·the jvm should work, although less speedy than wanted,
while the jvm is not converted to pasm.
I'm currently working on the second part of byte codes. I'm wonder that the carry variable in jvmData will be set depending on result of some math calculation, but the state is never checked anywhere or I haven't it found yet?
Your Wiki addition "Fast-Track for PropJavelin" is great! But perhaps you could add a bit at the start about what PropJavelin is and how it works. Or at least a link to the pertinent "high-level" description for this application in the "JVM for Prop" thread (that is getting very deep indeed!)
It will help Newbies to this Topic understand just what this is about.
I added a definition for PropJavelin under Propeller Lingo but never thought of that for the actual pages. It's a good point though as I often find things through Google, get to a page and it's not clear exactly what the page relates to.
I've added short explanations, and if anyone feels anything is missing ( there or elsewhere ), don't be afraid to dive in and start adding. No need to register to edit pages. No need to know Wiki markup codes; just type in what's useful and someone will pretty it up later.
Hey hippy - I'll 2nd Drone there; excellent wiki page.
Hmm "PropJavelin" - bit of a mouthful, how about "JProp" ?
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Cheers,
Simon
www.norfolkhelicopterclub.co.uk
You'll always have as many take-offs as landings, the trick is to be sure you can take-off again ;-) BTW: I type as I'm thinking, so please don't take any offense at my writing style
"PropJavelin" ... I thought Peter had called it that, so i just used it, but now think I'm mistaken.
I'm not that keen on it either, doesn't really roll off the tongue.
"JavaProp" has a ring to it, but I suspect Java(TM) brings issues, hence why we have "Javelin Stamp" not "Java Stamp" already.
"JavelinProp" is perhaps better ? Whatever people want, I'll update all the pages. I think it's only courtesy though to give Peter final say or some veto on choices.
@Hippy,
You're right about the IP issues. JavelinProp represents the idea well ... one wonders if Parallax will let it's lawyer out of the closet though [noparse]:)[/noparse]
@Peter,
I'm following your PUB ISR model to build and test ASM functions you have listed as private in the (ever changing [noparse]:)[/noparse] jvmVirtualPeripheral.spin file.
I'm finding it hard to make timing with registers on these "short" or 2 byte word address boundaries ... doing an endian swap and rdbyte/wrbyte with the timer is a 5us proposition (4us if you only care if timer needs reset). I've implemented a word wide routine that takes 2.5us, but that gives little margin for anything else considering the other vp activity that needs to happen. Using long access, this time could be shorter, but that would require changing some addresses ... one could presume that with a special propeller core library, such issues would be less troublesome [noparse]:)[/noparse] and other optimizations could be made.
Assuming this translates into rdbyte and wrbyte instructions,
it would take at least 15*16cycles = 240 cycles where
it could take 6*16cycles = 96 cycles if nmTimer and nmLatchTimer
were long aligned. Luckily only the Timer has multibyte properties,
all the other VP only have byte parameters as far as the VP code is concerned.
The Timer VP code could be written as
which reduces timing to 9*16cycles = 144 cycles.
The nmTimer and nmLatchTimer registers are stored as Little Endian, just as prop longs.
Now if the properties were long aligned the code could·become
I think we can·make this happen by long aligning the vpRambank and adding 2 dummy bytes
at its beginning. nmTimer1 than starts at offset 4 instead of 2. This can be accounted for
in the readRegister and writeRegister methods and address calculations in VP code.
regards peter
Post Edited (Peter Verkaik) : 3/3/2008 6:05:50 AM GMT
I made vpRambank long aligned and so there are two dummy bytes at the start and
two dummy bytes at the end but it makes the 32bit timerregisters long aligned,
so the timer VP code is reduced to incrementing one long. All the parameters for
other VP code are bytes and so their addresses simply increase by 2.
The readRegister and writeRegister are updated to account for the additional two bytes.
Attached are the updated jvmData and jvmVirtualPeripheral.
I'm posting a demo package that snapshots where I am on the VP COG stuff.
It is not complete, but it does timer/pwm, captures the design requirements,
thinking on further implementation, and generally allows a preview of the
direction I'm going. After more quality time, I can integrate it with JVM.
Jazzed,
Thanks for that first pasm attempt. I took a close look at it and made some changes.
First I defined vpRambank long aligned with 2 dummy startbytes, so nmTimer1 becomes
long aligned. Then I changed the vpAsmTimer routine. You falsely assumed nmTimer1
was stored using Big Endian, but it is stored in Little Endian.
So vpAsmTimer becomes really simple:
vpAsmTimer·· 'only called when vpType is TIMER ············ 'note that nmTimer1 is long aligned because vpRambank is long aligned with 2 dummy startbytes ············· mov···· var1, vpAsmBank·········· 'target bank··············· ············· add···· var1, #def#nmTimer1······ ' + target offset ············· rdlong· var2, var1··············· ' read timer count········· ············· add···· var2, #1················· 'increment timer··········· ············· wrlong· var2, var1··············· 'write new timer count·····
vpAsmTimer_ret······ ret
Note that I also removed the test at the start, because that test is already done in doVPuser.
Now I will look at vpAsmPwm. First thing I noticed is that it won't work if 2 or more PWM's are
installed due to missing variables for multiple PWM's. That's why we need to store running
variables like count and state in unused vpRambank registers that are specific to a PWM instance.
Jazzed,
I played around with the pwm code. By rearranging the highTime and lowTime registers,
we can use rdword and wrword, thereby minimizing main ram accesses.
I have come·up with:
vpAsmPwm····· 'only called when vpType is PWM ············· mov···· var1, vpAsmBank·········· 'target bank ············· add···· var1, #def#nmPWMPin······ '+ target offset ············· rdbyte· var2, var1··············· ' save pin ············· mov···· var1, vpAsmBank·········· 'target bank ············· add···· var1, #def#nmPWMState···· '+ target offset ············· rdword· var3, var1··············· ' get state+count ············· sub···· var3,#256················ ' decrement count ············· cmp···· var3,#2·················· ' compare with 2 ······ if_nc· jmp···· #vpAsmPwmT0·············· 'if > 1 do not load new count and state ············· xor···· var3, #1················· 'invert state ············· mov···· var1, var3··············· 'get state ············· add···· var1, var3··············· 'double it ············· and···· var1, #$FE··············· 'extract state ············· add···· var1, vpAsmBank·········· '+ target bank ············· add···· var1, #def#nmPWMhighLSB·· '+ target offset ············· rdword· var4, var1··············· 'new count (with lowbyte 0) ············· or····· var3, var4··············· 'set new count
vpAsmPwmT0 ············· mov···· var1, vpAsmBank ············· add···· var1, #def#nmPWMState ············· wrword· var1, var3··············· ' remember state+count ············· ' pin is already set to output during VP install ············· mov···· var4, #1················· ' create pinmask in var4 ············· shl···· var4, var2
vpAsmPwm_St·· 'if state is zero, do zero part, else do one part ············· and···· var3,#1··········· ············· tjz···· var3, #vpAsmPwm_Do0 ············· ' do one ············· or····· outa, var4··············· ' or to set ············· jmp···· #vpAsmPwm_ret
vpAsmPwm_Do0· ' do zero ············· andn··· outa, var4··············· ' and with inverted bits
vpAsmPwm_ret· ret
regards peter
·
Post Edited (Peter Verkaik) : 3/4/2008 11:05:35 AM GMT
@ Peter, Jazzed : I know this is a distraction from the coding you're doing, but is there a list / synopsis of what isn't fully implemented yet across the entire PropJavelin firmware ?
Obviously Virtual Peripherals are still being worked on and the entire PropJavelin conversion to PASM/LMM is incomplete, but I've lost track of what else doesn't work yet.
That's all that need completion: VP's and converting JVM to PASM (the jvm engine at least),
and some native functions.
Anything regarding the language and source level debugger is operational.
regards peter
Post Edited (Peter Verkaik) : 3/4/2008 12:44:03 PM GMT
@Peter,
Every time you use any variant of rd*/wr* it costs about 0.3us.
Using two rdword costs almost twice as much as using one rdlong.
Having a few instructions before rd*/wr* is free.
Added: 0.3us is minimum on almost back to back transactions.
I understand that, that's why we must move the unused registers around
so we can use longs if possible. We cannot change the defined register
offsets, only the unused register offsets. For the pwm it may be better
to read the parameters lowTime, highTime, count, state in a single rdlong.
But after that the value must be copied and shifted because either
the lowTime or the highTime may become the new count value.
And then still a wrword is necessary.
With four main memory accesses worst case is 4*22 cycles = 88 cycles.
Each VP routine may have up to 694/6 = 115 cycles on average,
important is that the total is less than 694 (694 cycles @80Mhz = 8.68usec).
vpAsmTimer 'only called when vpType is TIMER
'note that nmTimer1 is long aligned because vpRambank is long aligned with 2 dummy startbytes
mov var1, vpAsmBank 'target bank
add var1, #def#nmTimer1 ' + target offset
rdlong var2, var1 ' read timer count
add var2, #1 'increment timer
wrlong var2, var1 'write new timer count
vpAsmTimer_ret ret
Note that I also removed the test at the start, because that test is already done in doVPuser.
Yes, the test is not necessary. What happened to the latch ?
Now I will look at vpAsmPwm. First thing I noticed is that it won't work if 2 or more PWM's are
installed due to missing variables for multiple PWM's. That's why we need to store running
variables like count and state in unused vpRambank registers that are specific to a PWM instance.
I was so focused on meeting timing, that I forgot about multiple PWM [noparse]:)[/noparse]
With four main memory accesses worst case is 4*22 cycles = 88 cycles.
Each VP routine may have up to 694/6 = 115 cycles on average,
important is that the total is less than 694 (694 cycles @80Mhz = 8.68usec).
The Timer latch method now simply reads nmTimer1 as long, so there is no danger
nmTimer1 is incremented by the VP cog because of hub mechanism.
Once read, the value is written to nmLatchTimer1 (which is never touched from VP cog).
I have also doubts it will all fit a single cog, especially when uarts are added.
So lets forget about uarts right now and focus on timer, adc, dac and pwm.
That should be doable using one cog.
Thomas (member name "kaio") started working on a pasm variation of the jem bytecode engine.
It bears some similarity to work you had previously done ....
This VP stuff could certainly use an experienced hand [noparse]:)[/noparse]
The minimum time required for 6 timers for example in the current loop/if-else-if
call infrastructure using the minimalist RMW routine for all calls would be 9 to 10us.
Over budget [noparse]:([/noparse]
Converting the if-else-if statement in vpDoUser to jump table may help timing.
Looking at the scope: having just one rd* transaction in the timer routine rather
than rd*, modify, wr* actually increases its run-time.
The only way i can see having one cog do all this in time is to define the rambank
in the VP asm section and have the native driver access it accordingly. This way,
asm can do mov & math on variables without costly rd*/wr* transactions.
Having the rambank defined in the cog space rather than hub space would also
eliminate any need for changing anything at all in the standard stamp/core/*.java .
Peter,
You should reconsider the design to allow storing the vpRambank in the VP module.
I'm still trying to get my head around Virtual Peripherals but I think I'm getting there.
It looks to me that the core problem is trying to use data structures suited for the SX48 which aren't suited to the Propeller. The solution to me is to put all VP handling in the Cog along with new VP data structures and redesign how VP works for within the Cog ignoring how the SX48 does it.
Provide a single rdlong interface for the Cog which runs continually until the 8.68us time expires which is used to get data from the foreground into the VP data structures and put data out. The foreground native methods are then responsible for converting between the SX48 data structures and the Cog data structures.
The Cog VP needs to be extremely lightweight; "In <N> ticks, do <something>" where <something> would be set or clear a pin for PWM or Serial TX. Serial RX shouldn't be too much more complicated. Rather than have a single handler per VP-type, extend VP-type to also include state, thus a PWM VP is either in an "In <N> ticks, set pin <P>", or "In <N> ticks, Clear Pin <P>". Serial TX is pretty much the same, except a state which does nothing until some data to send and a bit count which decrements to set it back to idle state. And so on.
I don't know enough about sigma-delta ADC to say how that would be done. To me it's the most complicated, the rest seem easy from a 'within the Cog' perspective.
The above is thinking out loud rather than 'the solution' but I think it's going to need a complete break from how the SX48 does it; turn it round from how do we implement what's needed to duplicate the SX48 way of things to let's produce a VP solution then work out how to fit the rest of what we have with that.
Not really self-modifying beyond having to overcome lack of indirect addressing in PASM. I left the cycling through the VP's to try and keep the idea clean. It's actually not that much different to how it is in Spin looking back on it - just optimising to keep execution time down and have data ready to use when it executes.
taking that to extremes, to speed up execution, one idea would be to embed the byte data for each VP as actual code ...
here an update of my work of translation to PASM. It uses now to cogs for byte code handling. The second cog (jvmBytecodeHandler2) contains currently only the arithmetic functions and is not tested yet. The first cog (jvmBytecodeHandler1) is tested til Jem_IFGE but Jem_IFGE and Jem_IFLE doesn't work properly at all.
You can see in jvmBytecodeHandler1 how I want to realize the handling of native functions. The handling can be shared about other cogs, since there is no space to do all also in one of the byte code handlers. I have implemented the isCarry function in jvmBytecodeHandler2 because there are also the arithmetic functions.
@Peter
Can you please make some little java tests with this PASM version. I don't have the time for that and I have trouble to save the JVM to my eeprom.
Tomorrow I will continue with testing and I think I could finish the test of jvmBytecodeHandler1. Then we could run java programs using methods.
Comments
I have been renaming xyz_org or core_org folder in this case prior to updating
Thanks.
ron
I updated the jvmVirtualPeripheral by moving the pin conversion code to the
VP initialization routines. This reduces the VP code itself and saves cycles.
I also moved the timer latch code into the timer VP code. The latch method
now only sets a byte to the passed bank value, and then waits for
the timer VP code to clear that byte. At that point the current timer count
is latched and valid.
I think it is sufficient for working VP code to have the PRI methods of
jvmVirtualPeripheral converted to pasm. The VP code only depends on
the 8.68 usec tick, which is generated in method isr. And since isr runs
in itw own cog,·the jvm should work, although less speedy than wanted,
while the jvm is not converted to pasm.
regards peter
I'm currently working on the second part of byte codes. I'm wonder that the carry variable in jvmData will be set depending on result of some math calculation, but the state is never checked anywhere or I haven't it found yet?
Thomas
(native function 14).
regards peter
Your Wiki addition "Fast-Track for PropJavelin" is great! But perhaps you could add a bit at the start about what PropJavelin is and how it works. Or at least a link to the pertinent "high-level" description for this application in the "JVM for Prop" thread (that is getting very deep indeed!)
It will help Newbies to this Topic understand just what this is about.
Best Regards, David
I've added short explanations, and if anyone feels anything is missing ( there or elsewhere ), don't be afraid to dive in and start adding. No need to register to edit pages. No need to know Wiki markup codes; just type in what's useful and someone will pretty it up later.
propeller.wikispaces.com/Programming+in+Java
As PropJavelin has progressed so well, a lot on that page could be moved to a sub-page so it's primarily for what we do have, not what we might have.
Hmm "PropJavelin" - bit of a mouthful, how about "JProp" ?
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Cheers,
Simon
www.norfolkhelicopterclub.co.uk
You'll always have as many take-offs as landings, the trick is to be sure you can take-off again ;-)
BTW: I type as I'm thinking, so please don't take any offense at my writing style
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
jazzed·... about·living in·http://en.wikipedia.org/wiki/Silicon_Valley
Traffic is slow at times, but Parallax orders·always get here fast 8)
I'm not that keen on it either, doesn't really roll off the tongue.
"JavaProp" has a ring to it, but I suspect Java(TM) brings issues, hence why we have "Javelin Stamp" not "Java Stamp" already.
"JavelinProp" is perhaps better ? Whatever people want, I'll update all the pages. I think it's only courtesy though to give Peter final say or some veto on choices.
@Hippy,
You're right about the IP issues. JavelinProp represents the idea well ... one wonders if Parallax will let it's lawyer out of the closet though [noparse]:)[/noparse]
@Peter,
I'm following your PUB ISR model to build and test ASM functions you have listed as private in the (ever changing [noparse]:)[/noparse] jvmVirtualPeripheral.spin file.
I'm finding it hard to make timing with registers on these "short" or 2 byte word address boundaries ... doing an endian swap and rdbyte/wrbyte with the timer is a 5us proposition (4us if you only care if timer needs reset). I've implemented a word wide routine that takes 2.5us, but that gives little margin for anything else considering the other vp activity that needs to happen. Using long access, this time could be shorter, but that would require changing some addresses ... one could presume that with a special propeller core library, such issues would be less troublesome [noparse]:)[/noparse] and other optimizations could be made.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
jazzed·... about·living in·http://en.wikipedia.org/wiki/Silicon_Valley
Traffic is slow at times, but Parallax orders·always get here fast 8)
as would propJavelin.
I have been reading up on the hub timing and this is a real spoiler.
Here is the Timer VP code in spin:
· t32.byte[noparse][[/noparse]0] := byte[noparse][[/noparse]vpRambank + vpBank + nmTimer1]
· t32.byte[noparse][[/noparse]1] := byte[noparse][[/noparse]vpRambank + vpBank + nmTimer2]
· t32.byte[noparse][[/noparse]2] := byte[noparse][[/noparse]vpRambank + vpBank + nmTimer3]
· t32.byte[noparse][[/noparse]3] := byte[noparse][[/noparse]vpRambank + vpBank + nmTimer4]
· t32++
· byte[noparse][[/noparse]vpRambank + vpBank + nmTimer1] := t32.byte[noparse][[/noparse]0]
· byte[noparse][[/noparse]vpRambank + vpBank + nmTimer2] := t32.byte[noparse][[/noparse]1]
· byte[noparse][[/noparse]vpRambank + vpBank + nmTimer3] := t32.byte[noparse][[/noparse]2]
· byte[noparse][[/noparse]vpRambank + vpBank + nmTimer4] := t32.byte[noparse][[/noparse]3]
· if tLatch <> 0
··· byte[noparse][[/noparse]vpRambank + tLatch + nmLatchTimer1] := t32.byte[noparse][[/noparse]0]
··· byte[noparse][[/noparse]vpRambank + tLatch + nmLatchTimer2] := t32.byte[noparse][[/noparse]1]
··· byte[noparse][[/noparse]vpRambank + tLatch + nmLatchTimer3] := t32.byte[noparse][[/noparse]2]
··· byte[noparse][[/noparse]vpRambank + tLatch + nmLatchTimer4] := t32.byte[noparse][[/noparse]3]
··· tLatch := 0
Assuming this translates into rdbyte and wrbyte instructions,
it would take at least 15*16cycles = 240 cycles where
it could take 6*16cycles = 96 cycles if nmTimer and nmLatchTimer
were long aligned. Luckily only the Timer has multibyte properties,
all the other VP only have byte parameters as far as the VP code is concerned.
The Timer VP code could be written as
· t32.word[noparse][[/noparse]0] := word[noparse][[/noparse]vpRambank + vpBank + nmTimer1]
· t32.word[noparse][[/noparse]1] := word[noparse][[/noparse]vpRambank + vpBank + nmTimer3]
· t32++
· word[noparse][[/noparse]vpRambank + vpBank + nmTimer1] := t32.word[noparse][[/noparse]0]
· word[noparse][[/noparse]vpRambank + vpBank + nmTimer3] := t32.word[noparse][[/noparse]1]
· if tLatch <> 0
··· word[noparse][[/noparse]vpRambank + tLatch + nmLatchTimer1] := t32.word[noparse][[/noparse]0]
··· word[noparse][[/noparse]vpRambank + tLatch + nmLatchTimer3] := t32.word[noparse][[/noparse]1]
··· tLatch := 0
which reduces timing to 9*16cycles = 144 cycles.
The nmTimer and nmLatchTimer registers are stored as Little Endian, just as prop longs.
Now if the properties were long aligned the code could·become
· t32 := long[noparse][[/noparse]vpRambank + vpBank + nmTimer1]
· t32++
· long[noparse][[/noparse]vpRambank + vpBank + nmTimer1] := t32
· if tLatch <> 0
··· long[noparse][[/noparse]vpRambank + tLatch + nmLatchTimer1] := t32
··· tLatch := 0
which reduces timing to 6*16cycles = 96 cycles.
I think we can·make this happen by long aligning the vpRambank and adding 2 dummy bytes
at its beginning. nmTimer1 than starts at offset 4 instead of 2. This can be accounted for
in the readRegister and writeRegister methods and address calculations in VP code.
regards peter
Post Edited (Peter Verkaik) : 3/3/2008 6:05:50 AM GMT
two dummy bytes at the end but it makes the 32bit timerregisters long aligned,
so the timer VP code is reduced to incrementing one long. All the parameters for
other VP code are bytes and so their addresses simply increase by 2.
The readRegister and writeRegister are updated to account for the additional two bytes.
Attached are the updated jvmData and jvmVirtualPeripheral.
regards peter
I'm posting a demo package that snapshots where I am on the VP COG stuff.
It is not complete, but it does timer/pwm, captures the design requirements,
thinking on further implementation, and generally allows a preview of the
direction I'm going. After more quality time, I can integrate it with JVM.
Cheers.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
jazzed·... about·living in·http://en.wikipedia.org/wiki/Silicon_Valley
Traffic is slow at times, but Parallax orders·always get here fast 8)
Thanks for that first pasm attempt. I took a close look at it and made some changes.
First I defined vpRambank long aligned with 2 dummy startbytes, so nmTimer1 becomes
long aligned. Then I changed the vpAsmTimer routine. You falsely assumed nmTimer1
was stored using Big Endian, but it is stored in Little Endian.
So vpAsmTimer becomes really simple:
vpAsmTimer·· 'only called when vpType is TIMER
············ 'note that nmTimer1 is long aligned because vpRambank is long aligned with 2 dummy startbytes
············· mov···· var1, vpAsmBank·········· 'target bank···············
············· add···· var1, #def#nmTimer1······ ' + target offset
············· rdlong· var2, var1··············· ' read timer count·········
············· add···· var2, #1················· 'increment timer···········
············· wrlong· var2, var1··············· 'write new timer count·····
vpAsmTimer_ret······ ret
Note that I also removed the test at the start, because that test is already done in doVPuser.
Now I will look at vpAsmPwm. First thing I noticed is that it won't work if 2 or more PWM's are
installed due to missing variables for multiple PWM's. That's why we need to store running
variables like count and state in unused vpRambank registers that are specific to a PWM instance.
regards peter
I played around with the pwm code. By rearranging the highTime and lowTime registers,
we can use rdword and wrword, thereby minimizing main ram accesses.
I have come·up with:
vpAsmPwm····· 'only called when vpType is PWM
············· mov···· var1, vpAsmBank·········· 'target bank
············· add···· var1, #def#nmPWMPin······ '+ target offset
············· rdbyte· var2, var1··············· ' save pin
············· mov···· var1, vpAsmBank·········· 'target bank
············· add···· var1, #def#nmPWMState···· '+ target offset
············· rdword· var3, var1··············· ' get state+count
············· sub···· var3,#256················ ' decrement count
············· cmp···· var3,#2·················· ' compare with 2
······ if_nc· jmp···· #vpAsmPwmT0·············· 'if > 1 do not load new count and state
············· xor···· var3, #1················· 'invert state
············· mov···· var1, var3··············· 'get state
············· add···· var1, var3··············· 'double it
············· and···· var1, #$FE··············· 'extract state
············· add···· var1, vpAsmBank·········· '+ target bank
············· add···· var1, #def#nmPWMhighLSB·· '+ target offset
············· rdword· var4, var1··············· 'new count (with lowbyte 0)
············· or····· var3, var4··············· 'set new count
vpAsmPwmT0
············· mov···· var1, vpAsmBank
············· add···· var1, #def#nmPWMState
············· wrword· var1, var3··············· ' remember state+count
············· ' pin is already set to output during VP install
············· mov···· var4, #1················· ' create pinmask in var4
············· shl···· var4, var2
vpAsmPwm_St·· 'if state is zero, do zero part, else do one part
············· and···· var3,#1···········
············· tjz···· var3, #vpAsmPwm_Do0
············· ' do one
············· or····· outa, var4··············· ' or to set
············· jmp···· #vpAsmPwm_ret
vpAsmPwm_Do0· ' do zero
············· andn··· outa, var4··············· ' and with inverted bits
vpAsmPwm_ret· ret
regards peter
·
Post Edited (Peter Verkaik) : 3/4/2008 11:05:35 AM GMT
Obviously Virtual Peripherals are still being worked on and the entire PropJavelin conversion to PASM/LMM is incomplete, but I've lost track of what else doesn't work yet.
and some native functions.
Anything regarding the language and source level debugger is operational.
regards peter
Post Edited (Peter Verkaik) : 3/4/2008 12:44:03 PM GMT
Every time you use any variant of rd*/wr* it costs about 0.3us.
Using two rdword costs almost twice as much as using one rdlong.
Having a few instructions before rd*/wr* is free.
Added: 0.3us is minimum on almost back to back transactions.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
jazzed·... about·living in·http://en.wikipedia.org/wiki/Silicon_Valley
Traffic is slow at times, but Parallax orders·always get here fast 8)
Post Edited (jazzed) : 3/4/2008 2:29:37 PM GMT
so we can use longs if possible. We cannot change the defined register
offsets, only the unused register offsets. For the pwm it may be better
to read the parameters lowTime, highTime, count, state in a single rdlong.
But after that the value must be copied and shifted because either
the lowTime or the highTime may become the new count value.
And then still a wrword is necessary.
With four main memory accesses worst case is 4*22 cycles = 88 cycles.
Each VP routine may have up to 694/6 = 115 cycles on average,
important is that the total is less than 694 (694 cycles @80Mhz = 8.68usec).
regards peter
Yes, the test is not necessary. What happened to the latch ?
I was so focused on meeting timing, that I forgot about multiple PWM [noparse]:)[/noparse]
This appears impossible to do in one cog.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
jazzed·... about·living in·http://en.wikipedia.org/wiki/Silicon_Valley
Traffic is slow at times, but Parallax orders·always get here fast 8)
Post Edited (jazzed) : 3/4/2008 2:45:37 PM GMT
nmTimer1 is incremented by the VP cog because of hub mechanism.
Once read, the value is written to nmLatchTimer1 (which is never touched from VP cog).
I have also doubts it will all fit a single cog, especially when uarts are added.
So lets forget about uarts right now and focus on timer, adc, dac and pwm.
That should be doable using one cog.
regards peter
Thomas (member name "kaio") started working on a pasm variation of the jem bytecode engine.
It bears some similarity to work you had previously done ....
This VP stuff could certainly use an experienced hand [noparse]:)[/noparse]
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
jazzed·... about·living in·http://en.wikipedia.org/wiki/Silicon_Valley
Traffic is slow at times, but Parallax orders·always get here fast 8)
I rewrote the pwm using rdlong. I count a total of 157 cycles.
Who can optimize this? I am sure it can be.
{
· 'file stamp\core\PWM.java (offsets within a VP bank)
· nmPWMPort···· = $0E· '// 0 for PortA, 1 for PortB (reserved: 2 for PortC, 3 for PortD)
· nmPWMPin····· = $0F· '// bitmask (1 of 8 set)
· 'not defined in java sources but we can use unused registers
· 'Peter's defines
· 'read parameters as long
· nmPWMState··· = $06· 'state=0 or state=1 (long aligned)
· nmPWMCount··· = $07· 'subtract 256 from long to decrement count
· nmPWMhighTime = $08
· nmPWMlowTime· = $09
· nmPWMhighZero = $0A 'not used
· nmPWMlowZero· = $0B 'not used
}
vpAsmPwm····· 'only called when vpType is PWM
············· mov···· var1, vpAsmBank·········· '4·· target bank
············· add···· var1, #def#nmPWMPin······ '4·· + target offset
············· rdbyte· var2, var1··············· '22· save pin
············· mov···· var1, vpAsmBank·········· '4·· target bank
············· add···· var1, #def#nmPWMState···· '4·· + target offset
··············································· '1
············· rdlong· var3, var1··············· '7·· get state+count+highTime+lowTime
············· sub···· var3, #256··············· '4·· decrement count
············· mov···· var4, var3··············· '4··
············· shr···· var4, #8················· '4
············· and···· var4, #$FF··············· '4
············· tjnz··· var4, #vpAsmPwmT1········ '4·· if count > 0 do not load new count and state
············· xor···· var3, #1················· '4·· invert state
············· mov···· var4, var3··············· '4
············· shr···· var4, #16················ '4·· get highTime in lowbyte
············· mov···· var1, var3··············· '4
············· and···· var1, #1················· '4
············· tjnz··· var1, #vpAsmPwmT0········ '4
············· shr···· var4, #8················· '4·· get lowTime in lowbyte
vpAsmPwmT0····································· '
············· and···· var4, #$FF··············· '4·· extract new count
············· shl···· var4, #8················· '4·· move into place
············· or····· var3, var4··············· '4·· set new count
vpAsmPwmT1
············· mov···· var1, vpAsmBank·········· '4·
············· add···· var1, #def#nmPWMState···· '4
············· ' pin is already set to output··· '
············· mov···· var4, #1················· '4·· create pinmask in var4
············· shl···· var4, var2··············· '4
··············································· '4
············· wrword· var1, var3··············· '7·· remember state+count (do not touch highTime+lowTime)
············· 'if state is zero, do zero part, else do one part
············· and···· var3,#1·················· '4·· state?
············· tjz···· var3, #vpAsmPwm_Do0······ '4
············· ' do one
············· or····· outa, var4··············· '4·· or to set
············· jmp···· #vpAsmPwm_ret············ '4
vpAsmPwm_Do0· ' do zero
············· andn··· outa, var4··············· '4·· and with inverted bits
vpAsmPwm_ret· ret······························ '4
··············································· 'total worst case is 157
regards peter
call infrastructure using the minimalist RMW routine for all calls would be 9 to 10us.
Over budget [noparse]:([/noparse]
Converting the if-else-if statement in vpDoUser to jump table may help timing.
Looking at the scope: having just one rd* transaction in the timer routine rather
than rd*, modify, wr* actually increases its run-time.
The only way i can see having one cog do all this in time is to define the rambank
in the VP asm section and have the native driver access it accordingly. This way,
asm can do mov & math on variables without costly rd*/wr* transactions.
Having the rambank defined in the cog space rather than hub space would also
eliminate any need for changing anything at all in the standard stamp/core/*.java .
Peter,
You should reconsider the design to allow storing the vpRambank in the VP module.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
jazzed·... about·living in·http://en.wikipedia.org/wiki/Silicon_Valley
Traffic is slow at times, but Parallax orders·always get here fast 8)
It looks to me that the core problem is trying to use data structures suited for the SX48 which aren't suited to the Propeller. The solution to me is to put all VP handling in the Cog along with new VP data structures and redesign how VP works for within the Cog ignoring how the SX48 does it.
Provide a single rdlong interface for the Cog which runs continually until the 8.68us time expires which is used to get data from the foreground into the VP data structures and put data out. The foreground native methods are then responsible for converting between the SX48 data structures and the Cog data structures.
The Cog VP needs to be extremely lightweight; "In <N> ticks, do <something>" where <something> would be set or clear a pin for PWM or Serial TX. Serial RX shouldn't be too much more complicated. Rather than have a single handler per VP-type, extend VP-type to also include state, thus a PWM VP is either in an "In <N> ticks, set pin <P>", or "In <N> ticks, Clear Pin <P>". Serial TX is pretty much the same, except a state which does nothing until some data to send and a bit count which decrements to set it back to idle state. And so on.
I don't know enough about sigma-delta ADC to say how that would be done. To me it's the most complicated, the rest seem easy from a 'within the Cog' perspective.
The above is thinking out loud rather than 'the solution' but I think it's going to need a complete break from how the SX48 does it; turn it round from how do we implement what's needed to duplicate the SX48 way of things to let's produce a VP solution then work out how to fit the rest of what we have with that.
how this loops across 6 vpslots.
regards peter
taking that to extremes, to speed up execution, one idea would be to embed the byte data for each VP as actual code ...
here an update of my work of translation to PASM. It uses now to cogs for byte code handling. The second cog (jvmBytecodeHandler2) contains currently only the arithmetic functions and is not tested yet. The first cog (jvmBytecodeHandler1) is tested til Jem_IFGE but Jem_IFGE and Jem_IFLE doesn't work properly at all.
You can see in jvmBytecodeHandler1 how I want to realize the handling of native functions. The handling can be shared about other cogs, since there is no space to do all also in one of the byte code handlers. I have implemented the isCarry function in jvmBytecodeHandler2 because there are also the arithmetic functions.
@Peter
Can you please make some little java tests with this PASM version. I don't have the time for that and I have trouble to save the JVM to my eeprom.
Tomorrow I will continue with testing and I think I could finish the test of jvmBytecodeHandler1. Then we could run java programs using methods.
Thomas