HI. I have a uart receive tested as working on 1200, 2400, 9600, 19.2K.
Rates like 600 and 4800 probably work to but I have not tested them.
Higher bauds are more problematic with this current loop polling design;
receive works for 99% of incoming bytes on 38.4K and 57.6K.
The only ways i can think of to get higher receive rates working 100% of
the time are 1) use dedicated cogs for each uart RX VP (simple, but resource
limited), 2) to increase the sample rate (moderate to impossible difficulty),
3) use an extra cog for gathering multiplexed bitstreams based on startbit
detection and per-channel baudrate expectation.
Interrupts would be great here; alas, there is little point bemoaning their absence.
Jazzed,
With a tick of 8.68usec and a divisor of 2 for 57600 baud, and 3 for 38400 baud,
receiving at these baudrates is also problematic on the javelin. For reliable receive
the divisor should be => 4, which equals 28800 baud or lower.
For transmit, there is no problem using 38400 and 57600 baud.
Higher baudrates should be considered an enhancement and so these probably are
best treated seperately, using seperate cogs as you suggested.
I streamlined my uart·transmit VP code. I now count about 170 cycles and that includes
the buffer handling. Note that this is not really tested because my current setup
does not provide me a rs232 level shifter.
If you look at the nm_core_Uart_txInit() and nm_core_Uart_rxInit() methods,
you will notice I changed some parameters from their intended use as indicated
by their names. (I store buffer address directly rather than the object reference,
and I use little endian storage format, also store totalbits instead of just stopbits).
All this leads to less VP code.
I have been thinking of removing the buffer handling from the VP code to put it
in its own cog. We probably need that buffer handling code also for enhanced VP code,
like I2C slave or whatever. It also will make the uart VP code smaller.
Any comments on this idea?
Not sure I understand the need to move buffer handling or how it will be handled in a separate cog. I'll wait for an example.
If we can use 3 cogs to approximate the javelin VP's with it's known deficiencies (having some delays between 38k4 and 56k7 streams fixes most issues), that would leave one cog left for a "dasilva special" which could be developed as the framework for new vp's --- or better variants of existing ones that could easily replace what we have. With Prop-II this would be more attractive to allow the user to define whatever they want.
I'm addressing some issues with my current design and will post when possible.
Peter,
Here is latest jvmVpCore that includes UartRx and a version of your ADC routine. The UartRx does not have a backpressure mechanism yet; I'll defer that for a while. I added the flow control stuff to UartTx. This·file supports two VP's per cog/object with bytes and time to spare. I'll do some integration later.
Jazzed, Thomas,
Attached is my latest version with a seperate cog for the vp code buffer handling.
I count a maximum total of 684 cycles so it is a tight fit (8.68usec equals 694 cycles).
This maximum occurs when using 3 uarts in one cog.
The buffer handling is in jvmVpQueueu and is in spin so it probably limits data throughput.
I have not really tested this yet because of a missing rs232 level shifter chip.·I have
these on order so I should be able to test it fully in a few days.
The buffer handling code is started in a new cog from jvmVirtualPeripheral.
Hi Peter.
I tried your code. I could not get past initialization on any code using your VP.
I've been testing my design, and timer, dac, pwm, and adc function for 6 VPs.
ADC reading is backwards. Uarts are some trouble. I can see bytes comming
in the rx buffer, but java doesn't get it. I see java trying to write to TX buffer,
but bytes do not get to location specified. This behaviour has me baffled.
Will look at it all again tomorrow when I'm fresh.
Ok Peter.
Attached is jvm with mostly working VP.· The writeObject native function needed to add the JavaProg base address to the buffer so that VPs can share java buffers. There are still some issues (see list below), but I wanted to publish what I have.
Caveat Emptor:
1.· Important!!!· It is very likely that you need change the jvmVpCore.spin _xinfreq to 10_000_000 for spin stamp otherwise, this VP will fail. I have not implemented a full strategy for dealing with the different base _xinfreq.
2. Uart RX buffer overflow handling has an issue near head/tail 255. Receiving same character repetatively is ok, but a string like "Hello World" will fail after a while.
3. Number of Uart stop bits is not yet implemented.
4. ADC returns value inversely proportional to voltage being measured.
Other than the caveats, you should be able to start and run 6 VPs. With this package.
Here is a jvm with VP that fixes the Uart RX overflow issue. I had to move flow control and tail maintenance into the ASM driver (code space is so tight it squeeks when compiling now). I'll look at the _xinfreq·time dependency next. Let me know how important data bits and stop bits are for the Uart.
@Thomas, do you mind if I look at your asm bytecode interpreter ? Watching output of Uart receive characters snail accross the screen is amusing and pitiful ... don't know whether to laugh or cry.
Added: what I mean is ... spin is horribly slow;·getting a·working asm bytecode interpreter going would be excellent and you have it ready mostly. Guess I'll poke around.
Ok, this will be my last update for VPs for a while unless there is a bug fix or other maintenance issue.
In this package, the VP timer is selected based on the clock configuration. Only 2 have been tested:
pll8x·- 10MHz and pll16x - 5MHz. The top file is now jvmMain5MHz and can be changed to a 10MHz
variant easily and saved as you like. Documentation in jvmVpCore is updated to reflect design.
It is entirely possible to use·jvmVpCore.spin and jvmDefines.spin independent of the JVM.
Also, it may be possible to use up·N (8-main program)·instances of the object to get
more VP's (14 total?) running at once.
Attachment updated. I had commented out ADC code during last development. Back in now.
I'm back with good news. I got the PASM version working. The last release did also but there were some other bytecodes used, which was not implemented. I'm wondering why bytecodes for arrays are used also when no array variable is used. Btw, I have implemented handling of required bytecodes and now it is running.
I have tried to implement multiply and divide but the version which I had found in forum was not working well. I did not invest to much time in this problem, because someone has such routines working and is willing to give it in this project. So far multiply and divide should be used only for positive values.
The implementation of bytecode handling is not complete yet, but nearly. It seems that not all code for bytecode handlers can be placed in these two cogs. Currently only 26 longs are left in bytehandler 2 and 4 longs left in bytehandler 1. Since the mainloop must be also translated to PASM to achieve the full speed I have thought about the further work which could look like the following.
I did also some performance measurement to compare with spin version. The average time of bytecode handling in spin is 385 µs and the current PASM version needs only 12 µs. But the current PASM version can not achieve those speed yet, since the mainloop is located in spin. The mainloop overhead needs at minimum 152 µs when using doStep instead of doBytecode (whithout Javelin IDE support) or 240 µs with Javelin IDE support.
Good work so far Thomas and thank you. Yes, much is required to get "Hello World" printed. The arraylist is used for making a copy from the string table, and ldc is also needed as you obviously know by now. Having 6 VP with my working implementation will require 3 cogs; doing 6 VP in two cogs is unlikely. Cog space will end up tight in any event. I assume your 12us measurement was worst case for pure asm version ... about 32 times faster than current spin implementation would be sweet. Still much to do ....
Peter Verkaik said...
...
Using the regular commands will copy the bytecodes into propeller eeprom.
At the moment the JVM provides a 16KByte memoryspace for the bytecodes.
(The javelin has 32KByte memoryspace).
There are still more than 1000 longs free so that space could be cranked up to 20KByte.
I hope once alot of the spin code is converted to PASM that we can
crank it up to 24KByte.
regards peter
Is it possible to take advantage of the 64Kb EEPROM on the ProtoBoard?
Not directly. It is however possible when adding new native functions, to
include eeprom read/write functions that access the upper 32KB eeprom.
The same applies to other propeller resources like the counters, all these
resources can be made accessible via native functions.
On using 64K eeprom: If we use a small LMM kernel for debug and other support, we could load & run the comm module straight from eeprom. Even the serial console support code can be run as LMM, but would likely need to be cached. The JVM, Native Handlers, & VPs would remain resident as propeller asm code. Of course the asm JVM needs to be "finished" ....
On finishing the asm JVM: I've spent some time working on asm versions of jvmComm and jvmData spin methods. The comm.doDebug and doBytecode functions will almost consume an entire cog. We need a reasonable strategy to finish the asm design. Having the comm features and data methods like jd.init decoupled from the JVM core is likely to be necessary.
I've looked at providing a "spin-services" module that would combine all of the non-jvm engine code so that ASM could invoke these functions by command and block on their completion for debug mode. In non-debug mode, these functions would not be used and the asm JVM would be free to run at speed. Perhaps the "spin-services" module can use LMM to load and run the debug stuff straight from eeprom in an LMM asm form later.
I am curious where this project currently stands.· I didn't read all 13 pages of posts, but I see you guys have something working.· Can you give a summary for those that are only slightly familiar with Java?· Thank you.
All is working, except for the pasm virtual peripheral code. Kaoi (Thomas)
and Jazzed worked on that and had something going. Problem still is
how to put 6 VP's in at most 2 cogs, preferably·one.
I just finished an sx assembly source which shows how the javelin
(uses an sx48) could have handled this. I attached it·for reference.
You can open it with any ascii editor.
My attempt shows the sx can handle it in about 200 clockcycles
which comes down to about 180 instructions.
Since a cog can hold 496 instructions I still think it must be possible
to do better than just 2 VP's per cog.
Perhaps this sx source will make it easier to convert to pasm.
Aside from the VP's, the overal speed appears slower than the javelin,
due to spin. Perhaps the new C compiler can improve that by converting
the spin parts to C.
Edit: storage requirement for the sx VP code inside the interrupt routine·is 237 words, which is
a lot less than 496 longs.
regards peter
Post Edited (Peter Verkaik) : 7/10/2008 2:15:41 PM GMT
Peter, the last vp code I posted works fine except the adc values being 'inverted'. The interface is different from your expectation, but it works. I was considering rewriting parts of the pasm with regard to control parameters, but don't have time. Thomas last pasm should be fairly easy to integrate, but takes work. A C code version would never fit in 32K.
Thank you for the responses. Lets take another step back. For someone to use what you all have created, what needs to be installed on the Prop? How are the java programs written and trasfered to the Prop?
Peter,· ·· Excellent work on the port of Java to the Propeller - this is greatly appreciated. ·· This is my first post on this forum, I started with the Javelin a couple years ago, worked with the basic stamp and got the 3 variants of the propeller recently (mostly as interfaces to multiplexed display boards).· As I primarily develop in Java professionally, I kind of missed the 1.2 JRE that was on the javelin when working with the propeller.· I have downloaded your binaries/source today and will try everything out.
·· Thank you very much for your hard work and the work of all your fellow committers.
Comments
Rates like 600 and 4800 probably work to but I have not tested them.
Higher bauds are more problematic with this current loop polling design;
receive works for 99% of incoming bytes on 38.4K and 57.6K.
The only ways i can think of to get higher receive rates working 100% of
the time are 1) use dedicated cogs for each uart RX VP (simple, but resource
limited), 2) to increase the sample rate (moderate to impossible difficulty),
3) use an extra cog for gathering multiplexed bitstreams based on startbit
detection and per-channel baudrate expectation.
Interrupts would be great here; alas, there is little point bemoaning their absence.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
jazzed·... about·living in·http://en.wikipedia.org/wiki/Silicon_Valley
Traffic is slow at times, but Parallax orders·always get here fast 8)
With a tick of 8.68usec and a divisor of 2 for 57600 baud, and 3 for 38400 baud,
receiving at these baudrates is also problematic on the javelin. For reliable receive
the divisor should be => 4, which equals 28800 baud or lower.
For transmit, there is no problem using 38400 and 57600 baud.
Higher baudrates should be considered an enhancement and so these probably are
best treated seperately, using seperate cogs as you suggested.
I streamlined my uart·transmit VP code. I now count about 170 cycles and that includes
the buffer handling. Note that this is not really tested because my current setup
does not provide me a rs232 level shifter.
If you look at the nm_core_Uart_txInit() and nm_core_Uart_rxInit() methods,
you will notice I changed some parameters from their intended use as indicated
by their names. (I store buffer address directly rather than the object reference,
and I use little endian storage format, also store totalbits instead of just stopbits).
All this leads to less VP code.
I have been thinking of removing the buffer handling from the VP code to put it
in its own cog. We probably need that buffer handling code also for enhanced VP code,
like I2C slave or whatever. It also will make the uart VP code smaller.
Any comments on this idea?
regards peter
If we can use 3 cogs to approximate the javelin VP's with it's known deficiencies (having some delays between 38k4 and 56k7 streams fixes most issues), that would leave one cog left for a "dasilva special" which could be developed as the framework for new vp's --- or better variants of existing ones that could easily replace what we have. With Prop-II this would be more attractive to allow the user to define whatever they want.
I'm addressing some issues with my current design and will post when possible.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
jazzed·... about·living in·http://en.wikipedia.org/wiki/Silicon_Valley
Traffic is slow at times, but Parallax orders·always get here fast 8)
Here is latest jvmVpCore that includes UartRx and a version of your ADC routine. The UartRx does not have a backpressure mechanism yet; I'll defer that for a while. I added the flow control stuff to UartTx. This·file supports two VP's per cog/object with bytes and time to spare. I'll do some integration later.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
jazzed·... about·living in·http://en.wikipedia.org/wiki/Silicon_Valley
Traffic is slow at times, but Parallax orders·always get here fast 8)
Attached is my latest version with a seperate cog for the vp code buffer handling.
I count a maximum total of 684 cycles so it is a tight fit (8.68usec equals 694 cycles).
This maximum occurs when using 3 uarts in one cog.
The buffer handling is in jvmVpQueueu and is in spin so it probably limits data throughput.
I have not really tested this yet because of a missing rs232 level shifter chip.·I have
these on order so I should be able to test it fully in a few days.
The buffer handling code is started in a new cog from jvmVirtualPeripheral.
regards peter
I tried your code. I could not get past initialization on any code using your VP.
I've been testing my design, and timer, dac, pwm, and adc function for 6 VPs.
ADC reading is backwards. Uarts are some trouble. I can see bytes comming
in the rx buffer, but java doesn't get it. I see java trying to write to TX buffer,
but bytes do not get to location specified. This behaviour has me baffled.
Will look at it all again tomorrow when I'm fresh.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
jazzed·... about·living in·http://en.wikipedia.org/wiki/Silicon_Valley
Traffic is slow at times, but Parallax orders·always get here fast 8)
Attached is jvm with mostly working VP.· The writeObject native function needed to add the JavaProg base address to the buffer so that VPs can share java buffers. There are still some issues (see list below), but I wanted to publish what I have.
Caveat Emptor:
1.· Important!!!· It is very likely that you need change the jvmVpCore.spin _xinfreq to 10_000_000 for spin stamp otherwise, this VP will fail. I have not implemented a full strategy for dealing with the different base _xinfreq.
2. Uart RX buffer overflow handling has an issue near head/tail 255. Receiving same character repetatively is ok, but a string like "Hello World" will fail after a while.
3. Number of Uart stop bits is not yet implemented.
4. ADC returns value inversely proportional to voltage being measured.
Other than the caveats, you should be able to start and run 6 VPs. With this package.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
jazzed·... about·living in·http://en.wikipedia.org/wiki/Silicon_Valley
Traffic is slow at times, but Parallax orders·always get here fast 8)
@Thomas, do you mind if I look at your asm bytecode interpreter ? Watching output of Uart receive characters snail accross the screen is amusing and pitiful ... don't know whether to laugh or cry.
Added: what I mean is ... spin is horribly slow;·getting a·working asm bytecode interpreter going would be excellent and you have it ready mostly. Guess I'll poke around.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
jazzed·... about·living in·http://en.wikipedia.org/wiki/Silicon_Valley
Traffic is slow at times, but Parallax orders·always get here fast 8)
Post Edited (jazzed) : 3/29/2008 1:59:44 AM GMT
In this package, the VP timer is selected based on the clock configuration. Only 2 have been tested:
pll8x·- 10MHz and pll16x - 5MHz. The top file is now jvmMain5MHz and can be changed to a 10MHz
variant easily and saved as you like. Documentation in jvmVpCore is updated to reflect design.
It is entirely possible to use·jvmVpCore.spin and jvmDefines.spin independent of the JVM.
Also, it may be possible to use up·N (8-main program)·instances of the object to get
more VP's (14 total?) running at once.
Attachment updated. I had commented out ADC code during last development. Back in now.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
jazzed·... about·living in·http://en.wikipedia.org/wiki/Silicon_Valley
Traffic is slow at times, but Parallax orders·always get here fast 8)
Post Edited (jazzed) : 3/29/2008 3:42:04 AM GMT
I'm back with good news. I got the PASM version working. The last release did also but there were some other bytecodes used, which was not implemented. I'm wondering why bytecodes for arrays are used also when no array variable is used. Btw, I have implemented handling of required bytecodes and now it is running.
I have tried to implement multiply and divide but the version which I had found in forum was not working well. I did not invest to much time in this problem, because someone has such routines working and is willing to give it in this project. So far multiply and divide should be used only for positive values.
The implementation of bytecode handling is not complete yet, but nearly. It seems that not all code for bytecode handlers can be placed in these two cogs. Currently only 26 longs are left in bytehandler 2 and 4 longs left in bytehandler 1. Since the mainloop must be also translated to PASM to achieve the full speed I have thought about the further work which could look like the following.
COG Usage
+
1 mainloop, serial driver, bytecode handler 3, native function handling
2 bytecode handler 1
3 bytecode handler 2
4 virtual peripheral 1
5 virtual peripheral 2
I did also some performance measurement to compare with spin version. The average time of bytecode handling in spin is 385 µs and the current PASM version needs only 12 µs. But the current PASM version can not achieve those speed yet, since the mainloop is located in spin. The mainloop overhead needs at minimum 152 µs when using doStep instead of doBytecode (whithout Javelin IDE support) or 240 µs with Javelin IDE support.
Thomas
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
jazzed·... about·living in·http://en.wikipedia.org/wiki/Silicon_Valley
Traffic is slow at times, but Parallax orders·always get here fast 8)
include eeprom read/write functions that access the upper 32KB eeprom.
The same applies to other propeller resources like the counters, all these
resources can be made accessible via native functions.
regards peter
On finishing the asm JVM: I've spent some time working on asm versions of jvmComm and jvmData spin methods. The comm.doDebug and doBytecode functions will almost consume an entire cog. We need a reasonable strategy to finish the asm design. Having the comm features and data methods like jd.init decoupled from the JVM core is likely to be necessary.
I've looked at providing a "spin-services" module that would combine all of the non-jvm engine code so that ASM could invoke these functions by command and block on their completion for debug mode. In non-debug mode, these functions would not be used and the asm JVM would be free to run at speed. Perhaps the "spin-services" module can use LMM to load and run the debug stuff straight from eeprom in an LMM asm form later.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
jazzed·... about·living in·http://en.wikipedia.org/wiki/Silicon_Valley
Traffic is slow at times, but Parallax orders·always get here fast 8)
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Timothy D. Swieter, E.I.
www.brilldea.com·- Prop Blade, LED Painter, RGB LEDs, uOLED-IOC
www.sxmicro.com - a blog·exploring about the SX micro
www.tdswieter.com
and Jazzed worked on that and had something going. Problem still is
how to put 6 VP's in at most 2 cogs, preferably·one.
I just finished an sx assembly source which shows how the javelin
(uses an sx48) could have handled this. I attached it·for reference.
You can open it with any ascii editor.
My attempt shows the sx can handle it in about 200 clockcycles
which comes down to about 180 instructions.
Since a cog can hold 496 instructions I still think it must be possible
to do better than just 2 VP's per cog.
Perhaps this sx source will make it easier to convert to pasm.
Aside from the VP's, the overal speed appears slower than the javelin,
due to spin. Perhaps the new C compiler can improve that by converting
the spin parts to C.
Edit: storage requirement for the sx VP code inside the interrupt routine·is 237 words, which is
a lot less than 496 longs.
regards peter
Post Edited (Peter Verkaik) : 7/10/2008 2:15:41 PM GMT
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Timothy D. Swieter, E.I.
www.brilldea.com·- Prop Blade, LED Painter, RGB LEDs, uOLED-IOC
www.sxmicro.com - a blog·exploring the SX micro
www.tdswieter.com
http://propeller.wikispaces.com/Fast-Track+for+PropJavelin
regards peter
·
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Timothy D. Swieter, E.I.
www.brilldea.com·- Prop Blade, LED Painter, RGB LEDs, uOLED-IOC
www.sxmicro.com - a blog·exploring the SX micro
www.tdswieter.com
·· Excellent work on the port of Java to the Propeller - this is greatly appreciated.
·· This is my first post on this forum, I started with the Javelin a couple years ago, worked with the basic stamp and got the 3 variants of the propeller recently (mostly as interfaces to multiplexed display boards).· As I primarily develop in Java professionally, I kind of missed the 1.2 JRE that was on the javelin when working with the propeller.· I have downloaded your binaries/source today and will try everything out.
·· Thank you very much for your hard work and the work of all your fellow committers.
·· /michael
Please disregard this post, I realized my error. I was using the wrong program!