Multi-core RISCV does sound interesting. I don't know enough about this to see why there's a problem doing it. You are talking about using two P2 cogs to run riscv emulation, right? hard for me to imagine how they could stop on each other...
I guess the other question is if the riscv toolchain supports multi-core compilation. I'd suspect it does these days, but the version we are using here seems to be old...
Having trouble figuring out how to change the default P2 Clock Freq...
I see lots of code for setting the clock the old way, using hubset.
But, can't find the place the clock is actually set to 160 MHz.
@Rayman said:
Multi-core RISCV does sound interesting. I don't know enough about this to see why there's a problem doing it. You are talking about using two P2 cogs to run riscv emulation, right? hard for me to imagine how they could stop on each other...
When the instructions are translated from RISC-V to P2 the result is stored in cache in HUB memory (and run from there). I guess if each COG had its own completely separate cache that would work OK, but it would be far more efficient for them to share a cache, which requires synchronization. Again, not conceptually difficult, just slightly fiddly, because ideally we'd use a 2 level cache (with the code being cached in HUB but actually executing from LUT so other COGs can't change it while we're running it).
There is a mode right now where everything is only cached in LUT, and that would be very easy to get working with multiple COGs, but it's kind of slow so I'm not sure it's worth it.
I guess the other question is if the riscv toolchain supports multi-core compilation. I'd suspect it does these days, but the version we are using here seems to be old...
I don't think the compiler cares, it's only the runtime (C library and such) and the one we're using does support multi-threading.
Having trouble figuring out how to change the default P2 Clock Freq...
I see lots of code for setting the clock the old way, using hubset.
But, can't find the place the clock is actually set to 160 MHz.
It's in the serial initialization code in the riscvp2 translator, in jit/util_serial.spin2. I think you can also patch the clock frequency and mode into the binary at offsets $14 and $18 (the usual spot for everybody except Chip...).
Thanks @ersmith
I did notice the LUT cache and thought that was the default. Sound like it is not, ok.
I found the code you pointed to in util_serial.spin2:
' set clock to specified speed
' pa == clock mode
' pb == clock frequency
ser_clkset
It'd be convenient if there was a way to call ser_clkset from the C++ code, but I image it is not possible.
So, seems like basically need to recompile the riscvp2 toolchain if want a different P2 clock frequency.
Not a big deal, I'm thinking about changing it to 300 MHz so that can change upy video to 1080p...
Attempted to compile Tensorflow Lite Micro hello example with this modified makefile...
It downloaded a toolchain, that I replaced with the P2 one.
It compiled a lot of files, but failed in the end....
@rayman : that's good news. I certainly intended that the riscvp2 change should work with any toolchain that used GNU style linker scripts, but it's always good to see that it actually did work out in practice . Thanks for letting us know!
Ok, it's just weird... To make it work, first have makefile rigged for rv32imafc_zicsr version.
But, that will give an error eventually about not being able to link soft floats with floats or something like that.
Then, change the makefile for the rv32imc version and it works.
But, did it really work? Made a binary that seems to work...
But, maybe I haven’t tried anything floating point yet. Probably just go back to original. Doesn’t really matter to me, was just playing with it…
I think you should be able to use -march=rv32imac_zicsr (what you had first, but without the f) for riscvp2. The docs seem to say that any combination of letters is accepted, but I haven't tried that.
Interesting. I would have tried that but a subfolder with that name doesn't exist for some reason. But, maybe can just use the rv32imac subfolder? Can try that...
Toyed with riscv versions today... 10.2 and 13.2 seem to work in that they can compile and run hello.c. Micropython can compile, but doesn't output anything in the terminal.
Getting that working is over my head. Tried some things but nothing works...
@Rayman said:
Toyed with riscv versions today... 10.2 and 13.2 seem to work in that they can compile and run hello.c. Micropython can compile, but doesn't output anything in the terminal.
Getting that working is over my head. Tried some things but nothing works...
You just need persistence and fix one thing at a time until you run out of things to fix. Easier said than done sometimes, I know.
Attempted to compile Tensorflow Lite Micro hello example with this modified makefile...
It downloaded a toolchain, that I replaced with the P2 one.
It compiled a lot of files, but failed in the end....
@ersmith Think I'm seeing that when you build riscvp2, it generates the loading script "riscvp2.ld" that you invoke when calling gcc with the -T option. This installs the P2 JIT engine that is compiled into rvp2.o.
So, to change the default clock frequency, you can just build several versions of rvp2.o, one for each frequency that you want?
(so don't have to make riscvp2 every time you want use a different clock frequency?)
@Rayman said:
@ersmith Think I'm seeing that when you build riscvp2, it generates the loading script "riscvp2.ld" that you invoke when calling gcc with the -T option. This installs the P2 JIT engine that is compiled into rvp2.o.
So, to change the default clock frequency, you can just build several versions of rvp2.o, one for each frequency that you want?
(so don't have to make riscvp2 every time you want use a different clock frequency?)
Yes, I think you could do that (althought you'd also want a different loader script like "riscvp2_300Mhz.ld" that selects the appropriate rvp2.o).
But why not just change the clock frequency at run time in your code? Or alternatively, write a script to patch the default frequency into the binary?
@ersmith Let's say we did get tensorflow lite or some other complex and/or single purpose code compiled with this.
Is there a way I could execute in in a second cog using the LUT cache version of ld?
Would it be as simple as coginit() with a pointer to the binary loaded into hub ram?
@Rayman said:
@ersmith Let's say we did get tensorflow lite or some other complex and/or single purpose code compiled with this.
Is there a way I could execute in in a second cog using the LUT cache version of ld?
Would it be as simple as coginit() with a pointer to the binary loaded into hub ram?
I honestly don't know. In theory that sounds like it should work, but it's never been tested and there may well be some gotcha we haven't thought of yet.
@ersmith Trying to use spin2cpp, like it appears you have done in micropython to create C code that can then be compiled by this.
When tried just now, get:
c:\Propeller2\Micropython\SimplestSerial\simplestserial.spin2:44: error: pinf is not a function
c:\Propeller2\Micropython\SimplestSerial\simplestserial.spin2:47: error: pinl is not a function
c:\Propeller2\Micropython\SimplestSerial\simplestserial.spin2:50: error: pinf is not a function
c:\Propeller2\Micropython\SimplestSerial\simplestserial.spin2:53: error: pinl is not a function
c:\Propeller2\Micropython\SimplestSerial\simplestserial.spin2:82: error: pinr is not a function
c:\Propeller2\Micropython\SimplestSerial\simplestserial.spin2:90: error: pinr is not a function
c:\Propeller2\Micropython\SimplestSerial\simplestserial.spin2:92: error: rdpin is not a function
Think I can get around this, but was a little surprised... How can all the stuff in the vga and usb code work but not this?
Did I do something wrong? Or, did you have to get around this somehow too?
@Rayman said:
@ersmith Trying to use spin2cpp, like it appears you have done in micropython to create C code that can then be compiled by this.
When tried just now, get:
c:\Propeller2\Micropython\SimplestSerial\simplestserial.spin2:44: error: pinf is not a function
c:\Propeller2\Micropython\SimplestSerial\simplestserial.spin2:47: error: pinl is not a function
c:\Propeller2\Micropython\SimplestSerial\simplestserial.spin2:50: error: pinf is not a function
c:\Propeller2\Micropython\SimplestSerial\simplestserial.spin2:53: error: pinl is not a function
c:\Propeller2\Micropython\SimplestSerial\simplestserial.spin2:82: error: pinr is not a function
c:\Propeller2\Micropython\SimplestSerial\simplestserial.spin2:90: error: pinr is not a function
c:\Propeller2\Micropython\SimplestSerial\simplestserial.spin2:92: error: rdpin is not a function
Think I can get around this, but was a little surprised... How can all the stuff in the vga and usb code work but not this?
Did I do something wrong? Or, did you have to get around this somehow too?
spin2cpp has bit-rotted in the 5 years since I first did micropython; the P2 version of it doesn't get tested much any more (since flexspin just compiles C code, there's not nearly as much need for spin2cpp). I've fixed those bugs in github (the names of some of the built in functions changed over the years).
@Rayman said:
I think rdpin() might be broke...
Testing with micropython, maybe should try something simpler...
You may need to recompile everything. Make sure you're using a consistent set of propeller2.h and riscvp2; the definition of rdpin() changed slightly recently and the two need to be in sync everywhere.
@ersmith this gets a bit over my head... There's a propeller2.h in /opt/riscv/riscv-none-embed/include that is very different from the one in flexprop.
But, I guess somehow the makefile in riscvp2 build folder creates that h file from the flexprop one?
If that's true, then issue may be have new flexprop and old riscvp2?
Remember anything about how rdpin has changed that might be clue as to how to fix it?
For now, created a custom CSR to do rdpin on the pin I need...
'RJA
pin_read_csr
rdpin pb, #29 'RJA hard coding to P29
ret
@Rayman said:
@ersmith this gets a bit over my head... There's a propeller2.h in /opt/riscv/riscv-none-embed/include that is very different from the one in flexprop.
But, I guess somehow the makefile in riscvp2 build folder creates that h file from the flexprop one?
No, the two propeller2.h files are not related (except that they're supposed to define the same services). The riscvp2 propeller2.h describes how to perform common operations like _pinf and _rdpin with the riscvp2 compiler. The flexspin one tells how to do the same things with the flexspin compiler. The idea is that user code that does #include <propeller2.h> should be able to work the same on both compilers, even though "under the hood" the implementions are very different.
Remember anything about how rdpin has changed that might be clue as to how to fix it?
The rdpin binary encoding has changed around a few times, and the implementation in riscvptrace needs to match what the C code expects from propeller2.h, i.e. if you update riscvp2 and get a riscvptrace binary, you'll probably need to recompile any files that use propeller2.h. I hope rdpin has finally settled down (I just found and fixed another bug) so that this won't happen again. (The root cause is that I originally defined rdpin to be like wrpin, an S format instruction, but since it actually modifies its output it should have been an I format instruction. The first time this got changed the header and JIT got out of sync and broke; I reverted that, then went back and re-did the change, but messed up that change too ).
I saw you just updated the trace file in the github riscvp2 repo.
DId windiff on that and found the fix you mentioned.
Added that in and seems to work.
I'm attempting to modify util_serial.spin2 to bump clock to 297 MHz. Be interesting to see what breaks
I could dig up @ozpropdev method for calculating mode, but using the attached .spin2 file is easier.
Have to run this in FlexProp though as PropTool gives different answers for reasons I sort of understand, but don't what to think about...
@ersmith Been wondering about something... In your Github readme files you seem to be doing things as if logged into root. I seem to have to preface a lot of things with "sudo". Do you really run as logged into root all the time? Or, is there something I'm missing?
Comments
Multi-core RISCV does sound interesting. I don't know enough about this to see why there's a problem doing it. You are talking about using two P2 cogs to run riscv emulation, right? hard for me to imagine how they could stop on each other...
I guess the other question is if the riscv toolchain supports multi-core compilation. I'd suspect it does these days, but the version we are using here seems to be old...
Having trouble figuring out how to change the default P2 Clock Freq...
I see lots of code for setting the clock the old way, using hubset.
But, can't find the place the clock is actually set to 160 MHz.
When the instructions are translated from RISC-V to P2 the result is stored in cache in HUB memory (and run from there). I guess if each COG had its own completely separate cache that would work OK, but it would be far more efficient for them to share a cache, which requires synchronization. Again, not conceptually difficult, just slightly fiddly, because ideally we'd use a 2 level cache (with the code being cached in HUB but actually executing from LUT so other COGs can't change it while we're running it).
There is a mode right now where everything is only cached in LUT, and that would be very easy to get working with multiple COGs, but it's kind of slow so I'm not sure it's worth it.
I don't think the compiler cares, it's only the runtime (C library and such) and the one we're using does support multi-threading.
It's in the serial initialization code in the riscvp2 translator, in jit/util_serial.spin2. I think you can also patch the clock frequency and mode into the binary at offsets $14 and $18 (the usual spot for everybody except Chip...).
Thanks @ersmith
I did notice the LUT cache and thought that was the default. Sound like it is not, ok.
I found the code you pointed to in util_serial.spin2:
It'd be convenient if there was a way to call ser_clkset from the C++ code, but I image it is not possible.
So, seems like basically need to recompile the riscvp2 toolchain if want a different P2 clock frequency.
Not a big deal, I'm thinking about changing it to 300 MHz so that can change upy video to 1080p...
Attempted to compile Tensorflow Lite Micro hello example with this modified makefile...
It downloaded a toolchain, that I replaced with the P2 one.
It compiled a lot of files, but failed in the end....
The error was like this one: https://github.com/tensorflow/tflite-micro/issues/2444
So, seems there's some work to be done to actually make it work...
I think this works with the newer riscv compiler toolchain from here:
https://github.com/xpack-dev-tools/riscv-none-elf-gcc-xpack
Guess one changes is that prefix changed from riscv-none-embed to riscv-none-elf
Used the attached makefile.
Was able to compile hello.c, but probably needs more testing...
Built micropython using new toolchain with this makefile. Appears to work...
@rayman : that's good news. I certainly intended that the riscvp2 change should work with any toolchain that used GNU style linker scripts, but it's always good to see that it actually did work out in practice . Thanks for letting us know!
I might have been wrong out micropython compiling. I think some things were already compliled so it didn't actually recompile them.
After erasing the build folder, now get this:
main.c: Assembler messages:
main.c:61: Error: unrecognized opcode
csrr a5,0xC00', extension
zicsr' requiredmain.c:61: Error: unrecognized opcode
csrw 0xBC1,a5', extension
zicsr' requiredmain.c:55: Error: unrecognized opcode
csrr a3,0xC00', extension
zicsr' requiredmain.c:55: Error: unrecognized opcode
csrw 0xBC1,a3', extension
zicsr' requiredmain.c:58: Error: unrecognized opcode
csrr a3,0xC00', extension
zicsr' requiredmain.c:58: Error: unrecognized opcode
csrw 0xBC1,a3', extension
zicsr' requiredmake: *** [../../py/mkrules.mk:47: build/main.o] Error 1
Did notice versions that include zicsr, guess try one of those...
Ok, it's just weird... To make it work, first have makefile rigged for rv32imafc_zicsr version.
But, that will give an error eventually about not being able to link soft floats with floats or something like that.
Then, change the makefile for the rv32imc version and it works.
But, did it really work? Made a binary that seems to work...
But, maybe I haven’t tried anything floating point yet. Probably just go back to original. Doesn’t really matter to me, was just playing with it…
I think you should be able to use
-march=rv32imac_zicsr
(what you had first, but without thef
) for riscvp2. The docs seem to say that any combination of letters is accepted, but I haven't tried that.Interesting. I would have tried that but a subfolder with that name doesn't exist for some reason. But, maybe can just use the rv32imac subfolder? Can try that...
Toyed with riscv versions today... 10.2 and 13.2 seem to work in that they can compile and run hello.c. Micropython can compile, but doesn't output anything in the terminal.
Getting that working is over my head. Tried some things but nothing works...
You just need persistence and fix one thing at a time until you run out of things to fix. Easier said than done sometimes, I know.
s > @Rayman said:
I'd love to be able to use TF Lite on the P2! I'll try to get some time during the summer to look into this. but I don't even know where to start.
@ersmith Think I'm seeing that when you build riscvp2, it generates the loading script "riscvp2.ld" that you invoke when calling gcc with the -T option. This installs the P2 JIT engine that is compiled into rvp2.o.
So, to change the default clock frequency, you can just build several versions of rvp2.o, one for each frequency that you want?
(so don't have to make riscvp2 every time you want use a different clock frequency?)
Yes, I think you could do that (althought you'd also want a different loader script like "riscvp2_300Mhz.ld" that selects the appropriate rvp2.o).
But why not just change the clock frequency at run time in your code? Or alternatively, write a script to patch the default frequency into the binary?
Think the vga driver would break if clkfreq changed at runtime
You could always change the clkfreq before you initialized the vga. In micropython I guess that would be at the start of mp_hal_io_init (uart_core.c).
Think it’s baked into the pixel clock freq, but I’ll look again
@ersmith Let's say we did get tensorflow lite or some other complex and/or single purpose code compiled with this.
Is there a way I could execute in in a second cog using the LUT cache version of ld?
Would it be as simple as coginit() with a pointer to the binary loaded into hub ram?
I honestly don't know. In theory that sounds like it should work, but it's never been tested and there may well be some gotcha we haven't thought of yet.
@ersmith Trying to use spin2cpp, like it appears you have done in micropython to create C code that can then be compiled by this.
When tried just now, get:
Think I can get around this, but was a little surprised... How can all the stuff in the vga and usb code work but not this?
Did I do something wrong? Or, did you have to get around this somehow too?
I think rdpin() might be broke...
Testing with micropython, maybe should try something simpler...
Will try to see if can use free CSR to do it...
spin2cpp has bit-rotted in the 5 years since I first did micropython; the P2 version of it doesn't get tested much any more (since flexspin just compiles C code, there's not nearly as much need for spin2cpp). I've fixed those bugs in github (the names of some of the built in functions changed over the years).
You may need to recompile everything. Make sure you're using a consistent set of propeller2.h and riscvp2; the definition of rdpin() changed slightly recently and the two need to be in sync everywhere.
@ersmith this gets a bit over my head... There's a propeller2.h in /opt/riscv/riscv-none-embed/include that is very different from the one in flexprop.
But, I guess somehow the makefile in riscvp2 build folder creates that h file from the flexprop one?
If that's true, then issue may be have new flexprop and old riscvp2?
Remember anything about how rdpin has changed that might be clue as to how to fix it?
For now, created a custom CSR to do rdpin on the pin I need...
No, the two propeller2.h files are not related (except that they're supposed to define the same services). The riscvp2 propeller2.h describes how to perform common operations like
_pinf
and_rdpin
with the riscvp2 compiler. The flexspin one tells how to do the same things with the flexspin compiler. The idea is that user code that does#include <propeller2.h>
should be able to work the same on both compilers, even though "under the hood" the implementions are very different.The rdpin binary encoding has changed around a few times, and the implementation in riscvptrace needs to match what the C code expects from propeller2.h, i.e. if you update riscvp2 and get a riscvptrace binary, you'll probably need to recompile any files that use propeller2.h. I hope rdpin has finally settled down (I just found and fixed another bug) so that this won't happen again. (The root cause is that I originally defined rdpin to be like wrpin, an S format instruction, but since it actually modifies its output it should have been an I format instruction. The first time this got changed the header and JIT got out of sync and broke; I reverted that, then went back and re-did the change, but messed up that change too ).
Thanks @ersmith ! Think it's all fixed now.
I saw you just updated the trace file in the github riscvp2 repo.
DId windiff on that and found the fix you mentioned.
Added that in and seems to work.
I'm attempting to modify util_serial.spin2 to bump clock to 297 MHz. Be interesting to see what breaks
I could dig up @ozpropdev method for calculating mode, but using the attached .spin2 file is easier.
Have to run this in FlexProp though as PropTool gives different answers for reasons I sort of understand, but don't what to think about...
@ersmith Been wondering about something... In your Github readme files you seem to be doing things as if logged into root. I seem to have to preface a lot of things with "sudo". Do you really run as logged into root all the time? Or, is there something I'm missing?