@Rayman said:
I noticed this RISCV fpga project that uses a bootloader to let it work with the Arduino IDE: http://www.nxlab.fer.hr/fpgarduino/
Out of curiosity, would that be possible with this?
The riscvp2 compiler approach could work with any RISC-V toolchain, including the one you linked to. Getting the Arduino IDE to work with it is a bigger problem: you'd need to write an Arduino-compatible loader for the P2, and port a bunch of libraries. In theory it should all be quite do-able, but someone would have to spend some time to do it.
The most obvious requirement to run the Linux kernel is XMM mode. The kernel and initrd uses 2.9MB, far larger than the P2's 512kB hub memory. A less obvious requirement but perhaps even more important is interrupts. I haven't really thought about the details of that yet. I still think that XMM for riscvp2 might be a useful project.
Linux kernel load/store operation count
Load Store
This is based on the instructions found in a disassembly of the kernel binary and may not match actual runtime operation.
The JIT compiler won't be doing instruction reads and data operations on the external memory at the same time. The JIT engine maintains a cache of the compiled instruction in hub ram or the LUT. The instruction fetching of the raw riscv instructions would still benefit from a simple prefetch scheme. The next long requested is almost always the next one in memory. Most of the data read/write operations appear to be saving and restoring registers on the stack. Quite annoyingly the stack operations run in descending order. I don't think psram has a reverse burst mode. Neither does the P2 hub. Eric is well aware of this and put a lot of effort into the OPTIMIZE_SETQ_RDLONG feature. I think this could be reworked to combine operations into a single psram burst for stack operations. Writing a simple cache controller to speed up the stack operations shouldn't be too hard.
Have you tried ucLinux (a bit old by now, but most likely MUCH easier to get up an running). And another nice looking small unix variant: fuzix, which seems much more promising to get running.
I started looking into fuzix, when work started to slow down, but that lasted only for a week (my main curse since 2020, working 10+ hours as I'm WFH in Europe and projects are all in Asia. not really feeling to do anything after work hours)
Having a code-only XMM mode for riscvp2 (that is, execute in place out of flash, or something similar) would be straightforward -- we'd just fill the instruction cache from the external memory. Storing data in external memory would be quite a bit more effort, and significantly complicate memory accesses.
I wonder if a better approach might be to port a smaller Unix that Linux. The original Unix ran on a PDP-11 with less memory than a P2, IIRC, so something like https://github.com/robertlipe/riscv7 might not require any external memory at all.
@ersmith Finally got around to trying this...
The instructions are pretty good and I downloaded the Windows release.
Added the bin folder to the path and tried this:
@Rayman said:
Other PC still slow after reboot. Very strange.
Wondering if WSL2 is interfering somehow or something...
Try turning off real-time protection in Windows "Virus and Threat Protection" settings. It slows down gcc dramatically. One might begin to think it was deliberate!
I turn it off whenever I have to recompile Catalina.
@RossH said:
Try turning off real-time protection in Windows "Virus and Threat Protection" settings. It slows down gcc dramatically. One might begin to think it was deliberate!
So true. I use MinGW to compile my CNC software. We had to go back to an old release because all programs compiled with the latest version frequently trigger virus warnings on customer's PCs. And it's not just a warning at installation time you could simply ignore and click away. Some windows versions are so aggressive that they delete our executable without asking the user!
@ersmith Think I'm seeing that in order to compile a .c file, it has to be in a subfolder inside the riscvp2 folder, is this right?
Trying now with Ubuntu... Added the "bin" folder to the path, but that's it...
It complains about not being able to find cc1 otherwise...
@ersmith The custom assembly for rdpin() doesn't seem to work... Getting illegal instruction messages. I think it must work somehow though.
What am I doing wrong here?
I'm actually amazed that this thing still works after all this time . Congratulations on getting it going.
@Rayman said:
@ersmith Think I'm seeing that in order to compile a .c file, it has to be in a subfolder inside the riscvp2 folder, is this right?
Trying now with Ubuntu... Added the "bin" folder to the path, but that's it...
It complains about not being able to find cc1 otherwise...
I think it's enough to make sure the gcc program is in the PATH environment variable. I haven't really tried it much on Windows though, so I may be misremembering.
@ersmith The custom assembly for rdpin() doesn't seem to work... Getting illegal instruction messages. I think it must work somehow though.
What am I doing wrong here?
What message are you getting? It compiles OK for me on Linux. I'm afraid I don't have a Windows setup right now to test on.
I think the code that handles the custom opcodes for things like rdpin is in riscvtrace_p2.spin.
But, this file isn't in the P2 riscv distribution, but is in the micropython code.
Also, it's probably compiled previously by spin2cpp and then included somewhere?
@Rayman said:
@ersmith Switched to Ubuntu after couldn’t get it in Windows.
Does the hello_rdpin.c work for you?
For me, it compiles but rdpin gives illegal instruction error message
Ah, it wasn't clear to me that you were getting the error at run time rather than compile time. I tried and I'm getting the same error. It looks like rdpin and related instructions just plain don't work, although I don't understand why (the code for them is there, and there's a restriction that you can't use a pin offset but you're not doing that).
All of this code, unfortunately, has heavily bit-rotted. I have updated the riscvp2 code in https://github.com/totalspectrum/riscvp2 so it will compile with the latest flexspin (6.9.3... unfortunately there is a bug in the flexspin preprocessor between 6.3? and 6.9.2 that prevents riscvtrace_p2.spin from compiling, and even before that the use of alignl was deprecated in .spin files, so we have to change the extension to .spin2.
@Rayman said:
@ersmith awesome thanks!
This path looks to be best for compiling complex C++
Glad you still support it
I wouldn't say I "support" it, but I'll try to fix obvious bugs . On that note, it looks like at some point I updated the encoding of the rdpin instruction in propeller2.h but forgot to change the code in riscvtrace. I think that's fixed (just now); at least it no longer throws an illegal instruction. But I'm not sure it's 100% working.
@Rayman said:
BTW: was curious.. could execution speed be increased by using two cogs?
Maybe one for jit and one for execution?
Maybe? It would require some kind of speculative compilation or branch prediction (so the JIT can be compiling the next code while the current code is executing). I suspect it would be more trouble than it's worth. What would be nice would be to be able to run multiple copies of the RISC-V emulation in parallel (i.e. to emulate a multi-core RISC-V). I've started and stopped working on that multiple times over the years, but never finished it. It's not hard in principle, but requires a lot of finicky revision to the JIT cache code to make sure the COGs don't step on each other.
Comments
It worked for me on Windows (under Cygwin). Thank you again for YAAT (yet another amazing tool).
I may have to try this out too.
Seems like it might be the only way to get complex C++ programs to run on P2 at the moment.
Great! Thanks for the feedback, I'm glad you were able to get it working.
Regards,
Eric
I noticed this RISCV fpga project that uses a bootloader to let it work with the Arduino IDE:
http://www.nxlab.fer.hr/fpgarduino/
Out of curiosity, would that be possible with this?
The riscvp2 compiler approach could work with any RISC-V toolchain, including the one you linked to. Getting the Arduino IDE to work with it is a bigger problem: you'd need to write an Arduino-compatible loader for the P2, and port a bunch of libraries. In theory it should all be quite do-able, but someone would have to spend some time to do it.
The most obvious requirement to run the Linux kernel is XMM mode. The kernel and initrd uses 2.9MB, far larger than the P2's 512kB hub memory. A less obvious requirement but perhaps even more important is interrupts. I haven't really thought about the details of that yet. I still think that XMM for riscvp2 might be a useful project.
Linux kernel load/store operation count
Load Store
int8 2604 1764
int16 869 413
int32 74265 54634
int32(sp) 34141 34403
This is based on the instructions found in a disassembly of the kernel binary and may not match actual runtime operation.
The JIT compiler won't be doing instruction reads and data operations on the external memory at the same time. The JIT engine maintains a cache of the compiled instruction in hub ram or the LUT. The instruction fetching of the raw riscv instructions would still benefit from a simple prefetch scheme. The next long requested is almost always the next one in memory. Most of the data read/write operations appear to be saving and restoring registers on the stack. Quite annoyingly the stack operations run in descending order. I don't think psram has a reverse burst mode. Neither does the P2 hub. Eric is well aware of this and put a lot of effort into the OPTIMIZE_SETQ_RDLONG feature. I think this could be reworked to combine operations into a single psram burst for stack operations. Writing a simple cache controller to speed up the stack operations shouldn't be too hard.
Have you tried ucLinux (a bit old by now, but most likely MUCH easier to get up an running). And another nice looking small unix variant: fuzix, which seems much more promising to get running.
I started looking into fuzix, when work started to slow down, but that lasted only for a week (my main curse since 2020, working 10+ hours as I'm WFH in Europe and projects are all in Asia. not really feeling to do anything after work hours)
Having a code-only XMM mode for riscvp2 (that is, execute in place out of flash, or something similar) would be straightforward -- we'd just fill the instruction cache from the external memory. Storing data in external memory would be quite a bit more effort, and significantly complicate memory accesses.
I wonder if a better approach might be to port a smaller Unix that Linux. The original Unix ran on a PDP-11 with less memory than a P2, IIRC, so something like https://github.com/robertlipe/riscv7 might not require any external memory at all.
@ersmith Finally got around to trying this...
The instructions are pretty good and I downloaded the Windows release.
Added the bin folder to the path and tried this:
E:\Parallax\Prop2\RiscV\riscvp2-master\riscvp2-master>riscv-none-embed-gcc -T riscvp2.ld -Os -o hello.elf hello.c
It seems like it's doing something, but still no output after 5 minutes.
Assuming that means something is broke?
Looks like it worked eventually.
Was able to load the P2 with the .elf and have terminal output and blinking LEDs on Eval board.
Funny that it's so slow on Windows; on Linux, at least, it runs quickly (like, a second or so). Is your Windows machine low on memory?
@ersmith Don't think so. Good to know it's fast on Linux though. ~10 min. to compile hello world is not viable...
Tried it on another PC and it was instantaneous....
Very strange... Think need to reboot the other one....
Other PC still slow after reboot. Very strange.
Wondering if WSL2 is interfering somehow or something...
Try turning off real-time protection in Windows "Virus and Threat Protection" settings. It slows down gcc dramatically. One might begin to think it was deliberate!
I turn it off whenever I have to recompile Catalina.
Ross.
So true. I use MinGW to compile my CNC software. We had to go back to an old release because all programs compiled with the latest version frequently trigger virus warnings on customer's PCs. And it's not just a warning at installation time you could simply ignore and click away. Some windows versions are so aggressive that they delete our executable without asking the user!
@RossH That does appear to be the issue, thanks!
Now have to check that other machine... Not sure why it'd be turned off there...
So strange... Other machine has real time protection turned on, yet doesn't have this issue...
BTW: Turned real time back on slow machine and it's slow again, so that is definitely a switch.
Adding RiscV tools folder as an exclusion doesn't help. Nor does running CMD prompt as administrator.
But, adding exclusion for entire RiscV folder that includes the hello.c file does work. Found a solution!
The best Antivirus program for Windows is Linux.
You can have both at the same time with WSL2
@ersmith Think I'm seeing that in order to compile a .c file, it has to be in a subfolder inside the riscvp2 folder, is this right?
Trying now with Ubuntu... Added the "bin" folder to the path, but that's it...
It complains about not being able to find cc1 otherwise...
@ersmith The custom assembly for rdpin() doesn't seem to work... Getting illegal instruction messages. I think it must work somehow though.
What am I doing wrong here?
I'm actually amazed that this thing still works after all this time . Congratulations on getting it going.
I think it's enough to make sure the gcc program is in the PATH environment variable. I haven't really tried it much on Windows though, so I may be misremembering.
What message are you getting? It compiles OK for me on Linux. I'm afraid I don't have a Windows setup right now to test on.
@ersmith Switched to Ubuntu after couldn’t get it in Windows.
Does the hello_rdpin.c work for you?
For me, it compiles but rdpin gives illegal instruction error message
I think the code that handles the custom opcodes for things like rdpin is in riscvtrace_p2.spin.
But, this file isn't in the P2 riscv distribution, but is in the micropython code.
Also, it's probably compiled previously by spin2cpp and then included somewhere?
Ah, it wasn't clear to me that you were getting the error at run time rather than compile time. I tried and I'm getting the same error. It looks like rdpin and related instructions just plain don't work, although I don't understand why (the code for them is there, and there's a restriction that you can't use a pin offset but you're not doing that).
All of this code, unfortunately, has heavily bit-rotted. I have updated the riscvp2 code in https://github.com/totalspectrum/riscvp2 so it will compile with the latest flexspin (6.9.3... unfortunately there is a bug in the flexspin preprocessor between 6.3? and 6.9.2 that prevents riscvtrace_p2.spin from compiling, and even before that the use of alignl was deprecated in .spin files, so we have to change the extension to .spin2.
@ersmith awesome thanks!
This path looks to be best for compiling complex C++
Glad you still support it
BTW: was curious.. could execution speed be increased by using two cogs?
Maybe one for jit and one for execution?
I wouldn't say I "support" it, but I'll try to fix obvious bugs . On that note, it looks like at some point I updated the encoding of the
rdpin
instruction in propeller2.h but forgot to change the code in riscvtrace. I think that's fixed (just now); at least it no longer throws an illegal instruction. But I'm not sure it's 100% working.Maybe? It would require some kind of speculative compilation or branch prediction (so the JIT can be compiling the next code while the current code is executing). I suspect it would be more trouble than it's worth. What would be nice would be to be able to run multiple copies of the RISC-V emulation in parallel (i.e. to emulate a multi-core RISC-V). I've started and stopped working on that multiple times over the years, but never finished it. It's not hard in principle, but requires a lot of finicky revision to the JIT cache code to make sure the COGs don't step on each other.