A survey of C compilers available for P2

ersmithersmith Posts: 3,387
edited 2019-04-08 - 19:48:48 in Propeller 2
There is no complete C solution available yet for the P2, but there are a number of incomplete/in progress ones. This is an overview of what's available as of the date of writing (2019-04-07).

P2GCC
URL:               https://github.com/davehein/p2gcc
Type:              Native code
Standard Library:  incomplete
P2 support:        full (inline and out-of-line assembly)
C support:         full C1x
C++ support:       not yet, should be easy to add

Probably the most popular C solution for P2 right now. Works by converting the P1 output of PropGCC to P2 instructions. The standard library has not all been ported yet, and the default front end doesn't support C++ (but since PropGCC does have C++ this should be easy to change).

FASTSPIN
URL:               https://github.com/totalspectrum/spin2gui
Type:              Native code
Standard Library:  incomplete
P2 support:        full (inline and out-of-line assembly)
C support:         eventually C99, but incomplete at present
C++ support:       no

Homegrown C (and Spin and BASIC) compiler for P2. The C support is still incomplete and buggy. Optimizer is not up to GCC standards on generic code, but does have P2 specific optimizations. Noteworthy for its ability to seamlessly interoperate with Spin and BASIC functions.

RISC-V JIT
URL:               https://github.com/totalspectrum/riscvemu
Type:              JIT compiler
Standard Library:  complete, but awkward to use
P2 support:        partial, via custom Risc-V instructions 
C support:         full C1x
C++ support:       yes

Can support any RISC-V compiler, but I've been using GCC. Works by converting RISC-V instructions to P2 instructions at run time. Many P2 specific features (including COGINIT of arbitrary P2 code) are supported via custom RISC-V instructions. A full newlib standard library is available, but is a bit awkward to use (there's no package giving P2 board support). Documentation is sparse.


ZOG for P2
URL:               https://github.com/totalspectrum/zog
Type:              XBYTE interpreter or JIT compiler
Standard Library:  complete
P2 support:        limited (memory mapped registers)
C support:         full C1x
C++ support:       yes

GCC compiler to ZPU bytecode, which is then interpreted on the P2. A full newlib standard library is available. P2 specific registers like OUTA are accessible, but there isn't any other P2 support yet (the original ZOG for P1 did have COGINIT, so this should be easy to add).

CATALINA
URL:               https://sourceforge.net/projects/catalina-c/
Type:              Native code or CMM interpreter
Standard Library:  not released yet
P2 support:        yes, via out of line assembly
C support:         full C89, some C99 features
C++ support:       no

The new kid on the block. P2 version not yet released, but sounds very promising. Based on the LCC compiler, so code is probably not as fast as GCC.
«13

Comments

  • Here are some benchmark results for the C compilers discussed above. riscvjit and riscvemu are the results for RISC-V GCC with the JIT and plain interpreters, respectively. Similarly zog2_xbyte and zog2_jit are for the XBYTE and JIT interpreters; the ZPU JIT interpreter is still very early and has minimal optimizations.

    XXTEA

    This is the xxtea decompression benchmark.
    Speed is cycles to decompress the sample (so lower is better).
    Size is the size of the btea function.
    The GCC results all use -Os; fastspin uses -O2.
                 Speed           Size
    p2gcc:       15562 cycles     868
    fastspin:    15530 cycles     764
    riscvjit:    44596 cycles     520
    riscvemu:   273232 cycles     520
    zog2_xbyte: 160072 cycles     369
    zog2_jit:   154576 cycles     369
    

    FFTBENCH

    Time to complete in microseconds at 80 MHz (so lower is better).
    Size was difficult to compute, because different compilers inlined
    different functions, so I have left it out.
                  Speed  
    p2gcc:       48674 us
    fastspin:    46516 us (*)
    riscvjit:    43679 us
    riscvemu:   334388 us
    zog2_xbyte: 250154 us
    zog2_jit:   208991 us
    
    (*) fastspin printed incorrect output on this run. It has produced correct fftbench output before, so it's probably an uninitialized variable or similar bug, but take the result with a grain of salt
    

    DHRYSTONE

    This is the dhyrstone 1.1 benchmark.
    Speed is dhrystones/second with an 80 MHz system clock. Higher is better.
    Size gives the total size in bytes of the Proc_n and Func_n functions.

    For this benchmark I've tried some different optimization options to see what difference they make.
                    Speed    Size
    p2gcc:           9719     956
    fastspin:    n/a (unable to compile)
    riscvjit  -O3:  11221     852
    riscvjit  -Os:   7036     684
    riscvemu  -O3:   2100     852
    zog2_xbyte -O3:  2548     684
    zog2_jit   -O3:  2353     684
    zog2_xbyte -Os:  2283     604
    zog2_jit   -Os:  1284     604
    

    Overall the native compilers (fastspin and p2gcc) are the fastest and produce pretty similar results. riscvjit has more variable results (probably depending on exactly where things end up in its code cache), sometimes matching the native compilers for speed but sometimes a bit slower. The emulators are quite a bit further back. The ZPU xbyte interpreter is noticeably faster than the RISC-V interpreter (which does not use xbyte, because RISC-V instructions are 32 bits long). The ZPU JIT compiler is pretty much the same speed as xbyte, and with some optimization work could probably surpass it.
  • I'm hoping the current state improves. Think I'll wait a bit before trying any of these.
    Really want to get MPEG and PNG decoding (and encoding) going. I'm sure there are C examples of this out there...
    Prop Info and Apps: http://www.rayslogic.com/
  • Eric: Do you still intend to complete your C compiler for fastspin?
  • jmgjmg Posts: 13,917
    edited 2019-04-07 - 19:30:44
    ersmith wrote: »
    FASTSPIN
    URL:               https://github.com/totalspectrum/spin2gui
    Type:              Native code
    Standard Library:  incomplete
    P2 support:        full (inline and out-of-line assembly)
    C support:         eventually C99, but incomplete at present
    C++ support:       no
    

    Homegrown C (and Spin and BASIC) compiler for P2. The C support is still incomplete and buggy. Optimizer is not up to GCC standards on generic code, but does have P2 specific optimizations. Noteworthy for its ability to seamlessly interoperate with Spin and BASIC functions.
    I would add
    * Supports clean in-line assembly
    to the fastpin feature list.

    On a P2, I rank that as a vitally important feature, as being able to intermix assembler and any HLL, easily, is a key differentiation.
    GCC scores a clear 'F' there.

    The numbers you quote, also suggest those P2 specific optimizations already have fastspin in front of gcc, in some cases.


  • jmgjmg Posts: 13,917
    ersmith wrote: »
    RISC-V JIT
    URL:               https://github.com/totalspectrum/riscv
    Type:              JIT compiler
    Standard Library:  complete, but awkward to use
    P2 support:        partial, via custom Risc-V instructions 
    C support:         full C1x
    C++ support:       yes
    

    Can support any RISC-V compiler, but I've been using GCC. Works by converting RISC-V instructions to P2 instructions at run time. Many P2 specific features (including COGINIT of arbitrary P2 code) are supported via custom RISC-V instructions. A full newlib standard library is available, but is a bit awkward to use (there's no package giving P2 board support). Documentation is sparse.

    Your JIT effort is proving remarkably fruitful.
    The speed is quite good, and it is what allows Python to start to have a pulse on P2 already.

    Have you tried Blockly-Generated-C into any of your C flows yet ? How close is that to being supported ?

  • Has anyone heard from Parallax on the PropGCC front? Are they waiting until the respin now?
    David
    PropWare: C++ HAL (Hardware Abstraction Layer) for PropGCC; Robust build system using CMake; Integrated Simple Library, libpropeller, and libPropelleruino (Arduino port); Instructions for Eclipse and JetBrain's CLion; Example projects; Doxygen documentation
    CI Server: https://ci.zemon.name?guest=1
  • DavidZemon wrote: »
    Has anyone heard from Parallax on the PropGCC front? Are they waiting until the respin now?
    I keep asking but I haven't gotten an answer yet.

  • David Betz wrote: »
    Eric: Do you still intend to complete your C compiler for fastspin?

    Yes, I'd like to. I've had some other projects distracting me, but eventually I do plan to come back to fastspin/C.

  • jmg wrote: »
    Have you tried Blockly-Generated-C into any of your C flows yet ? How close is that to being supported ?

    No. Personally I never use Blockly, and I don't think there's much overlap between Blockly users and P2-ES users. I think most developers who have engineering samples are interested in a different class of tool.

    Whenever Blockly produces standard C code any of the GCC compilers will certainly handle it. The Propeller specific code that Blockly produces is another matter, but porting it will "just" be a matter of fixing up the libraries to work on P2. Which probably isn't much fun, so it may be difficult to find a volunteer for that.

  • ersmith,
    thank you for your comprehensive and concise overview of C compilers.
    This really helps non-specialist observers like me understand the state-of-play.
    What I really cannot understand is why there is so much duplicated effort.
    I understand competition is good, but it would seem that some sort of merging or selection process would lead to better outcomes for all.
  • A blockly clone that runs native on P2 is possible.
    There could be a P2 stick that plugs in to HDMI port on monitor. Sorta like Fire TV Stick...
    But, with keyboard and mouse... Complete programming environment...
    Prop Info and Apps: http://www.rayslogic.com/
  • jmgjmg Posts: 13,917
    macrobeak wrote: »
    ersmith,
    thank you for your comprehensive and concise overview of C compilers.
    This really helps non-specialist observers like me understand the state-of-play.
    What I really cannot understand is why there is so much duplicated effort.
    I understand competition is good, but it would seem that some sort of merging or selection process would lead to better outcomes for all.

    You are right about benefits of merging, but there is not as much duplicated effort as may first appear.

    ZOG for P2 & RISC-V JIT are MPU emulators, and look like ersmith's own ports, and the latter one allows P2 to run Python.

    P2-GCC also not quite native P2, as it currently says "It uses the PropGCC compiler to generate a P1 .s assembly file. The assembly file is converted to a P2 .spin2 file using the utility s2pasm. The P2 assembly file is assembled into an object file using p2asm. p2link is then used to link the object file with other object files and libraries to produce an executable binary file."

    CATALINA is being ported from a P1 code generator, so that work is to preserve the effort that went into P1 version.

    If you are developing a C compiler, it makes sense to be able to run 'second opinion' type tests, and those other ports allow that.
  • macrobeak wrote: »
    What I really cannot understand is why there is so much duplicated effort.
    I understand competition is good, but it would seem that some sort of merging or selection process would lead to better outcomes for all.

    Before any merging/selection process can take place there has to be options to select from. So part of this whole process is people trying out different approaches to see what works. That's natural, and it may seem confusing at first but without experimentation and variety you can never really make progress.

    High quality compilers like GCC and LLVM are big, complicated projects. They're difficult to come up to speed on, and not necessarily much fun to port to a new architecture. To work around this there are different approaches:

    (1) Use a simpler compiler. That was @RossH's approach with porting LCC to first P1 and now P2.

    (2) Start with a different compiler for the architecture, and add C support to it. That's what I"m trying to do with fastspin. Under the hood C, BASIC, and Spin are not really much different; most of the differences are in syntax. (The few semantic differences can be a pain too, of course, but they're rare.)

    (3) Find a way to adapt a compiler for a different architecture to the P2. Rather than trying to port a whole compiler and toolchain, they just implement a translator for another architecture (a much simpler process) and then all the tools for that architecture become available for P2. The ZPU effort does this by interpreting the instructions for another machine. The RISC-V JIT approach translates from RISC-V to P2 instructions at run-time. The p2gcc approach translates from P1 to P2 at compile time.

  • David BetzDavid Betz Posts: 13,480
    edited 2019-04-08 - 13:13:17
    It seems like the big question for Parallax is whether they feel they need C++ support. They can likely get that with p2gcc or with a new port of GCC or LLVM to P2 but I don't think that either fastspin or Catalina are likely to provide C++ support in the near future.
  • David Betz wrote: »
    It seems like the big question for Parallax is whether they feel they need C++ support. They can likely get that with p2gcc or with a new port of GCC or LLVM to P2 but I don't think that either fastspin or Catalina are likely to provide C++ support in the near future.

    I don't think it's a question for Parallax, it's a question for us. Unless and until Parallax provides some direction and support, the C effort on P2 is going to have to be community driven, which means we get to make the decisions. If Parallax has some particular needs or desires then they'll have to make those known, but right now Parallax does not give the impression that they are interested in P2 tools.

    As far as C++ support is concerned, it should be very straightforward to add it to p2gcc -- it's just a matter of changing the p2gcc compiler driver to recognize C++ extensions (.cpp, .cxx, .c++... any others?) and pass the appropriate flags along to PropGCC compiler to compile these. For simple C++ programs there aren't any flags necessary, but for more complicated ones we may need to link with libstdc++ and/or some other libraries.

    Of course the ZPU and RISC-V compilers already support C++ as is. In fact I'd suggest that the RISC-V path is a good one for C++ support on P2; we'd get a mature compiler (GCC 8.x) with all the latest C++ features, and the RISC-V JIT framework has custom instructions that map to many of the P2 instructions, so writing an equivalent to <propeller.h> is very feasible (in fact the lib/riscv.h file in the git repo is a start on this).
  • ersmith wrote: »
    In fact I'd suggest that the RISC-V path is a good one for C++ support on P2; we'd get a mature compiler (GCC 8.x) with all the latest C++ features, and the RISC-V JIT framework has custom instructions that map to many of the P2 instructions, so writing an equivalent to <propeller.h> is very feasible (in fact the lib/riscv.h file in the git repo is a start on this).
    Are you suggesting this as a long-term solution or just as something to use until a native compiler is available? I wonder what potential commercial users would think of a toolchain that targets another architecture with an emulator on the P2. I realize that it would run pretty fast with a JIT compiler but would it be seen as a viable solution by potential customers?
  • Blockly for the Propeller generates C source. So that requires Parallax to want C at least. http://learn.parallax.com/tutorials/language/blocklyprop/simple-blocklyprop-programs-propeller-boards/learning-program
    "We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
    By doing that, we can more accurately measure their mass, and determine whether
    scientists have systematically been underestimating how much matter they contain."
  • I found GCC C is generally too bloated for the Propeller1. I expect GCC C++ to be the same for the Prop2.
    "We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
    By doing that, we can more accurately measure their mass, and determine whether
    scientists have systematically been underestimating how much matter they contain."
  • evanh wrote: »
    Blockly for the Propeller generates C source. So that requires Parallax to want C at least. http://learn.parallax.com/tutorials/language/blocklyprop/simple-blocklyprop-programs-propeller-boards/learning-program
    The BlocklyProp C code generator also generates calls to some Simple Libraries functions so those would have to be ported for it to work with any of the P2 C compiler solutions.
  • David Betz wrote: »
    The BlocklyProp C code generator also generates calls to some Simple Libraries functions so those would have to be ported for it to work with any of the P2 C compiler solutions.
    Parallax will have an interest in that coming to fruition eventually.
    "We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
    By doing that, we can more accurately measure their mass, and determine whether
    scientists have systematically been underestimating how much matter they contain."
  • evanhevanh Posts: 7,880
    edited 2019-04-08 - 16:18:26
    Other than Chip, maybe they're holding off starting any work until the final silicon in delivered to minimise confusion over differences with the P2ES chip. Once August rolls around no-one will be talking P2ES any more.
    "We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
    By doing that, we can more accurately measure their mass, and determine whether
    scientists have systematically been underestimating how much matter they contain."
  • evanh wrote: »
    Other than Chip, maybe they're holding off starting any work until the final silicon in delivered to minimise confusion over differences with the P2ES chip. Once August rolls around no-one will be talking P2ES any more.
    Are there any changes in the Rev2 silicon that are likely to affect compiler code generation? Even if there are, I guess compiler developers could switch to using the FPGA until the Rev2 chips come out.
  • Nothing huge in terms of tools. There's more additions than changes. https://forums.parallax.com/discussion/169282/list-of-changes-in-next-p2-silicon/p1
    "We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
    By doing that, we can more accurately measure their mass, and determine whether
    scientists have systematically been underestimating how much matter they contain."
  • evanhevanh Posts: 7,880
    edited 2019-04-08 - 16:41:33
    #18 was the only entry in the list not done in the end I believe.

    #17 wasn't confirmed either but it would have been done along with the PLL tweaks (which aren't in the list) I'd say.

    Here's confirmation of #16 plus the late fixes - https://forums.parallax.com/discussion/169695/new-fpga-files-for-next-silicon-version-5th-final-release-contains-new-rom/p1
    New ROM with updated SD booter and TAQOZ.

    Extra register on each IN signal from pins to ensure metastability.

    Fixes r/w glitch during LUT sharing.
    Fixes JMP-event-within-REP bug.
    'GETCT reg WC' doesn't change C.
    "We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
    By doing that, we can more accurately measure their mass, and determine whether
    scientists have systematically been underestimating how much matter they contain."
  • ersmithersmith Posts: 3,387
    edited 2019-04-08 - 17:49:57
    David Betz wrote: »
    ersmith wrote: »
    In fact I'd suggest that the RISC-V path is a good one for C++ support on P2; we'd get a mature compiler (GCC 8.x) with all the latest C++ features, and the RISC-V JIT framework has custom instructions that map to many of the P2 instructions, so writing an equivalent to <propeller.h> is very feasible (in fact the lib/riscv.h file in the git repo is a start on this).
    Are you suggesting this as a long-term solution or just as something to use until a native compiler is available?
    It could be a viable long-term solution, I'm not sure. It might be very viable indeed if the P3 ends up using the RISC-V instruction set architecture in some form. And I think that would be a very good idea; inventing new instruction sets is an expensive business, both in terms of money and time. It requires writing all the tools from scratch, and as we've seen so far on P2 that's been difficult, at least compared to how simple things would be if P2 had used a well known ISA instead.

    Indeed, I see no particular reason to expect that a "truly" native C++ compiler will ever become available. p2gcc is pretty close, but it still targets another architecture (P1). And unless PropGCC gets updated it will continue to fall further and further behind of current C++ standards, which are changing rapidly.
    I wonder what potential commercial users would think of a toolchain that targets another architecture with an emulator on the P2. I realize that it would run pretty fast with a JIT compiler but would it be seen as a viable solution by potential customers?

    Well, I expect commercial users would generally be pragmatic: they'll want something that works and lets them get a product to market quickly. Struggling with custom tools for a unique ISA is not something most commercial users are going to be keen on.

    Every C compiler for the P1 involved an interpreter of some kind, either a minimal LMM one or a more complicated CMM one. Are people put off by Spin because it targets a virtual architecture? It seems like the sticking point isn't the interpreter, it's the fact that it interprets a well known 3rd party instruction set. But I'd argue that's a *good* thing. The Spin bytecode instructions are still not particularly well known or documented, and the only compilers that target it are for the Spin language (AFAIK; were there ever any others?) RISC-V, in contrast, is a well documented standard with multiple implementations available. The P2 RISC-V interpreter would be just another RV32IM implementation, with an option to run special custom microcode (PASM) on the other CPUs. That special microcode, and the P2 custom RISC-V instructions for accessing smart pins, are the "special sauce" that will make P2 unique and interesting.

    Now, perhaps RISC-V isn't the best final base architecture. Maybe something like Web Assembly or JVM would be better. I suspect in terms of performance though RISC-V, with it's register based architecture, is a better fit to the P2 than most other virtual machines are.
  • Well, if commercial customers would accept an emulated RISC-V instruction set then that is probably the best way to go. As you say, the toolchain is current and probably well maintained. Also, I agree that it would be nice if P3 used the RISC-V instruction set. Not sure if that is likely to happen though.
  • @ersmith , the github link to the riscv jit doesn't work. is it marked as private? typo?
    David
    PropWare: C++ HAL (Hardware Abstraction Layer) for PropGCC; Robust build system using CMake; Integrated Simple Library, libpropeller, and libPropelleruino (Arduino port); Instructions for Eclipse and JetBrain's CLion; Example projects; Doxygen documentation
    CI Server: https://ci.zemon.name?guest=1
  • ersmith wrote: »
    Indeed, I see no particular reason to expect that a "truly" native C++ compiler will ever become available. p2gcc is pretty close, but it still targets another architecture (P1). And unless PropGCC gets updated it will continue to fall further and further behind of current C++ standards, which are changing rapidly.
    Eric, I find that comment a bit depressing, especially coming from you. It does surprise me a bit that Parallax hasn't even begun to even develop a plan to implement a C/C++ compiler for the P2. However, I still believe that any minute now Parallax with come to the realization that they'll need development tools for the P2, and it would be beneficial to have a C++ compiler for it.


  • We need a Visual Studio plugin too...
    Prop Info and Apps: http://www.rayslogic.com/
  • jmgjmg Posts: 13,917
    ersmith wrote: »
    Well, I expect commercial users would generally be pragmatic: they'll want something that works and lets them get a product to market quickly. Struggling with custom tools for a unique ISA is not something most commercial users are going to be keen on.

    Every C compiler for the P1 involved an interpreter of some kind, either a minimal LMM one or a more complicated CMM one. Are people put off by Spin because it targets a virtual architecture? It seems like the sticking point isn't the interpreter, it's the fact that it interprets a well known 3rd party instruction set. But I'd argue that's a *good* thing. The Spin bytecode instructions are still not particularly well known or documented, and the only compilers that target it are for the Spin language (AFAIK; were there ever any others?) RISC-V, in contrast, is a well documented standard with multiple implementations available. The P2 RISC-V interpreter would be just another RV32IM implementation, with an option to run special custom microcode (PASM) on the other CPUs. That special microcode, and the P2 custom RISC-V instructions for accessing smart pins, are the "special sauce" that will make P2 unique and interesting.

    Now, perhaps RISC-V isn't the best final base architecture. Maybe something like Web Assembly or JVM would be better. I suspect in terms of performance though RISC-V, with it's register based architecture, is a better fit to the P2 than most other virtual machines are.

    I think that is viable, but the issues I see around CPU emulation are less to do with language, and more related to the usage details
    I think this emulation path means listing files and map files show RISV V based memory maps, not native P2 memory maps.

    * How will Debug experience work, can users BREAK and STEP in their source code ?
    * Can users in-line P2 Assembler, in this flow, like they can on fastspin-C ?
    * Can users manage multiple COGs and shared memory, in a clean manner ?
Sign In or Register to comment.