Yet another C compiler

MikeChristleMikeChristle Posts: 31
edited June 2016 in Propeller 1 Vote Up2Vote Down
Greetings

I recently did some research on compilers for the Propeller and found cornucopia of tools for every conceivable language. It appears that the world does not need another compiler for the Propeller. Having said that, I couldn't help myself. A few years ago I took a class on compiler design and got hooked.

My compiler, named PropC, is a command line compiler. I accepts a language that is not standard C, but provides access to all of the Propellers features. It compiles to Propeller assembly code. I also added an assembler so the programs can be used by SimpleIDE and Propeller Tool.

I just uploaded the entire project, including source, to a SourceForge project, sourceforge.net/projects/propc2asm/. The file PropC_UM.pdf will explain all the details.

I wrote this thing for the pure joy of learning something new. Enjoy.
Mike

Comments

  • 22 Comments sorted by Date Added Votes
  • MikeChristle,

    Wow! How awesome is that.

    Normally I'm of the opinion that the last thing the world needs is YAFL. I'll let you work out that acronym. We have so many languages today and new ones keep popping up all the time.

    On the other hand all respect to anyone with the skill who puts in the effort to create a compiler. I know nothing of compiler construction but years ago I read Jack Crenshaw's series of articles, "Let's Build a Compiler". From that I got a glimmer of an understanding of how complex language design can be never mind writing an actual compiler. It did enable me to write a compiler for a simple C/Pascal like language that at least supported signed and unsigned byte, word and long variables. No strings, structures or arrays though. It generated very inefficient code for x86 and then the Propeller.

    You are right, this whole language design / compiler writing thing is fascinating.

    I don't have time to look over your code or try it out but that's a nice user manual.

    Does this create in-cog code or can it generate code for execution from HUB as well ?

    A couple of thoughts...

    It would be great to have your project in a git repo on github or bitbucket. Git is much nicer than anything else. github and bitbucket make it much easier to down load the source and report any issues.

    C# is a bit of a show stopper for me. It can be used on Linux but it's a pain. If it were C++ then we could run it anywhere. And we could compile it to Javascript and have an in browser IDE. We have already demoed such an in-browser development environment with the OpenSource Spin compiler translated to Javascript. It's far from a finished project though.

    You might want to think of nicer name for this :)

    Anyway, love to try this out some time.



  • jmgjmg Posts: 10,617
    My compiler, named PropC, is a command line compiler. I accepts a language that is not standard C, but provides access to all of the Propellers features. It compiles to Propeller assembly code. I also added an assembler so the programs can be used by SimpleIDE and Propeller Tool.

    This looks pretty cool.

    I'd suggest avoiding 'C', perhaps PropC- / PropCminus / PropCm or Prop2C- or...
    - there are already GCC efforts, which are great, but GCC has become something like the Windows of the Compiler world.
    Widely used, but large and slow, with a lot of inertia.
    By avoiding the exact C name, you avoid confusing users, plus give yourself a little more freedom to do those extras you have nicely done.

    Other suggestions :
    * Add in-line ASM, if not already there.
    The simpler name-space you have, should make this easy ?
    * Add conditional compile controls, to allow various version builds etc
    Predefined should be 'PropC-'
    * Add Block comments. Not functionally a deal breaker, so can go down the to-do list :)

    Personal preferences and trends:
    With languages these days, I like to see the type-size visible. eg uint32 etc
    There are just so many INTs out there, no one has any idea of their size :)

    The tag GLOBAL may make more sense than HUB, but currently the Prop2 docs are all about HUB, so I would suggest GBYTE/GWORD/GLONG are called instead a clearer HUB associated name like ?
    HUB_AryU8, HUB_AryU16, HUB_AryU32

    If Chip relabels HUB, then you could change your name.


  • Oh boy,

    If we are talking language design issues then:

    1. Don't include inline ASM. That make the whole thing a lot more complicated. It also means you can't compile the same code down to x86 for testing on Windows, Linux whatever.

    2. I hate compiler controls "#ifdef" and all that. That uglifies the source code and adds complexity. I'm all for no preprocessor macros and all that junk.

    3. Not sure I'd miss block comments.

    4. I get your point about "uint32" and such. It's just that I find such warts uglify the source. Similarly for the complexity of "HUB_AryU32" and such.

    5. I'd make it compulsory to uses braces "{ ... }" around all blocks, "if", "while" etc. Even if the conditionally executed code is only one statement. I like consistency in the language those one liners have traditionally been a source of bugs. See Apple's recent famous SSL bugs.

    6. Underscores should not be allowed in variable names. That really does uglify the source as well as making the name longer than they need be.

    7. Many other language niggles I can't recall just now... :)

    Actually, I'd like to see a compiler enforce style. You know, braces in the right place, 4 space indentation, no tabs, no assignments in conditional statements and so on.

    I agree about staying away from "C" in the name. If it's not actually standard C call it something else to avoid confusion.

    Depends who the target users may be. If it's expert programmers throw in all the conditionals and macros and inline asm you like. Of course they won't use it because the already have C.

    If it's the non-expert, beginner or casual programmer then KISS rules.

    Or, hey, it's your language, best to make it the way you like it!
  • jmgjmg Posts: 10,617
    edited June 2016 Vote Up0Vote Down
    Heater. wrote: »
    Oh boy,

    If we are talking language design issues then:

    1. Don't include inline ASM. That make the whole thing a lot more complicated. It also means you can't compile the same code down to x86 for testing on Windows, Linux whatever.
    Did you miss the custom features the OP added, means you already cannot do that.
    Other tools have inline ASM, and I'd rank that quite high on any small microcontroller.
    Heater. wrote: »
    2. I hate compiler controls "#ifdef" and all that. That uglifies the source code and adds complexity. I'm all for no preprocessor macros and all that junk.
    Mostly I agree, but your 1. above, nicely shows why this feature is needed. (used sparingly, of course)
    Heater. wrote: »
    3. Not sure I'd miss block comments.
    That's why they are lower in the list.
    They are useful, if you want to paste in other boiler plate info, and also good to quickly remove blocks of code.
    Plus, the code you are porting, likely has them too.
    Best to reduce the changes, where it is easy to do.
    Heater. wrote: »
    4. I get your point about "uint32" and such. It's just that I find such warts uglify the source. Similarly for the complexity of "HUB_AryU32" and such.
    uglify is entirely in the eye of the beholder, but how exactly DO you guess what size INT is, when you pick up new code for a new MCU ?
    I prefer the code to be self explanatory, & quickly scanned, I'm less worried about 'complexity' & perceptions.
    Heater. wrote: »
    5. I'd make it compulsory to uses braces "{ ... }" around all blocks, "if", "while" etc. Even if the conditionally executed code is only one statement. I like consistency in the language those one liners have traditionally been a source of bugs. See Apple's recent famous SSL bugs.
    This idea has merit, and eg Modula-2 enforces that style by inferring the { and requiring END for }.
    Perhaps a nag warning, so users can compile less formal code.
    Heater. wrote: »
    6. Underscores should not be allowed in variable names. That really does uglify the source as well as making the name longer than they need be.
    Hehe, more 'eye of the beholder' stuff.
    I use them a lot, and find code is the quite opposite of uglified, the human eye can read it faster
    Heater. wrote: »
    Actually, I'd like to see a compiler enforce style. You know, braces in the right place, 4 space indentation, no tabs, no assignments in conditional statements and so on.
    Maybe, but is that really the place of a command line tool ?
    I could understand doing most of that in an Editor/IDE, where there was an automated 'enforce style' button.

  • jmgjmg Posts: 10,617
    jmg wrote: »
    My compiler, named PropC, is a command line compiler. I accepts a language that is not standard C, but provides access to all of the Propellers features. It compiles to Propeller assembly code. I also added an assembler so the programs can be used by SimpleIDE and Propeller Tool.
    * Add in-line ASM, if not already there.
    The simpler name-space you have, should make this easy ?

    I found the thread by ersmith, where he has the nice spin -> p2asm tool

    http://forums.parallax.com/discussion/164187/fastspin-compiler-for-p2/p1

    This uses a flat Name-space, and in-line asm this way, so you could copy that ?

    one example he gave
    PUB foo(seed, u32a, u32b) | u32c, u32x, u32y, low, high
        u32c := seed - 256
        '' want:u32x := u32a * u32b / u32c
        ''      u32y := u32a * u32b // u32c
        asm
          qmul u32a, u32b
          getqx low
          getqy high
          setq  high
          qdiv  low, u32c
          getqx u32x
          getqy u32y
        endasm
        return u32x + u32y
    
    which produces
    _foo
            sub     arg1, #256
            qmul    arg2, arg3
            getqx   _foo_low
            getqy   _foo_high
            setq    _foo_high
            qdiv    _foo_low, arg1
            getqx   result1
            getqy   _foo_u32y
            add     result1, _foo_u32y
    _foo_ret
            reta
    
  • Mike,
    Fantastic that you have done this all on your own without hardly posting on the forum!

    You will get lots of why did you do it this way, why didn't you do that. It's your version after all.
    I will take a look at the docs a little later as I have some things to do just now. Perhaps a snippet of the docs posted here might help others see what you have done without downloading everything.

    Looking forward to trying what you have done, and what you can do with it :)
    My Prop boards: P8XBlade2, RamBlade, CpuBlade, TriBlade
    Prop OS (also see Sphinx, PropDos, PropCmd, Spinix)
    Website: www.clusos.com
    Prop Tools (Index) , Emulators (Index) , ZiCog (Z80)
  • jmgjmg Posts: 10,617
    Cluso99 wrote: »
    Perhaps a snippet of the docs posted here might help others see what you have done without downloading everything.

    There is a separate pdf that is only 137k, gives a quite good overview and examples

    https://sourceforge.net/projects/propc2asm/files/Doc/PropC_UM.pdf/download
  • Good job, Mike!

    Sounds like a fun time. I look forward to having the silicon done so I can program, too.
  • To start with I would like to apologize for posting this in the wrong category. It belongs in the Propeller 1 category. To be clear, PropC targets P1, I know nothing about P2.

    Secondly, I would like to thanks everyone for all of the comments and ideas. Frankly I don't intend to change anything. PropC does exactly what I want it to do. It is what it is. Now that it is published I intend to wash my hands and move on to my next project.

    BTW for my next project I will be using PropC a lot. I am building an arbitrary waveform generator so I can learn about the math behind musical chords. Woo Hoo.

    Mike
  • ...I am building an arbitrary waveform generator so I can learn about the math behind musical chords. Woo Hoo.

    Mike

    Well, you could certainly benefit from using the Prop2 for that. It has LOG/EXP and SIN/COS/ROTATE/VECTOR functions in the CORDIC. It has excellent DACs, too.
  • Cluso99 wrote:
    Fantastic that you have done this all on your own without hardly posting on the forum!
    I totally agree! And that's the best way to maintain a cohesive vision for how it ought to look/work without a lot of extraneous "input" noise.

    -Phil
    “Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
  • PublisonPublison Posts: 9,772
    edited June 2016 Vote Up0Vote Down
    Moved to Prop1 from Prop2 by request of the OP.
    Infernal Machine
  • Dave HeinDave Hein Posts: 5,307
    edited June 2016 Vote Up0Vote Down
    To start with I would like to apologize for posting this in the wrong category. It belongs in the Propeller 1 category. To be clear, PropC targets P1, I know nothing about P2.
    I was wondering about that. The code that was being generated didn't look quite right for the P2. I played around with the executable, and it looks very good. I noticed that it generates COG code. Have you considered adding an LMM mode so you can execute code from hub RAM also?

    I also dabbled in writing a C compiler, which was targeted for the P2. The compiler is called taz, and it's written in C. I'm not a C# programmer, but I looked at your code, and it looks very well organized -- much cleaner than my code. I also like your documentation. It's very helpful in understanding your code.


  • jmgjmg Posts: 10,617
    Dave Hein wrote: »
    I also dabbled in writing a C compiler, which was targeted for the P2. The compiler is called taz, and it's written in C. I'm not a C# programmer, but I looked at your code, and it looks very well organized -- much cleaner than my code. I also like your documentation. It's very helpful in understanding your code.
    Any chance you could take your code, and Mike's, and create a 'PropC-minus' for both P1 and P2 ?

  • The code generator for taz, and presumably for PropC as well, could be modified to generate code for virtually any platform. So it should be doable to generate code for both P1 and P2. Since taz is written in C and PropC is written in C# it would be difficult to merge the two. However, it might be feasible to convert the code generators from each compiler to work with the other compiler. I haven't looked that closely at PropC, so I don't know whether that is a practical approach or not.
  • Dave Hein wrote: »
    The code generator for taz, and presumably for PropC as well, could be modified to generate code for virtually any platform.

    Not quite. My priorities for PropC language design were to allow access to all of the features of Propeller assembly and to maximize run-time performance, not compatibility. To do this I went thru the Propeller users manual, chapter 3, and added syntax to PropC for each feature, locks, waits, etc.

    P2 has features not present on the P1. For example, auto increment of pointers on main memory access. If I just updated the code generator I would loose this feature and take a large performance hit. To support the P2 I would start by updating the BNF to match the P2 feature set. Then these updates would ripple thru the entire project. This would not be a complete re-write, but it's much bigger than a tweek to the last stage. (Why is twerk in the dictionary but not tweek???)

    Mike
  • (Why is twerk in the dictionary but not tweek???)

    Because it's spelled tweak, not tweek. Sometimes correct spelling is important.
    In science there is no authority. There is only experiment.
    Life is unpredictable. Eat dessert first.
  • Dave Hein wrote: »
    The code generator for taz, and presumably for PropC as well, could be modified to generate code for virtually any platform. So it should be doable to generate code for both P1 and P2. Since taz is written in C and PropC is written in C# it would be difficult to merge the two. However, it might be feasible to convert the code generators from each compiler to work with the other compiler. I haven't looked that closely at PropC, so I don't know whether that is a practical approach or not.

    For P2, how about the following:
    Install SimpleIDE with GCC and a modified P2 propeller.h for the P2 register names and addresses.
    Run the GCC cc1.exe with -E option to run the C preprocessor only, with output to a new file. Found in (on my system) in directory:

    C:\Program Files (x86)\SimpleIDE\propeller-gcc\libexec\gcc\propeller-elf\4.6.1\

    Run the C program through Dave Hein's (ie, you) c2spin program, thread here:
    forums.parallax.com/discussion/119342/cspin-a-c-to-spin-converter/p1

    Run the converted spin program through ersmith's Spin Convert, spincvt, which can convert Spin to Pasm or C for P2, but we want P2 Pasm.

    forums.parallax.com/discussion/163635/interactive-spin-to-pasm-or-c-converter/p1

    Voila, C to Pasm on P2 via Spin!
  • jmgjmg Posts: 10,617
    Not quite. My priorities for PropC language design were to allow access to all of the features of Propeller assembly and to maximize run-time performance, not compatibility. To do this I went thru the Propeller users manual, chapter 3, and added syntax to PropC for each feature, locks, waits, etc.
    That makes sense.
    P2 has features not present on the P1. For example, auto increment of pointers on main memory access. If I just updated the code generator I would loose this feature and take a large performance hit.

    I'm not quite following this, as compilers will optimize by context.
    Yes, that is more than a direct code generator, but looking at 8 bit MCUs for example, you can see code like

    HiByte = Integer >> 8;

    here, no one expects the 8b compiler to actually load 16b, and shift right 8 times, but they would expect
    Integer = Integer >> 6;
    to load 16b and shift 6 times.

    Likewise
    Boolean = ! Boolean; uses a CPL BIT opcode, where that exists.

    In your case an array access with Ptr++, would be expected to use the auto-inc opcode in P2, but a separate INC in P1.
    The source is identical,

  • jmg wrote: »
    In your case an array access with Ptr++, would be expected to use the auto-inc opcode in P2, but a separate INC in P1. The source is identical.

    PropC does not support ++ --.
  • Would someone please post a link to the best docs on P2 assembly. Thank you.
    Mike
  • jmgjmg Posts: 10,617
    I think you want this

    https://docs.google.com/document/d/1O27nO2tMjBTvUNblFFRtEp5DHvcvsSGXcPvH9FaJ9u8/pub

    which is linked from post #1 in this thread, presently at 9b, with 9c? due very soon.
    http://forums.parallax.com/discussion/162298

    Some elements need expanding in the DOCs as this example shows

    http://forums.parallax.com/discussion/164507/cmpx-cmpsx-z-flag-issue
Sign In or Register to comment.