Yet another C compiler
MikeChristle
Posts: 31
Greetings
I recently did some research on compilers for the Propeller and found cornucopia of tools for every conceivable language. It appears that the world does not need another compiler for the Propeller. Having said that, I couldn't help myself. A few years ago I took a class on compiler design and got hooked.
My compiler, named PropC, is a command line compiler. I accepts a language that is not standard C, but provides access to all of the Propellers features. It compiles to Propeller assembly code. I also added an assembler so the programs can be used by SimpleIDE and Propeller Tool.
I just uploaded the entire project, including source, to a SourceForge project, sourceforge.net/projects/propc2asm/. The file PropC_UM.pdf will explain all the details.
I wrote this thing for the pure joy of learning something new. Enjoy.
Mike
I recently did some research on compilers for the Propeller and found cornucopia of tools for every conceivable language. It appears that the world does not need another compiler for the Propeller. Having said that, I couldn't help myself. A few years ago I took a class on compiler design and got hooked.
My compiler, named PropC, is a command line compiler. I accepts a language that is not standard C, but provides access to all of the Propellers features. It compiles to Propeller assembly code. I also added an assembler so the programs can be used by SimpleIDE and Propeller Tool.
I just uploaded the entire project, including source, to a SourceForge project, sourceforge.net/projects/propc2asm/. The file PropC_UM.pdf will explain all the details.
I wrote this thing for the pure joy of learning something new. Enjoy.
Mike
Comments
Wow! How awesome is that.
Normally I'm of the opinion that the last thing the world needs is YAFL. I'll let you work out that acronym. We have so many languages today and new ones keep popping up all the time.
On the other hand all respect to anyone with the skill who puts in the effort to create a compiler. I know nothing of compiler construction but years ago I read Jack Crenshaw's series of articles, "Let's Build a Compiler". From that I got a glimmer of an understanding of how complex language design can be never mind writing an actual compiler. It did enable me to write a compiler for a simple C/Pascal like language that at least supported signed and unsigned byte, word and long variables. No strings, structures or arrays though. It generated very inefficient code for x86 and then the Propeller.
You are right, this whole language design / compiler writing thing is fascinating.
I don't have time to look over your code or try it out but that's a nice user manual.
Does this create in-cog code or can it generate code for execution from HUB as well ?
A couple of thoughts...
It would be great to have your project in a git repo on github or bitbucket. Git is much nicer than anything else. github and bitbucket make it much easier to down load the source and report any issues.
C# is a bit of a show stopper for me. It can be used on Linux but it's a pain. If it were C++ then we could run it anywhere. And we could compile it to Javascript and have an in browser IDE. We have already demoed such an in-browser development environment with the OpenSource Spin compiler translated to Javascript. It's far from a finished project though.
You might want to think of nicer name for this
Anyway, love to try this out some time.
This looks pretty cool.
I'd suggest avoiding 'C', perhaps PropC- / PropCminus / PropCm or Prop2C- or...
- there are already GCC efforts, which are great, but GCC has become something like the Windows of the Compiler world.
Widely used, but large and slow, with a lot of inertia.
By avoiding the exact C name, you avoid confusing users, plus give yourself a little more freedom to do those extras you have nicely done.
Other suggestions :
* Add in-line ASM, if not already there.
The simpler name-space you have, should make this easy ?
* Add conditional compile controls, to allow various version builds etc
Predefined should be 'PropC-'
* Add Block comments. Not functionally a deal breaker, so can go down the to-do list
Personal preferences and trends:
With languages these days, I like to see the type-size visible. eg uint32 etc
There are just so many INTs out there, no one has any idea of their size
The tag GLOBAL may make more sense than HUB, but currently the Prop2 docs are all about HUB, so I would suggest GBYTE/GWORD/GLONG are called instead a clearer HUB associated name like ?
HUB_AryU8, HUB_AryU16, HUB_AryU32
If Chip relabels HUB, then you could change your name.
If we are talking language design issues then:
1. Don't include inline ASM. That make the whole thing a lot more complicated. It also means you can't compile the same code down to x86 for testing on Windows, Linux whatever.
2. I hate compiler controls "#ifdef" and all that. That uglifies the source code and adds complexity. I'm all for no preprocessor macros and all that junk.
3. Not sure I'd miss block comments.
4. I get your point about "uint32" and such. It's just that I find such warts uglify the source. Similarly for the complexity of "HUB_AryU32" and such.
5. I'd make it compulsory to uses braces "{ ... }" around all blocks, "if", "while" etc. Even if the conditionally executed code is only one statement. I like consistency in the language those one liners have traditionally been a source of bugs. See Apple's recent famous SSL bugs.
6. Underscores should not be allowed in variable names. That really does uglify the source as well as making the name longer than they need be.
7. Many other language niggles I can't recall just now...
Actually, I'd like to see a compiler enforce style. You know, braces in the right place, 4 space indentation, no tabs, no assignments in conditional statements and so on.
I agree about staying away from "C" in the name. If it's not actually standard C call it something else to avoid confusion.
Depends who the target users may be. If it's expert programmers throw in all the conditionals and macros and inline asm you like. Of course they won't use it because the already have C.
If it's the non-expert, beginner or casual programmer then KISS rules.
Or, hey, it's your language, best to make it the way you like it!
Other tools have inline ASM, and I'd rank that quite high on any small microcontroller.
Mostly I agree, but your 1. above, nicely shows why this feature is needed. (used sparingly, of course)
That's why they are lower in the list.
They are useful, if you want to paste in other boiler plate info, and also good to quickly remove blocks of code.
Plus, the code you are porting, likely has them too.
Best to reduce the changes, where it is easy to do.
uglify is entirely in the eye of the beholder, but how exactly DO you guess what size INT is, when you pick up new code for a new MCU ?
I prefer the code to be self explanatory, & quickly scanned, I'm less worried about 'complexity' & perceptions.
This idea has merit, and eg Modula-2 enforces that style by inferring the { and requiring END for }.
Perhaps a nag warning, so users can compile less formal code.
Hehe, more 'eye of the beholder' stuff.
I use them a lot, and find code is the quite opposite of uglified, the human eye can read it faster
Maybe, but is that really the place of a command line tool ?
I could understand doing most of that in an Editor/IDE, where there was an automated 'enforce style' button.
I found the thread by ersmith, where he has the nice spin -> p2asm tool
http://forums.parallax.com/discussion/164187/fastspin-compiler-for-p2/p1
This uses a flat Name-space, and in-line asm this way, so you could copy that ?
one example he gave which produces
Fantastic that you have done this all on your own without hardly posting on the forum!
You will get lots of why did you do it this way, why didn't you do that. It's your version after all.
I will take a look at the docs a little later as I have some things to do just now. Perhaps a snippet of the docs posted here might help others see what you have done without downloading everything.
Looking forward to trying what you have done, and what you can do with it
There is a separate pdf that is only 137k, gives a quite good overview and examples
https://sourceforge.net/projects/propc2asm/files/Doc/PropC_UM.pdf/download
Sounds like a fun time. I look forward to having the silicon done so I can program, too.
Secondly, I would like to thanks everyone for all of the comments and ideas. Frankly I don't intend to change anything. PropC does exactly what I want it to do. It is what it is. Now that it is published I intend to wash my hands and move on to my next project.
BTW for my next project I will be using PropC a lot. I am building an arbitrary waveform generator so I can learn about the math behind musical chords. Woo Hoo.
Mike
Well, you could certainly benefit from using the Prop2 for that. It has LOG/EXP and SIN/COS/ROTATE/VECTOR functions in the CORDIC. It has excellent DACs, too.
-Phil
I also dabbled in writing a C compiler, which was targeted for the P2. The compiler is called taz, and it's written in C. I'm not a C# programmer, but I looked at your code, and it looks very well organized -- much cleaner than my code. I also like your documentation. It's very helpful in understanding your code.
Not quite. My priorities for PropC language design were to allow access to all of the features of Propeller assembly and to maximize run-time performance, not compatibility. To do this I went thru the Propeller users manual, chapter 3, and added syntax to PropC for each feature, locks, waits, etc.
P2 has features not present on the P1. For example, auto increment of pointers on main memory access. If I just updated the code generator I would loose this feature and take a large performance hit. To support the P2 I would start by updating the BNF to match the P2 feature set. Then these updates would ripple thru the entire project. This would not be a complete re-write, but it's much bigger than a tweek to the last stage. (Why is twerk in the dictionary but not tweek???)
Mike
Because it's spelled tweak, not tweek. Sometimes correct spelling is important.
For P2, how about the following:
Install SimpleIDE with GCC and a modified P2 propeller.h for the P2 register names and addresses.
Run the GCC cc1.exe with -E option to run the C preprocessor only, with output to a new file. Found in (on my system) in directory:
C:\Program Files (x86)\SimpleIDE\propeller-gcc\libexec\gcc\propeller-elf\4.6.1\
Run the C program through Dave Hein's (ie, you) c2spin program, thread here:
forums.parallax.com/discussion/119342/cspin-a-c-to-spin-converter/p1
Run the converted spin program through ersmith's Spin Convert, spincvt, which can convert Spin to Pasm or C for P2, but we want P2 Pasm.
forums.parallax.com/discussion/163635/interactive-spin-to-pasm-or-c-converter/p1
Voila, C to Pasm on P2 via Spin!
I'm not quite following this, as compilers will optimize by context.
Yes, that is more than a direct code generator, but looking at 8 bit MCUs for example, you can see code like
HiByte = Integer >> 8;
here, no one expects the 8b compiler to actually load 16b, and shift right 8 times, but they would expect
Integer = Integer >> 6;
to load 16b and shift 6 times.
Likewise
Boolean = ! Boolean; uses a CPL BIT opcode, where that exists.
In your case an array access with Ptr++, would be expected to use the auto-inc opcode in P2, but a separate INC in P1.
The source is identical,
PropC does not support ++ --.
Mike
https://docs.google.com/document/d/1O27nO2tMjBTvUNblFFRtEp5DHvcvsSGXcPvH9FaJ9u8/pub
which is linked from post #1 in this thread, presently at 9b, with 9c? due very soon.
http://forums.parallax.com/discussion/162298
Some elements need expanding in the DOCs as this example shows
http://forums.parallax.com/discussion/164507/cmpx-cmpsx-z-flag-issue