The performance of the current Spin VM could be improved by almost a factor of 2 if we had more cog memory to work with. With only 496 longs many of the Spin opcodes have to be implemented in common code with lots of conditionally executed instructions. The Spin VM is actually a good choice for implementing C. It's a stack-based VM, which works well with RPN processing and function calls. A C-to-Spin-bytecode compiler would also interoperate with existing Spin code, which is a definite advantage.
For me, the ideal C compiler would be capable of generating PASM, LMM PASM and Spin bytecodes. I think the different targets would be implemented at the RTL (Register Transfer Level). It would probably be good to integrate the Spin compiler into the GCC tool chain as well, but it's not necessary. This would be done at the front end where Spin would be converted to Gimple. Spin is a much simpler language than C, so it shouldn't be to hard to convert Spin to Gimple.
I hope that the IDE and compiler are done as separate efforts, where the compiler can be used independent of the IDE. The job of the compiler is well defined, whereas the IDE can be open ended. Hopefully, the compiler could be implemented more quickly that way.
Spin bytecodes are more compact than LMM PASM. However, there's no reason that portions of a Spin program couldn't be implemented in LMM PASM, or even PASM. The Spin interpreter could include an LMM interpreter as well, or the LMM portion of the code could be execute in a separate cog.
I read the thread fairly quickly this morning. I'm finding it more necessary than ever to efficiently focus my efforts in the right places lately.
What we're interested in considering is a cross-platform set of tools with the possibility of supporting different languages. Whether or not GCC/GNU with Eclipse is practical we are not sure at this point - our starting point is a simple discussion among the interested folks who may contribute, develop or even review.
Therefore, if you replied to me via e-mail and expressed your interest then I am including you in a Webinar invitation for Tuesday, May 17th at 8:00 am Pacific. Jazzed will open and provide a 10-15 minute overview to us, then Chip will talk about how Prop 2 architecture could be used with GCC/GNU. After that we'll just have an open discussion and I'll expect to conclude no later than 10 am.
You should receive your webinar invitation from me this morning, before noon Pacific.
The GoToWebinar tool is a nice way to share screens, share audio and to manage a group beyond a few people. Please don't get the idea it's a one-way communication thread; anybody can talk or share a screen. It's the closest format to an actual face-to-face meeting, which I'm sure we'd all prefer given the choice. My role will primarily be as the webinar organizer, which puts me in the position of muting/un-muting participants, passing screen control, etc.
The initial number of participants appears to be about 8-12 developers, but [should we proceed] I expect our actual development team to be between 3-5 people. Their work would be documented on our forums or whatever open-source tool development locations are deemed appropriate.
Also, I want to thank all of you for your support. The community on these forums is truly unique and beneficial. I couldn't imagine this business without all of you.
What we're interested in considering is a cross-platform set of tools with the possibility of supporting different languages. Whether or not GCC/GNU with Eclipse is practical we are not sure at this point - our starting point is a simple discussion among the interested folks who may contribute, develop or even review.
Thanks for the clarification, Ken. I'm looking forward to the discussion!
Therefore, if you replied to me via e-mail and expressed your interest then I am including you in a Webinar invitation for Tuesday, May 17th at 8:00 am Pacific. Jazzed will open and provide a 10-15 minute overview to us, then Chip will talk about how Prop 2 architecture could be used with GCC/GNU. After that we'll just have an open discussion and I'll expect to conclude no later than 10 am.
Is this meeting to discuss GCC/compilers or Eclipse or both? It almost seems like it would be better to have two separate meetings since I suspect different people will be interested in the compiler and GUI aspects of this project.
I've been prepping for UPEW, and I just read over the whole thread... here are my thoughts:
1) going for the GCC tool chain buys us a lot of marketing advantages, but it is also bloated compared to SDCC, LLVM, etc. The odds are running the whole GCC tool chain on a Prop2 is going to be a LOT slower than LCC, SDCC or LLVM based compilers.
2) regardless of tool chains, I can see two ideal targets, and one less ideal
A) LMM, improved for the Prop2 - It should be around 20MIPS without even using FCACHE! CVM - custom byte code designed for C compiler support
C) Spin byte codes - not sure how good a fit for accessing C structures and stack frames
Now with regards to the code generator - regardless of compiler platform:
A) LMM will have decent speed right off the bat; should be faster than code compiled for an Arduino as even a simplistic LMM will attain ~20MIPS - but those are 32 bit MIPS!
If the code generator is well documented and easy to modify, it would also be possible to use FCACHE to generate code approaching the theoretical max of 160MIPS by generating FCACHE blocks for inner loops, and using FCACHE to load floating point and string functions on the fly. As one simple example, strcmp() would be hub access limited as FCACHED code!
Using FCACHE, 100MIPS average should be achievable, however this requires careful thought on the design of the kernel, and really needs about 128 longs in the LMM kernel cog to really shine. This also allows for in-line almost full speed PASM code - as inline code would be loaded into the FCACHE buffer for execution.
Close to native MIPS was the reason I had an FCACHE primitive in the initial LMM kernel when I announced LMM - and if the code generator is nice and well documented, I will help generate better code.
I would suggest starting with the following design goals (this is how I would do it if I were attacking this alone)
First code generator:
i) small, tight LMM2 kernel, keep as much space in the cog free as possible
ii) at least 128 longs kept free as an FCACHE buffer
iii) hand-optimized FCACHE code for common lib functions (str*, float*)
iv) demand-load str* and float* from a standard FCACHELIB for them
Second code generator, after first one is solid and debugged
v) serious LMM optimizer, identifies small loops and converts them to in-line FCACHED code
vi) limited amount of 'register' variables map directly to cog locations (outside of 128 long buffer), used for the FCACHED small loops variables
Now before you guys say that it is better to have many primitives in the kernel, remember that a tigh FCACHELOAD loop can use READQLONG and DJNZ to load the FCACHE area at 2 cycles per long, so any "primitive" that loops will run almost exactly as fast if FCACHED than if permanently resident in the kernel!
Now looking at "CVM" - a custom C VM for Prop2
CVM could be 2-3 times faster than a Spin VM, because it could use an FCACHE like mechanism to demand load "slow" "CISC" opcodes, and be optimized like crazy for the common op codes. It would also be *MUCH* tighter code size wise than LMM2. It is an interesting option, however it would still be (at a guess) 4-10 times slower than an FCACHE using LMM2 version.
As for using the Spin VM - I am concerned about efficient access to global and local data structures, I suspect it would be painfully slow. This is where CVM could shine, with opcodes specifically made for referencing structure members.
Long term, I think the best approach would be supporting both a "CVM" and "LMM2" kernels, with heavy use of FCACHE for LMM2.
Short term, a simple lean / mean LMM2, with demand loaded str*, mem* and float* library calls would give decent performance, that an optimizer that groks FCACHE could later greatly improve on.
Please note, that both LMM2 and CVM would allow for special debug kernels, that would make it easy to have very feature full debuggers.
1) going for the GCC tool chain buys us a lot of marketing advantages, but it is also bloated compared to SDCC, LLVM, etc. The odds are running the whole GCC tool chain on a Prop2 is going to be a LOT slower than LCC, SDCC or LLVM based compilers.
I may be wrong about this but I don't think it's anyone's intention to run the tool chain itself on the Propeller 2, just to generate code for the Propeller 2.
The reason I did not address threading yet is because I do not have definitive documentation on how Chip is implementing it, also LMM2 would allow time-slicing a cog to add C-threads like threading even without hardware threading support.
David:
I personally would love to run the toolchain on Prop2!
I see no reason it would not - I used to run Minix, and later Coherent, on a 4Mhz 80386 with 4MB of ram... a Prop2 will definitely out-muscle that, as it will be more than 20x faster with LMM2 running on a single cog!
The LMM2 kernel should deal with simple clean 32 bit address space, that way it will be relatively easy to support all the different XMM interfaces that are bound to crop up with Prop2.
The reason I did not address threading yet is because I do not have definitive documentation on how Chip is implementing it, also LMM2 would allow time-slicing a cog to add C-threads like threading even without hardware threading support.
David:
I personally would love to run the toolchain on Prop2!
I see no reason it would not - I used to run Minix, and later Coherent, on a 4Mhz 80386 with 4MB of ram... a Prop2 will definitely out-muscle that, as it will be more than 20x faster with LMM2 running on a single cog!
The LMM2 kernel should deal with simple clean 32 bit address space, that way it will be relatively easy to support all the different XMM interfaces that are bound to crop up with Prop2.
I'm not saying it can't be done. I just don't think it will be of much interest to the commercial market that Parallax intends to target. It seems like more of a hobbyist feature.
1) going for the GCC tool chain buys us a lot of marketing advantages, but it is also bloated compared to SDCC, LLVM, etc.
Marketing advantages are certainly in line with sales improvements
GNU/GCC is bloated compared to smaller compilers like LCC that do not attempt to do ANSI C99 and be compatible with 3 other languages: Objective-C, Compiled Java, and Fortran (Apple lovers may recognize one out of that list).
I'm curious though, what is it about LLVM that would be less bloated?
Isn't LLVM just a machine independent replacement for the more tedious GCC backend?
An LLVM IL interpreter might be just as hairy as a CIL VM.
Guess I should do more reading.
The Spin VM can handle struct accesses at the same efficiency as array and indexed pointer accesses. C structs are really the same thing as an array, but with variable size elements. The struct elements can be accessed as byte[structptr + byte_offset], word[structptr + byte_offset] and long[structptr + byte_offset], which is the way I do it in CSPIN. It would be even more efficient to use indexed accesses, such as byte[structptr][byte_offset], word[structptr][word_offset] and long[structptr][long_offset]. In this case the word and long offsets are one-half and one-fourth the byte offsets.
I think the Spin VM for Prop 2 should be implemented as a combined Spin and LMM interpreter, similar to how it's done in SpinLMM. In SpinLMM, some of the longer multi-cycles Spin instructions are implemented as FCACHE routines, such as STRSIZE. STRSIZE actually runs faster in SpinLMM than it does in the standard Spin VM becaused it was optimized as a tighter loop that doesn't have to also implement STRCOMP.
Bill, the CVM that you propose could be an enhanced Spin VM. I don't understand your concern about accessing global and struct elements. This should be just as efficient in the Spin VM. One problem with the Spin VM is that everything must go through the stack. This will be more efficient on Prop 2 because the stack pointer will be able to auto-increment. The Spin VM could be enhanced by adding some register-based instructions that don't go through the stack. Maybe that would provide what you need for the CVM.
Marketing advantages are certainly in line with sales improvements
GCC is bloated compared to smaller compilers like LCC that do not attempt to do ANSI C99 and 3 other languages: Objective-C, Compiled Java, and Fortran (Apple lovers may recognize one out of that list).
I'm curious though, what is it about LLVM that would be less bloated?
Isn't LLVM just a machine independent replacement for the more tedious GCC backend?
Guess I should do more reading.
Thanks for thinking out loud about GCC. It makes my presentation job easier
I'm not saying that an LLVM approach would not be appropriate but I think we need to do it for the right reasons. As far as I know, an LLVM-based tool chain is no faster than a GCC-based one. In fact, the LLVM web page says this about C/C++ support for LLVM:
"LLVM currently has full support for C and C++ source languages. These are available through a special version of GCC that LLVM calls the C Front End."
So, LLVM uses GCC as a front end. That suggests to me that it won't be significantly faster than native GCC. There is a faster compiler but I don't think it claims to generate better code. It mostly talks about using it to detect syntax errors.
I don't think running it natively should be high priority, but I for one would really like it - as would many others, I am sure.
Thanks for the additional info on LLVM.
Jazzed:
I was wrong about LLVM. LCC and SDCC are definitely leaner/meaner than GCC though
Dave Hein:
"It would be even more efficient to use indexed accesses, such as byte[structptr][byte_offset], word[structptr][word_offset] and long[structptr][long_offset]. In this case the word and long offsets are one-half and one-fourth the byte offsets."
Is what I was thinking about for a hypothetical CVM :-) as my main concern was speed; I did not want to have to emit separate byte codes to do the the structptr+offset computation separately, but would want single ops as you show above.
I am NOT dead set against using the Spin byte codes, I just have a deep conviction that it would be possible to make VM that would run C code more efficiently... and also run Spin more efficiently. I really want struct's in Spin.
I don't think running it natively should be high priority, but I for one would really like it - as would many others, I am sure.
Sorry, I didn't mean to imply that running native was not worth considering. I meant to say that it may not be that high a priority and that we shouldn't trade off good code generation for good native performance. The priority should be choosing the solution that will produce the best Propeller code that can be achieved in the target development time-frame. If it happens that we can also run it natively that's great too.
I am dead set against using the Spin byte codes, I just have a deep conviction that it would be possible to make VM that would run C code more efficiently... and also run Spin more efficiently. I really want struct's in Spin.
Maybe we should wait until we hear from Chip. It is my understanding that Parallax has not said that the Spin VM for Propeller 2 will be binary compatible with Propeller 1 Spin images. I'm sure Chip will improve the Spin VM for Propeller 2 and the new version could be an acceptable target for GCC. This is especially likely to be true if Chip can be convinced to take C into account when designing the Propeller 2 Spin VM. We might want to encourage Paralllax to document the Spin VM instruction set and binary file format this time around as well.
Maybe we should wait until we hear from Chip. It is my understanding that Parallax has not said that the Spin VM for Propeller 2 will be binary compatible with Propeller 1 Spin images. I'm sure Chip will improve the Spin VM for Propeller 2 and the new version could be an acceptable target for GCC. This is especially likely to be true if Chip can be convinced to take C into account when designing the Propeller 2 Spin VM. We might want to encourage Paralllax to document the Spin VM instruction set and binary file format this time around as well.
agreed 100% ... frankly I want the code generator documented so well that it will be possible to target multiple VM's.
The process of writing a GCC backend is already documented with hundreds of pages of text. Whether you or I can understand it even after reading that text is another story. :-)
I suspect LLVM is quite well documented as well if we should decide to go that way. Still, writing a code generator for any compiler is not a trivial task. Ask Ross about that!
The process of writing a GCC backend is already documented with hundreds of pages of text. Whether you or I can understand it even after reading that text is another story. :-)
I suspect LLVM is quite well documented as well if we should decide to go that way. Still, writing a code generator for any compiler is not a trivial task. Ask Ross about that!
Running the C language compiler tool chain on the Prop - I really don't think that's the way to attract the big time commercial market that Parallax Semiconductor is wanting to address.
C compiled to Spin byte codes - No, again I maintain that would be some kind of novelty joke for the serious C heads who expect their code compiled to lean and mean native instructions.
The LLVM compiler uses GCC as a front end - Yes, true. But for how long? LLVM also has the Clang front end coming along. Preparing for the future might be appropriate. There is a reason the tide is turning that way.
GCC is bloated - So what? Professional devs use it all the time for programming from everything from AVRs upwards. They are not going to worry about a hundred megabyte download. As long as it is an "apt-get install propeller-gcc" away:)
Spin - I love it but I feel that it should be left out of this picture totally.
C is not the answer for everyone -- and most definitely not for me.
True, but it is necessary for companies who have $1B+ per quarterly revenue.
Just saying ... Not trying to incite a language war on this professional forum.
Comments
For me, the ideal C compiler would be capable of generating PASM, LMM PASM and Spin bytecodes. I think the different targets would be implemented at the RTL (Register Transfer Level). It would probably be good to integrate the Spin compiler into the GCC tool chain as well, but it's not necessary. This would be done at the front end where Spin would be converted to Gimple. Spin is a much simpler language than C, so it shouldn't be to hard to convert Spin to Gimple.
I hope that the IDE and compiler are done as separate efforts, where the compiler can be used independent of the IDE. The job of the compiler is well defined, whereas the IDE can be open ended. Hopefully, the compiler could be implemented more quickly that way.
I read the thread fairly quickly this morning. I'm finding it more necessary than ever to efficiently focus my efforts in the right places lately.
What we're interested in considering is a cross-platform set of tools with the possibility of supporting different languages. Whether or not GCC/GNU with Eclipse is practical we are not sure at this point - our starting point is a simple discussion among the interested folks who may contribute, develop or even review.
Therefore, if you replied to me via e-mail and expressed your interest then I am including you in a Webinar invitation for Tuesday, May 17th at 8:00 am Pacific. Jazzed will open and provide a 10-15 minute overview to us, then Chip will talk about how Prop 2 architecture could be used with GCC/GNU. After that we'll just have an open discussion and I'll expect to conclude no later than 10 am.
You should receive your webinar invitation from me this morning, before noon Pacific.
The GoToWebinar tool is a nice way to share screens, share audio and to manage a group beyond a few people. Please don't get the idea it's a one-way communication thread; anybody can talk or share a screen. It's the closest format to an actual face-to-face meeting, which I'm sure we'd all prefer given the choice. My role will primarily be as the webinar organizer, which puts me in the position of muting/un-muting participants, passing screen control, etc.
The initial number of participants appears to be about 8-12 developers, but [should we proceed] I expect our actual development team to be between 3-5 people. Their work would be documented on our forums or whatever open-source tool development locations are deemed appropriate.
Also, I want to thank all of you for your support. The community on these forums is truly unique and beneficial. I couldn't imagine this business without all of you.
Ken Gracey
I'll be in the Rocklin area on the 17th.
Roy, I'll drop you an invitation via e-mail. You can participate over the internet connection in our office should you desire. - Ken
Thanks for the clarification, Ken. I'm looking forward to the discussion!
Ross.
Ken,
Can you please make sure the webminar is captured and get a copy emailed to me? I will be without internet access that day.
I've been prepping for UPEW, and I just read over the whole thread... here are my thoughts:
1) going for the GCC tool chain buys us a lot of marketing advantages, but it is also bloated compared to SDCC, LLVM, etc. The odds are running the whole GCC tool chain on a Prop2 is going to be a LOT slower than LCC, SDCC or LLVM based compilers.
2) regardless of tool chains, I can see two ideal targets, and one less ideal
A) LMM, improved for the Prop2 - It should be around 20MIPS without even using FCACHE!
CVM - custom byte code designed for C compiler support
C) Spin byte codes - not sure how good a fit for accessing C structures and stack frames
Now with regards to the code generator - regardless of compiler platform:
A) LMM will have decent speed right off the bat; should be faster than code compiled for an Arduino as even a simplistic LMM will attain ~20MIPS - but those are 32 bit MIPS!
If the code generator is well documented and easy to modify, it would also be possible to use FCACHE to generate code approaching the theoretical max of 160MIPS by generating FCACHE blocks for inner loops, and using FCACHE to load floating point and string functions on the fly. As one simple example, strcmp() would be hub access limited as FCACHED code!
Using FCACHE, 100MIPS average should be achievable, however this requires careful thought on the design of the kernel, and really needs about 128 longs in the LMM kernel cog to really shine. This also allows for in-line almost full speed PASM code - as inline code would be loaded into the FCACHE buffer for execution.
Close to native MIPS was the reason I had an FCACHE primitive in the initial LMM kernel when I announced LMM - and if the code generator is nice and well documented, I will help generate better code.
I would suggest starting with the following design goals (this is how I would do it if I were attacking this alone)
First code generator:
i) small, tight LMM2 kernel, keep as much space in the cog free as possible
ii) at least 128 longs kept free as an FCACHE buffer
iii) hand-optimized FCACHE code for common lib functions (str*, float*)
iv) demand-load str* and float* from a standard FCACHELIB for them
Second code generator, after first one is solid and debugged
v) serious LMM optimizer, identifies small loops and converts them to in-line FCACHED code
vi) limited amount of 'register' variables map directly to cog locations (outside of 128 long buffer), used for the FCACHED small loops variables
Now before you guys say that it is better to have many primitives in the kernel, remember that a tigh FCACHELOAD loop can use READQLONG and DJNZ to load the FCACHE area at 2 cycles per long, so any "primitive" that loops will run almost exactly as fast if FCACHED than if permanently resident in the kernel!
Now looking at "CVM" - a custom C VM for Prop2
CVM could be 2-3 times faster than a Spin VM, because it could use an FCACHE like mechanism to demand load "slow" "CISC" opcodes, and be optimized like crazy for the common op codes. It would also be *MUCH* tighter code size wise than LMM2. It is an interesting option, however it would still be (at a guess) 4-10 times slower than an FCACHE using LMM2 version.
As for using the Spin VM - I am concerned about efficient access to global and local data structures, I suspect it would be painfully slow. This is where CVM could shine, with opcodes specifically made for referencing structure members.
Long term, I think the best approach would be supporting both a "CVM" and "LMM2" kernels, with heavy use of FCACHE for LMM2.
Short term, a simple lean / mean LMM2, with demand loaded str*, mem* and float* library calls would give decent performance, that an optimizer that groks FCACHE could later greatly improve on.
Please note, that both LMM2 and CVM would allow for special debug kernels, that would make it easy to have very feature full debuggers.
Should the concept of threading be addressed; because it would be useful and because it would be a good way to show off having multiple COG's?
C.W.
The reason I did not address threading yet is because I do not have definitive documentation on how Chip is implementing it, also LMM2 would allow time-slicing a cog to add C-threads like threading even without hardware threading support.
David:
I personally would love to run the toolchain on Prop2!
I see no reason it would not - I used to run Minix, and later Coherent, on a 4Mhz 80386 with 4MB of ram... a Prop2 will definitely out-muscle that, as it will be more than 20x faster with LMM2 running on a single cog!
The LMM2 kernel should deal with simple clean 32 bit address space, that way it will be relatively easy to support all the different XMM interfaces that are bound to crop up with Prop2.
I'm not saying it can't be done. I just don't think it will be of much interest to the commercial market that Parallax intends to target. It seems like more of a hobbyist feature.
GNU/GCC is bloated compared to smaller compilers like LCC that do not attempt to do ANSI C99 and be compatible with 3 other languages: Objective-C, Compiled Java, and Fortran (Apple lovers may recognize one out of that list).
I'm curious though, what is it about LLVM that would be less bloated?
Isn't LLVM just a machine independent replacement for the more tedious GCC backend?
An LLVM IL interpreter might be just as hairy as a CIL VM.
Guess I should do more reading.
Here is a thread discussing LLVM briefly in the Propeller forum.
Thanks for thinking out loud about GCC. It makes my presentation job easier
I think the Spin VM for Prop 2 should be implemented as a combined Spin and LMM interpreter, similar to how it's done in SpinLMM. In SpinLMM, some of the longer multi-cycles Spin instructions are implemented as FCACHE routines, such as STRSIZE. STRSIZE actually runs faster in SpinLMM than it does in the standard Spin VM becaused it was optimized as a tighter loop that doesn't have to also implement STRCOMP.
Bill, the CVM that you propose could be an enhanced Spin VM. I don't understand your concern about accessing global and struct elements. This should be just as efficient in the Spin VM. One problem with the Spin VM is that everything must go through the stack. This will be more efficient on Prop 2 because the stack pointer will be able to auto-increment. The Spin VM could be enhanced by adding some register-based instructions that don't go through the stack. Maybe that would provide what you need for the CVM.
I'm not saying that an LLVM approach would not be appropriate but I think we need to do it for the right reasons. As far as I know, an LLVM-based tool chain is no faster than a GCC-based one. In fact, the LLVM web page says this about C/C++ support for LLVM:
"LLVM currently has full support for C and C++ source languages. These are available through a special version of GCC that LLVM calls the C Front End."
So, LLVM uses GCC as a front end. That suggests to me that it won't be significantly faster than native GCC. There is a faster compiler but I don't think it claims to generate better code. It mostly talks about using it to detect syntax errors.
I don't think running it natively should be high priority, but I for one would really like it - as would many others, I am sure.
Thanks for the additional info on LLVM.
Jazzed:
I was wrong about LLVM. LCC and SDCC are definitely leaner/meaner than GCC though
Dave Hein:
"It would be even more efficient to use indexed accesses, such as byte[structptr][byte_offset], word[structptr][word_offset] and long[structptr][long_offset]. In this case the word and long offsets are one-half and one-fourth the byte offsets."
Is what I was thinking about for a hypothetical CVM :-) as my main concern was speed; I did not want to have to emit separate byte codes to do the the structptr+offset computation separately, but would want single ops as you show above.
I am NOT dead set against using the Spin byte codes, I just have a deep conviction that it would be possible to make VM that would run C code more efficiently... and also run Spin more efficiently. I really want struct's in Spin.
Sorry, I didn't mean to imply that running native was not worth considering. I meant to say that it may not be that high a priority and that we shouldn't trade off good code generation for good native performance. The priority should be choosing the solution that will produce the best Propeller code that can be achieved in the target development time-frame. If it happens that we can also run it natively that's great too.
I meant to write "I am NOT dead set against"... LOL
Totally agree about documenting the SPIN instruction set - to the same level as the PASM instruction set!
The process of writing a GCC backend is already documented with hundreds of pages of text. Whether you or I can understand it even after reading that text is another story. :-)
I suspect LLVM is quite well documented as well if we should decide to go that way. Still, writing a code generator for any compiler is not a trivial task. Ask Ross about that!
After Ross made Catalina, I did take a peek at the LCC backend, and it was comprehensible... after a lot of reading.
I know from personal experience that writing an optimizing LMM code generator is ... umm ... non-trivial. To say the least.
C compiled to Spin byte codes - No, again I maintain that would be some kind of novelty joke for the serious C heads who expect their code compiled to lean and mean native instructions.
The LLVM compiler uses GCC as a front end - Yes, true. But for how long? LLVM also has the Clang front end coming along. Preparing for the future might be appropriate. There is a reason the tide is turning that way.
GCC is bloated - So what? Professional devs use it all the time for programming from everything from AVRs upwards. They are not going to worry about a hundred megabyte download. As long as it is an "apt-get install propeller-gcc" away:)
Spin - I love it but I feel that it should be left out of this picture totally.
I'm rooting for an enhanced, backward-compatible Spin myself. C is not the answer for everyone -- and most definitely not for me.
-Phil
True, but it is necessary for companies who have $1B+ per quarterly revenue.
Just saying ... Not trying to incite a language war on this professional forum.