Is an onboard compiler possible?
Oldbitcollector (Jeff)
Posts: 8,091
I'm curious if some of the expert level programmers here
feel as if an on-board compiler might be possible for the
propeller?
Something that could read a .spin text file and generate
a binary? Do we have enough access to the source to
create such a tool?
OBC
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
New to the Propeller?
Getting started with the Protoboard? - Propeller Cookbook 1.4
Updates to the Cookbook are now posted to: Propeller.warrantyvoid.us
Got an SD card? - PropDOS
Need a part? Got spare electronics? - The Electronics Exchange
feel as if an on-board compiler might be possible for the
propeller?
Something that could read a .spin text file and generate
a binary? Do we have enough access to the source to
create such a tool?
OBC
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
New to the Propeller?
Getting started with the Protoboard? - Propeller Cookbook 1.4
Updates to the Cookbook are now posted to: Propeller.warrantyvoid.us
Got an SD card? - PropDOS
Need a part? Got spare electronics? - The Electronics Exchange
Comments
They started out with a DEC PDP-7 mini computer with 8K of 18 bit words and started to make big progress when they got a PDP-11 with 24K bytes RAM.
So I think with an SD card we are in with a chance.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
that was a fine article. If they did it... why we cannot ? It all depends on what you want. Full-blown C is not possible, but some cut down version may well be, as they explain.
My crazy idea is to write a modified spin interpreter and instead of reading from HUB ram reads directly from the SD card, which would allow a rather large file to be interpreted but ofcourse this would be rather slow.
But for now what about a Prop assembler that runs on the Prop ? That must be doable after all old CP/M systems could rebuild themselves in 24K bytes. Keep it simple, all symbols to be limited to say 12 characters so we don't need to worry about strings. String comparison becomes compare of three longs. What ever tricks it takes. The Prop instructions are very regular so this helps a lot.
Of course the first thing needed is an editor. Forget about free form text editing. Make it line and field based, lable, condition, instruction, source, destination, effects. Possibly comments hanging of the end. In fact the editor could pretty much "assemble" straight into memory as you type.
Looking up a bit BASIC must be possible. Sinclair had a basic running on the ZX81 with 16K ROM and 16K RAM. This had the interesting idea that for example hitting G on the keyboard in programming mode produced GOTO in the basic. So it tokenized as you typed !
Just thinking.....
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
Don't forget femtoBasic and TinyBasic (on Hydra CD).
Forth systems have compiler on board (try PropellerForth, only 10kilobytes), and the size varies depending what other features are needed. PF uses simple serial port IO for console and have no VGA/SD-card modules inbuild nor any editor in itself. It has nice words for EEprom accessing so it wiould be possible to expand it with an Editor+assembler.
I made a small assebler for PF, but have had no time to debug it fully. Nor have I heard much interest of it here...
Me and two others implemented a JVM running javelin code.
The javelin runs its jvm and virtual periheral code + native functions
in 4K 12bit words on a SX48·(3K for the jvm, 1K for VP and native functions).
Implementing the jvm in spin took 12K, and setting aside 16K to hold
the java program binary image, only left 4K of main ram.
Even worse was the implementation for the VP code: only 2 virtual
peripherals could be implemented in 1 cog. The javelin requires 6 VP (3 cogs).
I think the fact that flags are included in every assembly word
leads to large memory waste. Many assembly instructions do not
to test on flags, nor do they need to set flags, so these bits
in an instruction are wasted. I believe if the cogs were 2K addressable
with simple bytes they could execute much larger programs
(1984·instructions versus 496 now).
regards peter
This is the perennial argument between RISC and CISC instruction sets. RISC machines have to have relatively long words with all the fields present in every instruction. They're quite wasteful of memory bandwidth for instruction fetching. They're also very fast at instruction decoding and the instruction fetch and decoding hardware is very simple. CISC machines have relatively short words with fields present when needed. Instructions are variable length or of different formats depending on function. Use of memory for instructions is much more efficient. Instruction fetch and decoding hardware is much more complex and can be much slower (or much more complex or both).
I can't speak for Chip, but I suspect that the use of a pretty pure RISC architecture was the only way to fit 8 processors on a reasonably small die.
Post Edited (Mike Green) : 7/14/2008 3:16:42 PM GMT
Frustrating thing is that for the lack of another 40K of RAM I could have the Prop host an 8080 C compiler (From CP/M). Waiting for Prop II.....
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
the core code of the jvm from spin to pasm, but that too appears to require 4 cogs
or so which does not make it easier. Perhaps for propII.
regards peter
If code space for the JVM is the problem, I assume by your desire to spread it over many cogs, why not make use of the Large Memory Model (LMM) idea instead. Which is what I finally resorted to for less used parts of the 8080 emulator. It was either that or emulate some instructions by code in another COG which seemed messy to me and wasteful COGs.
What would be cool is if you could use a LMM virtual machine compatible with that of ImageCrafts C compiler (Or the same if they don't mind) then we could integrate C and Java code or have Java with native C methods.
Meanwhile someone else might want to compile Spin to the same LMM virtual machine and ....
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer
Parallax, Inc.
The spin version could be a little faster if all the current calls were embedded in a case statement, but it would be pretty hard to follow. Spin method calls are pretty expensive @ ~3.3us with no parameters to ~5us with one.
LMM may be possible since it would eliminate the need to reproduce certain accessor functions in each cog running the JVM. There is a certain amount of IDE overhead required, and that would have to run in a second LMM. Code produced by the C LMM at (~3x size of spin) would be too big. Still one could try an "asm" LMM.
I've tried implementing the IDE stuff in one cog, but it's too big. Also, one might be able to squeeze 3 VPs into one cog with a little rework. The margin would be tight and everything has to be done in 6.86us - 2 VPs fit comfortably now.
For the PropII (if it's ever released):
Seems JVM is mostly off topic in this thread though.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Post Edited (jazzed) : 7/15/2008 2:50:07 AM GMT
Certainly possible but quite a lot of effort I would expect. Chip has it planned for the Prop II ( but, at last reports, likely not to be in the initial release ).
The early micro's usually had tokenisers and languages which could be easily 'compiled' but were quite simplistic and restrictive in nature. How long before Basics got away from just 26 variables, A-Z ? A lot of checking was also done at run-time rather than compile time, if a subroutine wasn't there it would stop when trying to find it. Those languages didn't have what people expect and demand these days. Spin isn't overly complicated but it's not simplistic either.
Mike's idea of splitting a compiler into parts or passes is a very good idea and probably the only sensible option. There are implementation issues and resource constraints so the design gets more complicated than a PC-based compiler where one can casually ask for as much Ram as one needs and usually get it. Symbol tables probably have to be compact rather than just ASCIIZ strings, and even each pass may need its own overlays. All easy enough to handle but it starts to add up to more effort. Chip obviously believes the Prop II will have enough resources to do it.
It should be possible to do. I'm afraid I forget who it is working on turning Chip's ROM interpreter into LMM code but once that's done it should be fairly easy to change reads of Hub into calls which get the next byte/word/long from SD Card. Even executing LMM direct from boot Eeprom has its advantages, maximising Ram for data storage. Once the SD/Eeprom executing interpreter is loaded and running, almost the entire 32KB ( 256KB on Prop II ) is available for data use.
As Ale notes, add caching and it hopefully won't be that slow. Extra Cogs can be used for caching and pipe-lining. The LMM code being executed doesn't have to be 32-bit PASM it can something else. As Peter comments, 32-bit PASM while fast isn't compact. The downside of something more compact is an extra overhead in expansion and execution time. A Thumb-style LMM could deliver reasonable compactness with minimal expansion over-head.
Which then brings us to the tools needed to implement a compiler, get it working and debugged. That's most likely going to require quite a good degree of experience and knowledge in a number of areas, some of it quite specialised.
There's enough information available to create a self-hosted compiler. It's really a question of finding the people who can understand that information and have the skills and interest and will power needed to implement such a thing.
Two factors which are likely to limit the number of people who might step forward is their lack of belief that the effort is worthwhile and the fact that as Chip is planning to do this anyway it could ultimately be wasted effort.
Post Edited (hippy) : 7/15/2008 2:00:35 PM GMT
If so then things like FemtoBasic, Forth and so on may fit the bill. The main problem I've run into in this area is that the code for the compiler, tokeniser or interpreter takes up so much Hub space that there's too little left for useful end-user programs. To regain space means removing functionality which in turn reduces the attractiveness. Adding BIOS, multi hardware support, hardware pinout portability and all manner of useful things improves usability but reduces the ability to use that usability.
With interpreters which can load and/or execute code direct from SD or Eeprom that may not be so much of a problem. I think most of these problems will go away when the Prop II finally materialises. Although the tools will naturally expand to add more functionality there should still be plenty of free resources for end-user use.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔