A simpler approach to a Prop-based compiler?

Oldbitcollector (Jeff) · 2009-03-26 02:25

Please poke some holes in this theory before I commit to this. [noparse]:)[/noparse]

Jeff Martin's last two Webinars have really inspired me to pay more attention to the
actual code with it being output from the Propeller tool and I've been thinking that
perhaps a true Propeller based binary compiler might not be as difficult as some of
us have thought if we took the approach of "eating the elephant one bite at a time."

How about instead of trying to tackle the whole Enchilada which including various
objects, etc. (I suspect we will run quickly out of room for without a multi-pass setup)
Why not establish some initial limitations like tv_text for video out, and Combokeyboard
for input. We could simply stream the bytes from a pre-established binary when required.

Perhaps I don't truly understand the full nature of the problems involved here, but
I'm willing to bet that we could have a running demonstration that could handle
conversion of PRINT "HELLO WORLD" into binary by the end of the week.
Heck, it could even be a type of BASIC to Spin binary. <shutter>

One command at a time and build from there. Before I start choking on bytecode,
does this sound like sound reasoning or does it look like this could derail very quickly
due to unforeseen issues?

OBC

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
New to the Propeller?

Visit the: The Propeller Pages @ Warranty Void.

Post Edited (Oldbitcollector) : 3/26/2009 2:30:21 AM GMT

Cluso99 · 2009-03-26 03:24

Let me think about this one. I would like to see a Basic (perhaps your FemtoBasic - not looked at it yet).

Maybe we could write a pasm compiler first, then spin, then all the frills. This could be a group effort as it breaks down into sections easily.

The other alternative is to see if we can get an emulator to run bstc or homespun compilers. Not sure if this is more complicated than its worth.
<Postedit> Or get an emulator to run PropTool or Propellent ??

Anyway - we do have to get a spin/pasm compiler running on the prop one way or another.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:

· Home of the MultiBladeProps: TriBladeProp, SixBladeProp, website (Multiple propeller pcbs)
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: Micros eg Altair, and Terminals eg VT100 (Index)
· Search the Propeller forums (via Google)
My cruising website is: ·www.bluemagic.biz·· MultiBladeProp is: www.bluemagic.biz/cluso.htm

Post Edited (Cluso99) : 3/26/2009 3:29:55 AM GMT

Bill Henning · 2009-03-26 03:34

A cleanly-designed multi-pass compiler is probably the way to go if we want it to fit in 32k of ram.

Cluso99 said...
Let me think about this one. I would like to see a Basic (perhaps your FemtoBasic - not looked at it yet).

Maybe we could write a pasm compiler first, then spin, then all the frills. This could be a group effort as it breaks down into sections easily.

The other alternative is to see if we can get an emulator to run bstc or homespun compilers. Not sure if this is more complicated than its worth.
<Postedit> Or get an emulator to run PropTool or Propellent ??

Anyway - we do have to get a spin/pasm compiler running on the prop one way or another.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
www.mikronauts.com - a new blog about microcontrollers

Mike Green · 2009-03-26 04:03

I've posted this stuff before, but I offer it again for any ideas it may provide.

Meta2 is a simple recursive descent compiler compiler now rewritten in Spin. It can compile itself into Spin code. It reads a source file from an SD card, copies it into a 32K area of EEPROM, compiles the source file into another 32K area, then copies that resulting source code to another SD card file. A third 32K area is used for the dictionary. Obviously, this requires a 128K EEPROM or several smaller EEPROMs on I/O pins 28/29. Bean's Prop Dongle fortunately comes this way and the Hydra as well.

Ouroboros is the beginning of a PBasic compiler (to LMM object code) that works the same way. The expression parser and declaration parser mostly work and it will generate code for some of the expression operators as well as do storage allocation. I stopped working on it as it ran out of memory and I had to split it up into phases and ran out of time to work on it. Feel free to use anything that looks like it might be helpful.

heater · 2009-03-26 07:10

Why not do give yourselves some room? My suggestion:

1. Do this on the TriBlade platform (or similar) give yourself some room, external RAM and a file system.
2. Use ImageCraft C as the implementation language. Makes life easier, you can develop/test it on Windows or Linux for example.
3. Have the compiler as XMM. i.e executing from external RAM.

I have a noddy compiler that compiles a version of Jack Crenshaw's TINY language into LMM. The compiler is written in C and could no doubt be compiled by ImageCraft C to run from external RAM on the Prop (XMM). The size of the compiler executable is only about 44K.

My noddy compiler is very crude and produces very inefficient LMM. I stopped work on it a long while ago when I got as far as being able to pass parameters into functions (There is no way to return values yet!). It would require an assembler running on the Prop for it's LMM output to be useful on the Prop (I was using PropAsm on Linux). I'm looking forward to getting this thing compiling on the TriBlade.

Now I'm sure if I can do this with my almost zero compiler expertise, others could do a thousand times better. They just might not want to bother with the challenge of squeezing it into 32K HUB RAM.

Attached are example inputs and outputs of my humble effort. They are both plain text files.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

MagIO2 · 2009-03-26 07:24

Why don't you do it like they did with the C64. There you did not have huge source-files. When you entered a line it immediately got parsed and converted to bytecode (they named it tokens these days). When you list a program it simply replaced the tokens by the strings again. The token itself worked like a pointer in a jumping-table. So, when executing a program the interpreter simply reads the the next token, looked up the jump-adress and run that code.
For propeller I'd simply run as many COGs as needed to have the PASM code for all the commands at hand. A management COG takes care of the program-flow and puts the next command to a global buffer used for parameter and return-values handling. Each COG reads the token and checks if it has something to do. If so, it runs the corresponding PASM.

By the way ... is there maybe a SRAM with a serial interface? I'd really prefere to use RAM instead of EEPROM or Flash for such compile kind of jobs. When you develops software you easily can have so much write cycles that the EEPROM/Flash will fail soon.

Oldbitcollector (Jeff) · 2009-03-26 14:47

Reading the threads this morning, I see I'm not the only one with this concept near my forehead.

There is an SPI RAM object obex.parallax.com/objects/346/
That could prove useful. I've worked with it just a little myself.

The idea of turning the code into bytecode makes perfect sense. As you said, this is
the way it's been done historically due to system limitations. I'm going to start
playing with some of these ideas this weekend and see what I can come up with.

I'm going to scale things back even further and work toward a simple INA/OUTA
translation directly to binary and see what I come up with. Perhaps I should do
this to EEPROM. IIRC, there is an EEPROM launcher already written.

OBC

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
New to the Propeller?

Visit the: The Propeller Pages @ Warranty Void.

stevenmess2004 · 2009-03-27 06:25

I put a fair bit of thought into this awhile ago and figured out the following.

It's going to need a multi-pass compiler for several reasons.
1. Variables can be anywhere in spin which means they can be declared after they are used which would be a pain to handle in a single pass compiler
2. Memory required would probably be too big (unless you use external memory)

To make things simple you could start at the bytecode end and start working back to spin. So the things that would need doing are
1 Take some kind of 'spin assembly' and turn it into spin bytecode (shouldn't be hard except that there isn't any 'spin assembly' defined)
2 Take some spin code and turn it into 'spin assembly' (only in one method, this gets a bit harder because it involves a lot more parsing and needs to transfer from infix to postfix)
3 Take some spin code and divide it up into all the different methods, var sections and sub-objects, allocate var locations and a bunch of other stuff. Then there is an assembler for the asm section.

It should be possible to fit all of that in the available memory, but everything will have to be read and written a line at a time to and from an sd card or some other type of memory.
Maybe I should start on a 'spin assembly' assembler...

Jack Crenshaw · 2009-03-27 13:58

I'm confused -- which seems to be my default state in this forum.

Seems to me the discussion revolves around four decisions, three or which are binary:

1) develop on host or target Prop?
2) bytecode or native code?
3) single-pass or multipass?
4) Spin, assembler, C, or some other language?

Re the first, I simply don't understand why there's a decision. You've got this lovely host PC, with gigabytes of RAM and gigaHz of speed, and never more than a USB cable away. Only a few years ago, we would have _KILLED_ for such an environment. Why the obsession with running the compiler inside the Prop? Is it simply to be able to say that you did it, or is there some more practical reason?

Decision #2 is part of the speed/code-size tradeoff. Using bytecodes gives smaller "object" files, but slower execution. And vice versa. You can do it either way, and it doesn't (or shouldn't) impact the other choices. Although, some of the combinations don't make much sense, like assembling from native code to bytecodes.

Seems to me, the single-pass vs. multi-pass issue is a non-issue. It's an implementation detail. Yes, high-order languages must deal with forward references. What else is new? It's been that way since time immemorial. Every assembler since SAP has had to deal with this.

Many compiler writers choose to let their compilers generate assembly code. That's the way I did it in my "Let's Build..." series. So did Ron Cain and Jim Hendrix of small C. So do the Unix/Linux C compilers. The rationale is simple enough: The assembler already has code to manage a symbol table and deal with forward references. So let it do its job -- no need to duplicate the effort. OTOH, if you want fast compile speed, generate the native code directly, like Leor Zolman did. Again, an implementation detail.

That leaves only the choice of language, doesn't it?

I get the feeling I'm overlooking something important. Is the issue that you don't know what bytecodes the Spin compiler uses? If Parallax published its bytecode rules -- if there were a programming manual for the Spin interpreter -- would that solve the problems? Then you could use whatever programming language you wanted, including COBOL.

If that's the problem, has anyone asked Parallax to do that? Is it a matter of proprietary info? If so, I can't imagine why. Seems to me, Parallax would have everything to gain, and nothing to lose, by having other vendors writing compilers that target the Prop.

Or perhaps the issue is, Prop assembler isn't really a stand-alone tool? Does it require the Spin interpreter to remain in place? Maybe the Spin interpreter can't be bypassed? If _THAT'S_ the issue, then the question is: How do we do an end run around the interpreter?

I'd appreciate someone clearing up the fog.

Jack

Oldbitcollector (Jeff) · 2009-03-27 14:15

Jack Crenshaw said...

Re the first, I simply don't understand why there's a decision. You've got this lovely host PC, with gigabytes of RAM and gigaHz of speed, and never more than a USB cable away. Only a few years ago, we would have _KILLED_ for such an environment. Why the obsession with running the compiler inside the Prop? Is it simply to be able to say that you did it, or is there some more practical reason?

My own attitude here is to have the option. Yes, it will be truly difficult to duplicate the many
features we take advantage of from the environment, but the idea of being able to grab my
Propeller, plug in a keyboard, LCD, and battery and program anywhere is compelling.

Chip has an on-board editor in mind for PropII. His motives for this are a little different, as
Parallax will be able to develop an editor which is guarantied to run without the constant
support headaches that come from Windows applications.

OBC

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
New to the Propeller?

Visit the: The Propeller Pages @ Warranty Void.

Oldbitcollector (Jeff) · 2009-03-27 14:20

stevenmess2004 said...

1. Variables can be anywhere in spin which means they can be declared after they are used which would be a pain to handle in a single pass compiler

This is one of the "holes" that I thought might get in the way. The only way I see around this, (maybe) in a single pass is to limit the pre-compiled command structure. (I haven't thrown enough code at this to prove to myself it won't work this way.)

OBC

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
New to the Propeller?

Visit the: The Propeller Pages @ Warranty Void.

BradC · 2009-03-27 14:20

Jack Crenshaw said...
Is the issue that you don't know what bytecodes the Spin compiler uses? If Parallax published its bytecode rules -- if there were a programming manual for the Spin interpreter -- would that solve the problems?

Parallax have not "documented" the bytecode as such, however they published the full source code for the spin interpreter. This is well and truly sufficient documentation to allow the writing of a third party compiler.

Jack Crenshaw said...

Or perhaps the issue is, Prop assembler isn't really a stand-alone tool? Does it require the Spin interpreter to remain in place? Maybe the Spin interpreter can't be bypassed? If _THAT'S_ the issue, then the question is: How do we do an end run around the interpreter?

Nope, it's not that either. It's incredibly easy to run any code and not have to run the spin interpreter.

From where I am standing, what I see is an effort to try and be able to use the propeller as a completely standalone "PC" and be able to write, edit and compile native code from the chip as a standalone entity itself. I won't say it can't be done - of course it can, and I won't say it won't be done.. I can even answer the obvious question myself - "Because it's there and I can".

Everyone has an itch to scratch. Mine was to be able to write propeller code in Linux. Someone else wants to be able to write propeller code without a PC. Mind you, I used to write Apple ][noparse][[/noparse] code with a 6502 opcode table, pen, paper and HP-41 calculator. Assembler? Never heard of it.

Guys, as a starting point.. and it can be done *easily*.. try "assembling" spin code by hand and make it blink a led.
As an "easy" entry, you *could* use bst[noparse][[/noparse]c] and the bytecode() operator to allow the compiler to handle the object table and init setup for you...

.. something like this.. hand assembled and not tested, but probably pretty close. Flash a led on pin one at 0.5HZ

CON
   _clkmode      = xtal1 + pll16x
  _xinfreq      = 5_000_000

PUB ABCD
  Bytecode($36,$3D,$D6,$1C { dira[noparse][[/noparse] 1]~~
 }, $36, $36, $3D, $D4, $4B { outa[noparse][[/noparse] 1] ^= 1
 }, $3F, $91, $3B, $04, $C4, $B4, $00, $EC, $23 { waitcnt (cnt+80_000_000)
 }, $04, $70 { jmp -16
 }, $32) ' end

<edit> edited to add spaces in the square braces.. grumble..

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Cardinal Fang! Fetch the comfy chair.

Post Edited (BradC) : 3/27/2009 2:25:31 PM GMT

Mike Green · 2009-03-27 14:23

Last question first ... There was a Prop assembler written in Java and ImageCraft's C system includes an assembler and linker. Although the Propeller's bootloader starts up the Spin interpreter by default, you can use a simple 16 byte (hidden if you want) preamble to your assembly code to start up the assembly code in a cog replacing the running Spin interpreter.

1) In one sense you're absolutely right, but there are some cases where it might be nice to be able to maintain a Prop-based device just using the facilities provided by that device. You could argue that a "netbook" level laptop could be carried around and plugged in to do the same thing easier and more efficiently, but having a self contained device is still a reasonable choice. There have also been some proposals for a very low cost, self contained educational device. Admittedly, the processor is a small part of the cost of the whole thing, but the Propeller has the advantage of very low power requirements. This could also be done with little memory as well other than an SD card.

2) Well said

3) A native solution on the Prop really requires multiple phases just from a memory standpoint.

4) The Spin interpreter code is now public knowledge and the byte codes, while not well documented (in terms of detail) are written down. Other Spin interpreters have been written (for demonstration purposes, as debugging tools, or as experimental toolsets).

heater · 2009-03-27 14:58

We are a couple of days away from having the ZiCog Z80 emulator running CP/M on Cluso's TriBlade Propeller board. So we will have:

1. The use of 64K of RAM. More if we implement a bank switched version of CP/M
2. A file system.
3. The ability to edit software with WordStar and other editors.
4. The ability to compile high level languages, on the Prop. C, Pascal, Fortran, PL/M etc, etc
5. The ability to run our compiled programs on the Prop under the emulation.

So here we have a stand alone Propeller hosted software development environment. What's missing?

1. An assembler for PASM.
2. A compiler for Spin.
3. A means of launching our Spin/PASM creations from CP/M

Has anyone out there created a PASM assembler in C? If not it might have to be done. We could then assemble PASM on the Prop. I had a go at writing a PASM assembler, but being crazy I was writing it in ADA using object oriented programming and trying to handle all the SPIN constructs VAR, CON, and operators for expressions. It got very complicated and huge and I have abandoned it. I'm sure a much simpler assembler could be created that would run in 64K.

A compiler for Spin, I have no idea about that.

Launching from CP/M to our newly created programs may be a case of just replacing CP/M by rebooting into the new application. Or giving CP/M access to write new code into the boot EEPROM or passing the code over to one of the other COGs on the board.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

Post Edited (heater) : 3/27/2009 3:03:36 PM GMT

Cluso99 · 2009-03-27 15:31

Just to add to heaters points..
1. 64KB of RAM plus a RAM disk of 960KB or any variation totalling 1MB, plus microSD.
6. OBC has PropDOS and an Editor (jointly with ?) and SD (equiv to microSD)

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:

· Home of the MultiBladeProps: TriBladeProp, SixBladeProp, website (Multiple propeller pcbs)
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: Micros eg Altair, and Terminals eg VT100 (Index)
· Search the Propeller forums (via Google)
My cruising website is: ·www.bluemagic.biz·· MultiBladeProp is: www.bluemagic.biz/cluso.htm

hippy · 2009-03-27 18:13

Jack Crenshaw said...
Many compiler writers choose to let their compilers generate assembly code ... The rationale is simple enough: The assembler already has code to manage a symbol table and deal with forward references. So let it do its job -- no need to duplicate the effort. OTOH, if you want fast compile speed, generate the native code directly, like Leor Zolman did. Again, an implementation detail.

For Spin, the bytecode is the native assembler and there's a farly straight forward mapping of Spin to bytecode so code generation shouldn't be an overly onerous task leaving only parsing the source as the requirement there.

Unfortunately there isn't a bytecode assembler available but that's equally pretty easy to create.

Split into two parts it makes things much easier. As I've said before, my opinion of the best approach is to get it working on a PC first which has all the advantages that brings, while coding in a manner which makes it reasonably easy to port to Spin later.

stevenmess2004 · 2009-03-28 01:06

Oldbitcollector said...

stevenmess2004 said...

1. Variables can be anywhere in spin which means they can be declared after they are used which would be a pain to handle in a single pass compiler

This is one of the "holes" that I thought might get in the way. The only way I see around this, (maybe) in a single pass is to limit the pre-compiled command structure. (I haven't thrown enough code at this to prove to myself it won't work this way.)

OBC

It's actually pretty easy to get around the variable problem if you add a couple of simple rules to the spin syntax. Things like all VARS must be in one block at the start of the program and preferably in the correct order (i.e. long, word, byte). Method calls could be a problem though.

hippy · 2009-03-28 13:06

Spin feels complicated because there is a lot of forward referencing allowed, however there's no reason that an initial pass of the compiler cannot determine all constants, objects, methods, variables and PASM blocks. That would only need simple syntax checking that each line of source was correct and not have to worry about indentation or semantics.

That then has an advantage over the traditional way of linearly / recursively parsing and processing source code -

Along with the symbol table created you can store pointers into the source for each symbol definition ( eg, where each method starts and ends ). Then simply sort the symbol table as necessary, run through the symbol table, find the relevant source code block and compile those as ( hopefully ) small chunks.

That would suit a modularised multi-pass compiler which I believe is better than a monolithic all-in-one compiler which is battling against the memory constraints of the Propeller.

My opinion is that any self-hosted compiler / IDE has to work within the 32KB hub constraints of a Propeller. Needing the addition of SD Card is one thing, but requiring specialised hardware or needing XMM etc pushes up costs and limits the market for participation and interest. By all means allow them and cater for them, but design so they are not required unless it's just not possible without.

Modularisation also suits programmer involvement as overall it's quite a big task, but each module becomes significantly simpler; a ( what would be second-pass ) compilation of a single module with a pre-built symbol table available is far easier in itself than something larger.

True, the downside is a slower compilation with a lot of to-and-froing, and most likely some extra steps to tie everything together, but the priority has to be designing something which can be built and delivered, secondly something which works, then finally optimisation of that.

A simpler approach to a Prop-based compiler?

Comments