OpenSpin Spin/PASM compiler in C/C++

RossH · 2012-01-29 17:19

Batang wrote: »

Firstly great stuff Roy.

As for the rest, who gives a rats about how many compilers/assemblers there are and who is supplying them as long as they work as advertised and are suitable for the task at hand and are free to download and use.

Cheers.

Hi Batang,

Well, for a start the people who have to write them for you probably "give a rats"

Ross.

Batang · 2012-01-30 08:11

Well, for a start the people who have to write them for you probably "give a rats"

Ross.

It would seem to me that the only people who "have to write them" would be those who a doing it for commercial reasons or in the case of Parallax doing it for product support.

And those who write them who are not above mentioned are perhaps doing them for there own altruistic reasons.

In either case why should there be a concern about how many different compilers etc that are available.

Ideally these tools should be provided as company product support that evolve as required and based on customer feedback.

However if like yourself others wish to provide tools that offer new or additional features great.

Cheers

Keith Leinenbach · 2012-01-30 10:22

note: I'm a newbie here. So I hope you don't mind me commenting.

I know this is so three days ago, but I just thought I'd chime in on some of the discussion about changes to the SPIN language. Changes to languages, even minor ones should be approached very seriously, not just added in as you go. The implications to changes like this are huge. Even changing non case sensitive to case sensitive is really big. Compiler options are rarely considered by most developers and can be devastating if they are too far-reaching. Its scary to me when several lines of code can be interpreted radically different depending on a list of options in a dialog somewhere or some strings of definitions at the top of a file.

If SPIN is changing, please take it as a separate issue, not part of a compiler build. Take time and serious discussion. And remember that simplicity can be a good thing.

Keith

Rayman · 2012-01-30 10:51

Amen to that. I hope we get a compiler that will do the same job that the Spin Tool does now.

Phil Pilgrim (PhiPi) · 2012-01-30 10:55

The issue of case sensitivity is a "sensitive" one, to be sure. It even manifests itself in the current, unmodified, version of Spin, when ported to an OS whose filenames are case-sensitive, such as Linux. Consider the following:

OBJ

  sio : "fullduplexserial"

In Windows, this is no big deal, because "fullduplexserial" and "FullDuplexSerial" refer to the same file. But in Linux, this would likely cause a compile error, since fullduplexserial.spin does not exist. It's my belief that this should also cause an error in Windows, even though "fullduplexserial" and "FullDuplexSerial" cannot refer to two different files there. This is what I call "enforced case coherence", and I believe that it should also apply to variables and other user-defined symbols in Spin. I do not, however, believe that things like myvar and MyVar should ever refer to two different entities in Spin -- even as an option -- since that would really turn the language on its head. But referring to the same variable in two different ways like that should cause a compile error. Such enforcement can only lead to cleaner, less ambiguous, and more readable code.

Now, having said that, there is no reason that a good IDE or pre-compile phase could not clean up a user's code on the fly to ensure case coherence. It would change the case of every symbol in the program to agree with that of its original definition. QuickBASIC did this on slow DOS machines, so it should not be a big deal to accomplish on more modern PCs.

-Phil

tdlivings · 2012-01-30 13:06

Phil
Having the compiler change the case of characters in the source from what I typed to what it thinks they should be does not
seem like a good idea to me. To know to change OBJ: fullduplexserial to OBJ: FullDuplexSerial it would have to know the character
case of the file on my hard drive. Seems like I would be fighting with big brother the compiler's idea of what I should have typed.
Seem's to me like it is something the editor might do, knowing what you typed at the top of a large file and offering a hint when you
type the same thing with a different case farther down in the file. Or even better since I do not want to be nagged to much while typing
how about the editor keeps track of variable names and has an editor hot key to pop up a selection list you can pick from. I am always
forgetting what I called something way up at the top of the file.

Tom

Phil Pilgrim (PhiPi) · 2012-01-30 13:35

tldivings wrote:

Having the compiler change the case of characters in the source from what I typed to what it thinks they should be does not seem like a good idea to me.

It doesn't seem like a good idea to me either, and it's not what I suggested. But if you type MyVar in the VAR section and use myvar someplace else in the same program, the compiler should flag it as an error. Now, given a compiler that demands case coherence, a user-friendly IDE could offer, as an optional service, to correct your instances of myvar to agree with what you typed in the VAR, CON, OBJ, or DAT sections.

tldivings wrote:

To know to change OBJ: fullduplexserial to OBJ: FullDuplexSerial it would have to know the character case of the file on my hard drive.

Again, this is not a change I would expect either the IDE or the compiler to make. But I would expect the compiler to flag an error if it cannot find a file with a matching name and case, regardless of whether you are using Windows or some OS where it matters. It's important that the behavior of the compiler be consistent across all platforms.

Nothing I've proposed is big-brotherish in the least. My suggestions are only meant to reward consistency and organization and to help make programs portable across multiple platforms and installations.

-Phil

pedward · 2012-01-30 13:58

Point of note, if you are on an operating system that enforces case coherence in filenames, how could the compiler guess that you want FullDuplexSerial when you tell it fullduplexserial? You can squash case, but you can't go the other way.

IIRC, BST already enforces case coherence for object filenames, the fact that Windows doesn't is actually an interoperability problem that Windows has (eg: I have a directory called Pictures and pictures, Windows can't access both because the case is squashed).

Daniel Harris · 2012-02-10 17:10

So, where do we stand on this? Are we still vetting the compiler to ensure proper operation?

Batang · 2012-02-11 00:37

I am putting V31 through various tests.

Roy Eltham · 2012-02-11 01:08

Daniel,
Several issues were found and fixed over the last few weeks. Things seem pretty stable right now. My understanding is that Jeff Martin is still busy with other stuff, but once he free's up he was going to be working on test code that would help validate the compiler results for both success and error conditions. I'm under the impression that I'll be helping him with that.

This weekend I was planning on doing the '-c' feature that the propgcc guys want.

Roy

rosco_pc · 2012-02-16 22:38

Just tried out the compiler (r31) under linux and found that it breaks on the following piece of code (coming from F32)

DAT
        long                1e+38, 1e+37, 1e+36, 1e+35, 1e+34, 1e+33, 1e+32, 1e+31
        ....

with error code:

FloatString.spin(503:30) : error : Expected "," or end of line

this is working with both homespun and bstc, can not try propellertool at the moment.

Roy Eltham · 2012-02-16 23:21

rosco_pc,
Thanks for reporting this, I've added it to the issue tracker and will tackle it soon.

Cluso99 · 2012-02-16 23:35

Roy: I presume that as yet it does not do conditional code (as in the #define and #ifdef used in bst)?

ZiCog would be a great test but it is too dependant on the #defines.

If and when you get to this, beware that homespun and bst treat these differently. bst inherits the #defines from higher level source files whereas AFAIK homespun does not.

tdlivings · 2012-02-19 16:57

Roy
I have been playing with your spin.exe command line program and found that s2_test.spin which is part of the s2 software crashes the
compiler. I cannot tell where as it just crashes with the windows message the program has quit close or have windows look for a solution.
I am running from a Windows command window.
Additionally I have been playing with Delphi running your spin.exe using CreateProcess and capturing the output back into a Memo and that did
do a compile ie command line was spin.exe s2_test.spin but when I did compiledocs spin.exe -d s2_test.spin I got the crash.

Tom

tdlivings · 2012-02-19 19:39

Roy
I just retried s2_test.spin with my program that I said did not crash if I selected no options and this time it crashes.
My program is not important however it crashes during compile using a command window under Win 7.

Tom

jmg · 2012-02-20 00:18

Keith Leinenbach wrote: »

...
If SPIN is changing, please take it as a separate issue, not part of a compiler build. Take time and serious discussion. And remember that simplicity can be a good thing.

Certainly any language should firstly be backward compatible.

Where conflicts arise, with new features, that is usually handled with a Compiler parser Switch.

Users then have full control, and the ideal is to have that switch 'live', so a whole project does not need to be one way, or the other.

On a PC it is now rare to find compilers/code that are case-ad-hoc, but on embedded systems this is less rare, because ASM files tend to not be case sensitive.
- So if you really were relying on Case to make unique, (not good practise), and you (eg) drop to ASM to optimise something, you will get burnt. Some Asm can even stumble on Name Alias across modules.

ruiz · 2012-02-24 09:55

Last December I came across the Propeller chip and thought it was interesting. A bit later I came across the propeller based retro computer board 'Hive' and decided to build one: it would be a nice base to do a PDP-11 emulator and make the Hive board run 1970's research unix. Porting one of Fabrice Bellard's C compilers was a relatively quick affair but taught me one thing: debugging tools for PASM/Spin are hard to find and often below par or lack an integrated tool chain. The cluso/jazzed debuggers seem to be okay, but are struggling for compiler support to work; Spinsim seems a good simulator.

What was needed in my opinion was a Spin/PASM compiler that would accept a debug flag and do all the fiddly bits to debug a program. I figured that with Spin being like 'B', with some Pascal blended in and a touch of Python, such a compiler would not be difficult. A BCPL to o-code compiler is about 5000 lines, as is a Pascal to p-code compiler. With the Sphinx compiler available as a guide, I figured I could pump out such a compiler in 10 evenings.

Two days later Roy announced his compiler. After thinking about it for 2 days, I figured there were many reasons for still pushing ahead and so I did. It took about 20 evenings, as I had overestimated how well I understood the PNUT engine, and the expression syntax is somewhat unusual/undocumented. It did come out at about 5000 lines though, and is written in simple C, so that it can compile with Catalina or other, simpler propeller C compilers.

The source is public domain and attached; I will set up a repository shortly. My first goal is to polish the code and to add the debugging features, my second goal is to add dead code elimination

@michael park / sphinx:
- thanks for releasing sphinx - it has saved a ton of reversing engineering work on spin & pnut!
- there is a bug where 'quit' generates a 'jmp' rather than a 'jnz'
- currently I don't see a simple, elegant way to do dead code elimination in a two-pass architecture, but I dislike the idea of going three-pass; penny for your thoughts...

@david hein / spinsim
- thanks for providing spinsim, it has been a great help in developing and debugging hsc!
- there is a bug in spinsim where the 'spr' opcode does not increase the register # with 16
- related, there is a bug where pnut register 0-15 are not handled properly (great loophole for monkey patching pnut on a real propeller, by the way)

@brad / bst
- for example "|<23" is accepted as void expression and leaves garbage on the stack
- a := @a? is accepted and generates strange byte code

@parallax
- what is the intended meaning of spin expression '++a+++b'? It could be '+ + a++ + b' or '++a + ++b' . Or is it intended to be illegal?
- do I now get to be a 'star contributor' :-))

@cluso/jazzed
- your debuggers seem to be a good base to work from. Wouldn't it be nice if one could just type 'hsc -g myprog.spin' and then 'db myprog.binary' and end up in a gdb-like debugging session, switching between spin and pasm, and between cogs as the debugging process may require?
- your help would be gratefully accepted!

@phiphi
- if you want to experiment with syntax & semantic variations for object import, have a look at obj.c: it is short and accessible and should offer a good base for experimentation

Regards,

Paul

Build notes:
- to build hsc just type "gcc -o hsc *.c"
- the code is 32/64 bit clean
- when compiled with "-m32 -Os" options, it is a 40kb binary on x86
- it was developed on osx, but should work on linux and windows
- just noticed that the clock mode setting code is not correct (see end of link.c); anyone got time to write a patch?

Circuitsoft · 2012-02-24 16:14

Wow... Now that is a first post. Welcome to the forums! Can't wait to see what your compiler can do.

pedward · 2012-02-24 16:40

ruiz wrote: »

@parallax
- what is the intended meaning of spin expression '++a+++b'? It could be '+ + a++ + b' or '++a + ++b' …. Or is it intended to be illegal?

Operator precedence should determine how greedy the tokenizer is. If ++ is higher precedence than +, it should glob the ++ first, then the +. The ++a++ would pre and post increment a, then add b. IIRC, the tokenizer in the C compiler is based on a table of constants and I think the precedence is set by the order.

jazzed · 2012-02-24 18:23

ruiz wrote: »

Last December I came across the Propeller chip and thought it was interesting. A bit later I came across the propeller based retro computer board 'Hive' and decided to build one: it would be a nice base to do a PDP-11 emulator and make the Hive board run 1970's research unix. Porting one of Fabrice Bellard's C compilers was a relatively quick affair but taught me one thing: debugging tools for PASM/Spin are hard to find and often below par or lack an integrated tool chain. The cluso/jazzed debuggers seem to be okay, but are struggling for compiler support to work; Spinsim seems a good simulator.

....

Fantastic work so far! I was able to unzip/compile the package on Linux, but Windows doesn't like con.c for some reason. I wish I had more time to dig into this now.

Thanks, and welcome to Propeller land.
--Steve

Cluso99 · 2012-02-24 18:37

ruiz (Peter): Welcome to this fabulous forum. Thanks for such an explosion!

The spin interpreter can be patched and has been. There is a faster version that I worked on (see my links to tools in my signature). In that thread is some more documents on how the interpreter works. Dave Hein also worked on extending the interpreter.

My problem with the debugger I did was that I could not control (or get to properly) the source files. I did play with attempting to display from the listing produced by (homespun I think, but bst also produces a listing). I wrote the debugger so that I could test my version of the interpreter.

As far as the gdb debugger is concerned, I know nothing about this unfortunately, so I cannot help here. I am also a novice at C. But I am willing to help where I can.

I love the idea that Catalina will probably compile your compiler. This would make a nice advance upon Michael Parks sphinx works. That elusive decent Propeller OS is getting much closer now

jazzed · 2012-02-24 19:09

The SPUD debugger can easily lookup BSTC list file lines generated with the -ls option, so I guess that would be one output target of a spin compiler. SPUD gets lost in the call stack sometimes though and some operations are not supported (random?).

Rayman · 2012-02-25 06:58

ruiz, thanks for posting this, it looks great.
Maybe you can start a new thread?

I had trouble with con.zip too... Had to uze 7-zip to extract it and then it came out as "_con.zip". Maybe the underscore is what messes up WIndows...

Dave Hein · 2012-02-25 08:38

ruiz wrote: »

@david hein / spinsim
- thanks for providing spinsim, it has been a great help in developing and debugging hsc!
- there is a bug in spinsim where the 'spr' opcode does not increase the register # with 16
- related, there is a bug where pnut register 0-15 are not handled properly (great loophole for monkey patching pnut on a real propeller, by the way)

Good catch finding the bug in the SPR command in spinsim. I'll fix that when I have a chance. This bug is in the interpreter that executes bytecodes directly, but the PASM version of the interpreter will execute correctly. You can run that by specifying the -b option, but you probably already know that.

I'm not sure what you mean by a bug in pnut for registers 0-15. Are you refering to the register instructions that access cog locations $1e0 to $1ff? Normally, these are used to access the special registers, such as OUTA, INA, DIRA and CNT. The runner code uses the feature of accessing cog locations $1e0 to $1ef to store some data temporarily. However, it does prvide an opening for patching the interpreter. I used this feature to patch an LMM interpreter into the Spin interpreter for the SpinLMM object.

ruiz · 2012-02-25 11:10

I had trouble with con.zip too... Had to uze 7-zip to extract it and then it came out as "_con.zip". Maybe the underscore is what messes up WIndows...

Please find con.c attached. It is strange, I zipped the source files with the standard osx zip, and it unzips okay on winxp. I guess we can always trust Apple and MS to make a mess of interop.

ruiz · 2012-02-25 11:44

I used this feature to patch an LMM interpreter into the Spin interpreter for the SpinLMM object.

Ah, I had not come across SpinLMM yet -- sounds like something I should look up. I figure it is dead simple to have the compiler start its objects at hubram 0x200 instead of 0x10 and use the space below 0x200 for debugging, for instance monkey patching the pnut engine into a debug version.

Are you refering to the register instructions that access cog locations $1e0 to $1ff

Yes, I thought it would be a good improvement if ExecuteRegisterOp mapped $1ff to spinvars->dcurr, etc. You're right, -b solves it too. I may need that feature as part of the test harness/suite for hsc (see last dash):

I'm not sure what Roy's plan is for a test harness, but my plan is as follows:
- have a bunch of files with small test case spin files
- each file has the 'right' result in a comment, including compiler rejection
- compile each test with hsc, compiler exits with an error code if there is a syntax etc. error
- run output file against spinsim, test return value against expected result
- use access to dcurr to test for push/pop balance
The test harness would be a simple shell script iterating through the test file, using eg. grep or sed to extract the 'right' result

I figure a few hundred tests are sufficient for confidently refactoring and extending hsc. I guess the Parallax aim of doing a java certification style test suite will easily take a thousand or more tests -- half a man-year work.

ruiz · 2012-02-25 12:29

jazzed wrote: »

The SPUD debugger can easily lookup BSTC list file lines generated with the -ls option, so I guess that would be one output target of a spin compiler. SPUD gets lost in the call stack sometimes though and some operations are not supported (random?).

Getting debug info from parsing a listing file is the world upside down and I had to write hsc to get around that issue: a proper compiler should generate usable debug info, period.

Let's forego smart data for a bit, and look at what the debugger needs:
- Find the right source line from a pcurr value. The compiler can generate a 32K array with the line info for each possible value. If the the file cannot be found at the path given, the debugger simply gives up
- Find the value of a symbolic name, given pcurr, dbase, vbase and pbase. Imagine that the compiler generates a list of all names in its symbol table, including local symbols. A name can appear more than once, but each name has a range for the first and last value of pcurr where it is valid. Each symbol has an associated offset versus dbase (local vars), vbase ('global' vars) or pbase (dat vars and labels)
- Backtrace the call stack. Dcurr always points to the start of the current stack frame and this has the previous values of pcurr, dbase, vbase and pbase.
Of course, one would use a smarter encoding in a real compiler/debugger combo.

The only tougher thing -- given a file with proper debug info -- would be finding the value of the counter in REPEAT <expr> statements, as that would require the compiler to keep track of the scratchpad stack to locate the counter location. As it is anonymous, I'm not sure how the programmer would reference it, perhaps by using the keyword COUNTER and the line number of the repeat statement that defines it.

What am I missing?

jazzed · 2012-02-25 13:03

I like pcurr to line number array and other translations mentioned. Repeat did cause me lots of pain especially at the end of the program.

It has been mentioned in a way that if you want to make this a group project, it should have it's own thread. I concur. I'll contribute what I can find time to do. I'm generally overwhelmed right now though.

Thanks,
--Steve

Rayman · 2012-02-25 13:28

ruiz, I was able to build the executable in Windows7 using Visual Studio 2010 (I think there's a free version of this from Microsoft that can also compile it).
The binary comes about almost the same size as the Spin Tool, for some random Spin file, only 4 bytes off.
But, the initialization bytes (the first few bytes) are different. That shouldn't be, should it?

OpenSpin Spin/PASM compiler in C/C++

Comments