OpenSpin Spin/PASM compiler in C/C++

jazzed · 2012-01-26 13:09

pedward wrote: »

Did it look to anyone else like the title of this thread is an open source SPASM compiler?

We've had SPASM before. Not sure just now who wrote it.

pedward wrote: »

The only way to make the old SPIN compiler grok the new format is using 2 methods: ....

It's not bad. This approach was used with the Basic Stamp in PBASIC. Was it {BS2} ? There will propably be lots of new features in a few years.

pedward wrote: »

...but it would be in Parallax's best interests to remain the corporate steward of the compiler and language and to be responsive to best-practiced changes.

From what I've seen Parallax is definitely interested in controlling their own destiny.

Phil Pilgrim (PhiPi) · 2012-01-26 13:11

Roy Eltham wrote:

1. I am inclined to use '/' ONLY as a directory delimiter for OBJ references. I do not like the idea of using :: for it at all, sorry. I also did not intend to allow for other OS specific elements in there either (so no "C:"). I did plan to support using '.' and '..' to mean current directory and parent directory and '/' to mean the root. As far as I know those all work the same across linux, mac, and windows.

I really don't like that approach at all. Sorry. It's a mistake, IMO, to allow any absolute directory references in the OBJ section, or the implied references "." and "..". This should not be part of the Spin language. Any absolute directory references should appear in the PATH environment and there only. My only concession would be a #PATH pragma that prepends a directory or list of directories to the current search path. I think that allowing specific references to OS file structures in the OBJ section violates a very necessary level of abstraction that should be maintained for the object library.

-Phil

pedward · 2012-01-26 13:26

Phil is correct in that the path should be specified by an external directive. This is how it's been done on Unix for ages, the LIBRARY env variable IIRC.

The object name in quotes is a file, but there should be a level of abstraction to that. The :: was adopted by PERL and makes sense, Java uses the period in a most confusing manner. The slash isn't nearly as elegant as ::, but it's intuitive to most because of the URL/URI specification.

If you really wanted to get pedantic, you could use URLs to specify OBJects.

--Perry

Roy Eltham · 2012-01-26 13:36

Phil,
I'm kind of confused by your seeming contradiction. You want to be able to specify sub directories in OBJ references in front of the spin file names in order to be able to organize your files on disk, but then you don't want to do it in any way that would be natural for most people familiar with directories on disk. The stuff between the quotes in an OBJ reference is a filename (optionally without the .spin extension, which gets added on by the wrapper if it's missing). Allowing that filename to include directories in front of if doesn't feel like it's changing the language to me at all. The compiler itself doesn't do any file handling. It expects the wrapper (or prop tool) to do it all for it. It just parses the quoted bit and puts it into an array of filenames, then the wrapper loads each of those files and recursively calls the compiler.

In fact, I wouldn't be surprised if my existing wrapper code just worked if you did this (and the file was in the directory in question):
OBJ
pst : "somedir/Paralax Serial Terminal"

If it doesn't it would likely be a trivial change to the wrapper code only to make it work.

Anyway, you seem to be holding some special meaning or requirement around that quoted filename string that I'm not seeing.

pedward · 2012-01-26 13:45

I wanted to use this when I wrote the Arduino_light object because I wanted different bits of the API to be in different files. It didn't work because : is reserved on WinDuws.

Eg:

bits: "Arduino::Bits"
math: "Arduino::Math"

It just seems more elegant because the inferences is that they are subclasses of the Arduino object. We know SPIN doesn't do subclassing, but it conveys a proper meaning to those who use other languages that do.

Roy Eltham · 2012-01-26 13:52

I forgot about one bit of code in there that doesn't allow certain characters in those quoted filenames, so it won't work. If I changed that code to allow / through then it would work.

Phil Pilgrim (PhiPi) · 2012-01-26 14:30

Roy,

I probably didn't explain it very well. In Perl, for example, a module that's named "Tk" refers to a file somewhere named "Tk.pm". Spin works the same way. For example, "FullDuplexSerial" is an object name, not a filename, the latter being "FullDuplexSerial.spin". That object name potentially refers to a multitude of different files, all named "FullDuplexSerial.spin". The one that's actually used would depend on the value of the PATH environment. In the Propeller tool, the path implicitly gives preference to an object that's open in the editor, followed by one that exists in the top-level object's directory, followed by one that exists in the Propeller library.

In Perl, for example, the path defaults (in Windows) to /perl/site/lib/, followed by /perl/lib/. Each of those directories contains a number of ".pm" files, along with many subdirectories, sub-subdirectories, etc., which have the same names as the divisions assigned to the CPAN (like OBEX) module repository. In Perl, when I say

use Tk::Button

for example, the Button.pm module is picked up from the /perl/site/lib/Tk directory. But, suppose I were testing my own version of Button.pm. Without changing the use statement referenced above, I could temporarily prepend my own directory to the current path via a use lib pragma at the head of my program, and the path specified there would be searched first for any referenced modules.

This is much cleaner than specifying the path in the original use statement and it makes the code more portable. Consider, if you will, the confusion that would reign if a Linux-based Spin developer were writing an object for the OBEX and was allowed to say:

OBJ

  sio : "/usr/bin/spin/lib/IO/FullDuplexSerial"

His object would not be portable to Windows. But if he could write instead:

OBJ

  sio : "IO::FullDuplexSerial"

his code would not only be portable to any other OS that used the same basic library hierarchy, but an end-user could substitute his own favorite FullDuplexSerial, without changing the source code, by modifying his local PATH variable.

This is what I mean by a very desirable level of abstraction, one necessary step away from the local file structure. The "::" notation not only ensures that users won't confuse the OBJ specs with references to a particular file, but it also shows no favoritism towards one OS's notation or another. Remember that Spin is a computer language that could conceivably be compiled on an iPhone, under Android, or who-knows-what-other environment. We shouldn't saddle the language itself with notational baggage accumulated over years of UNIX experience if we can avoid it.

Now, that covers the Spin language itself, of which the OBJ section is an integral part. That doesn't mean that we can't admit installation-dependent pragmas, such as #INCLUDE, or #LIB which do refer to actual file or directory names. But the use of such pragmas (pragmata?) should be discouraged in objects submitted to the OBEX, since they are non-portable.

I guess the gist of my comments is to maintain a clearly-defined wall of separation between pure language stuff and installation-dependent stuff. Absolute directory references and the use of "/", "\", ".", or ".." are installation matters and should be dealt with outside of the OBJ section.

-Phil

Dave Hein · 2012-01-26 14:35

My preference would be to use '/' for the path separator. The '::' separator just doesn't seem right to me.

EDIT: Absolute paths should be allowed, but they should be avoided. It's better to specify absolute library paths with a -L option.

Phil Pilgrim (PhiPi) · 2012-01-26 14:47

Dave Hein wrote:

The '::' separator just doesn't seem right to me.

That's probably because you've been tainted by years of POSIX exposure.

Remember that :: is not meant to be a subdirectory separator but an object "subclass" separator. In most installations, it will correspond one-to-one to a subdirectory structure, but that needn't be a necessity of the language per se.

-Phil

Dave Hein · 2012-01-26 14:51

Phil Pilgrim (PhiPi) wrote: »

That's probably because you've been tainted by years of POSIX exposure.

I think we are talking about a directory path here, so it is natural to use either / or \ for the separator. I prefrer /, but allowing both / or \ is also acceptable. Every OS I've used since the beginning of time used one of those characters as the separator.

Rayman · 2012-01-26 14:52

Personally, I'd be happy with some C++ code that reproduced exactly the Prop Tool result...
I'm worried that adding in a bunch of extra stuff will make it harder for me to strip out the core...

jazzed · 2012-01-26 14:53

Phil's example is library::module which is not unreasonable.
There would be nothing wrong with library/module though.
Perl has some neat ideas, but it also has some horrible ones too.

I agree that using absolute paths is wrong especially for sharing code.

Phil Pilgrim (PhiPi) · 2012-01-26 14:57

Dave Hein wrote:

I think we are talking about a directory path here, ...

No, we're not. That's my point. A directory path leads you to one particular file. The :: specifies a sublibrary and may correspond to one of any number of directories, depending on an installation-dependent path setting.

-Phil

Phil Pilgrim (PhiPi) · 2012-01-26 15:01

jazzed wrote:

Perl has some neat ideas, but it also has some horrible ones too.

I pick only the best ones as examples.

-Phil

Circuitsoft · 2012-01-26 15:04

Phil Pilgrim (PhiPi) wrote: »

A directory path leads you to one particular file.

Not necessarily. A relative path is tried against every root location in the library path.

If I write

OBJ
   sio : "serial/FullDuplex"

then I would expect to search in the libraries for a category/dirctory (not module) called serial, and for it to have an entry called FullDuplex.

One convention that I do like from C is:

#include "file_to_check_local_dir_first.h"
#include <file_to_check_include_path_first.h>

While the <fname> is slightly awkward, the fact that you can easily specify whether to look first in the immediate directory and second in the include path is nice.

Roy Eltham · 2012-01-26 15:09

Phil,
Thanks for your explanation! I believe I understand better know what you are asking for, and I actually think I like it. I would tend to think of this more as an alias. In your example 'IO' would be an alias for a given directory path.
I think the declarations of those aliases should not be inside of the Spin Source though. Instead they should be handled by the wrapper. So in the case of a command line tool you would pass in the aliases as options (-a IO=<directory path>), and in an IDE you would have some gui config dialog or whatever. This keeps the OS specific stuff out of the spin compiler entirely. Is that acceptable?

Rayman,
I'm not sure why you would want to "strip out the core" because the goal is for all of this to become the core that is used for all tools, including parallax's official ones. It will be able to compile existing spin code without modification.

Phil Pilgrim (PhiPi) · 2012-01-26 15:36

Roy Eltham wrote:

I think the declarations of those aliases should not be inside of the Spin Source though. Instead they should be handled by the wrapper.

Nooooo! Please, no! They're not aliases in that sense, but the names of sub-libraries. You could have the following library structure probably, but not necessarily, organized as subdirectories:

lib
  IO::
    Serial::
      FullDuplexSerial
      SimpleSerial
    SPI::
      SimpleSPI
    I2C::
      ComplicatedI2C
      SimpleI2C
  Math::
    Float::
      Float32
      Float32Full
    Conversion::
      SimpleNumbers

A fully-specified object name in the OBJ section must include the names of the sub-libraries of which it's a part. The person writing the code needs to be able to specify which library a particular object resides in. For example, there could easily be two objects of the same name, but in different sub-libraries, viz: Math::Float::Trig and Math::Integer::Trig.

As the OBEX grows ever larger, it will become an organizational nighmare to shovel everything into one library, without a further subdivisions. (Actually, it's already gotten to that point.) We have a chance, now that extensions are being considered for the compiler, to help effect a similar structural change in the OBEX. If we don't do it now, we will regret it later.

All that said, there's nothing wrong with having a command line switch or GUI config, to prepend to the default PATH (or replace it). But that's not the appropriate place to specify sub-libraries.

-Phil

pedward · 2012-01-26 15:37

Roy, you have understood correctly. For practice, PERL maps the namespace to a directory location and the nodes delimited by :: are directories or files.

That said, the notion of archiving SPIN files into a ZIP and distributing an archive of Gold Standard, etc, objects in that manner would be handy.

You would say, look in directory X, directory Y, directory Z, and also look in ZIP file FOO.ZIP for the objects.

The implementation is abstracted from the code.

EDIT: after reading Phil's post, I thought I'd be more clear.

The syntax would be:

OBJ
uartA: "IO::Serial::FullDuplexSerial"
math: "Math::Float::Float32"

How the "wrapper" maps that is entirely device dependent (to the device the compiler is running on). For simplicity PERL simply maps to directories, but there are other notions that are just as valid.

Rayman · 2012-01-26 15:42

Roy, since we can't have the original assembly, I'd really like just the basic C++ version that does exactly the same thing...
I'm not really that interested in these other features and I'm worried that adding extra stuff may cause unpredictable results...
I hope you at least make the extra stuff optional...

yeti · 2012-01-26 16:00

pedward wrote: »

Phil is correct in that the path should be specified by an external directive. This is how it's been done on Unix for ages, the LIBRARY env variable IIRC.

Please explain...
I'm using unix for >> 2 decades and dont know "the LIBRARY env variable"...
Please enlightren me!

Sapieha · 2012-01-26 16:10

Hi Ken.

Sorry but I have another opinion on Posting x86 Code.

Many of us understand it very well and that code can help us find BUGS that still can be in Roy's translated code. <

No offenses to ROY for his work But it is always better to have more eyes that can look and find problems.

Ken Gracey wrote: »

What Roy said.

The objective was to produce the C/C++ compiler, to validate it as bug-free, and to have it be the tool from which we make the various extensions and modifications identified above. The x86 assembly source requires some support and explanation from Chip, which Roy was able to obtain after several visits and discussions (and a year of work). We don't have a formal testing process in place for Roy's translation yet, but Jeff will start testing after we get the Eclipse GUI closer to release for the PropGCC beta.

I'm not about to burden Chip with the request, let alone expose him to the follow-up questions (even though Rayman wouldn't bother him, others will). Right now Chip's on a productive path with Propeller 2 and one of my jobs is give him the environment to finish the project.

Ken Gracey

P.S. to Roy: welcome to the Star Contributors section of our web site!

Roy Eltham · 2012-01-26 16:11

Phil,
Maybe alias was a bad word to use. I used alias, because what you describe is sort of similar to namespace aliases in C++ (although those don't map to a path on disk at all). I think I still got what you want for the most part.

The part of my message that you quoted and than said Nooo! to doesn't quite fit with what you say after the Nooo!. The part I want clarified is where the "mapping" of a library name to a actual OS path exists. I don't think that belongs in the spin source file. I think it is handled by the wrapper in the sense that you define them in some way for the wrapper and then it handles loading the appropriate actual file from disk for the given declaration in the source (e.g. myObj: "Math::Float::Float32" would end up loading from /library/math/MyFloatVariant/Float32.spin based on you having configured the wrapper (via gui or command line options or a config file) for that path mapping. it could also end up loading /library/math/float/Float32.spin given a different path mapping).

Rayman,
Much of the "extra stuff" will be optional in the sense that if you don't use it, it won't ever be invoked. In fact most of what we have been discussing at length here involves changes to the wrapper more so than the compiler itself. Preprocessing and file handling are all outside of the compiler core (which is in the PropellerCompiler folder in the google code repository). The main concern I have is that if you stay with "only what the x86 code did" in your tool(s) then your tool(s) will eventually not be compatible with all code made in the official parallax tools which are going to include the new features. This will be even more of a problem when we start adding in the Prop2 stuff.

pedward · 2012-01-26 16:42

yeti wrote: »

Please explain...
I'm using unix for >> 2 decades and dont know "the LIBRARY env variable"...
Please enlightren me!

LIBRARY_PATH is the env variable used by the compiler. The analog for the pre-processor is INCLUDE_PATH.

Phil Pilgrim (PhiPi) · 2012-01-26 16:56

Roy,

I see what you're getting at. You could, as an extra bonus feature, create an alias that maps the Math::Float library to Math::MyFloatVariant, but you would still have to specify "Math::Float::Float32" in the OBJ section for the match to Math::Float to be identified and the mapping to take place. That could be a useful feature on top of what I was suggesting.

In my simpler scenario, you would have a default PATH environment that might look like "c:\Program Files\Parallax\Propeller\lib", to which I might prepend "c:\jobs\propeller\mylib", so that it would be searched first, before the default library is searched. So if I had a file named "c:\jobs\propeller\mylib\IO\Serial\FullDuplexSerial.spin", it would get picked over "c:\Program Files\Parallax\Propeller\lib\IO\Serial\FullDuplexSerial.spin" when I specify "IO::Serial::FullDuplexSerial" in my OBJ section.

-Phil

Roy Eltham · 2012-01-26 17:50

Phil,
Ding! I finally properly get what you were asking for there. All that it would take is allowing the ':' character to be in filename strings in spin, and then in the wrapper code converting the '::' notation in filename strings into the native directory delimiter of the OS we're running on. The existing wrapper code already tries each path you specify with the -I command line option, in order, in front of the filename string to find/load the file. So it would work like you described.

I was imagining the variants existing in the same directory branch as the "originals" just in new folders. Then using one include path and any defined aliases you could redirect Math::Float to Math::MyFloatVariant. The spin code would say Math::Float, but the wrapper would load Math::MyFloatVariant because of the alias. That seems overly complicated and unnecessary. Your version is simpler.

Cluso99 · 2012-01-26 18:02

Roy Eltham wrote: »

A few quick things (I'll reply in more detail when I am not at work):

1. I am inclined to use '/' ONLY as a directory delimiter for OBJ references. I do not like the idea of using :: for it at all, sorry. I also did not intend to allow for other OS specific elements in there either (so no "C:"). I did plan to support using '.' and '..' to mean current directory and parent directory and '/' to mean the root. As far as I know those all work the same across linux, mac, and windows.

Agreed

2. My feeling is that #includes can go anywhere in the code and include any kind of file containing any kind of data, it's just part of the preprocessor.

Agreed - we have been after this for sooooo loooong

3. It's my hope that I once this is solid, I'll work with Jeff Martin at parallax to make it so that the Propeller Tool uses this instead of the x86 code it uses now. This would be an interim step to bring up the existing tool to the current Spin compiler while a new cross platform tool is worked on. It should be pretty easy and quick for us to do it, once we are ready.

I think this would be a great step in the right direction. To avoid problems, call the new PropTool v2.x so this allows perhaps a later rev of the existing PropTool without problems. Allow both versions to coexist so we chan check outputs if a problem/bug is found.

4. I'm really not a fan of the "with" stuff at all. I feel that the minor amount of typing it saves comes with a strong downside of code obfuscation. (e.g. when I see #CR I expect to find CR defined in the current file, I think it's horrible to have to find and navigate indirections that could be significantly distant from the instance I am looking at).

I agree, the extra typing is worth the cleaner code readability.

Way to go Roy !

RossH · 2012-01-26 18:35

Roy Eltham wrote: »

I'll start with a list here of a few things I was hoping to do...

1. Full preprocessor pass

Yay!

Roy Eltham wrote: »

2. Make OBJ references work like #include, allowing full/partial paths.

Yay!

Roy Eltham wrote: »

3. Dead code elemination (for spin, not pasm).

Yay!

Roy Eltham wrote: »

4. Case sensitivity

Won't this break almost all existing Spin programs?

Roy Eltham wrote: »

5. Add the '<exp> ? <true> : <false>" (ternary operation). This is a Chip request, that I really wan too.

Yes, but in the true tradition of Spin, be sure to make the semantics just very slightly different to those of C!

Roy Eltham wrote: »

6. Allow () for parameter-less functions.

Optional, I presume. But why would you want this?

Can I add some more ...

Arbitrary length identifiers.
Arbitrary size symbol tables.
Arbitrarily large segment sizes.
Compiler listing outputs.
Symbol table outputs (e.g. symbol type and address).

Ross.

Roy Eltham · 2012-01-26 18:53

RossH,
The case sensitivity thing would be an option, you would turn on if you wanted it for your code. It would probably have to be a special CON entry like _CLKMODE, _STACK, _FREE, etc.

For the Ternary operation, since ? is already used for random number, I'll have to use some alternate form. Perhaps '<exp> ?? <true> : <false>' or '<exp> ?: <true>, <false>' ?

The () on parameter-less functions would be optional (that's why I said it as allow). I think it's more clearly readable as a function call verses a variable if it has the () on it.

RE: Your Requests:

My symbol table is actually a couple hash tables (one for predefined and one for compile time defined), and doesn't have a limit to number of entries other than memory.

I did keep the symbol length limit in tact at 30(32 really internally), simply because a lot of the x86 code referred to it, and during translation it was easier to keep around. I was planning to propose changing that with Chip and Jeff. It would be easy in the C/C++ code.

The compiler already does have a symbol listing output, that is never really exposed to the user directly, but is used by the prop tool internally. My wrapper has an option (-v) to dump that stuff out to stdout along with some other stuff info and the hex dump of the binary image.

Not sure what you mean precisely by segments sizes in Spin/PASM. Do you just mean allowing for PASM bigger than 496 longs, and spin larger than 32k?

By compiler listing outputs, do you mean dumping some form of human readable version of the spin bytecode?

Roy

John Abshier · 2012-01-26 19:32

Rember that Spin has two groups of users, experts (most of posters to this thread) and readers of What's a Microcontrolled-Prop Edition. Don't make Spin so great that casual/new users get locked out.
John Abshier

pedward · 2012-01-26 20:19

I have a simple request in the next SPIN, I want inline assembler. C can do inline assembler, and SPIN could do it if the interpreter helped. I don't want huge chunks of inline, but there are many operations that are stupid simple easier in PASM than in SPIN.

OpenSpin Spin/PASM compiler in C/C++

Comments