New BASIC compiler for Prop1 and Prop2
ersmith
Posts: 6,001
Edit August 2019: It's been almost a year since I started the project, and I think BASIC support in fastspin is now very mature. We have some solid features like:
- Broadly FreeBASIC / MS BASIC compatible (including support for really old programs that use line numbers and gosub
- A simple preprocessor that allows #define, #ifdef / #else / #endif, for conditional compliation and simple macro substitution
- Inline assembly inside functions and subroutines, or in the main program
- Support for floats, strings, integers, pointers, arrays, and user defined structures
- Produce optimized Propeller executables for both P1 and P2
- The same compiler supports PASM, Spin, BASIC, and C, so functions written in any of those languages can call each other.
The early part of this thread has got some thrashing around about the language design, you can ignore that (most of it is obsolete). In the end I decided to make the strings be garbage collected, and this vastly simplified things. File I/O is done with traditional BASIC "open" and "print #n, x" style statements.
I've attached the current PDF documentation to this message so you can see what the language looks like and what features it has.
Rather than trying to keep the originally attached .zip here up to date, I'll just add pointers. Note that "spin2gui" can be used for BASIC development as well as Spin, and works for both P1 and P2:
spin2gui: https://github.com/totalspectrum/spin2gui/releases
fastspin: https://github.com/totalspectrum/spin2cpp/releases
spin2gui contains fastspin, proploader, loadp2, and a simple editor, so it has everything you need to try things out on Windows. Linux/Mac users can build fastspin and spin2gui themselves from the source code. I develop on Linux, so that should definitely work, and I think there are some Mac users as well.
Edit: The work in progress compiler is attached here. To call it, use a command line:
I'm working on a BASIC compiler for Prop2 (which will incidentally support Prop1 too, since it's based on fastspin which handles both). It'll be similar to PropBasic in that it will compile to COG or LMM code, but it's not a PropBasic replacement -- the intention is to make a more Microsoft like syntax, rather than PropBasic's pbasic syntax.
What features would you like to see in a BASIC compiler for Prop1 (and/or Prop2)? I've got the following things planned:
(1) Support for using Spin objects (so easy access to existing objects)
(2) Floating point and string support built in. Types are either inferred from the name ("a$" is a string, "a" is an integer) or explicitly declared in a DIM statement.
(3) Syntax that's a subset of FreeBasic.
(4) Optimized PASM code output
(5) Can directly build binaries (no need for bstc or any other Spin compiler)
The string support is probably going to be the hardest part, since BASIC traditionally has pretty powerful string handling, much more so than Spin or C. At present I'm thinking of limiting strings to 255 characters in length to simplify some of the code. Is that too restrictive?
How important are multi-dimensional arrays? At present fastspin just supports one dimensionsal arrays, so there's some work to do to add multiple dimensions, but it is a traditional BASIC feature.
It's still quite a ways away from production, although the curious can get it from the spin2cpp GitHub repository (you'll have to build it yourself from the "basic" branch; again, it's nowhere near ready for production, so I won't be releasing binaries for a while yet). Today I got the first programs running. Here's a sample of something that works:
- Broadly FreeBASIC / MS BASIC compatible (including support for really old programs that use line numbers and gosub
- A simple preprocessor that allows #define, #ifdef / #else / #endif, for conditional compliation and simple macro substitution
- Inline assembly inside functions and subroutines, or in the main program
- Support for floats, strings, integers, pointers, arrays, and user defined structures
- Produce optimized Propeller executables for both P1 and P2
- The same compiler supports PASM, Spin, BASIC, and C, so functions written in any of those languages can call each other.
The early part of this thread has got some thrashing around about the language design, you can ignore that (most of it is obsolete). In the end I decided to make the strings be garbage collected, and this vastly simplified things. File I/O is done with traditional BASIC "open" and "print #n, x" style statements.
I've attached the current PDF documentation to this message so you can see what the language looks like and what features it has.
Rather than trying to keep the originally attached .zip here up to date, I'll just add pointers. Note that "spin2gui" can be used for BASIC development as well as Spin, and works for both P1 and P2:
spin2gui: https://github.com/totalspectrum/spin2gui/releases
fastspin: https://github.com/totalspectrum/spin2cpp/releases
spin2gui contains fastspin, proploader, loadp2, and a simple editor, so it has everything you need to try things out on Windows. Linux/Mac users can build fastspin and spin2gui themselves from the source code. I develop on Linux, so that should definitely work, and I think there are some Mac users as well.
Edit: The work in progress compiler is attached here. To call it, use a command line:
fastspin myprog.baswhich will produce myprog.binary. I've left the rest of the message the same, but some of it is obsolete... see the thread for discussions on how the language has evolved.
I'm working on a BASIC compiler for Prop2 (which will incidentally support Prop1 too, since it's based on fastspin which handles both). It'll be similar to PropBasic in that it will compile to COG or LMM code, but it's not a PropBasic replacement -- the intention is to make a more Microsoft like syntax, rather than PropBasic's pbasic syntax.
What features would you like to see in a BASIC compiler for Prop1 (and/or Prop2)? I've got the following things planned:
(1) Support for using Spin objects (so easy access to existing objects)
(2) Floating point and string support built in. Types are either inferred from the name ("a$" is a string, "a" is an integer) or explicitly declared in a DIM statement.
(3) Syntax that's a subset of FreeBasic.
(4) Optimized PASM code output
(5) Can directly build binaries (no need for bstc or any other Spin compiler)
The string support is probably going to be the hardest part, since BASIC traditionally has pretty powerful string handling, much more so than Spin or C. At present I'm thinking of limiting strings to 255 characters in length to simplify some of the code. Is that too restrictive?
How important are multi-dimensional arrays? At present fastspin just supports one dimensionsal arrays, so there's some work to do to add multiple dimensions, but it is a traditional BASIC feature.
It's still quite a ways away from production, although the curious can get it from the spin2cpp GitHub repository (you'll have to build it yourself from the "basic" branch; again, it's nowhere near ready for production, so I won't be releasing binaries for a while yet). Today I got the first programs running. Here's a sample of something that works:
'' '' import the Spin FullDuplexSerial object '' class fullduplex using "FullDuplexSerial.spin" '' create a full duplex serial object dim ser as fullduplex '' start the serial ser.start(31, 30, 0, 115_200) rem note that as usual in basic, the variable i is rem declared automatically for i = 1 to 10 ser.dec(i) newline() next i do rem loop forever loop sub newline() ser.tx(13) ser.tx(10) end sub
Comments
Sounds good, I do like the 'more FreeBasic compatible' idea.
I recently added a patch to PropBasic to allow PoseidonFB IDE (see below image) to call PropBASIC, and pick up the error messages for error-line-highlight.
Was quite easy to do, just a minor shuffle of the error report line, to be FreeBASIC cloned.
That allows you to quickly leverage the various FreeBASIC IDEs - I was using FBide, but PoseidonFB seems to be gaining traction. There is also FBedit.
Will it include the same conditional preprocessor FreeBASIC does, so you can have one source for both - allows PCs to do testing of functions / code.
What about Asm..End Asm to allow in-line assembler ie same syntax as FreeBASIC ?
Could it give a slightly better 'not supported' type message on places where FreeBASIC is a superset ? (rather than a syntax error)
Hmm, on small MCUs, 255 is probably tolerable, so for a P1 that's likely ok. However, P2 is not so small anymore... - could this be some system option ?
- eg smaller strings for more compact code (eg P1), but larger ones allowed for P2 ?
Yes, that's nice to have, but does not need to be in the first release.
Modified PropBASIC called from PoseidonFB IDE, with error parsing shown : (/FB command line option now included in latest PropBASIC release, to reformat error reports)
Well, dynamic strings would pretty much require garbage collection, which seems like a lot of trouble. I was hoping to get away with putting the maximum length of the string in the upper bits of the pointer. Then we'd be able to translate code like "A$ = B$ + C$" into something like: On the P2 we'd actually have 12 bits for the length, so strings could go up to 4095 long, but 255 is more "traditional" and also would allow for an alternate implementation where the first byte of the string array held the length.
Initially it will have the same preprocessor as fastspin / openspin, which is pretty basic but does support simple #define, #ifdef / #else / #endif.
I'd like to do this. It's using the same engine as fastspin, so in principle supporting inline assembly is not a problem. Actually parsing the assembly will be a bit of a pain since in the Spin case I was able to re-use the DAT section parsing, whereas for BASIC I'll have to re-implement that. So maybe not the first release? We'll have to see.
Hmmm. That's also a good idea, and I think it could be done in at least some cases, but probably not for all of them.
True, but temporaries could be created on the stack, which will get freed when the function returns, so no need for garbage collection. Figuring out the size of temporaries could be a little tricky... I guess we could use the max of the lengths of any strings involved. Or maybe we just disallow any expressions that are too complicated to easily be converted? The only string returning operators I was planning to support initially were concatenation ("+") and the substring functions (LEFT$, MID$, RIGHT$). Hmmm, but for user defined functions I guess things get complicated. Dynamic strings would definitely make things easier on the compiler, but I worry about the size and space implications, not to mention the need for a garbage collector.
But finding all the references to the strings requires looking through stack, heap, and registers, doesn't it?
Maybe reference counted strings would be the way to go. Since the string doesn't contain any pointers it can't have loops, so at least in theory it should be do-able. I still kind of like the static allocation idea though because then you know how much space your program will need. Maybe a compromise where most strings are statically allocated but you can define a size of a string heap for temporaries?
One thought I had was allowing a PRINT WITH to specify a method to use for printing characters. The parameter would be an object and method that takes a single integer parameter and outputs it. Something like:
This would print to whatever device was opened as #1. This could be a file or I guess it could be a TV of VGA driver. I think you opened files like this:
-Phil
Yes, in the test example above, I simply swapped in the 'Compiler path' from the default ..fbc.exe to a path\to\CallPropBasic.BAT and that file could include a download if no errors line.
Because fbc.exe is a standalone compiler most IDEs should be able to manage this.
That sounds good enough
The FB help says
https://www.freebasic.net/wiki/wikka.php?wakka=CatPgDddefines
which gives examples of
that means a #ifdef __FB_VERSION__ test should be able to switch between platforms. Shame fastbasic truncates to FB ?
It would be nice to have, as it also allows users to learn PASM, without having to learn all of PASM. If the listings out have ASM/source included as commment, that also helps them learn & paste/modify.
That's easy to read, but print #n, may be easier to test on PC hosts for example.
eg I'm thinking here about using FB debuggers, to test Data/String/Flow code, complete with VAR watch etc, rather like a simulator, until the code is shaken down enough to try on a real chip.
In the above example, the serial print would either go to a PC serial port, or a Prop port, with a conditional variant of the Open Com line.
Thinking about how someone might use PC COM ports to test, and also use Screen output for other messages, I ran some tests in FreeBasic (1.05).
These were revealing, as not all means of screen write are identical, but they do show you could flip between COM ports and Cons, Scrn targets for useful debug & development.
"CON" behaves slightly differently.
FreeBASIC test code and results captured.
The above I was able to compile and debug (Step/watch) using this combination BAT file
"C:\FreeBASIC\FreeBASIC_1.05.0_Win32\fbc.exe" -g -v JTest.bas
"C:\FreeBASIC\fbdebugger292\fbdbg 32\fbdebugger.exe" C:\FreeBASIC\COM_tests\JTest.exe
but the 64b versions of those have some issues...
WORDS or VGA WORDS or LCD WORDS or 5 SERIAL WORDS to output serial on pin 5 or FILE> WORDS where output writes to the open file in the currently selected channel etc. The standard console is reset back with CON.
Dynamic strings are another thing, I'd be interested to see what you end up doing in this regard
I fully support this project. The more options the better.
I will be watching this thread to see how if works out.
Good luck,
Bean
Don't worry yet though. At the moment it can't even blink an LED!
I think David has convinced me to go with reference counted dynamic strings. I had really hoped to avoid that, but the use case of: requires that functions be able to return temporary strings, which in turn means they can't always be allocated on the stack . On the other hand I think that's the only case where dynamic heap allocation is absolutely required. If we disallow it then implementing strings becomes vastly simpler. But again, the language wouldn't be "traditional" BASIC, nor as user friendly.
Actually implementing the reference counting pretty much requires being able to run some code whenever a string is created, copied (decrementing the old reference, incrementing the new), or destroyed. The last one is the biggest rub, since it requires running destructors on objects going out of scope, e.g. temporary variables inside a function before the function returns. Which starts taking us into serious object oriented programming territory, but does open up some interesting other possibilities.
The other alternative, a mark-and-sweep type garbage collector, might be simpler if the references are always held in HUB memory. But if we allow references in COG memory then it becomes impossible, because one COG can't read other COGs memory to see if there are references there. Maybe an extra layer of indirection, like a handle table, might avoid that problem.
Lots to think about, anyway. Thanks everyone for your feedback.
Eric
Thanks Bean! I definitely don't think of this compiler as a replacement for PropBasic, but rather as a parallel approach. I like what you did with PropBasic -- it's an elegant compiler, and great for people coming from the Basic Stamp to the Prop. I never did any Stamp programming though, so my mind is kind of stuck on old school 80's BASIC with a touch of Microsoft's later changes.
Eric
However, I can't see why your output device couldn't just be revectorable and Basic only has to point the output vector to VGA and so all output is sent to that "device". I try to make all my devices handle streams rather than painfully calling methods, so rather than public vga.cr or vga.cls there is only vga.out which handles the stream of characters and controls etc which calls the internal private methods for vga.cr and so on (but normally vga.out is not called directly). The stream doesn't always have to be 8-bit although that is more convenient for printing. EMIT in Tachyon doesn't really do anything except determine what to pass the data to which by default is the serial console.
Example: Imagine that you only have serial output but on the other end of that serial output are smarts that can redirect the stream to many different devices, including files. Your code just needs to send (or call) that magic word to switch the output device.
PropBasic has it's roots in SX/B for the SX chip.
If I was to re-write it today I would do things very differently.
I would start with a good expression evaluation procedure.
Yeah, I'm thinking I might get some good PropBasic enhancement ideas from your compiler too.
I already like "PRINT WITH" I would probably use it something like:
Bean
In the approach I mentioned above each string has a use count, which gets incremented when it is copied and decremented when the copy is no longer in use. When the count reaches 0 the string is freed. This approach works well for static data like strings; it can fail for data structures that can involve loops (because then you can get a loop in which each element points to another, so the use counts are nonzero, but the loop as a whole is no longer reachable).
Another approach, mark-and-sweep, is a bit different. Basically when you run out of memory you walk through the heap and mark all objects as "free", then run through the stack and global variables and any time you see a reference to an object you re-mark it as "used". At the end you should have an accurate tally of used and free memory (anything no longer referenced will be marked "free") and you can then re-allocate from the new free space.
Well, that's basically what PRINT WITH was intended to do -- it sets the vector used internally by PRINT. So basically PRINT WITH redirects the output to the new stream. I illustrated it with some methods, but any kind of function could be used as the argument for PRINT WITH. Internally PRINT would do all the formatting and then when it wants to output characters it would call the vector. I guess I didn't explain it very well.
I guess we could rename PRINT WITH as OPEN and change around the syntax a bit to allow multiple vectors. Then we could do: (so plain PRINT is an alias for PRINT #0)
I like the approach of keeping this broadly compile compatible with FreeBASIC, as that means users can instantly use any existing FreeBASIC IDE / Compilers / Debuggers as development
eg Here I tested named ports, and this code compiles and runs in FreeBASIC.
Needs the prefix #, in Print, (not mandatory in Open..)
but does allow PRINT #vgafunc, Print,list which is easier to maintain and follow, but keeps compatible with FreeBASIC.
This variant of the same code uses FreeFile(), and it also compiles and runs - I'm less sure if FreeFile() is needed in a Prop Basic ?