Shop OBEX P1 Docs P2 Docs Learn Events
New BASIC compiler for Prop1 and Prop2 — Parallax Forums

New BASIC compiler for Prop1 and Prop2

ersmithersmith Posts: 6,068
edited 2019-08-14 16:14 in General Discussion
Edit August 2019: It's been almost a year since I started the project, and I think BASIC support in fastspin is now very mature. We have some solid features like:
- Broadly FreeBASIC / MS BASIC compatible (including support for really old programs that use line numbers and gosub
- A simple preprocessor that allows #define, #ifdef / #else / #endif, for conditional compliation and simple macro substitution
- Inline assembly inside functions and subroutines, or in the main program
- Support for floats, strings, integers, pointers, arrays, and user defined structures
- Produce optimized Propeller executables for both P1 and P2
- The same compiler supports PASM, Spin, BASIC, and C, so functions written in any of those languages can call each other.

The early part of this thread has got some thrashing around about the language design, you can ignore that (most of it is obsolete). In the end I decided to make the strings be garbage collected, and this vastly simplified things. File I/O is done with traditional BASIC "open" and "print #n, x" style statements.

I've attached the current PDF documentation to this message so you can see what the language looks like and what features it has.

Rather than trying to keep the originally attached .zip here up to date, I'll just add pointers. Note that "spin2gui" can be used for BASIC development as well as Spin, and works for both P1 and P2:

spin2gui: https://github.com/totalspectrum/spin2gui/releases
fastspin: https://github.com/totalspectrum/spin2cpp/releases

spin2gui contains fastspin, proploader, loadp2, and a simple editor, so it has everything you need to try things out on Windows. Linux/Mac users can build fastspin and spin2gui themselves from the source code. I develop on Linux, so that should definitely work, and I think there are some Mac users as well.

Edit: The work in progress compiler is attached here. To call it, use a command line:
  fastspin myprog.bas
which will produce myprog.binary. I've left the rest of the message the same, but some of it is obsolete... see the thread for discussions on how the language has evolved.

I'm working on a BASIC compiler for Prop2 (which will incidentally support Prop1 too, since it's based on fastspin which handles both). It'll be similar to PropBasic in that it will compile to COG or LMM code, but it's not a PropBasic replacement -- the intention is to make a more Microsoft like syntax, rather than PropBasic's pbasic syntax.

What features would you like to see in a BASIC compiler for Prop1 (and/or Prop2)? I've got the following things planned:

(1) Support for using Spin objects (so easy access to existing objects)
(2) Floating point and string support built in. Types are either inferred from the name ("a$" is a string, "a" is an integer) or explicitly declared in a DIM statement.
(3) Syntax that's a subset of FreeBasic.
(4) Optimized PASM code output
(5) Can directly build binaries (no need for bstc or any other Spin compiler)

The string support is probably going to be the hardest part, since BASIC traditionally has pretty powerful string handling, much more so than Spin or C. At present I'm thinking of limiting strings to 255 characters in length to simplify some of the code. Is that too restrictive?

How important are multi-dimensional arrays? At present fastspin just supports one dimensionsal arrays, so there's some work to do to add multiple dimensions, but it is a traditional BASIC feature.

It's still quite a ways away from production, although the curious can get it from the spin2cpp GitHub repository (you'll have to build it yourself from the "basic" branch; again, it's nowhere near ready for production, so I won't be releasing binaries for a while yet). Today I got the first programs running. Here's a sample of something that works:
''
'' import the Spin FullDuplexSerial object
''
class fullduplex using "FullDuplexSerial.spin"

'' create a full duplex serial object
dim ser as fullduplex

'' start the serial
ser.start(31, 30, 0, 115_200)

rem note that as usual in basic, the variable i is
rem declared automatically

for i = 1 to 10
  ser.dec(i)
  newline()
next i

do
  rem loop forever
loop

sub newline()
  ser.tx(13)
  ser.tx(10)
end sub


«13456720

Comments

  • Cool! Are you going to have a heap for dynamic strings?
  • jmgjmg Posts: 15,175
    edited 2018-09-03 23:19
    ersmith wrote: »
    I'm working on a BASIC compiler for Prop2 (which will incidentally support Prop1 too, since it's based on fastspin which handles both). It'll be similar to PropBasic in that it will compile to COG or LMM code, but it's not a PropBasic replacement -- the intention is to make a more Microsoft like syntax, rather than PropBasic's pbasic syntax.

    What features would you like to see in a BASIC compiler for Prop1 (and/or Prop2)? I've got the following things planned:

    (1) Support for using Spin objects (so easy access to existing objects)
    (2) Floating point and string support built in. Types are either inferred from the name ("a$" is a string, "a" is an integer) or explicitly declared in a DIM statement.
    (3) Syntax that's a subset of FreeBasic.
    (4) Optimized PASM code output
    (5) Can directly build binaries (no need for bstc or any other Spin compiler)

    Sounds good, I do like the 'more FreeBasic compatible' idea.

    I recently added a patch to PropBasic to allow PoseidonFB IDE (see below image) to call PropBASIC, and pick up the error messages for error-line-highlight.
    Was quite easy to do, just a minor shuffle of the error report line, to be FreeBASIC cloned.
    That allows you to quickly leverage the various FreeBASIC IDEs - I was using FBide, but PoseidonFB seems to be gaining traction. There is also FBedit.


    Will it include the same conditional preprocessor FreeBASIC does, so you can have one source for both - allows PCs to do testing of functions / code.

    What about Asm..End Asm to allow in-line assembler ie same syntax as FreeBASIC ?

    Could it give a slightly better 'not supported' type message on places where FreeBASIC is a superset ? (rather than a syntax error)

    ersmith wrote: »
    The string support is probably going to be the hardest part, since BASIC traditionally has pretty powerful string handling, much more so than Spin or C. At present I'm thinking of limiting strings to 255 characters in length to simplify some of the code. Is that too restrictive?
    Hmm, on small MCUs, 255 is probably tolerable, so for a P1 that's likely ok. However, P2 is not so small anymore... - could this be some system option ?
    - eg smaller strings for more compact code (eg P1), but larger ones allowed for P2 ?

    ersmith wrote: »
    How important are multi-dimensional arrays? At present fastspin just supports one dimensionsal arrays, so there's some work to do to add multiple dimensions, but it is a traditional BASIC feature.
    Yes, that's nice to have, but does not need to be in the first release.

    Modified PropBASIC called from PoseidonFB IDE, with error parsing shown : (/FB command line option now included in latest PropBASIC release, to reformat error reports)
    PropBasic_PoseidonFB_IDE.PNG
  • David Betz wrote: »
    Cool! Are you going to have a heap for dynamic strings?

    Well, dynamic strings would pretty much require garbage collection, which seems like a lot of trouble. I was hoping to get away with putting the maximum length of the string in the upper bits of the pointer. Then we'd be able to translate code like "A$ = B$ + C$" into something like:
       alen = ((unsigned)a) >> 24;
       blen = ((unsigned)b) >> 24;
       strncpy(a, b, min(alen, blen));
       clen = ((unsigned)c) >> 24;
       strncat(a, c, min(alen, clen));
    
    On the P2 we'd actually have 12 bits for the length, so strings could go up to 4095 long, but 255 is more "traditional" and also would allow for an alternate implementation where the first byte of the string array held the length.
  • ersmith wrote: »
    David Betz wrote: »
    Cool! Are you going to have a heap for dynamic strings?

    Well, dynamic strings would pretty much require garbage collection, which seems like a lot of trouble. I was hoping to get away with putting the maximum length of the string in the upper bits of the pointer. Then we'd be able to translate code like "A$ = B$ + C$" into something like:
       alen = ((unsigned)a) >> 24;
       blen = ((unsigned)b) >> 24;
       strncpy(a, b, min(alen, blen));
       clen = ((unsigned)c) >> 24;
       strncat(a, c, min(alen, clen));
    
    On the P2 we'd actually have 12 bits for the length, so strings could go up to 4095 long, but 255 is more "traditional" and also would allow for an alternate implementation where the first byte of the string array held the length.
    But if you have more complicated string expressions with nested calls to string functions you will probably need a way to create temporary strings on the fly.

  • jmg wrote: »
    I recently added a patch to PropBasic to allow PoseidonFB IDE (see below image) to call PropBASIC, and pick up the error messages for error-line-highlight.
    Was quite easy to do, just a minor shuffle of the error report line, to be FreeBASIC cloned.
    That sounds like a good idea. Do the IDEs have an easy way to change compiler and the way the output is run? It'd be nice to be able to call fastbasic (or whatever it ends up being called) and run the compiled binary directly from the IDE.
    Will it include the same conditional preprocessor FreeBASIC does, so you can have one source for both - allows PCs to do testing of functions / code.

    Initially it will have the same preprocessor as fastspin / openspin, which is pretty basic but does support simple #define, #ifdef / #else / #endif.
    What about Asm..End Asm to allow in-line assembler ie same syntax as FreeBASIC ?
    I'd like to do this. It's using the same engine as fastspin, so in principle supporting inline assembly is not a problem. Actually parsing the assembly will be a bit of a pain since in the Spin case I was able to re-use the DAT section parsing, whereas for BASIC I'll have to re-implement that. So maybe not the first release? We'll have to see.
    Could it give a slightly better 'not supported' type message on places where FreeBASIC is a superset ? (rather than a syntax error)
    Hmmm. That's also a good idea, and I think it could be done in at least some cases, but probably not for all of them.
  • David Betz wrote: »
    ersmith wrote: »
    David Betz wrote: »
    Cool! Are you going to have a heap for dynamic strings?

    Well, dynamic strings would pretty much require garbage collection, which seems like a lot of trouble. I was hoping to get away with putting the maximum length of the string in the upper bits of the pointer.
    But if you have more complicated string expressions with nested calls to string functions you will probably need a way to create temporary strings on the fly.

    True, but temporaries could be created on the stack, which will get freed when the function returns, so no need for garbage collection. Figuring out the size of temporaries could be a little tricky... I guess we could use the max of the lengths of any strings involved. Or maybe we just disallow any expressions that are too complicated to easily be converted? The only string returning operators I was planning to support initially were concatenation ("+") and the substring functions (LEFT$, MID$, RIGHT$). Hmmm, but for user defined functions I guess things get complicated. Dynamic strings would definitely make things easier on the compiler, but I worry about the size and space implications, not to mention the need for a garbage collector.

  • ersmith wrote: »
    David Betz wrote: »
    ersmith wrote: »
    David Betz wrote: »
    Cool! Are you going to have a heap for dynamic strings?

    Well, dynamic strings would pretty much require garbage collection, which seems like a lot of trouble. I was hoping to get away with putting the maximum length of the string in the upper bits of the pointer.
    But if you have more complicated string expressions with nested calls to string functions you will probably need a way to create temporary strings on the fly.

    True, but temporaries could be created on the stack, which will get freed when the function returns, so no need for garbage collection. Figuring out the size of temporaries could be a little tricky... I guess we could use the max of the lengths of any strings involved. Or maybe we just disallow any expressions that are too complicated to easily be converted? The only string returning operators I was planning to support initially were concatenation ("+") and the substring functions (LEFT$, MID$, RIGHT$). Hmmm, but for user defined functions I guess things get complicated. Dynamic strings would definitely make things easier on the compiler, but I worry about the size and space implications, not to mention the need for a garbage collector.
    At least a string garbage collector doesn't have to worry about recursively looking for references to heap objects. A string is just characters and can't refer to another object. It's simpler than general-purpose garbage collection.

  • David Betz wrote: »
    At least a string garbage collector doesn't have to worry about recursively looking for references to heap objects. A string is just characters and can't refer to another object. It's simpler than general-purpose garbage collection.

    But finding all the references to the strings requires looking through stack, heap, and registers, doesn't it?

    Maybe reference counted strings would be the way to go. Since the string doesn't contain any pointers it can't have loops, so at least in theory it should be do-able. I still kind of like the static allocation idea though because then you know how much space your program will need. Maybe a compromise where most strings are statically allocated but you can define a size of a string heap for temporaries?

  • Another question I had for everyone was how to handle PRINT. On the Prop I guess we can default to printing at some fixed baud rate over the standard serial pins, but for many purposes people will want to change this. How do other MCU BASICs handle this?

    One thought I had was allowing a PRINT WITH to specify a method to use for printing characters. The parameter would be an object and method that takes a single integer parameter and outputs it. Something like:
    class fdserial input "FullDuplexSerial.spin"
    class vga input "VGA.spin"
    
    dim ser as fdserial
    dim screen as vga
    
    print with ser.tx
    print "This message will go to serial"
    print "So will this one"
    
    print with vga.putchar
    print "This message and all subsequent ones will go to VGA"
    
  • Some old versions of BASIC used an ugly syntax like this:
    print #1, x
    

    This would print to whatever device was opened as #1. This could be a file or I guess it could be a TV of VGA driver. I think you opened files like this:
    open "foo.txt" as file 1
    
  • ersmith wrote: »
    David Betz wrote: »
    At least a string garbage collector doesn't have to worry about recursively looking for references to heap objects. A string is just characters and can't refer to another object. It's simpler than general-purpose garbage collection.

    But finding all the references to the strings requires looking through stack, heap, and registers, doesn't it?

    Maybe reference counted strings would be the way to go. Since the string doesn't contain any pointers it can't have loops, so at least in theory it should be do-able. I still kind of like the static allocation idea though because then you know how much space your program will need. Maybe a compromise where most strings are statically allocated but you can define a size of a string heap for temporaries?
    Yup, you still have to find all string references. I realized that the really old versions of BASIC had it easy because they only really had global variables so you could always find all references.

  • I think the Propeller is supremely suited to doing garbage collection. After all, it can be accomplished in a separate cog, with minimal effect on determinism -- save some locking and unlocking as things get moved around in hub RAM.

    -Phil
  • jmgjmg Posts: 15,175
    ersmith wrote: »
    jmg wrote: »
    I recently added a patch to PropBasic to allow PoseidonFB IDE (see below image) to call PropBASIC, and pick up the error messages for error-line-highlight.
    Was quite easy to do, just a minor shuffle of the error report line, to be FreeBASIC cloned.
    That sounds like a good idea. Do the IDEs have an easy way to change compiler and the way the output is run? It'd be nice to be able to call fastbasic (or whatever it ends up being called) and run the compiled binary directly from the IDE.

    Yes, in the test example above, I simply swapped in the 'Compiler path' from the default ..fbc.exe to a path\to\CallPropBasic.BAT and that file could include a download if no errors line.
    Because fbc.exe is a standalone compiler most IDEs should be able to manage this.
    ersmith wrote: »
    Will it include the same conditional preprocessor FreeBASIC does, so you can have one source for both - allows PCs to do testing of functions / code.

    Initially it will have the same preprocessor as fastspin / openspin, which is pretty basic but does support simple #define, #ifdef / #else / #endif.

    That sounds good enough
    The FB help says
    https://www.freebasic.net/wiki/wikka.php?wakka=CatPgDddefines

    which gives examples of
    #if __FB_VERSION__ < "0.18" 
    #error  Please compile With FB version 0.18 Or above 
    #endif
    
    Print __FB_SIGNATURE__
    yields
    FreeBASIC 0.21.1
    

    that means a #ifdef __FB_VERSION__ test should be able to switch between platforms. Shame fastbasic truncates to FB ?
    ersmith wrote: »
    What about Asm..End Asm to allow in-line assembler ie same syntax as FreeBASIC ?
    I'd like to do this. It's using the same engine as fastspin, so in principle supporting inline assembly is not a problem. Actually parsing the assembly will be a bit of a pain since in the Spin case I was able to re-use the DAT section parsing, whereas for BASIC I'll have to re-implement that. So maybe not the first release? We'll have to see.
    It would be nice to have, as it also allows users to learn PASM, without having to learn all of PASM. If the listings out have ASM/source included as commment, that also helps them learn & paste/modify.

  • jmgjmg Posts: 15,175
    ersmith wrote: »
    Another question I had for everyone was how to handle PRINT. On the Prop I guess we can default to printing at some fixed baud rate over the standard serial pins, but for many purposes people will want to change this. How do other MCU BASICs handle this?
    FreeBASIC codes like this
    If Open Com ("COM16:115200,n,8,1,cs0,rs,ds0,cd0,bin,op2000,TB32000,RB32000" For Binary As #1) <> 0 Then  'generic baud 
      n = Err()
      Print "unable to open serial port, (press any key) Error Code: ";n
      Sleep
      End
    else
      n = Err()
      Print "No error on Open Com, Error Code: "; n  'always 0 ?
    End If
    
    Print "Sending command: AT+CrLf"
    Print #1, "AT" + Chr(13, 10);
    

    ersmith wrote: »
    One thought I had was allowing a PRINT WITH to specify a method to use for printing characters. The parameter would be an object and method that takes a single integer parameter and outputs it. Something like:
    class fdserial input "FullDuplexSerial.spin"
    class vga input "VGA.spin"
    
    dim ser as fdserial
    dim screen as vga
    
    print with ser.tx
    print "This message will go to serial"
    print "So will this one"
    
    print with vga.putchar
    print "This message and all subsequent ones will go to VGA"
    

    That's easy to read, but print #n, may be easier to test on PC hosts for example.
    eg I'm thinking here about using FB debuggers, to test Data/String/Flow code, complete with VAR watch etc, rather like a simulator, until the code is shaken down enough to try on a real chip.

    In the above example, the serial print would either go to a PC serial port, or a Prop port, with a conditional variant of the Open Com line.
  • jmgjmg Posts: 15,175
    edited 2018-09-04 04:29
    ersmith wrote: »
    Another question I had for everyone was how to handle PRINT. On the Prop I guess we can default to printing at some fixed baud rate over the standard serial pins, but for many purposes people will want to change this. How do other MCU BASICs handle this?

    Thinking about how someone might use PC COM ports to test, and also use Screen output for other messages, I ran some tests in FreeBasic (1.05).
    These were revealing, as not all means of screen write are identical, but they do show you could flip between COM ports and Cons, Scrn targets for useful debug & development.
    "CON" behaves slightly differently.

    FreeBASIC test code and results captured.
    REM Blank line added 
    Dim a as Integer
    Dim ern as Integer
    Dim s as String
    Dim L as Long
    
    print "1 hello out there, Started JTest"
    
    open Cons for output as #1    ' Works 
    open "CON" for output as #2   ' Works but delays until close 
    open Scrn for output as #3    ' Works 
    ern=err
    print "err=",ern
    #define TestPrint
    #IFDEF TestPrint
     print #1," 2 Opened Cons as #1 ";
     print #2," 3 print #2 statement";      ' No new line  
     print #3," 4 print #3 statement";      ' No new line  
     sleep 2000
     print #1," 5 second half Cons ";        ' No new line
     print " 6 hello out there "
    #ELSE  'Test put
     put #1, ," 2 Opened Cons as #1 "
     put #2, ," 3 put #2 statement"         ' No new line  
     put #3, ," 4 put #3 statement"         ' No new line  
     sleep 2000
     put #1, ," 5 second half Cons "         ' No new line
     print " 6 hello out there "
    #ENDIF
    
    close #1   ' Cons  - put/prints do not need flush
    close #2   ' "CON" - This seems to flush "CON" prints 
    close #3   ' Scrn  - put/prints do not need flush
    print " 7 HELLO out there "
    
    a=2
    s="23"
    L = a + Val(s)
    
    if L > 3 then 
     L = 3
    Else
     L = 1234
    EndIf 
    Print
    print "8 hello out there      888888 "+chr(13);  ' Cr returns to SOL, so NEXT line over-writes
    print "9 HELLO OUT       999999"+chr(13);        ' Cr returns to SOL, so NEXT line over-writes
    Print "10 press any key AA"
    Sleep
    ' Test capture put ::  - #1, #3 immediate, but #2 has delay effect until close #2
    
    '1 hello out there, Started JTest
    'err=           0
    ' 2 Opened Cons as #1  4 put #3 statement 5 second half Cons  6 hello out there
    ' 3 put #2 statement 7 HELLO out there
    '
    '10 press any key AA9999988888
    
    ' Test Capture TestPrint - same as above.
    
    '1 hello out there, Started JTest
    'err=           0
    ' 2 Opened Cons as #1  4 print #3 statement 5 second half Cons  6 hello out there
    ' 3 print #2 statement 7 HELLO out there
    '
    '10 press any key AA9999988888
    
    

    The above I was able to compile and debug (Step/watch) using this combination BAT file
    "C:\FreeBASIC\FreeBASIC_1.05.0_Win32\fbc.exe" -g -v JTest.bas
    "C:\FreeBASIC\fbdebugger292\fbdbg 32\fbdebugger.exe" C:\FreeBASIC\COM_tests\JTest.exe

    but the 64b versions of those have some issues...
  • Peter JakackiPeter Jakacki Posts: 10,193
    edited 2018-09-04 09:15
    I've only been lightly perusing this thread out of interest but on the question of output devices I prefer to select a device for output and then whatever is sent goes to that output device. No need to have PRINT WITH etc. For instance in Tachyon I can simply redirect output from the serial console to say the VGA device and all output switches to that device. There is no need to hard code anything for a particular device and even system output is redirected too totally transparently. Same goes for input and files so all these work the same:
    WORDS or VGA WORDS or LCD WORDS or 5 SERIAL WORDS to output serial on pin 5 or FILE> WORDS where output writes to the open file in the currently selected channel etc. The standard console is reset back with CON.

    Dynamic strings are another thing, I'd be interested to see what you end up doing in this regard :)
  • BeanBean Posts: 8,129
    ersmith,
    I fully support this project. The more options the better.
    I will be watching this thread to see how if works out.
    Good luck,
    Bean
  • Bean wrote: »
    ersmith,
    I fully support this project. The more options the better.
    I will be watching this thread to see how if works out.
    Good luck,
    Bean
    You guys are all going to be blown away once I roll out my new AdvSys2 language as a general-purpose Propeller programming language! :smile:
    Don't worry yet though. At the moment it can't even blink an LED!
  • I've only been lightly perusing this thread out of interest but on the question of output devices I prefer to select a device for output and then whatever is sent goes to that output device. No need to have PRINT WITH etc.
    WORDS or VGA WORDS or LCD WORDS or 5 SERIAL WORDS to output serial on pin 5 or FILE> WORDS where output writes to the open file in the currently selected channel etc. The standard console is reset back with CON.
    FORTH has the advantage there of being a functional language, so composing words gives you a very powerful way to modify things. I guess the closest we'd get to that in BASIC would be to do away with PRINT and just implement a PRINT method on device objects, so something like:
       ser.print("hello, terminal")
       vga.print("hello, screen")
    
    Which is a fine way to do things, but not quite as aligned with "traditional" BASIC as I was hoping for.
    Dynamic strings are another thing, I'd be interested to see what you end up doing in this regard :)

    I think David has convinced me to go with reference counted dynamic strings. I had really hoped to avoid that, but the use case of:
    function greet$(name$)
      return "hello, " + name$
    end function
    
    requires that functions be able to return temporary strings, which in turn means they can't always be allocated on the stack :(. On the other hand I think that's the only case where dynamic heap allocation is absolutely required. If we disallow it then implementing strings becomes vastly simpler. But again, the language wouldn't be "traditional" BASIC, nor as user friendly.

    Actually implementing the reference counting pretty much requires being able to run some code whenever a string is created, copied (decrementing the old reference, incrementing the new), or destroyed. The last one is the biggest rub, since it requires running destructors on objects going out of scope, e.g. temporary variables inside a function before the function returns. Which starts taking us into serious object oriented programming territory, but does open up some interesting other possibilities.

    The other alternative, a mark-and-sweep type garbage collector, might be simpler if the references are always held in HUB memory. But if we allow references in COG memory then it becomes impossible, because one COG can't read other COGs memory to see if there are references there. Maybe an extra layer of indirection, like a handle table, might avoid that problem.

    Lots to think about, anyway. Thanks everyone for your feedback.

    Eric
  • Bean wrote: »
    ersmith,
    I fully support this project. The more options the better.
    I will be watching this thread to see how if works out.
    Good luck,
    Bean

    Thanks Bean! I definitely don't think of this compiler as a replacement for PropBasic, but rather as a parallel approach. I like what you did with PropBasic -- it's an elegant compiler, and great for people coming from the Basic Stamp to the Prop. I never did any Stamp programming though, so my mind is kind of stuck on old school 80's BASIC with a touch of Microsoft's later changes.

    Eric
  • Peter JakackiPeter Jakacki Posts: 10,193
    edited 2018-09-04 12:34
    I have wondered about whether to bother or not with dynamic strings before but for me they are used a bit differently although having capable string functions probably really requires it. There are always all those equivalent functions to Basic for manipulating strings and one needs to allocate space, maintain that space as long as it's needed, and know when it's not. I've never given it much thought until now though. But I do know that a heap is not boundless so therefore whatever limit is determined basically means that as long as that limit had not been exceeded then leave it as it is. So when it comes time to make room my first thought is to have a fixed table of string pointers and attributes including last accessed time where my 32-bit ms runtime count would be used as the stamp and the oldest culled first. The new string could use that area that is released with the remainder moved up/down if necessary and the pointer table updated. Perhaps you guys are familiar with more sophisticated methods anyway and maybe I'm way off but that's just my thinking at the moment that I thought I'd share. (I will try this out and see how dumb or good it is I guess)

    However, I can't see why your output device couldn't just be revectorable and Basic only has to point the output vector to VGA and so all output is sent to that "device". I try to make all my devices handle streams rather than painfully calling methods, so rather than public vga.cr or vga.cls there is only vga.out which handles the stream of characters and controls etc which calls the internal private methods for vga.cr and so on (but normally vga.out is not called directly). The stream doesn't always have to be 8-bit although that is more convenient for printing. EMIT in Tachyon doesn't really do anything except determine what to pass the data to which by default is the serial console.

    Example: Imagine that you only have serial output but on the other end of that serial output are smarts that can redirect the stream to many different devices, including files. Your code just needs to send (or call) that magic word to switch the output device.
  • BeanBean Posts: 8,129
    edited 2018-09-04 12:54
    ersmith wrote: »
    Bean wrote: »
    ersmith,
    I fully support this project. The more options the better.
    I will be watching this thread to see how if works out.
    Good luck,
    Bean

    Thanks Bean! I definitely don't think of this compiler as a replacement for PropBasic, but rather as a parallel approach. I like what you did with PropBasic -- it's an elegant compiler, and great for people coming from the Basic Stamp to the Prop. I never did any Stamp programming though, so my mind is kind of stuck on old school 80's BASIC with a touch of Microsoft's later changes.

    Eric

    PropBasic has it's roots in SX/B for the SX chip.
    If I was to re-write it today I would do things very differently.
    I would start with a good expression evaluation procedure.

    Yeah, I'm thinking I might get some good PropBasic enhancement ideas from your compiler too.

    I already like "PRINT WITH" I would probably use it something like:
    OutChar SUB 1 ' Character output
    
    Start:
      PRINT WITH OutChar, "Hello World!"
    
    END
    
    SUB OutChar
      SEROUT SOutPin, Baud, __PARAM1
    ENDSUB
    

    Bean
  • ersmithersmith Posts: 6,068
    edited 2018-09-04 13:45
    So when it comes time to make room my first thought is to have a fixed table of string pointers and attributes including last accessed time where my 32-bit ms runtime count would be used as the stamp and the oldest culled first. The new string could use that area that is released with the remainder moved up/down if necessary and the pointer table updated.
    That seems a bit dangerous -- it would usually work, but there isn't any guarantee that the oldest string is never going to be accessed again, so one could certainly see scenarios where it would overwrite a string that gets used later.

    In the approach I mentioned above each string has a use count, which gets incremented when it is copied and decremented when the copy is no longer in use. When the count reaches 0 the string is freed. This approach works well for static data like strings; it can fail for data structures that can involve loops (because then you can get a loop in which each element points to another, so the use counts are nonzero, but the loop as a whole is no longer reachable).

    Another approach, mark-and-sweep, is a bit different. Basically when you run out of memory you walk through the heap and mark all objects as "free", then run through the stack and global variables and any time you see a reference to an object you re-mark it as "used". At the end you should have an accurate tally of used and free memory (anything no longer referenced will be marked "free") and you can then re-allocate from the new free space.
    However, I can't see why your output device couldn't just be revectorable and Basic only has to point the output vector to VGA and so all output is sent to that "device". I try to make all my devices handle streams rather than painfully calling methods, so rather than public vga.cr or vga.cls there is only vga.out which handles the stream of characters and controls etc which calls the internal private methods for vga.cr and so on (but normally vga.out is not called directly).
    Well, that's basically what PRINT WITH was intended to do -- it sets the vector used internally by PRINT. So basically PRINT WITH redirects the output to the new stream. I illustrated it with some methods, but any kind of function could be used as the argument for PRINT WITH. Internally PRINT would do all the formatting and then when it wants to output characters it would call the vector. I guess I didn't explain it very well.

    I guess we could rename PRINT WITH as OPEN and change around the syntax a bit to allow multiple vectors. Then we could do:
      OPEN #0, vgafunc  ' set default PRINT function
      OPEN #1, serfunc   ' set alternate PRINT function #1
      PRINT "hello, vga"
      PRINT #1, "hello serial"
      PRINT "hello again, vga"
      '' now revector using LCD instead of VGA
      OPEN #0, lcdfunc
      PRINT "now print is going to LCD instead of VGA"
      PRINT #1, "print #1 still goes to serial"
    
    (so plain PRINT is an alias for PRINT #0)
  • You may want to use handles for strings so you can compact string space. If a program does a lot of string manipulation the heap could get pretty fragmented.
  • I'm still trying to figure out if it's possible to avoid heap allocation entirely. The main problem with stack based allocation is returning stack objects. But that problem goes away if we pass a pointer to the destination object, and use that for return values. That is, a function like:
    function greet$(name$)
      return "hello " + name$
    end function
    
    gets translated internally into:
    sub greet_$(byref retval$, name$)
      retval$ = "hello " + name$
    end sub
    
    Then in
      a$ = greet$(u$)
    
    we end up doing the C equivalent of:
       strncpy(a.ptr, "hello ", a.size);
       strncat(a.ptr, u.ptr, a.size);
    
    There will still be times when we'll have to allocate temporaries on the stack, e.g. in:
    PRINT greet$(firstname$ + lastname$)
    
    which will end up doing something like:
       struct string temp1;
       temp1.size = firstname.size + lastname.size;
       temp1.ptr = alloca(temp1.size);
       strncpy(temp1.ptr, firstname.ptr, temp1.size);
       strncat(temp1.ptr, lastname.ptr, temp1.size);
       struct string temp2;
       temp2.size = min(temp1.size, DEFAULT_STRING_LEN);
       temp2.ptr = alloca(temp2.size);
       greet_(temp2, temp1); // result will be copied to temp2.ptr
       print(temp2);
    
  • ersmith wrote: »
    I'm still trying to figure out if it's possible to avoid heap allocation entirely. The main problem with stack based allocation is returning stack objects. But that problem goes away if we pass a pointer to the destination object, and use that for return values. That is, a function like:
    function greet$(name$)
      return "hello " + name$
    end function
    
    gets translated internally into:
    sub greet_$(byref retval$, name$)
      retval$ = "hello " + name$
    end sub
    
    Then in
      a$ = greet$(u$)
    
    we end up doing the C equivalent of:
       strncpy(a.ptr, "hello ", a.size);
       strncat(a.ptr, u.ptr, a.size);
    
    There will still be times when we'll have to allocate temporaries on the stack, e.g. in:
    PRINT greet$(firstname$ + lastname$)
    
    which will end up doing something like:
       struct string temp1;
       temp1.size = firstname.size + lastname.size;
       temp1.ptr = alloca(temp1.size);
       strncpy(temp1.ptr, firstname.ptr, temp1.size);
       strncat(temp1.ptr, lastname.ptr, temp1.size);
       struct string temp2;
       temp2.size = min(temp1.size, DEFAULT_STRING_LEN);
       temp2.ptr = alloca(temp2.size);
       greet_(temp2, temp1); // result will be copied to temp2.ptr
       print(temp2);
    
    Clever. I tried to figure out how to do that with xbasic but didn't come up with that idea. Maybe I should go back and try it.

  • Very excited about this :cool:
  • jmgjmg Posts: 15,175
    edited 2018-09-04 20:30
    ersmith wrote: »
    Well, that's basically what PRINT WITH was intended to do -- it sets the vector used internally by PRINT. So basically PRINT WITH redirects the output to the new stream. I illustrated it with some methods, but any kind of function could be used as the argument for PRINT WITH. Internally PRINT would do all the formatting and then when it wants to output characters it would call the vector. I guess I didn't explain it very well.

    I guess we could rename PRINT WITH as OPEN and change around the syntax a bit to allow multiple vectors. Then we could do:
      OPEN #0, vgafunc  ' set default PRINT function
      OPEN #1, serfunc   ' set alternate PRINT function #1
      PRINT "hello, vga"
      PRINT #1, "hello serial"
      PRINT "hello again, vga"
      '' now revector using LCD instead of VGA
      OPEN #0, lcdfunc
      PRINT "now print is going to LCD instead of VGA"
      PRINT #1, "print #1 still goes to serial"
    
    (so plain PRINT is an alias for PRINT #0)

    I like the approach of keeping this broadly compile compatible with FreeBASIC, as that means users can instantly use any existing FreeBASIC IDE / Compilers / Debuggers as development

    eg Here I tested named ports, and this code compiles and runs in FreeBASIC.
    Needs the prefix #, in Print, (not mandatory in Open..)
    but does allow PRINT #vgafunc, Print,list which is easier to maintain and follow, but keeps compatible with FreeBASIC.
    ' Blank line added Derived from JTest, this adds Const names for ports
    Dim a as Integer
    Dim ern as Integer
    Dim s as String
    Dim L as Long
    Const AS Long cCons = 1, cCON = 2, cScrn = 3
    
    print "1 hello out there, Started JTestC"
    ' Example FB COM port syntax
    'If Open Com ("COM16:115200,n,8,1,cs0,rs,ds0,cd0,bin,op2000,TB32000,RB32000" For Binary As #1) <> 0 Then  'generic baud 
    The common typo/error case of duplicate open (same port#) may be able to be trapped/reported at compile time ?  in FreeBasic it compiles ok, but gives err=1 on run.
    
    ' In FreeBASIC Cons,"CON" and Scrn are reserved keywords, and all can allow redirect #FileNum to screen
    print "cCons",cCons
    open Cons for output as cCons  ' Works  # here looks optional
    
    print "cCON",cCON
    open "CON" for output as cCON  ' Works but delays until close 
    
    print "cScrn",cScrn
    open Scrn for output as cScrn    ' Works 
    
    ern=err
    print "err=",ern    ' invalid cScrn = 2 gives err=1, valid 3,5 etc gives err 0
    #define TestPrint
    #IFDEF TestPrint
     print #cCons," 2 Opened Cons as ",cCons;   '# here is needed, to separate port# from a print param.
     print #cCON," 3 print as",cCON;        ' No new line  
     print #cScrn," 4 print as",cScrn;      ' No new line  
     sleep 2000
     print #cCons," 5 second half Cons ";        ' No new line
     print " 6 hello out there "
    #ELSE  'Test put
     put #cCons, ," 2 Opened cCons as ",cCons
     put #cCON, ," 3 put cCON statement"         ' No new line  
     put #cScrn, ," 4 put cScrn statement"         ' No new line  
     sleep 2000
     put #cCons, ," 5 second half Cons "         ' No new line
     print " 6 hello out there "
    #ENDIF
    
    close #cCons   ' Cons  - put/prints do not need flush
    close #cCON    ' "CON" - This seems to flush "CON" prints 
    close #cScrn   ' Scrn  - put/prints do not need flush
    print " 7 HELLO out there "
    
    a=2
    s="23"
    L = a + Val(s)
    
    if L > 3 then 
     L = 3
    Else
     L = 1234
    EndIf 
    Print
    print "8 hello out there      888888 "+chr(13);  ' Cr returns to SOL, so NEXT line over-writes
    print "9 HELLO OUT       999999"+chr(13);        ' Cr returns to SOL, so NEXT line over-writes
    Print "10 press any key AA"
    Sleep    'Wait for key
    
    ' Test captures ::  - #1, #3 immediate, but "CON" has delay effect until close #2
    
    ' Test Capture TestPrint - same delay effect on "CON" as above.
    
    '1 hello out there, Started JTestC
    'cCons          1
    'cCON           2
    'cScrn          3
    'err=           0
    ' 2 Opened Cons as            1 4 print as    3 5 second half Cons  6 hello out there
    ' 3 print as    2 7 HELLO out there
    '
    '10 press any key AA9999988888
    

    This variant of the same code uses FreeFile(), and it also compiles and runs - I'm less sure if FreeFile() is needed in a Prop Basic ?
    REM Blank line added Derived from JTest, this adds names for ports
    Dim a as Integer
    Dim ern as Integer
    Dim s as String
    Dim L as Long
    Dim fCons As Long,fCON As Long, fScrn as Long
    
    print "1 hello out there, Started JTestN"
    ' Example FB COM port syntax
    'If Open Com ("COM16:115200,n,8,1,cs0,rs,ds0,cd0,bin,op2000,TB32000,RB32000" For Binary As #1) <> 0 Then  'generic baud 
    
    fCons = FreeFile
    print "fCons",fCons
    open Cons for output as fCons  ' Works 
    
    fCON = FreeFile
    print "fCON",fCON
    open "CON" for output as fCON  ' Works but delays until close 
    
    fScrn = FreeFile
    print "fScrn",fScrn
    open Scrn for output as fScrn    ' Works 
    
    ern=err
    print "err=",ern
    #define TestPrint
    #IFDEF TestPrint
     print #fCons," 2 Opened Cons as ",fCons;
     print #fCON," 3 print #2 statement as",fCON;        ' No new line  
     print #fScrn," 4 print #3 statement as",fScrn;      ' No new line  
     sleep 2000
     print #fCons," 5 second half Cons ";        ' No new line
     print " 6 hello out there "
    #ELSE  'Test put
     put #fCons, ," 2 Opened Cons as #1 "
     put #fCON, ," 3 put #2 statement"         ' No new line  
     put #fScrn, ," 4 put #3 statement"         ' No new line  
     sleep 2000
     put #fCons, ," 5 second half Cons "         ' No new line
     print " 6 hello out there "
    #ENDIF
    
    close #fCons   ' Cons  - put/prints do not need flush
    close #fCON    ' "CON" - This seems to flush "CON" prints 
    close #fScrn   ' Scrn  - put/prints do not need flush
    print " 7 HELLO out there "
    
    a=2
    s="23"
    L = a + Val(s)
    
    if L > 3 then 
     L = 3
    Else
     L = 1234
    EndIf 
    Print
    print "8 hello out there      888888 "+chr(13);  ' Cr returns to SOL, so NEXT line over-writes
    print "9 HELLO OUT       999999"+chr(13);        ' Cr returns to SOL, so NEXT line over-writes
    Print "10 press any key AA"
    Sleep    'Wait for key
    
    ' Test captures ::  - #1, #3 immediate, but #2 has delay effect until close #2
    
    ' Test Capture TestPrint - same as above.
    
    '1 hello out there, Started JTest
    'fCons          1
    'fCON           2
    'fScrn          3
    'err=           0
    ' 2 Opened Cons as            1 4 print #3 statement as     3 5 second half Cons  6 hello out there
    ' 3 print #2 statement as     2 7 HELLO out there
    '
    '10 press any key AA9999988888
    
    
    
  • jmgjmg Posts: 15,175
    Here is an example of how compatible #IFDEF and compile, could support dual platform development and simulation/debug, via FreeBASIC IDE/Debuggers
    #IFDEF BlockComment
    Comment section 
    we can 
    type 
    anything inside BlockComment build from # IFDEF
    
    #ifdef __FB_VERSION__
    If Open Com ("COM16:115200,n,8,1,cs0,rs,ds0,cd0,bin,op2000,TB32000,RB32000" For Binary As cSerialA) <> 0 Then  'generic baud 
      n = Err()
      Print "unable to open serial port, (press any key) Error Code: ";n
      Sleep
      End
    else
      n = Err()
      Print "No error on Open Com, Error Code: "; n  'always 0
    End If
    open Cons for output as cCons  ' screen reporting, or can be another serial open...
    
    #else  'This is PropHW path - whatever Serial open is used in Prop...
      Open PropCom (31,32,115200,n,8,1) For Binary As cSerialA
      Open PropCom (1,2,115200,n,8,1) For Binary As cCons
    #endif
    
    
    Print #cSerialA,"SSS"  ' works in either platform
    Print #cCons,"CCC"     ' works in either platform
    
    #ENDIF
    
  • Cluso99Cluso99 Posts: 18,069
    Excellent news. I've always wanted to see a more general Basic on P1. Having a compatible P1 and P2 Basic would be nice.
Sign In or Register to comment.