junkbasic
David Betz
Posts: 14,516
in Propeller 2
I know "junkbasic" isn't a very encouraging name but I've written so many BASIC interpreters that I've ended up abandoning for various reasons that I thought maybe putting the word "junk" in the name might make this the first one that doesn't actually get junked.
Anyway, my intent with this project is to first create an interactive BASIC for the P2 that compiles to byte codes that are then interpreted by a fast XBYTE VM. As a second step I'd like to make the compiler generate PASM code to improve performance.
So I'm sort of in both the language design and language implementation phase. I can start implementation early because I can "borrow" code from all of the other BASIC interpreters I've written in the past which in turn borrowed code from XLISP, Bob, AdvSys, and countless other languages I've created over the years.
However, I still have some language design problems to solve before I can actually run any programs under junkbasic. The one I'm working on now has to do with function/sub calls vs. array references. The ebasic3 interpreter that I wrote used parens to indicate function or subroutine calls and square brackets for array references. This is like C but not really the way BASIC usually works. BASIC typically uses parens for both operations. So my problem is that when I compile something like "foo(1, 2)", I might not know if "foo" is a function, subroutine, or an array. I could solve this by not allowing any forward references to arrays and then just assume that this expression is a function call if the symbol "foo" is as yet undefined but I'm not sure if that is the way other BASICs work.
So, my question is: Do implementations of BASIC allow forward references to arrays? I know I have to allow forward references to functions because someone might want to define a pair of mutually recursive functions and the only other way to handle that would be with a second compiler pass. I'd like to avoid that because it requires that I have the entire program available and an interactive BASIC might just get one piece of the program at a time as typed by the user.
Any language design people out there who might have suggestions about this?
Anyway, my intent with this project is to first create an interactive BASIC for the P2 that compiles to byte codes that are then interpreted by a fast XBYTE VM. As a second step I'd like to make the compiler generate PASM code to improve performance.
So I'm sort of in both the language design and language implementation phase. I can start implementation early because I can "borrow" code from all of the other BASIC interpreters I've written in the past which in turn borrowed code from XLISP, Bob, AdvSys, and countless other languages I've created over the years.
However, I still have some language design problems to solve before I can actually run any programs under junkbasic. The one I'm working on now has to do with function/sub calls vs. array references. The ebasic3 interpreter that I wrote used parens to indicate function or subroutine calls and square brackets for array references. This is like C but not really the way BASIC usually works. BASIC typically uses parens for both operations. So my problem is that when I compile something like "foo(1, 2)", I might not know if "foo" is a function, subroutine, or an array. I could solve this by not allowing any forward references to arrays and then just assume that this expression is a function call if the symbol "foo" is as yet undefined but I'm not sure if that is the way other BASICs work.
So, my question is: Do implementations of BASIC allow forward references to arrays? I know I have to allow forward references to functions because someone might want to define a pair of mutually recursive functions and the only other way to handle that would be with a second compiler pass. I'd like to avoid that because it requires that I have the entire program available and an interactive BASIC might just get one piece of the program at a time as typed by the user.
Any language design people out there who might have suggestions about this?
Comments
Actually, I think brackets are the standard mathematical notation for array references, no? I wonder why BASIC used parens? Were square brackets a late addition to the character set?
More broadly, its great to see more development happening on P2
-Phil
The "corrupted heap" message is probably due to memory not being initialized to 0 (the -O2 compiler leaves off static data). I think if you give the -ZERO option to loadp2 it should fix that.
Yes, there's some code in the garbage collector that does. It's a bug, I haven't figured out how to fix it yet though.
No, unfortunately not.
riscvp2 would probably produce a smaller binary (using compressed RISC-V instructions). But personally I'd probably wait and deal with the size issue after everything is working.