PNut/Spin2 Latest Version (v44 -Data Structures Added, New Methods for Memory Testing/Manipulation)

macca · 2023-11-13 17:57

@ersmith said:
There are several nice suggestions for dealing with the specific issue of the new bytes, words, longs keywords, but honestly I think these miss the forest for the trees. Sooner or later we're going to need a way to introduce new keywords into the Spin2 language without breaking existing code. So how about a version identifier for objects? Something like {v42} as the very first line of an object to indicate that it requires version 42 of the language (i.e. the language as implemented by PNut version 42)?

Would be good if the compiler adapt itself to the required version, like the c++ standard:
If no version specified: compile with the latest features enabled (and break code).
Version specified: disable all features added after that version.

Of course this would still break all existing code because none will have the version specified in the source (may be a command line option too).

Personally I start to like the idea of definition precedence to not break existing code. I did a quick experiment by defining a bytes method and seems to work well. Need to work more on this

ersmith · 2023-11-13 20:10

@macca : Ah, I see, you're proposing that any new keyword should be "soft", so that a user definition (as a variable or method) takes precedence over the new built in meaning? That actually sounds like a good idea, although it does create a difference between the old and new keywords (new ones can be used as variable names, old ones cannot). But that's probably a small price to pay for backwards compatibility. @cgracey , is this something you could implement in PNut?

cgracey · 2023-11-14 04:51

@ersmith said:
This is a nice feature, but it's another breaking change... the word bytes in particular may well be used as a variable in some existing code. This kind of breakage of old code is precisely what programmers get frustrated with on "big" systems like WIndows and Linux. With more and more objects being added to the OBEX, it'd be nice to find some way to ensure that those objects will work with future compilers. Some proposals:

(1) We could start all new keywords with % (so %BYTES, %WORDS, and so on). Right now a letter is not legal after a percent sign, so this shouldn't break anything, and provides a huge namespace for expansion.

(2) We could require a special comment like {$v42} at the start of any program that uses the new keywords. To make it easier for tools (like vscode) this comment should be the first thing in the file and the syntax should be as simple as possible. If the comment is missing, or if the version given in the comment is less than the version needed for the keyword, the keyword is just a regular identifier. This could get slightly messy with objects (the compiler would have to keep track of the version of each object separately) but would provide a clean way to upgrade the language without breaking existing OBEX objects.

I had started out with BYTE(), WORD(), and LONG() methods, which didn't expand the keywords, but Stephen Moraco asked me to change them to something different, like BYTES/WORDS/LONGS because his VS Code syntax highlighter was going to be difficult to change to accommodate BYTE/WORD/LONG as methods. I would rather use BYTE/WORD/LONG as the method names, but this is why they are what they are.

I like the idea of %KEYWORD to stop this madness.

cgracey · 2023-11-14 05:01

It would be nice if there was some magical way to completely separate the compiler's keywords from the user's symbol names. Typing % in front of many keywords would be a pain and it would look ugly. This is where some special text editor might be nice that could hide the % and use different coloring, instead. Then, there's still the pain of typing the keyword, since % is a shifted key. And, there's the long-term problem of the editor becoming like Word, where there's hidden junk that you can't see or easily get rid of, but is sure to hose things up. That's the worst.

I am reticent to start up {$v42} because that will begin complicating the compiler, going forward. It would be like dragging a bag that gets heavier and heavier.

rogloh · 2023-11-14 07:01

This is where having P1 SPIN fixed in ROM was so much simpler. There was only ever one version. I don't like the idea of % for lots of new keywords as it'll be hard to know whether to add it or not, but if that's where things are headed we might need something like that. It still doesn't solve the versioning issue just the namespace clash. You might be able to add a number on the end of the new keyword, but it still could potentially clash with user symbols. Like BYTES2 etc where 2 is the minimum tool major release number needed for supporting that new feature. Ugly as well...

VonSzarvas · 2023-11-14 07:29

For this specific case, is the keyword "array" available, and compatible with VSCode ?

Byte could be assumed as default, and overrides allow other types.

address := array($01, $02, $03)

or

address := array($01, long $02030405, $06)

VonSzarvas · 2023-11-14 07:34

or another upside-down thought... change the new array thing into an assignment that follows existing language rules.

ie.

BYTE myvalues[10]

myvalues := $01020304,$05,$06,07,$0809,$0A

address := @myvalues

I'm thinking that the assignment doesn't need to use the byte,word,long keywords as the comma-delim'd values would just be parsed by the compiler as bytes into the buffer.
Having the pre-assigned buffer might also be handy to prevent weird memory overflow issues in some future compilers or large code.

VonSzarvas · 2023-11-14 07:58

Regarding adding a keyword prefix:

I like the idea from @ersmith and the simplicity would have worked well if followed in the beginning (albeit finding a symbol that is common and easy to press on both USA and international keyboards).

However to introduce now would be mucky I feel, for the reasons @rogloh mentions and more. I'm not against doing something to mark keywords moving forward, but not sure the best solution has come to us yet. It's good to start the conversation and think about it though - must be more variations on the theme to consider that might work well retrospectively and for the future.

In Wordpress (well, PHP actually) they use the \ prefix to force the PHP compiler to look in the global space for a function. Handy to override functions locally, but might also be a way to optionally prefix functions using an existing "norm". Maybe that's also something to think about- instead of adding new functions to SPIN2, just lock in what we have now and provide overrides for people to copy/paste in new future functionality.

function getrnd () 
{
    return \getrnd() & $FF; // return truncated random value
}

This lock-in and expand concept might also take the form of a SPIN2+ language extension which could be defined at the top of a SPIN source file, and thus allowing users to add new compiler functions in a user-controlled way, rather than changing the existing compiler and breaking old projects. ie. Chip can continue improving and adding new SPIN2 functions, and release them as add-on module/s.

define SPIN2EXTRA_ARRAYS

macca · 2023-11-14 08:04

@VonSzarvas said:
I'm thinking that the assignment doesn't need to use the byte,word,long keywords as the comma-delim'd values would just be parsed by the compiler as bytes into the buffer.

I guess the current syntax allows to define structure-like buffers with different sized elements.
Look for example at the USB control setup header use case:

bmRequestType (byte)
bRequest (byte)
wValue (word)
wIndex (word)
wLength (word)

could be initialized with

ptr := bytes($00, $00, word $0000, word $0000, word $0000)

or also

ptr := words(byte $00, byte $00, $0000, $0000, $0000)

BYTE myvalues[10]
myvalues := $01020304,$05,$06,07,$0809,$0A
address := @myvalues
I'm thinking that the assignment doesn't need to use the byte,word,long keywords as the comma-delim'd values would just be parsed by the compiler as bytes into the buffer.

In the above use case you can't define a word with a value less than 256.

In addition, that syntax makes me believe that all values are downsized to bytes (so $01020304 became just $04) with a warning message, or an error is thrown in case the value exceeds the expected size.

ManAtWork · 2023-11-14 08:35

How about adding a "dummy object" that has to be included for the new keywords to become available? Like printf is only available in C if you #include <stdio.h>. That doesn't break existing code that doesn't include the dummy object, it doesn't need ugly characters and can be easily added to the syntax highlighting.

rogloh · 2023-11-14 09:14

@ManAtWork said:
How about adding a "dummy object" that has to be included for the new keywords to become available? Like printf is only available in C if you #include <stdio.h>. That doesn't break existing code that doesn't include the dummy object, it doesn't need ugly characters and can be easily added to the syntax highlighting.

Not such a bad idea, especially if the name of the dummy object can be used to indicate what minimum version of SPIN is required and used by the software. As new capabilities arise in the interpreter the code just includes a different version number of the dummy object that it uses. There will need to be a table published in the SPIN documentation that shows which is the minimum version of the object required for a given new API. Of course this won't be so simple for syntax highlighting to figure out and SPIN parsers need to look out for a special object name now and also essentially ignore it in the build process. But from a user's point of view wouldn't be too difficult to use at least. You'd need to have any nested objects maintain their own Spin versioning, so older files would still work, and allow files with no Spin version object to default to existing APIs. Maybe a CON could also be used instead of an OBJ if that's easier to handle too.

OBJ
_ver : "1.1"  ' no real file gets loaded for this version if the object name is "_ver", it's a dummy.  Maybe for compactness this dummy object could even be named "_".  Pretty unlikely existing object names would use that object identifier.  You could use your own version numbering scheme in the string for the object filename.

PUB main() | a
  a:=BYTES(1,2,3)

or

CON
_ = $1_1  ' spin2 version 1.1 API's are used (or whatever numbering scheme is appropriate).  Useful if the code itself also wants to know the constant value somewhere.

PUB main() | a
  a:=BYTES(1,2,3)

VonSzarvas · 2023-11-14 10:33

@ManAtWork said:
How about adding a "dummy object" that has to be included for the new keywords to become available? Like printf is only available in C if you #include <stdio.h>. That doesn't break existing code that doesn't include the dummy object, it doesn't need ugly characters and can be easily added to the syntax highlighting.

+1, Agreed, and exactly as I tried to explain 2 posts above

This lock-in and expand concept might also take the form of a SPIN2+ language extension which could be defined at the top of a SPIN source file, and thus allowing users to add new compiler functions in a user-controlled way, >rather than changing the existing compiler and breaking old projects. ie. Chip can continue improving and adding new SPIN2 functions, and release them as add-on module/s.
define SPIN2EXTRA_ARRAYS

Ariba · 2023-11-14 10:38

@cgracey said:
I posted a new v42 at the top of this thread.

v42 - Added LSTRING()/BYTE()/WORD()/LONG() methods.
  DoSETUP(bytes($21,$09,$00,$02,$00,$00,$01,$00)) 'turn on LEDs
Now, byte/word/long arrays can be conveniently expressed in Spin2, right where you use them, so there's no need to make a DAT reference for simple arrays. The array is placed right in the compiled Spin2 code and the method returns the address of the array. It's like STRING(), but for data of any word size. Size overrides are allowed on data, too (BYTE/WORD/LONG).

LSTRING() is like STRING(), but places a length byte at the start of the string and the string can contain zeros.

if it is like STRING() why do you not implement it as an extension of STRING():

   DoSETUP(string(byte $21,$09,$00, word $0002, $0100, byte $00))   'turn on LEDs

so it's a string of data instead of a string of characters.

Andy

Rayman · 2023-11-14 11:51

String is a good idea... I think byte is the default there...

VonSzarvas · 2023-11-14 12:31

@cgracey said:
I had started out with BYTE(), WORD(), and LONG() methods, which didn't expand the keywords, but Stephen Moraco asked me to change them to something different, like BYTES/WORDS/LONGS because his VS Code syntax highlighter was going to be difficult to change to accommodate BYTE/WORD/LONG as methods. I would rather use BYTE/WORD/LONG as the method names, but this is why they are what they are.

I wonder if re-visiting the VS Code issue might be called for under the circumstances?
- perhaps awkward to implement BYTE/WORD/LONG as methods, but could that ultimately be simpler than the gymnastics and possible lack of clarity in working around it?

ersmith · 2023-11-14 13:47

Just to make it clear this isn't a theoretical issue: the following items from the OBEX (at least) are broken when "bytes" is made a keyword: GraphicsDisplayDriverArch, isp_pca9685_16ch_pwm_servo, and esp32. Not a huge number of objects, but if every compiler release breaks 3 or 4 objects, how long until the OBEX is completely unusable?

VonSzarvas · 2023-11-14 14:40

@ersmith said:
Just to make it clear this isn't a theoretical issue: the following items from the OBEX (at least) are broken when "bytes" is made a keyword: GraphicsDisplayDriverArch, isp_pca9685_16ch_pwm_servo, and esp32. Not a huge number of objects, but if every compiler release breaks 3 or 4 objects, how long until the OBEX is completely unusable?

Good info - YES I agree 100%...

Some "IMO" thoughts... Adding breaking keywords at this stage is a no-no. More important to allocate some resource toward the VSCode issue, rather than create future enforced workarounds (and likely user frustration, losses, etc, and all that bad stuff).

Problem is clear though- the VSCode thing (ie. allowing use of byte(), word(), long() methods) might only solve this latest keyword addition. The next will come for sure!

At this stage the best theoretical solution I've seen here seems to be having the #define statement (or similar) to enable any future new "extended" compiler features, and so leave the core SPIN2 alone. OR use some sort of prefix for extended commands.

SPIN2 has officially launched already, and time for tweaking has gone! As has the time of causing pain for @ersmith, @"Stephen Moraco", @macca, and anyone else involved in keeping the tools working and docs up-to-date!

Electrodude · 2023-11-14 15:16

@cgracey said:
I had started out with BYTE(), WORD(), and LONG() methods, which didn't expand the keywords, but Stephen Moraco asked me to change them to something different, like BYTES/WORDS/LONGS because his VS Code syntax highlighter was going to be difficult to change to accommodate BYTE/WORD/LONG as methods. I would rather use BYTE/WORD/LONG as the method names, but this is why they are what they are.

@"Stephen Moraco" must already have to deal with Spin/PASM highlighting conflicts for keywords like and and or that are both PASM instructions and Spin operators. Is this any different?

macca · 2023-11-14 16:12

@VonSzarvas said:

@ersmith said:
Just to make it clear this isn't a theoretical issue: the following items from the OBEX (at least) are broken when "bytes" is made a keyword: GraphicsDisplayDriverArch, isp_pca9685_16ch_pwm_servo, and esp32. Not a huge number of objects, but if every compiler release breaks 3 or 4 objects, how long until the OBEX is completely unusable?

The referenced objects have other problems I think... isp_jm_i2c_singleton.spin2 has missing variables and doesn't compile, esp32_core.spin2 is missing dat_fullduplexserial and doesn't compile, the only thing I can compile is Graphics.spin2 from GraphicsDisplayDriverArch and with the changes I was testing today doesn't seems to have problems at all with the bytes keyword. So not a problem for me.

Anyway....

At this stage the best theoretical solution I've seen here seems to be having the #define statement (or similar) to enable any future new "extended" compiler features, and so leave the core SPIN2 alone. OR use some sort of prefix for extended commands.

SPIN2 has officially launched already, and time for tweaking has gone! As has the time of causing pain for @ersmith, @"Stephen Moraco", @macca, and anyone else involved in keeping the tools working and docs up-to-date!

Honestly, I don't have problems with the language additions. Yes, breaking the existing code is a problem, that's why I always try to make the compiler backward compatible, even if this may cause some confusion. Just for info, I had jm_fullduplexserial defining the fdec method for a long time, even after implementing the fdec debug keyword, I think also it can use field as a variable without problems, now I can use bytes/words/longs as a variable, a method or the system function without problems (and in some cases at the same time). That's the advantage of having a self-built tool that can be adapted to the language evolution. I think I would not have had any problems even using byte/word/long as originally intended.

I also did a small enhancement to the function: using bytes/words/longs instead of byte/word/long to override a value size also sets the default for subsequent values. May be useless, but it can be done.

What really is a pain to me is that these discussions happens after the fact. Now I have implemented these new function, if at the end Chip changes the syntax, I have to do that again!
Please discuss the additions before actually implementing them, if you want to discuss.

Wuerfel_21 · 2023-11-14 16:19

@cgracey said:
I had started out with BYTE(), WORD(), and LONG() methods, which didn't expand the keywords, but Stephen Moraco asked me to change them to something different, like BYTES/WORDS/LONGS because his VS Code syntax highlighter was going to be difficult to change to accommodate BYTE/WORD/LONG as methods. I would rather use BYTE/WORD/LONG as the method names, but this is why they are what they are.

Making the language worse because the tooling sucks and making the tooling worse because the language sucks and making the language worse because... cool way to start a downward spiral.

I know Stephen has it in him to make the syntax parsing work for any given language feature. I think not breaking compatibility for everyone is more important than making one guy's life slightly easier.

@macca said:
Please discuss the additions before actually implementing them, if you want to discuss.

This.

cgracey · 2023-11-14 16:25

Within the compiler, I could gate new keywords by simply not allowing them into the symbol table without a go-ahead, which could be a version number.

CON v41

or

CON vLATEST

VonSzarvas · 2023-11-14 16:34

Should the directive be more verbose?

CON COMPILER_VERSION_xyz

cgracey · 2023-11-14 16:37

@VonSzarvas said:
Should the directive be more verbose?

CON COMPILER_VERSION_xyz

The trouble is it would have to be added at the beginning of every little program. It's nice if it's small.

Instead of CON, we could make a new VER block which recognizes names outside of the normal namespace, to avoid conflicts. In fact, it could just use numbers with 0 being the latest.

VER 42

VER 0

ke4pjw · 2023-11-14 17:28

Having a VER block could also (possibly?) help with changes in how directives behave. ie. The REPEAT loop enumeration change that happened a while back. Though, I bet that would make Chip and Eric's jobs more complex.

ersmith · 2023-11-14 18:00

@cgracey said:

@VonSzarvas said:
Should the directive be more verbose?

CON COMPILER_VERSION_xyz

The trouble is it would have to be added at the beginning of every little program. It's nice if it's small.

Instead of CON, we could make a new VER block which recognizes names outside of the normal namespace, to avoid conflicts. In fact, it could just use numbers with 0 being the latest.

VER 42

VER 0

I like this. I think it would be wise to require that VER (if present) be the very first block in the program. In fact it would be kind of nice if it had to go on the first line, because then tools like editors could find it easily without having to parse much.

Wuerfel_21 · 2023-11-14 18:22

The VER block seems like a good idea, though I'd prefer that in cases like this where the feature can be easily tweaked such that it doesn't conflict with previously valid programs, we just do that instead.

JonnyMac · 2023-11-14 19:21

If the VER block is implemented, what will be the mechanism to deal with the version in user code (e.g., changes to repeat?)

macca · 2023-11-14 19:33

@ke4pjw said:
Having a VER block could also (possibly?) help with changes in how directives behave. ie. The REPEAT loop enumeration change that happened a while back. Though, I bet that would make Chip and Eric's jobs more complex.

If I'm not wrong, the repeat behaviour is implemented in the runtime interpreter, the compiler should have a copy of the old interpreter (or be able to patch it) and upload the appropriate version.
While this may be "trivial", I remember sometime ago that the introduction of a new keyword (don't rememeber what, maybe field) caused a renumber of all bytecodes (existing binaries were not compatible with the new interpreter). In such case, the compiler not only needs the old interpreter but also generate a completely different bytecode. This would be very bad!

macca · 2023-11-15 07:54

I have released version 0.33.0 of Spin Tools IDE with the new functions implemented and the changes needed to make them backward compatible with existing sources.
Let me know how it works.

cgracey · 2023-11-15 13:01

@macca said:

@ke4pjw said:
Having a VER block could also (possibly?) help with changes in how directives behave. ie. The REPEAT loop enumeration change that happened a while back. Though, I bet that would make Chip and Eric's jobs more complex.

If I'm not wrong, the repeat behaviour is implemented in the runtime interpreter, the compiler should have a copy of the old interpreter (or be able to patch it) and upload the appropriate version.
While this may be "trivial", I remember sometime ago that the introduction of a new keyword (don't rememeber what, maybe field) caused a renumber of all bytecodes (existing binaries were not compatible with the new interpreter). In such case, the compiler not only needs the old interpreter but also generate a completely different bytecode. This would be very bad!

We almost need to have each version of PNut available and have the source code name which version it needs.

But, what about cases where people are using objects compiled under different compiler versions? This becomes impossible to solve!

Maybe we could take a stealthy approach, where if we see a keyword being used a method, variable, or constant name, that keyword is removed from the symbol table and reentered as the type the user has declared?

PNut/Spin2 Latest Version (v44 -Data Structures Added, New Methods for Memory Testing/Manipulation)

Comments