Second-level P(2)ASM namespacing
Imagine yourself in this place... you're writing a Spin file that contains multiple different PASM programs (either because it is a PASM-only program or it just is like that for other reasons)
DAT ' Program A org 0 a_entry mov temp,#0 mov outa,#15 .loop add temp,#1 mov outa,temp jmp #.loop temp res 1 fit 496 DAT ' Program B org 0 b_entry .loop rdlong temp,ptrb[0] qdiv temp,#3 getqx temp wrlong temp,ptrb[1] jmp #.loop temp res 1 fit 496
and oops, both of them need a temp
symbol...
Currently the solution to this is to either prefix every symbol to its relevant scope, i.e. a_temp
and b_temp
, which is annoying and error prone (Check out MisoYume, which ends up having a ppr_tmp1
,ppc_tmp1
and ppm_tmp1
just in the PPU render->composite->math pipeline) or to put the code in separate files (not possible with PASM-only programs, annoying if you do need to share some code/data between cogs).
There really ought to be a higher-level namespace system that allows each sub-program to have its own cog symbols. These of course need to be accessible from elsewhere. Flexspin already allows accessing local symbols with a colon (i.e. a_entry:loop
), so that idea could be extended to namespaces like a:entry
or even a:entry:loop
.
Points worth discussing:
- How to define a namespace? (Note here that namespaces are not singular, you may have multiple discontiguous sections of code that share the same register layout and thus the same namespace)
- Compatibility with the existing language implies a "default" namespace, whose members are all globally exposed (as is currently the case for all non-local labels)
- Syntactical specifics
Tagging @cgracey @ersmith @macca for discussion
Comments
[...]
With relatively simple programs, one solution may be to always use local variables (prefixed with dot) within an high-level procedure. In your example, if you change temp to .temp I think you get what you want, but this is a really simple example and may not be applicable in all situations. The most obvious disadvantage is that you can't share variables, locals get out of scope when a non-dot prefixed label is encountered.
I was thinking, in the past, about a "DAT scoped label", prefixed with, don't know colon or underscore, or some other combination. A label that is local to the DAT section, so you can have different DAT sections with the same label names. Traditional assemblers have directives for labels, like ".local name" or ".global name" (can't remember exacly, is a long time since my last assembler programming...) may be a way to define label scopes without using cumbersome prefixes.
One, relatively quick to implement, way is to have PASM-only objects, and reference to public PASM labels with the same syntax used with Spin methods (objname.label). This may be consistent with the Spin object concept.
In my compiler I can have OBJs in PASM-only programs, it just doesn't export the labels.
Exactly, works for the simple example, not viable for large programs where it actually matters.
I feel like that defeats the point of PASM-only mode (manual memory layout)
Maybe it will be a bit more complicated, but is a way to split the source, even with all helps from the editor, a big source file is always a nightmare to handle (and you have to copy/paste the same modules in other projects).
What about DAT section names, like PUB/PRI method names ?
The section name acts as the namespace, labels are automatically prepended with the section name, anything outside the section must use the name.label syntax, inside the DAT section nothing changes.
Unnamed DATs works unchanged (no namespace prepended, or empty namespace).
May have the advantage to require just a #include statement to split the source in multiple, reusable, source modules.
On the surface I rather like that approach.
It also seems like this would not impact any existing code (considering existing code wouldn't have an alias/name after the DAT statement in any compilers?).
Something like that would work, yea. Though it's not 100% compatible, as it's valid to start writing DAT statements immediately after the section label (you can also do this with CON, OBJ and VAR). Not sure how many people actually do this.
I've seen often used to set the org, less (if ever) used for instructions.
The compatibility may be increased with some rules, like section names are single word keywords, without anything after except comments, and not instruction/statements.
FYI, I have experimented a bit with the DAT names, this is the result:
The listing produced by the code above (the org values are to check that it picks the right label address):
Looks good to me.
Also I forgot to have a bunch of tests with instructions on the same DAT line and are all passing, so the backward compatibility seems good.
I think PNut will take the name as a label, then choke because of the duplicated names, but if using PNut the source should not have duplicated labels anyway.
I have to fix some details, but I think it will debut in the next Spin Tools release.
looks good
Will need to look into making my code compile with spintools or getting it added to flexspin...
How about adding a marker to make the name explicit? Something like:
That way there's no ambiguity.
Probably a good idea. The double colon seems a bit out of place, not a token that exists anywhere else.
Maybe
PASM lines have a well defined format:
[label] [condition] instruction [paramters] [effect]
A smple check for condition and instruction will solve the backward compatibility.
The only ambiguity is when there is only a label
DAT label
How many are using that ? I hope nobody...
But why should we just "hope"? This is a new feature, so there's no reason not to introduce some kind of unique syntax marker to make sure conflict is impossible. This should also make it easier for tools like VS Code to parse.
I suggested
DAT::name
in analogy to C's namespaces. Ada preferredDAT (name)
. We could also doDAT #name
orDAT ^name
or any single character, really.PNut uses the keyword gating to enable new keywords, you can do the same to enable DAT names and ensure 100% compatibility.
It's not only the compatibility that I'm trying to improve, but also the ease of parsing. Spin2 already has all kinds of exceptions and ambiguities that a parser has to work around. I'd rather not add more. Putting in a marker of some kind removes the ambiguity completely. What is your objection to this? I realize you've probably already implemented it without the marker, and you'd rather not do more work, but my preference is to get things right before making it a cross-compiler standard.
You could always leave the markerless version in Spin Tools IDE as an extension, if you'd like (with the tiny risk of backwards incompatibility).
Ultimately I guess it will be up to @cgracey and @"Stephen Moraco" how and whether to put this in the "official" compiler, so I'd be interested in their thoughts.
Honestly I don't understand what difficults are to parse a label... I mean, I guess you already parse the DAT line for instructions, so get the next token, if it is a label (not a condition nor an instruction) it may be a label or the namespace, get the next token, if a condition or an instruction then proceed to parameters and effects (not a namespace), otherwise it is the namespace (and throw an error if there are more keywords). Needed only for the DAT line itself.
I've found more difficulties to implement the compiler part.
Having already implemented it is a reason, but to me, the DAT name is simple and clean, with 99.9% backward compatibility without the need for keyword gating, has some consistency with PUB/PRI methods (not that really matters). I don't like at all prefixes or other cumbersome specifiers that then are not consistent with each other.
I would like to implements PASM-only objects for this, however it is more diffcult than expected (well, some aspects of it, for example taking the address of a label, because child objects are compiled after the main object, at least in my compiler).
If you want easy parsing, we may simply add a new PASM keyword/directive, something like
name
ornamesp
(a short name for consistent formatting) to define the namespace.