Addressing and labeling syntax

Seairth · 2015-10-14 11:59

(as of the 2015-10-13 PNUT release)

So. Because the addressing and labeling syntax has been a bit of a moving target, I wanted to try to document this thoroughly. Yes, some of this is obvious, but I wanted a "big picture" coverage of the topic. So here it is...

Labels:

Labels are either globally-scoped or locally-scoped.
A globally-scoped label must begin with an underscore or letter (a-z, A-Z). All other characters must be an underscore, letter (a-z, A-Z) or number (0-9).
A locally-scoped label must begin with a period, followed by an underscore or letter (a-z, A-Z). All other characters must be an underscore, letter (a-z, A-Z) or number (0-9).
Each local scope begins immediately after each global label and ends immediately before the next global label.
All labels must be unique within the scope they belong to.

Label values:

Labels defined in an ORGH section resolve to a hub address or offset (in bytes), regardless of whether the label is referenced in an ORGH or ORG section.
Labels defined in an ORG section resolve to a cog address or offset (in longs), regardless of whether the label is referenced in an ORGH or ORG section.
When the effective hub address or offset is needed for a label that is defined in an ORG section, the label may be preceded by a "@" to force resolution to a hub address or offset.
Though it is possible to apply the "@" to labels defined in ORGH sections, it has no effect.

For expressions:

Expressions can contain numbers, labels, and nested expressions. The simplest expression is either a single number or label.
An expression that begins with # or ## is known as an "immediate" value.
For branching instructions, immediate values can be either "absolute" or "relative", depending on context.
For non-branching instructions, immediate values are always "absolute".
"Absolute immediate" interpretation can be forced by using "#\" or "##\".
There is no operator for forcing a "relative immediate" interpretation.
# means 9-bit (short-form) or 20-bit (long-form) immediate value:
- For short-form branch instructions, this is a 9-bit relative immediate.
- For long-form branch instructions that change execution mode (cog <-> hub), this is a 20-bit absolute immediate.
- For long-form branch instructions that do not change execution mode, this is a 20-bit relative immediate.
- For all other instructions, this is a 9-bit absolute immediate.
- In circumstances where an absolute immediate must be forced, the expression is prefaced with "#\".
## means 32-bit immediate value
- An implicit AUGx will precede the instruction containing the expression.
- The lower 9 bits will be encoded in the instruction and the upper 23 bits will be encoded in the AUGx.
- For short-form branch instructions, this is a 20-bit relative immediate. The upper 12 bits are ignored.
- For non-branch instructions, this is a 32-bit absolute immediate.
- This is meaningless for long-form branche instructions. PNUT throws an error.
For BYTE/WORD/LONG, the expression is encoded as raw data. If the expression begins with # or ##, PNUT throws an error.
For all other expressions that do not begin with # or ##, the expression is encoded as a register address and must be between $000 and $1FF.

Does this look right to everyone else? Anything to add, change, or remove?

(edit: clarified expression branching vs. non-branching immediate values, as suggested by @Electrodude.)
(edit: clarified label naming and fixed typo, as pointed out by @mindrobots and @potatohead.)
(edit: Updated "@" usage to better align with comments by @cgracey.)

Electrodude · 2015-10-14 12:50

You could clarify that only immediate values that are jump targets can be relative.

"add x, #y" adds the absolute cogram address of y, not the distance from the pc to y, into x. "add x, #\y" would throw an error.

rjo__ · 2015-10-14 12:58

I think Chip can just plug it in to his docs.

Nice work.

Dave Hein · 2015-10-14 12:59

One thing that concerns me about @label is that for P2 this represents the absolute hub RAM address, whereas for P1 it represents an object offset. In BST there is the @@@ operator that represents the absolute hub RAM address. We haven't addressed P2 Spin yet, but I wonder how object offsets will be represented in Spin. Or maybe P2 Spin will not use object offsets?

Seairth · 2015-10-14 13:50

Electrodude wrote: »

You could clarify that only immediate values that are jump targets can be relative.

"add x, #y" adds the absolute cogram address of y, not the distance from the pc to y, into x. "add x, #\y" would throw an error.

This is detailed in the sub-lists. However, I agree that it could be made more explicit. So I've updated it (see bullets 3 and 4 of "For expressions...".

potatohead · 2015-10-14 13:56

This is really great work.

mindrobots · 2015-10-14 14:02

Looks good!

Grep anyone? [.]?[_a-zA-Z][_a-zA-Z0-9]+

I'm not sure this is a generic enough explanation for anyone that doesn't grok grep. Words are always nice:

"Global labels begin with an underscore or any upper or lowercase letter followed by any number of underscores, letter or numbers. Local labels must begin with a ".", this can them be followed by any combination of letters, numbers or underscores. "

I think you dropped some words in the last bullet:

"For all other expressions that do not between with # or ##, the expression is encoded as a register address and must be between $000 and $1FF."

That do not WHAT between???? The curious want to know!

potatohead · 2015-10-14 14:03

"between" = "begin"

Seairth · 2015-10-14 14:26

mindrobots wrote: »

Looks good!

Grep anyone? [.]?[_a-zA-Z][_a-zA-Z0-9]+

I'm not sure this is a generic enough explanation for anyone that doesn't grok grep.

Updated as words instead.

potatohead wrote: »

"between" = "begin"

Huh. Even after proofreading it, I still read it as "begin". Fixed.

potatohead · 2015-10-14 15:10

Me too. I didn't catch it until this thread. We've both got awesome "error correcting" parsers, it seems.

Heater. · 2015-10-14 15:52

Clear as mud.

When did things get so complicated in the Propeller ?

David Betz · 2015-10-14 15:58

Heater. wrote: »

Clear as mud.

When did things get so complicated in the Propeller ?

It's my fault. I asked for hub exec. :-(

Seairth · 2015-10-14 16:02

Heater. wrote: »

Clear as mud.

When did things get so complicated in the Propeller ?

Basically, when hub exec was added (and, to a lesser extent, LUT exec). Without that, we would not need any long-form branches or short-form branches with AUGx. And all short-form branches would always be relative.

Dave Hein · 2015-10-14 16:08

Dave Hein wrote: »

One thing that concerns me about @label is that for P2 this represents the absolute hub RAM address, whereas for P1 it represents an object offset. In BST there is the @@@ operator that represents the absolute hub RAM address. We haven't addressed P2 Spin yet, but I wonder how object offsets will be represented in Spin. Or maybe P2 Spin will not use object offsets?

I mentioned the issue of absolute addresses versus object offsets on two different threads, but nobody commented on it. Anybody care to venture a guess on how the @label in DAT sections will work with objects in P2 Spin?

Seairth · 2015-10-14 16:11

Dave Hein wrote: »

Dave Hein wrote: »

One thing that concerns me about @label is that for P2 this represents the absolute hub RAM address, whereas for P1 it represents an object offset. In BST there is the @@@ operator that represents the absolute hub RAM address. We haven't addressed P2 Spin yet, but I wonder how object offsets will be represented in Spin. Or maybe P2 Spin will not use object offsets?

I mentioned the issue of absolute addresses versus object offsets on two different threads, but nobody commented on it. Anybody care to venture a guess on how the @label in DAT sections will work with objects in P2 Spin?

Nope. Not until P2 Spin development has actually started.

Heater. · 2015-10-14 16:15

David Betz,

It's my fault. I asked for hub exec. :-(

You are forgiven. HUB exec is neat.

I hope this is not all so hard to write as it is to read about...

potatohead · 2015-10-14 16:27

Start in COG mode. It's still easy.

mindrobots · 2015-10-14 16:53

The change Chip made to have COG0 start in COG exec on reset is brilliant! It's really just a P1 with more (and more powerful) COGs as far as a PASM programmer is concerned. Until you take it into HUB exec, you can pretty much ignore the feature! When ready, just wade out into the waters slowly.

Seairth · 2015-10-14 18:12

I have added the above information to P2 Assembler Instruction Set - Oct 2015.

cgracey · 2015-10-14 18:45

When labels declared under ORGH are referenced, they return hub address values.

When labels declared under ORG are referenced, they return cog address values.

@label is only needed when you want to know the hub address of a label declared under ORG.

@label is superfluous when used on labels that were declared under ORGH.

cgracey · 2015-10-14 18:49

This is all pretty simple. We just need to be sure we are presenting it all optimally, through assembler syntax and documentation.

David Betz · 2015-10-14 19:00

mindrobots wrote: »

The change Chip made to have COG0 start in COG exec on reset is brilliant! It's really just a P1 with more (and more powerful) COGs as far as a PASM programmer is concerned. Until you take it into HUB exec, you can pretty much ignore the feature! When ready, just wade out into the waters slowly.

I still think it would be better to introduce hub exec before diving into COG exec. How much have we struggled in the past to get our code to fit into 2K? No need to do that with hub exec. Then you dive into the deep end of COG exec and loading code into COG memory and possibly splitting it into overlays when you really need the performance that buys you.

Dave Hein · 2015-10-14 19:14

cgracey wrote: »

@label is superfluous when used on labels that were declared under ORGH.

In the following code I use "@label+Q" to get absolute hub addresses. On P1 the value of Q is 16 for the top object offset. On P2 I set Q to zero. So for P2 should I used "label" instead of "@label" for labels defined under ORGH?

If "@label" is the same as "label" on P2, then I wonder how we will get the object offset in Spin 2. Or maybe Spin 2 won't use object offsets.

DAT
                        orgh    0
                        
'*******************************************************************************
' pfth cog code
'*******************************************************************************
                        org     0                
forth
parm                    mov     parm, parval
parm1                   rdlong  pc, parm
parm2                   add     parm, #4
parm3                   rdlong  stackptr, parm
parm4                   add     parm, #4
temp                    rdlong  stackptr0, parm
temp1                   add     parm, #4
temp2                   rdlong  returnptr, parm
temp3                   add     parm, #4
temp4                   rdlong  returnptr0, parm
                        setb    outb, #tx_pin
                        setb    dirb, #tx_pin
                        jmp     #innerloop              ' Begin execution

...

'*******************************************************************************
' Addresses of variables in the dictionary, and the hex table
'*******************************************************************************
a_hexstr                long      @hexstr+Q
a_last                  long      @last+Q
a_state                 long      @state+Q
a_dp                    long      @dp+Q
a_tib                   long      @tib+Q
a_verbose               long      @verbose+Q
a_inputidx              long      @greaterin+Q
a_inputlen              long      @poundtib+Q

'*******************************************************************************
' The data and return stack pointers, and their base addresses
'*******************************************************************************
stackptr                long      0
stackptr0               long      0
returnptr               long      0
returnptr0              long      0

'*******************************************************************************
' The input file pointer used during initialization
'*******************************************************************************
infileptr               long      @infile+Q

'*******************************************************************************
' Constants
'*******************************************************************************
minus4                  long      -4
parval                  long      @pfthconfig+Q

                        fit       $1f0

                        orgh      $800
'*******************************************************************************
' Pfth configuration structure
'*******************************************************************************
pfthconfig              long      @xboot_1+Q            ' Initial word to execute
                        long      @stack+Q+16           ' Starting stack pointer
                        long      @stack+Q+16           ' Empty stack pointer value
                        long      @retstk+Q             ' Starting return pointer
                        long      @retstk+Q             ' Empty return pointer value
stack                   long      0[100]                ' Data stack
retstk                  long      0[100]                ' Return stack

'*******************************************************************************
' Input buffer and hex table
'*******************************************************************************
hexstr                  byte      "0123456789abcdef"
inputbuf                byte      0[200]

...

              
'*******************************************************************************
' The boot interpreter follows below.
'*******************************************************************************
              ' : xboot ( This word runs a simple interpreter )
xboot_L       word      @compcomma_L+Q
              byte      FLAG_DEF, 5, "xboot", 0, 0, 0
              long
xboot_X       word      execlistfunc, 0

              ' 20 word 0 pick c@ _jz _xboot2 ( Get word, refill if empty )
xboot_1       word      @_lit_X+Q, $20, @word_X+Q, @_lit_X+Q, 0, @pick_X+Q, @cfetch_X+Q, @_jz_X+Q, @xboot_2+Q
              
              ' find 0 pick _jz _xboot3 ( Find word, get number if not found )
              word      @find_X+Q, @_lit_X+Q, 0, @pick_X+Q, @_jz_X+Q, @xboot_3+Q

              'word      @_lit_x, 2, @dotx_x
              
              ' state @ = _jz _xboot4 ( Go execute if not compile mode or immediate )
              word      @state_X+Q, @fetch_X+Q, @equal_X+Q, @_jz_X+Q, @xboot_4+Q

              'word      @_lit_x, 3, @dotx_x
              
              ' compile, _jmp _xboot1 ( Otherwise, compile and loop again )
              word       @compcomma_X+Q, @_jmp_X+Q, @xboot_1+Q
              
              ' execute _jmp _xboot1 ( Execute and loop again )
xboot_4       word      @execute_X+Q, @_jmp_X+Q, @xboot_1+Q

              ' drop count _gethex ( Get number )
xboot_3       word      @drop_X+Q, @count_X+Q, @_gethex_X+Q

              ' state @ _jz _xboot1 ( Loop again if not compile mode )
              word      @state_X+Q, @fetch_X+Q, @_jz_X+Q, @xboot_1+Q
              
              ' ['] _lit , , _jmp _xboot1 ( Otherwise, compile number and loop again )
              word       @_lit_X+Q, @_lit_X+Q, @compcomma_X+Q, @compcomma_X+Q, @_jmp_X+Q, @xboot_1+Q

              ' drop refill _jmp _xboot1 ( Refill and loop again )
xboot_2       word      @drop_X+Q, @refill_X+Q, @_lit_X+Q, 13, @emit_X+Q, @_jmp_X+Q, @xboot_1+Q, 0, 0
              long

switch_L      word      @xboot_L+Q
              byte      FLAG_DEF, 6, "switch", 0, 0
              long
switch_X      word      execlistfunc, 0
              word      @_lit_X+Q, @getchar_X+Q, @_lit_X+Q, @key_B+Q, @store_X+Q, 0
              long

_last         long

_loop_L       word      @switch_L+Q
              byte      FLAG_CORE, 5, "_loop", 0, 0, 0
              long
_loop_X       word      _loopfunc, 0

_here         long

Dave Hein · 2015-10-14 19:20

Actually, eliminating object offsets should simplify things. However, we'll lose the ability to easily relocate Spin code if that is done.

jmg · 2015-10-14 20:58

Dave Hein wrote: »

One thing that concerns me about @label is that for P2 this represents the absolute hub RAM address, whereas for P1 it represents an object offset.

To me, it is preferable if the assembler manages the details, and unusual tag characters are avoided. P2 ASM is hard enough to read without extra distractions & defaut or most common code should always be the cleanest.
The # prefix on a label is one example, and restricts the ability of the ASM to check scope.
Once you tag immediate on anything, that becomes just a number.

cgracey wrote: »

When labels declared under ORGH are referenced, they return hub address values.

When labels declared under ORG are referenced, they return cog address values.

@label is only needed when you want to know the hub address of a label declared under ORG.

@label is superfluous when used on labels that were declared under ORGH.

So this is a very rare usage ?
Can you give an example of when someone would want to declare a hub address under ORG ?

I expected that ORGH set a Hub segment and ORG sets COG segment, and the Assembler would know which label was within each.

Because code now has two possible locations, it will help if users can hop between segments in a relocatable manner.
Some functions may be coded to be COG based and others nearby may be HUB.

An example could be INIT code, where you do not mind if that is a little slower, and can stay in HUB, but faster 'inner loops' may be very much COG focused.

cgracey · 2015-10-14 21:26

jmg wrote: »

Dave Hein wrote: »

One thbying that concerns me about @label is that for P2 this represents the absolute hub RAM address, whereas for P1 it represents an object offset.

To me, it is preferable if the assembler manages the details, and unusual tag characters are avoided. P2 ASM is hard enough to read without extra distractions & defaut or most common code should always be the cleanest.
The # prefix on a label is one example, and restricts the ability of the ASM to check scope.
Once you tag immediate on anything, that becomes just a number.

cgracey wrote: »

When labels declared under ORGH are referenced, they return hub address values.

When labels declared under ORG are referenced, they return cog address values.

@label is only needed when you want to know the hub address of a label declared under ORG.

@label is superfluous when used on labels that were declared under ORGH.

So this is a very rare usage ?
Can you give an example of when someone would want to declare a hub address under ORG ?

I expected that ORGH set a Hub segment and ORG sets COG segment, and the Assembler would know which label was within each.

Because code now has two possible locations, it will help if users can hop between segments in a relocatable manner.
Some functions may be coded to be COG based and others nearby may be HUB.

An example could be INIT code, where you do not mind if that is a little slower, and can stay in HUB, but faster 'inner loops' may be very much COG focused.

Code assembles into hub space sequentially, regardless of whether it's under ORGs or ORGHs. The only sort-of exception is when ORGH is followed by a value, in which case $00's are filled to that point.

I was thinking that it would be really neat to have the development tool always be able to give you a graphical representation of memory usage by ORG/ORGH type, maybe showing 1 pixel per byte or long. Then, you would always know at a glance how things were laid out. This could be expanded in spin to incorporate Spin and VAR sections. It could really clear the air. The most important thing is to give the programmer a clear understanding, so that he can program confidently. Makes things go way faster.

cgracey · 2015-10-14 21:30

...So this is a very rare usage ?
Can you give an example of when someone would want to declare a hub address under ORG ?

I don't see why anyone would want to do that. In cases of ORG (cog) code, though, you might need the hub address, which is what @ is for.

cgracey · 2015-10-14 21:52

From Spin, '@label' would always return the hub address.

jmg · 2015-10-14 21:56

cgracey wrote: »

Code assembles into hub space sequentially, regardless of whether it's under ORGs or ORGHs. The only sort-of exception is when ORGH is followed by a value, in which case $00's are filled to that point.

Sounds like it needs some form of segment support added ?

cgracey wrote: »

I was thinking that it would be really neat to have the development tool always be able to give you a graphical representation of memory usage by ORG/ORGH type, maybe showing 1 pixel per byte or long. Then, you would always know at a glance how things were laid out. This could be expanded in spin to incorporate Spin and VAR sections. It could really clear the air. The most important thing is to give the programmer a clear understanding, so that he can program confidently. Makes things go way faster.

Good MAP files can do that, and I prefer ASCII files over GUIs as ASCII files can be saved, parsed, and run through version-checkers....

mindrobots · 2015-10-14 22:10

cgracey wrote: »

...So this is a very rare usage ?
Can you give an example of when someone would want to declare a hub address under ORG ?

I don't see why anyone would want to do that. In cases of ORG (cog) code, though, you might need the hub address, which is what @ is for.

Initialize data in a COG image in hubram before you do COGINIT; tweak an instruction (pre-modify self modifying code); probably more, we're a resourceful bunch at times! Hub exec will challenge us to try the unlikely.

Dave Hein · 2015-10-14 22:19

Dave Hein wrote: »

So for P2 should I used "label" instead of "@label" for labels defined under ORGH?

I didn't see an answer to my question, so I'll ask it again. Should I used "label" instead of "@label" for labels defined under ORGH?

cgracey wrote: »

From Spin, '@label' would always return the hub address.

Yes, that is true when using "@label" within a Spin method. But on the P1, "@label" used in the DAT section provides the object offset and not the hub address. It seems that I must not be explaining this very well, so I'll just quit commenting on it until the issue comes up later on when P2 Spin is implemented.

Addressing and labeling syntax

Comments