CALLD PA/PB/PTRA/PTRB,#\A syntax question

Bob Drury · 2021-10-06 00:30

CALLD PA/PB/PTRA/PTRB,#\A Call to A by writing {C, Z, 10'b0, PC[19:0]} to
PA/PB/PTRA/PTRB (per W). If R = 1 then PC += A, else PC = A. "\" forces R = 0. (406)

Looked through documentation couldn't really find anything discussing CALLD trired using RET with
CALLD (Nope) found following:

$1F6 RAM / PA CALLD-imm return, CALLPA parameter, or LOC address
$1F7 RAM / PB CALLD-imm return, CALLPB parameter, or LOC address
RESI3 = CALLD $1F0,$1F1 WCZ
RESI2 = CALLD $1F2,$1F3 WCZ
RESI1 = CALLD $1F4,$1F5 WCZ
RESI0 = CALLD INA,INB WCZ
RETI3 = CALLD INB,$1F1 WCZ
RETI2 = CALLD INB,$1F3 WCZ
RETI1 = CALLD INB,$1F5 WCZ
RETI0 = CALLD INB,INB WCZ

Can someone point me in the right direction no idea what this is.
Regards and Thanks
Bob (WRD)

AJL · 2021-10-06 02:12

As CALLD PA/PB/PTRA/PTRB doesn't write the return address to the stack, a RET won't work.

As odd as it may seem, the way to return from this is a CALLD D with D set to either a scratch register, or a read only register like INB, and S set to the relevant register (PA/PB/PTRA/PTRB) for the CALLD you are returning from.

This will fetch the return address from that register, restore the C & Z flags that were stored before the call, and discard the 'return address' from this CALLD.

This might help to explain the difference between the RESIx and RETIx: In RESIx the address of the next instruction in the ISR is stored in the relevant register ($1F0, $1F2, $1F4, or the shadow register behind INA), while in RETIx that address is thrown away by 'writing' it to the read-only INB register.

evanh · 2021-10-06 02:51

Yeah, it's a not a regular "call" because it doesn't use any stack. It is an odd-ball - the naming of the instruction changed a couple of times before CALLD was settled on. It was named JMPRET in the Prop1. LINK, JMPL and BL (branch and link) are some other names used in other CPUs. Others have a specific Link Register but Propellers just use any general register to store the link address.

evanh · 2021-10-06 02:55

The list you've posted is some special uses of CALLD. In the case of interrupts, entering and exiting the Interrupt Service Routines are entirely dealt with through some crafted cases of CALLD.

evanh · 2021-10-06 03:11

Btw, there is two versions of that instruction. The one in the topic title contains an immediate 20-bit branch address. As such it doesn't have S or D fields, so only has four link registers to choose from. The other version has both S and D fields and so can use any general register to place the link address in.

Bob Drury · 2021-10-06 03:33

evanh
Do you have any simple examples of both versions. I checked again the documentation can't seem to find anything .
Never would have guessed to use CALLD D as a return. I would assume since there is no info the use of this call must be lower in priority. Just trying to have some sort
of complete notes on code instructions. Details of RESx RETx can be left for another day.
Regards and Thanks
Bob (WRD)

evanh · 2021-10-06 04:03

Technically, it doesn't ever "return". The CALLD doesn't perform any stacking itself. It's really a jump, that copies the address of where it branched from - the link. It's up to the software as to how that might be used further along.

Compilers will use it when optimising for what's known as leaf routines. Such routines don't call further compiled routines. As such these routines often don't need certain "stack frame" baggage that is normally provided by the compiler. Once the stack use drops down to zero, the last thing left is the caller return address. Well, that too can be eliminated from the stack with the use of a couple of link instructions replacing the usual call and return.

evanh · 2021-10-06 04:16

The Prop1 will have many examples. It didn't have a stack, nor a CALL instruction.

evanh · 2021-10-06 04:46

Here's a Prop1 example where two routines, Receive and Transmit, are cooperatively time slicing by using JMPRET. The RESIx aliases of CALLD serve a similar job, albeit interrupt based, in the Prop2.

'
' Receive
'
receive                 jmpret  rxcode,txcode         'run a chunk of transmit code, then return

                        test    rxtxmode,#%001  wz    'wait for start bit on rx pin
                        test    rxmask,ina      wc
        if_z_eq_c       jmp     #receive

                        mov     rxbits,#9             'ready to receive byte
                        mov     rxcnt,bitticks
                        shr     rxcnt,#1
                        add     rxcnt,cnt                          

:bit                    add     rxcnt,bitticks        'ready next bit period

:wait                   jmpret  rxcode,txcode         'run a chuck of transmit code, then return

                        mov     t1,rxcnt              'check if bit receive period done
                        sub     t1,cnt
                        cmps    t1,#0           wc
        if_nc           jmp     #:wait

                        test    rxmask,ina      wc    'receive bit on rx pin
                        rcr     rxdata,#1
                        djnz    rxbits,#:bit

                        shr     rxdata,#32-9          'justify and trim received byte
                        and     rxdata,#$FF
                        test    rxtxmode,#%001  wz    'if rx inverted, invert byte
        if_nz           xor     rxdata,#$FF

                        rdlong  t2,par                'save received byte and inc head
                        add     t2,rxbuff
                        wrbyte  rxdata,t2
                        sub     t2,rxbuff
                        add     t2,#1
                        and     t2,#$0F
                        wrlong  t2,par

                        jmp     #receive              'byte done, receive next byte
'
' Transmit
'
transmit                jmpret  txcode,rxcode         'run a chunk of receive code, then return

                        mov     t1,par                'check for head <> tail
                        add     t1,#2 << 2
                        rdlong  t2,t1
                        add     t1,#1 << 2
                        rdlong  t3,t1
                        cmp     t2,t3           wz
        if_z            jmp     #transmit

                        add     t3,txbuff             'get byte and inc tail
                        rdbyte  txdata,t3
                        sub     t3,txbuff
                        add     t3,#1
                        and     t3,#$0F
                        wrlong  t3,t1

                        or      txdata,#$100          'ready byte to transmit
                        shl     txdata,#2
                        or      txdata,#1
                        mov     txbits,#11
                        mov     txcnt,cnt

:bit                    test    rxtxmode,#%100  wz    'output bit on tx pin according to mode
                        test    rxtxmode,#%010  wc
        if_z_and_c      xor     txdata,#1
                        shr     txdata,#1       wc
        if_z            muxc    outa,txmask        
        if_nz           muxnc   dira,txmask
                        add     txcnt,bitticks        'ready next cnt

:wait                   jmpret  txcode,rxcode         'run a chunk of receive code, then return

                        mov     t1,txcnt              'check if bit transmit period done
                        sub     t1,cnt
                        cmps    t1,#0           wc
        if_nc           jmp     #:wait

                        djnz    txbits,#:bit          'another bit to transmit?

                        jmp     #transmit             'byte done, transmit next byte

AJL · 2021-10-06 04:46

Yes, evanh responded initially with a description of the other CALLD type (with D & S fields) that behaves much like the JMPRET of the original Propeller (P1).

D is the RetInstAddr and S is the DestAddr

The version you first asked about uses a 20-bit immediate allowing branching to anywhere, while storing the return address in one of those four registers. To return from the 20-bit immediate version requires use of the the D/S version.

evanh · 2021-10-06 05:05

Just looking at the Prop1 datasheet and note it does list both a CALL and RET. But they are the same opcode as JMPRET. And so is the JMP instruction. So JMP, RET and CALL are all aliases of JMPRET. But there is still no stack. So the CALL and RET are using what would be equivalent to a 1-level deep stack. EDIT: Oh, now I remember, CALL relied on self-modification to run-time insert the return address into the RET instruction. It had its caveats.

Ariba · 2021-10-06 12:10

A simple example:

CON
   _clkfreq = 160_000_000

   LED  =  60

PUB main()
  waitms(100)
  coginit(16, @code, 0)
  repeat

DAT
        org   0
code    calld pa,#toggle        'jump to toggle and save return addr into pa
        waitx ##40_000_000
        jmp   #code

toggle  drvnot #LED
        jmp    pa               'return (jump indirect thru pa)

You just get the return address in a register instead of pushed onto the stack. So you can handle the return address by yourself. This subroutine would also work:

toggle  push   pa
        drvnot #LED
        ret               'return

Andy

Bob Drury · 2021-10-06 17:45

Thanks . In the first form of CALLD CALLD PA/PB/PTRA/PTRB,#\A the key to this is to remember that PA/PB/PTRA/PTRB contains the Instruction address of CaLLD +1. The (per W) term means : PA/PB/PTRA/PTRB = CALLD instruction address + 1 for the address following CALLD Instruction address. Is there some background to where (per W) comes from? What does that mean? I realize
what it is doing but why would I Know (per W) means this? Is there some sort of legend of terms? Or is this a common term I am just not aware of?
Regards
Bob (WRD)

evanh · 2021-10-06 18:10

Dunno why W as a name. I'd say it'll mainly to be unique naming so as not to be confused with any other lettering in the instruction sheet. That group of instructions with 20bit immediate addresses are distinct from most because their encodings don't fit the regular OPCODE+D+S encoding. And the two AUGx's are different again.

EDIT: The letterings used in the instruction sheet evolved as Chip added more and people made susgestions. If you look at the EEEE conditional execution field, the letter used for the Prop1 was CCCC instead.

Ariba · 2021-10-06 18:11

I don't know where you have found this description, but I think there must be an instruction bit encoding with W bits beside it.
From the instructions.txt:

CCCC 11100ww Rnn nnnnnnnnn nnnnnnnnn        CALLD   reg,#abs/#rel

The two w bits define to which register the return address gets written.

Andy

evanh · 2021-10-06 18:19

Andy,
Where did you find that? It must be very old, it has the CCCC I was just referring to.

Rayman · 2021-10-06 18:29

Is it right that the JMPRET way doesn't restore the CZ flags, but that way @AJL described in post #2 does?

evanh · 2021-10-06 18:41

When you say JMPRET way, do you mean on the Prop1 itself? Looking at the Prop1 datasheet ... sounds right, looks like C is untouched and Z normally always reset.

Whereas the Prop2 CALLD options are just like RET. Flags are restored from the link register according to WC/WZ/WCZ options.

Ariba · 2021-10-06 22:23

@evanh said:
Andy,
Where did you find that? It must be very old, it has the CCCC I was just referring to.

I found it on my harddisk ;-)

I see in the newer versions it changed a bit:

EEEE 11100WW RAA AAAAAAAAA AAAAAAAAA        CALLD   reg,#abs/#rel

Rayman · 2021-10-07 15:00

I kind of wish CALLD were CALLS... Wouldn't that make more sense as you are calling the subroutine at address S?

Wuerfel_21 · 2021-10-07 15:22

@Rayman said:
I kind of wish CALLD were CALLS... Wouldn't that make more sense as you are calling the subroutine at address S?

But that's what all the CALL*s do. Only CALLD and CALLPA/CALLPB even have a D register

evanh · 2021-10-07 16:12

BL is too terse for my liking. JMPL looks fine although could be mistaken for meaning some kind of long jump. LINK is certainly distinctive but also not obvious what a link might be. But that isn't hard to get over so I guess I favour LINK.

EDIT: JMPRET wasn't so bad I didn't think. JMPLK wouldn't be bad either.

Rayman · 2021-10-07 17:45

Ok, so this really is the equivalent of the P1 JMPRET, except that it now has the option to restore CZ flags, right?

Wuerfel_21 · 2021-10-07 17:53

Not quite P1 JMPRET, because that is able to patch a JMP in place, whereas CALLD overwrites the entire long with return address + CZ

evanh · 2021-10-07 17:58

It could do that because the code address range was only 9 bits on the Prop1.

Wuerfel_21 · 2021-10-07 18:03

You could totally patch a 20 bit address into a JMP

evanh · 2021-10-07 18:12

Good point, but it would be rough in comparison ... with relative encodings ... and loss of flags ... and hubexec ...

CALLD PA/PB/PTRA/PTRB,#\A syntax question

Comments