Shop OBEX P1 Docs P2 Docs Learn Events
Looking for a way to save AUDS/AUGD/SETQ state — Parallax Forums

Looking for a way to save AUDS/AUGD/SETQ state

roglohrogloh Posts: 6,216
edited 2026-01-28 06:23 in PASM2/Spin2 (P2)

Hi all,
I am currently looking for a good way to determine the state of prior pending AUGS/AUGD/SETQ instructions at any stopped point in some executed P2 code. I am working on an external memory scheme that fragments code into blocks of 128 or 256 bytes and this can stop code executing at any instruction. I will essentially replay these 3 instructions prior to resuming to reload this state but it would be nice to not have to search for them manually and just know what the current AUGD/AUGS/SETQ state is in order to restore their values, eg. for cordic use or loading extended constant values > 511. It may not necessarily need to be used for block transfers, although I might not want to rule that out either. I don't expect I will need to worry about SETQ2 or ALTxx in my application as high level code like "C" won't use that unless somehow tied to the optimizer. I'm also assuming that once consumed AUGD/AUGS will always default back to zero (hopefully I'm not wrong in my assumption).

I think I can probably extract AUGS and "Q" values with these snippets:

mov augsval, #0 
shr augsval, #9  ' shift value down by 9 bits
setnib augsval, #$F, 1 ' create AUGS instruction to be executed later
...
augsval long 0 ' this instruction will be executed to restore any pending "AUGS" value

mov qval, #0
muxq qval, ##-1  ' extract current Q value
...
setq qval  ' restores Q value

But not sure if AUGD can similarly be done as well, maybe with ALTxx somehow although I might have to hunt for this instruction specifically. AUGD/AUGS both have one nibble as $F so it's not too hard to identify but I'll have to go read more longs from HUB for this.

Any other ideas? I guess it's possible to just use LUT RAM if I keep some free space for this. HUB RAM is similarly possible but burns quite a lot more clocks, still probably faster than direct instruction matching.

wrlut #0, lutstorage
rdlut audgval, lutstorage
shr augdval, #9  ' shift value down by 9 bits
bith augdval, #23+(4<<6) ' create AUGD instruction to be executed later
...
augdval long 0 ' this instruction will be executed to restore any pending "AUGD" value

UPDATE: Damn, I just realized that any conditional flags might also need to be copied too which complicates this scheme further.... so I might still have to resort to brute force lookup of prior 3 instructions.

I also wondered if there was a way to somehow use forced interrupts on block boundaries if these could automatically restore the pending AUGD/AUDS/SETQ state but after reading the documentation it looks like they wait for those instructions to complete first before triggering:

"When an interrupt event occurs, certain conditions must be met during execution before the interrupt branch can happen:

  • ALTxx / CRCNIB / SCA / SCAS / GETCT+WC / GETXACC / SETQ / SETQ2 / XORO32 / XBYTE must not be executing
  • AUGS must not be executing or waiting for a S/# instruction
  • AUGD must not be executing or waiting for a D/# instruction
  • REP must not be executing or active
  • STALLI must not be executing or active
  • The cog must not be stalled in any WAITx instruction"

Comments

  • evanhevanh Posts: 17,049

    @rogloh said:
    ... but after reading the documentation it looks like they wait for those instructions to complete first before triggering:

    "When an interrupt event occurs, certain conditions must be met during execution before the interrupt branch can happen:

    • ALTxx / CRCNIB / SCA / SCAS / GETCT+WC / GETXACC / SETQ / SETQ2 / XORO32 / XBYTE must not be executing
    • AUGS must not be executing or waiting for a S/# instruction
    • AUGD must not be executing or waiting for a D/# instruction
    • REP must not be executing or active
    • STALLI must not be executing or active
    • The cog must not be stalled in any WAITx instruction"

    Yeah, I was going to start a lecture on this until I got to the bottom of your posting. Consequently, restoring of Q is only needed if you've overwritten it in your ISR.

    The biggest issue arises with the Cordic. If its pipeline is being utilised for faster compute then that routine needs to hold off interrupts for the duration of the pipelining. Which can be a substantial number of clocks.

  • ersmithersmith Posts: 6,238

    @rogloh Could we back up and look at this on a higher level? You mentioned using C code, if so then maybe we could modify the compiler or alternatively add a postprocess to the compiler output so that the problematic instructions do not end a block (inserting NOPs if necessary). A postprocessor would have the advantage that it could cope with user assembly too, including inline assembly in the C code.

  • evanhevanh Posts: 17,049
    edited 2026-01-28 22:57

    [Oh, oops, I hadn't understood Roger's intent at all.]

  • The IRQ-blocking approach is a problem because then the instructions will start to read past the loaded code block.

    However, as Eric points out, the correct solution to this is to just not assemble such instructions over the block boundary.

  • evanhevanh Posts: 17,049
    edited 2026-01-29 00:53

    Probably going to want fixed block granular boundaries too. With things like branches and tight loops avoiding sitting across a block-end boundary as much as possible. It's an enhanced version of the above instruction bumping that then also assembles extra NOPs to push a branch target into the next block.

    EDIT: On the other hand, would there be feature for multiple contiguous blocks being copied physically adjacent in hubRAM? If so then there is options for grouping of blocks. This would support not putting spacing NOPs for every block.

Sign In or Register to comment.