LLVM Backend for Propeller 2

rogloh · 2026-05-04 09:55

@Rayman said:
This sounds great @rogloh

Putting your version on GitHub sounds like great idea.
Guess that would be a fork.

But @n_ermosh did post a few months ago …. Maybe he would take pull request. If can figure that out…

Yeah was waiting to see if he's lurking about. Might have to figure out GitHub again - been ages since I pushed stuff there.

BTW am now adding AUGD and AUGS constant value printing (where consumed) so we can see the full 32 bit values after augmenting. We can't show the full AUG values in the AUGD/AUGS statements themselves because the instruction that consumes it follows later and has not been processed yet. I also still want to show the original unaugmented value so this code can be re-assembled if you generate it without the addresses and that's why the final augmented value is shown only after the comment starts. Hopefully we can ultimately lookup any matching symbol names at these address values too. That can help to figure out some data variable accesses when memory is read or written there.

      9c: 80 03 00 ff                           augs    #$70000 >> 9
      a0: 00 f0 07 f6                           mov     ptra, #0  ' ##$70000 
      a4: 0c 03 00 ff                           augs    #$61800 >> 9
      a8: 34 a1 07 f6                           mov     r0, #$134  ' ##$61934 
      ac: d0 a5 1b f2                           cmp     r2, r0 wcz
      b0: 04 00 90 cd           if_c            jmp     #$4     ' <$b8>
      b4: 0c 00 90 fd                           jmp     #$c     ' <$c4>

Update: there is just one more missing item that the decoding of AUGS/AUGD stuff will also assist. The special augmented and unaugmented hub memory accesses using PTRx pointer index operations like: mov r0, ptra[##$2A332] and mov r0, ptra[123]

Right now these are not yet handled at all, either with augmented constants or without. I should add these which I think will complete the P2LLVM parser to accept all standard P2 instructions and aliases, although compile time expressions for calculating constants may still be limited - I've not looked into that.

The special SPIN2 language control directives like ORGH, FIT, RES etc won't be implemented though, as this assembler already uses its own (primarily GNU GAS compatible) directives.

iseries · 2026-05-05 10:16

@rogloh,
To put your changes up and make a pull request would mean you have to fork a copy, do a pull request against that copy, do the submodule updates, copy the changes in, do a push. That would make a version that we could pull down and test.

Mike

rogloh · 2026-05-05 11:13

@iseries said:
@rogloh,
To put your changes up and make a pull request would mean you have to fork a copy, do a pull request against that copy, do the submodule updates, copy the changes in, do a push. That would make a version that we could pull down and test.

Mike

Thanks @iseries for the refresher. I'll probably do exactly that in the end. I just want to complete the last part to parse which is the pointer indexing stuff. Then I should try the pull request. One good thing is that nothing seems to have changed since I dragged my tree down in the first place so no additional merging needs to be done at this stage.

Rayman · 2026-05-05 19:19

@rogloh glad thinking about putting on GitHub later…

Would be nice to have build instructions for Linux if possible. Guess my problem might be that Linux people are so used building stuff they don’t need much help whereas I’m mostly lost…

Also, don’t know if it can work here but some GitHub stuff appears to be automagically compiled be them if set up (?)

Rayman · 2026-05-06 22:40

@rogloh Suppose if spin2cpp still works, then can add spin2 drivers to c/c++ code right?

rogloh · 2026-05-07 01:29

@Rayman said:
@rogloh Suppose if spin2cpp still works, then can add spin2 drivers to c/c++ code right?

Potentially, although you'd need to worry about memory layout and startup code etc. Not sure they'll be fully compatible at that level without changes in how you structure and launch the bundled application code. But it's a compelling solution if we could use both toolchains together, particularly for something like MicroPython which could make use of other SPIN2 based objects converted into C code.

rogloh · 2026-05-07 14:38

Got a bit further on this and tested all the opcodes for the new P2 instructions I'd recently added to P2LLVM by comparing all the different possible {#}D/{#}S forms with the binary output that flexspin generates. This is the list:

JMPREL LOC CALLD TJS TJNS TJV REP TEST TESTN CALLB BITxxx DIRxxx OUTxxx FLTxxx DRVxxx TESTP TESTPN TESTB TESTBN ALTR ALTB ALTI SETR SETS SETD COGATN XZERO CRCBIT CRCNIB WAITxxx SKIP SKIPF EXECF PUSH POP ADDPIX MULPIX BLNPIX MIXPIX ALLOWI STALLI TRGINT1 TRGINT2 TRGINT3 NIXINT1 NIXINT2 NIXINT3 ALTSN ALTGN ALTSB ALTGB ALTSW ALTGW SETLUTS SETCY SETCI SETCQ SETCFRQ SETCMOD SETPIV SETPIX SETPAT SETDACS GETXACC INCMOD DECMOD FBLOCK SETSCP SETINT1 SETINT2 SETINT3 GETPTR GETSCP CMPM CMPSUB RFVAR RFVARS SCA SCAS MUXC MUXNC MUXZ MUXNZ NEGC NEGNC NEGZ NEGNZ RCZR RCZL JINT JCT1 JCT2 JCT3 JSE1 JSE2 JSE3 JSE4 JPAT JFBW JXMT JXFI JXRO JXRL JATN JQMT JNINT JNCT1 JNCT2 JNCT3 JNSE1 JNSE2 JNSE3 JNSE4 JNPAT JNFBW JNXMT JNXFI JNXRO JNXRL JNATN JNQMT CALLPA CALLPB

Found a typo and fixed the SETDACS opcode before it became buried and would cause problems later. But one error wasn't too bad given this was a manual process typing it all in.

I did find a couple of things with the P2LLVM output that worked differently when I assembled the same file using flexspin and compared the outputs.

1) Relative branch distances are taken directly from the immediate value entered for P2LLVM, while flexspin wants a symbol or target address in order to compute its relative branch amount. If you enter an immediate it will just use that as the branch address amount in P2LLVM and it won't encode the same as flexspin which will take the number provided and figure out the distance from the current address to that address target and encode that value (unless the \ prefix is added, in which case it will use the address directly). It might be good to bring P2LLVM parser in line with how flexspin behaves by default at least when labels are not used as branch targets, or perhaps flexspin is doing it wrong.

2) Some aliases are being incorrectly matched when immediates are being used. For example:
NOT D,#2 can incorrectly be matched as NOT D wc (alias form) because the #2 immediate S-operand in the NOT D, #S form is the same value as the encoded WC flag. It looks like effects flag operands need to have their own type and not get encoded as immediates in order for the aliases to work in all situations. For now until that is resolved (which needs additional effort) I'll just disable the affected aliases. I think it's this list of aliases NOT ABS NEG NEGC NEGNC NEGZ NEGNZ TEST ENCOD ONES

One other thing that I found is that the percentage symbol does indeed let you enter binary constants as it should after I'd enabled Motorola mode in the lexing. The $ prefix was working for hex but % was not working for binary when I tested it earlier. Turns out this was just because inline assembly also uses % as an escape character. To get the real % you need to use %%, although this is only needed for the inline assembler. The regular assembler that works with separate .s files only need a single % for binary number encoding (or can also use the 0b prefix, similar to 0x for hex). You can see both these prefixes now working in the captured log below. Unfortunately I don't think we can readily make the %% prefix represent two-bit quaternary numbers (base 4) as this is buried deep in the common lexing code used by all microprocessor targets which I certainly don't want to mess with. But as it's rarely used and we can easily convert those values into binary instead it shouldn't be a major deal to not have that feature implemented in P2LLVM.

❯ cat test.s
.text
  mov r0, #%10101
  mov r0, #$110
❯ cl -c -o test.o test.s
❯ objd -d --print-imm-hex test.o

test.o: file format elf32-p2

Disassembly of section .text:

00000000 <.text>:
       0: 15 a0 07 f6                       mov r0, #$15 
       4: 10 a1 07 f6                       mov r0, #$110 
❯

Rayman · 2026-05-07 15:16

@rogloh Is the test suite of any use?
https://github.com/ne75/p2llvm/tree/master/test/src

rogloh · 2026-05-07 23:42

@Rayman said:
@rogloh Is the test suite of any use?
https://github.com/ne75/p2llvm/tree/master/test/src

It most likely is useful but last time I wanted to use it I think I had issues with requiring some extra python software module that apparently collide with my Mac's own versions when I tried to install them. So I stopped there.

rogloh · 2026-05-08 14:43

It's pretty amazing what can be done with this LLVM toolchain if you mess about with it enough. By fighting/arguing/persevering with a google AI for several hours I was finally able to get the disassembly listing to produce the labels needed for branch targets. OMG that was a whole process in itself, but it finally appears to be working.

In the listing it is now making labels for each branch target in HUBRAM and referenced address in the code and automatically printing them after the tjz and jmp instructions. I've still got my own comment string being printed alongside that shows the target address and was used to help validate things but that is sort of redundant now. Relative branch addressing seems correct too.

0001beb4 <.LBB32_5>:
   1beb4: d5 a1 c3 fa                       rdbyte  r0, r5
   1beb8: 01 a8 07 f1                       add r4, #1
   1bebc: c4 38 c0 fd                       calla   #\unichar_isalpha
   1bec0: 01 de 97 fb                       tjz r31, #1 <.LBB32_7> ' <$1bec8>
   1bec4: 18 00 90 fd                       jmp #24 <.LBB32_6> ' <$1bee0>

0001bec8 <.LBB32_7>:
   1bec8: 01 a4 07 f1                       add r2, #1  
   1becc: 3f a1 07 fb                       rdlong  r0, ptra[-1]
   1bed0: d0 a5 1b f2                       cmp r2, r0 wcz
   1bed4: d4 ab 03 f6                       mov r5, r4
   1bed8: d8 ff 9f cd       if_c            jmp #-40 <.LBB32_5> ' <$1beb4>
   1bedc: 2c 00 90 fd                       jmp #44 <.LBB32_9> ' <$1bf0c>

0001bee0 <.LBB32_6>:
   1bee0: d5 a1 c3 fa                       rdbyte  r0, r5
   1bee4: d8 39 c0 fd                       calla   #\unichar_isupper
   1bee8: 01 de 9f fb                       tjnz    r31, #1 <.LBB32_8> ' <$1bef0>
   1beec: 28 00 90 fd                       jmp #40 <.LBB32_11> ' <$1bf18>

0001bef0 <.LBB32_8>:
   1bef0: 01 a4 07 f1                       add r2, #1 
   1bef4: 3f a1 07 fb                       rdlong  r0, ptra[-1]
   1bef8: d0 a5 1b f2                       cmp r2, r0 wcz
   1befc: d3 ad 03 f6                       mov r6, r3
   1bf00: d4 ab 03 f6                       mov r5, r4
   1bf04: ac ff 9f cd       if_c            jmp #-84 <.LBB32_5> ' <$1beb4>
   1bf08: 08 00 90 fd                       jmp #8 <.LBB32_10> ' <$1bf14>

0001bf0c <.LBB32_9>:
   1bf0c: 01 ac 07 f5                       and r6, #1  
   1bf10: 01 ac 97 fb                       tjz r6, #1 <.LBB32_11> ' <$1bf18>

I was also hoping to enable this --visualize-branches ASCII art feature that would look something like this, but it apparently came in on a later version of the toolchain and is not supported by P2LLVM. P2LLVM is based off LLVM 14.

    2a5c: 04 de 97 fb    tjz   r31, <bb_mp_print_str_1>  --+
    2a60: d3 a1 03 fb    rdlong  r0, r3                    |
    2a64: 04 a6 07 f1    add     r3, #4                    |
    2a68: d3 a7 03 fb    rdlong  r3, r3                    |
    2a6c: 2e a6 63 fd    calla   r3                        |
    2a70:                <bb_mp_print_str_1>:            <-+
    2a70: d2 df 03 f6    mov     r31, r2

ps: Still don't like the pattern above with a TJZ falling through after the test by skipping over the following JMP. If the target is close enough it should be able to just do the reverse test condition and only branch with a single instruction and continue through otherwise avoiding a branch penalty in one of the cases. That's another optimization that needs to be added in time. There's plenty of others too including a conditional branch without the prior compare operation to set flags, by instead setting Z directly off the memory reads, ALU operations. It also avoids TJZ since you can do if_z JMP xxx and purely use absolute addressing (for this external memory scheme I have in mind).

LLVM Backend for Propeller 2

Comments