Shop OBEX P1 Docs P2 Docs Learn Events
LLVM Backend for Propeller 2 - Page 14 — Parallax Forums

LLVM Backend for Propeller 2

18910111214»

Comments

  • roglohrogloh Posts: 6,357
    edited 2026-04-27 11:08

    @evanh said:
    The last one could then be:

      augd    #$3a201a8 >> 9
      wrlong  #$3a201a8 & $1ff, ptra[-4]
    

    It ideally should be that, and it could save 3 instructions and 6 clocks vs what was actually compiled in the code above. Can add up when there are function calls with lots of arguments being setup, including immediates. It is done this way I think so the stack is adjusted atomically in one go when ready to be called in case prior arguments and locals that are referenced off the SP are needed while building the new arguments. But it's very inefficient in code size IMO and this is where having a BP could come in handy.

  • roglohrogloh Posts: 6,357

    I sort of wonder if the called code's prologue and epilogue code could be best done in COGRAM/LUTRAM so that the impact of the extra code size overhead is reduced. We could easily pass a register count to save in PA with something like this:

    called function prologue:
    callpa #5, enter_handler

    called function epilogue:
    callpa #5, leave_handler

    Then in COG/LUT the enter_handler could save and block write the saved registers. The problem would be when there are gaps in the registers to be saved/restored. But it could simply be brute forced to just save them anyway as it's only a single clock per extra register saved rather than dealing with masks and looking for ranges to save etc.

    The leave_handler could potentially also pop the return address from the stack and return to the caller directly bypassing the need for a RETA at the end of the function.

    Just some musings on this...

  • evanhevanh Posts: 17,196

    @rogloh said:
    Just some musings on this...

    I've used the forum as a notebook at times. Both for laying out the reasons/steps and for later reference.

  • roglohrogloh Posts: 6,357
    edited 2026-04-27 14:05

    Just added a bunch of the previously unused P2 instructions to the P2LLVM code in P2InstrInfo.td. A few commented instructions at the bottom are still TBD as they are a little more involved. I've also added a couple of extra instruction formats for P2InstrFormats.td like P2InstNOARGS for the few instructions which take no arguments, as the code doesn't seem to have that yet. If anyone needs these for their own use, they are copied here for now, but beware they are not all tested so if there are bit errors below it could assemble to bad code.

    /*--------------------------------*/
    /* extra P2 instructions          */
    /*--------------------------------*/
    
    defm ADDPIX     : P2InstIDSm<0b101001000, "addpix\t">;
    defm MULPIX     : P2InstIDSm<0b101001001, "mulpix\t">;
    defm BLNPIX     : P2InstIDSm<0b101001010, "blnpix\t">;
    defm MIXPIX     : P2InstIDSm<0b101001011, "mixpix\t">;
    
    def ALLOWI      : P2InstNOARGS<0b1101011000000100000000100100, (outs), (ins), "allowi\t">;
    def STALLI      : P2InstNOARGS<0b1101011000000100001000100100, (outs), (ins), "stalli\t">;
    def TRGINT1     : P2InstNOARGS<0b1101011000000100010000100100, (outs), (ins), "trgint1\t">;
    def TRGINT2     : P2InstNOARGS<0b1101011000000100011000100100, (outs), (ins), "trgint2\t">;
    def TRGINT3     : P2InstNOARGS<0b1101011000000100100000100100, (outs), (ins), "trgint3\t">;
    def NIXINT1     : P2InstNOARGS<0b1101011000000100101000100100, (outs), (ins), "nixint1\t">;
    def NIXINT2     : P2InstNOARGS<0b1101011000000100110000100100, (outs), (ins), "nixint2\t">;
    def NIXINT3     : P2InstNOARGS<0b1101011000000100111000100100, (outs), (ins), "nixint3\t">;
    
    defm ALTSN      : P2InstIDSm<0b100101010, "altsn\t">;
    defm ALTGN      : P2InstIDSm<0b100101011, "altgn\t">;
    defm ALTSB      : P2InstIDSm<0b100101100, "altsb\t">;
    defm ALTGB      : P2InstIDSm<0b100101101, "altgb\t">;
    defm ALTSW      : P2InstIDSm<0b100101110, "altsw\t">;
    defm ALTGW      : P2InstIDSm<0b100101111, "altgw\t">;
    
    defm SETLUTS    : P2InstLDm<0b1101011, 0b000110111, "setluts\t">;
    defm SETCY      : P2InstLDm<0b1101011, 0b000111000, "setcy\t">;
    defm SETCI      : P2InstLDm<0b1101011, 0b000111001, "setci\t">;
    defm SETCQ      : P2InstLDm<0b1101011, 0b000111010, "setcq\t">;
    defm SETCFRQ    : P2InstLDm<0b1101011, 0b000111011, "setcfrq\t">;
    defm SETCMOD    : P2InstLDm<0b1101011, 0b000111100, "setcmod\t">;
    defm SETPIV     : P2InstLDm<0b1101011, 0b000111101, "setpiv\t">;
    defm SETPIX     : P2InstLDm<0b1101011, 0b000111110, "setpix\t">;
    
    defm SETPAT     : P2InstLIDSm<0b10111111,"setpat\t">;
    defm SETDACS    : P2InstLDm<0b1101000, 0b000011100, "setdacs\t">;
    def GETXACC     : P2InstD<0b1101011, 0b000011110, (outs P2GPR:$d), (ins), "getxacc\t$d">;
    
    defm INCMOD     : P2InstCZIDSm<0b0111000, "incmod\t">;
    defm DECMOD     : P2InstCZIDSm<0b0111001, "decmod\t">;
    
    defm FBLOCK     : P2InstLIDSm<0b11001001, "fblock\t">;
    defm SETSCP     : P2InstLDm<0b1101011, 0b001110000, "setscp\t">;
    defm SETINT1    : P2InstLDm<0b1101011, 0b000100101, "setint1\t">;
    defm SETINT2    : P2InstLDm<0b1101011, 0b000100110, "setint2\t">;
    defm SETINT3    : P2InstLDm<0b1101011, 0b000100111, "setint3\t">;
    
    def GETPTR      : P2InstD<0b1101011, 0b000110100, (outs P2GPR:$d), (ins), "getptr\t$d">;
    def GETSCP      : P2InstD<0b1101011, 0b001110001, (outs P2GPR:$d), (ins), "getscp\t$d">;
    
    defm CMPM       : P2InstCZIDSm<0b0010101, "cmpm\t">;
    defm CMPSUB     : P2InstCZIDSm<0b0010111, "cmpsub\t">;
    
    def RCZR        : P2InstCZD<0b1101011, 0b001101010, (outs P2GPR:$d), (ins), "rczr\t$d">;
    def RCZL        : P2InstCZD<0b1101011, 0b001101011, (outs P2GPR:$d), (ins), "rczl\t$d">;
    
    def RFVAR       : P2InstCZD<0b1101011, 0b000010011, (outs P2GPR:$d), (ins), "rfvar\t$d">;
    def RFVARS      : P2InstCZD<0b1101011, 0b000010100, (outs P2GPR:$d), (ins), "rfvars\t$d">;
    
    defm SCA         : P2InstZIDSm<0b10100010, (ins), "sca\t">;
    defm SCAS        : P2InstZIDSm<0b10100011, (ins), "scas\t">;
    
    defm MUXC       : P2InstCZIDSm<0b0101100, "muxc\t">;
    defm MUXNC      : P2InstCZIDSm<0b0101101, "muxnc\t">;
    defm MUXZ       : P2InstCZIDSm<0b0101110, "muxz\t">;
    defm MUXNZ      : P2InstCZIDSm<0b0101111, "muxnz\t">;
    
    defm NEGC       : P2InstCZIDSm<0b0110100, "negc\t">;
    defm NEGNC      : P2InstCZIDSm<0b0110101, "negnc\t">;
    defm NEGZ       : P2InstCZIDSm<0b0110110, "negz\t">;
    defm NEGNZ      : P2InstCZIDSm<0b0110111, "negnz\t">;
    
    defm JINT       : P2InstISm<0b101111001, 0b000000000, "jint\t">;
    defm JCT1       : P2InstISm<0b101111001, 0b000000001, "jct1\t">;
    defm JCT2       : P2InstISm<0b101111001, 0b000000010, "jct2\t">;
    defm JCT3       : P2InstISm<0b101111001, 0b000000011, "jct3\t">;
    defm JSE1       : P2InstISm<0b101111001, 0b000000100, "jse1\t">;
    defm JSE2       : P2InstISm<0b101111001, 0b000000101, "jse2\t">;
    defm JSE3       : P2InstISm<0b101111001, 0b000000110, "jse3\t">;
    defm JSE4       : P2InstISm<0b101111001, 0b000000111, "jse4\t">;
    defm JPAT       : P2InstISm<0b101111001, 0b000001000, "jpat\t">;
    defm JFBW       : P2InstISm<0b101111001, 0b000001001, "jfbw\t">;
    defm JXMT       : P2InstISm<0b101111001, 0b000001010, "jxmt\t">;
    defm JXFI       : P2InstISm<0b101111001, 0b000001011, "jxfi\t">;
    defm JXRO       : P2InstISm<0b101111001, 0b000001100, "jxro\t">;
    defm JXRL       : P2InstISm<0b101111001, 0b000001101, "jxrl\t">;
    defm JATN       : P2InstISm<0b101111001, 0b000001110, "jatn\t">;
    defm JQMT       : P2InstISm<0b101111001, 0b000001111, "jqmt\t">;
    defm JNINT      : P2InstISm<0b101111001, 0b000010000, "jnint\t">;
    defm JNCT1      : P2InstISm<0b101111001, 0b000010001, "jnct1\t">;
    defm JNCT2      : P2InstISm<0b101111001, 0b000010010, "jnct2\t">;
    defm JNCT3      : P2InstISm<0b101111001, 0b000010011, "jnct3\t">;
    defm JNSE1      : P2InstISm<0b101111001, 0b000010100, "jnse1\t">;
    defm JNSE2      : P2InstISm<0b101111001, 0b000010101, "jnse2\t">;
    defm JNSE3      : P2InstISm<0b101111001, 0b000010110, "jnse3\t">;
    defm JNSE4      : P2InstISm<0b101111001, 0b000010111, "jnse4\t">;
    defm JNPAT      : P2InstISm<0b101111001, 0b000011000, "jnpat\t">;
    defm JNFBW      : P2InstISm<0b101111001, 0b000011001, "jnfbw\t">;
    defm JNXMT      : P2InstISm<0b101111001, 0b000011010, "jnxmt\t">;
    defm JNXFI      : P2InstISm<0b101111001, 0b000011011, "jnxfi\t">;
    defm JNXRO      : P2InstISm<0b101111001, 0b000011100, "jnxro\t">;
    defm JNXRL      : P2InstISm<0b101111001, 0b000011101, "jnxrl\t">;
    defm JNATN      : P2InstISm<0b101111001, 0b000011110, "jnatn\t">;
    defm JNQMT      : P2InstISm<0b101111001, 0b000011111, "jnqmt\t">;
    
    defm TESTN      : P2InstCZIDSm<0b0111111, "testn\t">;
    defm WMLONG     : P2InstIDSm<0b101001111, "wmlong\t">;
    
    // TODO:
    // BITC BITH BITL BITNC BITNOT BITNZ BITRND BITZ 
    // CALLB CALLD CALLPA CALLPB 
    
    // DIRC DIRNC DIRNZ DIRRND DIRZ
    // DRVC DRVNC DRVNZ DRVRND DRVZ 
    // FLTC FLTNC FLTNZ FLTRND FLTZ 
    // OUTC OUTNC OUTNZ OUTRND OUTZ 
    
    // TJF TJNF TJNS TJS TJV 
    
    // aliases RESI0 RESI1 RESI2 RESI3 RETI0 RETI1 RETI2 RETI3 XSTOP
    

    Here are new classes for P2InstrFormats.td (one for no arguments, one for special MODCZ arguments)

    class P2InstNOARGS<bits<28> op, dag outs, dag inputs, string asmstr> : P2Inst<21, outs, !con(inputs, (ins P2Cond:$cc)), !strconcat("$cc\t", asmstr)> {
        bits<4> cc;
    
        let Inst{31-28} = cc;
        let Inst{27-0} = op;
    
        let TSFlags{5} = 0;
        let TSFlags{6} = 0;
        let TSFlags{7} = 0;
    
        let TSFlags{10-8} = 0; // s is always operand 0
        let TSFlags{13-11} = 0; // d is always operand 0
        let TSFlags{16-14} = 0; // n is always operand 0
    }
    
    class P2InstCZ4C4Z<bits<7> op, bits<9> s, dag outs, dag inputs, string asmstr> :
        P2Inst<20, outs, !con(inputs, (ins P2Cond:$cc, P2Effect:$cz)), !strconcat("$cc\t", asmstr, " $cz")> {
    
        bits<20> n20;
        bits<4> cc;
        bits<2> cz;
        bits<4> cccc;
        bits<4> zzzz;
    
        let Inst{31-28} = cc;
        let Inst{27-21} = op;
        let Inst{20-19} = cz;
        let Inst{18-17} = 0b10;
        let Inst{16-13} = cccc;
        let Inst{12-9} = zzzz;
        let Inst{8-0} = s;
    
        let TSFlags{5} = 0;
        let TSFlags{6} = 0;
        let TSFlags{7} = 0;
    
        let TSFlags{10-8} = 0; // s is always operand 0
        let TSFlags{13-11} = 0; // d is always operand 0
        let TSFlags{16-14} = 0; // n is always operand 0
    }
    
    
    
    
  • RaymanRayman Posts: 16,345

    @rogloh Nice! It is just inline assembly where this could go wrong, right? Think so.

  • roglohrogloh Posts: 6,357
    edited 2026-04-27 14:27

    @Rayman said:
    @rogloh Nice! It is just inline assembly where this could go wrong, right? Think so.

    Yes. C doesn't use these as there is no pattern defined that use these instructions. The disassembler and assembler should be able to take these and work with these formats. I have changed the tab spacing a little which works with the rest of my code changes, but may upset the output format slightly for these in isolation in the objdump until the remainder of the code changes I am working on get applied.. but that's cosmetic only. Also code has been fixed above due to some typo errors. It's now building ok on my Mac.

    Update: I just tried a few of these new instructions for sanity (ignore the modcz stuff which I'm still messing with):

        asm volatile (  
                  //"modc    _nc_and_z wc\n"
                  //"modz    _nc_and_nz wz\n"
                  //"modcz   $3,$5 wcz\n"
    
                    "addpix  r0, r2\n"
                    "mulpix  r0, r2\n"
                    "mulpix  r0, r2\n"
                    "allowi\n"
                    "stalli\n"
                    "trgint1\n"
                    "setluts #4\n"
                    "setpat r3,r5\n"
                    "fblock r3,#4\n"
                    "cmpm r3, #4\n"
                    "rczl r3 wc\n"
                    "sca r3,r19\n"
                    "negc r3, #22 wc\n"
                    "jint #3\n"
                    "testn r3,r4\n"
    ...
    

    and I get this dissassembly:

       331a0: d2 a1 43 fa                       addpix  r0, r2
       331a4: d2 a1 4b fa                       mulpix  r0, r2
       331a8: d2 a1 4b fa                       mulpix  r0, r2
       331ac: 24 40 60 fd                       allowi  
       331b0: 24 42 60 fd                       stalli  
       331b4: 24 44 60 fd                       trgint1 
       331b8: 37 08 64 fd                       setluts #4
       331bc: d5 a7 f3 fb                       setpat  r3, r5
       331c0: 04 a6 97 fc                       fblock  r3, #4
       331c4: 04 a6 a7 f2                       cmpm    r3, #4
       331c8: 6b a6 73 fd                       rczl    r3 wc
       331cc: e3 a7 23 fa                       sca r3, r19 
       331d0: 16 a6 97 f6                       negc    r3, #$16 wc
       331d4: 03 00 cc fb                       jint    #3
       331d8: d4 a7 e3 f7                       testn   r3, r4
    
Sign In or Register to comment.