Propeller II update - BLOG

jmg · 2014-03-13 13:41

jmg wrote: »

Instinct says a pick-pair opcode should also be useful for Quadrature Encoders, but I'm not seeing an elegant outcome yet...

Found a solution that is state-based, ( and also a good argument for making Macro Assembler standard).

Idle state is 2 lines (test and jump to same) and
Change action is 3 lines. ( test, inc/dec, jump to new)
appx 28 lines of code for each Quad block.

If Chip has the index choice, then that 28 lines may be mostly share-able across 3 Quad instances.

Could do 3 threads of Quad, leaving 4th for serious lifting.

At a USB compatible, leisurely 48MHz, main thread is (say) ~30 mips and each quad can run at ~6 mips.
I'm not sure how jumps and threads play together in the fine print, but that's roughly 3 x 1MHz quad-capable + Main.

Hard not to be impressed

Cluso99 · 2014-03-13 14:41

Bill Henning wrote: »
Looks great!

Allows trivial decoding of differential digital input pair...
         PICKZC ina,#16 ' 16&17 are differential +/- pair
if_00    jmp    #S0 state
if_01    RCL   data,#1   ' received a 1
if_10    SHL   data,#1   ' received a 0
if_11    jmp    #S1

This would be better in this case (but not useful for USB because you also need the previous state of the pair)

         PICKZC ina,#16 ' 16&17 are differential +/- pair
if_diff  RCR   data,#1   ' save received bit (LSB first)
if_same  jmp    #S0S1   ' jmp if SE0 or SE1
.....

S0S1 if_11  jmp   #S1
S0...

S1...

There are a lot of times where I have needed to decode pairs of bits and needed the full 4 cases. This instruction will help a lot.

BTW I think SETZC is still a better name than PICKZC. If it is not too much trouble, I suggest the compiler permit SETZC <reg> as SETZC <reg>,#0.

Bill Henning · 2014-03-13 14:49

Thanks, that looks way better.

Yep, I forgot about USB having to remember the previous bit.

Cluso99 wrote: »
This would be better in this case (but not useful for USB because you also need the previous state of the pair)
         PICKZC ina,#16 ' 16&17 are differential +/- pair
if_diff  RCR   data,#1   ' save received bit (LSB first)
if_same  jmp    #S0S1   ' jmp if SE0 or SE1
.....

S0S1 if_11  jmp   #S1
S0...

S1...
There are a lot of times where I have needed to decode pairs of bits and needed the full 4 cases. This instruction will help a lot.

BTW I think SETZC is still a better name than PICKZC. If it is not too much trouble, I suggest the compiler permit SETZC <reg> as SETZC <reg>,#0.

Cluso99 · 2014-03-13 14:58

Bill, Did you notice the subtle change of RCL to RCR?
Because LBS is received first. (realised this error in my USB a day or two ago)

Bill Henning · 2014-03-13 15:07

Yep. I was trying for a small, simple generic piece of sample code... yours is simpler

and has the right bit-order for USB

I am thrilled at the SETZC / PICZC instruction... it will make a lot of decoding easier!

Cluso99 wrote: »

Bill, Did you notice the subtle change of RCL to RCR?
Because LBS is received first. (realised this error in my USB a day or two ago)

Cluso99 · 2014-03-13 15:32

Bill Henning wrote: »

Yep. I was trying for a small, simple generic piece of sample code... yours is simpler and has the right bit-order for USB

I am thrilled at the SETZC / PICZC instruction... it will make a lot of decoding easier!

Yes. Both SETZC / PICKZC and RESD / RESULT are both simple & elegant, and extremely useful.

I really love the extra extension of RESD / RESULT to be used with first following executed instruction permitting a set of alternate conditional instructions. It's the little gems like these that continue to boost the performance for those of us who wish to spend the time.

Sapieha · 2014-03-13 15:36

Them are very useful for PLC programing

Cluso99 wrote: »

Yes. Both SETZC / PICKZC and RESD / RESULT are both simple & elegant, and extremely useful.

I really love the extra extension of RESD / RESULT to be used with first following executed instruction permitting a set of alternate conditional instructions. It's the little gems like these that continue to boost the performance for those of us who wish to spend the time.

cgracey · 2014-03-13 19:16

jmg wrote: »

You have to love tool flows, that are somehow not quite the same across families...
Google does not find much, might be time to contact Altera to find the magic preserve button ?

Contact Altera? They've made that an exercise in futility.

They've got lots of online help, but I can never find anything I need. One other thing I have never been able to find is any kind of benchmark comparisons between device families. If they don't want to "talk" about something, you will find no mention of the matter, anywhere. It all seems very controlled.

Cluso99 · 2014-03-13 22:04

Here is an updated Instruction Summary with...
* Column for WZ & WC now has it for each op
* adds new instructions REPD & PICKZC (note PICKZC my not be correct opcode bits)
* removes instruction SETZC
Tip to view without wrap - reduce text size (In IE Ctrl-Scroll)

Propeller II Instructions as of 2014/03/12 (+ REPD, PICKZC)
----------------------------------------------------------------------------------------------------------------------------
ZCxS Opcode  ZC I Cond  Dest       Source     Instr00 01      10      11        Operand(s)                       Flags
----------------------------------------------------------------------------------------------------------------------------
ZCWS 00000ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  RDBYTE  RDBYTEC RDWORD  RDWORDC   D,S/PTRA/PTRB                    ZC ZC ZC ZC 
ZCWS 00001ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  RDLONG  RDLONGC RDAUX   RDAUXR    D,S/PTRA/PTRB                    ZC ZC ZC ZC 
ZCMS 00010ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  ISOB    NOTB    CLRB    SETB      D,S/#                            ZC ZC ZC ZC 
ZCMS 00011ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  SETBC   SETBNC  SETBZ   SETBNZ    D,S/#                            ZC ZC ZC ZC 
ZCMS 00100ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  ANDN    AND     OR      XOR       D,S/#                            ZC ZC ZC ZC 
ZCMS 00101ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  MUXC    MUXNC   MUXZ    MUXNZ     D,S/#                            ZC ZC ZC ZC 
ZCMS 00110ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  ROR     ROL     SHR     SHL       D,S/#                            ZC ZC ZC ZC 
ZCMS 00111ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  RCR     RCL     SAR     REV       D,S/#                            ZC ZC ZC ZC 
ZCWS 01000ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  MOV     NOT     ABS     NEG       D,S/#                            ZC ZC ZC ZC 
ZCWS 01001ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  NEGC    NEGNC   NEGZ    NEGNZ     D,S/#                            ZC ZC ZC ZC 
ZCMS 01010ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  ADD     SUB     ADDX    SUBX      D,S/#                            ZC ZC ZC ZC 
ZCMS 01011ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  ADDS    SUBS    ADDSX   SUBSX     D,S/#                            ZC ZC ZC ZC 
ZCMS 01100ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  SUMC    SUMNC   SUMZ    SUMNZ     D,S/#                            ZC ZC ZC ZC 
ZCMS 01101ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  MIN     MAX     MINS    MAXS      D,S/#                            ZC ZC ZC ZC 
ZCMS 01110ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  ADDABS  SUBABS  INCMOD  DECMOD    D,S/#                            ZC ZC ZC ZC 
ZCMS 01111ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  CMPSUB  SUBR    MUL     SCL       D,S/#                            ZC ZC ZC ZC 
ZCWS 10000ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  DECOD2  DECOD3  DECOD4  DECOD5    D,S/#                            ZC ZC ZC ZC 
----------------------------------------------------------------------------------------------------------------------------
Z-WS 1000100 Zf I CCCC  DDDDDDDDD  SSSSSSSSS  ENCOD   BLMASK                    D,S/#                            Z- Z-       
Z-WS 1000101 Zf I CCCC  DDDDDDDDD  SSSSSSSSS  ONECNT  ZERCNT                    D,S/#                            Z- Z-       
-CWS 1000110 fC I CCCC  DDDDDDDDD  SSSSSSSSS  INCPAT          DECPAT            D,S/#                            -C    --    
--WS 1000111 ff I CCCC  DDDDDDDDD  SSSSSSSSS  SPLITB  MERGEB  SPLITW  MERGEW    D,S/#                            -- -- -- -- 
--MS 10010nn nf I CCCC  DDDDDDDDD  SSSSSSSSS  GETNIB  SETNIB                    D,S/#,#0..7                      -- --       
--MS 1001100 nf I CCCC  DDDDDDDDD  SSSSSSSSS  GETWORD SETWORD                   D,S/#,#0..1                      -- --       
--MS 1001101 ff I CCCC  DDDDDDDDD  SSSSSSSSS  SETWRDS ROLNIB  ROLBYTE ROLWORD   D,S/#                            -- -- -- -- 
--MS 1001110 ff I CCCC  DDDDDDDDD  SSSSSSSSS  SETS    SETD    SETCOND SETINST   D,S/#                            -- -- -- -- 
--WS 1001111 ff I CCCC  DDDDDDDDD  SSSSSSSSS  <empty> THALT                     D,S/#                            -- --       
--MS 101000n nf I CCCC  DDDDDDDDD  SSSSSSSSS  GETBYTE SETBYTE                   D,S/#,#0..3                      -- --       
--WS 1010010 ff I CCCC  DDDDDDDDD  SSSSSSSSS  SETBYTS MOVBYTS PACKRGB UNPKRGB   D,S/#                            -- -- -- -- 
--MS 1010011 ff I CCCC  DDDDDDDDD  SSSSSSSSS  ADDPIX  MULPIX  BLNPIX  MIXPIX    D,S/#                            -- -- -- -- 
ZCMS 1010100 ZC I CCCC  DDDDDDDDD  SSSSSSSSS  JMPSW                             D,S/@                            ZC          
ZCMS 1010101 ZC I CCCC  DDDDDDDDD  SSSSSSSSS  JMPSWD                            D,S/@                            ZC          
--MS 1010110 ff I CCCC  DDDDDDDDD  SSSSSSSSS  DJZ     DJZD    DJNZ    DJNZD     D,S/@                            -- -- -- -- 
--RS 1010111 ff I CCCC  DDDDDDDDD  SSSSSSSSS  JZ      JZD     JNZ     JNZD      D,S/@                            -- -- -- -- 
ZCRS 10110ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  TESTB   TESTN   TEST    CMP       D,S/#                            ZC ZC ZC ZC 
ZCRS 10111ff ZC I CCCC  DDDDDDDDD  SSSSSSSSS  CMPX    CMPS    CMPSX   CMPR      D,S/#                            ZC ZC ZC ZC 
-CRS 11000fn nC I CCCC  DDDDDDDDD  SSSSSSSSS  WAITPEQ WAITPNE                   D,S/#,#0..3                      -C -C       
---S 110010f nn I CCCC  nnnnnnnnn  SSSSSSSSS  WAITVID WAITVID                   #0..$DFF,S/#                     -- --       
--LS 110011f fL I CCCC  DDDDDDDDD  SSSSSSSSS  WRBYTE  WRWORD  WRLONG  WRWIDE    D/#,S/PTRA/PTRB                  -- -- -- -- 
--LS 110100f fL I CCCC  DDDDDDDDD  SSSSSSSSS  WRAUX   WRAUXR  SETACCA SETACCB   D/#,S/#0..$FF/PTRX/PTRY          -- -- -- -- 
--LS 110101f fL I CCCC  DDDDDDDDD  SSSSSSSSS  MACA    MACB    MUL32   MUL32U    D/#,S/#                          -- -- -- -- 
--LS 110110f fL I CCCC  DDDDDDDDD  SSSSSSSSS  DIV32   DIV32U  DIV64   DIV64U    D/#,S/#                          -- -- -- -- 
--LS 110111f fL I CCCC  DDDDDDDDD  SSSSSSSSS  SQRT64  QSINCOS QARCTAN QROTATE   D/#,S/#                          -- -- -- -- 
--LS 111000n nL I CCCC  DDDDDDDDD  SSSSSSSSS  CFGPINS cfgpins cfgpins SETMAP    D/#,S/#,#0..2                    -- -- -- -- 
--LS 111001f fL I CCCC  DDDDDDDDD  SSSSSSSSS  SETSERA SETSERB SETCTRS SETWAVS   D/#,S/#                          -- -- -- -- 
--LS 111010f fL I CCCC  DDDDDDDDD  SSSSSSSSS  SETFRQS SETPHSS ADDPHSS SUBPHSS   D/#,S/#                          -- -- -- -- 
--LS 111011f fL I CCCC  DDDDDDDDD  SSSSSSSSS  SETXFR  SETMIX  COGRUN  COGRUNX   D/#,S/#                          -- -- -- -- 
--LS 111100f fL I CCCC  DDDDDDDDD  SSSSSSSSS  FRAC    <empty> <empty> <empty>   D/#,S/#                          -- -- -- -- 
--LS 111101f fL I CCCC  DDDDDDDDD  SSSSSSSSS  JP      JPD     JNP     JNPD      D/#,S/@                          -- -- -- -- 
--WS 1111100 ff I CCCC  DDDDDDDDD  SSSSSSSSS  LOCBASE LOCBYTE LOCWORD LOCLONG   D,S/@                            -- -- -- -- 
----------------------------------------------------------------------------------------------------------------------------
--W- 1111101 00 f CCCC  DDDDDDDDD  sssssssss  LOCINST JMPLIST                   D,@s                             -- --       
---- 1111101 01 0 BBAA  ddddddddd  sssssssss  FIXINDA                           #d,#s / FIXINDB #d,#s / FIXINDS  --          
---- 1111101 01 1 nnnn  nnnnnnnnn  nnniiiiii  REPS                              #1..$10000,#1..64                --          
---- 1111101 1f n nnnn  nnnnnnnnn  nnnnnnnnn  AUGS    AUGD                      #23bits                          -- --       
----------------------------------------------------------------------------------------------------------------------------
---- 1111110 00 0 CCCC  ffnnnnnnn  nnnnnnnnn  LOCPTRA LOCPTRA LOCPTRB LOCPTRB   #abs                             -- -- -- -- 
---- 1111110 00 1 CCCC  ffnnnnnnn  nnnnnnnnn  JMP     JMP     JMPD    JMPD      #abs                             -- -- -- -- 
---- 1111110 01 0 CCCC  ffnnnnnnn  nnnnnnnnn  LINK    LINK    LINKD   LINKD     {0} #abs                         -- -- -- -- 
---- 1111110 01 1 CCCC  ffnnnnnnn  nnnnnnnnn  CALL    CALL    CALLD   CALLD     #abs                             -- -- -- -- 
---- 1111110 10 0 CCCC  ffnnnnnnn  nnnnnnnnn  CALLA   CALLA   CALLAD  CALLAD    #abs                             -- -- -- -- 
---- 1111110 10 1 CCCC  ffnnnnnnn  nnnnnnnnn  CALLB   CALLB   CALLBD  CALLBD    #abs                             -- -- -- -- 
---- 1111110 11 0 CCCC  ffnnnnnnn  nnnnnnnnn  CALLX   CALLX   CALLXD  CALLXD    #abs                             -- -- -- -- 
---- 1111110 11 1 CCCC  ffnnnnnnn  nnnnnnnnn  CALLY   CALLY   CALLYD  CALLYD    #abs                             -- -- -- -- 
--------------------------------------------------------------------------------------------------------------------------
ZCW- 1111111 ZC 0 CCCC  DDDDDDDDD  0000000ff  COGID   TASKID  LOCKNEW GETLFSR   D                                ZC ZC ZC ZC 
ZCW- 1111111 ZC 0 CCCC  DDDDDDDDD  0000001ff  GETCNT  GETCNTX GETACAL GETACAH   D                                ZC ZC ZC ZC 
ZCW- 1111111 ZC 0 CCCC  DDDDDDDDD  0000010ff  GETACBL GETACBH GETPTRA GETPTRB   D                                ZC ZC ZC ZC 
ZCW- 1111111 ZC 0 CCCC  DDDDDDDDD  0000011ff  GETPTRX GETPTRY SERINA  SERINB    D                                ZC ZC ZC ZC 
ZCW- 1111111 ZC 0 CCCC  DDDDDDDDD  0000100ff  GETMULL GETMULH GETDIVQ GETDIVR   D                                ZC ZC ZC ZC 
ZCW- 1111111 ZC 0 CCCC  DDDDDDDDD  0000101ff  GETSQRT GETQX   GETQY   GETQZ     D                                ZC ZC ZC ZC 
ZCW- 1111111 ZC 0 CCCC  DDDDDDDDD  0000110ff  GETPHSA GETPHZA GETCOSA GETSINA   D                                ZC ZC ZC ZC 
ZCW- 1111111 ZC 0 CCCC  DDDDDDDDD  0000111ff  GETPHSB GETPHZB GETCOSB GETSINB   D                                ZC ZC ZC ZC 
ZCM- 1111111 ZC 0 CCCC  DDDDDDDDD  0001000ff  PUSHZC  POPZC   SUBCNT  GETPIX    D                                ZC ZC ZC ZC 
ZCM- 1111111 ZC 0 CCCC  DDDDDDDDD  0001001ff  BINBCD  BCDBIN  BINGRY  GRYBIN    D                                ZC ZC ZC ZC 
ZCM- 1111111 ZC 0 CCCC  DDDDDDDDD  0001010ff  ESWAP4  ESWAP8  SEUSSF  SEUSSR    D                                ZC ZC ZC ZC 
ZCM- 1111111 ZC 0 CCCC  DDDDDDDDD  0001011ff  INCD    DECD    INCDS   DECDS     D                                ZC ZC ZC ZC 
ZCW- 1111111 ZC 0 CCCC  DDDDDDDDD  0001100ff  POPT0   POPT1   POPT2   POPT3     D                                ZC ZC ZC ZC 
ZCW- 1111111 ZC 0 CCCC  DDDDDDDDD  0001101ff  POP     <empty> <empty> <empty>   D                                ZC ZC ZC ZC 
----------------------------------------------------------------------------------------------------------------------------
--L- 1111111 00 L CCCC  DDDDDDDDD  001iiiiii  REPD                              D/#1..512,#1..64                 --          
----------------------------------------------------------------------------------------------------------------------------
--L- 1111111 00 L CCCC  DDDDDDDDD  0100000ff  CLKSET  COGSTOP LOCKSET LOCKCLR   D/#                              -- -- -C -C 
--L- 1111111 00 L CCCC  DDDDDDDDD  0100001ff  LOCKRET RDWIDE  RDWIDEC RDWIDEQ   D/#                              -- -- -- -- 
ZCL- 1111111 ZC L CCCC  DDDDDDDDD  0100010ff  GETP    GETNP   SEROUTA SEROUTB   D/#                              ZC ZC -C -C 
-CL- 1111111 0C L CCCC  DDDDDDDDD  0100011ff  CMPCNT  WAITPX  WAITPR  WAITPF    D/#                              -C -C -C -C 
--L- 1111111 00 L CCCC  DDDDDDDDD  0100100ff  PUSH    <empty> SETXCH  SETTASK   D/#                              -- ZC -- -- 
--L- 1111111 00 L CCCC  DDDDDDDDD  0100101ff  SETRACE SARACCA SARACCB SARACCS   D/#                              -- -- -- -- 
--L- 1111111 00 L CCCC  DDDDDDDDD  0100110ff  SETPTRA SETPTRB ADDPTRA ADDPTRB   D/#                              -- -- -- -- 
--L- 1111111 00 L CCCC  DDDDDDDDD  0100111ff  SUBPTRA SUBPTRB SETWIDE SETWIDZ   D/#                              -- -- -- -- 
--L- 1111111 00 L CCCC  DDDDDDDDD  0101000ff  SETPTRX SETPTRY ADDPTRX ADDPTRY   D/#                              -- -- -- -- 
--L- 1111111 00 L CCCC  DDDDDDDDD  0101001ff  SUBPTRX SUBPTRY PASSCNT WAIT      D/#                              -- -- -- -- 
--L- 1111111 00 L CCCC  DDDDDDDDD  0101010ff  OFFP    NOTP    CLRP    SETP      D/#                              -- -- -- -- 
--L- 1111111 00 L CCCC  DDDDDDDDD  0101011ff  SETPC   SETPNC  SETPZ   SETPNZ    D/#                              -- -- -- -- 
--L- 1111111 00 L CCCC  DDDDDDDDD  0101100ff  DIV64D  SQRT32  QLOG    QEXP      D/#                              -- -- -- -- 
--L- 1111111 00 L CCCC  DDDDDDDDD  0101101ff  SETQI   SETQZ   CFGDACS SETDACS   D/#                              -- -- -- -- 
--L- 1111111 00 L CCCC  DDDDDDDDD  0101110ff  CFGDAC0 CFGDAC1 CFGDAC2 CFGDAC3   D/#                              -- -- -- -- 
--L- 1111111 00 L CCCC  DDDDDDDDD  0101111ff  SETDAC0 SETDAC1 SETDAC2 SETDAC3   D/#                              -- -- -- -- 
--L- 1111111 00 L CCCC  DDDDDDDDD  0110000ff  SETCTRA SETWAVA SETFRQA SETPHSA   D/#                              -- -- -- -- 
--L- 1111111 00 L CCCC  DDDDDDDDD  0110001ff  ADDPHSA SUBPHSA SETVID  SETVIDY   D/#                              -- -- -- -- 
--L- 1111111 00 L CCCC  DDDDDDDDD  0110010ff  SETCTRB SETWAVB SETFRQB SETPHSB   D/#                              -- -- -- -- 
--L- 1111111 00 L CCCC  DDDDDDDDD  0110011ff  ADDPHSB SUBPHSB SETVIDI SETVIDQ   D/#                              -- -- -- -- 
--L- 1111111 00 L CCCC  DDDDDDDDD  0110100ff  SETPIX  SETPIXZ SETPIXU SETPIXV   D/#                              -- -- -- -- 
--L- 1111111 00 L CCCC  DDDDDDDDD  0110101ff  SETPIXA SETPIXR SETPIXG SETPIXB   D/#                              -- -- -- -- 
--L- 1111111 00 L CCCC  DDDDDDDDD  0110110ff  SETPORA SETPORB SETPORC SETPORD   D/#                              -- -- -- -- 
--L- 1111111 00 L CCCC  DDDDDDDDD  0110111ff  RDWIDEA RDWIDEB WRWIDEA WRWIDEB   D/#1..512                        -- -- -- -- 
--L- 1111111 00 L CCCC  DDDDDDDDD  0111000ff  JMPT0   JMPT1   JMPT2   JMPT3     D/#                              -- -- -- -- 
--L- 1111111 00 L CCCC  DDDDDDDDD  0111001ff  PUSHT0  PUSHT1  PUSHT2  PUSHT3    D/#                              -- -- -- -- 
--L- 1111111 ZC L CCCC  DDDDDDDDD  0111010ff  COGNEW  COGNEWX RESD    <empty>   {D} D/#                          ZC ZC -- -- 
----------------------------------------------------------------------------------------------------------------------------
--R- 1111111 ZC x CCCC  DDDDDDDDD  1000000ff  LOCPTRA LOCPTRB JMP     JMPD      D                                ZC ZC ZC ZC 
--R- 1111111 ZC x CCCC  DDDDDDDDD  1000001ff  LINK    LINKD   CALL    CALLD     {0} D                            ZC ZC ZC ZC 
--R- 1111111 ZC x CCCC  DDDDDDDDD  1000010ff  CALLA   CALLAD  CALLB   CALLBD    D                                ZC ZC ZC ZC 
--R- 1111111 ZC x CCCC  DDDDDDDDD  1000011ff  CALLX   CALLXD  CALLY   CALLYD    D                                ZC ZC ZC ZC 
--R- 1111111 00 x CCCC  DDDDDDDDD  1000100ff  LODINDA LODINDB <empty> <empty>   D                                -- -- -- -- 
----------------------------------------------------------------------------------------------------------------------------
ZC-- 1111111 ZC x CCCC  xxxxxxxxx  1100000ff  RETA    RETAD   RETB    RETBD                                      ZC ZC ZC ZC 
ZC-- 1111111 ZC x CCCC  xxxxxxxxx  1100001ff  RETX    RETXD   RETY    RETYD                                      ZC ZC ZC ZC 
ZC-- 1111111 ZC x CCCC  xxxxxxxxx  1100010ff  RET     RETD    POLCTRA POLCTRB                                    ZC ZC ZC ZC 
ZC-- 1111111 ZC x CCCC  xxxxxxxxx  1100011ff  POLVID  CAPCTRA CAPCTRB CAPCTRS                                    ZC -- -- -- 
---- 1111111 00 x CCCC  xxxxxxxxx  1100100ff  SETPIXW CLRACCA CLRACCB CLRACCS                                    -- -- -- -- 
ZC-- 1111111 ZC x CCCC  xxxxxxxxx  1100101ff  CHKPTRX CHKPTRY SYNCTRA SYNCTRB                                    ZC ZC -- -- 
---- 1111111 00 x CCCC  xxxxxxxxx  1100110ff  DCACHEX ICACHEX ICACHEP ICACHEN                                    -- -- -- -- 
---- 1111111 00 x 0000  xxxxxxxxx  1100111ff  TLOCK   TFREE   LOADT3  SAVET3                                     -- -- -- -- 
----------------------------------------------------------------------------------------------------------------------------
ZCL- 1111111 ZC L CCCC  DDDDDDDDD  1111iiiii  PICKZC                            D/#,#0-31                        ZC          
                                              Note: PICKZC may not be correct opcode in S[8:5].
----------------------------------------------------------------------------------------------------------------------------

InstructionSummary_20140312b.spin

cgracey · 2014-03-14 00:43

I have a question for you guys.

You know we have this new RESD D/# instruction that lets you redirect the next result to another register.

What about expanding this capability a little (rdr = task's redirection register):

RESD	D/#	'set rdr to D/#, redirect next write to [rdr]
RESDP	D/#	'set rdr to D/#, redirect next write to [rdr++]
RESDN	D/#	'set rdr to D/#, redirect next write to [rdr--]

RESDX	D/#	'set rdr to D/#, redirect all subsequent writes to [rdr]   until RESDOFF
RESDPX	D/#	'set rdr to D/#, redirect all subsequent writes to [rdr++] until RESDOFF
RESDNX	D/#	'set rdr to D/#, redirect all subsequent writes to [rdr--] until RESDOFF

RESDOFF D/#	'set rdr to D/#, cancel write redirection


RESD		'		 redirect next write to [rdr]
RESDP		'		 redirect next write to [rdr++]
RESDN		'		 redirect next write to [rdr--]

RESDX		'		 redirect all subsequent writes to [rdr]   until RESDOFF
RESDPX		'		 redirect all subsequent writes to [rdr++] until RESDOFF
RESDNX		'		 redirect all subsequent writes to [rdr--] until RESDOFF

RESDOFF		'		 cancel write redirection

This scheme takes two more state bits and an incrementer/decrementer on the task's redirection register.

Cluso99 · 2014-03-14 00:57

cgracey wrote: »

I have a question for you guys.

You know we have this new RESD D/# instruction that lets you redirect the next result to another register.

What about expanding this capability a little (rdr = task's redirection register):

RESD    D/#    'set rdr to D/#, redirect next write to [rdr]
RESDP    D/#    'set rdr to D/#, redirect next write to [rdr++]
RESDN    D/#    'set rdr to D/#, redirect next write to [rdr--]

RESDX    D/#    'set rdr to D/#, redirect all subsequent writes to [rdr]   until RESDOFF
RESDPX    D/#    'set rdr to D/#, redirect all subsequent writes to [rdr++] until RESDOFF
RESDNX    D/#    'set rdr to D/#, redirect all subsequent writes to [rdr--] until RESDOFF

RESDOFF D/#    'set rdr to D/#, cancel write redirection


RESD        '         redirect next write to [rdr]
RESDP        '         redirect next write to [rdr++]
RESDN        '         redirect next write to [rdr--]

RESDX        '         redirect all subsequent writes to [rdr]   until RESDOFF
RESDPX        '         redirect all subsequent writes to [rdr++] until RESDOFF
RESDNX        '         redirect all subsequent writes to [rdr--] until RESDOFF

RESDOFF        '         cancel write redirection

This scheme takes two more state bits and an incrementer/decrementer on the task's redirection register.

WOW! That's some scheme Chip.

So each task has its own task register, and it can auto increment/decrement each time it gets used. That is now a really powerful instruction.

Rather than the RESDP & RESDN in pnut, perhaps they could be RESD #/D++ or RESD #/D-- to indicate increment or decrement, instead of revealing there are separate opcodes used.

rogloh · 2014-03-14 00:59

cgracey wrote: »

I have a question for you guys.

You know we have this new RESD D/# instruction that lets you redirect the next result to another register.

What about expanding this capability a little (rdr = task's redirection register):

RESD	D/#	'set rdr to D/#, redirect next write to [rdr]
RESDP	D/#	'set rdr to D/#, redirect next write to [rdr++]
RESDN	D/#	'set rdr to D/#, redirect next write to [rdr--]

RESDX	D/#	'set rdr to D/#, redirect all subsequent writes to [rdr]   until RESDOFF
RESDPX	D/#	'set rdr to D/#, redirect all subsequent writes to [rdr++] until RESDOFF
RESDNX	D/#	'set rdr to D/#, redirect all subsequent writes to [rdr--] until RESDOFF

RESDOFF D/#	'set rdr to D/#, cancel write redirection


RESD		'		 redirect next write to [rdr]
RESDP		'		 redirect next write to [rdr++]
RESDN		'		 redirect next write to [rdr--]

RESDX		'		 redirect all subsequent writes to [rdr]   until RESDOFF
RESDPX		'		 redirect all subsequent writes to [rdr++] until RESDOFF
RESDNX		'		 redirect all subsequent writes to [rdr--] until RESDOFF

RESDOFF		'		 cancel write redirection

This scheme takes two more state bits and an incrementer/decrementer on the task's redirection register.

I don't know about this one. I'm just trying to imagine when we would redirect a whole lot of sequential instructions to a register (or a register block), so far all I could dream up was maybe inside some tight REP loop doing a form of SIMD or something. We have INDA++ already which could help do the writing to a block. So the only real benefit is in not destroying both the operands to the opcodes.

Cluso99 · 2014-03-14 01:04

Chip, wouldn't post increment/decrement be better?

Might the instructions RESULT / RESULTX / RESULTN (result = result modify off = never) be better, and the ++ or -- on the #/D operand as pre or post (only 1) ?

rogloh · 2014-03-14 01:06

Example of a Single Instruction Multiple Data (SIMD) type operation on adding 100 numbers was what came to mind.

REPS #100,#2
    RESDPX C
op: ADD A,B
    INCDS op
    RESOFF C

Result C vector = A vector + B vector

EDIT: Actually that probably cannot work because there is not enough time in the self modifying code loop to work right, so perhaps it could be this instead:

REPS #100,#1
    RESDPX C
op: ADD INDA++,INDB++
    RESOFF C

Actually this is where the incrementing/decrementing RESD stuff became useful because INDA/INDB are already consumed in the loop itself. So yes it definitely could be of some benefit here.

We do 100 adds in 100 cycles above. To do this same operation otherwise takes a lot of instructions if you can't destroy the D operand.

REP #100, #4
      NOP
      MOV TEMP, INDA++
      ADD TEMP, INDB++
copy: MOV DEST, TEMP
      INCD copy

So we get a 4x speedup above!

Cluso99 · 2014-03-14 01:08

ragloh,
I could see potential in using these instead of INDA/INDB (saved for other uses), as a block move between threads, as a way of using banking (instead of remapping). I am sure there are other uses - we just have to see.

ozpropdev · 2014-03-14 01:19

Chip,
I see how RESD,RESDP and RESDN could be useful in multi-tasking as we only have 1 set of INDA/B registers.

Question: The SETMAP instruction now has a #S value. I recall you mentioning somewhere about selecting a
register block using any register rather than INDB or task number? Also can each hw task select a separate block?

cgracey · 2014-03-14 01:19

rogloh wrote: »

I don't know about this one. I'm just trying to imagine when we would redirect a whole lot of sequential instructions to a register (or a register block), so far all I could dream up was maybe inside some tight REP loop doing a form of SIMD or something. We have INDA++ already which could help do the writing to a block. So the only real benefit is in not destroying both the operands to the opcodes.

INDA/INDB are not per-task, but global to the cog, so this would provide some new capability to all tasks.

cgracey · 2014-03-14 01:23

Cluso99 wrote: »

Chip, wouldn't post increment/decrement be better?

Might the instructions RESULT / RESULTX / RESULTN (result = result modify off = never) be better, and the ++ or -- on the #/D operand as pre or post (only 1) ?

This scheme only does post-inc/dec on the redirect register.

We might need better names, for sure.

Also, these instructions have operand-less versions which don't write rdr, but set a redirect mode.

rogloh · 2014-03-14 01:23

That's true. I do like it now, see my potential SIMD application of it above.

cgracey · 2014-03-14 01:25

rogloh wrote: »
Example of a Single Instruction Multiple Data (SIMD) type operation on adding 100 numbers was what came to mind.
REPS #100,#2
    RESDPX C
op: ADD A,B
    INCDS op
    RESOFF C
Result C vector = A vector + B vector

EDIT: Actually that probably cannot work because there is not enough time in the self modifying code loop to work right, so perhaps it could be this instead:
REPS #100,#1
    RESDPX C
op: ADD INDA++,INDB++
    RESOFF C
Actually this is where the incrementing/decrementing RESD stuff became useful because INDA/INDB are already consumed in the loop itself. So yes it definitely could be of some benefit here.

We do 100 adds in 100 cycles above. To do this same operation otherwise takes a lot of instructions if you can't destroy the D operand.
REP #100, #4
      NOP
      MOV TEMP, INDA++
      ADD TEMP, INDB++
copy: MOV DEST, TEMP
      INCD copy
So we get a 4x speedup above!

There are some cases where there's a definite benefit. This is practically a freebie, so I say put it in the toolbox.

cgracey · 2014-03-14 01:28

ozpropdev wrote: »

Chip,
I see how RESD,RESDP and RESDN could be useful in multi-tasking as we only have 1 set of INDA/B registers.

Question: The SETMAP instruction now has a #S value. I recall you mentioning somewhere about selecting a
register block using any register rather than INDB or task number? Also can each hw task select a separate block?

Register remapping is a cog-wide function that you'd probably want to set up once, unless, like for preemptive multi-threading, you'd like to switch around under software control. That S/# in SETTASK is to set a static remap value, rather than using only the current task ID or INDA/INDB for instantaneous remap.

Ramon · 2014-03-14 02:19

cgracey wrote: »
RESDP	D/#	'set rdr to D/#, redirect next write to [rdr++]
RESDN	D/#	'set rdr to D/#, redirect next write to [rdr--]
This scheme takes two more state bits and an incrementer/decrementer on the task's redirection register.

Destination is only COG RAM registers? or can be HUB RAM too?

Do you think that It could be possible (or useful) to add OFFSET to those instructions?

For example (instruction for fixed 8 bytes offset) :

RESDP8 D/# 'set rdr to D/#, redirect next write to [rdr+8]
RESDN8 D/# 'set rdr to D/#, redirect next write to [rdr-8]

With an offset of 8, maybe some application could use eight synchronized cogs to get the fastest data transfer possible to fill HUB RAM (or use HUB RAM to synchonously send data using pin transfer (XFR).

I remember that high speed video on P1 was implemented using several synchronized cogs (is this right?).

Maybe some high speed video application could benefit. Or projects that made use or fast pipelined DAC/ADC.

Another example: with an +3 offset, three cogs can send R, G, B data indepently to a DAC (EDIT: "to three DACs") using pin transfer instructions. And hopefully with a 3x speed improvement over a single COG handling three colors.

What do you think?

cgracey · 2014-03-14 02:38

Ramon wrote: »

Destination is only COG RAM registers? or can be HUB RAM too?

Do you think that It could be possible (or useful) to add OFFSET to those instructions?

For example (instruction for fixed 8 bytes offset) :

RESDP8 D/# 'set rdr to D/#, redirect next write to [rdr+8]
RESDN8 D/# 'set rdr to D/#, redirect next write to [rdr-8]

With an offset of 8, maybe some application could use eight synchronized cogs to get the fastest data transfer possible to fill HUB RAM (or use HUB RAM to synchonously send data using pin transfer (XFR).

I remember that high speed video on P1 was implemented using several synchronized cogs (is this right?).

Maybe some high speed video application could benefit. Or projects that made use or fast pipelined DAC/ADC.

Another example: with an +3 offset, three cogs can send R, G, B data indepently to a DAC (EDIT: "to three DACs") using pin transfer instructions. And hopefully with a 3x speed improvement over a single COG handling three colors.

What do you think?

I don't see a need for an offset because you set whatever address you want and it just goes from there.

This is something that the programmer is going to have to put in his head and then recognize when an opportunity to use it arises.

rjo__ · 2014-03-14 03:54

I can think of all kinds of image processing that would benefit from this. Imagine the task of comparing two images, from known optics, which are similar except for rotation and magnification(cyclopean vision), where your calculation yields a change in the position of the camera repeated cordic and math functions. These tools seem to fit the bill nicely.
Rich

Cluso99 · 2014-03-14 04:14

A lot of these extra instructions will not be used by beginners.

That's fine. When they graduate, or look at someone else's code, they will find these gems being used ;

Probably the instruction set should be split into two, for normal and advanced users.

rjo__ · 2014-03-14 04:39

Cluso99,

In the past what happened was the documentation seemed to be purposefully targeted at various classes of users at various times in the device's life cycle… first being experts, followed by advanced, beginners and intermediate. Where most Prop1 users will probably want to start is with P1 compatible coding… and then adding in advanced programming as needed.

In the mass adaptation phase, the biggest problem seems to be directing users to appropriate documentation, based upon their individual skill set. Parallax does a good job of this, mostly through the forums. Even advanced users are going to struggle with some of the refinements that have occurred.

Integrating advanced help utilities into the various tools seems essential to me.

Rich

I didn't mean to go off-topic, just couldn't help myself:)

cgracey · 2014-03-14 05:57

I just finished testing the newly expanded RESD functions. I'll get back on the documentation now for the next FPGA release.

cgracey · 2014-03-14 06:07

Ramon wrote: »

Destination is only COG RAM registers? or can be HUB RAM too?

It's just for cog RAM registers - a little trick to swap out result register addresses.

Bill Henning · 2014-03-14 07:19

Nice addition Chip!

I can see all sorts of potential uses for this in graphics, string operations and even cryptography.

Sorry, cannot resist...

I wonder if there would be uses for the inverse?

SRCD D/#
SRCDP D/#
SRCDN D/#
SRCDOFF D/#

cgracey wrote: »

I have a question for you guys.

You know we have this new RESD D/# instruction that lets you redirect the next result to another register.

What about expanding this capability a little (rdr = task's redirection register):

RESD	D/#	'set rdr to D/#, redirect next write to [rdr]
RESDP	D/#	'set rdr to D/#, redirect next write to [rdr++]
RESDN	D/#	'set rdr to D/#, redirect next write to [rdr--]

RESDX	D/#	'set rdr to D/#, redirect all subsequent writes to [rdr]   until RESDOFF
RESDPX	D/#	'set rdr to D/#, redirect all subsequent writes to [rdr++] until RESDOFF
RESDNX	D/#	'set rdr to D/#, redirect all subsequent writes to [rdr--] until RESDOFF

RESDOFF D/#	'set rdr to D/#, cancel write redirection


RESD		'		 redirect next write to [rdr]
RESDP		'		 redirect next write to [rdr++]
RESDN		'		 redirect next write to [rdr--]

RESDX		'		 redirect all subsequent writes to [rdr]   until RESDOFF
RESDPX		'		 redirect all subsequent writes to [rdr++] until RESDOFF
RESDNX		'		 redirect all subsequent writes to [rdr--] until RESDOFF

RESDOFF		'		 cancel write redirection

This scheme takes two more state bits and an incrementer/decrementer on the task's redirection register.

ctwardell · 2014-03-14 08:04

Bill Henning wrote: »

Sorry, cannot resist...

I wonder if there would be uses for the inverse?

SRCD D/#
SRCDP D/#
SRCDN D/#
SRCDOFF D/#

What would these do?

C.W.

Propeller II update - BLOG

Comments