Shop Learn
Any recommendations for transposing arrays? — Parallax Forums

Any recommendations for transposing arrays?

jrullanjrullan Posts: 56
edited 2020-12-02 15:52 in Propeller 2
I'm working on porting an old project to the P2 and it relies on transposing an array of longs (32 bits long integers) into an array of bytes (I know it's odd). See below:
The resulting byte array order could be either as shown or having b00 to be the msb.

7Uy0r4Z.jpg

I wonder if there are any instructions among the PASM instruction set that facilitates doing this in the least amount of cycles possible.

Comments

  • Can't think of any that help super much with this particular task beyond what's obvious (TESTB/BIT*).

    There's RCZR/RCZL that can shift two bits into the status flags at once, I guess.
  • RaymanRayman Posts: 11,835
    Yeah, RCZR is probably the way to go...

    But, maybe you could send the 8 longs to 8 smartpins and them shift it out while you read the in?
    Might require jumpers to 8 other pins...
  • @Wuerfel_21 thanks for your input. I'll check out those instructions.
  • TonyB_TonyB_ Posts: 1,624
    edited 2020-12-02 00:45
    Could a combination of SPLITx, MERGEx, MOVBYTS and MUXQ beat the 384 shifts involving RCZx?

    See post by rogloh below.
  • JonnyMacJonnyMac Posts: 7,412
    edited 2020-12-01 21:44
    I'm not as clever with PASM as you guys, but I thought I'd try this as a lunch-time exercise. This simple code seems to work.
    pub transpose(p_longs, p_bytes) | lidx, lwork, bidx, bwork 
    
      org
    rd_hub          setq      #8-1                                  ' read 8 longs from hub      
                    rdlong    old, p_longs
    
                    mov       lidx, #0                              ' long index (0..7) / bit #
    
    .loop1          alts      lidx, #old                            ' get long from old table
                    mov       lwork, 0-0
    
                    mov       bidx, #0                              ' bit index (0..31) / byte #
    
    .loop2          testb     lwork, bidx                   wc      ' get lwork.[bidx] into C
                    altgb     bidx, #new                            ' get byte from new table     
                    getbyte   bwork                             
                    bitc      bwork, lidx                           ' update
                    altsb     bidx, #new                            ' put it back
                    setbyte   bwork
    
                    incmod    bidx, #31                     wc      ' next bit of current long
      if_nc         jmp       #.loop2
    
                    incmod    lidx, #7                      wc      ' next long
      if_nc         jmp       #.loop1
    
    wr_hub          setq      #8-1                                  ' write 8 longs (32 bytes) to hub
                    wrlong    new, p_bytes
                    
                    ret
    
    old             long      0[8]                                  ' treat as  8 longs
    new             long      0[8]                                  ' treat as 32 bytes
    
      end
    
  • roglohrogloh Posts: 2,957
    edited 2020-12-01 22:14
    @jrullan I think this method below should get it down to 18 clocks per 32 bits processed excluding input/output overheads

    Approach:
    Read in your 8 longs to obtain source data first to 8 consecutive registers.
    Then do 8 x ROLNIBs into an accumulator, followed by a SPLITB then the long is complete.
    Repeat this 8 times in total for all data, advancing where you source the nibble from in the ROLNIBs and using different output register accumulators.
    Write back the 8 longs of accumulated data as a burst.

    An example code snippet is below. If you can't afford them all unrolled in COGRAM space, consider HUB-exec which should be almost as fast because of no branching.
    setq #8-1
    rdlong data, input_addr ' read source data from hub (assumed)
    
    rolnib acc+0, data+7,#0
    rolnib acc+0, data+6,#0
    rolnib acc+0, data+5,#0
    rolnib acc+0, data+4,#0
    rolnib acc+0, data+3,#0
    rolnib acc+0, data+2,#0
    rolnib acc+0, data+1,#0
    rolnib acc+0, data+0,#0
    splitb acc+0
    
    rolnib acc+1, data+7,#1
    rolnib acc+1, data+6,#1
    rolnib acc+1, data+5,#1
    rolnib acc+1, data+4,#1
    rolnib acc+1, data+3,#1
    rolnib acc+1, data+2,#1
    rolnib acc+1, data+1,#1
    rolnib acc+1, data+0,#1
    splitb acc+1
    
    ... (continue this pattern 6 more times)
    
    setq #8-1
    wrlong acc0, output_addr ' write back data to hub if required
    ...
    
    data long 0[8] ' 8 longs for input data
    acc long 0[8] ' 8 longs for output data
    input_addr long 0
    output_addr long 0
    
    
  • Wow this is why this forum is so great. I'll definitely try out your recommendations. Honestly i didn't expect anyone trying it out, so I'm very grateful of your curiosity and generosity. @JonnyMac @rogloh @TonyB_ @Rayman etc...
  • jrullanjrullan Posts: 56
    edited 2020-12-02 15:44
    rogloh wrote: »
    @jrullan I think this method below should get it down to 18 clocks per 32 bits processed excluding input/output overheads

    Approach:
    Read in your 8 longs to obtain source data first to 8 consecutive registers.
    Then do 8 x ROLNIBs into an accumulator, followed by a SPLITB then the long is complete.
    Repeat this 8 times in total for all data, advancing where you source the nibble from in the ROLNIBs and using different output register accumulators.
    Write back the 8 longs of accumulated data as a burst.

    An example code snippet is below. If you can't afford them all unrolled in COGRAM space, consider HUB-exec which should be almost as fast because of no branching.
    ...
    rolnib acc+0, data+7,#0
    rolnib acc+0, data+6,#0
    rolnib acc+0, data+5,#0
    rolnib acc+0, data+4,#0
    rolnib acc+0, data+3,#0
    rolnib acc+0, data+2,#0
    rolnib acc+0, data+1,#0
    rolnib acc+0, data+0,#0
    splitb acc+0
    
    rolnib acc+1, data+7,#1
    rolnib acc+1, data+6,#1
    rolnib acc+1, data+5,#1
    rolnib acc+1, data+4,#1
    rolnib acc+1, data+3,#1
    rolnib acc+1, data+2,#1
    rolnib acc+1, data+1,#1
    rolnib acc+1, data+0,#1
    splitb acc+1
    ...
    

    @rogloh,

    from the P2ASM spreadsheet I see that the instruction rolnib has this description:
    ROLNIB  D,{#}S,#N     Rotate-left nibble N of S into D. D = {D[27:0], S.NIBBLE[N]).
    

    I guess my question is when it refers to S.NIBBLE[N], the index N refers to the index of nibbles within the long, or a nibble starting at bit N?

    I can see how the first 9 instructions work, but the second group of instructions doesn't seem to achieve the same result unless N is the bit index within the long instead of the nibble index.

    XbOoI7x.jpg
  • S.NIBBLE[N] is nibble N of S (source). Both S and D are longs. It might be less confusing if you rename rogloh's DATA to S and ACC to D to match spreadsheet.
  • jrullanjrullan Posts: 56
    edited 2020-12-02 16:33
    In this thread I found this comment from Chip:
    cgracey wrote: »
    ...
    ROLNIB takes nibble 0..7 from S and puts it into the lower nibble of D, while shifting D up by one nibble. It's a way to rotate a whole nibble into D.

    Another approach as suggested by @Wuerfel_21, and I think is also @JonnyMac's approach could be to use the testb/bitc instructions:

    jqEEphl.jpg

    These would take 32 clocks per D byte, unrolled it would take 1024 clocks for the 8 longs to be transposed.
  • roglohrogloh Posts: 2,957
    edited 2020-12-03 13:30
    @jrullan,

    [TypoEdit]ROLNIB does take nibble #N from the source (not bit #N). What is the pattern that you wish to follow after the first 8 bits?

    I have assumed you are translating more bits from the original 8x32 bit longs to generate output data in this sequence:
    1st byte: b7-0, b6-0, b5-0, b4-0, b3-0, b2-0, b1-0, b0-0
    2nd byte: b7-1, b6-1, b5-1, b4-1, b3-1, b2-1, b1-1, b0-1
    3rd byte: b7-2, b6-2, b5-2, b4-2, b3-2, b2-2, b1-2, b0-2
    4th byte: b7-3, b6-3, b5-3, b4-3, b3-3, b2-3, b1-3, b0-3
    
    then
    5th byte: b7-4, b6-4, b5-4, b4-4, b3-4, b2-4, b1-4, b0-4
    6th byte: b7-5, b6-5, b5-5, b4-5, b3-5, b2-5, b1-5, b0-5
    7th byte: b7-6, b6-6, b5-6, b4-6, b3-6, b2-6, b1-6, b0-6
    8th byte: b7-7, b6-7, b5-7, b4-7, b3-7, b2-7, b1-7, b0-7
    
    then
    9th byte: b7-8, b6-8, b5-8, b4-8, b3-8, b2-8, b1-8, b0-8
    10th byte: b7-9, b6-9, b5-9, b4-9, b3-9, b2-9, b1-9, b0-9
    11th byte: b7-10, b6-10, b5-10, b4-10, b3-10, b2-10, b1-10, b0-10
    12th byte: b7-11, b6-11, b5-11, b4-11, b3-11, b2-11, b1-11, b0-11
    
    etc

    If that is not the pattern you were after, what is the rest of the output byte pattern after the first byte? Your original post does not show all the details.
  • AJLAJL Posts: 369
    edited 2020-12-03 04:41
    @jrullan

    In @rogloh's approach the 9 instructions produce four bytes of output at a time which is why only 8 sequences of 9 instructions are needed.
    31	30	29	28	27	26	25	24	23	22	21	20	19	18	17	16	15	14	13	12	11	10	9	8	7	6	5	4	3	2	1	0				
    																												b7-3	b7-2	b7-1	b7-0		rolnib D+0, S+7, #0		
    																								b7-3	b7-2	b7-1	b7-0	b6-3	b6-2	b6-1	b6-0		rolnib D+0, S+6, #0		
    																				b7-3	b7-2	b7-1	b7-0	b6-3	b6-2	b6-1	b6-0	b5-3	b5-2	b5-1	b5-0		rolnib D+0, S+5, #0		
    																b7-3	b7-2	b7-1	b7-0	b6-3	b6-2	b6-1	b6-0	b5-3	b5-2	b5-1	b5-0	b4-3	b4-2	b4-1	b4-0		rolnib D+0, S+4, #0		
    												b7-3	b7-2	b7-1	b7-0	b6-3	b6-2	b6-1	b6-0	b5-3	b5-2	b5-1	b5-0	b4-3	b4-2	b4-1	b4-0	b3-3	b3-2	b3-1	b3-0		rolnib D+0, S+3, #0		
    								b7-3	b7-2	b7-1	b7-0	b6-3	b6-2	b6-1	b6-0	b5-3	b5-2	b5-1	b5-0	b4-3	b4-2	b4-1	b4-0	b3-3	b3-2	b3-1	b3-0	b2-3	b2-2	b2-1	b2-0		rolnib D+0, S+2, #0		
    				b7-3	b7-2	b7-1	b7-0	b6-3	b6-2	b6-1	b6-0	b5-3	b5-2	b5-1	b5-0	b4-3	b4-2	b4-1	b4-0	b3-3	b3-2	b3-1	b3-0	b2-3	b2-2	b2-1	b2-0	b1-3	b1-2	b1-1	b1-0		rolnib D+0, S+1, #0		
    b7-3	b7-2	b7-1	b7-0	b6-3	b6-2	b6-1	b6-0	b5-3	b5-2	b5-1	b5-0	b4-3	b4-2	b4-1	b4-0	b3-3	b3-2	b3-1	b3-0	b2-3	b2-2	b2-1	b2-0	b1-3	b1-2	b1-1	b1-0	b0-3	b0-2	b0-1	b0-0		rolnib D+0, S+0, #0		
    																																			
    b7-3	b6-3	b5-3	b4-3	b3-3	b2-3	b1-3	b0-3	b7-2	b6-2	b5-2	b4-2	b3-2	b2-2	b1-2	b0-2	b7-1	b6-1	b5-1	b4-1	b3-1	b2-1	b1-1	b0-1	b7-0	b6-0	b5-0	b4-0	b3-0	b2-0	b1-0	b0-0		splitb D+0		contains D[3..0]
    																																			
    																												b7-7	b7-6	b7-5	b7-4		rolnib D+1, S+7, #1		
    																								b7-7	b7-6	b7-5	b7-4	b6-7	b6-6	b6-5	b6-4		rolnib D+1, S+6, #1		
    																				b7-7	b7-6	b7-5	b7-4	b6-7	b6-6	b6-5	b6-4	b5-7	b5-6	b5-5	b5-4		rolnib D+1, S+5, #1		
    																b7-7	b7-6	b7-5	b7-4	b6-7	b6-6	b6-5	b6-4	b5-7	b5-6	b5-5	b5-4	b4-7	b4-6	b4-5	b4-4		rolnib D+1, S+4, #1		
    												b7-7	b7-6	b7-5	b7-4	b6-7	b6-6	b6-5	b6-4	b5-7	b5-6	b5-5	b5-4	b4-7	b4-6	b4-5	b4-4	b3-7	b3-6	b3-5	b3-4		rolnib D+1, S+3, #1		
    								b7-7	b7-6	b7-5	b7-4	b6-7	b6-6	b6-5	b6-4	b5-7	b5-6	b5-5	b5-4	b4-7	b4-6	b4-5	b4-4	b3-7	b3-6	b3-5	b3-4	b2-7	b2-6	b2-5	b2-4		rolnib D+1, S+2, #1		
    				b7-7	b7-6	b7-5	b7-4	b6-7	b6-6	b6-5	b6-4	b5-7	b5-6	b5-5	b5-4	b4-7	b4-6	b4-5	b4-4	b3-7	b3-6	b3-5	b3-4	b2-7	b2-6	b2-5	b2-4	b1-7	b1-6	b1-5	b1-4		rolnib D+1, S+1, #1		
    b7-7	b7-6	b7-5	b7-4	b6-7	b6-6	b6-5	b6-4	b5-7	b5-6	b5-5	b5-4	b4-7	b4-6	b4-5	b4-4	b3-7	b3-6	b3-5	b3-4	b2-7	b2-6	b2-5	b2-4	b1-7	b1-6	b1-5	b1-4	b0-7	b0-6	b0-5	b0-4		rolnib D+1, S+0, #1		
    																																			
    b7-7	b6-7	b5-7	b4-7	b3-7	b2-7	b1-7	b0-7	b7-6	b6-6	b5-6	b4-6	b3-6	b2-6	b1-6	b0-6	b7-5	b6-5	b5-5	b4-5	b3-5	b2-5	b1-5	b0-5	b7-4	b6-4	b5-4	b4-4	b3-4	b2-4	b1-4	b0-4		splitb D+1		contains D[7..4]
    

    The code is already unrolled, taking 76 longs for the code and a further 18 longs of working memory. Ignoring transfers to and from hubram it takes 72 instructions (144 clocks) to transpose the array of bits.
  • rogloh wrote: »
    @jrullan,

    ROLBIB does take nibble #N from the source (not bit #N). What is the pattern that you wish to follow after the first 8 bits?

    I have assumed you are translating more bits from the original 8x32 bit longs to generate output data in this sequence:
    1st byte: b7-0, b6-0, b5-0, b4-0, b3-0, b2-0, b1-0, b0-0
    2nd byte: b7-1, b6-1, b5-1, b4-1, b3-1, b2-1, b1-1, b0-1
    3rd byte: b7-2, b6-2, b5-2, b4-2, b3-2, b2-2, b1-2, b0-2
    4th byte: b7-3, b6-3, b5-3, b4-3, b3-3, b2-3, b1-3, b0-3
    
    then
    5th byte: b7-4, b6-4, b5-4, b4-4, b3-4, b2-4, b1-4, b0-4
    6th byte: b7-5, b6-5, b5-5, b4-5, b3-5, b2-5, b1-5, b0-5
    7th byte: b7-6, b6-6, b5-6, b4-6, b3-6, b2-6, b1-6, b0-6
    8th byte: b7-7, b6-7, b5-7, b4-7, b3-7, b2-7, b1-7, b0-7
    
    then
    9th byte: b7-8, b6-8, b5-8, b4-8, b3-8, b2-8, b1-8, b0-8
    10th byte: b7-9, b6-9, b5-9, b4-9, b3-9, b2-9, b1-9, b0-9
    11th byte: b7-10, b6-10, b5-10, b4-10, b3-10, b2-10, b1-10, b0-10
    12th byte: b7-11, b6-11, b5-11, b4-11, b3-11, b2-11, b1-11, b0-11
    
    etc

    If that is not the pattern you were after, what is the rest of the output byte pattern after the first byte? Your original post does not show all the details.

    @rogloh and @AJL ,

    Yes that's exactly what I'm looking for!!!

    I think that what I'm struggling with is understanding how splitb works. I could follow the description up to the first 8 bits:
    SPLITB  D - Split every 4th bit of D into bytes. D = {D[31], D[27], D[23], D[19], ...D[12], D[8], D[4], D[0]}.
    

    Do you know if this is documented better in another place?

    In any case this definitely looks like the best approach! :smile:
    Wow! Thank you for your time and kindness. I definitely need to test this directly on the P2.
  • If it helps, the transform that SPLITB performs is shown diagrammatically here ...

    http://forums.parallax.com/discussion/comment/1479691/#Comment_1479691
  • rogloh wrote: »
    If it helps, the transform that SPLITB performs is shown diagrammatically here ...

    http://forums.parallax.com/discussion/comment/1479691/#Comment_1479691

    Wow! I am a visual person myself, that definitely helps!!! Thank you again @rogloh. Can't wait to see the final Propeller 2 manual.
  • TonyB_TonyB_ Posts: 1,624
    edited 2020-12-03 14:10
    Here's some code that is half the speed of the optimal solution by rogloh but a third the size.
    	setq	#8-1
    	rdlong	s,s_addr	'read original array from hub to cog RAM
    
    	mov	ptra,LUT_addr	'set LUT address
    	rep	@.end,#8	'repeat code block up to .end 8 times
    	rolnib	d,s+7,#0
    	rolnib	d,s+6,#0
    	rolnib	d,s+5,#0
    	rolnib	d,s+4,#0
    	rolnib	d,s+3,#0
    	rolnib	d,s+2,#0
    	rolnib	d,s+1,#0
    	rolnib	d,s+0,#0
    	splitb	d		'transposed data
    	ror	s+7,#4
    	ror	s+6,#4
    	ror	s+5,#4
    	ror	s+4,#4
    	ror	s+3,#4
    	ror	s+2,#4
    	ror	s+1,#4
    	ror	s+0,#4
    	wrlut	d,ptra++	'write data to LUT addr ptra, then inc ptra
    .end
    	setq2	#8-1
    	wrlong	LUT_addr,d_addr	'write transposed array from LUT RAM to hub
    
    s		long	0[8]	'original array
    d		long	0	'transposed data long
    s_addr		long	0	'hub RAM addr of original array
    d_addr		long	0	'hub RAM addr of transposed array
    LUT_addr	long	0	'LUT RAM addr of transposed array
    

    An idea for a future P2+ or P3: ALTGN applies to multiple consecutive GETNIB/ROLNIB instructions if D[31]=1. Similarly for other ALTSx & ALTGx. The eight ROR instructions above could be replaced by one ALTGN and one nibble increment.
  • RaymanRayman Posts: 11,835
    Wow, the rolnib with splitb is genius. I sometimes wondered if there was any practical use for things like "splitb"... Guess there is.

    What was SEUSSF supposed to be for? That's a strange looking one...
  • JonnyMacJonnyMac Posts: 7,412
    edited 2020-12-03 17:01
    During the P2 call yesterday, Chip said it was for simple obfuscation -- probably why it's named after Dr. Seuss. It tested it with this:
    pub obfuscate(value, i) : result
    
      org
                    mov     result, value
                    rep     #1, i                           ' valid range is 1..31
                     seussf  result 
      end
    
    
    pub recover(value, i) : result
    
      org
                    mov     result, value
                    neg     i, i
                    add     i, #32
                    rep     #1, i
                     seussf  result 
      end
    
  • jrullanjrullan Posts: 56
    edited 2020-12-03 19:58
    Rayman wrote: »
    Wow, the rolnib with splitb is genius. I sometimes wondered if there was any practical use for things like "splitb"... Guess there is.

    You are right @Rayman, it is genius.

    Thank you all for your input. Here I am attaching an example for testing it out. Might be useful to someone out there.
    {
      Long-to-byte Transposition Example
    
      Test Code
    
      First of all the recognitions: http://forums.parallax.com/discussion/172489/any-recommendations-for-transposing-arrays
      @rogloh, @jonnymac, @Wuerfel_21, @Rayman, @TonyB_, @AJL
    }
    CON
      DELAY = 125
    
    VAR
      long leds
      long number
      long src[8]
      byte dest[32]
      byte  i
    
    PUB public_method_name()
      src[0] := %00010000_00000000_01000000_00000001
      src[1] := %00101000_00000000_10100000_00000010
      src[2] := %01000100_00000001_00010000_00000100
      src[3] := %10000010_00000010_00001000_00001000
      src[4] := %00000001_00000100_00000100_00010000
      src[5] := %00000000_10001000_00000010_00100000
      src[6] := %00000000_01010000_00000001_01000000
      src[7] := %00000000_00100000_00000000_10000000
    
      leds := 0 addpins 7
      dira.[leds]~~
    
      transpose(@src,@dest)
    
      repeat
        repeat i FROM 0 TO 31
          on(dest[i])
          waitms(DELAY)
          off()
          waitms(DELAY)
    
    
    PRI on(cnt)
      ORG
        MOV OUTA,cnt
      END
    
    PRI off()
      ORG
        MOV OUTA,#0
      END
    
    PRI transpose(srcAddress,destAddress)
    
      ORG
        SETQ  #8-1
        RDLONG s_array,srcAddress
    
        ROLNIB d_array+0, s_array+7, #0
        ROLNIB d_array+0, s_array+6, #0
        ROLNIB d_array+0, s_array+5, #0
        ROLNIB d_array+0, s_array+4, #0
        ROLNIB d_array+0, s_array+3, #0
        ROLNIB d_array+0, s_array+2, #0
        ROLNIB d_array+0, s_array+1, #0
        ROLNIB d_array+0, s_array+0, #0
        SPLITB d_array+0
    
        ROLNIB d_array+1, s_array+7, #1
        ROLNIB d_array+1, s_array+6, #1
        ROLNIB d_array+1, s_array+5, #1
        ROLNIB d_array+1, s_array+4, #1
        ROLNIB d_array+1, s_array+3, #1
        ROLNIB d_array+1, s_array+2, #1
        ROLNIB d_array+1, s_array+1, #1
        ROLNIB d_array+1, s_array+0, #1
        SPLITB d_array+1
    
        ROLNIB d_array+2, s_array+7, #2
        ROLNIB d_array+2, s_array+6, #2
        ROLNIB d_array+2, s_array+5, #2
        ROLNIB d_array+2, s_array+4, #2
        ROLNIB d_array+2, s_array+3, #2
        ROLNIB d_array+2, s_array+2, #2
        ROLNIB d_array+2, s_array+1, #2
        ROLNIB d_array+2, s_array+0, #2
        SPLITB d_array+2
    
        ROLNIB d_array+3, s_array+7, #3
        ROLNIB d_array+3, s_array+6, #3
        ROLNIB d_array+3, s_array+5, #3
        ROLNIB d_array+3, s_array+4, #3
        ROLNIB d_array+3, s_array+3, #3
        ROLNIB d_array+3, s_array+2, #3
        ROLNIB d_array+3, s_array+1, #3
        ROLNIB d_array+3, s_array+0, #3
        SPLITB d_array+3
    
        ROLNIB d_array+4, s_array+7, #4
        ROLNIB d_array+4, s_array+6, #4
        ROLNIB d_array+4, s_array+5, #4
        ROLNIB d_array+4, s_array+4, #4
        ROLNIB d_array+4, s_array+3, #4
        ROLNIB d_array+4, s_array+2, #4
        ROLNIB d_array+4, s_array+1, #4
        ROLNIB d_array+4, s_array+0, #4
        SPLITB d_array+4
    
        ROLNIB d_array+5, s_array+7, #5
        ROLNIB d_array+5, s_array+6, #5
        ROLNIB d_array+5, s_array+5, #5
        ROLNIB d_array+5, s_array+4, #5
        ROLNIB d_array+5, s_array+3, #5
        ROLNIB d_array+5, s_array+2, #5
        ROLNIB d_array+5, s_array+1, #5
        ROLNIB d_array+5, s_array+0, #5
        SPLITB d_array+5
    
        ROLNIB d_array+6, s_array+7, #6
        ROLNIB d_array+6, s_array+6, #6
        ROLNIB d_array+6, s_array+5, #6
        ROLNIB d_array+6, s_array+4, #6
        ROLNIB d_array+6, s_array+3, #6
        ROLNIB d_array+6, s_array+2, #6
        ROLNIB d_array+6, s_array+1, #6
        ROLNIB d_array+6, s_array+0, #6
        SPLITB d_array+6
    
        ROLNIB d_array+7, s_array+7, #7
        ROLNIB d_array+7, s_array+6, #7
        ROLNIB d_array+7, s_array+5, #7
        ROLNIB d_array+7, s_array+4, #7
        ROLNIB d_array+7, s_array+3, #7
        ROLNIB d_array+7, s_array+2, #7
        ROLNIB d_array+7, s_array+1, #7
        ROLNIB d_array+7, s_array+0, #7
        SPLITB d_array+7
    
    
        SETQ #8-1
        WRLONG d_array,destAddress
    
      d_array long 0[8]
      s_array long 0[8]
      END
    
  • JonnyMacJonnyMac Posts: 7,412
    edited 2020-12-03 19:35
    I would be inclined to force a return at the end so that the arrays are not treated like instructions when the transposition is done.
        _RET_    WRLONG d_array,destAddress
    
  • JonnyMac wrote: »
    I would be inclined to force a return at the end so that the arrays are not treated like instructions when the transposition is done.
        _RET_    WRLONG d_array,destAddress
    

    Oh, ok. Thanks. Will test it later tonight.
  • JonnyMac wrote: »
    During the P2 call yesterday, Chip said it was for simple obfuscation -- probably why it's named after Dr. Seuss. It tested it with this:
    pub obfuscate(value, i) : result
    
      org
                    mov     result, value
                    rep     #1, i                           ' valid range is 1..31
                     seussf  result 
      end
    
    
    pub recover(value, i) : result
    
      org
                    mov     result, value
                    neg     i, i
                    add     i, #32
                    rep     #1, i
                     seussf  result 
      end
    

    When the SEUSSF/SEUSSR subject was mentionned during yesterday's meeting, I'd tried to link the following 2015' thread into the live chat session, but then my entire place went dark-mode, due to a sudden city-wide power outage, and I lost my connection, during ~30 minutes or so.

    On return, I forgot to repeat the post. Corrected now...

    [url=" https://forums.parallax.com/discussion/162238/what-are-your-expectations-regarding-seussf-and-seussr"] https://forums.parallax.com/discussion/162238/what-are-your-expectations-regarding-seussf-and-seussr[/url]

    We certainly owe Chip, Seairth and Heater a good use-case for those two instructions. :smile:
Sign In or Register to comment.