Any recommendations for transposing arrays?

jrullan · 2020-12-01 20:04

I'm working on porting an old project to the P2 and it relies on transposing an array of longs (32 bits long integers) into an array of bytes (I know it's odd). See below:
The resulting byte array order could be either as shown or having b00 to be the msb.

I wonder if there are any instructions among the PASM instruction set that facilitates doing this in the least amount of cycles possible.

Wuerfel_21 · 2020-12-01 20:22

Can't think of any that help super much with this particular task beyond what's obvious (TESTB/BIT*).

There's RCZR/RCZL that can shift two bits into the status flags at once, I guess.

Rayman · 2020-12-01 20:31

Yeah, RCZR is probably the way to go...

But, maybe you could send the 8 longs to 8 smartpins and them shift it out while you read the in?
Might require jumpers to 8 other pins...

jrullan · 2020-12-01 20:31

@Wuerfel_21 thanks for your input. I'll check out those instructions.

TonyB_ · 2020-12-01 20:46

Could a combination of SPLITx, MERGEx, MOVBYTS and MUXQ beat the 384 shifts involving RCZx?

See post by rogloh below.

JonnyMac · 2020-12-01 21:39

I'm not as clever with PASM as you guys, but I thought I'd try this as a lunch-time exercise. This simple code seems to work.

pub transpose(p_longs, p_bytes) | lidx, lwork, bidx, bwork 

  org
rd_hub          setq      #8-1                                  ' read 8 longs from hub      
                rdlong    old, p_longs

                mov       lidx, #0                              ' long index (0..7) / bit #

.loop1          alts      lidx, #old                            ' get long from old table
                mov       lwork, 0-0

                mov       bidx, #0                              ' bit index (0..31) / byte #

.loop2          testb     lwork, bidx                   wc      ' get lwork.[bidx] into C
                altgb     bidx, #new                            ' get byte from new table     
                getbyte   bwork                             
                bitc      bwork, lidx                           ' update
                altsb     bidx, #new                            ' put it back
                setbyte   bwork

                incmod    bidx, #31                     wc      ' next bit of current long
  if_nc         jmp       #.loop2

                incmod    lidx, #7                      wc      ' next long
  if_nc         jmp       #.loop1

wr_hub          setq      #8-1                                  ' write 8 longs (32 bytes) to hub
                wrlong    new, p_bytes
                
                ret

old             long      0[8]                                  ' treat as  8 longs
new             long      0[8]                                  ' treat as 32 bytes

  end

rogloh · 2020-12-01 21:49

@jrullan I think this method below should get it down to 18 clocks per 32 bits processed excluding input/output overheads

Approach:
Read in your 8 longs to obtain source data first to 8 consecutive registers.
Then do 8 x ROLNIBs into an accumulator, followed by a SPLITB then the long is complete.
Repeat this 8 times in total for all data, advancing where you source the nibble from in the ROLNIBs and using different output register accumulators.
Write back the 8 longs of accumulated data as a burst.

An example code snippet is below. If you can't afford them all unrolled in COGRAM space, consider HUB-exec which should be almost as fast because of no branching.

setq #8-1
rdlong data, input_addr ' read source data from hub (assumed)

rolnib acc+0, data+7,#0
rolnib acc+0, data+6,#0
rolnib acc+0, data+5,#0
rolnib acc+0, data+4,#0
rolnib acc+0, data+3,#0
rolnib acc+0, data+2,#0
rolnib acc+0, data+1,#0
rolnib acc+0, data+0,#0
splitb acc+0

rolnib acc+1, data+7,#1
rolnib acc+1, data+6,#1
rolnib acc+1, data+5,#1
rolnib acc+1, data+4,#1
rolnib acc+1, data+3,#1
rolnib acc+1, data+2,#1
rolnib acc+1, data+1,#1
rolnib acc+1, data+0,#1
splitb acc+1

... (continue this pattern 6 more times)

setq #8-1
wrlong acc0, output_addr ' write back data to hub if required
...

data long 0[8] ' 8 longs for input data
acc long 0[8] ' 8 longs for output data
input_addr long 0
output_addr long 0

jrullan · 2020-12-01 23:51

Wow this is why this forum is so great. I'll definitely try out your recommendations. Honestly i didn't expect anyone trying it out, so I'm very grateful of your curiosity and generosity. @JonnyMac @rogloh @TonyB_ @Rayman etc...

jrullan · 2020-12-02 13:23

rogloh wrote: »
@jrullan I think this method below should get it down to 18 clocks per 32 bits processed excluding input/output overheads

Approach:
Read in your 8 longs to obtain source data first to 8 consecutive registers.
Then do 8 x ROLNIBs into an accumulator, followed by a SPLITB then the long is complete.
Repeat this 8 times in total for all data, advancing where you source the nibble from in the ROLNIBs and using different output register accumulators.
Write back the 8 longs of accumulated data as a burst.

An example code snippet is below. If you can't afford them all unrolled in COGRAM space, consider HUB-exec which should be almost as fast because of no branching.
...
rolnib acc+0, data+7,#0
rolnib acc+0, data+6,#0
rolnib acc+0, data+5,#0
rolnib acc+0, data+4,#0
rolnib acc+0, data+3,#0
rolnib acc+0, data+2,#0
rolnib acc+0, data+1,#0
rolnib acc+0, data+0,#0
splitb acc+0

rolnib acc+1, data+7,#1
rolnib acc+1, data+6,#1
rolnib acc+1, data+5,#1
rolnib acc+1, data+4,#1
rolnib acc+1, data+3,#1
rolnib acc+1, data+2,#1
rolnib acc+1, data+1,#1
rolnib acc+1, data+0,#1
splitb acc+1
...

@rogloh,

from the P2ASM spreadsheet I see that the instruction rolnib has this description:

ROLNIB  D,{#}S,#N     Rotate-left nibble N of S into D. D = {D[27:0], S.NIBBLE[N]).

I guess my question is when it refers to S.NIBBLE[N], the index N refers to the index of nibbles within the long, or a nibble starting at bit N?

I can see how the first 9 instructions work, but the second group of instructions doesn't seem to achieve the same result unless N is the bit index within the long instead of the nibble index.

TonyB_ · 2020-12-02 15:07

S.NIBBLE[N] is nibble N of S (source). Both S and D are longs. It might be less confusing if you rename rogloh's DATA to S and ACC to D to match spreadsheet.

jrullan · 2020-12-02 15:40

In this thread I found this comment from Chip:

cgracey wrote: »

...
ROLNIB takes nibble 0..7 from S and puts it into the lower nibble of D, while shifting D up by one nibble. It's a way to rotate a whole nibble into D.

Another approach as suggested by @Wuerfel_21, and I think is also @JonnyMac's approach could be to use the testb/bitc instructions:

These would take 32 clocks per D byte, unrolled it would take 1024 clocks for the 8 longs to be transposed.

rogloh · 2020-12-03 02:11

@jrullan,

[TypoEdit]ROLNIB does take nibble #N from the source (not bit #N). What is the pattern that you wish to follow after the first 8 bits?

I have assumed you are translating more bits from the original 8x32 bit longs to generate output data in this sequence:

1st byte: b7-0, b6-0, b5-0, b4-0, b3-0, b2-0, b1-0, b0-0
2nd byte: b7-1, b6-1, b5-1, b4-1, b3-1, b2-1, b1-1, b0-1
3rd byte: b7-2, b6-2, b5-2, b4-2, b3-2, b2-2, b1-2, b0-2
4th byte: b7-3, b6-3, b5-3, b4-3, b3-3, b2-3, b1-3, b0-3

then

5th byte: b7-4, b6-4, b5-4, b4-4, b3-4, b2-4, b1-4, b0-4
6th byte: b7-5, b6-5, b5-5, b4-5, b3-5, b2-5, b1-5, b0-5
7th byte: b7-6, b6-6, b5-6, b4-6, b3-6, b2-6, b1-6, b0-6
8th byte: b7-7, b6-7, b5-7, b4-7, b3-7, b2-7, b1-7, b0-7

then

9th byte: b7-8, b6-8, b5-8, b4-8, b3-8, b2-8, b1-8, b0-8
10th byte: b7-9, b6-9, b5-9, b4-9, b3-9, b2-9, b1-9, b0-9
11th byte: b7-10, b6-10, b5-10, b4-10, b3-10, b2-10, b1-10, b0-10
12th byte: b7-11, b6-11, b5-11, b4-11, b3-11, b2-11, b1-11, b0-11

etc

If that is not the pattern you were after, what is the rest of the output byte pattern after the first byte? Your original post does not show all the details.

AJL · 2020-12-03 04:40

@jrullan

In @rogloh's approach the 9 instructions produce four bytes of output at a time which is why only 8 sequences of 9 instructions are needed.

31	30	29	28	27	26	25	24	23	22	21	20	19	18	17	16	15	14	13	12	11	10	9	8	7	6	5	4	3	2	1	0				
																												b7-3	b7-2	b7-1	b7-0		rolnib D+0, S+7, #0		
																								b7-3	b7-2	b7-1	b7-0	b6-3	b6-2	b6-1	b6-0		rolnib D+0, S+6, #0		
																				b7-3	b7-2	b7-1	b7-0	b6-3	b6-2	b6-1	b6-0	b5-3	b5-2	b5-1	b5-0		rolnib D+0, S+5, #0		
																b7-3	b7-2	b7-1	b7-0	b6-3	b6-2	b6-1	b6-0	b5-3	b5-2	b5-1	b5-0	b4-3	b4-2	b4-1	b4-0		rolnib D+0, S+4, #0		
												b7-3	b7-2	b7-1	b7-0	b6-3	b6-2	b6-1	b6-0	b5-3	b5-2	b5-1	b5-0	b4-3	b4-2	b4-1	b4-0	b3-3	b3-2	b3-1	b3-0		rolnib D+0, S+3, #0		
								b7-3	b7-2	b7-1	b7-0	b6-3	b6-2	b6-1	b6-0	b5-3	b5-2	b5-1	b5-0	b4-3	b4-2	b4-1	b4-0	b3-3	b3-2	b3-1	b3-0	b2-3	b2-2	b2-1	b2-0		rolnib D+0, S+2, #0		
				b7-3	b7-2	b7-1	b7-0	b6-3	b6-2	b6-1	b6-0	b5-3	b5-2	b5-1	b5-0	b4-3	b4-2	b4-1	b4-0	b3-3	b3-2	b3-1	b3-0	b2-3	b2-2	b2-1	b2-0	b1-3	b1-2	b1-1	b1-0		rolnib D+0, S+1, #0		
b7-3	b7-2	b7-1	b7-0	b6-3	b6-2	b6-1	b6-0	b5-3	b5-2	b5-1	b5-0	b4-3	b4-2	b4-1	b4-0	b3-3	b3-2	b3-1	b3-0	b2-3	b2-2	b2-1	b2-0	b1-3	b1-2	b1-1	b1-0	b0-3	b0-2	b0-1	b0-0		rolnib D+0, S+0, #0		
																																			
b7-3	b6-3	b5-3	b4-3	b3-3	b2-3	b1-3	b0-3	b7-2	b6-2	b5-2	b4-2	b3-2	b2-2	b1-2	b0-2	b7-1	b6-1	b5-1	b4-1	b3-1	b2-1	b1-1	b0-1	b7-0	b6-0	b5-0	b4-0	b3-0	b2-0	b1-0	b0-0		splitb D+0		contains D[3..0]
																																			
																												b7-7	b7-6	b7-5	b7-4		rolnib D+1, S+7, #1		
																								b7-7	b7-6	b7-5	b7-4	b6-7	b6-6	b6-5	b6-4		rolnib D+1, S+6, #1		
																				b7-7	b7-6	b7-5	b7-4	b6-7	b6-6	b6-5	b6-4	b5-7	b5-6	b5-5	b5-4		rolnib D+1, S+5, #1		
																b7-7	b7-6	b7-5	b7-4	b6-7	b6-6	b6-5	b6-4	b5-7	b5-6	b5-5	b5-4	b4-7	b4-6	b4-5	b4-4		rolnib D+1, S+4, #1		
												b7-7	b7-6	b7-5	b7-4	b6-7	b6-6	b6-5	b6-4	b5-7	b5-6	b5-5	b5-4	b4-7	b4-6	b4-5	b4-4	b3-7	b3-6	b3-5	b3-4		rolnib D+1, S+3, #1		
								b7-7	b7-6	b7-5	b7-4	b6-7	b6-6	b6-5	b6-4	b5-7	b5-6	b5-5	b5-4	b4-7	b4-6	b4-5	b4-4	b3-7	b3-6	b3-5	b3-4	b2-7	b2-6	b2-5	b2-4		rolnib D+1, S+2, #1		
				b7-7	b7-6	b7-5	b7-4	b6-7	b6-6	b6-5	b6-4	b5-7	b5-6	b5-5	b5-4	b4-7	b4-6	b4-5	b4-4	b3-7	b3-6	b3-5	b3-4	b2-7	b2-6	b2-5	b2-4	b1-7	b1-6	b1-5	b1-4		rolnib D+1, S+1, #1		
b7-7	b7-6	b7-5	b7-4	b6-7	b6-6	b6-5	b6-4	b5-7	b5-6	b5-5	b5-4	b4-7	b4-6	b4-5	b4-4	b3-7	b3-6	b3-5	b3-4	b2-7	b2-6	b2-5	b2-4	b1-7	b1-6	b1-5	b1-4	b0-7	b0-6	b0-5	b0-4		rolnib D+1, S+0, #1		
																																			
b7-7	b6-7	b5-7	b4-7	b3-7	b2-7	b1-7	b0-7	b7-6	b6-6	b5-6	b4-6	b3-6	b2-6	b1-6	b0-6	b7-5	b6-5	b5-5	b4-5	b3-5	b2-5	b1-5	b0-5	b7-4	b6-4	b5-4	b4-4	b3-4	b2-4	b1-4	b0-4		splitb D+1		contains D[7..4]

The code is already unrolled, taking 76 longs for the code and a further 18 longs of working memory. Ignoring transfers to and from hubram it takes 72 instructions (144 clocks) to transpose the array of bits.

jrullan · 2020-12-03 07:06

rogloh wrote: »
@jrullan,

ROLBIB does take nibble #N from the source (not bit #N). What is the pattern that you wish to follow after the first 8 bits?

I have assumed you are translating more bits from the original 8x32 bit longs to generate output data in this sequence:
1st byte: b7-0, b6-0, b5-0, b4-0, b3-0, b2-0, b1-0, b0-0
2nd byte: b7-1, b6-1, b5-1, b4-1, b3-1, b2-1, b1-1, b0-1
3rd byte: b7-2, b6-2, b5-2, b4-2, b3-2, b2-2, b1-2, b0-2
4th byte: b7-3, b6-3, b5-3, b4-3, b3-3, b2-3, b1-3, b0-3
then
5th byte: b7-4, b6-4, b5-4, b4-4, b3-4, b2-4, b1-4, b0-4
6th byte: b7-5, b6-5, b5-5, b4-5, b3-5, b2-5, b1-5, b0-5
7th byte: b7-6, b6-6, b5-6, b4-6, b3-6, b2-6, b1-6, b0-6
8th byte: b7-7, b6-7, b5-7, b4-7, b3-7, b2-7, b1-7, b0-7
then
9th byte: b7-8, b6-8, b5-8, b4-8, b3-8, b2-8, b1-8, b0-8
10th byte: b7-9, b6-9, b5-9, b4-9, b3-9, b2-9, b1-9, b0-9
11th byte: b7-10, b6-10, b5-10, b4-10, b3-10, b2-10, b1-10, b0-10
12th byte: b7-11, b6-11, b5-11, b4-11, b3-11, b2-11, b1-11, b0-11
etc

If that is not the pattern you were after, what is the rest of the output byte pattern after the first byte? Your original post does not show all the details.

@rogloh and @AJL ,

Yes that's exactly what I'm looking for!!!

I think that what I'm struggling with is understanding how splitb works. I could follow the description up to the first 8 bits:

SPLITB  D - Split every 4th bit of D into bytes. D = {D[31], D[27], D[23], D[19], ...D[12], D[8], D[4], D[0]}.

Do you know if this is documented better in another place?

In any case this definitely looks like the best approach!

Wow! Thank you for your time and kindness. I definitely need to test this directly on the P2.

rogloh · 2020-12-03 13:33

If it helps, the transform that SPLITB performs is shown diagrammatically here ...

http://forums.parallax.com/discussion/comment/1479691/#Comment_1479691

jrullan · 2020-12-03 13:40

rogloh wrote: »

If it helps, the transform that SPLITB performs is shown diagrammatically here ...

http://forums.parallax.com/discussion/comment/1479691/#Comment_1479691

Wow! I am a visual person myself, that definitely helps!!! Thank you again @rogloh. Can't wait to see the final Propeller 2 manual.

TonyB_ · 2020-12-03 13:56

Here's some code that is half the speed of the optimal solution by rogloh but a third the size.

	setq	#8-1
	rdlong	s,s_addr	'read original array from hub to cog RAM

	mov	ptra,LUT_addr	'set LUT address
	rep	@.end,#8	'repeat code block up to .end 8 times
	rolnib	d,s+7,#0
	rolnib	d,s+6,#0
	rolnib	d,s+5,#0
	rolnib	d,s+4,#0
	rolnib	d,s+3,#0
	rolnib	d,s+2,#0
	rolnib	d,s+1,#0
	rolnib	d,s+0,#0
	splitb	d		'transposed data
	ror	s+7,#4
	ror	s+6,#4
	ror	s+5,#4
	ror	s+4,#4
	ror	s+3,#4
	ror	s+2,#4
	ror	s+1,#4
	ror	s+0,#4
	wrlut	d,ptra++	'write data to LUT addr ptra, then inc ptra
.end
	setq2	#8-1
	wrlong	LUT_addr,d_addr	'write transposed array from LUT RAM to hub

s		long	0[8]	'original array
d		long	0	'transposed data long
s_addr		long	0	'hub RAM addr of original array
d_addr		long	0	'hub RAM addr of transposed array
LUT_addr	long	0	'LUT RAM addr of transposed array

An idea for a future P2+ or P3: ALTGN applies to multiple consecutive GETNIB/ROLNIB instructions if D[31]=1. Similarly for other ALTSx & ALTGx. The eight ROR instructions above could be replaced by one ALTGN and one nibble increment.

Rayman · 2020-12-03 16:09

Wow, the rolnib with splitb is genius. I sometimes wondered if there was any practical use for things like "splitb"... Guess there is.

What was SEUSSF supposed to be for? That's a strange looking one...

JonnyMac · 2020-12-03 17:00

During the P2 call yesterday, Chip said it was for simple obfuscation -- probably why it's named after Dr. Seuss. It tested it with this:

pub obfuscate(value, i) : result

  org
                mov     result, value
                rep     #1, i                           ' valid range is 1..31
                 seussf  result 
  end


pub recover(value, i) : result

  org
                mov     result, value
                neg     i, i
                add     i, #32
                rep     #1, i
                 seussf  result 
  end

jrullan · 2020-12-03 18:01

Rayman wrote: »

Wow, the rolnib with splitb is genius. I sometimes wondered if there was any practical use for things like "splitb"... Guess there is.

You are right @Rayman, it is genius.

Thank you all for your input. Here I am attaching an example for testing it out. Might be useful to someone out there.

{
  Long-to-byte Transposition Example

  Test Code

  First of all the recognitions: http://forums.parallax.com/discussion/172489/any-recommendations-for-transposing-arrays
  @rogloh, @jonnymac, @Wuerfel_21, @Rayman, @TonyB_, @AJL
}
CON
  DELAY = 125

VAR
  long leds
  long number
  long src[8]
  byte dest[32]
  byte  i

PUB public_method_name()
  src[0] := %00010000_00000000_01000000_00000001
  src[1] := %00101000_00000000_10100000_00000010
  src[2] := %01000100_00000001_00010000_00000100
  src[3] := %10000010_00000010_00001000_00001000
  src[4] := %00000001_00000100_00000100_00010000
  src[5] := %00000000_10001000_00000010_00100000
  src[6] := %00000000_01010000_00000001_01000000
  src[7] := %00000000_00100000_00000000_10000000

  leds := 0 addpins 7
  dira.[leds]~~

  transpose(@src,@dest)

  repeat
    repeat i FROM 0 TO 31
      on(dest[i])
      waitms(DELAY)
      off()
      waitms(DELAY)


PRI on(cnt)
  ORG
    MOV OUTA,cnt
  END

PRI off()
  ORG
    MOV OUTA,#0
  END

PRI transpose(srcAddress,destAddress)

  ORG
    SETQ  #8-1
    RDLONG s_array,srcAddress

    ROLNIB d_array+0, s_array+7, #0
    ROLNIB d_array+0, s_array+6, #0
    ROLNIB d_array+0, s_array+5, #0
    ROLNIB d_array+0, s_array+4, #0
    ROLNIB d_array+0, s_array+3, #0
    ROLNIB d_array+0, s_array+2, #0
    ROLNIB d_array+0, s_array+1, #0
    ROLNIB d_array+0, s_array+0, #0
    SPLITB d_array+0

    ROLNIB d_array+1, s_array+7, #1
    ROLNIB d_array+1, s_array+6, #1
    ROLNIB d_array+1, s_array+5, #1
    ROLNIB d_array+1, s_array+4, #1
    ROLNIB d_array+1, s_array+3, #1
    ROLNIB d_array+1, s_array+2, #1
    ROLNIB d_array+1, s_array+1, #1
    ROLNIB d_array+1, s_array+0, #1
    SPLITB d_array+1

    ROLNIB d_array+2, s_array+7, #2
    ROLNIB d_array+2, s_array+6, #2
    ROLNIB d_array+2, s_array+5, #2
    ROLNIB d_array+2, s_array+4, #2
    ROLNIB d_array+2, s_array+3, #2
    ROLNIB d_array+2, s_array+2, #2
    ROLNIB d_array+2, s_array+1, #2
    ROLNIB d_array+2, s_array+0, #2
    SPLITB d_array+2

    ROLNIB d_array+3, s_array+7, #3
    ROLNIB d_array+3, s_array+6, #3
    ROLNIB d_array+3, s_array+5, #3
    ROLNIB d_array+3, s_array+4, #3
    ROLNIB d_array+3, s_array+3, #3
    ROLNIB d_array+3, s_array+2, #3
    ROLNIB d_array+3, s_array+1, #3
    ROLNIB d_array+3, s_array+0, #3
    SPLITB d_array+3

    ROLNIB d_array+4, s_array+7, #4
    ROLNIB d_array+4, s_array+6, #4
    ROLNIB d_array+4, s_array+5, #4
    ROLNIB d_array+4, s_array+4, #4
    ROLNIB d_array+4, s_array+3, #4
    ROLNIB d_array+4, s_array+2, #4
    ROLNIB d_array+4, s_array+1, #4
    ROLNIB d_array+4, s_array+0, #4
    SPLITB d_array+4

    ROLNIB d_array+5, s_array+7, #5
    ROLNIB d_array+5, s_array+6, #5
    ROLNIB d_array+5, s_array+5, #5
    ROLNIB d_array+5, s_array+4, #5
    ROLNIB d_array+5, s_array+3, #5
    ROLNIB d_array+5, s_array+2, #5
    ROLNIB d_array+5, s_array+1, #5
    ROLNIB d_array+5, s_array+0, #5
    SPLITB d_array+5

    ROLNIB d_array+6, s_array+7, #6
    ROLNIB d_array+6, s_array+6, #6
    ROLNIB d_array+6, s_array+5, #6
    ROLNIB d_array+6, s_array+4, #6
    ROLNIB d_array+6, s_array+3, #6
    ROLNIB d_array+6, s_array+2, #6
    ROLNIB d_array+6, s_array+1, #6
    ROLNIB d_array+6, s_array+0, #6
    SPLITB d_array+6

    ROLNIB d_array+7, s_array+7, #7
    ROLNIB d_array+7, s_array+6, #7
    ROLNIB d_array+7, s_array+5, #7
    ROLNIB d_array+7, s_array+4, #7
    ROLNIB d_array+7, s_array+3, #7
    ROLNIB d_array+7, s_array+2, #7
    ROLNIB d_array+7, s_array+1, #7
    ROLNIB d_array+7, s_array+0, #7
    SPLITB d_array+7


    SETQ #8-1
    WRLONG d_array,destAddress

  d_array long 0[8]
  s_array long 0[8]
  END

JonnyMac · 2020-12-03 18:48

I would be inclined to force a return at the end so that the arrays are not treated like instructions when the transposition is done.

    _RET_    WRLONG d_array,destAddress

jrullan · 2020-12-03 19:40

JonnyMac wrote: »
I would be inclined to force a return at the end so that the arrays are not treated like instructions when the transposition is done.
    _RET_    WRLONG d_array,destAddress

Oh, ok. Thanks. Will test it later tonight.

Yanomani · 2020-12-03 20:40

JonnyMac wrote: »

During the P2 call yesterday, Chip said it was for simple obfuscation -- probably why it's named after Dr. Seuss. It tested it with this:

pub obfuscate(value, i) : result

  org
                mov     result, value
                rep     #1, i                           ' valid range is 1..31
                 seussf  result 
  end


pub recover(value, i) : result

  org
                mov     result, value
                neg     i, i
                add     i, #32
                rep     #1, i
                 seussf  result 
  end

When the SEUSSF/SEUSSR subject was mentionned during yesterday's meeting, I'd tried to link the following 2015' thread into the live chat session, but then my entire place went dark-mode, due to a sudden city-wide power outage, and I lost my connection, during ~30 minutes or so.

On return, I forgot to repeat the post. Corrected now...

[url=" https://forums.parallax.com/discussion/162238/what-are-your-expectations-regarding-seussf-and-seussr"] https://forums.parallax.com/discussion/162238/what-are-your-expectations-regarding-seussf-and-seussr[/url]

We certainly owe Chip, Seairth and Heater a good use-case for those two instructions.

Any recommendations for transposing arrays?

Comments