Any recommendations for transposing arrays?
jrullan
Posts: 168
I'm working on porting an old project to the P2 and it relies on transposing an array of longs (32 bits long integers) into an array of bytes (I know it's odd). See below:
The resulting byte array order could be either as shown or having b00 to be the msb.
I wonder if there are any instructions among the PASM instruction set that facilitates doing this in the least amount of cycles possible.
The resulting byte array order could be either as shown or having b00 to be the msb.
I wonder if there are any instructions among the PASM instruction set that facilitates doing this in the least amount of cycles possible.
Comments
There's RCZR/RCZL that can shift two bits into the status flags at once, I guess.
But, maybe you could send the 8 longs to 8 smartpins and them shift it out while you read the in?
Might require jumpers to 8 other pins...
See post by rogloh below.
Approach:
Read in your 8 longs to obtain source data first to 8 consecutive registers.
Then do 8 x ROLNIBs into an accumulator, followed by a SPLITB then the long is complete.
Repeat this 8 times in total for all data, advancing where you source the nibble from in the ROLNIBs and using different output register accumulators.
Write back the 8 longs of accumulated data as a burst.
An example code snippet is below. If you can't afford them all unrolled in COGRAM space, consider HUB-exec which should be almost as fast because of no branching.
@rogloh,
from the P2ASM spreadsheet I see that the instruction rolnib has this description:
I guess my question is when it refers to S.NIBBLE[N], the index N refers to the index of nibbles within the long, or a nibble starting at bit N?
I can see how the first 9 instructions work, but the second group of instructions doesn't seem to achieve the same result unless N is the bit index within the long instead of the nibble index.
Another approach as suggested by @Wuerfel_21, and I think is also @JonnyMac's approach could be to use the testb/bitc instructions:
These would take 32 clocks per D byte, unrolled it would take 1024 clocks for the 8 longs to be transposed.
[TypoEdit]ROLNIB does take nibble #N from the source (not bit #N). What is the pattern that you wish to follow after the first 8 bits?
I have assumed you are translating more bits from the original 8x32 bit longs to generate output data in this sequence:
then then etc
If that is not the pattern you were after, what is the rest of the output byte pattern after the first byte? Your original post does not show all the details.
In @rogloh's approach the 9 instructions produce four bytes of output at a time which is why only 8 sequences of 9 instructions are needed.
The code is already unrolled, taking 76 longs for the code and a further 18 longs of working memory. Ignoring transfers to and from hubram it takes 72 instructions (144 clocks) to transpose the array of bits.
@rogloh and @AJL ,
Yes that's exactly what I'm looking for!!!
I think that what I'm struggling with is understanding how splitb works. I could follow the description up to the first 8 bits:
Do you know if this is documented better in another place?
In any case this definitely looks like the best approach!
Wow! Thank you for your time and kindness. I definitely need to test this directly on the P2.
http://forums.parallax.com/discussion/comment/1479691/#Comment_1479691
Wow! I am a visual person myself, that definitely helps!!! Thank you again @rogloh. Can't wait to see the final Propeller 2 manual.
An idea for a future P2+ or P3: ALTGN applies to multiple consecutive GETNIB/ROLNIB instructions if D[31]=1. Similarly for other ALTSx & ALTGx. The eight ROR instructions above could be replaced by one ALTGN and one nibble increment.
What was SEUSSF supposed to be for? That's a strange looking one...
You are right @Rayman, it is genius.
Thank you all for your input. Here I am attaching an example for testing it out. Might be useful to someone out there.
Oh, ok. Thanks. Will test it later tonight.
When the SEUSSF/SEUSSR subject was mentionned during yesterday's meeting, I'd tried to link the following 2015' thread into the live chat session, but then my entire place went dark-mode, due to a sudden city-wide power outage, and I lost my connection, during ~30 minutes or so.
On return, I forgot to repeat the post. Corrected now...
[url=" https://forums.parallax.com/discussion/162238/what-are-your-expectations-regarding-seussf-and-seussr"] https://forums.parallax.com/discussion/162238/what-are-your-expectations-regarding-seussf-and-seussr[/url]
We certainly owe Chip, Seairth and Heater a good use-case for those two instructions.