Performance boost from spin to asm?

__red__ · 2011-02-15 11:12

Admittedly I'm three days into actually using my propeller so please forgive the greenness of my question:

repeat
outa[SHLD]~ 'Load 'em up!
outa[CLK]~~ 'Get clock in correct position for...
outa[SHLD]~~'Clock transitions now shift data.
v1 := v1 << 1 + ina[SER] 'First bit is already in position.
repeat (32)
outa[CLK]~~ 'Transit up and shift
v1 := v1 << 1 + ina[SER] 'Read bit
outa[CLK]~ 'Drop the clock in preparation for next cycle
v2 := v1

Given the simplicity of the above code, would re-writing it in asm gain me any significant performance increase? It's currently taking 3.1ms to execute the outer loop. Given that my application requires me to read 260 keys instead of just 32 I need to be able to poll a fair amount on one cog before deploying to more.

Thanks,

Red

Mike Green · 2011-02-15 11:31

I'm guessing, but it looks like you're trying to read a chain of 74HC165s.

Yes, you'll get a tremendous speed increase by redoing this in assembly language. There are some very fast techniques for doing this using a cog counter to generate the clock, but a straightforward inner loop will take about 5 instructions to shift in a bit from the 74HC165, then another few instructions to handle each long that results and gets stored in main (hub) memory as well as managing the memory addresses and counts. You're talking about roughly 10us per long (32 bits).

__red__ · 2011-02-15 14:48

Mike,

Thank you for the reply and yes, you're right - I am reading from a chain of 74HC165s.

I've seen code to do the clock like you said but I'm fearful that my reading may lose sync with the counter so I've avoided that method.

Looks like I have some reading to do wrt asm. 3.1ms to 10us would be well worth the time invested in learning asm.

Thanks,

Red

Mike Green · 2011-02-15 16:06

Look at some of the Object Exchange objects that manage SPI devices using assembly language. Look at the low level SPI read routine in the Winbond Flash Memory I/O object in the Object Exchange.

__red__ · 2011-06-02 08:01

Mike Green wrote: »

Look at some of the Object Exchange objects that manage SPI devices using assembly language. Look at the low level SPI read routine in the Winbond Flash Memory I/O object in the Object Exchange.

So, I've written the following which appears to work very well in Gear. Looking forward to testing it on real hardware when I get home tonight:

DAT
              org 0
entrypoint    or DIRA, DIRMASK                  ' Set pins to stun
:outerloop    andn OUTA, LOADPIN                ' Set LD to low
              andn OUTA, CLKPIN                 ' Set clockpin low
              or OUTA, LOADPIN                  ' Set Load high
              mov bitcntr, #32                  ' We're reading 32 bits...
:loop         andn OUTA, CLKPIN                 ' Set LD to low (Yes, this is redundant
                             
              ror INA, #28 NR,WC                ' C should contain value of CLK (27)
              rcl KEYSTATE, #1                  ' C should now be pushed into LSB of keystate
              or OUTA, CLKPIN                   ' Set CLK high
              djnz bitcntr, #:loop              ' Around the loop we go
              jmp #:outerloop                   ' ... and back for another snapshot

DIRMASK       long      %00000110_00000000_00000000_01111111
LOADPIN       long      %00000010_00000000_00000000_00000000
CLKPIN        long      %00000100_00000000_00000000_00000000  
SERPIN        long      %00001000_00000000_00000000_00000000
BITCNTR       long      $0

Probably not the perfect code (like I should be using a counter instead of bit-banging myself) but I'm happy with the start that I've made :-)

Thanks again,

Red

Mike Green · 2011-06-02 08:06

In the future, please do not use the [ QUOTE ] tags for including code, particularly Propeller code. Use [ CODE ] tags instead for small code fragments or use the Attachment Manager (use the Go Advanced button) when posting a reply with code included.

__red__ · 2011-06-02 08:26

I was actually re-formatting it within a code block while you were posting.

Done!

Of course, now through my browsing I'm reading the "Tricks and Traps" pdf and it looks like

ror INA, #28 NR,WC

may be invalid, even with the NR flag.

edit:

That Traps and Tricks document saved my life again it seems:

"This is because, no matter
how many positions are shifted, the carry (if written) always gets the initial
value of bit 0 for right shifts/rotates or bit 31 for left shifts/rotates"

JasonDorie · 2011-06-02 17:46

I think I read at one point that there is a very approximate 300:1 ratio of time spent executing an operation in SPIN vs executing that same operation in PASM. I realize that the two are very different, but SPIN has a significant amount of overhead because it is interpreted, so unless you're doing something that is essentially atomic in SPIN (like a multiply), it'll always be an order of magnitude faster in PASM.

Given that your 3.1ms loop becomes 10us, or 310x faster, that seems about right.

Performance boost from spin to asm?

Comments