Fast byte swapping?

kwagner · 2022-05-31 17:48

I'm dealing with network packets and have to swap a lot of bytes.
I thought the REV instruction might help, but looks like in P2 it's bits and not bytes, which isn't what I need. I know having the commands inline would remove the function call overhead, but without something more succinct it would create a mess with how many times I'm calling it.

PUB FlipWord(bytein):byteout|a,b,c,d
a := (bytein>>24)&$FF
b := (bytein>>16)&$FF
c := (bytein>>8)&$FF
d := (bytein)&$FF
byteout := (d<<24)|(c<<16)|(b<<8)|a

Wuerfel_21 · 2022-05-31 19:11

There isn't any sort of spin intrinsic for it, but the movbyts assembly instruction is exactly what you need:

PUB FlipWord(bytein):r
org
    movbyts bytein,#%%0123
end
return bytein

If you're using flexspin, you can use asm/endasm instead of org/end and it will be able to inline the function

Electrodude · 2022-06-02 03:34

Here are the pure Spin versions I've been using in my network stack, but @Wuerfel_21 's inline ASM version is better and I'll rewrite my two functions to use movbyts instead.

PUB bswap2(w)
  return w.byte[0] << 8 | w.byte[1]

PUB bswap4(l)
  return l.byte[0] << 24 | l.byte[1] << 16 | l.byte[2] << 8 | l.byte[3]

JonnyMac · 2022-06-02 13:57

For the P2 you need to declare a return variable.

I frequently do timing tests on little bits of code and decide to compare inline PASM against pure Spin in two forms.

pub swap_v1(val) : result  

  org
                mov     result, val
                movbyts result, #%%0123
  end


pub swap_v2(val) : result

  result := val.byte[0] << 24 | val.byte[1] << 16 | val.byte[2] << 8 | val.byte[3]


pub swap_v3(val) : result

  result.byte[3] := val.byte[0]
  result.byte[2] := val.byte[1]
  result.byte[1] := val.byte[2]
  result.byte[0] := val.byte[3]

The results:

Ariba · 2022-06-02 17:26

It's quite different in FlexSpin. PASM with org - end is here the slowest, because the 2 instructions must be copied to the cogram and then executed (like in Chips Spin2). But with ASM - ENDASM the compiler generates the PASM code really inline and executes it with hubexec.
Results:

v1  (org-end)      98
v1  (asm-endasm)    6
v2                 58
v3                 50

Andy

kwagner · 2022-06-03 12:10

Thanks all, really informative!

HydraHacker · 2022-06-25 02:30

Hello JonnyMac and kwagner,
I have posted my spin2 method to reverse the order of bytes in a long value.

HydraHacker

  _clkfreq = 180_000_000
  msk1     = $00_ff_00_ff
  msk2     = $ff_00_ff_00

PUB main() 
   revBytes($AABBCCDD)

PUB revBytes(v) : r
   r := v ror 8 & msk2 | v ror 24 & msk1

Fast byte swapping?

Comments