Shop Learn
Fast byte swapping? — Parallax Forums

Fast byte swapping?

I'm dealing with network packets and have to swap a lot of bytes.
I thought the REV instruction might help, but looks like in P2 it's bits and not bytes, which isn't what I need. I know having the commands inline would remove the function call overhead, but without something more succinct it would create a mess with how many times I'm calling it.

PUB FlipWord(bytein):byteout|a,b,c,d
a := (bytein>>24)&$FF
b := (bytein>>16)&$FF
c := (bytein>>8)&$FF
d := (bytein)&$FF
byteout := (d<<24)|(c<<16)|(b<<8)|a

Comments

  • There isn't any sort of spin intrinsic for it, but the movbyts assembly instruction is exactly what you need:

    PUB FlipWord(bytein):r
    org
        movbyts bytein,#%%0123
    end
    return bytein
    

    If you're using flexspin, you can use asm/endasm instead of org/end and it will be able to inline the function

  • Here are the pure Spin versions I've been using in my network stack, but @Wuerfel_21 's inline ASM version is better and I'll rewrite my two functions to use movbyts instead.

    PUB bswap2(w)
      return w.byte[0] << 8 | w.byte[1]
    
    PUB bswap4(l)
      return l.byte[0] << 24 | l.byte[1] << 16 | l.byte[2] << 8 | l.byte[3]
    
  • JonnyMacJonnyMac Posts: 8,097

    For the P2 you need to declare a return variable.

    I frequently do timing tests on little bits of code and decide to compare inline PASM against pure Spin in two forms.

    pub swap_v1(val) : result  
    
      org
                    mov     result, val
                    movbyts result, #%%0123
      end
    
    
    pub swap_v2(val) : result
    
      result := val.byte[0] << 24 | val.byte[1] << 16 | val.byte[2] << 8 | val.byte[3]
    
    
    pub swap_v3(val) : result
    
      result.byte[3] := val.byte[0]
      result.byte[2] := val.byte[1]
      result.byte[1] := val.byte[2]
      result.byte[0] := val.byte[3]
    

    The results:

  • AribaAriba Posts: 2,588

    It's quite different in FlexSpin. PASM with org - end is here the slowest, because the 2 instructions must be copied to the cogram and then executed (like in Chips Spin2). But with ASM - ENDASM the compiler generates the PASM code really inline and executes it with hubexec.
    Results:

    v1  (org-end)      98
    v1  (asm-endasm)    6
    v2                 58
    v3                 50
    

    Andy

  • Thanks all, really informative!

  • Hello JonnyMac and kwagner,
    I have posted my spin2 method to reverse the order of bytes in a long value.

    HydraHacker

      _clkfreq = 180_000_000
      msk1     = $00_ff_00_ff
      msk2     = $ff_00_ff_00
    
    PUB main() 
       revBytes($AABBCCDD)
    
    PUB revBytes(v) : r
       r := v ror 8 & msk2 | v ror 24 & msk1
    
Sign In or Register to comment.