Shop OBEX P1 Docs P2 Docs Learn Events
CRC8 — Parallax Forums

CRC8

JonnyMacJonnyMac Posts: 9,182
edited 2020-03-08 15:28 in Propeller 2
While browsing through the instruction set I happened upon CRCBIT -- this looked interesting, so I thought I should try it.

I started with a Spin1 routine by Micah Dowty that is part of my 1-Wire object. This works without modification in Spin2.
pub crc8x(p_src, n) : crc | b

'' Returns CRC8 of n bytes at p_src
'' -- implementation by Micah Dowty

  repeat n
    b := byte[p_src++]
    repeat 8
      if (crc ^ b) & 1
        crc := (crc >> 1) ^ $8C
      else
        crc >>= 1
      b >>= 1
I translated Micah's Spin1 routine to inline PASM2.
pub crc8x_pasm(p_src, n) : crc | b, t1

'' Returns CRC8 of n bytes at p_src
'' -- conversion to PASM2 by JonnyMac

  org
.loop                   rdbyte  b, p_src
                        add     p_src, #1
                        rep     @.done, #8
                        mov     t1, crc
                        xor     t1, b
                        testb   t1, #0                  wc
                        shr     crc, #1
        if_c            xor     crc, #$8C
                        shr     b, #1
.done
                        djnz    n, #.loop
  end
Of course, there was a big speed improvement. BTW, I'm just coming up to speed on PASM2. In the likely event that this routine can be made more efficient, please show me how. Finally, I updated the code with CRCBIT.
Edit: This code has been updated with a suggestion by @AJL (see below). It cuts about 300 ticks out of the test timing.
pub crc8x_pasm2(p_src, n) : crc | b, t1

'' Returns CRC8 of n bytes at p_src

  org
.loop                   rdbyte  b, p_src
                        add     p_src, #1
                        rep     @.done, #8
                        shr     b, #1                 wc
                        crcbit  crc, #$8C
.done
                        djnz    n, #.loop
  end
Final results running a string of 19 characters through each method:
Test1: CRC is $82 in 63536 ticks.
Test2: CRC is $82 in 3360 ticks.
Test3: CRC is $82 in 2128 ticks.

Comments

  • I'd call a 26x speed improvement a winner!
  • A small improvement:
    pub crc8x_pasm2(p_src, n) : crc | b, t1
    
    '' Returns CRC8 of n bytes at p_src
    '' -- implementation by JonnyMac
    
      org
    .loop                   rdlong  b, p_src
                            add     p_src, #1
                            rep     @.done, #8
    '                        testb   b, #0                   wc
                            shr b, #1                        wc
                            crcbit  crc, #$8C
    '                        shr     b, #1
    .done
                            djnz    n, #.loop
      end
    
  • JonnyMacJonnyMac Posts: 9,182
    edited 2020-03-07 20:51
    Thank you! That's one of those things that is so simple I looked right past it. With this particular test string the updated code is almost 30x faster than the original Spin routine.
  • RaymanRayman Posts: 14,789
    Wonder how FastSpin would do...
  • cgraceycgracey Posts: 14,232
    There's a CRCNIB instruction, too, that can process four bits at a time.
  • JonnyMacJonnyMac Posts: 9,182
    edited 2020-03-08 00:01
    deleted
  • garryjgarryj Posts: 337
    edited 2020-03-08 00:00
    Here is a code block from my USB host code that calculates a 5-bit CRC on the 11-bit data for the full-speed 1ms frame#. It uses a combination of crcnib and crcbit:
                    mov     utx, #OUT_SOP
                    call    #isr_utx_sof                    ' Send sync byte
                    mov     icrc, #$1f                      ' Prime the CRC5 pump
                    mov     sof_pkt, frame                  ' CRC5 calculation done on the 11-bit frame number value
                    rev     sof_pkt                         ' Input data reflected
                    mov     utx, #PID_SOF
                    call    #utx_byte                       ' Send token PID byte
                    setq    sof_pkt                         ' CRCNIB setup for data bits 0..7
                    crcnib  icrc, #USB5_POLY
                    crcnib  icrc, #USB5_POLY                ' Data bits 0..7 calculated
                    getbyte utx, frame, #0                  ' Send the low byte of the frame number
                    call    #utx_byte
                    shl     sof_pkt, #8                     ' Shift out processed bits to set up CRCBIT * 3
                    rep     #2, #3                          ' Three data bits left to process
                    shl     sof_pkt, #1             wc
                    crcbit  icrc, #USB5_POLY                ' Data bits 8..10 calculated
                    xor     icrc, #$1f                      ' Final XOR value
                    getbyte utx, frame, #1                  ' Send remaining frame number bits
                    shl     icrc, #3                        ' Merge CRC to bits 7..3 of the final token byte
                    or      utx, icrc
                    call    #utx_byte                       ' Last start-of-frame byte is on the wire
    
    Doh! Edit: Note that USB puts bits on the wire MSB LSB first.
    696 x 800 - 21K
  • JonnyMacJonnyMac Posts: 9,182
    edited 2020-03-08 15:29
    Thank you for that, Gary. Here's an update that uses CRCNIB and knocks another 600 ticks of the test timing for the string I'm using
    pub crc8x_pasm3(p_src, n) : crc | b
    
    '' Returns CRC8 of n bytes at p_src
    
      org
    .loop                   rdbyte  b, p_src
                            add     p_src, #1
                            rev     b
                            setq    b
                            crcnib  crc, #$8C
                            crcnib  crc, #$8C
                            djnz    n, #.loop
      end
    
    Test1: CRC is $82 in 63736 ticks.
    Test2: CRC is $82 in 3336 ticks.
    Test3: CRC is $82 in 2104 ticks.
    Test4: CRC is $82 in 1504 ticks.
    
  • Cluso99Cluso99 Posts: 18,069
    IIRC there is post in the Tricks and Traps that links to the thread that discusses the CRC instructions and their use for the various CRC16 routines used in communications.
  • I didn't see it. Still, until a "Propeller 2 Cookbook" can be produced to help those coming from other devices, I don't think a bit a redundancy is a problem, especially since I started with Spin which may help those not yet comfortable with PASM move toward it. My favorite feature of Spin2 is inline PASM -- I will be converting lots of utility routines like this for P2 projects.
  • JonnyMac,
    yeah we need a cookbook with good explanations and diagrams/graphs/whatever, especially for the smart pin functionality. I've read the current doc on them a bit and I still don't really understand most of it well enough to use.
    I'm sure if I reread them a bunch and did a bunch of trial and error experimenting with a scope I could figure them out, but it sure will be nice when we have better docs.
  • Cluso99Cluso99 Posts: 18,069
    Here is a link that discusses the time the CRCBIT instruction was added to the P2.
    https://forums.parallax.com/discussion/162298/prop2-fpga-files-updated-2-june-2018-final-version-32i/p110
  • JonnyMacJonnyMac Posts: 9,182
    edited 2020-03-08 03:39
    Roy Eltham wrote: »
    JonnyMac,
    yeah we need a cookbook with good explanations and diagrams/graphs/whatever, especially for the smart pin functionality. I've read the current doc on them a bit and I still don't really understand most of it well enough to use.
    I'm sure if I reread them a bunch and did a bunch of trial and error experimenting with a scope I could figure them out, but it sure will be nice when we have better docs.
    Better docs will come. In the meantime, I'm experimenting with P1 -> P2 code conversions to teach myself and help others outside the Propeller world get going. Every time I'm at a tech con and tell people about the Propeller, they get very excited, yet few make the move because there is such a dearth of practical information that enthusiasts and engineers can use to get up and running very quickly.
  • Perhaps I'm not familiar with the newer syntax, but you are reading a long and only CRCing a byte.

    Shouldn't you CRCNIB 8 times instead of 2?

    I see you're calling rdlong but processing n bytes, so shouldn't it really be a rdbyte?
  • we need a cookbook with good explanations and diagrams/graphs/whatever, especially for the smart pin functionality.

    And a glossary.


    Bill M.
  • JonnyMac,
    Yeah I know they will, Parallax has always been pretty good with docs for their stuff.

    I've been porting the Simple Libraries stuff over to FlexC and making them work on P2, and I'm stalled currently on the last few bits of libsimpletools that use the P1 counter modules in various ways (pwm, rctime, etc.). I could just bit bang some of them, but they should use the smartpins.
    The existing docs are pretty terse on each mode, and the vocabulary doesn't match up P1 counter module docs very well. Tomorrow I am going to get setup with my scope on the P2 and do a bunch of trial and error testing...
  • pedward wrote: »
    Perhaps I'm not familiar with the newer syntax, but you are reading a long and only CRCing a byte.

    Shouldn't you CRCNIB 8 times instead of 2?

    I see you're calling rdlong but processing n bytes, so shouldn't it really be a rdbyte?

    Yes, I should have used rdbyte, and I have gone back and fixed that in my examples. Thank you for the catch. I couldn't find it in the docs, but I believe I read somewhere that rdlong in the P2 works on any boundary, which explains why the code worked.
  • Roy Eltham wrote: »
    JonnyMac,
    Yeah I know they will, Parallax has always been pretty good with docs for their stuff.

    I've been porting the Simple Libraries stuff over to FlexC and making them work on P2, and I'm stalled currently on the last few bits of libsimpletools that use the P1 counter modules in various ways (pwm, rctime, etc.). I could just bit bang some of them, but they should use the smartpins.
    The existing docs are pretty terse on each mode, and the vocabulary doesn't match up P1 counter module docs very well. Tomorrow I am going to get setup with my scope on the P2 and do a bunch of trial and error testing...

    I'm struggling with smart pins, too, and like you, will have my 'scope out today looking at signals. I'm working to port my personal library of Spin1 objects because I have a few clients interested in transitioning to the P2. When they're ready, I want to be as well.
  • ErNaErNa Posts: 1,752
    I'm sure after Peters P2D2 is available together with WIZNET will be a Webserver online which allows to search and find all information...
  • Rayman wrote: »
    Wonder how FastSpin would do...

    I'm very curious about that too, because the original Spin1 routine is exactly the kind of bit manipulation that fastspin was designed to optimize. I can't test now, but my guess is that it'll be in the same ball park as Jon's first Pasm routine when compiled with -O2. Probably a bit slower.

    Jon, any chance you could build this with fastspin?
  • JonnyMacJonnyMac Posts: 9,182
    edited 2020-03-08 23:26
    ersmith wrote: »
    Rayman wrote: »
    Wonder how FastSpin would do...

    I'm very curious about that too, because the original Spin1 routine is exactly the kind of bit manipulation that fastspin was designed to optimize. I can't test now, but my guess is that it'll be in the same ball park as Jon's first Pasm routine when compiled with -O2. Probably a bit slower.

    Jon, any chance you could build this with fastspin?

    With full optimization on it comes to 3326 ticks, 10 shorter than my hand-assembled code above. I think the difference can be accounted for by the Spin overhead for calling and returning from the routine.

    Code is attached so you can check that I did things correctly (I'm not a regular user of FastSpin).
  • A few minutes later...
    I was curious about the Spin overhead so I "removed" it by moving the start/stop timing points just outside the inline assembly in my original program. The result was 2488 ticks -- this is more inline of what I expected after looking at the code generated by the FastSpin compiler (it's doing more than I am).
  • JonnyMac,
    FYI, after couple hours of experimenting with various settings, I think have a reasonable handle on some of the smart pin modes (00100 to 01001) pulse/cycle, transition, NCOs, & PWMs.
    I will work up some diagrams and whatnot and share in a new thread. I'm using FlexC for my test/sample code, but it should be trivial to convert to spin2, since it's mostly just wrpin/wxpin/wypin instrinsics that map to the similar spin2 commands.

    For example, this snippet does a 42ms pulse out on pin 1:
      // code assumes 200Mhz sysclock
      int mode = 0x00000048;   // pulse/cycle mode
      int x = (0 << 16) | 200; // 0 for compare (means always true/high), 200 clocks per time unit (1us at 200Mhz sysclock)
      int y = 42000;           // pulse for 42000 time units (42ms at 200Mhz clock)
    
      int pin = 1;
    
      _dirl(pin);         // dir low (smart pin held in reset)
      _wrpin(pin, mode);  // set mode
      _wxpin(pin, x);     // set x
      _dirh(pin);         // dir high (smart pin allowed to run)
      
      _wypin(pin, y);     // set y, in pulse mode this triggers the pulse to start and last for (y * x[15:0]) sysclocks
    
    This is just a super basic example, but useful still. Once the pin is setup in this mode, you can just write new values to Y to trigger new pulses.

Sign In or Register to comment.