CRC8

JonnyMac · 2020-03-07 19:02

While browsing through the instruction set I happened upon CRCBIT -- this looked interesting, so I thought I should try it.

I started with a Spin1 routine by Micah Dowty that is part of my 1-Wire object. This works without modification in Spin2.

pub crc8x(p_src, n) : crc | b

'' Returns CRC8 of n bytes at p_src
'' -- implementation by Micah Dowty

  repeat n
    b := byte[p_src++]
    repeat 8
      if (crc ^ b) & 1
        crc := (crc >> 1) ^ $8C
      else
        crc >>= 1
      b >>= 1

I translated Micah's Spin1 routine to inline PASM2.

pub crc8x_pasm(p_src, n) : crc | b, t1

'' Returns CRC8 of n bytes at p_src
'' -- conversion to PASM2 by JonnyMac

  org
.loop                   rdbyte  b, p_src
                        add     p_src, #1
                        rep     @.done, #8
                        mov     t1, crc
                        xor     t1, b
                        testb   t1, #0                  wc
                        shr     crc, #1
        if_c            xor     crc, #$8C
                        shr     b, #1
.done
                        djnz    n, #.loop
  end

Of course, there was a big speed improvement. BTW, I'm just coming up to speed on PASM2. In the likely event that this routine can be made more efficient, please show me how. Finally, I updated the code with CRCBIT.
Edit: This code has been updated with a suggestion by @AJL (see below). It cuts about 300 ticks out of the test timing.

pub crc8x_pasm2(p_src, n) : crc | b, t1

'' Returns CRC8 of n bytes at p_src

  org
.loop                   rdbyte  b, p_src
                        add     p_src, #1
                        rep     @.done, #8
                        shr     b, #1                 wc
                        crcbit  crc, #$8C
.done
                        djnz    n, #.loop
  end

Final results running a string of 19 characters through each method:

Test1: CRC is $82 in 63536 ticks.
Test2: CRC is $82 in 3360 ticks.
Test3: CRC is $82 in 2128 ticks.

JRoark · 2020-03-07 20:00

I'd call a 26x speed improvement a winner!

AJL · 2020-03-07 20:38

A small improvement:

pub crc8x_pasm2(p_src, n) : crc | b, t1

'' Returns CRC8 of n bytes at p_src
'' -- implementation by JonnyMac

  org
.loop                   rdlong  b, p_src
                        add     p_src, #1
                        rep     @.done, #8
'                        testb   b, #0                   wc
                        shr b, #1                        wc
                        crcbit  crc, #$8C
'                        shr     b, #1
.done
                        djnz    n, #.loop
  end

JonnyMac · 2020-03-07 20:49

Thank you! That's one of those things that is so simple I looked right past it. With this particular test string the updated code is almost 30x faster than the original Spin routine.

Rayman · 2020-03-07 21:07

Wonder how FastSpin would do...

cgracey · 2020-03-07 21:26

There's a CRCNIB instruction, too, that can process four bits at a time.

JonnyMac · 2020-03-07 22:27

deleted

garryj · 2020-03-07 23:33

Here is a code block from my USB host code that calculates a 5-bit CRC on the 11-bit data for the full-speed 1ms frame#. It uses a combination of crcnib and crcbit:

                mov     utx, #OUT_SOP
                call    #isr_utx_sof                    ' Send sync byte
                mov     icrc, #$1f                      ' Prime the CRC5 pump
                mov     sof_pkt, frame                  ' CRC5 calculation done on the 11-bit frame number value
                rev     sof_pkt                         ' Input data reflected
                mov     utx, #PID_SOF
                call    #utx_byte                       ' Send token PID byte
                setq    sof_pkt                         ' CRCNIB setup for data bits 0..7
                crcnib  icrc, #USB5_POLY
                crcnib  icrc, #USB5_POLY                ' Data bits 0..7 calculated
                getbyte utx, frame, #0                  ' Send the low byte of the frame number
                call    #utx_byte
                shl     sof_pkt, #8                     ' Shift out processed bits to set up CRCBIT * 3
                rep     #2, #3                          ' Three data bits left to process
                shl     sof_pkt, #1             wc
                crcbit  icrc, #USB5_POLY                ' Data bits 8..10 calculated
                xor     icrc, #$1f                      ' Final XOR value
                getbyte utx, frame, #1                  ' Send remaining frame number bits
                shl     icrc, #3                        ' Merge CRC to bits 7..3 of the final token byte
                or      utx, icrc
                call    #utx_byte                       ' Last start-of-frame byte is on the wire

Doh! Edit: Note that USB puts bits on the wire MSB LSB first.

JonnyMac · 2020-03-07 23:43

Thank you for that, Gary. Here's an update that uses CRCNIB and knocks another 600 ticks of the test timing for the string I'm using

pub crc8x_pasm3(p_src, n) : crc | b

'' Returns CRC8 of n bytes at p_src

  org
.loop                   rdbyte  b, p_src
                        add     p_src, #1
                        rev     b
                        setq    b
                        crcnib  crc, #$8C
                        crcnib  crc, #$8C
                        djnz    n, #.loop
  end

Test1: CRC is $82 in 63736 ticks.
Test2: CRC is $82 in 3336 ticks.
Test3: CRC is $82 in 2104 ticks.
Test4: CRC is $82 in 1504 ticks.

Cluso99 · 2020-03-07 23:58

IIRC there is post in the Tricks and Traps that links to the thread that discusses the CRC instructions and their use for the various CRC16 routines used in communications.

JonnyMac · 2020-03-08 00:05

I didn't see it. Still, until a "Propeller 2 Cookbook" can be produced to help those coming from other devices, I don't think a bit a redundancy is a problem, especially since I started with Spin which may help those not yet comfortable with PASM move toward it. My favorite feature of Spin2 is inline PASM -- I will be converting lots of utility routines like this for P2 projects.

Roy Eltham · 2020-03-08 00:29

JonnyMac,
yeah we need a cookbook with good explanations and diagrams/graphs/whatever, especially for the smart pin functionality. I've read the current doc on them a bit and I still don't really understand most of it well enough to use.
I'm sure if I reread them a bunch and did a bunch of trial and error experimenting with a scope I could figure them out, but it sure will be nice when we have better docs.

Cluso99 · 2020-03-08 01:10

Here is a link that discusses the time the CRCBIT instruction was added to the P2.
https://forums.parallax.com/discussion/162298/prop2-fpga-files-updated-2-june-2018-final-version-32i/p110

JonnyMac · 2020-03-08 03:37

Roy Eltham wrote: »

JonnyMac,
yeah we need a cookbook with good explanations and diagrams/graphs/whatever, especially for the smart pin functionality. I've read the current doc on them a bit and I still don't really understand most of it well enough to use.
I'm sure if I reread them a bunch and did a bunch of trial and error experimenting with a scope I could figure them out, but it sure will be nice when we have better docs.

Better docs will come. In the meantime, I'm experimenting with P1 -> P2 code conversions to teach myself and help others outside the Propeller world get going. Every time I'm at a tech con and tell people about the Propeller, they get very excited, yet few make the move because there is such a dearth of practical information that enthusiasts and engineers can use to get up and running very quickly.

pedward · 2020-03-08 06:14

Perhaps I'm not familiar with the newer syntax, but you are reading a long and only CRCing a byte.

Shouldn't you CRCNIB 8 times instead of 2?

I see you're calling rdlong but processing n bytes, so shouldn't it really be a rdbyte?

Capt. Quirk · 2020-03-08 07:00

we need a cookbook with good explanations and diagrams/graphs/whatever, especially for the smart pin functionality.

And a glossary.

Bill M.

Roy Eltham · 2020-03-08 07:30

JonnyMac,
Yeah I know they will, Parallax has always been pretty good with docs for their stuff.

I've been porting the Simple Libraries stuff over to FlexC and making them work on P2, and I'm stalled currently on the last few bits of libsimpletools that use the P1 counter modules in various ways (pwm, rctime, etc.). I could just bit bang some of them, but they should use the smartpins.
The existing docs are pretty terse on each mode, and the vocabulary doesn't match up P1 counter module docs very well. Tomorrow I am going to get setup with my scope on the P2 and do a bunch of trial and error testing...

JonnyMac · 2020-03-08 15:37

pedward wrote: »

Perhaps I'm not familiar with the newer syntax, but you are reading a long and only CRCing a byte.

Shouldn't you CRCNIB 8 times instead of 2?

I see you're calling rdlong but processing n bytes, so shouldn't it really be a rdbyte?

Yes, I should have used rdbyte, and I have gone back and fixed that in my examples. Thank you for the catch. I couldn't find it in the docs, but I believe I read somewhere that rdlong in the P2 works on any boundary, which explains why the code worked.

JonnyMac · 2020-03-08 16:49

Roy Eltham wrote: »

JonnyMac,
Yeah I know they will, Parallax has always been pretty good with docs for their stuff.

I've been porting the Simple Libraries stuff over to FlexC and making them work on P2, and I'm stalled currently on the last few bits of libsimpletools that use the P1 counter modules in various ways (pwm, rctime, etc.). I could just bit bang some of them, but they should use the smartpins.
The existing docs are pretty terse on each mode, and the vocabulary doesn't match up P1 counter module docs very well. Tomorrow I am going to get setup with my scope on the P2 and do a bunch of trial and error testing...

I'm struggling with smart pins, too, and like you, will have my 'scope out today looking at signals. I'm working to port my personal library of Spin1 objects because I have a few clients interested in transitioning to the P2. When they're ready, I want to be as well.

ErNa · 2020-03-08 20:01

I'm sure after Peters P2D2 is available together with WIZNET will be a Webserver online which allows to search and find all information...

ersmith · 2020-03-08 22:54

Rayman wrote: »

Wonder how FastSpin would do...

I'm very curious about that too, because the original Spin1 routine is exactly the kind of bit manipulation that fastspin was designed to optimize. I can't test now, but my guess is that it'll be in the same ball park as Jon's first Pasm routine when compiled with -O2. Probably a bit slower.

Jon, any chance you could build this with fastspin?

JonnyMac · 2020-03-08 23:24

ersmith wrote: »

Rayman wrote: »

Wonder how FastSpin would do...

I'm very curious about that too, because the original Spin1 routine is exactly the kind of bit manipulation that fastspin was designed to optimize. I can't test now, but my guess is that it'll be in the same ball park as Jon's first Pasm routine when compiled with -O2. Probably a bit slower.

Jon, any chance you could build this with fastspin?

With full optimization on it comes to 3326 ticks, 10 shorter than my hand-assembled code above. I think the difference can be accounted for by the Spin overhead for calling and returning from the routine.

Code is attached so you can check that I did things correctly (I'm not a regular user of FastSpin).

JonnyMac · 2020-03-08 23:37

A few minutes later...
I was curious about the Spin overhead so I "removed" it by moving the start/stop timing points just outside the inline assembly in my original program. The result was 2488 ticks -- this is more inline of what I expected after looking at the code generated by the FastSpin compiler (it's doing more than I am).

Roy Eltham · 2020-03-09 02:05

JonnyMac,
FYI, after couple hours of experimenting with various settings, I think have a reasonable handle on some of the smart pin modes (00100 to 01001) pulse/cycle, transition, NCOs, & PWMs.
I will work up some diagrams and whatnot and share in a new thread. I'm using FlexC for my test/sample code, but it should be trivial to convert to spin2, since it's mostly just wrpin/wxpin/wypin instrinsics that map to the similar spin2 commands.

For example, this snippet does a 42ms pulse out on pin 1:

  // code assumes 200Mhz sysclock
  int mode = 0x00000048;   // pulse/cycle mode
  int x = (0 << 16) | 200; // 0 for compare (means always true/high), 200 clocks per time unit (1us at 200Mhz sysclock)
  int y = 42000;           // pulse for 42000 time units (42ms at 200Mhz clock)

  int pin = 1;

  _dirl(pin);         // dir low (smart pin held in reset)
  _wrpin(pin, mode);  // set mode
  _wxpin(pin, x);     // set x
  _dirh(pin);         // dir high (smart pin allowed to run)
  
  _wypin(pin, y);     // set y, in pulse mode this triggers the pulse to start and last for (y * x[15:0]) sysclocks

This is just a super basic example, but useful still. Once the pin is setup in this mode, you can just write new values to Y to trigger new pulses.

CRC8

Comments