CRC8

While browsing through the instruction set I happened upon CRCBIT -- this looked interesting, so I thought I should try it.
I started with a Spin1 routine by Micah Dowty that is part of my 1-Wire object. This works without modification in Spin2.
Edit: This code has been updated with a suggestion by @AJL (see below). It cuts about 300 ticks out of the test timing.
I started with a Spin1 routine by Micah Dowty that is part of my 1-Wire object. This works without modification in Spin2.
pub crc8x(p_src, n) : crc | b
'' Returns CRC8 of n bytes at p_src
'' -- implementation by Micah Dowty
repeat n
b := byte[p_src++]
repeat 8
if (crc ^ b) & 1
crc := (crc >> 1) ^ $8C
else
crc >>= 1
b >>= 1
I translated Micah's Spin1 routine to inline PASM2.
pub crc8x_pasm(p_src, n) : crc | b, t1
'' Returns CRC8 of n bytes at p_src
'' -- conversion to PASM2 by JonnyMac
org
.loop rdbyte b, p_src
add p_src, #1
rep @.done, #8
mov t1, crc
xor t1, b
testb t1, #0 wc
shr crc, #1
if_c xor crc, #$8C
shr b, #1
.done
djnz n, #.loop
end
Of course, there was a big speed improvement. BTW, I'm just coming up to speed on PASM2. In the likely event that this routine can be made more efficient, please show me how. Finally, I updated the code with CRCBIT.Edit: This code has been updated with a suggestion by @AJL (see below). It cuts about 300 ticks out of the test timing.
pub crc8x_pasm2(p_src, n) : crc | b, t1
'' Returns CRC8 of n bytes at p_src
org
.loop rdbyte b, p_src
add p_src, #1
rep @.done, #8
shr b, #1 wc
crcbit crc, #$8C
.done
djnz n, #.loop
end
Final results running a string of 19 characters through each method:
Test1: CRC is $82 in 63536 ticks.
Test2: CRC is $82 in 3360 ticks.
Test3: CRC is $82 in 2128 ticks.
Comments
pub crc8x_pasm2(p_src, n) : crc | b, t1 '' Returns CRC8 of n bytes at p_src '' -- implementation by JonnyMac org .loop rdlong b, p_src add p_src, #1 rep @.done, #8 ' testb b, #0 wc shr b, #1 wc crcbit crc, #$8C ' shr b, #1 .done djnz n, #.loop end
mov utx, #OUT_SOP call #isr_utx_sof ' Send sync byte mov icrc, #$1f ' Prime the CRC5 pump mov sof_pkt, frame ' CRC5 calculation done on the 11-bit frame number value rev sof_pkt ' Input data reflected mov utx, #PID_SOF call #utx_byte ' Send token PID byte setq sof_pkt ' CRCNIB setup for data bits 0..7 crcnib icrc, #USB5_POLY crcnib icrc, #USB5_POLY ' Data bits 0..7 calculated getbyte utx, frame, #0 ' Send the low byte of the frame number call #utx_byte shl sof_pkt, #8 ' Shift out processed bits to set up CRCBIT * 3 rep #2, #3 ' Three data bits left to process shl sof_pkt, #1 wc crcbit icrc, #USB5_POLY ' Data bits 8..10 calculated xor icrc, #$1f ' Final XOR value getbyte utx, frame, #1 ' Send remaining frame number bits shl icrc, #3 ' Merge CRC to bits 7..3 of the final token byte or utx, icrc call #utx_byte ' Last start-of-frame byte is on the wire
Doh! Edit: Note that USB puts bits on the wire MSB LSB first.pub crc8x_pasm3(p_src, n) : crc | b '' Returns CRC8 of n bytes at p_src org .loop rdbyte b, p_src add p_src, #1 rev b setq b crcnib crc, #$8C crcnib crc, #$8C djnz n, #.loop end
Test1: CRC is $82 in 63736 ticks. Test2: CRC is $82 in 3336 ticks. Test3: CRC is $82 in 2104 ticks. Test4: CRC is $82 in 1504 ticks.
yeah we need a cookbook with good explanations and diagrams/graphs/whatever, especially for the smart pin functionality. I've read the current doc on them a bit and I still don't really understand most of it well enough to use.
I'm sure if I reread them a bunch and did a bunch of trial and error experimenting with a scope I could figure them out, but it sure will be nice when we have better docs.
https://forums.parallax.com/discussion/162298/prop2-fpga-files-updated-2-june-2018-final-version-32i/p110
Shouldn't you CRCNIB 8 times instead of 2?
I see you're calling rdlong but processing n bytes, so shouldn't it really be a rdbyte?
And a glossary.
Bill M.
Yeah I know they will, Parallax has always been pretty good with docs for their stuff.
I've been porting the Simple Libraries stuff over to FlexC and making them work on P2, and I'm stalled currently on the last few bits of libsimpletools that use the P1 counter modules in various ways (pwm, rctime, etc.). I could just bit bang some of them, but they should use the smartpins.
The existing docs are pretty terse on each mode, and the vocabulary doesn't match up P1 counter module docs very well. Tomorrow I am going to get setup with my scope on the P2 and do a bunch of trial and error testing...
Yes, I should have used rdbyte, and I have gone back and fixed that in my examples. Thank you for the catch. I couldn't find it in the docs, but I believe I read somewhere that rdlong in the P2 works on any boundary, which explains why the code worked.
I'm struggling with smart pins, too, and like you, will have my 'scope out today looking at signals. I'm working to port my personal library of Spin1 objects because I have a few clients interested in transitioning to the P2. When they're ready, I want to be as well.
I'm very curious about that too, because the original Spin1 routine is exactly the kind of bit manipulation that fastspin was designed to optimize. I can't test now, but my guess is that it'll be in the same ball park as Jon's first Pasm routine when compiled with -O2. Probably a bit slower.
Jon, any chance you could build this with fastspin?
With full optimization on it comes to 3326 ticks, 10 shorter than my hand-assembled code above. I think the difference can be accounted for by the Spin overhead for calling and returning from the routine.
Code is attached so you can check that I did things correctly (I'm not a regular user of FastSpin).
I was curious about the Spin overhead so I "removed" it by moving the start/stop timing points just outside the inline assembly in my original program. The result was 2488 ticks -- this is more inline of what I expected after looking at the code generated by the FastSpin compiler (it's doing more than I am).
FYI, after couple hours of experimenting with various settings, I think have a reasonable handle on some of the smart pin modes (00100 to 01001) pulse/cycle, transition, NCOs, & PWMs.
I will work up some diagrams and whatnot and share in a new thread. I'm using FlexC for my test/sample code, but it should be trivial to convert to spin2, since it's mostly just wrpin/wxpin/wypin instrinsics that map to the similar spin2 commands.
For example, this snippet does a 42ms pulse out on pin 1:
// code assumes 200Mhz sysclock int mode = 0x00000048; // pulse/cycle mode int x = (0 << 16) | 200; // 0 for compare (means always true/high), 200 clocks per time unit (1us at 200Mhz sysclock) int y = 42000; // pulse for 42000 time units (42ms at 200Mhz clock) int pin = 1; _dirl(pin); // dir low (smart pin held in reset) _wrpin(pin, mode); // set mode _wxpin(pin, x); // set x _dirh(pin); // dir high (smart pin allowed to run) _wypin(pin, y); // set y, in pulse mode this triggers the pulse to start and last for (y * x[15:0]) sysclocks
This is just a super basic example, but useful still. Once the pin is setup in this mode, you can just write new values to Y to trigger new pulses.