I'm not to smart when it come to smart pins (Serial SBUS Driver)

evanh · 2021-03-13 04:50

Jon,
Here's an optimisation for the latter part - reduces 13 longs down to 5 longs. It does expect "i" changed from an index with ALTGB into a direct pointer to the specific register:

        ' test flags
        testb   i, #7           wc      ' ch17 is digital
        bitc    tmp, #0+(10<<5)             ' if NC then 0, if C then 2047
        testb   i, #6           wc      ' ch18 is digital
        bitc    tmp, #16+(10<<5)            ' if NC then 0, if C then 2047
        wrlong  tmp, p_ch

To keep it as an index would need one more instruction:

        ' test flags
        getbyte i, #.sbus_raw + (23 >> 2), #23 & 3  ' packet byte 23
        testb   i, #7           wc      ' ch17 is digital
        bitc    tmp, #0+(10<<5)             ' if NC then 0, if C then 2047
        testb   i, #6           wc      ' ch18 is digital
        bitc    tmp, #16+(10<<5)            ' if NC then 0, if C then 2047
        wrlong  tmp, p_ch

PS: Neither solution is tested.

Regarding the bit-order reversal, that seems overly complicated. The byte receive routine should be able format each received byte correctly before placing in the array. Most likely this is a case of using SHR instead of SHL when receiving the serial bits.

evanh · 2021-03-13 05:01

Heh, the first one fails for sure. It needs the ch17/ch18 TESTB bit constants corrected for longword addressing. The bit offsets will change when the bit order is changed anyway.

JonnyMac · 2021-03-13 05:20

Thanks for the tips, Evan. I used the first suggestion and found it worked, even shaving of another 0.3 microseconds. I posted the latest version above.

Regarding the bit-order reversal, that seems overly complicated.

I'm using a smart pin UART so there is no choice in the matter. If I was bit-banging the UART, it could be done in that process (as Jason does in his P1 code).

JonnyMac · 2021-03-13 05:33

@evanh said:
Heh, the first one fails for sure. It needs the ch17/ch18 TESTB bit constants corrected for longword addressing. The bit offsets will change when the bit order is changed anyway.

This is what I put into the code and I have verified that it works. I'm testing with a fixed set of data and have manually changed the flag bits and seen the corresponding change in the channel values array. It works.

                mov       tmp, #0                               ' clear ch17/ch18 workspace
                testb     v, #7                         wc      ' ch17 is digital
                bitc      tmp, #00 + (10 << 5)                  ' 0 if nc, 2047 if c
                testb     v, #6                         wc      ' ch18 is digital
                bitc      tmp, #16 + (10 << 5)                  ' 0 if nc, 2047 if c    
                wrlong    tmp, ptrb

TBH, I don't know why bitc works the way it does in this situation, but I'm very tired. I will figure it out in the morning.
Figured it out; (10 << 5) is creating a group of 11 bits that gets filled with C. Okay, I can sleep now.

evanh · 2021-03-13 05:55

Ah, yes, I see the MOV is needed. I keep forgetting the unspecified bits are not cleared.

The reason why I'm saying the first example won't work was because it wasn't meant to be preceded by a GETBYTE at all. It was supposed to operate on longwords, and could but needs more tuning.

You can shave off one more program longword by ditching the final ALTGB and hard coding the final GETBYTE as per my second example.

TonyB_ · 2021-03-13 12:22

@JonnyMac
Happy to help. The code you have now is as I thought with one extra long saved at the end by Evan's bitc that sets a contiguous group of bits to c. Only four bits are available with a 9-bit immediate for group size - 1, enough in this case but not always so.

Another long could be saved if the position of flags in the array is fixed and if I understood the algorithm (I just had a quick look at about 1 am) I could say whether any further improvement could be made. Are you sure that bits in each byte cannot be reversed before storing in the cog array?

iseries · 2021-03-13 13:27

25 bytes of data are sent then a 10 to 20 millisecond pause before the next 25 bytes are sent. So there is this amount of time to process the data which is easy enough to do even on a P1 which is where this code was first used.

I use a lock while the data is being updated because I zero the channel value just prior to it being updated so I don't want the data to be used until I know it's valid. I could use a different variable to collect the value and then just assign it to the channel but it's nice to have all the values updated at the same time.

Mike

evanh · 2021-03-13 14:16

@TonyB_ said:
... Are you sure that bits in each byte cannot be reversed before storing in the cog array?

I have my doubts there needs to be any reversing. RS232 is lsb first and the smartpins only understand lsb-first. So unless the data is explicitly mangled by the sender, there shouldn't be any reordering needed. Just needs the SHR rxdata, #32-8 to move the byte to the least 8-bits is all.

TonyB_ · 2021-03-13 14:21

@evanh said:

@TonyB_ said:
... Are you sure that bits in each byte cannot be reversed before storing in the cog array?

I have my doubts there needs to be any reversing. RS232 is lsb first and the smartpins only understand lsb-first. So unless the data is explicitly mangled by the sender, there shouldn't be any reordering needed. Just needs the SHR rxdata, #32-8 to move the byte to the least 8-bits is all.

If needed, reversing the bits of each byte in a long is trivial:

    movbyts D,#%%0123
    rev     D

JonnyMac · 2021-03-13 16:30

@TonyB_ said:
If needed, reversing the bits of each byte in a long is trivial:
  movbyts D,#%%0123
  rev     D

That works with single bytes. I tend to use this strategy which always works, even when the value is greater than 8 bits.

                rev       chval                                 ' flip back
                rol       chval, #11                            ' restore LSB position

Some of this may not make sense because you're not seeing the whole picture. In case you want to play and make more suggestions (which I always appreciate), here is a data set that I captured from an RC receiver.

dat  

  SBUS          byte    $0F, $03, $04, $E0, $3E, $00, $08, $0F, $78, $C0, $E3, $E1, $00
                byte    $7C, $38, $00, $01, $08, $40, $00, $02, $10, $80, $00, $00

Byte 0 is the header ($0F)
Bytes 1-22 contain packed channel values (16, each is 11 bits)
Byte 23 is the flags
-- bit0 is ch17
-- bit1 is ch18
-- bit2 is loss-of-signal
-- bit3 indicates failsafe mode (tends to be throttle channel to 0)
Byte 24 is the footer ($00)

Again, these values are from the receiver, hence are in "normal" order. In Jason Dorie's P1 S.BUS receiver he deliberately receives the bits in reverse order, I believe in support of his unpacking algorithm. If the bytes in the packet above can be quickly unpacked without having them reverse first, this is good.

These are the channel values I get from the code I posted, as well as from a Spin translation of the algorithm used in many C versions of S.BUS receiver/decoders.

I am displaying the device channel numbers, not the array index.

This is the unpacking code found in most C demos for S.BUS.

  channels[0]  = ((sbusData[1]|sbusData[2]<< 8) & 0x07FF);
  channels[1]  = ((sbusData[2]>>3|sbusData[3]<<5) & 0x07FF);
  channels[2]  = ((sbusData[3]>>6|sbusData[4]<<2|sbusData[5]<<10) & 0x07FF);
  channels[3]  = ((sbusData[5]>>1|sbusData[6]<<7) & 0x07FF);
  channels[4]  = ((sbusData[6]>>4|sbusData[7]<<4) & 0x07FF);
  channels[5]  = ((sbusData[7]>>7|sbusData[8]<<1|sbusData[9]<<9) & 0x07FF);
  channels[6]  = ((sbusData[9]>>2|sbusData[10]<<6) & 0x07FF);
  channels[7]  = ((sbusData[10]>>5|sbusData[11]<<3) & 0x07FF); 
  channels[8]  = ((sbusData[12]|sbusData[13]<< 8) & 0x07FF);
  channels[9]  = ((sbusData[13]>>3|sbusData[14]<<5) & 0x07FF);
  channels[10] = ((sbusData[14]>>6|sbusData[15]<<2|sbusData[16]<<10) & 0x07FF);
  channels[11] = ((sbusData[16]>>1|sbusData[17]<<7) & 0x07FF);
  channels[12] = ((sbusData[17]>>4|sbusData[18]<<4) & 0x07FF);
  channels[13] = ((sbusData[18]>>7|sbusData[19]<<1|sbusData[20]<<9) & 0x07FF);
  channels[14] = ((sbusData[20]>>2|sbusData[21]<<6) & 0x07FF);
  channels[15] = ((sbusData[21]>>5|sbusData[22]<<3) & 0x07FF);

My gut tells me that Jason's idea to reverse the bytes and unpack them the way he does is going to be more efficient that a PASM conversion of this.

JonnyMac · 2021-03-13 16:59

@evanh said:
I have my doubts there needs to be any reversing. RS232 is lsb first and the smartpins only understand lsb-first. So unless the data is explicitly mangled by the sender, there shouldn't be any reordering needed. Just needs the SHR rxdata, #32-8 to move the byte to the least 8-bits is all.

You may have a different thought after seeing what the C programmers are doing to unpack the channel data, but I reserve the right to be wrong , and would love to see your approach.

Here's how I am receiving a packet. I'm using a 250us (> 2 bytes @ 100Kbaud, 8E2, inverted) period to locate the idle time between packets.

sbus_rx         fltl      rxd                                   ' reset uart, release smart pin

.wquiet         mov       x, #250                               ' wait for low between packets
.wq1            waitx     us1
                testp     rxd                           wc
    if_c        jmp       #.wquiet                              ' if activity, restart wait
                djnz      x, #.wq1

                drvl      rxd                                   ' re-enable smart pin
                mov       i, #0                                 ' reset packet index

.get_sbus       testp     rxd                           wc      ' anything waiting?
    if_nc       jmp       #.get_sbus                            '  not yet

                rdpin     b, rxd                                ' read new byte
                shr       b, #23                                ' align lsb
                and       b, #$FF                               ' strip parity bit

.flip           rev       b                                     ' reverse bits
                rol       b, #8

                altsb     i, #.sbus_raw                         ' write to cog array
                setbyte   b

                incmod    i, #24                        wc      ' update array index
                tjnz      i, #.get_sbus                         '  go again if not done

I used this code to capture the packet from my receiver (see previous post), but I wrote it to a hub array instead of a cog array. The smart pin is configured by the Spin2 interface code.

JonnyMac · 2021-03-13 19:35

@TonyB_ and @evanh

Well, you were right, reversing the bytes is not necessary for speedy decompression. This is what I came up with. It shaves off a couple hundredths microseconds and may be easier to understand (eye of the beholder, I guess).

Thanks, again, for the feedback and suggestions. I'm not yet a strong PASM programmer -- but am working my way there.

Edit: Restored original code -- uses shl to move new byte into correct position and then or to incorporate into working value.

                ' unpack s.bus bytes to channel values

unpack          mov       ptrb, p_ch                            ' point to hub array
                mov       ch, #0                                ' set hub index     
                mov       i, #1                                 ' s.bus data index (cog)
                mov       tmp, #0                               ' workspace
                mov       bc, #0                                ' bits in tmp                    

.get_byte       altgb     i, #sbusraw                           ' get s.bus byte from cog array
                getbyte   v                                     ' get next byte
                add       i, #1                                 ' bump byte array index
                shl       v, bc                                 ' move into position
                or        tmp, v                                ' add to work byte       
                add       bc, #8                                ' update bit count
                cmpsub    bc, #11                       wc      ' enough bits for channel value?
    if_nc       jmp       #.get_byte

                mov       out, tmp                              ' make a copy
                and       out, ##$07FF                          ' clean-up
                wrword    out, ptrb++                           ' write to hub

                shr       tmp, #11                              ' remove last value

                incmod    ch, #15                       wc      ' update hub index
                tjnz      ch, #.get_byte                        ' keep going if more channels

                ' get digital channels from flags byte

                altgb     i, #sbusraw                           ' get flags byte
                getbyte   v

                mov       tmp, #0                               ' clear ch17/ch18 workspace
                testb     v, #0                         wc      ' ch17 is digital
                bitc      tmp, #00 + (10 << 5)                  ' 0 if nc, 2047 if c
                testb     v, #1                         wc      ' ch18 is digital
                bitc      tmp, #16 + (10 << 5)                  ' 0 if nc, 2047 if c
                wrlong    tmp, ptrb

JonnyMac · 2021-03-13 20:27

@iseries

After re-writing the S.BUS unpacking code to work with standard (non-reversed) bytes, I translated it to Spin2.

pub unpack_sbus() | i, j, workspace, bitcount

  i := 1
  longfill(@j, 0, 3)

  repeat while (j < 16)
    repeat while (bitcount < 11)
      workspace |= (SBus[i++] << bitcount)
      bitcount += 8
    channel[j++] := workspace & $07FF
    workspace >>= 11
    bitcount -= 11

  channel[16] := (SBus[23] & %0001) ? 2047 : 0
  channel[17] := (SBus[23] & %0010) ? 2047 : 0

I tested this and it works. Now, I'm not much of a C programmer, but I made this attempt at a translation. In your S.BUS library you are doing a bit-by-bit unpacking which is probably consuming more time than necessary. As we say here in Hollywood... for your consideration:

void unpack_sbus()
{
  uint8_t  i = 1;                                               \\ s.bus index
  uint8_t  j = 0;                                               \\ channel value index
  uint8_t  bitcount = 0;
  uint32_t workspace = 0;

  while (j < 16) {
    while (bitcount < 11) {                                     \\ build channel value
      workspace |= (ss[i++] << bitcount);                       \\ get byte, add to workspace
      bitcount += 8;                                            \\ update bitcount
    }
    channel[j++] = uint16_t(workspace & 0x07FF);                \\ write to channel
    workspace >>= 11;                                           \\ remove old value
    bitcount -= 11;                                             \\ update bitcount
  }

  channel[16] = (ss[23] & 0x01) ? 2047 : 0;                     \\ extract digital channels
  channel[17] = (ss[23] & 0x02) ? 2047 : 0; 
}

The C version has not been tested.

TonyB_ · 2021-03-13 23:41

@JonnyMac said:
@TonyB_ and @evanh

Well, you were right, reversing the bytes is not necessary for speedy decompression. This is what I came up with. It shaves off a couple hundredths microseconds and may be easier to understand (eye of the beholder, I guess).

Thanks, again, for the feedback and suggestions. I'm not yet a strong PASM programmer -- but am working my way there.

@JonnyMac
Thanks for the full explanation about S-BUS channel data. The rolbyte optimization is missing in your latest code.

I've written an unpacker which is about twice the size but very fast, less than a microsecond at 200MHz. I could post it tomorrow after checking (but not testing).

JonnyMac · 2021-03-14 00:08

The rolbyte optimization is missing in your latest code.

The use of rolbyte only applies when the bits are flipped; in standard order we have to shift the new byte into position and then add it.

As the routine stands, it takes about 3.7us -- less than half a bit period. According to information I've read, S.BUS packets come every 14ms (standard speed) or 7ms (high speed). Even in high speed mode there is 4ms to unpack and update the channel values. That is to say that I'm going to live with 3-4 microseconds and not have a worry.

evanh · 2021-03-14 00:53

Looking good Jon.

I like what you've done with bit-bashing around the smartpin's enable/disable. I'd never thought that would even work without first dropping back to smartpin off (M == 0). EDIT: Which means I've got an error in the I/O block diagram, because I explicitly label the IN switch as controlled by M == 0.

evanh · 2021-03-14 02:10

Real-time packet validation could be added to the receive routine. Invalid parity, header and footer could all trigger a restart of the wait-for-next-packet.

Also, the deglitcher could be employed to help too. The normal method in UARTs is 16 samples over a full bit period and majority vote. But since the deglitcher is unanimous vote we will have to restrict the window. I'd aim for 5 samples over quarter bit period. eg:

        hubset( 1<<30 | 3<<7 | 2<<5 | 4 )       'set filt3 to 5 samples at interval of 16 sysclocks/sample (window = 80 sysclocks)
        pinstart( rxd, P_ASYNC_RX | P_FILT3_AB | P_SYNC_IO | P_SCHMITT_A, bitperiod, 0 )   ' use filt3 for serial receive

evanh · 2021-03-14 03:39

@evanh said:
I like what you've done with bit-bashing around the smartpin's enable/disable. I'd never thought that would even work without first dropping back to smartpin off (M == 0). EDIT: Which means I've got an error in the I/O block diagram, because I explicitly label the IN switch as controlled by M == 0.

Ha, I was right already. I've only just now tested it, having assumed before. M == 0 is the switch after all. It did make the most sense.

So, Jon, the FLTL/DRVL pair aren't working the way you wanted in the "sbus_rx" routine. The TESTP rxd will never set C.

EDIT: It is flushing any excess from the smartpin though. That's likely a good thing.

JonnyMac · 2021-03-14 05:32

I just spoke with Chip; you're right. Not a problem for my S.BUS object as I can define the S.BUS mode as a constant.

JonnyMac · 2021-03-14 18:51

@iseries, @TonyB_ , @evanh

I feel like the first cut of my P2 S.BUS receiver and demo is ready to make public. Before I send to Parallax for inclusion in their library, I'd appreciate any additional feedback on the finished object.

I've tested with a SkyFly FS-IA10B set to S.BUS mode which puts out 130 packets/second (fast mode). Everything seems to be working smoothly. I've tested in Propeller Tool and with FlexGUI (set T_TYPE to T_ANSI) so it should be okay with any OS.

Thanks, again, for comments and feedback. My friend Rick does effects for the new Star Wars TV shows (Mandalorian, Book of Boba Fett, Kenobi) and this is a tool he needs for puppeteering live animatronics using the Propeller (their shop uses Arduinos for this now, but -- like me -- he doesn't care for the Arduino).

evanh · 2021-03-15 00:10

Oh, that's so easy in spin how the DAT section pasm variables can be used in spin too. Well, I'm guessing not as a variable though, more like a initial object create parameter I presume. My spin knowledge is near non-existent.

evanh · 2021-03-15 01:09

Here's the deglitcher additions (untested of course): Two lines inserted between pinstart() and coginit(), and the pin mode config word.

  pinstart(rxpin, M_SBUS, x, 0)                                 ' start smart pin rx for S.BUS

  x = (encod(x +/ 20) - 16) #> 0                                ' x = priority encoded (bit period / 20)
  hubset(1<<30 | 3<<7 | 2<<5 | x)                               ' set filt3 to 5 samples at interval of serial-bit-period/20

  cog := coginit(COGEXEC_NEW, @entry, @rxp) + 1                 ' start uart manager cog

M_SBUS          long      P_ASYNC_RX | P_INVERT_IN | P_FILT3_AB | P_SYNC_IO | P_SCHMITT_A            ' smart pin rx uart, inverted and deglitched

evanh · 2021-03-15 02:06

No need to explicitly strip the parity bit, the SETBYTE does that naturally.

And here's the parity check that goes straight after the SETBYTE:

        ones    t1, t1                  ' parity check
        testb   t1, #0      wz          ' Z set when odd count (parity)
    if_z    jmp #sbus_rx                ' if odd (not even), wait for next packet

evanh · 2021-03-15 02:57

Actually, to do the deglitch fully, the same should be applied to the activity waiting too:

sbus_rx         fltl      rxd                                   ' reset uart, release smart pin
                wrpin     P_SBUS, rxd                           ' restore tri-state mode

P_SBUS          long      P_FILT3_AB | P_SYNC_IO | P_SCHMITT_A                                  ' deglitched only

evanh · 2021-03-15 03:12

And code size shrink for the verify routine:

verify
        getbyte   t1, #sbusraw+(24>>2), #(24&3)     ' footer @ index 24, expecting $00
        rolbyte   t1, #sbusraw, #0          ' header @ index 0, expecting $0f
        cmp       t1, #$0f      wcz
    if_ne   jmp       #sbus_rx              ' if no match, wait for next packet

TonyB_ · 2021-03-15 18:10

@evanh said:
And code size shrink for the verify routine:

verify
      getbyte   t1, #sbusraw+(24>>2), #(24&3)     ' footer @ index 24, expecting $00
      rolbyte   t1, #sbusraw, #0          ' header @ index 0, expecting $0f
      cmp       t1, #$0f      wcz
  if_ne   jmp       #sbus_rx              ' if no match, wait for next packet

No # before sbusraw. Also bad practice I think to use wcz if c not needed.

More code shrinking:

'               altgb     icog, #sbusraw                        ' get flags byte
'to
'               jmp       #sbus_rx                              ' back to top
'replaced by

                mov       chwork, #0                            ' clear ch17/ch18 workspace
                testb     sbusraw+23/4, #0+(23&3)*8     wc      ' ch17 is digital
                bitc      chwork, #00 + (10 << 5)               ' 0 if nc, 2047 if c
                testb     sbusraw+23/4, #1+(23&3)*8     wc      ' ch18 is digital
                bitc      chwork, #16 + (10 << 5)               ' 0 if nc, 2047 if c
                wrlong    chwork, ptrb

                ' check loss-of-signal and failsafe flags

chk_los         testb     sbusraw+23/4, #2+(23&3)*8     wc
                subx      t2,t2                                 ' 0/-1 if nc/c
                wrlong    t2, ptra[3]                           ' write to hub

chk_failsafe    testb     sbusraw+23/4, #3+(23&3)*8     wc
                subx      t2,t2                                 ' 0/-1 if nc/c
                wrlong    t2, ptra[4]                           ' write to hub

                rdlong    t2, ptra[5]                           ' update packet count
                add       t2, #1
                wrlong    t2, ptra[5]

                wrlong    #0, ptra[6]

                jmp       #sbus_rx                              ' back to top

evanh · 2021-03-16 04:05

Bugger, you're right. And I've led you up the garden path too. Certainly can't go embedding maths in an assembly line for variables/registers. That only works for constants/immediates.

Both our code shrinks are broken.

JonnyMac · 2021-03-16 04:56

No worries, guys, I appreciate the feedback. I tend to focus on writing obvious code that new programmers can follow. I re-implemented the parity check. If there is a parity error, or header/footer error the lost frame flag is set.

Another reason for simplicity is back-porting it to the P1 while trying to make things as similar as is possible.

The exercise has been really good, and I appreciate the suggestions.

TonyB_ · 2021-03-16 10:43

@evanh said:
Bugger, you're right. And I've led you up the garden path too. Certainly can't go embedding maths in an assembly line for variables/registers. That only works for constants/immediates.

Both our code shrinks are broken.

PM sent.

TonyB_ · 2021-03-16 11:43

@evanh said:
Bugger, you're right. And I've led you up the garden path too. Certainly can't go embedding maths in an assembly line for variables/registers. That only works for constants/immediates.

Both our code shrinks are broken.

I think my code will work. The only problem with your code was the # before sbusraw. FlexSpin can handle register arithmetic and assembles the following code correctly.

verify
        getbyte t1, sbusraw+(24>>2), #(24&3)    ' footer @ index 24, expecting $00
        rolbyte t1, sbusraw, #0                 ' header @ index 0, expecting $0f
        cmp     t1, #$0f            wz
 if_ne  jmp     #sbus_rx                        ' if no match, wait for next packet

I'm not to smart when it come to smart pins (Serial SBUS Driver)

Comments