Shop OBEX P1 Docs P2 Docs Learn Events
I'm not to smart when it come to smart pins (Serial SBUS Driver) - Page 2 — Parallax Forums

I'm not to smart when it come to smart pins (Serial SBUS Driver)

2

Comments

  • evanhevanh Posts: 16,040
    edited 2021-03-13 05:17

    Jon,
    Here's an optimisation for the latter part - reduces 13 longs down to 5 longs. It does expect "i" changed from an index with ALTGB into a direct pointer to the specific register:

            ' test flags
            testb   i, #7           wc      ' ch17 is digital
            bitc    tmp, #0+(10<<5)             ' if NC then 0, if C then 2047
            testb   i, #6           wc      ' ch18 is digital
            bitc    tmp, #16+(10<<5)            ' if NC then 0, if C then 2047
            wrlong  tmp, p_ch
    

    To keep it as an index would need one more instruction:

            ' test flags
            getbyte i, #.sbus_raw + (23 >> 2), #23 & 3  ' packet byte 23
            testb   i, #7           wc      ' ch17 is digital
            bitc    tmp, #0+(10<<5)             ' if NC then 0, if C then 2047
            testb   i, #6           wc      ' ch18 is digital
            bitc    tmp, #16+(10<<5)            ' if NC then 0, if C then 2047
            wrlong  tmp, p_ch
    

    PS: Neither solution is tested.

    Regarding the bit-order reversal, that seems overly complicated. The byte receive routine should be able format each received byte correctly before placing in the array. Most likely this is a case of using SHR instead of SHL when receiving the serial bits.

  • evanhevanh Posts: 16,040
    edited 2021-03-13 05:08

    Heh, the first one fails for sure. It needs the ch17/ch18 TESTB bit constants corrected for longword addressing. The bit offsets will change when the bit order is changed anyway.

  • Thanks for the tips, Evan. I used the first suggestion and found it worked, even shaving of another 0.3 microseconds. I posted the latest version above.

    Regarding the bit-order reversal, that seems overly complicated.

    I'm using a smart pin UART so there is no choice in the matter. If I was bit-banging the UART, it could be done in that process (as Jason does in his P1 code).

  • JonnyMacJonnyMac Posts: 9,160
    edited 2021-03-13 05:48

    @evanh said:
    Heh, the first one fails for sure. It needs the ch17/ch18 TESTB bit constants corrected for longword addressing. The bit offsets will change when the bit order is changed anyway.

    This is what I put into the code and I have verified that it works. I'm testing with a fixed set of data and have manually changed the flag bits and seen the corresponding change in the channel values array. It works.

                    mov       tmp, #0                               ' clear ch17/ch18 workspace
                    testb     v, #7                         wc      ' ch17 is digital
                    bitc      tmp, #00 + (10 << 5)                  ' 0 if nc, 2047 if c
                    testb     v, #6                         wc      ' ch18 is digital
                    bitc      tmp, #16 + (10 << 5)                  ' 0 if nc, 2047 if c    
                    wrlong    tmp, ptrb
    

    TBH, I don't know why bitc works the way it does in this situation, but I'm very tired. I will figure it out in the morning.
    Figured it out; (10 << 5) is creating a group of 11 bits that gets filled with C. Okay, I can sleep now. :)

  • evanhevanh Posts: 16,040
    edited 2021-03-13 06:06

    Ah, yes, I see the MOV is needed. I keep forgetting the unspecified bits are not cleared.

    The reason why I'm saying the first example won't work was because it wasn't meant to be preceded by a GETBYTE at all. It was supposed to operate on longwords, and could but needs more tuning.

    You can shave off one more program longword by ditching the final ALTGB and hard coding the final GETBYTE as per my second example.

  • @JonnyMac
    Happy to help. The code you have now is as I thought with one extra long saved at the end by Evan's bitc that sets a contiguous group of bits to c. Only four bits are available with a 9-bit immediate for group size - 1, enough in this case but not always so.

    Another long could be saved if the position of flags in the array is fixed and if I understood the algorithm (I just had a quick look at about 1 am) I could say whether any further improvement could be made. Are you sure that bits in each byte cannot be reversed before storing in the cog array?

  • 25 bytes of data are sent then a 10 to 20 millisecond pause before the next 25 bytes are sent. So there is this amount of time to process the data which is easy enough to do even on a P1 which is where this code was first used.

    I use a lock while the data is being updated because I zero the channel value just prior to it being updated so I don't want the data to be used until I know it's valid. I could use a different variable to collect the value and then just assign it to the channel but it's nice to have all the values updated at the same time.

    Mike

  • evanhevanh Posts: 16,040

    @TonyB_ said:
    ... Are you sure that bits in each byte cannot be reversed before storing in the cog array?

    I have my doubts there needs to be any reversing. RS232 is lsb first and the smartpins only understand lsb-first. So unless the data is explicitly mangled by the sender, there shouldn't be any reordering needed. Just needs the SHR rxdata, #32-8 to move the byte to the least 8-bits is all.

  • TonyB_TonyB_ Posts: 2,196
    edited 2021-03-13 14:24

    @evanh said:

    @TonyB_ said:
    ... Are you sure that bits in each byte cannot be reversed before storing in the cog array?

    I have my doubts there needs to be any reversing. RS232 is lsb first and the smartpins only understand lsb-first. So unless the data is explicitly mangled by the sender, there shouldn't be any reordering needed. Just needs the SHR rxdata, #32-8 to move the byte to the least 8-bits is all.

    If needed, reversing the bits of each byte in a long is trivial:

        movbyts D,#%%0123
        rev     D
    
  • JonnyMacJonnyMac Posts: 9,160
    edited 2021-03-13 18:40

    @TonyB_ said:
    If needed, reversing the bits of each byte in a long is trivial:

      movbyts D,#%%0123
      rev     D
    

    That works with single bytes. I tend to use this strategy which always works, even when the value is greater than 8 bits.

                    rev       chval                                 ' flip back
                    rol       chval, #11                            ' restore LSB position
    

    Some of this may not make sense because you're not seeing the whole picture. In case you want to play and make more suggestions (which I always appreciate), here is a data set that I captured from an RC receiver.

    dat  
    
      SBUS          byte    $0F, $03, $04, $E0, $3E, $00, $08, $0F, $78, $C0, $E3, $E1, $00
                    byte    $7C, $38, $00, $01, $08, $40, $00, $02, $10, $80, $00, $00
    

    Byte 0 is the header ($0F)
    Bytes 1-22 contain packed channel values (16, each is 11 bits)
    Byte 23 is the flags
    -- bit0 is ch17
    -- bit1 is ch18
    -- bit2 is loss-of-signal
    -- bit3 indicates failsafe mode (tends to be throttle channel to 0)
    Byte 24 is the footer ($00)

    Again, these values are from the receiver, hence are in "normal" order. In Jason Dorie's P1 S.BUS receiver he deliberately receives the bits in reverse order, I believe in support of his unpacking algorithm. If the bytes in the packet above can be quickly unpacked without having them reverse first, this is good.

    These are the channel values I get from the code I posted, as well as from a Spin translation of the algorithm used in many C versions of S.BUS receiver/decoders.

     1  1027
     2  1024
     3   251
     4  1024
     5   240
     6   240
     7   240
     8  1807
     9  1024
    10  1807
    11  1024
    12  1024
    13  1024
    14  1024
    15  1024
    16  1024
    17     0
    18     0
    

    I am displaying the device channel numbers, not the array index.

    This is the unpacking code found in most C demos for S.BUS.

      channels[0]  = ((sbusData[1]|sbusData[2]<< 8) & 0x07FF);
      channels[1]  = ((sbusData[2]>>3|sbusData[3]<<5) & 0x07FF);
      channels[2]  = ((sbusData[3]>>6|sbusData[4]<<2|sbusData[5]<<10) & 0x07FF);
      channels[3]  = ((sbusData[5]>>1|sbusData[6]<<7) & 0x07FF);
      channels[4]  = ((sbusData[6]>>4|sbusData[7]<<4) & 0x07FF);
      channels[5]  = ((sbusData[7]>>7|sbusData[8]<<1|sbusData[9]<<9) & 0x07FF);
      channels[6]  = ((sbusData[9]>>2|sbusData[10]<<6) & 0x07FF);
      channels[7]  = ((sbusData[10]>>5|sbusData[11]<<3) & 0x07FF); 
      channels[8]  = ((sbusData[12]|sbusData[13]<< 8) & 0x07FF);
      channels[9]  = ((sbusData[13]>>3|sbusData[14]<<5) & 0x07FF);
      channels[10] = ((sbusData[14]>>6|sbusData[15]<<2|sbusData[16]<<10) & 0x07FF);
      channels[11] = ((sbusData[16]>>1|sbusData[17]<<7) & 0x07FF);
      channels[12] = ((sbusData[17]>>4|sbusData[18]<<4) & 0x07FF);
      channels[13] = ((sbusData[18]>>7|sbusData[19]<<1|sbusData[20]<<9) & 0x07FF);
      channels[14] = ((sbusData[20]>>2|sbusData[21]<<6) & 0x07FF);
      channels[15] = ((sbusData[21]>>5|sbusData[22]<<3) & 0x07FF);
    

    My gut tells me that Jason's idea to reverse the bytes and unpack them the way he does is going to be more efficient that a PASM conversion of this.

  • JonnyMacJonnyMac Posts: 9,160
    edited 2021-03-13 17:01

    @evanh said:
    I have my doubts there needs to be any reversing. RS232 is lsb first and the smartpins only understand lsb-first. So unless the data is explicitly mangled by the sender, there shouldn't be any reordering needed. Just needs the SHR rxdata, #32-8 to move the byte to the least 8-bits is all.

    You may have a different thought after seeing what the C programmers are doing to unpack the channel data, but I reserve the right to be wrong , and would love to see your approach.

    Here's how I am receiving a packet. I'm using a 250us (> 2 bytes @ 100Kbaud, 8E2, inverted) period to locate the idle time between packets.

    sbus_rx         fltl      rxd                                   ' reset uart, release smart pin
    
    .wquiet         mov       x, #250                               ' wait for low between packets
    .wq1            waitx     us1
                    testp     rxd                           wc
        if_c        jmp       #.wquiet                              ' if activity, restart wait
                    djnz      x, #.wq1
    
                    drvl      rxd                                   ' re-enable smart pin
                    mov       i, #0                                 ' reset packet index
    
    .get_sbus       testp     rxd                           wc      ' anything waiting?
        if_nc       jmp       #.get_sbus                            '  not yet
    
                    rdpin     b, rxd                                ' read new byte
                    shr       b, #23                                ' align lsb
                    and       b, #$FF                               ' strip parity bit
    
    .flip           rev       b                                     ' reverse bits
                    rol       b, #8
    
                    altsb     i, #.sbus_raw                         ' write to cog array
                    setbyte   b
    
                    incmod    i, #24                        wc      ' update array index
                    tjnz      i, #.get_sbus                         '  go again if not done
    

    I used this code to capture the packet from my receiver (see previous post), but I wrote it to a hub array instead of a cog array. The smart pin is configured by the Spin2 interface code.

  • JonnyMacJonnyMac Posts: 9,160
    edited 2021-03-14 03:58

    @TonyB_ and @evanh

    Well, you were right, reversing the bytes is not necessary for speedy decompression. This is what I came up with. It shaves off a couple hundredths microseconds and may be easier to understand (eye of the beholder, I guess).

    Thanks, again, for the feedback and suggestions. I'm not yet a strong PASM programmer -- but am working my way there.

    Edit: Restored original code -- uses shl to move new byte into correct position and then or to incorporate into working value.

                    ' unpack s.bus bytes to channel values
    
    unpack          mov       ptrb, p_ch                            ' point to hub array
                    mov       ch, #0                                ' set hub index     
                    mov       i, #1                                 ' s.bus data index (cog)
                    mov       tmp, #0                               ' workspace
                    mov       bc, #0                                ' bits in tmp                    
    
    .get_byte       altgb     i, #sbusraw                           ' get s.bus byte from cog array
                    getbyte   v                                     ' get next byte
                    add       i, #1                                 ' bump byte array index
                    shl       v, bc                                 ' move into position
                    or        tmp, v                                ' add to work byte       
                    add       bc, #8                                ' update bit count
                    cmpsub    bc, #11                       wc      ' enough bits for channel value?
        if_nc       jmp       #.get_byte
    
                    mov       out, tmp                              ' make a copy
                    and       out, ##$07FF                          ' clean-up
                    wrword    out, ptrb++                           ' write to hub
    
                    shr       tmp, #11                              ' remove last value
    
                    incmod    ch, #15                       wc      ' update hub index
                    tjnz      ch, #.get_byte                        ' keep going if more channels
    
                    ' get digital channels from flags byte
    
                    altgb     i, #sbusraw                           ' get flags byte
                    getbyte   v
    
                    mov       tmp, #0                               ' clear ch17/ch18 workspace
                    testb     v, #0                         wc      ' ch17 is digital
                    bitc      tmp, #00 + (10 << 5)                  ' 0 if nc, 2047 if c
                    testb     v, #1                         wc      ' ch18 is digital
                    bitc      tmp, #16 + (10 << 5)                  ' 0 if nc, 2047 if c
                    wrlong    tmp, ptrb
    
  • JonnyMacJonnyMac Posts: 9,160
    edited 2021-03-13 22:10

    @iseries

    After re-writing the S.BUS unpacking code to work with standard (non-reversed) bytes, I translated it to Spin2.

    pub unpack_sbus() | i, j, workspace, bitcount
    
      i := 1
      longfill(@j, 0, 3)
    
      repeat while (j < 16)
        repeat while (bitcount < 11)
          workspace |= (SBus[i++] << bitcount)
          bitcount += 8
        channel[j++] := workspace & $07FF
        workspace >>= 11
        bitcount -= 11
    
      channel[16] := (SBus[23] & %0001) ? 2047 : 0
      channel[17] := (SBus[23] & %0010) ? 2047 : 0 
    

    I tested this and it works. Now, I'm not much of a C programmer, but I made this attempt at a translation. In your S.BUS library you are doing a bit-by-bit unpacking which is probably consuming more time than necessary. As we say here in Hollywood... for your consideration:

    void unpack_sbus()
    {
      uint8_t  i = 1;                                               \\ s.bus index
      uint8_t  j = 0;                                               \\ channel value index
      uint8_t  bitcount = 0;
      uint32_t workspace = 0;
    
      while (j < 16) {
        while (bitcount < 11) {                                     \\ build channel value
          workspace |= (ss[i++] << bitcount);                       \\ get byte, add to workspace
          bitcount += 8;                                            \\ update bitcount
        }
        channel[j++] = uint16_t(workspace & 0x07FF);                \\ write to channel
        workspace >>= 11;                                           \\ remove old value
        bitcount -= 11;                                             \\ update bitcount
      }
    
      channel[16] = (ss[23] & 0x01) ? 2047 : 0;                     \\ extract digital channels
      channel[17] = (ss[23] & 0x02) ? 2047 : 0; 
    }
    

    The C version has not been tested.

  • @JonnyMac said:
    @TonyB_ and @evanh

    Well, you were right, reversing the bytes is not necessary for speedy decompression. This is what I came up with. It shaves off a couple hundredths microseconds and may be easier to understand (eye of the beholder, I guess).

    Thanks, again, for the feedback and suggestions. I'm not yet a strong PASM programmer -- but am working my way there.

    @JonnyMac
    Thanks for the full explanation about S-BUS channel data. The rolbyte optimization is missing in your latest code.

    I've written an unpacker which is about twice the size but very fast, less than a microsecond at 200MHz. I could post it tomorrow after checking (but not testing).

  • JonnyMacJonnyMac Posts: 9,160
    edited 2021-03-14 04:01

    The rolbyte optimization is missing in your latest code.

    The use of rolbyte only applies when the bits are flipped; in standard order we have to shift the new byte into position and then add it.

    As the routine stands, it takes about 3.7us -- less than half a bit period. According to information I've read, S.BUS packets come every 14ms (standard speed) or 7ms (high speed). Even in high speed mode there is 4ms to unpack and update the channel values. That is to say that I'm going to live with 3-4 microseconds and not have a worry.

  • evanhevanh Posts: 16,040
    edited 2021-03-14 00:55

    Looking good Jon.

    I like what you've done with bit-bashing around the smartpin's enable/disable. I'd never thought that would even work without first dropping back to smartpin off (M == 0). EDIT: Which means I've got an error in the I/O block diagram, because I explicitly label the IN switch as controlled by M == 0.

  • evanhevanh Posts: 16,040
    edited 2021-03-14 02:20

    Real-time packet validation could be added to the receive routine. Invalid parity, header and footer could all trigger a restart of the wait-for-next-packet.

    Also, the deglitcher could be employed to help too. The normal method in UARTs is 16 samples over a full bit period and majority vote. But since the deglitcher is unanimous vote we will have to restrict the window. I'd aim for 5 samples over quarter bit period. eg:

            hubset( 1<<30 | 3<<7 | 2<<5 | 4 )       'set filt3 to 5 samples at interval of 16 sysclocks/sample (window = 80 sysclocks)
            pinstart( rxd, P_ASYNC_RX | P_FILT3_AB | P_SYNC_IO | P_SCHMITT_A, bitperiod, 0 )   ' use filt3 for serial receive
    
  • evanhevanh Posts: 16,040
    edited 2021-03-14 03:45

    @evanh said:
    I like what you've done with bit-bashing around the smartpin's enable/disable. I'd never thought that would even work without first dropping back to smartpin off (M == 0). EDIT: Which means I've got an error in the I/O block diagram, because I explicitly label the IN switch as controlled by M == 0.

    Ha, I was right already. I've only just now tested it, having assumed before. M == 0 is the switch after all. It did make the most sense.

    So, Jon, the FLTL/DRVL pair aren't working the way you wanted in the "sbus_rx" routine. The TESTP rxd will never set C.

    EDIT: It is flushing any excess from the smartpin though. That's likely a good thing.

  • I just spoke with Chip; you're right. Not a problem for my S.BUS object as I can define the S.BUS mode as a constant.

  • @iseries, @TonyB_ , @evanh

    I feel like the first cut of my P2 S.BUS receiver and demo is ready to make public. Before I send to Parallax for inclusion in their library, I'd appreciate any additional feedback on the finished object.

    I've tested with a SkyFly FS-IA10B set to S.BUS mode which puts out 130 packets/second (fast mode). Everything seems to be working smoothly. I've tested in Propeller Tool and with FlexGUI (set T_TYPE to T_ANSI) so it should be okay with any OS.

    Thanks, again, for comments and feedback. My friend Rick does effects for the new Star Wars TV shows (Mandalorian, Book of Boba Fett, Kenobi) and this is a tool he needs for puppeteering live animatronics using the Propeller (their shop uses Arduinos for this now, but -- like me -- he doesn't care for the Arduino).

  • evanhevanh Posts: 16,040
    edited 2021-03-15 00:11

    Oh, that's so easy in spin how the DAT section pasm variables can be used in spin too. Well, I'm guessing not as a variable though, more like a initial object create parameter I presume. My spin knowledge is near non-existent. :)

  • evanhevanh Posts: 16,040

    Here's the deglitcher additions (untested of course): Two lines inserted between pinstart() and coginit(), and the pin mode config word.

      pinstart(rxpin, M_SBUS, x, 0)                                 ' start smart pin rx for S.BUS
    
      x = (encod(x +/ 20) - 16) #> 0                                ' x = priority encoded (bit period / 20)
      hubset(1<<30 | 3<<7 | 2<<5 | x)                               ' set filt3 to 5 samples at interval of serial-bit-period/20
    
      cog := coginit(COGEXEC_NEW, @entry, @rxp) + 1                 ' start uart manager cog
    
    M_SBUS          long      P_ASYNC_RX | P_INVERT_IN | P_FILT3_AB | P_SYNC_IO | P_SCHMITT_A            ' smart pin rx uart, inverted and deglitched
    
  • evanhevanh Posts: 16,040
    edited 2021-03-15 02:33

    No need to explicitly strip the parity bit, the SETBYTE does that naturally.

    And here's the parity check that goes straight after the SETBYTE:

            ones    t1, t1                  ' parity check
            testb   t1, #0      wz          ' Z set when odd count (parity)
        if_z    jmp #sbus_rx                ' if odd (not even), wait for next packet
    
  • evanhevanh Posts: 16,040

    Actually, to do the deglitch fully, the same should be applied to the activity waiting too:

    sbus_rx         fltl      rxd                                   ' reset uart, release smart pin
                    wrpin     P_SBUS, rxd                           ' restore tri-state mode
    
    P_SBUS          long      P_FILT3_AB | P_SYNC_IO | P_SCHMITT_A                                  ' deglitched only
    
  • evanhevanh Posts: 16,040

    And code size shrink for the verify routine:

    verify
            getbyte   t1, #sbusraw+(24>>2), #(24&3)     ' footer @ index 24, expecting $00
            rolbyte   t1, #sbusraw, #0          ' header @ index 0, expecting $0f
            cmp       t1, #$0f      wcz
        if_ne   jmp       #sbus_rx              ' if no match, wait for next packet
    
  • TonyB_TonyB_ Posts: 2,196
    edited 2021-03-16 10:26

    @evanh said:
    And code size shrink for the verify routine:

    verify
          getbyte   t1, #sbusraw+(24>>2), #(24&3)     ' footer @ index 24, expecting $00
          rolbyte   t1, #sbusraw, #0          ' header @ index 0, expecting $0f
          cmp       t1, #$0f      wcz
      if_ne   jmp       #sbus_rx              ' if no match, wait for next packet
    

    No # before sbusraw. Also bad practice I think to use wcz if c not needed.

    More code shrinking:

    '               altgb     icog, #sbusraw                        ' get flags byte
    'to
    '               jmp       #sbus_rx                              ' back to top
    'replaced by
    
                    mov       chwork, #0                            ' clear ch17/ch18 workspace
                    testb     sbusraw+23/4, #0+(23&3)*8     wc      ' ch17 is digital
                    bitc      chwork, #00 + (10 << 5)               ' 0 if nc, 2047 if c
                    testb     sbusraw+23/4, #1+(23&3)*8     wc      ' ch18 is digital
                    bitc      chwork, #16 + (10 << 5)               ' 0 if nc, 2047 if c
                    wrlong    chwork, ptrb
    
                    ' check loss-of-signal and failsafe flags
    
    chk_los         testb     sbusraw+23/4, #2+(23&3)*8     wc
                    subx      t2,t2                                 ' 0/-1 if nc/c
                    wrlong    t2, ptra[3]                           ' write to hub
    
    chk_failsafe    testb     sbusraw+23/4, #3+(23&3)*8     wc
                    subx      t2,t2                                 ' 0/-1 if nc/c
                    wrlong    t2, ptra[4]                           ' write to hub
    
                    rdlong    t2, ptra[5]                           ' update packet count
                    add       t2, #1
                    wrlong    t2, ptra[5]
    
                    wrlong    #0, ptra[6]
    
                    jmp       #sbus_rx                              ' back to top
    
  • evanhevanh Posts: 16,040
    edited 2021-03-16 04:07

    Bugger, you're right. And I've led you up the garden path too. Certainly can't go embedding maths in an assembly line for variables/registers. That only works for constants/immediates.

    Both our code shrinks are broken.

  • No worries, guys, I appreciate the feedback. I tend to focus on writing obvious code that new programmers can follow. I re-implemented the parity check. If there is a parity error, or header/footer error the lost frame flag is set.

    Another reason for simplicity is back-porting it to the P1 while trying to make things as similar as is possible.

    The exercise has been really good, and I appreciate the suggestions.

  • @evanh said:
    Bugger, you're right. And I've led you up the garden path too. Certainly can't go embedding maths in an assembly line for variables/registers. That only works for constants/immediates.

    Both our code shrinks are broken.

    PM sent.

  • TonyB_TonyB_ Posts: 2,196
    edited 2021-03-16 11:45

    @evanh said:
    Bugger, you're right. And I've led you up the garden path too. Certainly can't go embedding maths in an assembly line for variables/registers. That only works for constants/immediates.

    Both our code shrinks are broken.

    I think my code will work. The only problem with your code was the # before sbusraw. FlexSpin can handle register arithmetic and assembles the following code correctly.

    verify
            getbyte t1, sbusraw+(24>>2), #(24&3)    ' footer @ index 24, expecting $00
            rolbyte t1, sbusraw, #0                 ' header @ index 0, expecting $0f
            cmp     t1, #$0f            wz
     if_ne  jmp     #sbus_rx                        ' if no match, wait for next packet
    
Sign In or Register to comment.