Shop OBEX P1 Docs P2 Docs Learn Events
Manipulating Bits in a LONG — Parallax Forums

Manipulating Bits in a LONG

JonnyMacJonnyMac Posts: 9,107
edited 2013-08-17 19:29 in Propeller 1
For an RGB LED driver I'd like to keep the colors in the high-level interface as $RR_GG_BB but the device itself wants the order as $GG_RR_BB. In the world of Captain Obvioso (that would be me), I'm doing this:
' adjust colorbits
                        ' -- starts as $RR_GG_BB
                        ' -- ends as $GG_RR_BB

                        mov     t1, colorbits                   ' make copies
                        mov     t2, colorbits
                        and     t1, RED_BITS                    ' isolate colors
                        and     t2, GRN_BITS
                        and     colorbits, BLU_BITS
                        shr     t1, #8                          ' reposition red
                        shl     t2, #8                          ' reposition green
                        or      colorbits, t2                   ' reconstruct
                        or      colorbits, t1

                        ' shift out bits here
                        
                        
RED_BITS                long    $FF_00_00
GRN_BITS                long    $00_FF_00
BLU_BITS                long    $00_00_FF


I know there are some with wild-eyed assembly tricks for bit manipulations -- and I'm always game to learn. Timing is not a problem here, but faster tends to be better.
«1

Comments

  • tonyp12tonyp12 Posts: 1,951
    edited 2013-07-16 15:01
    the movi trick, works on up to 31bit data but not 32bit without handling that the last movi will trash bit22
    step1: get the long in to position that the lower byte is correct, use shift or rol
    step2: get the byte you want to merge in to position bit 8-1 bit, bit 0 is a do-not-care
    step3: use movi
    step4: rol the long 8bits
    step5: repeat from step2 one more time for 24bit data

    if you could adjust so shiftout uses 24bit msb left shiftout , could make the last rol colorbits,#16 and delete the and
                            mov     t1, colorbits                   ' 00_RR_GG_BB   copy
                            shr     colorbits,#8                    ' 00_00_RR_GG   step1
                            ror     t1,#15                          ' gG_BB_00_RRg  step2
                            movi    colorbits,t1                    ' RRg_00_RR_GG  step3
                            rol     colorbits,#8                    ' g00_RR_GG_RR  step4
                            ror     t1,#8                           ' 00_RR_GG_BB0  step2
                            movi    colorbits,t1                    ' BB0_RR_GG_RR  step3
                            rol     colorbits,#8                    ' 0RR_GG_RR_BB  step4
                            and     colorbits,_$FFFFFF              ' only needed if shiftout don't ignore                      
    
                            ' shift out bits here
    
  • kwinnkwinn Posts: 8,697
    edited 2013-07-16 15:10
    Your'e probably thinking of how data in memory can be swapped without using temporary locations by using XOR instructions. XOR a, b then XOR b, a followed by XOR a, b will swap the data in those two locations. What you have looks like it will be as fast as it gets.
  • skylightskylight Posts: 1,915
    edited 2013-07-16 15:23
    I'm probably being daft here, usually am :smile: but what about keeping the code straightforward and rewiring by swapping the red and green outputs? ie wire green output to red and vice versa.
  • kwinnkwinn Posts: 8,697
    edited 2013-07-16 16:15
    skylight wrote: »
    I'm probably being daft here, usually am :smile: but what about keeping the code straightforward and rewiring by swapping the red and green outputs? ie wire green output to red and vice versa.

    It's a lot simpler to write the 9 instructions to shift the data around than it would be to change a board layout or rewire a board/connector.
  • JonnyMacJonnyMac Posts: 9,107
    edited 2013-07-16 16:44
    I'm probably being daft here, usually am but what about keeping the code straightforward and rewiring by swapping the red and green outputs? ie wire green output to red and vice versa.

    That is not an option. The device in question is a WS2812 RGB 5050 LED (with built-in driver) and it wants the data as $GG_RR_BB. Why? I have no idea. As I try to keep my code obvious I wan to pass colors as $RR_GG_BB. In the end, the easy code works, the driver works, and I'm going to put a fork in it.
  • jazzedjazzed Posts: 11,803
    edited 2013-07-16 17:12
    Essentially the same as tonyp's except no need for mask.
                            ' adjust colorbits
                            ' -- starts as $RR_GG_BB
                            ' -- ends as $GG_RR_BB
    
                            ' dots are 0's ...
                            mov    t1, colorbits  ' ........_rrrrrrrr_gggggggg_bbbbbbbb
                            ror    colorbits, #17 ' rggggggg_gbbbbbbb_b......._.rrrrrrr
                            ror    t1, #16        ' gggggggg_bbbbbbbb_........_rrrrrrrr
                            movi   colorbits, t1  ' .rrrrrrr_rbbbbbbb_b......._.rrrrrrr
                            shr    colorbits, #8  ' ........_.rrrrrrr_rbbbbbbb_b.......
                            shr    t1, #24        ' ........_........_........_gggggggg
                            movi   colorbits, t1  ' .ggggggg_grrrrrrr_rbbbbbbb_b.......
                            shr    colorbits,#7   ' ........_gggggggg_rrrrrrrr_bbbbbbbb
    
    Alternative method with a HUB long variable.
                            mov    addr, someaddr
                            rdlong colorbits, addr ' ........_rrrrrrrr_gggggggg_bbbbbbbb
                            add    addr, #1        ' next byte 
                            ror    colorbits, #16  ' gggggggg_bbbbbbbb_........_rrrrrrrr
                            wrbyte colorbits, addr ' write RR to addr + 1
                            add    addr, #1        ' next byte 
                            rol    colorbits, #8   ' bbbbbbbb_........_rrrrrrrr_gggggggg
                            wrbyte colorbits, addr ' write GG to addr + 2
    
  • Cluso99Cluso99 Posts: 18,069
    edited 2013-07-16 17:53
    Here is another solution... (8 instructions and 1 extra temp register)
                                    ' x = 00000000 rrrrrrrr gggggggg bbbbbbbb
            mov   t, x              ' t = 00000000 rrrrrrrr gggggggg bbbbbbbb
            shl   x, #24            ' x = bbbbbbbb 00000000 00000000 00000000
            ror   t, #16            ' t = gggggggg bbbbbbbb 00000000 rrrrrrrr
            movs  x, t              ' x = bbbbbbbb 00000000 00000000 rrrrrrrr
            ror   x, #8             ' x = rrrrrrrr bbbbbbbb 00000000 00000000
            shr   t, #24            ' t = 00000000 00000000 00000000 gggggggg
            or    x, t              ' x = rrrrrrrr bbbbbbbb 00000000 gggggggg
            ror   x, #16            ' x = 00000000 gggggggg rrrrrrrr bbbbbbbb
    x       long  0                 ' orig & final value
    t       long  0                 ' temporary reg
    
    
  • tonyp12tonyp12 Posts: 1,951
    edited 2013-07-16 18:10
    If you assume the long is stored in Hub in the first place, two first 2 is needed anyway so it will add 8 longs
    mov    t1,par
    rdbyte colorbits,t1  'get blue, gives you free "AND $FF"
    add    t1,#1
    rdbyte t2,t1         'get green
    shl    t2,#16
    or     colorbits,t2  'merge in green
    add    t1,#1
    rdbyte t2,t1         'get red
    shl    t2,#8
    or     colorbits,t2  'merge in red
    
  • Beau SchwabeBeau Schwabe Posts: 6,566
    edited 2013-07-16 21:07
    I almost feel as this should be one of Phil's Golf Challenges... :-)

    Here is a method that uses 7 instructions
                  mov       t1,   Data            '00_RR_GG_BB    copy
                  shl       t1,   #8              'RR_GG_BB_00    shift t1 left one byte 
                  xor       t1,   Data            'RR_^^_^^_BB    xor t1 and data
                  and       t1,   Mask            '00_^^_00_00    Mask off only what we are interested in
                  xor       Data, t1              '00_GG_GG_BB    GG moves into RR
                  shr       t1,   #8              '00_00_^^_00    shift t1 right one byte
                  xor       Data, t1              '00_GG_RR_BB    RR moves into GG's place
    

    "Mask" is preset to = $00_FF_00_00

    "Data" is both your input and output
  • Cluso99Cluso99 Posts: 18,069
    edited 2013-07-16 23:42
    Nice work Beau!
  • RaymanRayman Posts: 14,662
    edited 2013-07-17 07:37
    Was just looking at the datasheet for that device... I think maybe I'd do this in the shift out driver... Maybe create a shift out function that shifts out one byte. Then, just do a preshift before each call to that function...
  • JonnyMacJonnyMac Posts: 9,107
    edited 2013-07-17 14:15
    There's enough timing tolerance in the spec that you could probably get away with that. I rearrange the bits and then send the entire 24-bit packet. I've attached my "Captain Obvioso" code for the driver (will appear in my N&V column).

    [Edit] Updated driver in post #25. Will move to ObEx shortly.
  • RS_JimRS_Jim Posts: 1,766
    edited 2013-07-18 07:27
    Johnny,
    I believe the color sequence comes from the video industry where they use a lot of color subtraction and it is from the green channel.
    Jim
    JonnyMac wrote: »
    That is not an option. The device in question is a WS2812 RGB 5050 LED (with built-in driver) and it wants the data as $GG_RR_BB. Why? I have no idea. As I try to keep my code obvious I wan to pass colors as $RR_GG_BB. In the end, the easy code works, the driver works, and I'm going to put a fork in it.
  • kwinnkwinn Posts: 8,697
    edited 2013-07-18 13:37
    Rayman wrote: »
    Was just looking at the datasheet for that device... I think maybe I'd do this in the shift out driver... Maybe create a shift out function that shifts out one byte. Then, just do a preshift before each call to that function...

    Now that is the perfect answer when the bits are shifted out on a single pin, and it only adds 2 or 3 shift instructions to the driver.
  • TubularTubular Posts: 4,703
    edited 2013-07-18 14:10
    Hi Jonny

    Good to see you're onto these. The prop is well suited to driving a bunch of these (several "universes")

    I've attached some working code based on Gavin T Garner's obex code. The GRB vs RGB swap is a bit of a pain, I did it in the shift out driver. That way we can just change the driver according to the exact led
    'new G-R swap required for Neopixel.   Delete this block for other drivers that use RGB format
                  mov       GRBvalue, RGBvalue  'make copy to work on - swap RGB to GRB format for NeoPixels
                  mov       BLUvalue, RGBvalue
                  and       BLUvalue,#255
                  shl       GRBvalue,#16        'GB00
                  and       GRBvalue,mask1      'G000
                  and       RGBvalue,mask3       0R00
                  or        GRBvalue,RGBvalue   'GR00
                  shr       GRBvalue,#8         '0GR0
                  or        GRBvalue,BLUvalue   '0GRB
    

    Btw my kids have had an absolute ball programming these, choosing the color sequences of a little "train" that runs along the strip, as well as how many leds in the "train" etc. Part of the fun is the flexible strip itself, that can be bent around the place to run up walls, loop the loop, weave between toys, etc. It has been good practice spelling colors (including "Chartreuse") and they asked me to add "brown" and "black" to the range of colors (I think we also added marone and crimson too)

    One of the next projects we're going to try is a light version of a slot car set, with a red, green and blue "car", that can overtake each other, (you'll get yellow while red is passing green, for instance). Need to 3d print some hand throttles!

    Will post some pics/vids of proto board soon

    BTW the B version of the WS2812 (WS2812B) is extremely easy to solder, because it has 4 pins rather than 6 (one pin in each corner, spread relatively far apart). Its more like a 0.1" pitch. I have 1000 on order, hope to see them soon. I can send you some when they arrive if you like (and anyone else that wants a few samples)

    regards
    Lachlan

    EDIT: Beware; I messed around with some of the color constants in the attached code - turned down the brightness by editing the constants, most are at "half brightness"
  • JonnyMacJonnyMac Posts: 9,107
    edited 2013-07-18 14:50
    Lachlan,

    Thanks for the tip on the WS2812B -- I have a couple projects that they will be great for. I've already created a component and pattern for DipTrace. And, yes, I would love samples. Thanks!

    Will have a look at the adjusted constants. As I admit in the code, I liberated those from another program. Will do likewise with yours! :)
  • Cluso99Cluso99 Posts: 18,069
    edited 2013-07-18 18:48
    The WS2812B specs look great! I will email you Lachlan.
  • tonyp12tonyp12 Posts: 1,951
    edited 2013-07-18 19:51
    I think I got everyone beat, 2 extra longs in shift routine vs the 12longs used now.
    00 rr gg bb  'starting out with
    bb 00 rr gg  'ror 8
    gg bb 00 rr  'rol nbits (24)
    bb 00 rr gg  'after 8 rol colorbits, #1
    rr gg bb 00  'rol nbits (16)
    gg bb 00 rr  'after 8 rol colorbits, #1
    bb 00 rr gg  'rol nbits (8)
    
    shiftout              [b]  ror[/b]     colorbits, #8                   ' to off set the rol 24 below
                            mov     nbits,#24                       ' shift 24 bits (3 x 8) 
                            
    :loop                 [b]  test    nbits,#7                wz      ' at 24, 16 or 8?
            if_z            rol     colorbits,nbits [/b] 
                            [b]rol[/b]     colorbits, #1           wc      ' msb --> C 
    
  • JonnyMacJonnyMac Posts: 9,107
    edited 2013-07-18 20:37
    Tony,

    That looked intriguing so I plugged it into my code and.... no dice. It seems like it should work, but it doesn't.
  • tonyp12tonyp12 Posts: 1,951
    edited 2013-07-18 20:49
    you have to change the RCL to a ROL, as you have no need to rol C in to the value anyway.

    ROL D,S Rotate D left by S: C= D[31]
    RCL D,S Rotate carry left into D by S: C= D[31]
    :loop                   rcl     colorbits, #1           wc      ' [b]C--> lsb[/b] ,result msb --> C
    :loop                  [b] rol  [/b]   colorbits, #1           wc      ' result msb --> C
    
    I updated the code on page1
  • JonnyMacJonnyMac Posts: 9,107
    edited 2013-07-18 21:31
    That worked. I'm going to go ahead and leave it in the code and document it in my column -- you'll get full credit for the trick, of course!
  • RobertDyerRobertDyer Posts: 23
    edited 2013-07-19 16:44
    Glad these posts are here. I'm struggling to write my first PASM code trying to drive a stick of 8 WS2812s from Adafruit. The code appears to be working correctly as I step through the debugger, but doesn't manipulate the output pin when I run it at full-speed. Hopefully I'll find my problem after sifting through some of the code here. However, that's not what I'm writing about. Just wanted to kick in a correction for the timing on the chips. All the different timing numbers for the high and low times for the high and low bits made me suspicious. It also looked weird to me that the 0-bit and 1-bit total times were different. So I checked with Adafruit and they agreed that the datasheet probably "lost something in translation". The telling numbers are the "800kHz operation frequency" and the "1.25uS data transfer time" (which happen to match). They suggest keeping the 1.25uS bit-time for both 0 and 1, but keeping closer to a 1/3-2/3 and 2/3-1/3 timing ratios. So more like .4uS high with .85uS low for a 0-bit, and .8uS high with .45uS low for a 1-bit. Obviously you guys are successfully using the times in the errant datasheet, but maybe this'll help reliability??? Robert
  • tonyp12tonyp12 Posts: 1,951
    edited 2013-07-19 17:08
    I also think it's some type of misprint.
    It would simplify the code a little if you only need to supply two longs for timing, one for total length of 125us and one for 35us
    The below code keeps the rising edge at the same interval time, independent of code length.
    shiftout                ror     colorbits, #8                   ' to off set the rol 24 below
                            mov     nbits,#24                       ' shift 24 bits (3 x 8) 
                            mov     bittimer,_125us                 ' total bittime  
                            add     bittimer,cnt                    ' init the first one
    
    :loop                   test    nbits,#7                wz      ' at 24, 16 or 8?
                   if_z     rol     colorbits,nbits                 ' shuffle rrggbb to ggrrbb
                            rol     colorbits, #1           wc      ' msb --> C
                            mov     cnt,_35us                       ' sync bit timer for a 0
                   if_c     add     cnt,_35us                       ' double it for a 1
                            add     cnt,cnt
                            or      outa, txmask                    ' tx line 1
                            waitcnt cnt,#0
                            andn    outa, txmask                    ' tx line 0 
                            waitcnt bittimer,_125us                 ' total bitime
                            djnz    nbits, #:loop                   ' next bit 
    
  • JonnyMacJonnyMac Posts: 9,107
    edited 2013-07-19 18:33
    Robert,

    My final code does in fact use values right out of the WS2812 data sheet. I tried other values but it stops the driver from working correctly. This could be due to 3.3v from the Propeller going into the module (technically, 3.3v is below the Vih spec for the chip). I've attached my final driver in the event it might be useful to you.

    As you can see, I've got three Adafruit NeoPixels on an Activity Board and they're doing what I want them to do.
    800 x 600 - 614K
  • TubularTubular Posts: 4,703
    edited 2013-07-19 19:46
    Hi Robert,

    I think the your suspicions are right about the datasheet. However we don't know how they decode the pulses - there's a good chance it involves some kind of voltage averaging (comparing the data in waveform against its own longer term average), may not even involve any type of clock (its a self clocking signal in after all).

    I'm in the process of testing the WS2812B one parameter at a time ("WS2812B, the missing manual"). I've been looking at the V-I characteristics and thermal performance first (had Adafruits 40 neopixel shield up at 108 C so far). Next up I want to look at switching thresholds on the Din, and then I'll get onto timing. The prop is plenty fast enough to explore all the timing limits (I'm sure we can make a multi string controller in one cog, in fact).

    As john says we all have slightly different code working fine, so happy to look at yours and offer suggestions if you get really stuck. Otherwise keep going!
    Here's a vid of a propeller proto board driving a few forms http://youtu.be/z6vhASDaWIw

    Let me get to those timing tests. I changed the way the "timing ticks" are calculated from Gavin's original way. I have been able to successfully run faster and slower than recommended, but I didn't record the limits. I think from memory 0.2us and 0.6us worked, but I'll revisit that soon and document it.
  • RobertDyerRobertDyer Posts: 23
    edited 2013-07-22 14:54
    Hey guys, Still plugging away at my coding problems. Want to make sure I exhaust my efforts before sending it to you guys. My problem may be that I don't understand how the WS2812s actually work. For synchronous timing purposes I thought I needed to read the cnt register once at the beginning of sending ALL the bits for ALL the LEDs. But if I understand some of the code I'm looking at, that's not what everyone is doing. I see multiple captures of the cnt register in the loops. Where am I going wrong in my thinking? Also, how do I insert a snippet of code into a forum post? I tried to "cut and paste" both out of notepad and the Propeller tool, and also tried typing it directly into the editing window, but every time I preview the post, the column spacing is gone whether I use tabs or spaces. How do you guys do it? Thanks. Robert
  • tonyp12tonyp12 Posts: 1,951
    edited 2013-07-22 15:25
    Use code.gif?psid=1
    As the WS2812 have different timing for a 0,1 bit and also different (supposedly needed) for the total length if the bit was a 0 or 1.
    That is why you need multiple waitcnt with different values.

    What you have to learn with waitcnt, that you wait for a time to be equal but then you also add a value good towards the next wait (if not needed use #0)
    cnt on the destination side (add cnt,cnt) is just a way to use the free shadow register and the real cnt always have to be on the source side.
  • RobertDyerRobertDyer Posts: 23
    edited 2013-07-22 15:58
    Sorry Tony, I wasn't clear with my question. I understand about using waitcnt with a future value in the target register, and then adding values to the target register for the next waitcnt instruction. (I use the label CntSave to save cnt as my target register, others use bittimer or counter as their labels). But what I noticed is that there are multiple times when people move cnt into their target register DURING the bit-streams for each LED. Unless you allow for it, doesn't reloading the target register with cnt (anywhere in the bit-stream other than the very beginning) throw off the synchronization of the subsequent bits for the LED? Also, doesn't the WS2812 require that there be no extra time between the 24-bits of each LED? Robert
  • JonnyMacJonnyMac Posts: 9,107
    edited 2013-07-22 17:38
    Robert,

    Below you'll find the latest code from my driver. This uses Tony's register manipulation trick so that the $RR_GG_BB value passed to the routine gets transmitted as $GG_RR_BB.

    The high side pulse timing is most critical (it establishes the bit value), so I load the timer, take the TX line high, then add in the cnt register; in my mind this helps the code keep the pulse timing as close as possible. Sure there's a little slop on the low side, but this is well within the tolerances of the spec.

    shiftout                ror     colorbits, #8                   ' {1} to offset the rol 24 below
                            mov     nbits, #24                      ' shift 24 bits (3 x 8) 
                            
    :loop                   test    nbits, #%111            wz      ' {2} nbits at 24, 16 or 8?
            if_z            rol     colorbits, nbits                ' if yes, modify colorbits 
                            rol     colorbits, #1           wc      ' msb --> C
            if_c            mov     bittimer, bit1hi                ' set bit timing  
            if_nc           mov     bittimer, bit0hi                
                            or      outa, txmask                    ' tx line 1  
                            add     bittimer, cnt                   ' sync bit timer  
            if_c            waitcnt bittimer, bit1lo                
            if_nc           waitcnt bittimer, bit0lo 
                            andn    outa, txmask                    ' tx line 0             
                            waitcnt bittimer, #0                    ' hold while low
                            djnz    nbits, #:loop                   ' next bit
    
  • tonyp12tonyp12 Posts: 1,951
    edited 2013-07-22 18:36
    > doesn't reloading the target register with cnt (anywhere in the bit-stream other than the very beginning) throw off the synchronization
    Yes it would as the length of the code would add around 0.6us, in my code in #24 post I do use fixed interval timer.
    The length of the bit is more important as the LED will sync itself to the rising edge of each bit it get in.
Sign In or Register to comment.