Best practices assembly code for bit operations

Cncjerry · 2012-06-09 13:19

Can somebody point me to what would be considered best practices for the equivalent PIC assembly codes for setting a bit, clearing a bit, testing a bit in 32 bit registers? I currently initialize a register shifting the bit around for setting or clearing with or, and, andn, XOR depending.

Jerry

Mike Green · 2012-06-09 13:44

For PIC instructions, you'll have to consult with the PIC Manual. For the Propeller, it depends on whether the bit you're interested in is in the first 9 bits of the 32 bit long or somewhere else in the 32 bit value. When it's in the low order 9 bits, you normally use an immediate value as a bit mask along with the OR, ANDN, XOR, and TEST instructions (like: TEST temp,#$100). If you're using the C or Z flags to hold values, the MUXxx instructions are also useful for setting one or more bits. If the bits you're interested in are elsewhere in the long, it's best to define a constant long value (like: value long $1000) and refer to that in your OR, ANDN, XOR, or TEST instructions (like: TEST temp,value).

tonyp12 · 2012-06-09 13:49

OR = set a bit
AND = clear a bit with the zeros in the mask (source field)
But to make it easy use ANDN to clear a bit wth 1s in the mask ,as the N reverses the mask.

And then there is the MUXC/MUXZ,
set the bit to the values of C or Z (eg 0 or 1) and the mask (dest field) will tell what bits to affect

Cncjerry · 2012-06-09 14:07

So basically I think I understand it then. On the PIC you can test a specific ( and set or clear ) bit of a register using an immediated value.

Next question, if I have a 32 bit register that I need to shift out a specific pin with clock on another pin, is there an easy way to nondestructively ( to the other bits in the outa register) shift it out? There must be serial clock code around I guess. But I was thinking the cleanest way would be to save out and dira, flip all dira to inputs except for the two (clock and data) pins, then load outa with data, or in the clock, etc. I would then reload outa and dira.

tonyp12 · 2012-06-09 14:30

Here is a non-destructive, it shift the 24bit test mask instead of the data, msb first in this case.

DAT
              org

asm_entry     mov       dira,#7                         'make pin 0, 1, 2 pin an output

              mov       outa,#0                         'set all pins to off
              mov       asm_c,cnt                       'prepare for WAITCNT loop
              add       asm_c,asm_cycles
:loop         waitcnt   asm_c,asm_cycles                'wait for next CNT value

              movd      :loop2,serialpnt                'set spot in serial anim table
              add       serialpnt,#1                    'add one long for next time
              cmp       serialpnt,#serialpnt wz         'did it reach the end?
        if_z  mov       serialpnt,#serial               'reset serial pointer
              mov       asm_cnt,bit_test         


:loop2        test      0-0,asm_cnt wz                  '0-0 selfmod code, test serial
              muxnz     outa,#SER                       'answer in z, use it to set Serial pin
              or        outa,#CLK                       'turn pin on  (clock) 
              andn      outa,#CLK                       'turn pin off (clock)
              sar       asm_cnt,#1 wz                   'shift right
        if_nz jmp       #:loop2                         'if not zero, jmp 
              
              or        outa,#LAT                       'turn pin on  (latch) 
              andn      outa,#LAT                       'turn pin off (latch)
              cmp       asm_cycles,asm_cycles2 wz
        if_z  jmp       #:loop        
              sub       asm_cycles,asm_cycles3     
              jmp       #:loop                          'wait for next sample

              

asm_cycles    long      $01FFFF                         'delay timer
asm_cycles2   long      $004FFF                         'minimum delay
asm_cycles3   long      $1000                           'subtract to delay
serial        long      H+J+L+%10
              long      A1+A2+D1+D2+J+M+%01
              long      B+C+D1+D2+E+F+%10<<8
              long      E+F+J+M+%01<<8
              long      B+C+E+F+K+M+%10<<16
              long      H+J+K+M+%01<<16
serialpnt     long      serial                          'always have this line just after serial anim

font          long      A1+A2+B+C+D1+D2+E+F+J+M         '0
              long      B+C                             '1
              long      A1+A2+B+D1+D2+E+G1+G2           '2
              ' and so on
              
bit_test      long      %1<<24                          '24 serial to parallel bits to do.
asm_c         res       1                               'uninitialized variables follow emitted data
asm_cnt       res       1

Duane Degn · 2012-06-09 14:34

This is code shifts bits out from "oByte" and in to "iByte".
"bits" has be set equal to 8 previously. I think this would work with other sizes of data if the value of "bits" were adjusted.

I doubt any of this code is original to myself.

'------------------------------------------------------------------------------------------------------------------------------
'' value of oByte should be set before calling this subroutine
'' returns value of iByte
DAT shiftIO             mov     bitCountdown, bits
                        
                        ror     oByte, bits
:Bit                    shl     oByte, #1 wc         ' shift off the bit to send
                        muxc    outa, mosiMask        
                        or      outa, clockMask
                        test    misoMask, ina wc
                        rcl     iByte, #1            
                        andn    outa, clockMask
                        djnz    bitCountdown, #:Bit
                     
                        
shiftIO_ret             ret

It is destructive to oByte but not the other bits in the outa register.

Edit: I hadn't seen Tony's post.

Duane Degn · 2012-06-09 14:47

@Tony,

I think your code could be improved (sped up) with some "djnz" statements.

You're "asm_cnt" could be loaded with "#8" and instead of shifting and then testing, a single "djnz" would take care of catching the end of your countdown.

tonyp12 · 2012-06-09 15:04

As I have to shift the mask anyway, I use it as a 24 value counter
When it has no value left, e.g shifted out it should no longer jump back to loop2.
Saves me a long (mov counter,#24) before the loop starts and speed wise it's negligible

sar asm_cnt,#1 wz 'shift right
if_nz jmp #:loop2

Duane Degn · 2012-06-09 15:18

tonyp12 wrote: »

As I have to shift the mask anyway

Thanks for explaining Tony. I just think the "djnz" statement is pretty cool that it can subtract and jump in four clock cycles (8 cycles when not jumping) so I try to use it when ever I can.

My not understanding your code made me think you could take advantage of it too. I'll need to study your code a bit more and see if I can better understand it.

Cncjerry · 2012-06-09 15:41

Very helpful all around.

Considering I am not using spi, i2c or any serial other than USB, are there any other alternatives for serial data and clock that would further simplify the code (unused prop pins?) ? The serial shift speed (clock low) on the downstream device clock is 7ns so the tuning word of 48 bits can go out at something greater than 1Mhz.

pjv · 2012-06-09 17:40

Hi Cncjerry;

I think the following will work as a 32 bit shiftin/shiftout (MSb first) with a single register (called Shifter) preloaded with your value
The InMask bit must have an IN direction, and the OutMask bit must have an OUT direction.

              
Start        mov     LoopCtr,#32               'number of bits to shift
Loop         and     InMask,ina     wc,nr    'get the (single) input bit into the carry
             rcl     Shifter,#1       wc        'pull the carry into bit0 and put bit31 into the carry
             muxc    outa,OutMask            'output the carry to the output port
             or      outa,ClockMask          'set the clock bit high
             andn    outa,ClockMask          'set the clock bit low
             djnz     LoopCtr,#Loop           'continue for 32 passes

ClockMask    long      1                       'portbit for clock
InMask       long      2                       'portbit for input
OutMask      long      4                       'portbit for output
LoopCtr      long      0                       'local count variable
Shifter      long      0                       'shift register

If the single instruction clock is too fast, then you can separate the clockset and clockclear instructions by locating them elsewhere inside the loop.

Cheers,

Peter (pjv)

Cncjerry · 2012-06-09 22:49

Peter - nice and net code. Thanks

Duane Degn · 2012-06-10 08:27

Peter,

That's some nice code.

FYI, the "test" statement is an "and" without writing so your:

[B]and[/B]  InMask, [B]ina wc, nr[/B]

could be written:

[B]test [/B]InMask, [B]ina wc[/B]

I mainly say this in case others here are confused about why one of us use test and one uses and (as I would have likely been a few days ago).

I think there may be some devices that can not be read from and written to during the same position of the clock pulse. I noticed, I set the output pin prior to the rising clock edge and read from the device after the clock goes high, but after checking the datasheet (for the Nordic nRF24L01+) I see the I could also read prior to the rising clock pulse.

I can't think of any SPI devices off-hand where your code couldn't be used, but I wouldn't be surprised if they exist.

I think you just made it possible to speed up several of my drivers, thank you.

Mike Green · 2012-06-10 08:49

There are faster techniques, but they're harder to understand. One such technique uses one of the cog counters to generate the clock while the other counter is used as a shift register to place the data on an I/O pin. This requires careful synchronization of the instructions being executed with the actions of the counters. Some of the high speed SD card routines use this technique. Another scheme uses the video generator to do the shifting. Since the video generator can operate in VGA mode where an arbitrary 8-bit (or fewer) value gets produced on a group of I/O pins at a user-set clock rate and this can all be driven by a series of 2-bit pixels selecting one of four 8-bit values to output, these can represent the two states of a clock value plus the two states of a data value. Three pixels can contain the data value, hold time, and clock pulse values. This requires some setup, but can deliver 10+ bits of serial data per WAITVID instruction every 75ns.

Best practices assembly code for bit operations

Comments