Shop OBEX P1 Docs P2 Docs Learn Events
Byte Array In Cog — Parallax Forums

Byte Array In Cog

JonnyMacJonnyMac Posts: 9,191
edited 2012-12-29 20:36 in Propeller 1
I have a PASM driver that I'd like to add a buffer to. In short, I'd like to create a block of longs but access it as an array of bytes. I'm sure I could work out the read/write access code myself, but on the chance someone else has tackled this task, I'd love to see it.

Comments

  • Mike GMike G Posts: 2,702
    edited 2012-12-27 17:46
    I built the simple PASM string parsers a while back for the Spinneret. The idea was to build a static parser in a COG - learning exercise. Anyway, the workspace array buffers data for integer to string and string to integer conversions.

    http://code.google.com/p/spinneret-web-server/source/browse/trunk/MultiSocketServer_MikeG/StringMethods.spin

    Click the "View raw file" link in the File Info box on the right side of the page.
  • Bill HenningBill Henning Posts: 6,445
    edited 2012-12-27 18:52
    Hi Jon,

    While this is possible, I recommend storing the bytes in the hub - as packing/unpacking bytes would be more complicated.

    Using the hub would also be faster...
    JonnyMac wrote: »
    I have a PASM driver that I'd like to add a buffer to. In short, I'd like to create a block of longs but access it as an array of bytes. I'm sure I could work out the read/write access code myself, but on the chance someone else has tackled this task, I'd love to see it.
  • JonnyMacJonnyMac Posts: 9,191
    edited 2012-12-27 20:24
    Using the hub would also be faster...

    I think very highly of your opinion, Bill, so I'm going to go with the hub.


    @Mike: I'm still going to have a look at your code as I think the exercise may be worthwhile, anyway; even if I don't use it on this particular project.
  • JonnyMacJonnyMac Posts: 9,191
    edited 2012-12-27 22:35
    In the end, the code was actually pretty easy (btw, I've made no attempt to optimize). That said, I have to test it now. In the event you have a moment to look over my logic, your feedback is appreciated.

    Read routine
    ' Read byte from cog array
    ' -- "cbidx" is byte index to read from, 0 to 63
    ' -- "bval" is byte read
    
    read                    mov     t1, #cbuf                       ' point to cbuf[0]         
                            mov     t2, cbidx                       ' make copy of byte index  
                            shr     t2, #2                          ' convert to long index    
                            add     t1, t2                          ' convert to array address 
                            movs    :rdcell, t1                     ' fix mov instruction      
                            nop                                                                
    :rdcell                 mov     bval, 0-0                       ' read long holding byte   
                            mov     t2, cbidx                       ' make copy of byte index                       
                            and     t2, #%11                        ' convert to byte pos in long   
                            shl     t2, #3                          ' convert to bit count   
                            shr     bval, t2                        ' move target to lsb       
                            and     bval, #$FF                      ' clean up  
    read_ret                ret
    


    Write routine
    ' Write byte to cog array
    ' -- "cbidx" is byte index to write to, 0 to 63
    ' -- "bval" is byte written
    
    write                   mov     t1, #cbuf                       ' point to cbuf[0]
                            mov     t2, cbidx                       ' make copy of byte index
                            shr     t2, #2                          ' convert to long index
                            add     t1, t2                          ' convert to array address
                            movs    :rdcell, t1                     ' fix mov instruction
                            nop
    :rdcell                 mov     t3, 0-0                         ' read long holding byte
                            mov     t2, cbidx                       ' make copy of byte index                       
                            and     t2, #%11                        ' convert to byte pos in long   
                            shl     t2, #3                          ' convert to bit count   
                            mov     t4, #$FF                        ' create bit mask
                            shl     t4, t2                          ' align with target byte
                            andn    t3, t4                          ' clear old byte
                            and     bval, #$FF                      ' clean up
                            shl     bval, t2                        ' adjust target to byte position
                            or      bval, t3                        ' put into long cell
                            movd    :wrcell, t1                     ' fix move instruction
                            nop
    :wrcell                 mov     0-0, bval                       ' update array
    write_ret               ret
    


    Once it's fully tested I'll do speed tests to compare using a byte array in the hub.
  • StefanL38StefanL38 Posts: 2,292
    edited 2012-12-28 01:18
    @Bill: I'm no expert about PASM and with my quarter-knowledge I would have thought PASM is faster because HUB-RAM-access needs 7-22 clock-cycles. Each PASM command only 4 clock-cycles.

    Does this mean that extracting a byte from a long taken from Cog-RAM needs more clock-cycles because you have to use more commands than retrieving the byte from HUB-RAM, ?

    best regards
    Stefan
  • JonnyMacJonnyMac Posts: 9,191
    edited 2012-12-28 01:33
    As you can see in my first cut of the code, it does take several instructions to extract a byte from a long in the cog ram, and even more if you want to write a byte. The _only_ reason to do this, I think, is to protect the array from the user by keeping it out of the hub.

    Even if I want it as a PASM subroutine, reading a byte from hub ram is far easier than extracting from a cog long as I did above.
    read                    mov     t1, bufpntr                     ' point to hub buf[0]
                            add     t1, bidx                        ' point to hub buf[bidx]
                            rdbyte  bval, t1                        ' bval := buf[bidx]
    read_ret                ret
    
  • RaymanRayman Posts: 14,826
    edited 2012-12-28 05:09
    Bill made an interesting point... using rdbyte could be faster.

    Still, I think there must be some cases where having the bytes in the COG would be faster...
    I do have a code where I had an array of words starting at cog address 0.
    Starting from address 0 saves an instruction or two when accessing...

    But, I didn't try to have 2 words in each long, so it wasn't compacted.
    So, if you can't have one long for every byte in your array, then Bill is probably right.
  • Bill HenningBill Henning Posts: 6,445
    edited 2012-12-28 09:30
    Hi Jon,

    Your code is absolute correct - and faster than using cog longs :)

    A small modification makes it potentially faster for subsequent reads:
    read                    mov     t1, bufpntr                     ' point to hub buf[0]
                            add     t1, bidx                        ' point to hub buf[bidx]
    read_next               rdbyte  bval, t1                        ' bval := buf[bidx]
                            add t1,#1
    read_next_ret
    read_ret                ret
    

    This way, you call "read" to get the first byte, and "read_next" to get following bytes.
    JonnyMac wrote: »
    As you can see in my first cut of the code, it does take several instructions to extract a byte from a long in the cog ram, and even more if you want to write a byte. The _only_ reason to do this, I think, is to protect the array from the user by keeping it out of the hub.

    Even if I want it as a PASM subroutine, reading a byte from hub ram is far easier than extracting from a cog long as I did above.
    read                    mov     t1, bufpntr                     ' point to hub buf[0]
                            add     t1, bidx                        ' point to hub buf[bidx]
                            rdbyte  bval, t1                        ' bval := buf[bidx]
    read_ret                ret
    
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2012-12-28 10:02
    Jon,

    Here's a mod that tightens up your original read code a bit:
    rd_byte       mov       t1,index
                  ror       t1,#2
                  add       t1,#byte_buf
                  movs      load_long,t1
                  rol       t1,#5    
    load_long     mov       value,0-0
                  shr       value,t1
                  and       value,#$ff
    rd_byte_ret   ret
    
    byte_buf      res       32
    t1            res       1
    index         res       1
    value         res       1
    

    BTW, if index can be volatile, you can just use it instead of t1 to eliminate one more instruction.

    -Phil
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2012-12-28 11:11
    Here's some more compact code that does both the read and write (untested):
    rd_byte       call      #access
                  and       rd_value,#$ff
    rd_byte_ret   ret
    
    wr_byte       call      #access
                  andn      rd_value,#$ff
                  or        rd_value,wr_value
                  movd      store_long,load_long
                  rol       rd_value,t1
    store_long    mov       0-0,rd_value
    wr_byte_ret   ret
    
    access        mov       t1,index
                  ror       t1,#2
                  add       t1,#byte_buf
                  movs      load_long,t1
                  rol       t1,#5    
    load_long     mov       rd_value,0-0
                  ror       rd_value,t1
    access_ret    ret
    
    byte_buf      res       32
    t1            res       1
    index         res       1
    rd_value      res       1
    wr_value      res       1
    

    -Phil
  • Bill HenningBill Henning Posts: 6,445
    edited 2012-12-28 12:48
    Hi Stefan,

    Exactly, it is because it takes more instructions. To be faster, you'd have to be able to read a byte (or write one) from/to a long in the cog in less than 20-22 cycles (less than 5 regular instructions)
    StefanL38 wrote: »
    @Bill: I'm no expert about PASM and with my quarter-knowledge I would have thought PASM is faster because HUB-RAM-access needs 7-22 clock-cycles. Each PASM command only 4 clock-cycles.

    Does this mean that extracting a byte from a long taken from Cog-RAM needs more clock-cycles because you have to use more commands than retrieving the byte from HUB-RAM, ?

    best regards
    Stefan
  • StephenMooreStephenMoore Posts: 188
    edited 2012-12-29 19:57
    loop          call      #rd_byte
                  call      #wr_byte
                  jmp       #loop
    
    
    rd_byte       call      #access
                  and       rd_value,#$ff
    rd_byte_ret   ret
    
    
    wr_byte       call      #access
                  andn      rd_value,#$ff
                  or        rd_value,wr_value
                  movd      store_long,load_long
                  rol       rd_value,t1
    store_long    mov       0-0,rd_value
    wr_byte_ret   ret
    
    
    access        mov       t1,index
                  ror       t1,#2
                  add       t1,#byte_buf
                  movs      load_long,t1
                  rol       t1,#5    
    load_long     mov       rd_value,0-0
                  ror       rd_value,t1
    access_ret    ret
    
    
    byte_buf      res       32
    t1            res       1
    index         res       1
    rd_value      res       1
    wr_value      res       1
    

    I am trying to understand the tricks used in this bit of code and have a couple of questions:

    1) I understand < movs load_long, t1 > but I do not understand < movd store_long, load_long >. How does the index address get properly loaded for the write function?

    2) I tried to use PASD to look at register values during these code moves but get stymied when looking at the self-modifying code because PASD does not breakout the 9 bit literal values very readily. Is there any suggestion to help understanding how to better use the debugger to interpret self-modified instructions.

    This has been a good tutorial for me.

    Regards,

    sm
  • kuronekokuroneko Posts: 3,623
    edited 2012-12-29 20:24
    1) I understand < movs load_long, t1 > but I do not understand < movd store_long, load_long >. How does the index address get properly loaded for the write function?
    t1 - being the index address - will be stored in load_long[8..0]. To avoid re-calculation the final write simply borrows the stored value (it knows it has been in access before) from that location and uses movd to store load_long[8..0] (== t1initial) in store_long[17..9].
  • StephenMooreStephenMoore Posts: 188
    edited 2012-12-29 20:36
    ok. That makes sense. My understanding is this: with either movs or movd, only the lowest 9 bits of the value register get loaded into the available 9 bits of the destination field.

    Thank you.

    sm
Sign In or Register to comment.