Shop OBEX P1 Docs P2 Docs Learn Events
Understanding Assembler - Page 2 — Parallax Forums

Understanding Assembler

2»

Comments

  • KaioKaio Posts: 266
    edited 2007-06-18 18:51
    Jan,

    1) If the methods are never calling each other, then I suggest to use variant c). You need only one variable for all subroutines.

    The difference between b) and c) is, that you can use the res keyword only at the end of your code after long, word or byte declarations. Otherwise your code will not work properly and the compiler does not inform you about this.

    So you could use only b) to declare a locale variable.

    2) Your variables x1, x2 and so on are temporary variables. Therefore you should name these as tmp1 and so on, while x is more used as a variable of coordinates. So it would be better readable for others.

    The best goal should be to minimize your code. So when you not must have separate variables for different routines try to share the temporary variables. I would recommended to write a comment on each routine which describes the usage of variables.
    janb said...

    Does it make sense to declare 2 local variables of the same name ´:x1´´in both subroutines
    to save one register ?
    I don't know what you mean. The Propeller doen't have registers. All registers are used as variables in Cog memory. So if you would declare 2 (local) variables it takes 2 longs of Cog memory.


    @Mike
    janb said...

    Assume I have 3 subroutines in ASM, called...

    Thomas
  • janbjanb Posts: 74
    edited 2007-06-24 03:49
    Hi,
    could you help me to understand why the following code works correctly:
    after certain pins reach desired state it reads 2 bytes from upper pins 16-31 and stores the value in
    3 consecutive local variables o21,o22,o23
    This is what I want.
    However ...
    '-------------------------
    CaptureFrame_slow
            waitpeq   frameState, frameMask   'wait new fframe
            waitpeq   lineState, lineMask                 'wait for start condition      
            mov n,#3
    :newPix  waitpeq   pixState, pixMask                 'wait for start condition
            mov       x, ina                  'store pins state
            waitpeq   zero, pixMask
            shr x,#16
    :save   mov o21,x
            add  :save,d_inc  'increment destination in instruction above  
            djnz      n, #:newPix                 'go for next transition 
    d_inc  LONG  $0000_0200      
    CaptureFrame_slow_ret ret
    

    ....
    

    o20      long 0
    o21      long 0
    o22      long 0
    o23      long 0
    


    the full code is much longer (~200 lines). If I move the definition of
    d_inc· LONG· $0000_0200·····
    way back at the end of the ASM code (I wanted to reuse this constant in many places to save COG memory)
    the code seems to not increment address in ':save' instruction. It replaces 3 time the value of the variable 021. Why is that?
    Thanks
    Jan
  • Mike GreenMike Green Posts: 23,101
    edited 2007-06-24 04:37
    janb,
    One obvious mistake is that you've put d_inc (a constant) between two instructions where it will be executed as an instruction. Move it at least to after the CaptureFrame_slow_ret. It shouldn't really matter though since this value would be executed as a NOP. Also, do you reinitialize the instruction at :save?

    What you've posted (at least the store part) ought to work. I don't see anything else that's obvious. I'd have to see the whole thing to understand why moving the constant causes a problem. If you have some RES statements before the constant, that would throw things off.
  • janbjanb Posts: 74
    edited 2007-06-24 05:04
    Hi Mike,
    the full OBJ code is attached. Unfortunatly it is not short, nor one can run it - there is more objects needed.

    The problematic subroutine is
    CaptureFrame_slow
            waitpeq   frameState, frameMask   'wait new frame
            waitpeq   lineState, lineMask  'wait for new line 
            mov n,lineLen
    :newPix waitpeq   pixState, pixMask 'wait for new pixel 
            mov x, ina   'acquire new value
            and x,dataBusMask
            shr x,dataBusPin0
    :save2   mov CogArr,x
            add  :save2,d_inc  'increment destination in instruction above
            waitpeq   zero, pixMask
            djnz      n, #:newPix                 'go for next transition 
    CaptureFrame_slow_ret ret
    d_inc  LONG  $0000_0200    
    

    I did moved d_inc below subroutine as you have suggested.
    The code as is works.
    But if I move this one line at the end the line
    add  :save2,d_inc  
    

    does nothing and data pileup on the first address of 'CogArr'.
    If you have time could you advice me if the order of local/global variables is optimal.
    I started with your earlier advice to declare local trmporary variables in every subroutine but this reduced the avaliable space for my big local array - so now there are few working variables decalerd at the end. E.g. ':n' and 'n' .

    Also the lines
            mov x, ina   'acquire new value
            and x,dataBusMask
            shr x,dataBusPin0
    
    

    are ment to acquire one value from a bus made out of K pins starting form pin0 - is it optimal?
    Thanks
    ·Jan
  • Mike GreenMike Green Posts: 23,101
    edited 2007-06-24 05:42
    1) You do have variables at the end in the pattern "long ... long ... res ... long ... long" and this is not legal. The RES statements must come last (although a FIT statement can follow it). No LONG statements or instructions may follow a RES in that DAT section. There are some exceptions to that rule, but that's a longer discussion.

    2) You're using :d_inc for many of the copies of $0000_0200. That won't work if you want to use a single copy of the constant. Put the "d_inc LONG $0000_0200" at the end (before the RES) and change all the ":d_inc" to just "d_inc".
  • deSilvadeSilva Posts: 2,967
    edited 2007-06-24 12:00
    I have these comments wrt assambly style:

    - When you modify an instruction, it is good style to clearly mark the modified part as 0 (or #0 if appropriate) - some even use 0-0.
    So you will alwys be remainded that you have to initialize it; though there will be rare cases when code ist performed only once and a static preset will do....

    - There is no such concept as "local variables" in a COG, just "local names". These NAMES are valid between two non-local names only! This can lead to dangerously inserting variables (LONG) within the instruction flow...
    The idea behind this practice - to save space - however is absolutly wrong. The idea to protect local static variables by hiding their names on the other hand is good and valid. Because of the JMPRET conventions this can only be accomplished using an additional "jump-over".

    - Saving memory (i.e. registers!!) by overlaying local variables (i.e. using registers in the old fashioned way) is a tricky practice needing much discipline, experience and a good naming practice! Nested routines will always play tricks with your best intentions smile.gif
  • CardboardGuruCardboardGuru Posts: 443
    edited 2007-06-24 14:17
    deSilva said...
    - Saving memory (i.e. registers!!) by overlaying local variables (i.e. using registers in the old fashioned way) is a tricky practice needing much discipline, experience and a good naming practice! Nested routines will always play tricks with your best intentions smile.gif

    It's easy to run out of space on a cog. The two approaches I've used up to now to reuse variable memory have been:

    The register approach.
    r0 RES 1
    r1 RES 1
    r2 RES 2
    


    Then taking pains to work out where each can be reused.

    And the multiple labels on shared memory approach.
    foo
    bar res 1
    x
    trumpet res 1
    y
    trombone res 1
    


    This makes for more clarity in reading the code, but far more difficulty in working out which locations can be reused.

    But it occurs to me now that this is purely a matter of scope, which a high level language will deal with by means of having local variables on the stack, and safely reusing memory that way. So how about using a naming convention to indicate scope. e.g. When writing a TV driver, you might have the following structure:

    scanline loop
        frame loop
            set background
        process sprites
            do sprite
    
    



    So how about naming vars starting with the scope.

    Scanline_count
    ScanlineFrame_count
    ScanlineFrame_foo
    ScanlineSprites_count
    ScanlineSpritesSprite_bar
    



    So the rule for whether you can share a memory location is to compare the scope part of each varable name, and you can reuse if it is different. Different, not just shorter.

    Scanline_count RES 0
    
    ScanlineFrame_count
    ScanlineSprites_count res 0
    
    ScanlineFrame_foo
    ScanlineSpritesSprite_bar res 0
    



    What do you thiink? It won't appeal to those who like their assembly code terse, that's for sure. And scope names would need to be short to fit in the 30 char identifier limit. I think it might be a workable solution to the problem. But I'd have to try it out for real to get a feel for how useful it is.

    Post Edited (CardboardGuru) : 6/24/2007 2:24:24 PM GMT
  • deSilvadeSilva Posts: 2,967
    edited 2007-06-24 14:56
    I think this will not work so easily smile.gif

    (1) There can be variables in a routine being used statically, i.e. some "local memory"; they are kind of global and must be left alone under all circimstances (thus ned for a special naming convention)
    (2) In most cases a "static call tree" can be constructed, i.e. a simple graph showing who calls who. As recursion is not straightforward with the Propeller, it is generally assumed that this is always possible, but it is not!
    Consider a routine A calling B, deping on a flag - then using the same global flag B calls C.
    Now the flag beeing unset, A could call C and C than B without violation of any law except the lawe of good prgramming practice smile.gif

    Thus it is neccessary for any rules to have a unambigious static call tree in the first place!

    Under those circumstances, all routines can use any variables not in their calling path; this is similar to what CardboardGuru has in mind

    (3) Alas, issues start already earlier within longer routines, e.g. using "I", and "J" as loop indexes.... Note however that there is no help for this even in highest level languages (except of the functional kind!).

    I sometimes used extensive documentation of the kind:
    "Using: R1, R2, R3"
    and
    "Free: R3"

    which could help, when beeing painstakingly updated with each change...
  • CardboardGuruCardboardGuru Posts: 443
    edited 2007-06-24 15:28
    deSilva said...
    (1) There can be variables in a routine being used statically, i.e. some "local memory"; they are kind of global and must be left alone under all circimstances (thus ned for a special naming convention)

    Then they'd have larger scope. Such as the Scanline_count variable I gave. Some variables will have file scope - they'd not have any scope prefix, and couldn;t be resused for anything else.
    deSilva said...
    (2) In most cases a "static call tree" can be constructed, i.e. a simple graph showing who calls who. As recursion is not straightforward with the Propeller, it is generally assumed that this is always possible, but it is not!
    Consider a routine A calling B, deping on a flag - then using the same global flag B calls C.
    Now the flag beeing unset, A could call C and C than B without violation of any law except the lawe of good prgramming practice smile.gif
    Thus it is neccessary for any rules to have a unambigious static call tree in the first place!

    True. It wouldn't work automatically, you'd have to be intelligent about what scope you gave to a variable. It's just a notational way of expressing what you know about how long a variable's value remains used. Where there is doubt, you'd give the widest scope to a variable. Perhaps file scope - which means may not be reused for anything.

    It wouldn't be a replacement for the dynamic scoping that comes with a stack. But then I haven't yet seen any prop code complicated enough that it would need dynamic scoping.
  • janbjanb Posts: 74
    edited 2007-06-25 03:08
    Thanks Guys!
    this one will be a simple on: how do multiply to vales in assembler?
    Assume register x & y hold to arbitrary (not too large) integers.
    how do I get its product in to register z?
    Do I need to make a loop and add X to register Z Y-times, like in kidergarten?
    Jan
  • Mike GreenMike Green Posts: 23,101
    edited 2007-06-25 03:17
    There was a long posting by Chip soon after the Propeller was released containing, among other things, multiplication, division, and square root routines. I don't remember the link to it, but I've attached a copy I had on hand. It was under the subject "Propeller Guts".
  • janbjanb Posts: 74
    edited 2007-06-25 03:32
    I see,
    so it is like in the kindergarten, except instead of multiplying Y times you rather shift & sum for powers of 2 used in Y.
    Very tricky!
    Thanks
    Jan
  • Paul BakerPaul Baker Posts: 6,351
    edited 2007-06-25 04:50
    The next chip will have a single cycle instruction multiply.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Paul Baker
    Propeller Applications Engineer

    Parallax, Inc.
  • janbjanb Posts: 74
    edited 2007-06-25 13:19
    Hi Paul,
    Good to hear.
    I did counted - this 'mul'-emulation costs ~55 instructions (~250 clock ticks) - much more than 1 instruction per 'add'
    So I'm forced to rething my ASM algo.
    An obvoius questions: when? will it be stil propeler or sth else/different?
    Thanks for the feedback
    jan
  • Paul BakerPaul Baker Posts: 6,351
    edited 2007-06-25 18:03
    No answer to when, it will be code compatible with the current Propeller, but there will be enough enhancements that there wont be an absolute compatibility.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Paul Baker
    Propeller Applications Engineer

    Parallax, Inc.
  • deSilvadeSilva Posts: 2,967
    edited 2007-06-25 18:55
    Ha, ha!
  • Paul BakerPaul Baker Posts: 6,351
    edited 2007-06-25 20:41
    deSilva said...
    Ha, ha!
    ? explain

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Paul Baker
    Propeller Applications Engineer

    Parallax, Inc.
  • deSilvadeSilva Posts: 2,967
    edited 2007-06-25 20:47
    Sorry smile.gif
    Well, I did never expect the Propeller II to be fully bit-code compatible - you squeezed out to many bit fields in the Propeller I smile.gif
    And - of course - marketing would not allow you to point out any timings smile.gif
  • Graham StablerGraham Stabler Posts: 2,510
    edited 2007-06-25 20:48
    yeah really funny
  • janbjanb Posts: 74
    edited 2007-06-27 01:31
    Hi Guys,
    I'm stuck. The following code works properly:
    '---------------------------
    ClearArray
            mov x,#12
            movd :self,#CogArr 'reset address of Cog array
            mov i,bufLen ' clear only used portion of the buffer
    :self   mov 0-0,x 
            add  :self,d_inc  'increment destination in instruction above
            djnz  i, #:self 'continue till end of line  
            'mov n,#22
    ClearArray_ret  ret
    

    It fills internal COG array 'CogArr' of lenght 'bufLen' with values of 12.

    But if I enable one more line in this routine
    ·mov n,#22
    which changes some other register it caus this routine to write zeros instead of 12 to my array.
    I do not get this entangelment.
    The full code is attached.
    Please advice
    Jan
  • janbjanb Posts: 74
    edited 2007-07-04 16:46
    Hi,
    with time and patience I solved·my previous·problem, the ASM code was writeing on itself.
    Perhaps you could help me with new problem?

    I'd like to measure exactly (1 prop clock tick accuracy) the time of· some pin going from ·high to low . The time interval is of the order of· ~20 prop ticks.
    The following code works only approximate, according to the manual· waitpeq takes 5+ clock ticks - it is not fully deterministic.
            waitpeq   zero, pixMask   
            mov T1,cnt
            waitpeq   pixMask, pixMask   
            waitpeq   zero, pixMask   
            mov T2,cnt
    zero long 0 
    


    Since I want to measure deltaT only once· would it be possible to use a prop counter instead?

            waitpeq   zero, pixMask   
            '??? activate ctra, frqa,dira 
               
            'wait fixed amount of time, slightly longer then expected measurement time
    

           'extract content of ctra to get T2-T1
    

    Can it be done? Would be be more accurate, providing there is enough clock ticks to initialize the counter while pin pointed by pinMask is low?

    Any suggestions? Perhaps there is somewhere a code I could look it up?
    Thanks
    Jan

    Thanks
    Jan
  • Mike GreenMike Green Posts: 23,101
    edited 2007-07-04 17:03
    The easiest thing is to set up the counter ahead of time to use mode %01100 which counts clock cycles whenever the selected pin is zero. Your program would do a "waitpne zero,pixMask". At that point, phsa or phsb would contain the number of clock cycles the pin was low or it would contain zero (or whatever its previous value was). If T2-T1 is zero, there was no pulse. If T2-T1 is greater than zero, there was a pulse and you have its width in 12.5ns clock cycles (with an 80MHz system clock).
  • janbjanb Posts: 74
    edited 2007-07-04 17:26
    Hi,
    thanks a lot. So my code should look like:
    mov ctra,#%01100_0000_0000_0010 ' count when pin#1 is LOW
    

    mov frqa,#1 ' increment cog counterA by 1 per clock tick (at 80 MHz)
    

    waitpeq   pixMask, pixMask ' wait untill counter stops counting 1st time
    

    mov T1,phsa ' save counter value
    

    waitpeq   zero, pixMask ' give it a chance to count again
    

    waitpeq   pixMask, pixMask ' wait untill it stops counting 2nd time
    

    mov T2,phsa
    

    sub T2,T1 ' this is time between 2 consecutive HIG of pin#1
    

    I'll give it a try. I appreciate the immediate feedback, it is a lot of fun to work in such enviroment
    Jan

       
    

    ·
  • janbjanb Posts: 74
    edited 2007-07-04 20:57
    Hi,
    Thanks again.
    The code below does work!· It measures separately duration of negative and positive state of pin#5 and stores 4 cnt values in variables t0,...t4.

    Now I want to do more·complex stuff:
    Q1:·to measure time for a coincidence state of 2 pins I should use mode %10001 or %10010, ...etc, right?
    Q2: what is the difference between modes %01100 and %10101 ?

    Thanks
    Jan

    :modeNEG long %01100 <<26         
    :modePOS long %01000 <<26
    :mask    long 1<<5
    :pin     long 5
            mov frqb,#1 ' increment cog counterB by 1 per clock tick (at 80 MHz)
    

            'negative state duration
            mov x,:modeNEG
            add x,:pin       
            mov ctrb,x ' count when pin#1 is LOW
            waitpeq   :mask, :mask ' wait untill counter stops counting 1st time
            mov t0,phsb ' save counter value
            waitpeq   zero, :mask ' give it a chance to count again   
            waitpeq   :mask, :mask ' wait untill it stops counting 2nd time
            mov t1,phsb
    

            'positive state duration
            mov x,:modePOS
            add x,:pin       
            mov ctrb,x ' count when pin#1 is LOW
            waitpeq   zero, :mask ' wait untill counter stops counting 1st time
            mov t2,phsb ' save counter value
            
            waitpeq   :mask, :mask ' wait untill it stops counting 2nd time
            waitpeq   zero, :mask ' give it a chance to count again   
            mov t3,phsb
    
  • Mike GreenMike Green Posts: 23,101
    edited 2007-07-04 22:32
    Jan,
    Yes, to measure a logical coincidence use mode %10001 or %10010.

    I believe modes %01100 and %10101 are functionally equivalent. I suspect the logic was easier this way than trying to add some additional conditions for the duplicate mode values.
  • Paul BakerPaul Baker Posts: 6,351
    edited 2007-07-05 15:58
    Hi Jan,
    To clarify a point, waitpeq is deterministic with respect to the pin state. The reason it is listed as 5+ is it takes 4 clocks to process the instruction, plus however many cycles of compare necessary to achieve the wait state. If the value is true at the beginning it will take 5 cycles since only one compare cycle occurs. For situations where more than one compare cycle occurs, the next instruction begins execution on the next clock cycle after a comparison evaluates true.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Paul Baker
    Propeller Applications Engineer

    Parallax, Inc.
  • janbjanb Posts: 74
    edited 2007-07-05 16:18
    Hi,
    thanks for the explanation. I'm more interested in the case when a cog stops at waitpeg and now waits for a given pin state - in my example pin goes high at the beginning of new frame from a CMOS camera, shortly after data transmission starts.
    How much time it will take between the frame-pin (driven by the camera) goes high and execution of the next ASM instruction in the cog after weitpeq?

    I need to know this, since the 1 st image data in the frame will show e.g. 1 usec after the frame-pin goes high and pixel data will change every 50 ns (i.e. one ASM instruction time). I need to know very precise which pixel I'm reading and which skipping
    Thanks
    Jan
  • Paul BakerPaul Baker Posts: 6,351
    edited 2007-07-05 17:19
    The next instruction will begin executing the next clock cycle, since at 80MHz each clock cycle is 12.5 ns, the first stage of the next instruction will begin executing between 0 and 12.5ns after the line goes high, since that is the resolution of the comparator.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Paul Baker
    Propeller Applications Engineer

    Parallax, Inc.
Sign In or Register to comment.