Shop Learn
waitpxx detailed timing? — Parallax Forums

waitpxx detailed timing?

agsags Posts: 386
edited 2014-04-21 11:18 in Propeller 1
What is the timing (in clock cycles) between a pin detected high (or low) - "releasing" the waitpxx instruction and beginning the next?

From other discussions, I believe that waitpxx requires a minimum of 6 clocks. If it is like waitcnt, the comparison happens during the 4th clock (unclear what happens during the 5th) and results are written in the 6th. So for:

waitcnt value, delay

cnt is sampled during the 3rd clock or "e" clock (using modified SDeR (fetchSource/fetchDestination/execute/writeResults) terminology for waitxx instructions of SDwm.R - or fetchSource/fetchDestination/wait/match/nothing/writeResult).

With waitpxx, is INA sampled during the 3rd clock? If so, and steps similar to waitcnt occur (fetchSource/fetchDestination/sampleINA/match/nothing/writeResult) that would mean that the following instruction begins 3 clocks after the src operand matches INA.

Is this correct? I searched for waitpne and waitne but found nothing about this. Thanks.

Comments

  • kuronekokuroneko Posts: 3,623
    edited 2014-04-07 16:39
    As stated in docn.errata the 4th cycle is the first match opportunity. See attached sample for more details. HTH
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,049
    edited 2014-04-07 19:45
    I think what matters most here is how soon after the wait event occurs can an instruction effect change. So I wrote the following test program that outputs a PLL signal on a pin that's asynchronous to the system clock, does a wait, and outputs a pulse on another pin:
    CON
    
      _clkmode      = xtal1 + pll16x
      _xinfreq      = 5_000_000
    
    PUB  start
    
      cognew(@tester, 0)
    
    DAT
    
                  org       0
    tester        mov       ctra,ctra0
                  mov       frqa,frqa0
                  mov       dira,#3
    :loop         waitpne   one,#1
                  waitpeq   one,#1
                  or        outa,#2
                  andn      outa,#2
                  jmp       #:loop
    
    
    ctra0         long      %00010 << 26 |  %011 << 23
    frqa0         long      $0fcf_a5a5
    one           long      1
    

    Here's what the scope output looks like:

    attachment.php?attachmentid=108029&d=1396927096

    As expected, the uncertainty between the arrival time of the leading edge and recognition by the waitpeq is 12.5 ns. The amount of time it takes between that arrival and the output to the pin to register ranges between 56.25 ns and 68.75 ns (i.e. between 4.5 and 5.5 system clocks). See correction below.

    -Phil
    640 x 480 - 23K
  • kuronekokuroneko Posts: 3,623
    edited 2014-04-07 19:59
    @PhiPi: Am I correct in assuming that the yellow trace represents outa[1]. If so why do you measure outa[0] from the falling edge (it's a waitpeq after all, waiting for high)?
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,049
    edited 2014-04-07 20:17
    kuroneko,

    Oh, crap! I grabbed the wrong edge. Yellow is out[0]; blue, out[1]. Here's the correct display:

    attachment.php?attachmentid=108028&d=1396927003

    The delay ranges from 6.75 to 7.75 system clocks.

    Thanks for the catch!

    -Phil
    640 x 480 - 23K
  • Tracy AllenTracy Allen Posts: 6,566
    edited 2014-04-08 23:14
    Interesting approach Phil. Had to try it. Here it is on the LeCroy with width statistics. Result confirmed, 84.00 to 96.91µs, compared to 84.38 to 96.875 for exact 6.75 to 7.75 clocks. Average is close to 7.25 clocks. Trace A is the difference of (1)-(2). Trace C is the histogram of widths.

    D021.png
    0
    640 x 480 - 12K
  • kuronekokuroneko Posts: 3,623
    edited 2014-04-09 06:02
    @PhiPi: can you please run the test for pins 7/8 again? Thanks.
  • Mark_TMark_T Posts: 1,981
    edited 2014-04-09 06:03
    This begs the question of what's the fastest input pin square wave that a stream of waitpeq/waitpne instructions
    can synchronize to? Harder to measure if you don't allow other instructions between them, but you could
    measure the speed of a loop like
    :loop
                           waitpeq  mask, mask
                           waitpne  mask, mask
                           waitpeq  mask, mask
                           waitpne  mask, mask
                           waitpeq  mask, mask
                           waitpne  mask, mask
                           waitpeq  mask, mask
                           waitpne  mask, mask
                           xor      OUTA, pmask  ' output the input divided by 8 (or 10 when close to limit)
                           jmp     #:loop
    
    And see the frequency beyond which it suddenly slows down as synchrony is lost. For 6 cycles it should
    be at clkfreq/12, but will be sensitive to duty cycle if edges are subject to unequal delays.
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,049
    edited 2014-04-09 14:10
    I tested pin 7 vs. pin 8. The results are about the same (6.75 - 7.75 ns):

    attachment.php?attachmentid=108075&d=1397077810

    Here's the revised program. You can change the constants to test any pair of pins with this one:
    CON
    
      _clkmode      = xtal1 + pll16x
      _xinfreq      = 5_000_000
    
      PLL_PIN       = 7
      OUT_PIN       = 8
    
    PUB  start
    
      cognew(@tester, 0)
    
    DAT
    
                  org       0
    tester        mov       ctra,ctra0
                  mov       frqa,frqa0
                  mov       dira,dira0
    :loop         waitpne   test_ptn,test_ptn
                  waitpeq   test_ptn,test_ptn
                  or        outa,outa0
                  andn      outa,outa0
                  jmp       #:loop
    
    
    ctra0         long      %00010 << 26 |  %011 << 23 | PLL_PIN
    frqa0         long      $0fcf_a5a5
    dira0         long      1 << OUT_PIN | 1 << PLL_PIN
    outa0         long      1 << OUT_PIN
    test_ptn      long      1 << PLL_PIN
    

    -Phil
    640 x 480 - 20K
  • kuronekokuroneko Posts: 3,623
    edited 2014-04-09 18:46
    I tested pin 7 vs. pin 8. The results are about the same (6.75 - 7.75 ns) ...
    Thanks, I was just wondering about the pad delay effect. Seems it's negligible in this context.
  • agsags Posts: 386
    edited 2014-04-10 17:29
    So although I framed my original question in terms of "how does waitpxx work at the individual clock-cycle level?", the resulting answer is actually more useful, and other than my inherent constant curiosity about how things actually work under the covers, does give me the information needed. Bottom line is that depending on when the input edge occurs relative to the Prop internal clock edge, the next instruction will begin in no more than 4 and no less than 2 clock cycles later.

    Back to my "how does it work?" quest, I suppose this means that INA is sampled/latched in the third clock cycle ("e" using the "SDeR" terminology) as is the case with most (all?) other instructions, and there are three more clocks that pass before waitpxx completes. This also matches what I've seen stating that waitpxx takes 6+ clocks.

    Still on the "what is happening?" path, if live registers (e.g. INA) are latched in "e", and the actual comparison is done in the next clock ("e+1", or "R" for "normal instructions") and the last clock is used to write the result (if specified), then what is happening in the clock cycle in between ("e+2")? Is it the same as is happening with waitcnt (whatever that is)?
  • kuronekokuroneko Posts: 3,623
    edited 2014-04-21 11:18
    waitxxx perform add underneath (except waitpne which does dst += src + 1).
Sign In or Register to comment.