Shop OBEX P1 Docs P2 Docs Learn Events
I can't believe I found a HW BUG in the propeller — Parallax Forums

I can't believe I found a HW BUG in the propeller

Alex.StanfieldAlex.Stanfield Posts: 198
edited 2012-10-12 11:12 in Propeller 1
Now that I have your attention :tongue: I found a really strange behavior when using MAX/S and MIN/S with FRQA. Attached is a demo program you can compile and run and the resulting waveforms I sampled with a logic analyzer

MAXS_MINS PASM bug.spin

To summarize the problem:

If you use max/s or min/s directly with FRQA in the dest field, something happens with the counter like it seems to set the counter to the source field even though you read FRQA correctly, it's an internal HW problem, your program still believes it has FRQA on the counter BUT in reality your pin is at the max (or min) limit you specified.

As you can see in the program you can easily circumvent the problem by reading and storing again the value (or even better read, maxs and write back FRQA on a working register)

Here you see what happens to the output waveform on a MAXS FRQA, FRQ_max

maxs_bug-1.jpg


Immediately after just waiting a second this happens on ADD FRQA, #0 (just restores the right value). As you can see the correct value is stored on FRQA but now it get the correct value on the output pin.
maxs_bug-2.png


I would recomend studying in detail this behavior and modifying the Propeller manual to alert users on this behavior.

Regards
Alex
1024 x 408 - 64K
976 x 521 - 39K

Comments

  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2012-10-05 22:47
    Whenever anyone says they've discovered a Propeller hardware bug after six years on the market, I have to roll my eyes, purse my lips, and utter words of pure skepticism. But I've reduced your program to a bare minimum, and it exhibits that same curious behavior:
    CON
       
      _clkmode      = xtal1 + pll16x
      _xinfreq      = 5_000_000
      
    PUB main
    
      cognew(@entry, 0)
    
    DAT
                  org 0
    
    entry         mov       time,one_second
                  add       time,cnt
                  waitcnt   time,one_second         'Wait 1 sec before starting.
                  mov       dira,#1                 'Set bit 0 to output.
                  mov       ctra,ctra0              'Set ctra to NCO; output on P0.
                  mov       frqa,fmin
                  waitcnt   time,one_second         'Pause 1 sec.
                  maxs      frqa,fmax
                  waitcnt   time,one_second         'Pause 1 sec.
                  add       frqa,#0                 'Set HW output correctly again.
                  jmp       #$
                  
    one_second    long      80_000_000              'cnt cycles for 1 sec.
    ctra0         long      %00100<<26              'NCO to generate step pulses.
    fmax          long      53000                   'Working parameter: fmax for job.
    fmin          long      5300                    'Working parameter: fmin for job. 
    time          res       1
    

    I have to say that something odd is happening, and it probalby has to do with frqa's shadow register. But nothing has ever been documented that frqa cannot be used with impunity as the destination of a read-modify-write instruction. It should behave normally under these circumstances.

    Now, the following program works as you would expect:
    CON
       
      _clkmode      = xtal1 + pll16x
      _xinfreq      = 5_000_000
      
    PUB main
    
      cognew(@entry, 0)
    
    DAT
                  org 0
    
    entry         mov       time,one_second
                  add       time,cnt
                  waitcnt   time,one_second         'Wait 1 sec before starting.
                  mov       dira,#1                 'Set bit 0 to output.
                  mov       ctra,ctra0              'Set ctra to NCO; output on P0.
                  mov       acc,fmin
                  mov       frqa,acc
                  waitcnt   time,one_second         'Pause 1 sec.
                  maxs      acc,fmax
                  mov       frqa,acc
                  waitcnt   time,one_second         'Pause 1 sec.
                  add       acc,#0
                  mov       frqa,acc                'Set HW output correctly again.
                  jmp       #$
                  
    one_second    long      80_000_000              'cnt cycles for 1 sec.
    ctra0         long      %00100<<26              'NCO to generate step pulses.
    fmax          long      53000                   'Working parameter: fmax for job.
    fmin          long      5300                    'Working parameter: fmin for job. 
    time          res       1
    acc           res       1
    

    Dunno what to think. It definitely seems to be an anomaly.

    Hopefully Kuroneko has awakened and tuned in. He's the de facto custodian of weird Propeller behavior. Maybe he can shed some light.

    -Phil
  • kuronekokuroneko Posts: 3,623
    edited 2012-10-06 01:37
    Odd behaviour indeed! That said, max & Co are odd in itself. Anyway, after max is done, the value of frqa is correct whether sampled as src or dst. Which suggests separate h/w for the actual counter register. Also, if you look at the flag behaviour for Z it's in fact src == 0, not result. So this means the Z flag logic is fed from the source bus rather than destination/result. The counter register seems to get its value from the same feed which would explain the observed behaviour (as it's clearly loaded from src in this case).

    Quick tests show the same behaviour for other h/w backed registers, e.g. max phsa, one_second leaves the shadow intact (small enough) but the counter register is preset with 80M. This may come in handy, who knows?
    PUB null
    
      cognew(@entry, 0)
    
    DAT             org     0
    
    entry           max     dira, mask
    
                    max     frqa, frqa_
                    max     ctra, ctra_             ' video PLL, clock exposed (16)
    
                    max     frqb, #255
                    max     ctrb, ctrb_             ' NCO (17)
    
                    max     vcfg, vcfg_
                    max     vscl, vscl_             ' video (18..23)
                    
                    mov     temp, dira
                    or      temp, frqa
                    or      temp, frqb
                    or      temp, ctra
                    or      temp, ctrb
                    or      temp, vcfg
                    or      temp, vscl wz
            if_nz   hubop   $, #%10000_000          ' reset of non-zero
    
                    waitpeq $, #%%210
                    
    mask            long    $00FF0000
    frqa_           long    $10000000               ' 5MHz
    ctra_           long    %0_00010_000 << 23 | 16
    ctrb_           long    %0_00100_000 << 23 | 17
    vcfg_           long    %0_01_1_00_000 << 23 | 2 << 9 | $FC
    vscl_           long    255 << 12 | 4095
                    
    temp            res     1
    
                    fit
    
    DAT
    
    This is effective for all SPRs (actually I haven't bothered checking the r/o section and vscl). The former is ... r/o and the latter not useful for code. Which means that under certain circumstances 15 SPRs can be used for code/data while their connected h/w still does its job.
  • Clive WakehamClive Wakeham Posts: 152
    edited 2012-10-06 02:38
    Thought the correct term was FEATURE not BUG.
  • prof_brainoprof_braino Posts: 4,313
    edited 2012-10-07 07:32
    Thought the correct term was FEATURE not BUG.

    Its only a BUG if it contradicts the stated requirements. If there is no description, it is simply NOT DEFINED or in this case NOT YET DEFINED.

    So, to clarify, (I only check that the docs line up, I don't necessarily understand all the details), this is NOT a bug, since nothing defines what FREQA+MAX does concerning the the shadow register(?) [Please help me state the issue clearly, I just got up]

    And the observed behavior is....something to the affect that it does its job while the cog can do other stuff? [Ok, I'll get some coffee...]

    My aim here is to turn it from an undocumented anomaly to a documented behavior. Then somebody else can show us how to take advantage of this function and/or how to avoid put falls of its presence.
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2012-10-07 10:26
    Whether it's a bug or a feature depends entirely upon the intent of the designer. Since we can't get inside Chip's head, we have to rely upon the documentation for clues. The manual is pretty clear in its implication that frqx is read-modify-write safe. In any event, it would be hard to imagine the exhibited behavior being intentional.

    -Phil
  • StefanL38StefanL38 Posts: 2,292
    edited 2012-10-07 12:08
    First of all:
    compared with any other MCUs the propeller-chip to me is still No 1 even in category "free of bugs"

    So I would like to ask Chip Gracey directly. Chip are you still reading headlines of the threads?
    If you are busy with re-re-checking the circuitry of the propeller-2-Chip please stop checking for a moment read this thread
    and then check the Prop-Chip 2 if the P2-chip has the same feature.

    best regards
    Stefan
  • jazzedjazzed Posts: 11,803
    edited 2012-10-07 12:24
    10 reasons for bugs:

    10) Not using pest control
    9) Being unnecessarily exposed to risks
    8) Making solutions overly complex
    7) Faults in steps or logic
    6) Not verifying function
    5) Expecting something to "just work"
    4) Not meeting natural expectations
    3) Not communicating how something should work
    2) Failure of user to find/comprehend published material
    1) Not getting everyone's approval on how something should work in the first place
  • prof_brainoprof_braino Posts: 4,313
    edited 2012-10-08 08:46
    I try to avoid contradicting philpi, as usually I can't understand what he says on a technical level. But in regards to requirements, I must beg to differ, as this is my area, and there is a larger precedent supporting a position other than stated by our mentor.

    IT IS COMPLETELY IRRELEVANT whats in the engineer's head or what the "intent" was if it is not written down and communicated to us the folks that need the information. The only thing that matters is what does it CLAIM TO DO, and what does it ACTUALLY DO. Claims of functionality and actual functionality are the only things we can check, and confirm or deny. All else is undefined. These claims are recorded in the documentation. Documented claims are the only thing that exist, and are the only thing we can measure against. Any "claim" that is not documented cannot be claimed to exist, and for all practical purposes does not exist.

    Does the document state specifically state the behavior for this specific function is other than observed? Is this function so obscure that it is undefined and that is OK? We cannot expect anyone to see the future and guess every possible weird thing that may ever come to pass. Some things are just left "undefined" until such a lime as they raise their head and demand attention. This may be one of those, since nobody noticed it before in all these years.

    If it is inconsistent with the other special register functions, can it be stated in the docs that this one reads the input bus instead of the output bus for setting this bit? (or whatever its doing, you pretty much lost me at FRQA). If the documentation can be made to describe this as the intended (or accepted) behavior, it cannot be considered a bug. It may be weird, but if its in the docs and we know about it, we are not set up to get burned.

    Even it this contradicts the stated behavior of the other special registers, the docs can be made to state that this function has an exception, it can no longer be called a bug.

    This is a kind of cool discussion. I want to see if we establish the intended function, the actual function, and accurate documentation of the analysis.
  • Alex.StanfieldAlex.Stanfield Posts: 198
    edited 2012-10-08 12:34
    Does the document state specifically state the behavior for this specific function is other than observed?
    Is this function so obscure that it is undefined and that is OK?
    No and No. It's an indirect bug:
    - FRQA set the output frequency of the counter in the sample program above (NCO=numerically controlled oscillator)
    - MAX/s or MIN/s limit the maximum or minimum value of a register (any register that can be used in read/modify/write instructions). These instructions do on FRQA whats supposed to happen.
    - However after these instructions FRQA does NOT reflect the output frequency of the counter. FRQA has the value that you expect, it's the output of the counter that is wrong.
    If it is inconsistent with the other special register functions, can it be stated in the docs that this one reads the input bus instead of the output bus for setting this bit? (or whatever its doing, you pretty much lost me at FRQA). If the documentation can be made to describe this as the intended (or accepted) behavior, it cannot be considered a bug. It may be weird, but if its in the docs and we know about it, we are not set up to get burned.

    Absolutely, that's what I stated when opening this thread. Let's get the manual to reflect this as a warning so others don't stumble on this behavior. Until now it's a BUG, when we change the manual you'll be warned and it's no longer a bug.
    I would suggest to modify both the special registers section and the MAX/MIN instructions.

    Alex
  • cgraceycgracey Posts: 14,253
    edited 2012-10-10 09:40
    I looked into this and discovered the problem. The instructions MINS/MAXS/MIN/MAX/CMPSUB can all cancel writes to cog RAM, but that 'cancel' signal was not incorporated into the I/O register write signals. I see from the hardware-description source code that I had labelled the write-cancel signal as late-arriving, so it was not suitable for using as a clock-gating signal for the I/O register flipflops, though it could be used to control cog RAM writes via the RAM's write-enable signal, which gets captured at the clock edge. So, all MINS/MAXS/MIN/MAX/CMPSUB instructions working on I/O registers WILL write the otherwise-conditional result to I/O flops, though the shadow registers behind them will be written as expected (causing a disparity between the two). I probably thought at design time that nobody would ever use MINS/MAXS/MIN/MAX/CMPSUB on I/O registers. Sorry about this. The solution is to not use MINS/MAXS/MIN/MAX/CMPSUB on I/O registers.
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2012-10-10 10:06
    Thanks, Chip, for the informative feedback -- and another peek into the Prop's internal workings! Now, since it's a documented fact, all we need to do is find ways to exploit this "new feature!" :)

    I assume this will not be the case with the Prop II, in part, since the SFRs are hidden from R/M/W instructions; correct?

    -Phil
  • Heater.Heater. Posts: 21,230
    edited 2012-10-10 11:54
    Feature or bug?

    Is it a "feature" that will be carried to the Prop II or is it a "bug" that will be fixed in Prop II?
  • Alex.StanfieldAlex.Stanfield Posts: 198
    edited 2012-10-10 17:53
    Chip, thanks a lot for the information and detailed explanation.

    Given the time in the market with no reports on this it's obviously not a big issue, but will someone at Parallax clarify this in the manual so the next lunatic that tries this be warned?
    Thanks in advance.
    Alex
  • David CarrierDavid Carrier Posts: 294
    edited 2012-10-10 18:37
    Alex,
    I created a task in our issue tracker to add the information in the next revision of the Propeller Manual.

    — David Carrier
    Parallax Inc.
  • Alex.StanfieldAlex.Stanfield Posts: 198
    edited 2012-10-10 19:26
    Alex,
    I created a task in our issue tracker to add the information in the next revision of the Propeller Manual.

    — David Carrier
    Parallax Inc.

    Thanks a lot David! I appreciate that.

    As we are accustomed around here, Parallax stands out in support for it's users, great work!

    I'm glad I chose the Propeller for my projects.

    Alex
  • prof_brainoprof_braino Posts: 4,313
    edited 2012-10-10 20:18
    Heater. wrote: »
    Feature or bug?

    In this case its "just the documented way it works" in the prop 1 (once the doc is updated).

    Prop 2 will likely be determined by observing "late arriving", or not, on that hardware.

    This is kind of cool that an anomaly can go undetected so long, then be detected and verified by the community, and acknowledged and addressed by the company so quickly. Whats that, 5 days turn around? Brilliant!

    Since no one has encountered this before, "The solution is to not use MINS/MAXS/MIN/MAX/CMPSUB on I/O registers. " seems an acceptable resolution.
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2012-10-10 20:28
    Prop 2 will likely be determined by observing "late arriving", or not, on that hardware.
    AFAIK, it's a non-issue, since the SFRs on the Prop 2 are not read/modify/write accessible.
    "The solution is to not use MINS/MAXS/MIN/MAX/CMPSUB on I/O registers. " seems an acceptable resolution.
    "Find a way to exploit the newly-documented behavior," is even better! :)

    -Phil
  • SRLMSRLM Posts: 5,045
    edited 2012-10-10 20:46
    Heater. wrote: »
    Is it a "feature" that will be carried to the Prop II or is it a "bug" that will be fixed in Prop II?

    Would it even be possible to make such a change (if it is needed) without having to restart the whole routing process (weeks!) and pushing back the timeline even further?
  • cgraceycgracey Posts: 14,253
    edited 2012-10-10 21:41
    Heater. wrote: »
    Feature or bug?

    Is it a "feature" that will be carried to the Prop II or is it a "bug" that will be fixed in Prop II?

    The Prop II doesn't use write cancellation. It just outputs the correct result, in any case. So, this won't be an issue on Prop II. There could be other issues, though.

    When I was looking through the Prop I hardware-description code today, I was amazed at how small it was compared to the Prop II code. It's about 1/9 the size, in a more verbose language, making it probably 1/13th the complexity of Prop II. I've written many test programs for the Prop II that I run and verify at each hardware-description iteration, so I'm planning for the best outcome. My main fear is that some catastrophic bug might exist because of some mistake made ABOVE the Verilog description, relating to our own circuitry. Hopefully, the chip will just work the first time we build it.
  • mindrobotsmindrobots Posts: 6,506
    edited 2012-10-11 02:51
    Let's all hope and pray for silicon success!!

    {...crosses finger and toes, lights candles, faces Rocklin and chants calming mantra "hmmmm, lots of speed, lots of pins, hmmmmm"...)
  • jazzedjazzed Posts: 11,803
    edited 2012-10-12 11:12
    mindrobots wrote: »
    {...crosses finger and toes, lights candles, faces Rocklin and chants calming mantra "hmmmm, lots of speed, lots of pins, hmmmmm"...)

    Three times a day? :)

    Yes, best of luck to Parallax.
Sign In or Register to comment.