Locks with PASM

2»

Comments

  • If only two cogs are contending for a lock then only a short time would be needed between a LOCKCLR and LOCKSET. If the other cog wants the lock it will be sitting in a tight loop doing a LOCKSET. So the cog that is clearing the lock just needs to wait slightly longer than the loop time. Of course, the loop time in Spin will be substantially longer than a loop in PASM.

    It gets more complicated when three or more cogs are actively contending for the lock. In that case a random wait time should fix the problem. Or lock requests could be queued up in a FIFO. The write to the FIFO would need to be protected with a lock. However, it seems like some wait time would still be needed before a cog could add a new request to the FIFO.
  • Show me the code!

    Surely if only two COGs are in the game, then when one does a LOCKCLR it has to wait a whole HUB cycle before it can do a LOCKSET. Isn't that enough time for the other COG to always be able to get in?



  • heater wrote:
    Isn't that enough time for the other COG to always be able to get in?
    Maybe not if the other cog happens to be executing a jmp when its turn comes around.

    -Phil
    “Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
  • Interesting aspect @Phil, lets think about it.

    The shortest PASM loop to wait for a lock is one Hub-op and one conditional jmp.

    So a PASM COG can get hold of the lock at every HUB-cycle it can access. If more then one PASM COG is waiting, no deadlock possible, after releasing a lock all other PASM COGs waiting have at least one chance at it until the first one gets around again.

    Not so with SPIN COGs. A 'repeat until lockset' in SPIN will not be able to catch every HUB-cycle.

    Mike


    I am just another Code Monkey.

    A determined coder can write COBOL programs in any language. -- Author unknown.

    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this post are to be interpreted as described in RFC 2119.
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 21,235
    edited September 28 Vote Up0Vote Down
    heater wrote:
    Can anyone produce a simple program where:

    a) One or more COGS repeatedly acquire and release a lock.

    b) They manage to block the progress of another COG trying to acquire the same lock.

    c) Each COG flashes an LED to indicate it's progress.

    d) All COGS running PASM of course.

    Trying, but so far unsuccessful -- even with four cogs vying normally, plus one "lock hog."

    -Phil
    “Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
  • It's certainly possible to create a PASM program that would prevent a Spin cog from getting the lock. The PASM cog would just need a set/clr cycle that matched the number of cycles in the Spin 'repeat until lockset' loop. Would this happen normally? Probably not, but it is possible. I think in general multiple cogs will not get locked out. Some cogs may require more attempts to acquire the lock, but it is unlikely that they would never get it.
  • Dave,
    It's certainly possible to create a PASM program that would prevent a Spin cog from getting the lock
    OK, just for you, the challenged is relaxed. Your COG blocking demonstration may have one or more of the COGs running Spin.

    Mind you, it's the same challenge, seeing as Spin is PASM under the hood.
  • Dave Hein wrote: »
    It's certainly possible to create a PASM program that would prevent a Spin cog from getting the lock. The PASM cog would just need a set/clr cycle that matched the number of cycles in the Spin 'repeat until lockset' loop. Would this happen normally? Probably not, but it is possible. I think in general multiple cogs will not get locked out. Some cogs may require more attempts to acquire the lock, but it is unlikely that they would never get it.

    The issue I had was with a cog running Spin that would never obtain the lock. There were 3 other cogs, all running PASM, that were accessing the same lock. It was very, very, sporadic, maybe happening in 1 out of 10 executions of 10 minute runs each.

    But happen it did. And it was always the cog running Spin that would hang. I do believe that there was an inadvertent timing synchronization between the Spin routine and at least one of the PASM routines.

    Adding the differing wait times to each cog has fixed the problem (it has never recurred).








    Tulsa, OK

    My OBEX objects:
    AGEL: Another Google Earth Logger
    DHT11 Sensor

    I didn't do it... and I promise not to do it again!
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 21,235
    edited September 28 Vote Up0Vote Down
    Finally got a cog to hog a lock, effectively preventing three others from obtaining it. Here's code that does not hog the lock:
    CON
    
      _clkmode      = xtal1 + pll16x
      _xinfreq      = 5_000_000
    
    PUB start
    
      cognew(@locktest, 4)    'Scope channel 1.
      cognew(@locktest, 8)      'Scope channel 2.
      cognew(@locktest, 16)   'Scope channel 3.
      cognew(@locktest, 32)   'Scope channel 4.
    
    DAT
    
                  org       0
    locktest      mov       dira,par
    
    lockget       lockset   zero wc
                  nop
                  nop
                  nop
    '              nop
            if_c  jmp       #lockget
                  or        outa,par
                  andn      outa,par
                  lockclr   zero
                  jmp       #lockget
    
    zero          long      0
    

    Here's the scope output:

    no_lock_hog.gif

    If I uncomment the nop, this is what I get:

    lock_hog.gif

    What seems to be happening is that in channels 2 through 4, the jmp is synchronized to the hub cycle, so they miss every opportunity to obtain a lock when their turn comes around. The code is admittedly pathological, but it does demonstrate the possibility of a cog not being able to obtain a lock.

    I haven't yet been able to duplicate this behavior without instructions between the lockset and the jmp. Also, it takes four cogs with the same code for any one of them to get locked out. My mistake: 'locks out even with two.

    -Phil
    640 x 480 - 18K
    640 x 480 - 13K
    “Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 21,235
    edited September 29 Vote Up0Vote Down
    It occurs to me that if all cogs that use a lock immediately follow the lockset by the if_c jmp back to it, all cogs will get an equal shot at obtaining a lock. The reason is that each cog will be able to access the hub again on its very next turn, as Heater rightly pointed out. In my examples above, I put a delay between the lockset and the jmp, and this is what caused the problem. What this tells me that any kind of random hold-off will create the problem, rather than fixing it.

    In Spin, the situation is different because, as was pointed out, you don't get to do a lockset on every hub cycle. So a Spin cog might still get locked out. Given more instruction space in the Spin interpreter, a lockwait function that was interpreted using the tight PASM loop, would have taken care of the problem.

    -Phil
    “Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
  • Wow, Phil, you did it. And so easily as it turns out. A nice demonstration.

    Amazing that after all these years there are things to be discovered about the Prop.

    It's the "lock bomb" !


  • Thanks, Phil, for your time and effort researching this issue.

    One thing that your test code doesn't do is keep the lock for any significant time. The issue I had was with locking messages that required decoding. So each cog has to obtain the lock, retrieve the message, decode it, determine if it is the intended recipient, and potentially process it - all before releasing the lock.

    I believe that the issue in my case was that one of the PASM cogs spent just enough time doing the above - on a message that wasn't intended for it - and keeping the other cogs from obtaining the lock. This, of course, meant that the intended recipient was never able to retrieve and process (and reset) it's message.

    Especially since the intended recipient was the Spin cog, the cog always having the issue.

    It would have been nice to use a separate lock for each recipient. Lack of available cog space precludes this.

    I think that my "random waits" code fixes the issue because each cog waits a different amount of time before retrying.

    Of course, it's almost impossible to prove that it will always work.


    Tulsa, OK

    My OBEX objects:
    AGEL: Another Google Earth Logger
    DHT11 Sensor

    I didn't do it... and I promise not to do it again!
  • Locks should only be held for as short as time as possible. If only because keeping them longer degrades system performance.

    Lock, read, unlock, process...., lock, write, unlock, repeat.
  • wmosscrop wrote:
    I think that my "random waits" code fixes the issue because each cog waits a different amount of time before retrying.
    Based upon my experiments, you should always retry as soon as possible, IOW in a tight loop. Waiting only increases the chance that you will not obtain the lock at all.

    Here's an example:
    CON
    
      _clkmode      = xtal1 + pll16x
      _xinfreq      = 5_000_000
    
    PUB start
    
      cognew(@locktest, 4)
      cognew(@locktest, 8)
      cognew(@locktest, 16)
      cognew(@locktest, 32)
    
    DAT
    
                  org       0
    locktest      mov       dira,par
    
    lockget       lockset   zero wc
            if_nc jmp       #:doit
                  nop
                  nop
                  nop
                  nop
                  nop
                  nop
                  nop
                  nop
                  nop
                  nop 
                  jmp       #lockget
    
    :doit         or        outa,par
                  mov       t0,cnt
                  
    :waitlp       mov       t1,cnt
                  sub       t1,t0
                  sub       t1,keep wc
            if_c  jmp       #:waitlp
                  
                  andn      outa,par
                  lockclr   zero
                  jmp       #lockget
    
    zero          long      0
    keep          long      100000
    t0            res       1
    t1            res       1
    

    In this example, each cog keeps the lock for 100,000 clocks. The first cog gets the lock once, and the fourth one hogs it thereafter. If I remove the nops, every cog gets its turn.

    -Phil
    “Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
  • Interesting insights Phil. I wonder if lockwait was ever under consideration.

    Have I got this right in the current logic? The combination of lockset wc and if_c jmp #$-1 will always hit every hub rotation, and even adding one nop or one other 4-cycle instruction will be okay. But two or more instructions between the the lockset and the jmp $-1 will mean it will miss one or more hub rotation. You can design or inadvertently cause another cog to sync in hub phase so that it will aways release its lock and also pick it up during that missed rotation, a hog-cog. Definitely a reason to keep that loop tight.

    One could use a djnz value,#$-1 instead of a straight jmp in order to make a failsafe for a locked up situation for whatever cause, but I haven't ever before thought that something like that might be necessary. Seems an unlikely bump in the night. Code that has random hub accesses or decision trees will tend to escape (?!) quickly, eventually.




Sign In or Register to comment.