strange lock-up "cycle"

gio_rome · 2014-03-18 05:02

Independently from the code, I'm experiencing a strange "cycle" in the use of my Propeller.

Symptoms: every 50-60 seconds I experience strange lock up on variables values and/or LCD output. This causes the code to malfunction since it's stability relies on the continuity of the values.

Example: I display value on an LCD screen. To try to avoid this strange lockup I perform an earlier check, but it's uneffective:

PUB OUT2LCD | clock, i, a_, b_, c_                    ' method I launch within a cog
    clock:=cnt                          ' initialize chronometer
    repeat 
          a_:=var1 '(calculated live in another cog)
          b_:=var2 '(calculated live in another cog)
          c_:=var3 '(calculated live in another cog)
          IF (a_==0) AND (b_==0) AND (c_==0)
              'beep(1,1)
              NEXT
          if (cnt > (clock+clkfreq*0.9)) ' circa every second, print the result, without waiting!
            Clear_LCD                   ' launch clean method
            ...do you thing...like
            BS2.LCD_POS(1,11)
            BS2.LCD_DEC(a_)
            ....
            clock:=cnt                             ' reset chronometer

What do I experience with this code? Every once in a while (circa 50-60 seconds) the display print all 0s for each variable.
The other thing I experience is that with the same var1 (2 and 3) I perform calculations which get wrong when I experience this strange "lockup".

Any idea?

Thanks

G.

StefanL38 · 2014-03-18 05:16

Hi,

every 50-60 seconds sounds like a missed match of a waitcnt-command.

if you execute a waitcnt-command to cog is stopped until the free running and never be stopped systemcounter and your counter-value is matching.
The systemcounter is free running. It's counting up really independent from anything else. Now if your matching value is to close to the actual value
cnt has counted up to a higher value than your match-value. In this case the systemcounter has to count up to max roll over to zero and count up to your match value
again. at 80MHz counting up to 2^32 needs around 53 seconds.

best regards
Stefan

Ariba · 2014-03-18 06:11

if (cnt > (clock+clkfreq*0.9)) ' circa every second..

You can't multiply with 0.9 without using a floatingpoint object. Either just wait a full second by eliminiating the * 0.9, or multiply and divide with integers:

if (cnt > (clock+clkfreq*9/10)) ' circa every second,

Andy

kuroneko · 2014-03-18 06:31

And never compare cnt against anything else directly in SPIN (it's signed arithmetic), see e.g. the rxtime method in the FullDuplexSerial object for an example.

gio_rome · 2014-03-18 06:42

thank you for the remark. that was very naive of me.
unfortunately I've changed the code but the lock up stays

Ariba wrote: »
You can't multiply with 0.9 without using a floatingpoint object. Either just wait a full second by eliminiating the * 0.9, or multiply and divide with integers:
if (cnt > (clock+clkfreq*9/10)) ' circa every second,
Andy

gio_rome · 2014-03-18 06:47

Hi Stefan,

you seem to have hit the problem.

Unfortunately I cannot understand you very well, so I wonder if you could elaborate more on this.
What exactly am I looking for in the code? In some cog I use the waitcnt.
And it occurs to me now that in the Main I cognew some method, but at the end I have a "repeat" to get the code running...in which I perform tasks
if (cnt > (clock+clkfreq)) (it's actually the out2lcd from before...I don't launch it in a cog, but in the end of the Main)

is it a right behaviour?

StefanL38 wrote: »

Hi,

every 50-60 seconds sounds like a missed match of a waitcnt-command.

if you execute a waitcnt-command to cog is stopped until the free running and never be stopped systemcounter and your counter-value is matching.
The systemcounter is free running. It's counting up really independent from anything else. Now if your matching value is to close to the actual value
cnt has counted up to a higher value than your match-value. In this case the systemcounter has to count up to max roll over to zero and count up to your match value
again. at 80MHz counting up to 2^32 needs around 53 seconds.

best regards
Stefan

Ariba · 2014-03-18 06:49

Thanks Kuroneko, I just seen the float and haven't checked what it does...

@gio_rome

Try it like that:

if (cnt-clock > clkfreq*9/10) ' circa every second,

Andy

gio_rome · 2014-03-18 07:13

thank you Ariba,

it's not that.

I've tried remarking the whole portion of that code. It "locks" anyway. It's not that.
It must be some other part of the code.

Tell me now, for I don't know it, is the use of cnt in different cogs subjected to some kind of restraints? Is there anyway in particular I must take care to avoid? Because I seem to be doing it, whatever that is :-/

thank you again

gio_rome · 2014-03-18 07:18

Wait a sec...maybe I know what's wront, although I don't know /what/'s wrong with it.

I do this in a cog:

     cnt_st:=cnt  
     REPEAT WHILE cnt <= (clkfreq*2/10+cnt_st)    
                    ....things...

This whole thing serves me to have the "things" last exactly that amount of time. Can it be this thing what "skips" and locks the whole?

Ariba · 2014-03-18 07:31

It has nothing to do with the use in different cogs.
The problem is that CNT is a 32 bit counter value that rolls over after circa 54 seconds (2^32 / 80'000'000). Spin handles 32bit values as signed, so if you read CNT in Spin you see an increasing value for 27 seconds and then a negative value (when bit31 is set) for 27 seonds. So you can not compare that to a always positive time value.
If you calculate first CNT-CLOCK you always get a positive result because both are 32bit numbers and the Carry gets ignored. I know it's tricky and not easy to understand, without some Assembler knowledge.

Andy

Lawson · 2014-03-18 07:35

gio_rome wrote: »
Wait a sec...maybe I know what's wront, although I don't know /what/'s wrong with it.

I do this in a cog:
     cnt_st:=cnt  
     REPEAT WHILE cnt <= (clkfreq*2/10+cnt_st)    
                    ....things...
This whole thing serves me to have the "things" last exactly that amount of time. Can it be this thing what "skips" and locks the whole?

Yes.

That line of code suffers from the same bug that Ariba and Kuroneko pointed out. Everytime that CNT rolls over, direct comparisons of a variable to CNT fail. Instead always calculate the *difference* between CNT and some reference then compare to that. i.e. (cnt - ref) > one_second This works because of the properties of fixed precision math.

Marty

gio_rome · 2014-03-19 01:45

thank you all.

I use now

REPEAT WHILE (clkfreq*2/10+cnt_st-cnt) >0

which doesn't seem to "lock".

what do you think?

G.

Mark_T · 2014-03-19 09:12

Exactly, that code computes a difference, differences don't roll-over....

With computer integers you cannot assume that a+b > c implies a > c - b

strange lock-up "cycle"

Comments