Propeller Chip - Apparent Cog Instability

Paul Voss · 2008-02-18 01:47

I am using a Propeller chip to control small research balloons in the Arctic next month - until recently, everything was going great - the parallel processors are a dream to work with.

However, I have found a very strange instability that I am worried could be a flaw in the chip (hopefully I've just made an idiotic mistake and someone will straighten me out).

I reduced the offending code to a short and simple program (attached). The problem is very fussy, depending on the exact timing of two coginits - change one little thing and the problem goes away. However, as written the problem reliably occurs on both raw and board-integrated Propsticks with differing power supplies and external connections. Although the symptom is garbled text on hyperterminal, all the serial code can be removed and the problem still persists - in this case, the LED (if enabled) will flash about 9-15 times and then go out - the main and led cog both lock up, so it appears to be a cog interaction. Also note that the debug cog is not specified here and could be stopped bya subsequent coginit - in other tests, I have specified the debug cog and got the same lockup problem.

If some of the smart people on this forum could take a look at the attached code excerpt, I would greatly appreciate it! The code is a bit unusual due to the complexity of the parent program it came from. Note that I am not looking for just a fix (there are many simple changes that miraculously fix the problem) - rather, I need to understand what is going on so that I don't fly unstable code. The balloons need to ship very soon - this problem was an unfortunately last-minute surprise.

Thanks

Paul

Graham Stabler · 2008-02-18 02:21

The c := cnt should be before a waitcnt or do waitcnt(clkfreq*2+cnt) as it is the waitcnt becomes a function of the length of time taken to complete the other instructions or in other words if you expected a delay of 2 seconds you get a delay of 2 seconds - time for other commands.

Might not be the problem but it's true [noparse]:)[/noparse]

Graham

Mike Green · 2008-02-18 03:39

I would echo Graham's concern. In your MAIN method, you save the system clock, then perform several complex operations and expect to be able to do a WAITCNT for a time 2 seconds later (which works) or 1 second later (which doesn't). You didn't say, but, if you miss the 1 second absolute time mark, your program will wait for several minutes until the absolute time wraps around in the 32 bit counter. That's quite possible.

Another thing is that, by reinitializing the LED and GPS cogs, you're stopping them at an arbitrary place, then restarting them. Also, a COGINIT will take several milliseconds to perform at 10MHz.

mirror · 2008-02-18 04:11

Mike,
according to Paul the problem occurs at 2 seconds, but not at 1 second - which is opposite to your understanding - and somewhat changes the timing issues.

Paul,
I had a similar garbling of serial characters a while ago (8 to 10 months). It seemed to be dependent on how the code was laid out. Swapping lines of code and changing delays made the problem change - also I only seemed to have a problem when sending non-printable characters.
The horrible answer is that as I wrote more code - a whole lot more - the problem mysteriously dissappeared. I never did find out what the problem was, but it hasn't bugged me since. I wasn't starting/stopping/interrupting the cogs like you are, but I did have 7 of the cogs occupied.
I wish I could shed more light. I'm reasonably sure it's not a problem with the chip as when the problem occurred I'd only been using the chip for a very short amount of time (<1 month).

Just·one thing out of curiosity, is this what you mean by your case statement?

if (N == i) or (N == 1) 
  coginit(2,GPS_COGCODE,@GpsStack) 'COMMENT OUT COGINIT (NOT I,1) AND CODE RUNS 
elseif (N == i+1) or (N == 2) 
  ' DO NOTHING

Paul Voss · 2008-02-18 05:00

Thanks for the quick response. I'll take them in order..

First, on the c:=cnt being reversed - I agree it is awkward, however, it was a deliberate reversal due to the multiple possible threads after c is initialized - there is simply no place to put the waitcnt at the end of the (real) program that works for all scenarios. Putting the waitcnt at the very beginning works well with the one exception that I need to be careful on the first pass through (before the c:=cnt line is executed)

Second, on starting and stopping the cogs at arbitrary places - I thought this was ok - its only flashing an led and all pins revert to 0 when the cog is stopped (led off). Perhaps not the ideal way to do it, but it seems it should not be causing the lockup I am seeing.

Mike - I think you and I may have had the same issues. The "mysteriously disappear" thing would normally work - however, with the flight issues, I need to be 100% certain what is going on. The code I posted seems very simple and should not be causing lockup problems. Hopefully, someone will prove me wrong - show me my error. This is what I am hoping for. And yes, the logic in your if-then statement is what is in the case I believe.

Thanks again - please let me know if you have further thoughts - this remains a deep (and urgent) mystery.

Phil Pilgrim (PhiPi) · 2008-02-18 05:05

Paul Vosk said...
... and all pins revert to 0 when the cog is stopped (led off).

'Probably not relevent, but in need of correction just the same: all pins revert to inputs when the cog is stopped.

-Phil

Mike Green · 2008-02-18 05:18

Paul,
It's impossible to be 100% sure with your program as written. You've written it with the potential for a fault ... missing a WAITCNT time. You need to do a timing analysis of the delays introduced by the COGINITs and the debug output or you need to rewrite the code to be independent of the delays introduced by them. Unfortunately, there's no execution time chart for the Spin operators to make it easy. You'll need to do some testing to determine the actual time involved. I'm a firm believer in designing programs to do what they need to do rather than relying on testing. You can't always do that or do it completely, but, to the extent you can do it, it improves your program's reliability and your faith in it.

Paul Voss · 2008-02-18 05:28

Hi Phil,

I agree timing waitcnt timing is critical. In this case though, the delay of the cog inits is a few 10s of milliseconds vs the 2 seconds of the waitcnt - so it's not even close to causing a problem in this instance. Also, the symptom I see (printing random characters continuously to hyperterminal or freezing forever all the cogs) is completely different that would be caused by a hanging waitcnt. Thanks for your reply.

Paul

Phil Pilgrim (PhiPi) · 2008-02-18 05:55

Paul,

I'm not a huge fan of using COGINIT in Spin. In my mind, it's too dangerous, presupposes too much, and ought to be banned from the language. My recommmendation: just don't use it! Here's a version of your code that uses COGNEW and COGSTOP and seems not to suffer the hangups that occur in the original code:

''PROGRAM TO DEMONSTRATE COG CONFLICTS

[b]CON[/b]
  [b]_clkmode[/b] = [b]xtal[/b]1 + [b]pll[/b]2x                               ' phase lock loop multiplier for clock
  [b]_xinfreq[/b] = 5_000_000                                   ' base clock frequency is 5 MHz
  led      = 16

[b]OBJ[/b]
  dbug : "FullDuplexSerial"                        'include serial object for hyperterm

[b]VAR[/b]                                                     
  [b]long[/b]  LedStack[noparse][[/noparse]500&#093;                                   'led cog memory 
  [b]long[/b]  GpsStack[noparse][[/noparse]500&#093;                                   'gps cog memory

[b]PUB[/b] MAIN | c,i,N,iGPS,iLED

 'PROGRAM COUNTS 1,2,3,4,5,6 AND THEN SPITS GARBAGE TO SCREEN OR FREEZES (RUN SEVERAL TIMES)
  
  dbug.start(31,30,%0000,19200)                    'start dbug serial on it own cog
  
  N:=6
  c:=[b]cnt[/b]
  iLED := -1
  iGPS := -1
  [b]repeat[/b]
    [b]repeat[/b] i [b]from[/b] 1 to N
      [b]waitcnt[/b](clkfreq*2+c)                         'PROB OCCURS FOR 2 SEC  DELAY, BUT NOT 1 SEC??
      c := [b]cnt[/b]                                     'WAITCNT MUST BE BEFORE C:=CNT FOR PROB TO OCCUR
      dbug.dec(i)                                  'prints loop counter              
      dbug.tx(10)
      dbug.tx(13)
      [b]case[/b] N
        i,1    :
          [b]if[/b] (iGPS => 0)
            [b]cogstop[/b](iGPS)
          iGPS := [b]cognew[/b](GPS_COGCODE, @GpsStack)
        i+1,2  :                                   'does nothing, but the delay is critical to problem
      [b]if[/b] (iLED => 0)
        [b]cogstop[/b](iLED)
      iLED := [b]cognew[/b](LED_COGCODE, @LedStack)

[b]PRI[/b] GPS_COGCODE
  [b]repeat[/b]                             
  'nothing needed here
  
[b]PRI[/b] LED_COGCODE | c                                'FLASH LED ON PIN 27 (DIABLED - SEE BELOW)
  [b]outa[/b][noparse][[/noparse]led&#093; := 0                          
  [b]dira[/b][noparse][[/noparse]led&#093; := 1                            
  [b]repeat[/b]
    c := [b]cnt[/b]             
    [b]outa[/b][noparse][[/noparse]led&#093; := 1                                  'outa=0, led off to eliminate power issues
    [b]waitcnt[/b](clkfreq/10+c)                             
    [b]outa[/b][noparse][[/noparse]led&#093; := 0   
    [b]waitcnt[/b](clkfreq+c)

-Phil

deSilva · 2008-02-18 08:11

Paul, I have not looked into any of your code; others already have.....

However the things you describe GENERALLY have one cause only - stack (or other memory) overflows.
I notice in Phil's posting that you allocate 500 LONGs for them. This is curious. Either you need so much.... so are you sure you do not need even more???

On the other hand this is considerable space... Are you sure the main COG has still got enough memory? You have no _STACK safety belt instruction in it...

A second remark: It is generally a better technique to use WAITCNT as it is intended, as a "waiting upto a deadline", and refer to CNT once only, rather then twist it to a "delay" instruction....

mirror · 2008-02-18 10:19

Phil,
Did you try Paul's first version of code before writing your own? Were you able to reproduce the problem, or did you just write the most likely workaround?
Why not use coginit - if it is part of the language then why not use it?

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

hippy · 2008-02-18 11:02

mirror said...
Why not use coginit - if it is part of the language then why not use it?

It's more error prone than CogNew, requiring that the Cog to be used isn't already in use. With
the right precautions and checks it is okay. To know which Cogs are available is not always easy
when using sub-objects, and, once working, hard-wired CogInits may cause program failure if sub-
objects are changed or more sub-objects are added, or where the code is included as a sub-
object itself.

A CogInit stops whatever may be running in that Cog even if essential to the program, and there's
no feedback on whether the Cog was previously in use or not.

I wouldn't ban CogInit but would recommend CogNew in preference unless there were compelling
reasons to use CogInit.

Post Edited (hippy) : 2/18/2008 11:10:22 AM GMT

Martin Hebel · 2008-02-18 11:08

I had tested/recommended issuing at a minimum a cogstop prior to the Coginit (re-init) to Paul, while recommending he post here also. Hippy/all, would that make it a more stable use by ensuring it is shut down first prior to another coginit? Or is there a possibility counters may still be running leading to possible instability?

-Martin

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
SelmaWare Solutions - StampPlot GUI for controllers, XBee and Propeller Application Boards

Southern Illinois University Carbondale, Electronic Systems Technologies

American Technical Educator's Assoc. Conference·- April, Biloxi, MS. -- PROPELLER WORKSHOP!

hippy · 2008-02-18 11:14

@ Martin : I don't really see any advantage in a CogStop before a CogInit, and wouldn't expect that
to result in any different stability.

I personally wouldn't CogStop then CogNew/CogInit unless I had to. I prefer to get the Cogs running
at the start of the program and then control them by updating 'shared variables', but appreciate that
may not always be possible.

Phil Pilgrim (PhiPi) · 2008-02-18 17:07

Mirror,

Yes, I tried his original code and was able to reproduce the problem. I made the least modification possible to exorcize the COGINITs, and that seems to have fixed things. (I added the CON for the LED pin, only becasue pin 27 isn't available on my proto board.)

Just bacause a feature is available is no reason to use it. Many otherwise structured languages provide GOTO, too, but good coding practice disdains its use.
___

Hippy,

I agree: stopping and restarting a cog is not the best practice. If the interval between stopping and restarting were long enough, I could see it as a way to save power, but that's about it.

-Phil

deSilva · 2008-02-18 20:44

This COGNEW/COGSTOP/COGINIT discussion is not terribly helpful, and a lot of prejudices shine through.
It is most unlikely that anything discussed above is the cause for the problems.

Stack-Overflow...
DeSilva now took a deeper look at the program.. No memory usage of any kind at all...
Other funny things.. DeSilva had never seen such a CASE construct.... but there is nothing against it in the Manual... However:

i+1,2  :                                   'does nothing, but the delay is critical

No, that cannot be!!
Notice that CASE is not used very often... many programmers shy this construct and there have been reports from time to time that it needs STACK of unclear amount...

Some expreimenting shows that a problem occurs ONLY with a most spefic CASE match pattern...
As soon as you omit ",1" or ",2" everything works fine.
It also seems to run when ordering the values, i.e. "1,i:" and "2,i+1:"

Conclusion: There is something weird with complex match patterns, disturbing the stack. This might or might not self-repair in a normal program, but COGINIT after some case labels seems to be very susceptible to it...

Post Edited (deSilva) : 2/18/2008 8:50:42 PM GMT

mirror · 2008-02-18 22:15

deSilva, The case statement is a little strange in any case, as half the conditions will never be true. N is a constant - during program execution - with a value of 6, which will never be equal to 1 or 2. So the original case statement is logically equivalent to the following pseudocode·:

case i
  5 :     ' Do nothing
  6 : CogInit(GPS)
CogInit(LED)

When I mentioned that I had a problem some months ago, there was no stopping and starting of cogs but there was a huge case statement used to parse incoming comms messages. So maybe there is something with case statements!?

Hippy, There is reason to start and stop cogs and Paul has identified the situation where it is of most benefit. He is trying to save power by running a slow clock (10Mhz) and by having the minimal number of tasks (cogs) running at a time.

Phil, I'm not convinced of the error of using coginit. Once again Paul has identified the exact situation where it's needed, for complete control of the cog allocation. Maybe·it's possible that there's a problem if a cog is forcibly stopped while in the middle of a wait instruction or some other specific situation!?

What Paul has so kindly given us is a minimal piece of code which demonstrates the problem. His logic and deduction are superior to have been able to give us such a concise piece of code with which even Phil was able to reproduce the problem (thanks Phil).

Unfortunately I'm without usable hardware right at this moment. I'm interested in this thread because I've seen the instability Paul has spoken of but in a·different set of circumstances. It's not the sort of thing that's going to stop me from enjoying my work with the Propeller, but if we can identify the BEWARE then we will all write better code!
·

Paul Voss · 2008-02-18 22:31

Thanks all for the good dicussion - All your good comments have convinced me that clobbering running cogs with a coginit is not good practice (perhaps allowed, but not a good idea). I have changed my code so that all cogs stop on their own before they are ever hit with a coginit. The code appears to be 100% stable now. Thank you!!

I will also be careful using the case statement - it is efficient in my real code because there are many possible values of N and hence all checked cases do see action. None the less, if I continue to have any problems, it would be simple enough to replace case with some if-then lines.

It the code behaves well over the next 24 hours, I will be able to ship the balloons. In a couple weeks, live flight data will be posted on www.science.smith.edu/cmet.

Thanks to all.

Paul

deSilva · 2008-02-18 23:01

Good to hear!
But please, mirror et al. : Listen to what I posted, not to your prejudices wrt COGINIT, which is a very fine an reliable instruction

And yes, there is something with case match patterns

Post Edited (deSilva) : 2/18/2008 11:10:36 PM GMT

Phil Pilgrim (PhiPi) · 2008-02-18 23:09

Mirror,

'Sorry to disagree, but there's no good reason ever to have or want complete control over cog allocation. All the cogs are alike. Why insist on picking one over another? It doesn't save any time, and doing so can be a recipe for disaster, particularly when using third-party objects that spawn their own cogs. The Propeller provides a completely transparent cog allocation mechanism via COGNEW, which returns the cog number for those rare occasions when you need ot know it. It's simply the right tool for the job.

Of course, the question of why Paul's original program fails is still open, and those into Propeller program pathology can perhaps dig up a cause. My approach is more like the doctor in the following conversation:

····Patient: "Doc, it hurts when I use COGINIT."
····Doctor: "Then don't use COGINIT."

-Phil

Paul Baker · 2008-02-18 23:16

While we don't specifically check for it, we strongly recomend objects on the exchange to never use COGINIT, for the very reasons Phil has said. And in general unless you have a very specific reason for using it, it should be entirely avoided.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer

Parallax, Inc.

deSilva · 2008-02-18 23:24

Patient: "Doc, I live under the misconception of being a dog and moving on my four limbs amazes my family."
Doctor. "Just sit up and beg."

deSilva · 2008-02-19 01:42

Sorry, I lost all of my optimism.

After some more testing it becomes curioser and curioser. It HAS TO DO with COGINIT, WAITCNT, but with many things more..... Maybe I shall start reading the bytecode....
However I still do not think it is an issue of COGINIT but a compiler bug... I can produce the issue with IF as well, not needing a CASE.... But there is NEVER an issue with linear code and simple loops....

I give up for the moment...

OzStamp · 2008-02-19 01:52

Hi

It would be good if the creator " herr Chip " could step in here and sniff it out..

cheers Ron

deSilva · 2008-02-19 02:27

This is the smallest piece of code I can produce the problem with. Note that when changing ANYTHING here it will work fine.
E.g. removing the unused local variable in CHECK or changing the WAIT intervals...

''PROGRAM TO DEMONSTRATE COG CONFLICTS
' changed by deSilva

CON
  _clkmode = xtal1 + pll8x                
  _xinfreq = 10_000_000     'attention: HYDRA settings                       

OBJ
  dbug : "FullDuplexSerial"                      

VAR                                                     
  long  aStack[noparse][[/noparse]100]                              
  long  bStack[noparse][[/noparse]100]                             

PUB MAIN | c,i ,N
 
  dbug.start(31,30,%0000,19200)         

  c:=cnt
  repeat
    repeat i from 1 to 2   
       waitcnt(clkfreq/2+c)                    
       c := cnt
       dbug.dec(N++)                                      
       dbug.tx(10)                                            

 ' Remove the following two lines and it works fine for a longtime...
      IF N==10    
           cognew(check,@bStack) 

      repeat 3000 ' this will give enough time to settle COGNEW
      coginit(3, check, @aStack) 
      

PRI CHECK | d
  outa[noparse][[/noparse]0] := 0                          
  dira[noparse][[/noparse]0] := 1                            
  repeat                             
    waitcnt(clkfreq/2+cnt)

Post Edited (deSilva) : 2/19/2008 2:33:20 AM GMT

mirror · 2008-02-19 02:43

deSilva said...
Sorry, I lost all of my optimism.

After some more testing it becomes curioser and curioser. It HAS TO DO with COGINIT, WAITCNT, but with many things more..... Maybe I shall start reading the bytecode....
However I still do not think it is an issue of COGINIT but a compiler bug... I can produce the issue with IF as well, not needing a CASE.... But there is NEVER an issue with linear code and simple loops....

I give up for the moment...

We all hope that it might be a compiler bug . . .

The other possibility is a bug in the interpreter - but I really really hope not.

Chip is·pretty amazing, but to err is human.·It wouldn't be the first chip with an errata, so that in itself doesn't bother me.·What bothers me is that I possibly stumbled into and back out of it months ago without being able to extract a sufficiently compact piece of code to post to the forum at the time.

·

hippy · 2008-02-19 03:15

I cannot see anything obviously wrong with the bytecode ...

0020                         PINIT    ALIGN    SPIN 

====                                  ; PUB MAIN | c,i ,N
====                                  ;   dbug.start(31,30,%0000,19200)
====                                  ;   c:=cnt
====                                  ;   repeat
====                                  ;     repeat i from 1 to 2
====                                  ;        waitcnt(clkfreq/2+c)
====                                  ;        c := cnt
====                                  ;        dbug.dec(N++)
====                                  ;        dbug.tx(10)
====                                  ;  ' Remove the following two lines and it works fine for a longtime...
====                                  ;       IF N==10
====                                  ;            cognew(check,@bStack)
====                                  ;       repeat 3000 ' this will give enough time to settle COGNEW
====                                  ;       coginit(3, check, @aStack)

                                      ALIGN    STACK          ; For S5 

      +0000                           LONG     0              ; Unused Result Variable
      +0004                  VL1      LONG     0
      +0008                  VL2      LONG     0
      +000C                  VL3      LONG     0

                                      ALIGN    SPIN 

0020         01              S5       FRAME    CALL WITHOUT RETURN VALUE
0021         37 24                    PUSH     #$1F
0023         38 1E                    PUSH     #$1E
0025         35                       PUSH     #0
0026         39 4B 00                 PUSH     #19200
0029         06 03 01                 CALLOBJ  O11, +1
002C         3F 91                    PUSH     CNT
002E         65                       POP      VL1
002F         36              J6       PUSH     #1
0030         69                       POP      VL2
0031         35              J7       PUSH     #0
0032         C0                       PUSH     MEM[noparse][[/noparse]]
0033         37 00                    PUSH     #2
0035         F6                       DIV
0036         64                       PUSH     VL1
0037         EC                       ADD
0038         23                       WAITCNT
0039         3F 91                    PUSH     CNT
003B         65                       POP      VL1
003C         01                       FRAME    CALL WITHOUT RETURN VALUE
003D         6E AE                    USING    VL3 PUSH POSTINC
003F         06 03 09                 CALLOBJ  O11, +9
0042         01                       FRAME    CALL WITHOUT RETURN VALUE
0043         38 0A                    PUSH     #10
0045         06 03 07                 CALLOBJ  O11, +7
0048         6C                       PUSH     VL3
0049         38 0A                    PUSH     #10
004B         FC                       EQ
004C         0A 07                    JPF      N8
004E         37 00                    PUSH     #2
0050         CB 81 90                 PUSH     #L43
0053         15                       MARK
0054         2C                       COGISUB
0055         39 0B B8        N8       PUSH     #3000
0058         08 02                    LOOPJPF  N10
005A         09 7E           J9       LOOPRPT  J9
005C         37 00           N10      PUSH     #2
005E         43                       PUSH     #L42
005F         15                       MARK
0060         37 21                    PUSH     #3
0062         3F 8F                    PUSH     MEM+15
0064         37 61                    PUSH     #$FFFFFFFC
0066         D1                       POP      MEM[noparse][[/noparse]][noparse][[/noparse]]
0067         2C                       COGISUB
0068         36                       PUSH     #1
0069         37 00                    PUSH     #2
006B         6A 02 43                 USING    VL2 RPTINCJ J7
006E         04 FF BE                 GOTO     J6
0071         32                       RETURN

cgracey · 2008-02-19 04:00

deSilva said...
This is the smallest piece of code I can produce the problem with. Note that when changing ANYTHING here it will work fine.
E.g. removing the unused local variable in CHECK or changing the WAIT intervals...

''PROGRAM TO DEMONSTRATE COG CONFLICTS
' changed by deSilva

CON
  _clkmode = xtal1 + pll8x                
  _xinfreq = 10_000_000     'attention: HYDRA settings                       

OBJ
  dbug : "FullDuplexSerial"                      

VAR                                                     
  long  aStack[noparse][[/noparse]100]                              
  long  bStack[noparse][[/noparse]100]                             

PUB MAIN | c,i ,N
 
  dbug.start(31,30,%0000,19200)         

  c:=cnt
  repeat
    repeat i from 1 to 2   
       waitcnt(clkfreq/2+c)                    
       c := cnt
       dbug.dec(N++)                                      
       dbug.tx(10)                                            

 ' Remove the following two lines and it works fine for a longtime...
      IF N==10    
           cognew(check,@bStack) 

      repeat 3000 ' this will give enough time to settle COGNEW
      coginit(3, check, @aStack) 
      

PRI CHECK | d
  outa[noparse][[/noparse]0] := 0                          
  dira[noparse][[/noparse]0] := 1                            
  repeat                             
    waitcnt(clkfreq/2+cnt)

I believe the problem is due to the (undocumented) fact that when Spin instructions COGNEW and COGINIT are·used to launch·other Spin routines (as opposed to just assembly code), a special sequence of Spin bytecodes in ROM are called to build the initial stack frame for the soon-to-be-launched Spin routine. This process takes a little time, After it is completed, the·actual COGINIT is executed to kick off·the cog with the Spin interpreter pointed to the newly-initialized stack frame.

The blow-up occurs when this new stack frame being built is in the same area that a Spin cog is already working in. This can cause nasty problems, but may not always, making it all the more dangerous.

I'm confident that if the above example were modified so that the COGINIT used alternating stack areas (not always "@aStack"), there would be no problem, as the new stack frame being built wouldn't already be in active use.

Also, a COGSTOP before the COGINIT, in this case, would solve this problem. However, it would introduce a new possible problem of allowing other cogs to grab that temporarily-stopped cog·by a COGNEW·of their own, before your own·COGINIT would actually execute.

Perhaps this would be the simplest solution: have the Spin routine that you are referencing in the COGINIT ("check" in this case) consist of nothing but a call to another Spin routine, which would then form the loop, and never return. This would build the stack to a height that would exceed the·top-most long being modified by the re-launching COGINIT.

Basically, relaunching Spin code into an already-being-used stack area is like playing Russian Roulette. You need to either do a COGSTOP first, use a different stack area, or know that the already-active stack is currently at a height which won't mind its bottom being modified.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

Chip Gracey
Parallax, Inc.

Post Edited (Chip Gracey (Parallax)) : 2/19/2008 4:17:03 AM GMT

OzStamp · 2008-02-19 04:08

Thanks Chip.

So for all those accusations... there is no compiler error.. no hardware bug..

No mystery.. just plain facts.. just what was needed
Thanks Chip.. take care..

Ron Nollet Mel OZ

cgracey · 2008-02-19 04:41

OzStamp said...
Thanks Chip.

So for all those accusations... there is no compiler error.. no hardware bug..

No mystery.. just plain facts.. just what was needed
Thanks Chip.. take care..

Ron Nollet Mel OZ

Ron,

I hope this is it. DeSilva's example made it easy for me to see.

Perhaps one of you can confirm the theory. I've just got Propeller II stuff in front of me.

I could modify the compiler to generate roughly this sequence in response to a COGINIT(cognum, spinroutine, @stack):

· COGINIT(cognum, @asmloop, 0)···········'sort of like COGSTOP, but keeps the cog tied up
· COGINIT(cognum, spinroutine, @stack)···'do COGINIT as usual

DAT

asmloop·· jmp·· #0······················ 'an assembly-language·infinite loop

Can any of you think of any related pitfall scenarios that might still be out there? Would this compiler modification be a good idea? It would only apply to the case of COGINIT being used to launch a Spin routine, and would always burden that sequence with perhaps·10 bytes of·code.

And, if any of you can provide examples of problems with CASE, I'm very interested in addressing this. I don't know of any trouble, myself, but a few of you mentioned there might be some issues.

Thanks.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

Chip Gracey
Parallax, Inc.

Post Edited (Chip Gracey (Parallax)) : 2/19/2008 4:49:14 AM GMT

Mike Green · 2008-02-19 05:07

Chip,
Please leave the compiler as it is (in regard to this case). This is an issue for documentation. This goes under Tricks and Traps and hopefully becomes one of many examples in an "Introduction to Multiprocessing with the Propeller" tutorial.

Propeller Chip - Apparent Cog Instability

Comments