When cogs collide
varnon
Posts: 184
I've notice that when using objects that launch new cogs, such as the Parallax Serial Terminal or FSRW, the order the cogs are launched seems to matter. Sometimes if one object is started before another one the second object will not start properly. I really don't understand why this happens. I thought cognew looked for an unused cog and launched code there. I don't understand why there would be conflicts.
I've scraped together some methods from other projects to give you an idea of what I am talking about. Here I test the Parallax Serial Terminal, FSRW, RealRandom, and a clock method I use. I have had order issues with each one of these.
To test the cog order issues I changed the order the cogs were launched in LaunchEverything. Print lines were commented out if PST was not launched yet.
I found that the following orders were successful:
pst, rr, clock, fsrw
rr, pst, clock, fsrw
rr, clock, pst, fsrw
fsrw, clock, rr, pst
fsrw, clock, pst, rr
These orders were caused the program to lock up when attempting to launch one of the objects:
pst, rr, fsrw, clock
rr, clock, sfsrw, pst
rr, fsrw, clock, pst
Previously I have found issues where FSRW could write files, but not read them until another cog was closed. I was not able to replicate this issue. I have also previously found that the clock may return incorrect values when the cogs are launched in certain orders, but I was not able to replicate this either.
Does anyone have any explanation for why this happens? It is easy enough to change the order of the code, but I feel like that should be unnecessary. Any thoughts?
I've scraped together some methods from other projects to give you an idea of what I am talking about. Here I test the Parallax Serial Terminal, FSRW, RealRandom, and a clock method I use. I have had order issues with each one of these.
CON {{ Sets the clock mode }} _clkmode = xtal1 + pll16x _xinfreq = 5_000_000 OBJ pst : "Parallax Serial Terminal" SD : "FSRW" ' Uses "mb_small_spi" RR : "RealRandom" numbers : "Simple_Numbers" VAR long clock long clockstack[10] byte ClockCogID PUB Main Write Read PUB Write LaunchEverything SD.popen(string("Testfile.txt"),"w") pst.str(string("Test file created...",13,13)) repeat 10 SD.pputs(numbers.dec(randomrange(10,99))) SD.pputc(32) SD.pputs(numbers.dec(time(0))) SD.pputc(13) SD.pclose pst.str(string("Data written...",13,13)) PUB LaunchEverything pst.start(115_200) pst.str(string(13,13,"PST starting...",13,13)) rr.start pst.str(string("RealRandom starting...",13,13)) ClockCogID:=cognew(sysclock, @clockstack) pst.str(string("Clock starting...",13,13)) sd.mount_explicit(0,1,2,3) pst.str(string("SD card mounted...",13,13)) PUB Read | x SD.popen(string("Testfile.txt"),"r") pst.str(string("Test file opened...",13,13)) x:=SD.pgetc repeat until x==-1 pst.char(x) x:=SD.pgetc pst.char(13) SD.pclose StopEverything PUB StopEverything SD.unmount ' Unmounts the SD Card. pst.str(string("SD card unmounted...",13,13)) rr.stop pst.str(string("RealRandom stopped...",13,13)) Cogstop(ClockCogID) ' Stops the clock. pst.str(string("Clock stopped...",13,13)) waitcnt(clkfreq/1000*2000+cnt) pst.stop PUB RandomRange(Minimum,Maximum) | value, range range:=||(Minimum-Maximum) ' The range is the absolute value of the difference between the numbers. value:=rr.random ' Use realrandom to generate a random number. value:=||(value//(range+1)) ' Divide the random number by range+1 and save the remainder. value+=Minimum ' The final value is the remainer plus the smaller number. return value ' Return the selected value. PUB Time(Event) '' Reports time since an event. If time(0) is called it returns the current time. return (Clock-Event) PRI SysClock | systime systime:=cnt ' Finds current value of system counter. repeat waitcnt(systime += clkfreq/1000) ' Waits one millisecond. A synchronized pause. clock++ ' After a millisecond passes, add 1 to the clock.
To test the cog order issues I changed the order the cogs were launched in LaunchEverything. Print lines were commented out if PST was not launched yet.
I found that the following orders were successful:
pst, rr, clock, fsrw
rr, pst, clock, fsrw
rr, clock, pst, fsrw
fsrw, clock, rr, pst
fsrw, clock, pst, rr
These orders were caused the program to lock up when attempting to launch one of the objects:
pst, rr, fsrw, clock
rr, clock, sfsrw, pst
rr, fsrw, clock, pst
Previously I have found issues where FSRW could write files, but not read them until another cog was closed. I was not able to replicate this issue. I have also previously found that the clock may return incorrect values when the cogs are launched in certain orders, but I was not able to replicate this either.
Does anyone have any explanation for why this happens? It is easy enough to change the order of the code, but I feel like that should be unnecessary. Any thoughts?
Comments
It may well be something else but we should check the simple explanations first.
Andy
Good point, that seems to be the case here (20Mb reader).
I'm sure an abort trap would work fine, but I don't think it wouldn't really solve the problem. I would just be presented with an abort message printed to the screen instead of a lock up. I actually do have an abort traps for the SD card being present in my actual programs.
I switched the SPI object to mb_safe_spi. So far things are okay. I can't remember if there was a reason I was using mb_small_spi or not. Is there any good reason not to use mb_safe_spi? I can't imagine speed will be an issue for my uses. I do some quick data notes to the SD card during the main program, but any complicated processing and reading/writing is done at the end of the program when time is not an issue.
I'll keep testing to see if I can get any other collisions.
Thanks.
There must be hardly a corner case of the Prop not explored by those of so inclined to want to break such things.
My money is on a fault in your code somewhere.
Unless, of course, this is one of the first stitches of the universe coming undone as we approach the end of time:)
Not really a very helpful response. Did you read my post? I wasn't intending to suggest a Propeller bug. (Maybe "lock-up" is a weighted phrase? Programing is at least a tertiary skill/hobby so my language is not always the best.) I think you are correct. It is the code. But I only wrote a small fraction of the code in this example. Given the simplicity of my clock cog, I really doubt that is causing an issue. Unfortunately, my PASM isn't great and I haven't had any luck looking into the other code to see what the error might be.
I haven't been able to make any errors since I switched FRSW to safe_spi, so I think I can point to the direction of the problem. I do wish I understood it a bit better though.
If you want more details check this thread about [thread=127653]waitpxx cog synchronisation[/thread], especially the section about counter DUTY mode.
Thanks for the help. I knew you guys would have some good (and simple) idea that I missed.
(Thanks for the link too, I'll read up on that.)