FYI: Potential Gotcha with multi-core processing
Beau Schwabe
Posts: 6,568
The original code was from a friend of mine asking for programming help.
The idea of the code operation is to sync two or more Propellers to an event initiated by a master Propeller. As it is the Propellers should be able to communicate asynchronous serial between one another.
So, the flow should go something like this ...
1) Master sends a broadcast command via asynchronous serial to all other Propellers that a synchronized event is about to happen.
2) This cues the Clients to 'look' for the sync pulse, initiating the event.
3) Over the same transmission line, the master Propeller sends out a 'sync pulse' which is NOT asynchronous serial.
4) The Clients see the pulse and execute code accordingly.
This seems pretty straight forward in the approach, but it won't work as described. Why?
Answer)
Since the FullDuplexSerial object as well as other serial objects operate out of their own cog, It's possible to send serial commands to the FullDuplexSerial object and return to the calling cog BEFORE all of the serial data has been sent. This is especially true if your baud rate is slow. In this scenario the Master could send the request, and the sync before the serial data had completed from the FullDuplexSerial object, thus corrupting the serial data. As a result of the corrupted data, the Client's would never be able to determine that there was an upcoming sync event at all.
To visually see this, look at the attached image with reference to the commented corrected code.
Below is the original code sent to me trying to accomplish this task.
Below is virtually the same code with comments and corrections.
The idea of the code operation is to sync two or more Propellers to an event initiated by a master Propeller. As it is the Propellers should be able to communicate asynchronous serial between one another.
So, the flow should go something like this ...
1) Master sends a broadcast command via asynchronous serial to all other Propellers that a synchronized event is about to happen.
2) This cues the Clients to 'look' for the sync pulse, initiating the event.
3) Over the same transmission line, the master Propeller sends out a 'sync pulse' which is NOT asynchronous serial.
4) The Clients see the pulse and execute code accordingly.
This seems pretty straight forward in the approach, but it won't work as described. Why?
Answer)
Since the FullDuplexSerial object as well as other serial objects operate out of their own cog, It's possible to send serial commands to the FullDuplexSerial object and return to the calling cog BEFORE all of the serial data has been sent. This is especially true if your baud rate is slow. In this scenario the Master could send the request, and the sync before the serial data had completed from the FullDuplexSerial object, thus corrupting the serial data. As a result of the corrupted data, the Client's would never be able to determine that there was an upcoming sync event at all.
To visually see this, look at the attached image with reference to the commented corrected code.
Below is the original code sent to me trying to accomplish this task.
CON _clkmode = xtal1 + pll16x _xinfreq = 5_000_000 OBJ Ser : "FullDuplexSerial" 'Used for Debug output CON RX = 31 TX = 30 SyncBit = 37 Pub Start Ser.start(RX,TX, 0, 9600) repeat Ser.str(string("Serial Enabled")) Ser.tx(13) Ser.tx(SyncBit) Ser.stop ' waitcnt(cnt+clkfreq/5000) 'DirA[TX]~~ 'Set the direction bit to write out 'OutA[TX]~~ 'Set the pin to High to signal all other devices that sync has occured ' waitcnt(cnt+clkfreq/5000) 'DirA[TX]~ 'give up control of the pin Ser.Start(RX,TX,0,9600) 'restart serial
Below is virtually the same code with comments and corrections.
CON _clkmode = xtal1 + pll16x _xinfreq = 5_000_000 OBJ Ser : "FullDuplexSerial" 'Used for Debug output CON RX = 31 'TX = 30 TX = 7 '<- for scope debugging only SyncBit = 37 Pub Start 'Ser.start(RX,TX, 0, 9600) '<- can't just run inverted mode, need to run open-drain Ser.start(RX,TX, %0100 {<-mode open drain}, 9600) '' mode bit 0 = invert rx '' mode bit 1 = invert tx '' mode bit 2 = open-drain/source tx '' mode bit 3 = ignore tx echo on rx OutA[TX]~ 'preset TX Low ; this way all we need to do is control direction ' and it allows for open drain style control repeat Ser.str(string("Serial Enabled")) Ser.tx(13) Ser.tx(SyncBit) ' Ser.stop '<- don't do this! too much overhead in stopping/starting a new serial driver ' waitcnt(clkfreq/5000 + cnt) '<- Your not thinking 3 dimensional here. The FullDuplexSerial is running ' in it's own cog, and especially since the baud is relativly low, the three ' 'Ser' commands above have executed and loaded the TX buffer within the ' FullDuplexSerial before all of the data has been sent. So.... you must ' wait a longer time interval so the data has a chance to be sent. waitcnt(clkfreq/40 + cnt) DirA[TX]~~ 'Set the direction bit to write out 'OutA[TX]~~ 'Set the pin to High to signal all other devices that sync has occured '<- See OutA[TX]~ above ; remember the serial data is INVERTED, so this would ' need to go LOW instead. waitcnt(clkfreq/5000 + cnt) '<- This delay is ok because we are out of the serial stream ; this delay ' represents the width of the pulse you want to send DirA[TX]~ 'give up control of the pin waitcnt(clkfreq/40 + cnt) '<- This delay can be much shorter but is set the the same interval as the ' entry delay above so that you can see in the attached scope image what ' can happen with multi-processing. ' Ser.Start(RX,TX,0,9600) 'restart serial ; don't do that! related reason to the Ser.stop note above.
Comments
or, use a separate "sync" pin, and then the slaves can waitpeq or waitpne on it after receiving the serial command
"Since you are running at 9600bps, bit-bang the output (even Spin should be able to do it) - problem solved" - Right, that's what I would do, but this wasn't my project. Just helping out a friend.
This still doesn't mean that the problem completely goes away at higher baud rates, so it's something we all should be aware of.
cognew(....
do_something
But do_something depended on a variable being set by the code running in the new cog. A short delay between cognew and and do_something solved the problem, but it took a lot of time to finally realize what the problem was.
John Abshier
or perhaps specify a LOCK or hub address the caller can wait on?
-Phil
-Phil
"Why would some one use serial to do this? Send a pulse, wait a short time and send the sync pulse." - The mechanism for intercommunication between Propellers is necessary to pass other information. Using the existing wires was an easy way to approach syncing the Propellers all together at once.
Phew, I thought you were about to announce the discovery of a horrible bug in the Propeller chip.
No, this is just a common or garden race condition that could happen with a single core chip and some external hardware performing the action you have forgotten to wait (or check for) the for completion of.
Perhaps the title of this thread should be changed.