Multiple Cogs accessing the same object

jsaddiction · 2014-01-13 04:55

Ok so I have done some searching and can not find an understandable answer to my question.

I am working on an ethernet project using a WIZnet for quickstart board. I am currently working on an object called "ethernet". Ethernet has one reference to the object "W5200" which starts an spi interface to the chip.

In my "Ethernet" object I have three methods that have the ability to access the W5200 object and communicate with the WIZnet chip.

The first one is called get_data and is launched with cognew command and runs in a repeat loop which "mines" data from a database using http POST requests and a PHP script. This should give me quick access time to a global array in which the "mined" data resides.

The second one is called get_Internet_time and runs when a parent object calls it. It returns SNTP time stamp used to sync the RTC.

The third one is called get_Sunset and also runs when a parent object calls it. It makes another http POST request to another PHP script that calculates sunset for a given day.

Here is the problem if you haven't figured it out already. With the first method running in a new cog and in a repeat loop I was experiencing some data collisions as multiple cogs tried to access the same spi port at the same time. I tried adding some code to each method which checked the status of a global variable called WIZlocked. The code is a simple repeat while WIZlocked and then set WIZlocked to true then do some task then when done with task set WIZlocked to false.

All of this "to me" should work but for some reason it doesn't and it has some weird symptoms. Firstly all of the methods work independantly as long as the cognew isn't issued. During boot time i would like to get all of the data so that i have something to start with so I make calls to both get_internet_time and get_sunset. After a short delay I launch get_data into a new cog. After that point I can not successfully retrieve any data from the other two methods. My understanding is that because of the repeat while WIZlocked code, I should be able to call the get_internet_time method and it should wait for WIZlocked to become false and then set WIZlocked to true causing the get_data method to wait for get_internet_time to finish. Firstly the get_internet_time will return some data immediatly but the data is incorrect.???

is there a better way? should i pass the WIZlocked or the address of WIZlocked as a parameter for the cognew command? Not sure where to start with this problem. I would just not have the cognew command in there but get_data takes to long to process (250ms)

Thanks in advance for any suggestions!

Duane Degn · 2014-01-13 08:07

A lot will depend on how the "ethernet" object handles requests for data (a link to the object would be helpful).

One way around these problems is to only have one cog access the ethernet object.

While I haven't experimented with ethernet much I do use a serial object in most of my programs. I generally have other cogs set flags the cog monitoring the serial object checks. The other cogs set these flags when wanting to send data or when requesting data.

The alternative is to use locks. Kye uses locks in several of his demo programs, I followed his example when using them myself.

Mark_T · 2014-01-13 08:48

You can't lock without a lock, it doesn't work like that.

You have a flag, WIZlocked, so you look at it, see its not locked, and blithely go ahead, set the flag and use the object. Bzzzzt, FAIL!

Every cog is running in parallel and all looking at the flag - they all see its free, they then all set it, and all enter the critical
sections together - you haven't protected against simultaneous access at all, and you'll never achieve this with a flag.

You need either an atomic read-modify-write primitive such as a lock or semaphore,
or a synchronization primitive that works without atomic read-modify-write such
as an event-counter. Here the former is appropriate - when you initialize the object
use locknew to get a lock, make all the critical sections use this to prevent multiple
access using lockset and lockclr - I think this is explained in the Prop manual.

Mike G · 2014-01-13 15:31

jsaddiction, there's one SPI bus so you'd have to lock SPI bus access while in use as Mark T suggests. That's a pain in the butt.

I generally run a main loop that checks the status of scheduled tasks.

jsaddiction · 2014-01-14 03:16

I was playing around with locks last night and after reading the manual over and over I am apparently missing the point. Here is the latest sanitized version of my Ethernet object. Mike G, can I not use locks to "lock" an entire method from executing?

Mike G · 2014-01-14 04:21

It's much easier to create a main process that executes each network (WizNet) task.

Dealing with 3 top level locks is complicated and difficult to troubleshoot. Implement locks in the driver, SpiCouterPasm.spin, if you can't live without locks.

jsaddiction · 2014-01-14 04:25

Ethernet has a parent object called light controller and uses the main 3 methods to collect data from the internet. is that what your talking about Mike?

jsaddiction · 2014-01-14 04:36

The top object first calls get internet time and uses that to sync the rtc. Then it calls get sunset using data in con block and date from rtc. Lastly it calls start data. up until this point I don't see a need for any locking mechanism, however the top object will call both sync Tim and get sunset upon the day change read from rtc. If get data is running in another cog I can see how conflicts can arise in the buffer and in the spi bus. I thought it would be more simple to lock each of the three methods as they deal with both conflicts

Mike G · 2014-01-14 04:41

Think of it like this... there one SPI bus. The bus is a physical thing with pins and wires. Only one process can use the bus at any time. So create a single processes responsible for the SPI bus.

jsaddiction · 2014-01-14 04:52

So i should put a lock system into the SPICounterPasm object? If so how would that be impletmeted?

Mark_T · 2014-01-14 06:26

Locks protect against multiple cogs entering a "critical section" simultaneously. You first need to identify your critical sections
(probably just the SPI method), and at the start of the critical section you busy-wait to claim the lock. Once you are through
the critical section you free the lock for the next cog. Have you read the Prop manual documentation, it explains it with
Spin examples

Mike G · 2014-01-14 07:59

I would forget about locks and create a single process (main loop) that handles the Wiznet communication. For example, the PHP process sets a ready flag. When the main loop sees the ready flag it executes whatever network process.

jsaddiction · 2014-01-14 08:01

I have read and reread the manual and I think I have the implementation correct. The wiznet has a 2kbyte buffer for each socket and my program has one 2kbyte buffer in which I load from one of the 3 socket buffers. I then preform some parsing and then move to the next process. The code above shows how I do this but it is still gives me bad data. Each method work independently but not in one neat package. That is why I thought to use a lock of some sort. I implemented my take from how the manual describes it but still no joy.

Rforbes · 2014-01-14 08:42

@jsaddiction-

This method:

PRI recieveSNTPtime | bytesToRead
wait(10)
repeat
bytesToRead := Available(SNTPSocket) 'get number of bytes to read
if (bytesToRead < 0) 'check timeout
return true
if (bytesToRead == 0) 'recieving is done
return true
if (bytesToRead > 0) 'we got work to do
receive(SNTPsocket,@Buff,bytesToRead)
if Available(SNTPSocket) == -1
return true
bytesToRead~

Doesn't appear to make sure you've got at least 56 bytes in the time server response. It just waits for at least 1 byte and then uses whatever it has to get the transmit time stamp. That would cause some problems with incorrect data for your RTC if the number of bytes waiting to be received were less than a full response from the server (56 bytes at least.) I think.

Mike G · 2014-01-14 15:55

The wiznet has a 2kbyte buffer for each socket and my program has one 2kbyte buffer in which I load from one of the 3 socket buffers. I then preform some parsing and then move to the next process. The code above shows how I do this but it is still gives me bad data.

From the code I can see, the get_Data method is running in a COG. Once this method fills the prop buffer it releases the lock. Whatever process is waiting will immediately takes over. It's not clear how data is processed since the entire project was not attached. My guess is data processing is not aligned with moving Wiznet buffered data the Propeller. Again just a guess without seeing all the code.

If I were you, I'd forget about locks and use a main loop to control buffer and SPI bus access. That's what I do and it works great.

jsaddiction · 2014-01-14 17:44

The get data method should parse and fill the array setting before the lock is cleared. After setting is updated the buffer is free to be modified. The method get switch is called at anytime to get that particular setting. I will add some better buffer management to ensure the buffer is cleared before a method writes to it as it mostly depends on strsize to compute buffer sizes and byte locations. Good find Forbes I was just thinking about how to further check data in a buffer before declaring a good receive has happened. I'll add those two things and see how it works. I'm currently working on dynamically loading the get data method to limit network traffic when its not needed

jsaddiction · 2014-01-15 09:32

ok so I have updated the SNTP recieve method to check for a full response and looks like this

PRI recieveSNTPtime | bytesToRead, bytesRead

repeat 
  bytesToRead := Available(SNTPSocket)         'get number of bytes to read
  
  if (bytesToRead < 0) and (bytesRead => 56)   'recieving is done
    return true

  if (bytesToRead < 0)  and (bytesRead < 56)    'check for time out
    return false

  if (bytesToRead > 0)                                     'we got work to do
    bytesRead =+ bytesToRead
    receive(SNTPsocket,@Buff,bytesToRead)

  bytesToRead~

This is a way more stable way of receiving the Time stamps. Thanks for pointing that out. One question though, why the 56 bytes? My first server listed in the dat table only returns 56 bytes. Looking at the message formats, it seams to me that 56 bytes puts me 4 bytes into the message digest just after the Key ID field. I don't really comprehend what the significance is but it works???

Rforbes · 2014-01-15 10:27

Assuming you're using the W5100 indirect driver :

'' The data returned is the complete packet as provided by the W5100. This means the following:
'' data[0]..[3] is the source IP address, data[4],[5] is the source port, data[6],[7] is the payload size and data[8] starts the payload

The above is a comment from the RxUDP method.

This means the transmit timestamp will be located at bytes 48 thru 55. Anything after byte 55 is either a key identifier, kiss o' death or something else. Either way, as long as you have 56 bytes you can be pretty sure that you've got the data needed to figure out what time it is.

Without ensuring you've got at least 56 bytes, it means you "could" throw 0's into the humantime method, and that of course would get things all wonky and cattawampus and off-kilter and such in your calculations.

jsaddiction · 2014-01-16 13:12

Ok so here is what I did. After making the corrections Rforbes listed, I implemented the use of locks in each of the primary methods that modify the buffer and access the w5200 spi bus. After making those changes I can call any method while the getdata method is running. To further prevent collisions I made the get data cog load dynamically like MikeG said. I only need to know switch positions when the lights are on. When the lights are off I cogstop the getdata cog and start checking for a need to update time. Which is only when current rtc date is different than the last time I synced and after 3:00 in the morning , just in case daylight savings time happened that day. Once I have a successful sync time I use the rtc date and other constants to call the getsunset method. One all that happens, the program goes into a holding pattern waiting for rtc time to equal the sunset time. So far so good. Now im gonna run it for a couple of days and see how it works. I may need to add an lcd so I can monitor variables like sunset sntp time and rtc time maybe even switch positions. I appreciate everyones help. This has been a very fun project.

Multiple Cogs accessing the same object

Comments