Problem with memory integrity

Daniel Harris · 2012-06-06 18:41

Hi guys,

I've been spending some time trying to figure out many of the little hangups that people may run into while trying to develop C programs for the Propeller. I feel that I have run into a significant one - data integrity. I am seeing some strange behavior and I'd like to get some clarification on what I may be doing wrong.

Let me describe the situation. The program is an attempt to get data off of Parallax's 3-axis gyroscope module, the L3G4200D (Parallax part #27911), by launching a cog that is dedicated to running I2C communications with the gyro and stuffing the results into a mailbox structure. The I2C based gyro driver is a .cogc program (with its own main function). The gyro driver is launched into a new cog by the main application, which runs in LMM mode. For the most part, the main application (test.c) simply copies the raw x, y, and z values from the mailbox structure to some locally defined integer variables, which are then printed to the terminal. To be able to observe my problem, you need to hook up one of these gyroscope modules.

Take a look inside the main loop in test.c. One thing that is troubling to me is that several variables being displayed in the main loop are not changed in the loop, however their values sometimes change when printed. Specifically, the Z axis accumulator, "accZ". "accZ" is declared locally and is only accessed locally, so it shouldn't change. I believe there is also a problem with the way the mailbox is passing data from the gyro driver cog to the main application cog. There is probably some problem with the way I am declaring things or maybe with the way the optimizer is optimizing things out. I am making use of the volatile prefix, but I'm not entirely sure I am using it properly.

I have verified that the Propeller is successfully communicating with the gyro by attaching a logic analyzer and watching the transaction. Something is wrong in the way data is being passed back and forth. Attached is my Simple IDE project and all accompanying files. Any help is greatly appreciated

.

Thanks,
Daniel

jazzed · 2012-06-06 21:33

Are you using a demoboard? Does it have a pull-up on the SCL pin? The I2C code is depending on pull ups for both pins.

David Betz · 2012-06-06 21:39

Daniel Harris wrote: »

Hi guys,

I've been spending some time trying to figure out many of the little hangups that people may run into while trying to develop C programs for the Propeller. I feel that I have run into a significant one - data integrity. I am seeing some strange behavior and I'd like to get some clarification on what I may be doing wrong.

Let me describe the situation. The program is an attempt to get data off of Parallax's 3-axis gyroscope module, the L3G4200D (Parallax part #27911), by launching a cog that is dedicated to running I2C communications with the gyro and stuffing the results into a mailbox structure. The I2C based gyro driver is a .cogc program (with its own main function). The gyro driver is launched into a new cog by the main application, which runs in LMM mode. For the most part, the main application (test.c) simply copies the raw x, y, and z values from the mailbox structure to some locally defined integer variables, which are then printed to the terminal. To be able to observe my problem, you need to hook up one of these gyroscope modules.

Take a look inside the main loop in test.c. One thing that is troubling to me is that several variables being displayed in the main loop are not changed in the loop, however their values sometimes change when printed. Specifically, the Z axis accumulator, "accZ". "accZ" is declared locally and is only accessed locally, so it shouldn't change. I believe there is also a problem with the way the mailbox is passing data from the gyro driver cog to the main application cog. There is probably some problem with the way I am declaring things or maybe with the way the optimizer is optimizing things out. I am making use of the volatile prefix, but I'm not entirely sure I am using it properly.

I have verified that the Propeller is successfully communicating with the gyro by attaching a logic analyzer and watching the transaction. Something is wrong in the way data is being passed back and forth. Attached is my Simple IDE project and all accompanying files. Any help is greatly appreciated .

Thanks,
Daniel

Hi Daniel,

One thing that might be causing you trouble is that your cogc driver seems to be a bit too complex. I find at least one case (your function WaitForNewData) where it is attempting to use a hub stack and if you don't provide it with stack space that could cause problems. This is one problem with COG C. It's difficult (at least for me) to write COG C code that doesn't use the stack so it will run entirely within a COG. I found that I had to continually disassemble the code using propeller-elf-objdump -d in order to check to see if any stack references were generated. I finally came to the conclusion that you're mostly okay if you make sure you don't have more than one level of function calls. In other words, the main program can call functions but those functions can't call other functions. That seemed to work for me anyway.

Why didn't you just use the i2c driver that I wrote? It seems to work okay. We're now using it for the code that allows COG images to be loaded from the upper 32k of the eeprom. It might be easier than trying to write your own driver.

Thanks,
David

pedward · 2012-06-06 23:13

Why couldn't you tell the compiler to relocate the stack to a COG location?

ersmith · 2012-06-07 03:56

I think your variable "gyro" has to be declared "volatile", since both the main thread and the cog driver can modify it. If you don't do that, the optimizer may optimize away some accesses to gyro on the assumption that it cannot have changed since the last access.

Whenever you encounter weird memory problems in a multi-cog (or multi-thread) application, look to make sure that all variables that are shared by cogs (including via pointers) are declared volatile.

Eric

ersmith · 2012-06-07 03:57

pedward wrote: »

Why couldn't you tell the compiler to relocate the stack to a COG location?

The stack always has to be in hub RAM. The compiler can deal with individual variables in COG memory, but not arrays or other regions of memory (like a stack).

ersmith · 2012-06-07 04:44

ersmith wrote: »

I think your variable "gyro" has to be declared "volatile", since both the main thread and the cog driver can modify it.

Ah, I see you put the "volatile" in the struct definition. I guess it's more likely to be a stack problem then, as David suggested. It looks like the generated code in gyro_driver.asm is using a small amount of stack. Actually this may be partly an SIDE issue -- it looks like the cogc gyro driver was compiled with just -O rather than -Os. -O is not as aggressive in eliminating stack references as -Os. In any case you should always provide a small amount of stack for any Cog C program to allow the compiler to spill registers.

Eric

Dave Hein · 2012-06-07 04:56

If you look at the asm file you'll see that each routine saves r14 on the stack, except for the main function. I think you need 2 longs of stack space. I would suggest starting out with a few more longs of stack space, and initialize them to some unique value, such as 0xf00dfeed, and then examine the stack values after running for a while to see how many are actually used.

David Betz · 2012-06-07 07:33

David Betz wrote: »

Hi Daniel,

One thing that might be causing you trouble is that your cogc driver seems to be a bit too complex. I find at least one case (your function WaitForNewData) where it is attempting to use a hub stack and if you don't provide it with stack space that could cause problems. This is one problem with COG C. It's difficult (at least for me) to write COG C code that doesn't use the stack so it will run entirely within a COG. I found that I had to continually disassemble the code using propeller-elf-objdump -d in order to check to see if any stack references were generated. I finally came to the conclusion that you're mostly okay if you make sure you don't have more than one level of function calls. In other words, the main program can call functions but those functions can't call other functions. That seemed to work for me anyway.

Why didn't you just use the i2c driver that I wrote? It seems to work okay. We're now using it for the code that allows COG images to be loaded from the upper 32k of the eeprom. It might be easier than trying to write your own driver.

Thanks,
David

I didn't have time for a complete reply last night but you might want to try something like this:

int main(void)
{
    extern unsigned int _load_start_gyro_drivercog[];

    struct {
    	uint32_t stack[16];
    	gyro_init_t gyro;
    } state;

    int i;
    int x, y, z;
    int offX, offY, offZ;
    int accX = 0;
    int accY = 0;
    int accZ = 0;
    uint32_t t;

    /* set some parameters up in the mailbox structure */
    state.gyro.scl = SCL;
    state.gyro.sda = SDA;

    /* launch the program defined in "gyro_driver.cogc" into a new cog */
    cognew(_load_start_gyro_drivercog, &state.gyro);

and so on...

The idea here is that you're passing the address of your gyro_init_t structure as PAR but you're also giving your driver some stack space to use. You can adjust the 16 constant in the structure to reflect the amount of stack you think you'll need.

I've attached a complete copy of your test.c source modified to use this scheme. I don't have a board setup with a gyro so I haven't been able to try this though.

jazzed · 2012-06-07 07:51

Generally speaking:

Any .cogc program should not be too complicated.
All variables in a .cogc program should be global except for a few indexer variables like int x; etc....
I wish propeller-gcc could warn if stack is being used when compiling .cogc programs.
The way to detect stack usage in a .cogc program is to look for "sp" in the SIDE .asm listing.

SimpleIDE always compiles .cogc code with -Os and -mcog.

I'm thinking that SimpleIDE should always generate asm and check for "sp" references in the compile process.
Looking into it ....

David Betz · 2012-06-07 07:55

jazzed wrote: »

Generally speaking:
Any .cogc program should not be too complicated.

All variables in a .cogc program should be global except for a few indexer variables like int x; etc....

I wish propeller-gcc could warn if stack is being used when compiling .cogc programs.

The way to detect stack usage in a .cogc program is to look for "sp" in the SIDE .asm listing.

SimpleIDE always compiles .cogc code with -Os and -mcog.

I'm thinking that SimpleIDE should always generate asm and check for "sp" references in the compile process.
Looking into it ....

I don't know if it's a good idea for SimpleIDE to reject COG C programs with stack references. It is possible to pass a stack to a COG C program and then it will work fine even if it uses the stack. You just have to remember to pass space for a stack below the value you pass as PAR when you start the COG C program with cognew or coginit.

jazzed · 2012-06-07 07:59

David Betz wrote: »

I don't know if it's a good idea for SimpleIDE to reject COG C programs with stack references. It is possible to pass a stack to a COG C program and then it will work fine even if it uses the stack. You just have to remember to pass space for a stack below the value you pass as PAR when you start the COG C program with cognew or coginit.

Warn, not reject. Seems like a gentle warning might be appreciated. SimpleIDE adds lots of value to many users - this is just one more possibility.

David Betz · 2012-06-07 08:11

jazzed wrote: »

Warn, not reject. Seems like a gentle warning might be appreciated. SimpleIDE adds lots of value to many users - this is just one more possibility.

Yes, a warning could be helpful. Even an error would be okay if it was only triggered by checking a box that says something like "Ensure no COG C stack use" or something like that. Of course, you're opening up a can of worms here. Once you have a warning or error indicating stack use you'll then have people asking you how to figure out what is causing it and how to get rid of it. :-)

mindrobots · 2012-06-07 08:22

How about a "Allow COG C Stack Use (Advanced Feature)" defaulted to NOT being checked.
If SimpleIDE finds and SP reference in this case, it gives you an ERROR - the error message leads to telling someone they aren't smart enough to be doing what they are doing, so stops them or let sthem know if they are smart enough, then they need to check this box to allow themselves to shoto themselves in the foot.

If it is checked, it generates a warning - just so that advanced user is aware they might be doing what they intend.

The default for "Simple" is to protect the user and make it easy for us simple folk.

David Betz · 2012-06-07 08:26

mindrobots wrote: »

How about a "Allow COG C Stack Use (Advanced Feature)" defaulted to NOT being checked.
If SimpleIDE finds and SP reference in this case, it gives you an ERROR - the error message leads to telling someone they aren't smart enough to be doing what they are doing, so stops them or let sthem know if they are smart enough, then they need to check this box to allow themselves to shoto themselves in the foot.

If it is checked, it generates a warning - just so that advanced user is aware they might be doing what they intend.

The default for "Simple" is to protect the user and make it easy for us simple folk.

Actually, I think you need to be smarter to write COG C code that doesn't use the stack. Using the stack is the default for pretty much all C code so it is the easiest to do providing you actually give the COG C program a stack to use.

Another problem with the check box approach is that it will be a global setting and not allow for the case where you might have several COG C programs some of which need a stack and some of which don't want to use a stack. This really needs to be a per-file setting or maybe a global setting with a per-file override.

Daniel Harris · 2012-06-07 09:29

Hey guys! Thank you!!! You have given me a great deal to think about and to try.

David Betz wrote: »

One thing that might be causing you trouble is that your cogc driver seems to be a bit too complex. I find at least one case (your function WaitForNewData) where it is attempting to use a hub stack and if you don't provide it with stack space that could cause problems

Ah, good catch. That particular function was a new addition. I had previously dumped the disassembled code to check for references to SP, and found none. On one of the Wiki pages on the Google Code site, I did see where you warned against function calls beyond a single level.

David Betz wrote: »

Why didn't you just use the i2c driver that I wrote?

When I started this driver, I don't think you had implemented the "simple I2C" portion to your driver. I thought it redundant to use an extra cog to do the I2C work, so I just rolled it into a single driver. It looks like I'm bumping up on the limit of what a Cog C driver can handle. It's good to get an idea where that line lies.

ersmith wrote: »

In any case you should always provide a small amount of stack for any Cog C program to allow the compiler to spill registers.

I need to figure out how to allocate this, then. That and figure out how to tell the compiler to use the allocated space in the Cog C program as a stack.

Is there a different mode to compile in or way to write/structure the driver to produce the same general effect - a driver that can be launched into an available cog and share resources, yet not be plagued by optimizer, stack, and complexity issues? Basically, I'm trying to get something that I can show to other people to demonstrate how to do the typical things you would do on the Propeller with Spin, except they would be doing it with C.

David Betz · 2012-06-07 09:40

Daniel Harris wrote: »

Is there a different mode to compile in or way to write/structure the driver to produce the same general effect - a driver that can be launched into an available cog and share resources, yet not be plagued by optimizer, stack, and complexity issues? Basically, I'm trying to get something that I can show to other people to demonstrate how to do the typical things you would do on the Propeller with Spin, except they would be doing it with C.

I posed an example of how to make a stack available to a COG C program in message #9 of this thread.

David Betz · 2012-06-07 10:27

Daniel Harris wrote: »

When I started this driver, I don't think you had implemented the "simple I2C" portion to your driver. I thought it redundant to use an extra cog to do the I2C work, so I just rolled it into a single driver. It looks like I'm bumping up on the limit of what a Cog C driver can handle. It's good to get an idea where that line lies.

This is an interesting statement. If it is wasteful to use a COG to provide a generic function like I2C or SPI then the idea of having a library of generic drivers will likely not be very useful. It seems that the best way to make use of the Propeller is to write highly customized drivers that incorporate basic protocol handling with higher level logic. This means custom drivers not generic ones. Maybe a driver library is a bad idea?

mindrobots · 2012-06-07 10:45

Interesting questions.

For sake of discussion, let's say it's a temperature data logger with I2C RTC, I2C Temp sensor that is going to be logging data to an I2C EEPROM.

Do I want a Cog with I2C + RTC driver (a soft RTC peripheral), another with I2C + Temp (a soft Temp peripheral) and a third Cog with I2C + EEPROM (a soft EEPROM Datalog peripheral)?

Do I want a Cog for RTC, one for Temp and one for EEPROM all talking to a generic I2C driver? Maybe if they are all on the same pins, not if they are on different pins.

Is it better to consider base protocols as library code that gets bundled into a Cog with the sensor code. Making each Cog a soft peripheral? This requires a collection of "protocol" code that can be either bundled into standard soft peripheral items in the library or bundled up with your own sensor code to create your own soft peripheral for your own library (or sharing with others).

I think the code for the basic protocol handling and then drivers for standard parallax products needs to be provided.

...but then I've been all day at making graphs and Powerpoint presentations so my brain is not reliable at this point.

pedward · 2012-06-07 11:04

Wow, so many off-topic vectors! I guess I'll pile on.

I2C is a fairly simple protocol, is there a need for a dedicated COG driver? This is the same difference in the SPIN world of a SPIN vs PASM driver. Like Simple SPI is a SPIN implementation of the SPI protocol, to save COG resources. Since all C code is compiled to ASM, there isn't the same argument for SPIN/PASM, the code is fast enough. The main argument I can think of would be a slave driver or code reuse, especially when you have multiple devices on the I2C bus that are used by different code lines.

As for COG stack, I need a technical explanation of why it's not possible, because in my mind when you choose COG memory model, it just limits the memory size that the compiler will generate code for. Why can't the compiler allocate a stack at the top of COG ram?

To answer Daniel's question, you could write your code as a thread instead of a COG. In LMM mode, threads are multiplexed as N*M, so you can launch threads on other COGs, having the advantages of the LMM memory model and dedicated COG execution. The downside is that in LMM mode, it launches the LMM bootstrap onto the COG and streams code from hub ram. This may present a performance issue for your code if it needs to be deterministic.

I've asked questions about stack use in the past, as I'm still not fully happy with that "gotcha" and the answers. First, since COGc code uses the native calling convention (if you declare a function NATIVE), it doesn't use the stack for the call/ret function. This only leaves variables (variable spillage alluded to above). In that case, you could avoid variable spillage by carefully allocating many of your variables as global (bad practice in general coding standards, but acceptable in this scenario).

The only gotcha there is making certain they are declared local to the COG, is there a compiler hint for making global variables scoped to the COG ram?

If a program file is named .cogc, wouldn't it be reasonable to force all functions declared within to be NATIVE?

In my opinion, there still needs to be some work to the compiler to make writing COGc a lot less problematic. There shouldn't be so many corner cases that you have to avoid and manually check for.

ersmith · 2012-06-07 14:59

Daniel Harris wrote: »

When I started this driver, I don't think you had implemented the "simple I2C" portion to your driver. I thought it redundant to use an extra cog to do the I2C work, so I just rolled it into a single driver. It looks like I'm bumping up on the limit of what a Cog C driver can handle. It's good to get an idea where that line lies.

I don't think you're bumping up into any limits, except the artificial limit of "what a driver can do without any stack". If you give it some stack you can run a pong game in C in one cog.

I need to figure out how to allocate this, then. That and figure out how to tell the compiler to use the allocated space in the Cog C program as a stack.

David provided a good example earlier. Basically you have a structure that contains stack space followed by mailbox, and pass a pointer to the mailbox portion of the structure when you launch the cog. The stack grows down from the mailbox/parameter pointer that's provided to the cog.

Is there a different mode to compile in or way to write/structure the driver to produce the same general effect - a driver that can be launched into an available cog and share resources, yet not be plagued by optimizer, stack, and complexity issues? Basically, I'm trying to get something that I can show to other people to demonstrate how to do the typical things you would do on the Propeller with Spin, except they would be doing it with C.

Well, in Spin you'd be doing something more like an LMM thread launch. COG C is a bit more like a "high level PASM".

Another way to reduce stack requirements would be to reduce the number of levels of subroutine calls, for example by declaring functions "inline" or even writing the code so that there's basically one big function. But to be honest, providing a small amount of stack is pretty easy to do, so I don't think it's a big deal.

Eric

ersmith · 2012-06-07 15:02

pedward wrote: »

As for COG stack, I need a technical explanation of why it's not possible, because in my mind when you choose COG memory model, it just limits the memory size that the compiler will generate code for. Why can't the compiler allocate a stack at the top of COG ram?

C pointers and stack variables are addressed by RDLONG/WRLONG. To access a variable indirectly in COG memory requires self modifying code, which the compiler doesn't support. In principle this could be done, but it's non-trivial (and would also preclude accessing hub memory at all without a lot more work).

Eric

Phil Pilgrim (PhiPi) · 2012-06-07 15:09

ersmith wrote:

... requires self modifying code, which the compiler doesn't support.

So the compiler can't produce jmprets either?

-Phil

jazzed · 2012-06-07 16:08

Phil Pilgrim (PhiPi) wrote: »

So the compiler can't produce jmprets either?

-Phil

Of course it can.

ersmith · 2012-06-07 17:52

Phil Pilgrim (PhiPi) wrote: »

So the compiler can't produce jmprets either?

As Steve said, of course it can produce jmpret. I didn't phrase my response very well, perhaps. The problem with accessing COG memory really has to do with pointer dereference. To do something like:

  a = *ptr;

where ptr is a reference to cog memory requires self modifying code like:

  mov :blah,ptr
  nop
:blah mov a,0

This is totally different from the code required to dereference a pointer to hub memory, which is:

  rdlong a, ptr

Things get even more complicated if the COG memory is to be referenced as bytes or words rather than longs -- there would have to be masks and shifts involved as well, and the whole thing will likely end up slower than just referencing data in HUB memory in the first place.

As I mentioned, in principle it would be possible (albeit complex) to support putting all data into COG memory. But since the sequence required to access data in HUB memory is completely different, we'd end up with a driver that could not read from hub RAM at all -- which would not be very useful. There are probably ways around this; I'm not saying it would be impossible to support pointers to COG memory, just very difficult. The current scheme is a compromise which was fast to implement, does allow one to (mostly) avoid using stack, while keeping full support for hub memory.

Programming is full of trade offs, and I suspect not everyone will be happy with the trade-offs provided by PropGCC. Fortunately, the code is all open source, so perhaps some enterprising people will add missing features in the future.

Eric

Phil Pilgrim (PhiPi) · 2012-06-07 19:00

Thanks, Eric, for the thorough explanation. It seems we've come full circle, back to the question I posed last August. In light of the issues presented here, how would you address my original query?

Thanks,
-Phil

ersmith · 2012-06-08 05:24

Phil Pilgrim (PhiPi) wrote: »

Thanks, Eric, for the thorough explanation. It seems we've come full circle, back to the question I posed last August. In light of the issues presented here, how would you address my original query?

There's a whole thread addressing your original query :-). I think the short answer is: "In theory a C compiler can generate idiomatic PASM code, for some definition of idiomatic. Current C compilers for the Propeller achieve that goal sometimes, but in many cases fall short, so there is still considerable room for improvement."

ersmith · 2012-06-08 05:53

ersmith wrote: »

I think the short answer is: "In theory a C compiler can generate idiomatic PASM code, for some definition of idiomatic. Current C compilers for the Propeller achieve that goal sometimes, but in many cases fall short, so there is still considerable room for improvement."

To expand on this a little bit: was your original question really "can a C compiler produce PASM code that looks like a human wrote it?", or "can a C compiler produce PASM code that can do useful things in a single COG?". I think the latter is the more practically useful question, and directly addresses this thread. Daniel has written an I2C driver in C that runs in a COG, so clearly the answer is "yes". There are other examples, too -- I think Heater was the first to write a COG driver completely in C with his full duplex serial driver, and I know of a VGA driver as well. I'm sure there are more.

The caveat is that, as currently structured, Daniel's I2C driver requires 8 bytes or so of hub memory to be used as scratch space ("stack"). I don't think that's a huge cost, given how much easier C is to write than PASM, and in any case I don't think it would be hard to re-structure the code so it doesn't need any stack at all.

David Betz · 2012-06-08 06:16

ersmith wrote: »

To expand on this a little bit: was your original question really "can a C compiler produce PASM code that looks like a human wrote it?", or "can a C compiler produce PASM code that can do useful things in a single COG?". I think the latter is the more practically useful question, and directly addresses this thread. Daniel has written an I2C driver in C that runs in a COG, so clearly the answer is "yes". There are other examples, too -- I think Heater was the first to write a COG driver completely in C with his full duplex serial driver, and I know of a VGA driver as well. I'm sure there are more.

The caveat is that, as currently structured, Daniel's I2C driver requires 8 bytes or so of hub memory to be used as scratch space ("stack"). I don't think that's a huge cost, given how much easier C is to write than PASM, and in any case I don't think it would be hard to re-structure the code so it doesn't need any stack at all.

Actually, the i2c code in Daniel's driver came from a driver I wrote for my old PropBOE library that has since been moved into the main PropGCC library to use as part of the EEPROM COG loader functions. It runs entirely in a COG with no hub memory required other than the mailbox used to communicate with it. I've attached it to this message so you can see that it is actually very simple and was much faster to develop than a similar driver written in PASM. Of course, it doesn't make as efficient use of a COG as would be possible in PASM but it is good enough for the purpose and was easier to develop.

Edit: why does this forum software insist on renaming attachments? The file called i.c is supposed to be named i2c_driver.c.

Problem with memory integrity

Comments