PropGCC / PASM question

JasonDorie · 2015-10-20 21:08

Here's a fun one. I've included a stripped down version of the project so hopefully someone more familiar with PropGCC can take a look. It appears that PropGCC is congealing a bunch of my variables into the same memory space.

My QuatIMU object declares a bunch of static variables. I've been getting some really weird results.

My code now has a main loop that prints the addresses of three different variables, like this:

      printf( "%08x\t%08x\t%08x\r", QuatIMU_GetTest(), QuatIMU_GetSensors()+1, QuatIMU_GetMatrix() );

That output looks like this:

0000107c        0000107c        00001080

Those three functions look like this:

int * QuatIMU_GetSensors(void)
{
  return &gx;
}  

float * QuatIMU_GetTest(void)
{
  return &rx;
}

float * QuatIMU_GetMatrix(void)
{
  return &m00;
}

And those variables are all declared independent of each other in a large block of statics, like this:

static int gx, gy, gz;
static int ax, ay, az;						// Sensor inputs
static int mx, my, mz;
static int alt, altRate;

// Internal orientation storage
static float qx, qy, qz, qw;				// Body orientation quaternion

  
static float m00, m01, m02;
static float m10, m11, m12;					// Body orientation as a 3x3 matrix
static float m20, m21, m22;

static float qdx, qdy, qdz, qdw;			// Incremental rotation quaternion

static float fx2, fy2, fz2;
static float fwx, fwy, fwz;					// Quaternion to matrix temp coefficients
static float fxy, fxz, fyz;

static float rx, ry, rz;					   // Float versions of rotation components
static float fax, fay, faz;					// Float version of accelerometer vector

So, why do those 3 variables that should be dozens of bytes apart end up overlapping in memory?

David? Any ideas? I've attached the SimpleIDE project and code so you can compile locally and see if you're getting the same results.

DavidZemon · 2015-10-20 21:12

Woh... very curious to try this when I get home.

JasonDorie · 2015-10-20 21:15

I've been banging my head against this for days. This is probably the single most complex piece of code in this project, so I'm not surprised that I'm having issues with it, but I am surprised I'm having *this* issue with it. It's bizarre.

DavidZemon · 2015-10-20 22:04

First problem, which you're only avoiding because you're running an evil operating system: propeller.h uses lowercase, not uppercase (line 8, Elev8-Main.c and line 6 Constants.h)

JasonDorie · 2015-10-20 22:07

Most of the build tools I use force paths to be lower case. I capitalize for readability. Sorry your OS is silly.

DavidZemon · 2015-10-20 22:22

Okay, you're simply seeing the optimizer at work. GCC find the unused variables and doesn't include them in the final executable, therefore mushing your three used instances right next to each other. If you don't want that to happen, you could use an array for your three different sensor values (which is what I would suggest - then index that array with an enum).

typedef enum {
  X,
  Y,
  Z,
  AXES
} Axis;

int g[AXES];

DavidZemon · 2015-10-20 22:24

I also learned from this example that GCC does not treat a pointer the same as an int when doing arithmetic. I was very confused that +1 gave you a word-aligned value, not an odd number. Turns out that if you cast it to an int before the addition, it works as I was previously expecting.

JasonDorie · 2015-10-20 22:54

Pointer arithmetic works on the size of the type pointed to. As for my variables, they're obviously used - they're being referenced in the quat command stream - I take the address of all of them. If you un-comment that block of stream commands, and uncomment the code that uses it, the addresses still resolve on top of each other. I can try putting them in a struct to see if it behaves, but so far this looks like a compiler bug. Taking the address of the variables is declaring their use, so &gx and &rx returning the same address is a bug to me, not an optimization.

DavidZemon · 2015-10-21 00:16

JasonDorie wrote: »

&gx and &rx returning the same address is a bug

But they're not returning the same address. &rx is 4 less than &gx, which is correct for the simple example you posted.

JasonDorie · 2015-10-21 01:03

In the post at the top of this page, where I first posted this bug, the first two values are identical : 0x107C, 0x107C, followed by 0x1080, and they refer to three different variables, all referenced by the stream of data stored in QuatUpdateCommands, so they are referenced, and they do overlap.

I can change the structure of the code to try to coax the compiler into not screwing these up, but this is a bug. Here's the output from the same code, compiled with Microsoft Visual C/C++:

00f741ec        00f74258        00f74148

DavidZemon · 2015-10-21 01:13

Are you missing the "+1" that exists in the code? That +1 is the only reason the printed addresses are the same.

JasonDorie · 2015-10-21 01:24

I stuck all the variables in a struct and now it works. Perfectly. First try. F#$* me.

DavidZemon · 2015-10-21 01:27

JasonDorie wrote: »

I stuck all the variables in a struct and now it works. Perfectly. First try. F#$* me.

Glad to hear it

JasonDorie · 2015-10-21 01:28

I compiled the *exact same* code in MSVC++ and in PropGCC. The MSVC compiler did not congeal rx and gx into the same memory address, and PropGCC did. I verified it twice.

I "fixed" the code by putting all the IMU working variables into a struct - basically stuck typedef struct { ... } IMU_VARS; around it and then declared an instance of it, and reference all the vars in that struct now instead. That works perfectly.

Tor · 2015-10-21 11:56

rx and gx aren't in the same memory address, you effectively print &(gx)+1 there. So they're sizeof (int) apart, which is fine. It's just the optimizer removing everything you didn't use, so they ended up being stored next to each other.
I get the same result on Linux with gcc (except that it's m00 that ends up adjacent to gx), when compiled with -O3 (probably also with -O2, but didn't test it). With -O0, optimizer off, all the non-used variables are also allocated storage and the result is different.
So nothing is wrong with PropGCC.

Edit: Did the same test on a Solaris 10 Sun computer, with Sun C compiler. Same result as with gcc.

-Tor

JasonDorie · 2015-10-21 18:29

I did a bunch of different tests, and the only way I could get GCC to respect the layout of the variables was to put them in a struct. I'll try to put something together tonight that clearly demonstrates that it's broken, or clearly demonstrates that I screwed up.

JasonDorie · 2015-10-21 18:57

Tor, you said "the non-used variables", which I don't get. There are no unused variables in that code. Look for an array called QuatUpdateCommands - every variable declared in that file is referenced within that array.

It's possible that when trimmed the code down to make it compile I made things worse, but even with all the code fully functional, I was getting bizarre layout and overlap of variables.

One thing I've noticed since is that GCC seems to arrange variables in the reverse order to their declaration, which I'm not used to. On the Prop, to speed up code, sometimes I block move data instead of copying variables one at a time, because it produces less instructions for Spin to interpret. In GCC, if the variable order isn't respected, that screws up. I still think what I was seeing is a distinct problem from that though.

In your version, try un-commenting this line:

  QuatIMU_AdjustStreamPointers( QuatUpdateCommands );

...and the body of this function:

void QuatIMU_AdjustStreamPointers( int * p )
{
	//while( p[0] != 0 )
	//{
	//	p[0] = (int)F32_GetCommandPtr( p[0] );				// Convert the instruction index into the address of a jump table instruction
	//	p += 4;
	//}
}

Un-commenting those will make use of the QuatUpdateCommands array where all the variables are referenced, in which case they should very certainly no longer be removable. If you're still seeing variables resolving to the same addresses at that point, there's an issue. On the other hand, maybe GCC's variable layout is such that variables are arranged to be in "first use" order. In that case, my functions asking for specific variables might bump those to the front of the list. I never asked for the memory locations of all variables - just those three, assuming that GCC's layout would be sequential by declaration. Now that I know it's not, I'll go back and look a little harder at what might be happening.

JasonDorie · 2015-10-21 19:03

It's worth mentioning I have this port about 85% done at this point. It's only slightly larger than the Spin version, and from what I can tell, significantly faster in the high-level code parts.

I have a couple minor objects to move over, but this should be ready to fly pretty soon. Thanks to all for the help.

DavidZemon · 2015-10-21 19:07

JasonDorie wrote: »
Tor, you said "the non-used variables", which I don't get. There are no unused variables in that code. Look for an array called QuatUpdateCommands - every variable declared in that file is referenced within that array.

It's possible that when trimmed the code down to make it compile I made things worse, but even with all the code fully functional, I was getting bizarre layout and overlap of variables.

One thing I've noticed since is that GCC seems to arrange variables in the reverse order to their declaration, which I'm not used to. On the Prop, to speed up code, sometimes I block move data instead of copying variables one at a time, because it produces less instructions for Spin to interpret. In GCC, if the variable order isn't respected, that screws up. I still think what I was seeing is a distinct problem from that though.

In your version, try un-commenting this line:
  QuatIMU_AdjustStreamPointers( QuatUpdateCommands );
...and the body of this function:
void QuatIMU_AdjustStreamPointers( int * p )
{
	//while( p[0] != 0 )
	//{
	//	p[0] = (int)F32_GetCommandPtr( p[0] );				// Convert the instruction index into the address of a jump table instruction
	//	p += 4;
	//}
}   
Un-commenting those will make use of the QuatUpdateCommands array where all the variables are referenced, in which case they should very certainly no longer be removable. If you're still seeing variables resolving to the same addresses at that point, there's an issue. On the other hand, maybe GCC's variable layout is such that variables are arranged to be in "first use" order. In that case, my functions asking for specific variables might bump those to the front of the list. I never asked for the memory locations of all variables - just those three, assuming that GCC's layout would be sequential by declaration. Now that I know it's not, I'll go back and look a little harder at what might be happening.

Ahhh, I missed those lines last night when I was looking for commented-out code. Yes - once those are uncommented, it should behave as you expect. But GCC is smart enough to recognize that QuatIMU_AdjustStreamPointers() contains no logic, therefore the variable passed into it is unused, therefore QuatUpdateCommands is unused, therefore, therefore therefore.... everything is unused in its current state, except for the three vars in the print command.

David Betz · 2015-10-21 19:31

JasonDorie wrote: »

I did a bunch of different tests, and the only way I could get GCC to respect the layout of the variables was to put them in a struct. I'll try to put something together tonight that clearly demonstrates that it's broken, or clearly demonstrates that I screwed up.

I don't think the C standard says anything about the layout of variables in memory unless they are collected together into a struct.

JasonDorie · 2015-10-21 20:02

I usually put things in classes, in which case the layout follows specific rules. This is a direct port of Spin, so for now I was leaving things the way they were in the original code. I think I'm going to go back and turn these all into C++ classes as a cleanup, and also to force layout.

It's possible that it's always been the reversing layout that was causing my issues, and my reducing the code just made it do different things.

ersmith · 2015-10-21 21:07

As David Zemon points out, GCC is pretty smart about figuring out when things are really unused (e.g. if a variable is referenced only inside a function that produces no result and changes no state, then that variable is effectively unused). If you remove the "static" from your variable declarations that will disable this particular optimization, since you're then saying that the variable might be used in other parts of the code that the compiler can't see right now.

And David Betz is right that there's no guarantee of where variables end up in memory if they're not in a structure.

Tor · 2015-10-22 01:45

Yep, I concur with ersmith and the two Davids - the C standard doesn't guarantee that the sequence of variable declarations matches the physical declaration, in fact I believe it's almost guaranteed not to do that, at least when optimization is on (just try running the executable through a debugger and print your variables -- most of them won't even be there, they're 'optimized out' unless compiled with -O0 -- all done inside registers or folded into expressions). And as long as the C compiler can see all of your code it can easily see if you're actually using the variables you assigned. If not, out with them. You can try with 'volatile' to leave something untouched, but that's not always going to work either. It's difficult to lie to the compiler.

But I'm not sure (I read the thread again from the start) what the problem really is? Because the compiler isn't doing anything wrong and the program should be doing exactly what you tell it to do. Is it "block moving" of data? I'm not sure how you would block-move a number of standalone variables in C without resorting to trying to do that behind the compiler's back, with some assembly code, and that would not be the right way. To block move data, put the variables into a struct and copy the struct.

-Tor

JasonDorie · 2015-10-22 01:57

I'm using memcpy to move data. To be fair, I'm used to using Spin on the prop which does no optimization whatsoever. With GCC it's probably not necessary to manually translate stuff like a bunch of sequential moves into a memcpy - the compiler should be smart enough to do that.

I'm used to working in C++ where layout is more rigid because it's inside classes. I do understand that the compiler is free to rearrange and remove things (I've written one, just not a plain C one).

DavidZemon · 2015-10-22 02:02

JasonDorie wrote: »

I'm used to working in C++ where layout is more rigid because it's inside classes. I do understand that the compiler is free to rearrange and remove things (I've written one, just not a plain C one).

If you're used to writing in C++, why are you doing it in C? I know you said you're planning to move back to C++ already, but I'm just curious what made you decide to start in C.

JasonDorie · 2015-10-22 02:34

DavidZemon wrote: »

If you're used to writing in C++, why are you doing it in C? I know you said you're planning to move back to C++ already, but I'm just curious what made you decide to start in C.

I actually tried to use C++ with the first object I converted, but it didn't work for some reason. I'm about to try it again. The object is intended to be a single instance, so I made all the members static, and it's interfacing with a COG driver object. I was worried I had hit a case no one had tried yet.

Tor · 2015-10-22 02:36

The equivalent of using C++ classes to hold data is to use structs in C. And that works fine, of course. And a struct can be copied as-is too.. struct a a1; struct a a2 = a1;
no memcpy needed. But even with structs (or classes!) you can't just do

struct a
{
    int  i1;
    char c1; 
    short s1;
    int i2;
};

memcpy (&somewhere, &a.c1, 6);

and expect it to copy c1, s1, and i2. The compiler will almost certainly insert padding to align some of the variables. Where it does that depends on the architecture, but on x86 it would align chars on 1-byte boundaries, shorts on 2-byte boundaries, and ints on 4-byte boundaries. Although on x86 you could add a __packed<something> attribute and it would really pack it. On MIPS it can pack it too, but would need to add additional code whenever the variables are to be used or there would be bus errors. I never use that packed attribute (it's GCC specific anyway).

DavidZemon · 2015-10-22 02:57

Definitely give C++ a try and look to PropWare and libpropeller for inspiration. If you have any questions about how to do something specific to the Prop (since it sounds like you already know C++ in general fairly well), I'd be happy to help.

It is odd though that you'd make the variables static, just because you only had a single instance. I haven't heard of anyone doing that before. What was your reasoning?

JasonDorie · 2015-10-22 03:00

Verified again - I tried to convert the RC Receiver object to an all-static C++ class and it doesn't function.

I've attached a sample that lets you flip between versions by commenting / uncommenting the USE_CPP define at the top. When #define USE_CPP is commented out, the code properly reads my radio control receiver and displays the expected output. When I enable USE_CPP, the code just displays -3000, meaning that the values in the Pins array are never being set to non-zero.

I'll poke at it a little bit - maybe it's that layout issue again, but it shouldn't be because these are now in a class.

JasonDorie · 2015-10-22 03:05

Tor wrote: »

The compiler will almost certainly insert padding to align some of the variables.

I know this, but when using variables that are all the same size and type this doesn't happen. In my case, all of the variables defined were the same type or size (float, long, and int = 4 bytes on the Prop).

So, if you've written:

struct {
  long a;
  long b;
  long c;
} MyStruct;

I don't see why it's unreasonable to write memcpy( targetAddr, &MyStruct.b, sizeof(long) * 2 );

PropGCC / PASM question

Comments