Shop OBEX P1 Docs P2 Docs Learn Events
PropGCC / PASM question - Page 4 — Parallax Forums

PropGCC / PASM question

1246

Comments

  • JasonDorieJasonDorie Posts: 1,930
    edited 2015-10-22 03:21
    Oh bloody hell - It's the same issue - It's reordering the variables because they're static, and therefore global.

    I modified the code to display the memory addresses of the Pins[] array followed by the PinMask value, and it spit out this:

    003688 003684

    So in memory they're in the opposite order that they were declared, even though they're in a class, most likely because they're static. I only need specific layout of these particular variables because they are being passed to a Spin-extracted COG driver with a single address parameter. Having to shove everything into a struct (inside a class) to make sure GCC won't randomize memory order seems excessive.
  • David BetzDavid Betz Posts: 14,516
    edited 2015-10-22 03:23
    JasonDorie wrote: »
    Oh bloody hell - It's the same issue - It's reordering the variables because they're static, and therefore global.

    I modified the code to display the memory addresses of the Pins[] array followed by the PinMask value, and it spit out this:

    003688 003684

    So in memory they're in the opposite order that they were declared, even though they're in a class, most likely because they're static. I only need specific layout of these particular variables because they are being passed to a Spin-extracted COG driver with a single address parameter. Having to shove everything into a struct (inside a class) to make sure GCC won't randomize memory order seems excessive.
    Maybe put them inside a static struct?
    class Receiver
    {
    public:
      static void Start(void);
      static int  Get(int _pin);
      static int  GetRC(int _pin);
      static int  Channel(int _pin);
    
    private:
      static struct {
        long Pins[8];
        long PinMask;
      } driverVars;
    };
    

  • That's what I had to do to make it work (that's why I said "putting them in a struct inside a class seems excessive").

    It works, but it makes the code more verbose, because now everywhere they're referenced I have to prefix with "driverVars." It's not a big deal, but it's a pain, and no other compiler I use does this, so it's another unexpected gotcha, that's all.
  • I feel like I should be speaking in an old man voice, waving my fists at the screen, yelling, "In my day, compilers didn't rearrange everything on you all willy-nilly!"

    I write code for a living, on modern compilers. I swear. :)
  • jmgjmg Posts: 15,182
    JasonDorie wrote: »
    ... but it's a pain, and no other compiler I use does this, so it's another unexpected gotcha, that's all.
    It does seem excessive for a MCU compiler, and hard to manage.
    Do all flavours of GCC have this issue ?
  • TorTor Posts: 2,010
    edited 2015-10-22 04:21
    I really really really don't think it's a good idea to use memcpy with a variable's address as input and giving it a 'size' value larger than the variable and assume it will nicely copy the next (in source) item as well. Just copy the variables by assignment. The compiler is smart enough to optimize that to a fast operation whenever possible, and if it's not possible (due to padding or re-arrangement) the user-trick wouldn't work either.
    If you have internal parts of a struct that you need to copy as a chunk very often, then make it a chunk:
    #include <string.h>
    
    struct chunk
    { 
      int something;
      int more;
    };
    
    struct all
    {
       char abc;
       struct chunk chunk;
       int def;
    };
    
    int main (void) {
       struct all data1;
       struct all data2;
       void *somewhere; // needs assignment    
       data1.chunk = data2.chunk; // or also:
       memcpy (somewhere, &data2.chunk, sizeof (data2.chunk));
       return (0);
    }
    
    C is a high level language, it's not assembler (although some people like to say it's 'close to assembler' - it isn't really). For a high level language you need to move with the language, not against it.

    -Tor
  • I think part of the issue for me is that I'm so used to Spin on the Prop requiring things like this to make it fast enough that while porting my Spin code it's hard to get away from that. I normally do exactly as you say and make a sub-struct, but I was trying to do a fairly direct port of the existing code with as little translation as possible, just to see if it was viable, would be small enough, fast enough, and so on.

    I figured once it worked I'd go back and clean it up, but it kept not working. Now that it *is* working, I'm cleaning up the code, moving things into classes, and so on.
  • TorTor Posts: 2,010
    edited 2015-10-22 05:33
    There's one more thing actually.. if you tell the compiler what you're trying to do, e.g.
    a = a1;
    b = b1;
    c = c1;
    d = d1;
    e = e1;
    ..
    and so on, the optimizer may well decide to store those variables together and do an internal 'memcpy'. But if you instead do
    memcpy (&a, &a1, 4*4);
    it won't see what you want to do, and cannot optimize. Worse, it may even never assign those variables (b* through e*) at all if the only way you interact with them is via intentional out-of-bounds memcpy operations.. well I guess you wouldn't actually never access any them individually, but you can be sure that if something isn't actively used for real then the optimizer will remove it. Or, if you only assign a variable by writing to it by an out-of-bounds memcpy then when you read it (as in 'for (i=0;i<1000;i++){if (a == 4) {break;}') then the optimizer will assume that it was never changed and bad things happens (that loop may not loop, for example).
  • Like I said, I've written a compiler, and I write C/C++ for a living, I'm just stuck in Spin-brain because I'm porting Spin code and trying to be overly pedantic about not doing anything OTHER than a direct port so I screw up as little as possible in the process. The Spin compiler does NO optimization whatsoever, and it's difficult for me to remember that this is no longer Spin.

    The Spin code does the memmove thing all over the place to make it fast enough - doing a per-member copy, the Spin compiler will turn into a bunch of sequential instructions. I know I don't have to worry about that here, I'm just trying to do a straight port until everything is working, then I can clean up.
  • Here's another fun one.

    This works:
    #define Set( var, bit )   { var |= 1<<bit; }
    #define Clear( var, bit ) { var &= ~(1<<bit); }
    
    void Test(void)
    {
      Set( DIRA, 29 );
      Set( DIRA, 28 );
    
      while( 1 ) {
        Clear( OUTA, 28 );
        Set(   OUTA, 28 );
      }
    }
    

    ...but this doesn't:
    static void Set( volatile unsigned int &var, int bit )   { var |= 1<<bit; }
    static void Clear( volatile unsigned int &var, int bit ) { var &= ~(1<<bit); }
    
    void Test(void)
    {
      Set( DIRA, 29 );
      Set( DIRA, 28 );
    
      while( 1 ) {
        Clear( OUTA, 28 );
        Set(   OUTA, 28 );
      }    
    }
    

    Isn't the latter version technically more correct? OUTA and DIRA are defined as volatile unsigned int, so this shouldn't this work?
  • jmg wrote: »
    JasonDorie wrote: »
    ... but it's a pain, and no other compiler I use does this, so it's another unexpected gotcha, that's all.
    It does seem excessive for a MCU compiler, and hard to manage.
    Do all flavours of GCC have this issue ?
    I'm not sure why you consider this "an issue". GCC tries hard to generate compact and fast code. I would think you'd want that most on an MCU where resources are limited. I suspect any good optimizing compiler will do the same. Simple compilers like Spin and ones I've written do a more direct translation of source to object but they also produce far less efficient code.

  • JasonDorie wrote: »
    Here's another fun one.

    This works:
    #define Set( var, bit )   { var |= 1<<bit; }
    #define Clear( var, bit ) { var &= ~(1<<bit); }
    
    void Test(void)
    {
      Set( DIRA, 29 );
      Set( DIRA, 28 );
    
      while( 1 ) {
        Clear( OUTA, 28 );
        Set(   OUTA, 28 );
      }
    }
    

    ...but this doesn't:
    static void Set( volatile unsigned int &var, int bit )   { var |= 1<<bit; }
    static void Clear( volatile unsigned int &var, int bit ) { var &= ~(1<<bit); }
    
    void Test(void)
    {
      Set( DIRA, 29 );
      Set( DIRA, 28 );
    
      while( 1 ) {
        Clear( OUTA, 28 );
        Set(   OUTA, 28 );
      }    
    }
    

    Isn't the latter version technically more correct? OUTA and DIRA are defined as volatile unsigned int, so this shouldn't this work?
    It seems like that should work as long as you compile it as C++. C doesn't have reference parameters.

  • ElectrodudeElectrodude Posts: 1,661
    edited 2015-10-22 13:03
    JasonDorie wrote: »
    Here's another fun one.

    This works:
    #define Set( var, bit )   { var |= 1<<bit; }
    #define Clear( var, bit ) { var &= ~(1<<bit); }
    
    void Test(void)
    {
      Set( DIRA, 29 );
      Set( DIRA, 28 );
    
      while( 1 ) {
        Clear( OUTA, 28 );
        Set(   OUTA, 28 );
      }
    }
    

    ...but this doesn't:
    static void Set( volatile unsigned int &var, int bit )   { var |= 1<<bit; }
    static void Clear( volatile unsigned int &var, int bit ) { var &= ~(1<<bit); }
    
    void Test(void)
    {
      Set( DIRA, 29 );
      Set( DIRA, 28 );
    
      while( 1 ) {
        Clear( OUTA, 28 );
        Set(   OUTA, 28 );
      }    
    }
    

    Isn't the latter version technically more correct? OUTA and DIRA are defined as volatile unsigned int, so this shouldn't this work?

    You do realize you're using a reference to a cogram register, right? I'm suprised the compiler doesn't complain about this!

    Why aren't OUTA and DIRA defined as volatile unsigned register int? Am I missing something?
  • JasonDorie wrote: »
    Here's another fun one.

    This works:
    #define Set( var, bit )   { var |= 1<<bit; }
    #define Clear( var, bit ) { var &= ~(1<<bit); }
    
    void Test(void)
    {
      Set( DIRA, 29 );
      Set( DIRA, 28 );
    
      while( 1 ) {
        Clear( OUTA, 28 );
        Set(   OUTA, 28 );
      }
    }
    

    ...but this doesn't:
    static void Set( volatile unsigned int &var, int bit )   { var |= 1<<bit; }
    static void Clear( volatile unsigned int &var, int bit ) { var &= ~(1<<bit); }
    
    void Test(void)
    {
      Set( DIRA, 29 );
      Set( DIRA, 28 );
    
      while( 1 ) {
        Clear( OUTA, 28 );
        Set(   OUTA, 28 );
      }    
    }
    

    Isn't the latter version technically more correct? OUTA and DIRA are defined as volatile unsigned int, so this shouldn't this work?

    You do realize you're using a reference to a cogram register, right? I'm suprised the compiler doesn't complain about this!

    Why aren't OUTA and DIRA defined as volatile unsigned register int? Am I missing something?
    Ah, good point. The are not in the hub address space so you can't access them indirectly.

  • If you need fast, easy and efficient bit-banging code, give PropWare's Pin class a shot.

    You can either drag PropWare.h, port.h and pin.h into your project, or download all the headers from here.
  • DavidZemonDavidZemon Posts: 2,973
    edited 2015-10-22 13:49
    And if what you really want is to convert Spin to C++ with no changes, and with the intention of cleaning it up later, I'd strongly encourage you to use spin2cpp. Spin2cpp was designed for exactly what you're doing. It's not necessarily pretty (though the author has been doing work to make it prettier) but it works. Then once you verified it works, you can go in and manually start cleaning things up.

    Spin2cpp releases: https://github.com/totalspectrum/spin2cpp/releases
  • I'm nearly done at this point, so hopefully it's moot. The reference to OUTA / DIRA thing tripped me up for a while. I realize they're COG registers, but I have no idea what the under-the-hood handling of them in the compiler looks like. They were declared as variables so I used them.

    Normally I'd use high(), low(), and input(), but I was porting an EEPROM object with very specific ordering of setting the state and directions of the pins. I am aware there's already an EEPROM object, but I didn't want one that takes a full cog, and again about the direct porting thing.

    I made my first test flight last night and it actually works. I have no yaw control, but that should be relatively simple to track down. Flight stability and IMU is all working, so thanks to all of you for your help.
  • DavidZemonDavidZemon Posts: 2,973
    edited 2015-10-22 16:58
    JasonDorie wrote: »
    I made my first test flight last night and it actually works. I have no yaw control, but that should be relatively simple to track down. Flight stability and IMU is all working, so thanks to all of you for your help.

    THAT is very cool! :D

    Also, PropWare provides a fast and small I2C object which does not require a dedicated cog. You may find that helpful as well. It's nearly a direct copy of libpropeller's I2C object, but allows 16-bit addresses.
  • The one I ported is "Propeller EEPROM.spin", which includes all the i2c stuff in the code. I'll probably try to compact it some now that it's working, and I still need to verify it a bunch (write across page boundaries, read it back, compare) but so far as I can tell it's working correctly.

    When you said "16 bit addresses" I got excited for a sec. One of the current bloat points is my command stream array - in Spin / PASM, all the hub addresses were 16 bit, but GCC won't link unless I make them 32 bit, which doubled the size of my array. That's taking about 5kb in a 21kb project, so it's significant. I think I know how I'm going to fix it, but it's going to be a chunk of work.
  • DavidZemonDavidZemon Posts: 2,973
    edited 2015-10-22 18:44
    JasonDorie wrote: »
    One of the current bloat points is my command stream array - in Spin / PASM, all the hub addresses were 16 bit, but GCC won't link unless I make them 32 bit, which doubled the size of my array. That's taking about 5kb in a 21kb project, so it's significant. I think I know how I'm going to fix it, but it's going to be a chunk of work.

    Perhaps this is what you already thought of, but in case not: you could use uint16_t for your array type and then cast to pointer wherever needed. But I can imagine that would be a chunk of (boring and tedious) work.
  • Doesn't work in my case - the array is built at compile time, and the linker had to resolve the addresses when linking, but it doesn't handle type conversion or even simple truncation.

    I'm going to have to create an enum of all the variables I use, put indices into the array, and convert the indices to addresses at runtime. I could just build the array at runtime, but the code to do it is 2x the size of the resulting array, which is why building it at compile time is attractive.
  • DavidZemonDavidZemon Posts: 2,973
    edited 2015-10-22 18:59
    Is the code available in VCS somewhere? It sounds very cool and I'd love to poke at it.
  • jmgjmg Posts: 15,182
    edited 2015-10-22 19:02
    David Betz wrote: »
    I'm not sure why you consider this "an issue". GCC tries hard to generate compact and fast code. I would think you'd want that most on an MCU where resources are limited. I suspect any good optimizing compiler will do the same. Simple compilers like Spin and ones I've written do a more direct translation of source to object but they also produce far less efficient code.
    I'm fine with Compilers shuffling open variables, (& especially local ones) but this is the first time I've seen a compiler shuffle inside a declared record structure.
    Those record structures may be replicating some external file storage, you really do not want a compiler making that up as it goes along.
    Records are usually order-preserved with the Pack option available.

    GCC is free to try hard to generate compact and fast code every where else.
  • jmg wrote: »
    ...but this is the first time I've seen a compiler shuffle inside a declared record structure.
    Those record structures may be replicating some external file storage, you really do not want a compiler making that up as it goes along.

    I tend to agree, which is why I was surprised by this. Loose variables, sure, move those around. Once they're in a struct, there's probably a reason for it. If I make that struct static it doesn't mean I didn't want them in that order.
  • DavidZemon wrote: »
    Is the code available in VCS somewhere? It sounds very cool and I'd love to poke at it.

    I'm doing regular commits here:
    https://github.com/parallaxinc/Flight-Controller

    The "Firmware" folder is the original Spin version, and "Firmware-C" is the C/C++ version. It compiles under SimpleIDE.

    I wasn't intending to port it to C/C++, but I saw some recent performance stats that showed cmm code being better than a 2x speedup over Spin, and with a relatively small size penalty. I'd been having some issues with the speed of the high-level Spin loop, so I did some tests of my own, and they confirmed the results.

    Right now, as long as I leave out printf and dprint, the compiled code is about 21kb, which compares favorably to the Spin version, and I should be able to trim that down even more.
  • JasonDorie wrote: »
    DavidZemon wrote: »
    Is the code available in VCS somewhere? It sounds very cool and I'd love to poke at it.

    I'm doing regular commits here:
    https://github.com/parallaxinc/Flight-Controller

    Didn't realize this was part of Parallax's official Elev8 code.
    "Firmware-C" is the C/C++ version. It compiles under SimpleIDE.

    Good! I know I'll need to convert the spin file to .S, but other that, hopefully it compiles under PropWare without issue. It'll be a good test! Wish I had the hardware to test it

    You might give printi and print a try (from simpletext.h) instead of printf if you find yourself wanting formatted printing but not having space.
  • I don't really need printf, I just use it for debug testing. The Elev8-FC application does a full display of all sensor readings, shows the IMU orientation, etc, etc. That's the primary output tool, but when I'm porting I need to look at the intermediates sometimes, and printf comes in handy then.

    I've been relatively quiet about it, but yes, this is the firmware for Parallax's new Elev8 Flight Controller. The board and the code are still works in progress, but it already flies, is quite stable, and still has two cogs and 10kb to spare.
  • JasonDorie wrote: »
    I don't really need printf, I just use it for debug testing. The Elev8-FC application does a full display of all sensor readings, shows the IMU orientation, etc, etc. That's the primary output tool, but when I'm porting I need to look at the intermediates sometimes, and printf comes in handy then.

    I've been relatively quiet about it, but yes, this is the firmware for Parallax's new Elev8 Flight Controller. The board and the code are still works in progress, but it already flies, is quite stable, and still has two cogs and 10kb to spare.
    What sensors does this new board have?

  • Gyro, accelerometer, magnetometer, and barometer for sensors. I'm working on integrating Ping))) for near-ground altitude hold. It has a self-leveling mode, in addition to "manual" flight (like the HoverFly Sport), and you can switch between modes in-flight.
  • JasonDorie wrote: »
    Gyro, accelerometer, magnetometer, and barometer for sensors. I'm working on integrating Ping))) for near-ground altitude hold. It has a self-leveling mode, in addition to "manual" flight (like the HoverFly Sport), and you can switch between modes in-flight.
    Sounds great! Congratulations!

Sign In or Register to comment.