Shop OBEX P1 Docs P2 Docs Learn Events
QMP+RP2 drivers working in all XMM modes! — Parallax Forums

QMP+RP2 drivers working in all XMM modes!

RaymanRayman Posts: 14,665
edited 2013-05-14 08:59 in Propeller 1
I've been putting HUBDATA and volatile everywhere I can think of, but it still won't work...

I've wittled this test down to just the display driver (no fsrw or any sub objects).

The attached code runs fine in LMM and CMM modes with any optimization.
This was converted with spin2cpp and the original spin file is here too:
QMP_GccTest - Archive [Date 2013.05.08 Time 16.01].zip

But, it doesn't work in xmmc or xmm-single mode with Rampage2X.
In xmmc mode, it appears like it tries to work. It initialized the screen and then fills the screen blue.
But then, it misbehaves badly and sets some random pixels on one horizontal row and then stops.
In xmm-single mode it doesn't even fill the screen.

I've tested ebasic with this setup, so I'm sure xmmc and xmm-single modes can work...

Any ideas?

Update: It now works in -xmmc mode after adding HUBDATA and volatile and HUBTEXT in all the right places :)
-xmm-single and split modes still don't work though.

Update: All modes now working! A lot of my issues were due to using a prototype version of Rampage2 with pulldown resistors instead of pullup resistors on the memory chip selects. Fortunately, the real boards have the correct pullup resistors.

Also, replaced FSRW with stdio because couldn't make FSRW work in xmm-single or split modes.
«13

Comments

  • RaymanRayman Posts: 14,665
    edited 2013-05-08 18:02
    Are there any demos that start a cog with a data array and work in xmm mode?
  • jazzedjazzed Posts: 11,803
    edited 2013-05-08 18:54
    Rayman wrote: »
    Are there any demos that start a cog with a data array and work in xmm mode?

    Yes.

    This is from the gas_toggle example. It could easily use an array instead of __builtin_alloca().
    /*
     * function to start up a new cog running the toggle
     * code (which we've placed in the .cogtoggle section)
     */
    void start_cog(void)
    {
        extern unsigned int _load_start_cogtoggle[];
    
    
        /* now start the kernel */
    #if defined(__PROPELLER_XMMC__) || defined(__PROPELLER_XMM__)
        unsigned int *buffer;
    
    
        // allocate a buffer in hub memory for the cog to start from
        buffer = __builtin_alloca(2048);
        memcpy(buffer, _load_start_cogtoggle, 2048);
        cognew(buffer, 0);
    #else
        cognew(_load_start_cogtoggle, 0);
    #endif
    }
    
  • RaymanRayman Posts: 14,665
    edited 2013-05-09 13:03
    Finally figured out the problem (pretty obvious now...).
    QMP uses the same data bus (P0..P7) as the Rampage2 board...

    Now, I need to figure out if it is possible to share the bus...
  • David BetzDavid Betz Posts: 14,516
    edited 2013-05-09 13:07
    Rayman wrote: »
    Finally figured out the problem (pretty obvious now...).
    QMP uses the same data bus (P0..P7) as the Rampage2 board...

    Now, I need to figure out if it is possible to share the bus...
    The RamPage2 cache driver lets go of the pins when it is not active so you might be able to share them with other hardware. You'll have to make sure that any code that uses that other hardware is declared as HUBTEXT so it doesn't cause any cache misses when it is run though.
  • RaymanRayman Posts: 14,665
    edited 2013-05-10 03:32
    Ok, I'll give it a shot. Will seem like magic if it works...

    BTW: Did you guys invent "Hubdata" and "Hubtext"? Or, did you base it on something else?


    Also: If I declare a function "Hubtext" and it calls another function, what happens? Does it remain in hub, or does that break it?


    Multithreading would be impossible with a shared memory bus, right?
  • David BetzDavid Betz Posts: 14,516
    edited 2013-05-10 03:43
    Rayman wrote: »
    Ok, I'll give it a shot. Will seem like magic if it works...

    BTW: Did you guys invent "Hubdata" and "Hubtext"? Or, did you base it on something else?
    They are just macros that expand into standard GCC syntax for placing code or data in a specific section. The real GCC syntax for HUBTEXT is this:
    __attribute__((section(".hubtext")))
    
    Now which would you rather type? :-)
    Also: If I declare a function "Hubtext" and it calls another function, what happens? Does it remain in hub, or does that break it?
    No, only the functions you declare with HUBTEXT will be located in hub memory. That means you need to restrict that to leaf functions (functions that don't call any other functions) in practice. I've typically only placed the mailbox loop in hub memory and left most of the complex logic in external memory.

    However, it doesn't "break" anything to do that. It just doesn't achieve your goal of eliminating the possiblity of cache misses.
    Multithreading would be impossible with a shared memory bus, right?
    Yes, of course it is impossible. Now go do it! :-)
  • jazzedjazzed Posts: 11,803
    edited 2013-05-10 07:25
    Rayman wrote: »
    Multithreading would be impossible with a shared memory bus, right?

    No. It's just that we have disallowed multi-threading on xmm for now mainly because of the XMM cache mechanism (full multi-cog, multi-threading is available on LMM with pthreads). David is experimenting with new cache ideas for P2 (and P1) where multiple "cogstart" functions can be run on XMM just like on LMM. A lock will be required for sharing pins between a peripheral and XMM for that case.
  • RaymanRayman Posts: 14,665
    edited 2013-05-10 07:37
    Ok, it now works in xmmc and xmm-split modes.. Hurray!

    Doesn't work in xmm-single mode though. Isn't that strange?
  • David BetzDavid Betz Posts: 14,516
    edited 2013-05-10 07:49
    Rayman wrote: »
    Ok, it now works in xmmc and xmm-split modes.. Hurray!

    Doesn't work in xmm-single mode though. Isn't that strange?
    How big is the program? How big is your SRAM?
  • RaymanRayman Posts: 14,665
    edited 2013-05-10 10:42
    program only uses 8 out of 256 kB...

    Weird thing is that it almost works... This little test code starts the LCD, then shows random lines, then shows random dots.
    In xmm-single mode, it shows the random lines fine and then reboots.
  • David BetzDavid Betz Posts: 14,516
    edited 2013-05-10 10:57
    Rayman wrote: »
    program only uses 8 out of 256 kB...

    Weird thing is that it almost works... This little test code starts the LCD, then shows random lines, then shows random dots.
    In xmm-single mode, it shows the random lines fine and then reboots.
    I think you forgot the attachment.

    Maybe your problem is the crappy cache driver you're using. You should get someone who knows what they're doing to write one for you! :-)
  • RaymanRayman Posts: 14,665
    edited 2013-05-10 11:00
    cache driver must work because it runs in split mode, right?
  • David BetzDavid Betz Posts: 14,516
    edited 2013-05-10 11:01
    Rayman wrote: »
    cache driver must work because it runs in split mode, right?
    You would think so. Can you send me the code you're trying to run? I have one of your QMP boards so I should be able to try it.
  • RaymanRayman Posts: 14,665
    edited 2013-05-10 11:09
    Actually, maybe split isn't really working... I just tried this with my full blown QMP Demo program.
    xmmc mode works, but single and split don't.
    Maybe this means that there's some variable that should be in hub that isn't...
    Let me try some things before bothering you more...

    BTW: I'm kinda amazed the fsrw is working this way with no changes at all.
    Just using the raw spin2cpp output and it's reading bitmap images from uSD just fine...
  • David BetzDavid Betz Posts: 14,516
    edited 2013-05-10 12:28
    Rayman wrote: »
    BTW: I'm kinda amazed the fsrw is working this way with no changes at all.
    Just using the raw spin2cpp output and it's reading bitmap images from uSD just fine...
    Yes, Eric's spin2cpp program is truely impressive!
  • RaymanRayman Posts: 14,665
    edited 2013-05-10 12:33
    I've tried a few things, but still can't make single or split work right...
    Definitely a caching issue, I think because it crashes when leaving a big loop.

    I noticed something interesting though... I had forgotten that this code, in the main Spin routine,
    set the flash and sram ce pins high, to keep them from doing anything.
    But, somehow, the flash was working anyway! How can this be? I guess the flash cache driver must be running in the same cog as the lmm cog?

    I'll post a small example that doesn't work in single mode, in case somebody can see the problem...


    PS: I'm still wondering why FSRW works just fine without any volatile or hubdata/hubtext additions...
  • David BetzDavid Betz Posts: 14,516
    edited 2013-05-10 12:35
    Rayman wrote: »
    I've tried a few things, but still can't make single or split work right...
    Definitely a caching issue, I think because it crashes when leaving a big loop.

    I noticed something interesting though... I had forgotten that this code, in the main Spin routine,
    set the flash and sram ce pins high, to keep them from doing anything.
    But, somehow, the flash was working anyway! How can this be? I guess the flash cache driver must be running in the same cog as the lmm cog?

    I'll post a small example that doesn't work in single mode, in case somebody can see the problem...
    The cache driver runs in a different COG. I'm not sure why it would work if you've set the CS pins high in the main code.
  • RaymanRayman Posts: 14,665
    edited 2013-05-10 12:50
    Ok, I think it's working due to a bug in Spin2Cpp...

    My Spin code had this:
    outa[9..10]~~
      dira[9..10]~~
    

    But, it came out of Spin2Cpp like this:
    [FONT=Consolas][SIZE=2][FONT=Consolas][SIZE=2]OUTA |= (3<<10);
      DIRA |= (3<<10);[/SIZE][/FONT][/SIZE][/FONT]
    

    Isn't that off a bit?
  • David BetzDavid Betz Posts: 14,516
    edited 2013-05-10 12:55
    Rayman wrote: »
    Ok, I think it's working due to a bug in Spin2Cpp...

    My Spin code had this:
    outa[9..10]~~
      dira[9..10]~~
    

    But, it came out of Spin2Cpp like this:
    [FONT=Consolas][SIZE=2][FONT=Consolas][SIZE=2]OUTA |= (3<<10);
      DIRA |= (3<<10);[/SIZE][/FONT][/SIZE][/FONT]
    

    Isn't that off a bit?
    Seems like it. If you take those two instructions out does xmm-single start working?
  • RaymanRayman Posts: 14,665
    edited 2013-05-10 12:58
    unfortunately, not. These weren't in my smaller test program anyway.


    But, if I change it to this:
    OUTA |= (3<<9);
    DIRA |= (3<<9);

    I can keep xmmc from working :)
  • David BetzDavid Betz Posts: 14,516
    edited 2013-05-10 13:03
    Rayman wrote: »
    unfortunately, not. These weren't in my smaller test program anyway.


    But, if I change it to this:
    OUTA |= (3<<9);
    DIRA |= (3<<9);

    I can keep xmmc from working :)
    You should probably let Eric know about this spin2cpp issue.
  • jazzedjazzed Posts: 11,803
    edited 2013-05-10 13:17
    Rayman wrote: »
    ... Isn't that off a bit?
    Exactly a bit :)
  • ersmithersmith Posts: 6,054
    edited 2013-05-10 15:04
    Rayman wrote: »
    Ok, I think it's working due to a bug in Spin2Cpp...

    My Spin code had this:
    outa[9..10]~~
      dira[9..10]~~
    

    But, it came out of Spin2Cpp like this:
    [FONT=Consolas][SIZE=2][FONT=Consolas][SIZE=2]OUTA |= (3<<10);
      DIRA |= (3<<10);[/SIZE][/FONT][/SIZE][/FONT]
    

    Isn't that off a bit?

    Yes, that is a spin2cpp bug. I've checked a fix in to the p2test branch of propgcc. In the meantime, you can change those expressions to
    outa[10..9]~~
    dira[10..9]~~
    
    I think it's generally safer to use the ranges that way (hi..lo), since an expression like:
    outa[0..7] := 1
    
    actually sets the low byte of outa to $80, which may not be what you intended.

    Thanks for the bug report!
    Eric
  • RaymanRayman Posts: 14,665
    edited 2013-05-10 15:39
    Actually, this flexibility is one very nice thing in Spin.
  • RaymanRayman Posts: 14,665
    edited 2013-05-11 08:26
    David, I was about to send you my little test program that doesn't work in xmm-single or split, but when I tried it at home, it works!

    This threw me off because I thought I had the same PropGCC installed on both computers.
    But, I think I found the difference in the rampage2x.cfg file.
    The one that works has this:
    cache-param1: 0x00070600

    And the one that doesn't has this:

    cache-param1: 0

    I remember changing these values, but I thought that was for C3F...

    Does this help figure out what's going on?
  • David BetzDavid Betz Posts: 14,516
    edited 2013-05-11 08:37
    Rayman wrote: »
    David, I was about to send you my little test program that doesn't work in xmm-single or split, but when I tried it at home, it works!

    This threw me off because I thought I had the same PropGCC installed on both computers.
    But, I think I found the difference in the rampage2x.cfg file.
    The one that works has this:
    cache-param1: 0x00070600

    And the one that doesn't has this:

    cache-param1: 0

    I remember changing these values, but I thought that was for C3F...

    Does this help figure out what's going on?
    cache-param1 in the "x" drivers configures the cache geometry. If you leave it zero you get the default which is a 4-way cache. The parameters you show for the working version is a direct-mapped cache. Either should work but I suppose there may be a bug in the n-way cache code.
  • RaymanRayman Posts: 14,665
    edited 2013-05-11 08:45
    Maybe it doesn't mean anything after all...
    I added some more code to that small example and now it does like before, runs for a while and then messes up...
  • RaymanRayman Posts: 14,665
    edited 2013-05-11 09:08
    Here's that slightly bigger example that doesn't work in single or split mode.
    You probably don't have this hardware setup (QMP+Rampage2), but maybe you can see the problem?

    QMP_GccTest8.zip
  • David BetzDavid Betz Posts: 14,516
    edited 2013-05-11 09:23
    Rayman wrote: »
    Here's that slightly bigger example that doesn't work in single or split mode.
    You probably don't have this hardware setup (QMP+Rampage2), but maybe you can see the problem?

    QMP_GccTest8.zip
    What do I have to do to setup my QMP board to run this? Do I need a different RamPage2 board with a stacking connector?
  • RaymanRayman Posts: 14,665
    edited 2013-05-11 09:40
    I guess there are a few ways to do it...

    What I did was plug QMP into the Quickstart's main connector and then plug Rampage2 into the second connector.
    But, this is a little tricky because RP2 had to be in at an angle and it's also hard soldered, so it can't be removed.

    Maybe it would be better to solder a female header under Quickstart and then solder male header pins to top of RP2 so it can plug in underneath of Quickstart.

    Another way would be to use those stacking connectors for Quickstart and plug both boards into the main connector of Quickstart.
Sign In or Register to comment.