Shop OBEX P1 Docs P2 Docs Learn Events
Inline assembly (PASM), injecting delays — Parallax Forums

Inline assembly (PASM), injecting delays

ypapelisypapelis Posts: 99
edited 2012-09-13 12:19 in Propeller 1
I have been very successfully converting my PASM programs into C (love it!), but one place where I am consistently getting in trouble is finding ways to inject precise delays in the code. I am using COG mode for C, so the C output program is running in a cog by itself, although the same problem probably would occur in LMM (or other models) for loops that end up cached into COG memory.

Using waitcnt/waitcnt2 can only generate delays higher than about 7 micro seconds, which does not cut it for tweaking clock pulses and clock-data pulse offsets. In pasm, for 'large' delays I could call a subroutine that loops around, and I could get a precise delay knowing the loop iterations. For shorter delays, simply inserting 'NOP' instructions did the trick. I tried to call a delay function in cogc, but unless the function is doing something 'meaty', it gets 'optimized' away, and not knowing (or having to look into the assembly is inconvenient). Incrementing a volatile variable appears to do the trick but is awkward. So I was wondering, is there a way to inject PASM code inside C? Some other embedded C systems provide pseudo functions that do exactly that. Or any other suggestion on how to achieve instruction-level delays would be welcomed!

Comments

  • jazzedjazzed Posts: 11,803
    edited 2012-09-13 10:20
    ypapelis wrote: »
    I have been very successfully converting my PASM programs into C (love it!), but one place where I am consistently getting in trouble is finding ways to inject precise delays in the code. I am using COG mode for C, so the C output program is running in a cog by itself, although the same problem probably would occur in LMM (or other models) for loops that end up cached into COG memory.

    Using waitcnt/waitcnt2 can only generate delays higher than about 7 micro seconds ....


    I've been able to make pins toggle as fast as 400ns (possibly 250ns) in COGC code using waitcnt2 (with 80MHz clock).

    Here is a COG mode example.
    /**
     * @file cogdelay.c
     *
     * This program demonstrates creating COG mode delays with one
     * statement between delays. The minimum delay is dictated
     * by how long it takes to execute instructions between delays.
     
     * It should be noted that if the time it takes for code to execute
     * is greater than wait, the code will appear to freeze for about
     * one minute with an 80MHz clock.
     */
    #include <propeller.h>
    
    
    /**
     * COG mode cogdelay main function
     *
     * The basis of this delay demo is the propeller.h waitcnt2 macro
     * which translates to __builtin_waitcnt(count, wait).
     *
     *   #define waitcnt2(a, b) __builtin_propeller_waitcnt((a),(b))
     *
     *   Wait until system counter reaches A value.
     *   waitcnt2 Parameters:
     *      a Target value
     *      b Adjust value
     *
     * Demo runs in an infinite loop and allows for scope measurement.
     */
    int main(void)
    {
        unsigned int count;
        unsigned int wait;
        unsigned int pin;
        // set pin to output
        pin = 1 << 15;
        DIRA |= pin;
        
        // wait as small as 250ns is possible (at 80MHz) building in COG mode.
        
        wait = (CLKFREQ/5000000); // waitcnt2 delay becomes 400ns
        for(;;) {
            // initial delay just for reasonable scope triggering
            count = wait*50+CNT;
            OUTA &= ~pin;
            count = waitcnt2(count, wait);
            // now count contains the target count for the next waitcnt2 call.
            OUTA ^= pin;
            count = waitcnt2(count, wait);
            OUTA ^= pin;
            count = waitcnt2(count, wait);
            OUTA ^= pin;
            count = waitcnt2(count, wait);
            OUTA ^= pin;
            count = waitcnt2(count, wait);
            OUTA ^= pin;
            count = waitcnt2(count, wait);
            OUTA ^= pin;
            count = waitcnt2(count, wait);
            OUTA ^= pin;
            count = waitcnt2(count, wait);
            OUTA ^= pin;
            count = waitcnt2(count, wait);
            OUTA ^= pin;
            count = waitcnt2(count, wait);
            OUTA ^= pin;
            count = waitcnt2(count, wait);
            OUTA ^= pin;
            count = waitcnt2(count, wait);
            OUTA ^= pin;
            count = waitcnt2(count, wait);
            OUTA ^= pin;
            count = waitcnt2(count, wait);
        }
    }
    


    Even in LMM mode 2us delays are possible. CMM programs are slower, so the demo below won't work by default - change wait = CLKFREQ/250000; to get about a 5us wait.
    /**
     * @file delaydemo.c
     *
     * This program demonstrates creating 2 microsecond or more delays
     * with one statement between delays. The minimum delay is dictated
     * by how long it takes to execute instructions between delays. The
     * use of fcache (FastCache) also determines how small the delays can
     * be - this demo allows fcache to be disabled, but it can be enabled.
     *
     * It should be noted that if the time it takes for code to execute
     * is greater than wait, the code will appear to freeze for about
     * one minute with an 80MHz clock.
     */
    #include <propeller.h>
    
    
    /**
     * delaydemo main function
     * HUBTEXT forces function to be in HUB RAM even with XMM modes.
     * Not designed for COG mode.
     *
     * The basis of this delay demo is the propeller.h waitcnt2 macro
     * which translates to __builtin_waitcnt(count, wait).
     *
     *   #define waitcnt2(a, b) __builtin_propeller_waitcnt((a),(b))
     *
     *   Wait until system counter reaches A value.
     *   waitcnt2 Parameters:
     *      a Target value
     *      b Adjust value
     *
     * Demo runs in an infinite loop and allows for scope measurement.
     */
    HUBTEXT void main(void)
    {
        unsigned int count;
        unsigned int wait;
        unsigned int pin;
        // set pin to output
        pin = 1 << 15;
        DIRA |= pin;
        
        // 2us waits for 80MHz clock assuming no fcache.
        // If the code is small enough to fit in the fcache area
        // wait as small as 500ns is possible (at 80MHz).
        wait = CLKFREQ/500000;
        for(;;) {
            // initial delay just for reasonable scope triggering
            count = wait*50+CNT;
            OUTA &= ~pin;
            count = waitcnt2(count, wait);
            // now count contains the target count for the next waitcnt2 call.
            OUTA ^= pin;
            count = waitcnt2(count, wait);
            OUTA ^= pin;
            count = waitcnt2(count, wait);
            OUTA ^= pin;
            count = waitcnt2(count, wait);
            OUTA ^= pin;
            count = waitcnt2(count, wait);
            OUTA ^= pin;
            count = waitcnt2(count, wait);
            OUTA ^= pin;
            count = waitcnt2(count, wait);
            OUTA ^= pin;
            count = waitcnt2(count, wait);
            OUTA ^= pin;
            count = waitcnt2(count, wait);
            OUTA ^= pin;
            count = waitcnt2(count, wait);
            OUTA ^= pin;
            count = waitcnt2(count, wait);
            OUTA ^= pin;
            count = waitcnt2(count, wait);
            OUTA ^= pin;
            count = waitcnt2(count, wait);
            OUTA ^= pin;
            count = waitcnt2(count, wait);
        }
    }
    

    I'm not much of a GNU ASM user, so I don't feel comfortable answering that question without having time for more reading.
  • Dave HeinDave Hein Posts: 6,347
    edited 2012-09-13 10:47
    If you want to keep a delay loop from being optimized away you could do something like this:
    #pragma GCC push_options
    #pragma GCC optimize("O1")
    void delay()
    {
        int i;
        for (i = 0; i < 100; i++);
    }
    #pragma GCC pop_options
    
    However, it looks like PropGCC doesn't like the pop_options pragma, so you would have to explicitly set the optimization level after the delay loop with something like "#pragma GCC optimize("Os")" instead of using the push/pop pragmas.
  • ersmithersmith Posts: 6,054
    edited 2012-09-13 11:54
    ypapelis wrote: »
    Using waitcnt/waitcnt2 can only generate delays higher than about 7 micro seconds
    Are you sure about this? waitcnt should compile to a waitcnt instruction, so in COG mode you should be able to get accuracies of much better than a microsecond, at least with the default 80MHz clock.
    So I was wondering, is there a way to inject PASM code inside C? Some other embedded C systems provide pseudo functions that do exactly that.
    Yes, the __asm__ instruction. There's a section on it in the GCC manual, and various documentation online. To insert two nops you would do something like:
    __asm__ volatile (" nop\n nop\n");
    

    The string inside the __asm__ is passed through directly to the assembler. To insert multiple lines, use "\n" (and remember to indent at least one space so the instruction is not interpreted as a label). The "volatile" says not to optimize around the asm (not to move instructions past it, for example).

    Note that the syntax for __asm__ is GAS, not PASM. They are mostly the same; the main thing to watch out for is that GAS addresses are always byte addresses, not long addresses.

    Eric
  • dnalordnalor Posts: 222
    edited 2012-09-13 12:19
    Dave Hein wrote: »
    If you want to keep a delay loop from being optimized away you could do something like this:
    #pragma GCC push_options
    #pragma GCC optimize("O1")
    void delay()
    {
        int i;
        for (i = 0; i < 100; i++);
    }
    #pragma GCC pop_options
    
    However, it looks like PropGCC doesn't like the pop_options pragma, so you would have to explicitly set the optimization level after the delay loop with something like "#pragma GCC optimize("Os")" instead of using the push/pop pragmas.

    For resetting to command line options you can use:
    #pragma GCC optimize(initial)
Sign In or Register to comment.