Shop OBEX P1 Docs P2 Docs Learn Events
Need a short pulse Propeller GCC — Parallax Forums

Need a short pulse Propeller GCC

DarkGFXDarkGFX Posts: 17
edited 2015-06-02 09:10 in Propeller 1
Hello, I have only tried some small projects with the propeller so I'm really new to this.

I'm trying to get a short pulse on pin 4 on the activity board, this needs to be *** short *** possible!
I have tried the waitcnt command but it seems to have problems with getting under microseconds limit.
I need this function to be in the nanosecond range, *** low *** possible!

I will need to control the off time and on time of the pin, and it need to be able to repeat X number of times.

Have configured the system clock to work at 80MHz.


I have written this function:

void flash_enable(int flash){
double t1, t2, diff;
for(int i = 1; i <= flash; i++){
high(4);
t1 = CNT;
__asm__ volatile (" nop\n nop\n");
t2 = CNT;
low(4);
diff = t2 - t1;
print("Diff %d\n", diff);
pause(1);
print("flash nr: %d\n",i);
}
}


But there is something strange with this one! I wrote something similar earlier and a got a diff of 15, I consider this delay to be 15*12.5 ns, but now its returning a really high number.
I guess its a lit better to write the whole code in ASM, but I'm not to familiar with that language.

Comments

  • kwinnkwinn Posts: 8,697
    edited 2015-05-29 07:15
    Spin is an interpreted language so you may not be able to get as short a pulse as you want with it unless you use the counters. The fastest pulse using spin would be:

    OUTA[4]~~ 'Make P4 high
    OUTA[4]~ 'Make P4 low

    To go faster would require either a PASM program or use of the counters.
  • DarkGFXDarkGFX Posts: 17
    edited 2015-05-29 07:46
    I'm not basing my code on spinn, I use C.
    your code would work, but I have not control over the delay, I cant change it to fit different purposes.
    And this is what I want, but thanks for the reply!
  • kwinnkwinn Posts: 8,697
    edited 2015-05-29 08:09
    Sorry, should have looked at the post more carefully, and waited until after my morning coffee as well.
  • yetiyeti Posts: 818
    edited 2015-05-29 09:07
    As expected I get different delays depending on memory model...
    (yeti@aurora:5)~/wrk/propeller/spinsim/DarkGFX$ cat main.c 
    #include <propeller.h>
    #include "conio.h" /* like FullDuplexSerial for C on SpinSim*/
    
    int main(void)
    {
            static volatile int i;
            int t=CNT;
            conio_dec(CNT-t);
            conio_out(13);
    
            t=CNT;
            __asm__ volatile (" nop\n nop\n");
            conio_dec(CNT-t);
            conio_out(13);
            cogstop(0);
    }
    (yeti@aurora:5)~/wrk/propeller/spinsim/DarkGFX$ make
    /opt/parallax/bin/propeller-elf-gcc -mcog -Os -std=c99 main.c -o main.cog.elf
    /opt/parallax/bin/propeller-load -s main.cog.elf
    /opt/parallax/bin/spinsim -p main.cog.binary 2>&1 | tee main.cog.log
    4
    12
    /opt/parallax/bin/propeller-elf-gcc -mlmm -Os -std=c99 main.c -o main.lmm.elf
    /opt/parallax/bin/propeller-load -s main.lmm.elf
    /opt/parallax/bin/spinsim -p main.lmm.binary 2>&1 | tee main.lmm.log
    16
    48
    /opt/parallax/bin/propeller-elf-gcc -mcmm -Os -std=c99 main.c -o main.cmm.elf
    /opt/parallax/bin/propeller-load -s main.cmm.elf
    /opt/parallax/bin/spinsim -p main.cmm.binary 2>&1 | tee main.cmm.log
    160
    256
    rm main.lmm.elf main.cog.elf main.cmm.elf main.lmm.binary main.cog.binary main.cmm.binary
    
    ...and 12 ticks for two NOPs and one substraction in "-mcog" mode does look as expected...
  • idbruceidbruce Posts: 6,197
    edited 2015-05-29 09:12
    One big problem that you will have is measuring the pulse without a scope, because your prints and assignments all take time. Unless you go to PASM, I believe this is about the fastest code you can run in C, but I could be wrong. Set the pulse_width and delay variables per your requirements to see if you can get close.
    void flash_enable(int32_t flash)
    {
    	int32_t counter;
    
    	// initialize this variable with an appropriate value
    	int32_t pulse_width;
    
    	// initialize this variable with an appropriate value
    	int32_t delay;
    
    
    	// Set up the CTRMODE of Counter A for NCO/PWM single-ended.
    	CTRA = (4 << 26) | 4;
    
    	// Set the value to be added to PHSA with every clock cycle.
    	FRQA = 1;
    
    	// Set DIRA as an output.
    	DIRA |= 1 << 4;
    
    	// Get the current System Counter value.
    	counter = CNT;
    
    	for(int32_t i = 1; i <= flash; i++)
    	{
    		// Send out a high pulse on the desired pin for the desired duration.
    		PHSA = -pulse_width;
    
    		// Wait for a specified period of time before sending another high pulse
    		waitcnt(counter += delay);
    	}
    }
    
  • idbruceidbruce Posts: 6,197
    edited 2015-05-29 09:29
    In reference to my previous post, if you need to measure time without a scope, then take a time measurement before and after the loop, but leave the loop alone to get the fastest execution.
  • ersmithersmith Posts: 6,053
    edited 2015-05-29 17:28
    For really time critical (and small) functions, you can make the speed independent of the memory model by adding __attribute__((fcache)) to the declaration of the function. This tells PropGCC to put the whole function in the fast cache area of kernel memory (i.e. COG RAM). In order for this to work the function will have to fit in fcache (so it has to be less than 512 bytes in length), and it can't have calls to other functions in it.There will be a bit of a delay the first time the function is called (while it's loaded into cache) but once the function starts timing will be fixed and the same as -mcog mode.
  • idbruceidbruce Posts: 6,197
    edited 2015-05-29 17:59
    For really time critical (and small) functions, you can make the speed independent of the memory model by adding __attribute__((fcache)) to the declaration of the function.

    Any chance I could talk you into a code example :)
  • ersmithersmith Posts: 6,053
    edited 2015-05-29 18:44
    Here's the serial output code from the PropGCC library, which uses __attribute__((fcache)) to make sure output bits have consistent timing. (Without it the timing for the first bit is tricky.)
    /*
     * We need _serial_txbyte to always be fcached so that the timing is
     * OK.
     */
    __attribute__((fcache))
    int _serial_tx(int c, unsigned int txmask, unsigned int bitcycles)
    {
      unsigned int waitcycles;
      int i, value;
    
      /* set output */
      _OUTA |= txmask;
      _DIRA |= txmask;
    
      value = (c | 256) << 1;
      waitcycles = getcnt() + bitcycles;
      for (i = 0; i < 10; i++)
        {
          waitcycles = __builtin_propeller_waitcnt(waitcycles, bitcycles);
          if (value & 1)
            _OUTA |= txmask;
          else
            _OUTA &= ~txmask;
          value >>= 1;
        }
      return c;
    }
    
  • DavidZemonDavidZemon Posts: 2,973
    edited 2015-05-29 19:31
    More fcache here.
    /**
             * @brief       Shift out one word of data (FCache function)
             *
             * @param[in]   data        A fully configured, ready-to-go, data word
             * @param[in]   bits        Number of shiftable bits in the data word
             * @param[in]   bitCycles   Delay between each bit; Unit is clock cycles
             * @param[in]   txMask      Pin mask of the TX pin
             */
            inline void shift_out_data (uint32_t data, uint32_t bits, const uint32_t bitCycles, const uint32_t txMask) const {
                volatile uint32_t waitCycles = bitCycles;
                __asm__ volatile (
                        "        fcache #(ShiftOutDataEnd - ShiftOutDataStart)                     \n\t"
                        "        .compress off                                                     \n\t"
    
                        "ShiftOutDataStart:                                                        \n\t"
                        "        add %[_waitCycles], CNT                                           \n\t"
    
                        "loop%=:                                                                   \n\t"
                        "        waitcnt %[_waitCycles], %[_bitCycles]                             \n\t"
                        "        shr %[_data],#1 wc                                                \n\t"
                        "        muxc outa, %[_mask]                                               \n\t"
                        "        djnz %[_bits], #__LMM_FCACHE_START+(loop%= - ShiftOutDataStart)   \n\t"
    
                        "        jmp __LMM_RET                                                     \n\t"
                        "ShiftOutDataEnd:                                                          \n\t"
                        "        .compress default                                                 \n\t"
                        : [_data] "+r"(data),
                        [_waitCycles] "+r"(waitCycles),
                        [_bits] "+r" (bits)
                        : [_mask] "r"(txMask),
                        [_bitCycles] "r"(bitCycles));
            }
    
  • idbruceidbruce Posts: 6,197
    edited 2015-05-30 00:07
    Thanks guys.
  • DarkGFXDarkGFX Posts: 17
    edited 2015-06-01 00:52
    Sorry for the late reply guys!
    I had to go away for a couple of days, so I did not have time to check the replies.

    There is a lot of good suggestions here, I will try several of the suggestions today to see how fast these examples can execute a pulse.
    Note the print commands I used in my code was not to measure exact time, I just used it as a reference value to see what used less time that other solutions I used. I have access to an oscilloscope, so I have the possibility to use this when the software measured time differential gets low enough.

    I will also try to add the __attribute__((fcache)) to my function!

    While I try the different suggestions above, can someone that knows ASM give me an good example of this as well with some comments so I can try to understand it.
    As an exercise for me, and I guess to get this function reliable and really fast I guess I need to use ASM.

    Best regards
    Patrick

    PS: Thanks for all the help so far, really appreciate it!!!
  • DarkGFXDarkGFX Posts: 17
    edited 2015-06-01 01:16
    idbruce wrote: »
    One big problem that you will have is measuring the pulse without a scope, because your prints and assignments all take time. Unless you go to PASM, I believe this is about the fastest code you can run in C, but I could be wrong. Set the pulse_width and delay variables per your requirements to see if you can get close.
    void flash_enable(int32_t flash)
    {
    	int32_t counter;
    
    	// initialize this variable with an appropriate value
    	int32_t pulse_width;
    
    	// initialize this variable with an appropriate value
    	int32_t delay;
    
    
    	// Set up the CTRMODE of Counter A for NCO/PWM single-ended.
    	CTRA = (4 << 26) | 4;
    
    	// Set the value to be added to PHSA with every clock cycle.
    	FRQA = 1;
    
    	// Set DIRA as an output.
    	DIRA |= 1 << 4;
    
    	// Get the current System Counter value.
    	counter = CNT;
    
    	for(int32_t i = 1; i <= flash; i++)
    	{
    		// Send out a high pulse on the desired pin for the desired duration.
    		PHSA = -pulse_width;
    
    		// Wait for a specified period of time before sending another high pulse
    		waitcnt(counter += delay);
    	}
    }
    

    I tried to run your code with measuring of system clk before and after the PHSA command, the difference can not go below 64 clk cycles, witch is some of the best result I have gotten so far, but can we push it even lower??
    I know that some of the clk cycles I measured is there due to the time t takes to execute the int = CNT command, but it is a good reference pint.
    But a problem is the wait function waitcnt, unless I used a delay of 100000 clk pulses it would not run smooth, and this delay is to long.

    My example of your code:

    __attribute__((fcache))
    void flash_enable2(int32_t flash)
    {
    int32_t counter;

    // initialize this variable with an appropriate value
    int32_t pulse_width = 1;

    // initialize this variable with an appropriate value
    int32_t delay = 100000;


    // Set up the CTRMODE of Counter A for NCO/PWM single-ended.
    CTRA = (4 << 26) | 4;

    // Set the value to be added to PHSA with every clock cycle.
    FRQA = 1;

    // Set DIRA as an output.
    DIRA |= 1 << 4;

    // Get the current System Counter value.
    counter = CNT;

    for(int32_t i = 1; i <= flash; i++)
    {
    // Send out a high pulse on the desired pin for the desired duration.
    int T1 = CNT;
    PHSA = -pulse_width;
    int T2 = CNT - T1;
    print("diff= %d\n", T2);

    // Wait for a specified period of time before sending another high pulse
    waitcnt(counter += delay);
    }
  • DarkGFXDarkGFX Posts: 17
    edited 2015-06-01 01:33
    yeti wrote: »
    As expected I get different delays depending on memory model...
    (yeti@aurora:5)~/wrk/propeller/spinsim/DarkGFX$ cat main.c 
    #include <propeller.h>
    #include "conio.h" /* like FullDuplexSerial for C on SpinSim*/
    
    int main(void)
    {
            static volatile int i;
            int t=CNT;
            conio_dec(CNT-t);
            conio_out(13);
    
            t=CNT;
            __asm__ volatile (" nop\n nop\n");
            conio_dec(CNT-t);
            conio_out(13);
            cogstop(0);
    }
    (yeti@aurora:5)~/wrk/propeller/spinsim/DarkGFX$ make
    /opt/parallax/bin/propeller-elf-gcc -mcog -Os -std=c99 main.c -o main.cog.elf
    /opt/parallax/bin/propeller-load -s main.cog.elf
    /opt/parallax/bin/spinsim -p main.cog.binary 2>&1 | tee main.cog.log
    4
    12
    /opt/parallax/bin/propeller-elf-gcc -mlmm -Os -std=c99 main.c -o main.lmm.elf
    /opt/parallax/bin/propeller-load -s main.lmm.elf
    /opt/parallax/bin/spinsim -p main.lmm.binary 2>&1 | tee main.lmm.log
    16
    48
    /opt/parallax/bin/propeller-elf-gcc -mcmm -Os -std=c99 main.c -o main.cmm.elf
    /opt/parallax/bin/propeller-load -s main.cmm.elf
    /opt/parallax/bin/spinsim -p main.cmm.binary 2>&1 | tee main.cmm.log
    160
    256
    rm main.lmm.elf main.cog.elf main.cmm.elf main.lmm.binary main.cog.binary main.cmm.binary
    
    ...and 12 ticks for two NOPs and one substraction in "-mcog" mode does look as expected...



    I must confess that understood very little of this, is it possible you could elaborate this a bit more?
  • DarkGFXDarkGFX Posts: 17
    edited 2015-06-01 01:39
    Ok guys, I was not able to get it lower than 64 ticks, so it looks like <<yeti>> is dong something right that got the function down to 12 ticks, impressed!
    But what if we use ASM, could we get full control of the ON_time and OFF_time of the pulse with reliable short delays with only a couple of ticks of the system clock?
  • idbruceidbruce Posts: 6,197
    edited 2015-06-01 03:41
    DarkGFX

    I am not sure about this, but perhaps execution time could be cut a little more by changing

    From:
    for(int32_t i = 1; i <= flash; i++)
    

    To:
    for(int32_t i = 1; i < flash; i++)
    

    There are undoubtedly several examples of using NCO/PWM single-ended counters written in PASM, you just have to seek them out and perhaps modify the examples to suit your needs.

    Over the years, I have read about several people trying to reach the nanosecond range. Perhaps a Google search of the forum, using "nano" and "counter" as search terms may yield some interesting results for you.
  • DarkGFXDarkGFX Posts: 17
    edited 2015-06-01 05:38
    idbruce wrote: »
    DarkGFX

    I am not sure about this, but perhaps execution time could be cut a little more by changing

    From:
    for(int32_t i = 1; i <= flash; i++)
    

    To:
    for(int32_t i = 1; i < flash; i++)
    

    There are undoubtedly several examples of using NCO/PWM single-ended counters written in PASM, you just have to seek them out and perhaps modify the examples to suit your needs.

    Over the years, I have read about several people trying to reach the nanosecond range. Perhaps a Google search of the forum, using "nano" and "counter" as search terms may yield some interesting results for you.

    thanks for the loop optimization, I will need this at a later stage in my project.
    My main concern right now is to get that first pulse as short as possible! And I will haft to take on the problem regarding how much time the loop requires at a later point.
    I have tried to find a solution searing both the forum and google, but have not found a good example winch I at least can understand. This is why I'm reaching out to the forum.
    But I will continue trying, I will not easily give up ;).

    I have tried to read up in inline pasm, but I must admit its a bit hard to understand.

    I was hoping that maybe a fast toggle with PASM could give me what I needed.
  • AribaAriba Posts: 2,690
    edited 2015-06-01 08:21
    DarkGFX wrote: »
    thanks for the loop optimization, I will need this at a later stage in my project.
    My main concern right now is to get that first pulse as short as possible! And I will haft to take on the problem regarding how much time the loop requires at a later point.
    I have tried to find a solution searing both the forum and google, but have not found a good example winch I at least can understand. This is why I'm reaching out to the forum.
    But I will continue trying, I will not easily give up ;).

    I have tried to read up in inline pasm, but I must admit its a bit hard to understand.

    I was hoping that maybe a fast toggle with PASM could give me what I needed.

    It would help alot if you give us more infos:
    - What is the application
    - Must the pulse times be variable or can they be fixed
    - if variable: at compile time or at run time?
    - must the number of pulses be variable at run time
    - and most important: what is the range of the pulse times you need and with what resolution.

    There are a number of solutions with the Propeller: C only, with PASM, with Counters, with VideoGenerator hardware ...

    For example if you only need the shortest possible high pulse of 12.5ns and only need to make the low time variable, the the DUTY mode of a counter is a very simple solution:
    void pulses(void)
    {
      DIRA |= 1<<4;
      PHSA = 0;
      FRQA = 0x40000000;    //2 cy LOW, 1 cy HIGH
      CTRA = (6<<26) + 4;   //DUTY mode at pin 4
      while(1) ;            //endless
    }
    // LOW time: 0x80000000 = 1cy, 0x20000000 = 4 cy, 0x10000000 = 8 cy ...
    // HIGH time always 1 cy
    
    
    (This is untested and produces endless pulses)

    Andy
  • ersmithersmith Posts: 6,053
    edited 2015-06-01 18:06
    Here's a solution that doesn't use the counters, just bangs the bits as quickly as possible:
    #include <stdio.h>
    #include <propeller.h>
    
    const unsigned bitmask = 0x0001;
    
    __attribute__((fcache))
    unsigned shortPulse(void)
    {
        unsigned int elapsed = CNT;
        OUTA |= bitmask;
        // put any additional delays in here
        OUTA &= ~bitmask;
    
        elapsed = CNT - elapsed;
        return elapsed;
    }
    
    void main()
    {
        unsigned elapsed;
    
        DIRA = bitmask;
        OUTA &= ~bitmask;
        
        elapsed = shortPulse();
    
        printf("pulse duration=%u\n", elapsed);
    }
    
    Note that putting a printf inside the shortPulse() function will produce a warning telling you that the compiler cannot use fcache (since an fcached function cannot call other functions).

    The code prints 12 for the elapsed time, which includes 3 instructions: setting the pin high, setting it low, and the subtraction for elapsed time calculation. I think the pulse itself should be just 4 cycles, but I don't have an oscilloscope handy here to verify that.

    Eric
  • DarkGFXDarkGFX Posts: 17
    edited 2015-06-02 00:20
    Thank you everyone, I will try out the codes above *** soon as I have time today! Appreciate that you took the time!

    I will answer the questions above here:
    What I'm making requires an led to blink extremely fast and extremely bright, to do this I'm driving the led with around 10 times more current than it can handle, I'm going to use this to stop the moments of a moving bullet. In other words and building a fast led camera flash!
    I know there is a lot of other circuits I could use to get the led to blink, but I want to use the propeller since it's going to do other operations as well.


    - What is the application
    The application is that at a given time, the controller should flash a led as fast as possible X-number of times. How the prototype works is that the Flash() function is called when I push a button.
    How fast this flashes optimally should be I will need to experiment on. So that's why I need the pulse as fast as possible and yet variable.

    - Must the pulse times be variable or can they be fixed
    The pulse times will be fixed when I find the optimal time that will achieve what I want.

    - if variable: at compile time or at run time?
    They can be variable at compile time

    - must the number of pulses be variable at run time
    NO

    - and most important: what is the range of the pulse times you need and with what resolution.
    As short and fast *** possible!
  • DarkGFXDarkGFX Posts: 17
    edited 2015-06-02 00:38
    ersmith wrote: »
    Here's a solution that doesn't use the counters, just bangs the bits as quickly as possible:
    #include <stdio.h>
    #include <propeller.h>
    
    const unsigned bitmask = 0x0001;
    
    __attribute__((fcache))
    unsigned shortPulse(void)
    {
        unsigned int elapsed = CNT;
        OUTA |= bitmask;
        // put any additional delays in here
        OUTA &= ~bitmask;
    
        elapsed = CNT - elapsed;
        return elapsed;
    }
    
    void main()
    {
        unsigned elapsed;
    
        DIRA = bitmask;
        OUTA &= ~bitmask;
        
        elapsed = shortPulse();
    
        printf("pulse duration=%u\n", elapsed);
    }
    
    Note that putting a printf inside the shortPulse() function will produce a warning telling you that the compiler cannot use fcache (since an fcached function cannot call other functions).

    The code prints 12 for the elapsed time, which includes 3 instructions: setting the pin high, setting it low, and the subtraction for elapsed time calculation. I think the pulse itself should be just 4 cycles, but I don't have an oscilloscope handy here to verify that.

    Eric

    Hi Eric
    How do you get 12 ticks?
    I get 144 without __attribute__((fcache)), and 28 with, am I using the wrong memory model or something?
    I'm using LMM Main RAM.

    Patrick
  • ersmithersmith Posts: 6,053
    edited 2015-06-02 02:40
    DarkGFX wrote: »
    Hi Eric
    How do you get 12 ticks?
    I get 144 without __attribute__((fcache)), and 28 with, am I using the wrong memory model or something?
    I'm using LMM Main RAM.

    What optimization flags are you using? I used -Os.

    I'm also using a recent build of the compiler (the 1.9.0_alpha series) rather than the one that comes with SimpleIDE. The SimpleIDE one is pretty old, and it may be missing some optimizations.
  • DarkGFXDarkGFX Posts: 17
    edited 2015-06-02 03:30
    ersmith wrote: »
    What optimization flags are you using? I used -Os.

    I'm also using a recent build of the compiler (the 1.9.0_alpha series) rather than the one that comes with SimpleIDE. The SimpleIDE one is pretty old, and it may be missing some optimizations.

    That might do the trick!
    I'm using simple IDE and -O2 speed as optimization, I will search for the alpha series and try that one!
    Thanks =D
  • DomanikDomanik Posts: 233
    edited 2015-06-02 08:43
    DarkGFX wrote: »
    I'm going to use this to stop the moments of a moving bullet. In other words and building a fast led camera flash!
    What LED are you using and how quickly can it be turned on/off (zero light to full-on
    full-on light to zero)? Bullet velocity? 750 - 4,000 ft/sec? At the velocity anticipated, how long will one pulse light-up the bullet? The answers should help to nail down specs for the pulse width, duty cycle and burst duration.

    EDIT: Found some info about LED flashes used in cell phones:

    "Agilent has developed high-brightness white LED light sources, the HSMW C830/C850 Agilent Flash Modules, which are specifically designed to suit the actual requirements of camera phone camera modules. Their unique dome design concentrates the light output from the LED die to form a 60-degree Lambertian radiation pattern, maximizing the light output that falls within a typical camera’s field of view. No secondary optics are needed to redirect the light output. Both major and minor axis produce similar radiation patterns. "

    There's also a paper/pdf about the design of a circuit using these parts.
  • DavidZemonDavidZemon Posts: 2,973
    edited 2015-06-02 09:10
    This guy's answer (website is offshoot of stack overflow) explains how to calculate the on-off time of an LED based on its datasheet. Interesting read :)
Sign In or Register to comment.