How to get accurate micro sec Pauses?
BTL24
Posts: 54
I am having trouble getting micro sec pauses to work with the Prop Activity board. I followed the library guidelines and used the set_pause_dt(CLKFREQ/1000000); command, which is supposed to allow the pause() command to switch from 1 mS increments to 1 uS increments.
When I program up a pause (10) I see 40uS on scope. When I program pause(20), I see 50uS on scope.
I have tried a higher resolution with set_pause_dt(CLKFREQ/2000000); ..but all I see is a High output.
I have gone other way and programmed 10 uS increments with set_pause_dt(CLKFREQ/100000); and then programmed pause(1)... but again I see 40 uS.
What am I doing wrong?
Is there a better way to get 10 uS pulses than using pause() command?
Thanks,
BTL24 (Brian).
When I program up a pause (10) I see 40uS on scope. When I program pause(20), I see 50uS on scope.
I have tried a higher resolution with set_pause_dt(CLKFREQ/2000000); ..but all I see is a High output.
I have gone other way and programmed 10 uS increments with set_pause_dt(CLKFREQ/100000); and then programmed pause(1)... but again I see 40 uS.
What am I doing wrong?
Is there a better way to get 10 uS pulses than using pause() command?
Thanks,
BTL24 (Brian).
Comments
Any delay library is going to have some minimum time, and then some granularity.
With Prop GCC there may be other loading delays to consider as well.
Optimization levels apparently make a difference, but I still cant get to 1 uS increments. I ran optimized for size for test results below. When I optimized for speed.... system just hung. Other optimization levels gave worse results.
Results:
For ptick10 (10 uS), display reports...PTick = 800 ticks against measured delay tick = 1216 ticks (or 1.52 times greater).
For ptick20 (20uS), display reports ...PTick = 1600 ticks against measured delay tick = 2016 ticks (1.26 times greater).
In summary, I am surprised that the caliber of processor with the prop cant get below 20 uS pause. What am I doing wrong? Is it the overhead of C?
If this truly is the case... I will have to abandon the prop for a another processor. In my work, I have to manipulate bits and timing down to 100s of nS..sometimes smaller.
The Propeller, with a timing granularity of 12.5 ns, can definitely do what you want; but don't expect it to happen in C, Spin, or any other high-level language. For that, you will have to take the time to learn Propeller assembly (PASM). It's nothing to be afraid of, BTW: PASM is one of the friendliest, most approachable assembly languages out there and well worth the time to master.
-Phil
Phil,
Thanks for the quick response... and I just was coming to that same conclusion. For the 10 uS and 20 usS critical timing I need, I thought of SPIN... but I guess you claim assembly would be better.
Can in line assembly be placed inside a C program?... or is it best to assemble outside, and then include the function later during build?
Regards,
BTL24 (Brian)
On any uC sub-us is really going to need Assembler, and care.
Can you clarify "manipulate bits and timing " - what granularity in edges do you need, what is Min and max time between edges, is this output only, or do you need to also capture/time edges arriving ?
Yes. There are several examples. Here is one that mixes C+ASM. It could be faster without the C code.
Care to guess what it does?
Note that propeller-gcc in-line ASM is like PASM with regard to source and destination.
Well, by definition, "inline assembly" is inline and not linked in. Here's a tutorial on that:
https://sites.google.com/site/propellergcc/documentation/tutorials/inline-asm-basics
And this I2C driver uses inline assembly for the bit banging: https://github.com/libpropeller/libpropeller/blob/master/libpropeller/i2c/i2c_base.h
Possibly not with the pause() function, but the waitcnt() function can give pauses down to about 0.2us (or 0.05us if you're inside an FCACHE loop or otherwise in COG code rather than LMM mode).
For example:
Produces: 4 of the extra 5 cycles are due to the fetch of CNT after the waitcnt; I'm not sure where the other cycle came from, but it's not C overhead (you'll get the same result with PASM).
But Phil is right, PASM is a very friendly assembly language, and probably worth learning!
Eric
Agreed on assembler and care. I measured the prop C high() and low() functions... they each take 12 uS to execute, just by themselves! That, added to other overhead of calling functions and such, means I will never make the pulse times I desire with this approach. It is now making sense as to what is happening now in prop with C code. Assembly appears to be the answer.
My Goal:
What I am trying to do is create a serial stream of pulses with high and low values for precise uS of time. This will be used to send a signal to a string of GE Color Effects (GECE) smart pixels lights that reside on a 3 wire system... +5V, Gnd, and signal (serial stream). After an enumeration process (assigning each bulb and address), the color and intensity of light can be managed per bulb, as each is addressable.
FYI...I have done this very reliably with SX processor for years using simple HIGH and LOW commands with PAUSEUS() function. By bit banging the signal line with pulses that meet the unique format and timing, I was able to talk to the bulbs successfully. Obviously the overhead with SX is less than with Prop C code. Trying to convert code to prop.
GECE Serial Signal Format is as follows...
Serial stream consists of :
Start Pulse is high pulse for 10 uS
Address of bulb - 6 bits
Intensity - 8 bits
Blue Color - 4 bits
Green Color - 4 bits
Red Color - 4 bits
Idle time is 20 uS at a logical level of low
Special Timing/Structure of "1" and "0"...
Logical 0 is 10 uS low followed by 20 uS of high
Logical 1 is 20 uS low followed by 10 uS of high
Data is sent MSB first, LSB last
I also may have to buffer up the Prop output of 3 v to 5 v levels . Not sure if GECE will recognize that low of a level... but that is another matter that I can easily solve.
Regards,
BTL24 (Brian)
Outstanding! It creates a delay in very fine resolution. AND I get to see how to inject assembly into C..
I need to better understand fcache. I think this will be the way to speed things up. (or run in own cog).
Thanks.
Regards.
BTL24 (Brian)
Thanks for the links...
Brian
Thank you for the example. I learn the fastest with examples like yours. fcache apparently is critical to getting things moving fast with the prop.
Regards,
Brian
The unexpected 5th cycle has to come from somewhere.
What happens as you vary the delay value ? (there will be some min where it breaks )
Also, how does the call overhead change ? - user code often does the pin-IO stuff either side of a customdelay() call, but for best precision, (least SW jitter), it is probably better to do the pin-IO inside the function, right next to the waitcnt() ?
I do not quite follow this, The P1 data says MOV/ADD are 4 cycles, and that leaves WAITCNT which says 6+N, but it auto adjusts once it has got going.. so I can see 4 extras come from one ASM line. I can get 14 min as 4+4+6
Maybe WaitCNT effectively needs one clock to complete/exit ? (which would make sense to me)
- but the user asked for 80, so the number added is 80. [add one, #delay]
Missing from the cycles diagram is the exit cases, and exactly where in the 6 cycles do the added wait-clocks get packed.
That detail determines where cnt is when the next opcode starts, which is the info I am chasing.
The ideal code is where that 5 cycles is corrected for, inside the delay routine.
Yes, I prefer the format of #16, especially for examples/tutorials.
Can you elaborate? Not sure I understand the question.
Your diagram shows S D w m . R for the 6 cycles min of WAIT, but does not show where that stretches.
ie does it do [S D w m . . . . . R] or [S D w m . R R R R R ] or ?
Whether is gains 1 count on entry, or exit, is less important than correcting for the +5 overall path.
Cool, so I make that, merging with your other code
Porting this to a PinAction function, like the OP seeks, using the above example would this be coded something like ?