A Tutorial on PWM
deSilva
Posts: 2,967
I shall post this as a PDF in some days, but I should appreciate your feed back before that step. Thank You!
There was a question in a German forum the other day how many servos can be controlled by one Prop: 10? 30?? Even more???
Most people know how this is generally done - not so deSilva, but he also can learn
You send out a pulse of a length between 1 to 2 ms within a 20 ms frame. A 1 ms pulse means turn left, a 1.5 ms pulse means stay in the middle, and a 2 ms pulse turn right. The most common resolution is 256 per ms for an 8 bit precision - so this leaves ample time to do it with SPIN.
Algorithm 1:
* Wait for the next 20 ms tick
* Set port x
* Wait for 1 + n/256 ms
* Reset port x
* Repeat this for 8 more ports (be cautious to not cross the 20 ms borderline!)
This is the simple source for it:
Not bad at all… However there is this ugly calibration to compensate for SPIN instruction time, as it is not possible to time a signal precisely with OUTA. There is always some difference - measured im us rather than ns - between the OUTA and getting the CNT.. But we can easily improve it – much! - by using a counter. The resolution becomes higher than 10 bits now!
The beauty of the Propeller is, that you can simply multiply this by as many COGs as you can spare. So 4 COGs will give you detailed control over 36 servos.. You already have to start using a port expander (which in this case can be a cheap 4-to-16 decoder, as we set port by port, a single COG can easily control 9 servos with 4 pins = needing just 16 pins totally)
That was easy so far, but as you have noticed all of our 4 COGs do nothing but wait… Can’t we put rather ONE to work for it all?
Not with SPIN… - but with machine code!
Algorithm 2:
* Wait for the next 20 ms tick
* Set ALL ports
* Wait 1ms
* Look through the pulse width table to find a port to be stopped at this time, and stop it
* Wait 1/256 ms
Repeat the last two steps further 255 times
(I ommit the code 24servosC here – educating as it may be – as few persons read machine code…[noparse];)[/noparse]
Blazingly fast as the machine code might be, working itself through the table in 40 us thwarts it within 6 loop runs!
All right, but we are now only at clock time N + 2 ms! We have still 18 ms until the next cycle starts, so we can repeat this algorithm also around 9 times, giving us 54 servos to control.
But on second sight, the signals do not look good at all. It is true, each pulse is stopped in the right 1/256 ms slot, but depending on its position in the table it can be at the beginning or the end of the slot, resulting in an unacceptable jitter! Oh dear – what have we done wrong?
The issue is not the use of machine language, but the use of the wrong algorithm! Why always looping through the same table??? Can’t we find out in advance which time slot will be the next needed? Of course we can, by simply re-sorting the table along the pulse width value!
Algorithm 3:
* Sort the pulse width table
* Wait for the next 20ms tick
* Set ALL ports
* Wait for the NEXT timeslot
* Reset the corresponding ports
* Repeat this until all ports have been reset
This idea is so fascinating that we can feel inclined to implement it even in SPIN!
The main obstacle is that the required precision of 1/256 ms leaves - worst case - only 40 us between two potential reset requests, which is just a little bit to short for 80 MHz SPIN… One could try.. though it’s very tight!
But now machine code enters again at its best! After having killed four bugs in it, here comes a program even deSilva can be proud of. It will not only control 30+ servos (Due to inefficiencies of the sorting the limit is 50 at the moment, but can be boosted to 100 with a little change), but also works with a fourfold precision of 1000 per ms.
To give it broader application, I changed the pulse width table to a long, each entry addressed by its physical I/O port number (0..29)
The test routine is configured to dim 2 LEDS at port 0 and 1, as duty cycle stays below 50% they can be connected without any resistor.
This is the astonishing short code:
[code]
{ 24servosD: A tutorial in 4 Steps: Step 4
November 2007 by deSilva
v0.14: Another Bug Nov, 2nd
Demonstration von Pulsmodulation auf mehreren Kan
There was a question in a German forum the other day how many servos can be controlled by one Prop: 10? 30?? Even more???
Most people know how this is generally done - not so deSilva, but he also can learn
You send out a pulse of a length between 1 to 2 ms within a 20 ms frame. A 1 ms pulse means turn left, a 1.5 ms pulse means stay in the middle, and a 2 ms pulse turn right. The most common resolution is 256 per ms for an 8 bit precision - so this leaves ample time to do it with SPIN.
Algorithm 1:
* Wait for the next 20 ms tick
* Set port x
* Wait for 1 + n/256 ms
* Reset port x
* Repeat this for 8 more ports (be cautious to not cross the 20 ms borderline!)
This is the simple source for it:
{ 24servosA: A Tutorial in 4 Steps – Step 1 ------------------------------------------ November 2007 by deSilva v0.14: A simple SPIN Version without Timer. Within a time window of 20 ms as many PWM channels as possible will be served; as pulse width is 2 ms this is 9, quite independent of any performance constraints - it is NOT necessary to run @ 80 Mhz ! } CON _CLKMODE = XTAL1 + PLL8X ' values for Hydra or SpinStamp! _XINFREQ = 10_000_000 CON _HZ = 80_000_000 ' This can be set to other values but the true clock ' It works as a global scaling: e.g. a value of 160_000_000 ' will lead to 4 ms pulses within a 40 ms period for a 80 MHz clock _maxServos = 9 _pulseOffsetCompensation = 0 ' derived constants __20msTicks = _HZ/1000*20 __1msTicks = _HZ/1000 __resTicks = _HZ/256_000 ' = 1/256 ms clocks PUB demo servosA(@servoCtrl) PUB start(table) COGNEW(servosA(table), @aStack) PRI servosA(ctab) | port, next20ms, addr, pulseWidth 'enable all used Ports REPEAT addr FROM ctab TO ctab+constant(2*(_maxServos-1)) STEP 2 port := BYTE[noparse][[/noparse]addr] IF port =< 31 DIRA[noparse][[/noparse]port] := 1 next20ms := CNT + __1msTicks ' first tick to act on REPEAT WAITCNT(next20ms) REPEAT addr FROM ctab TO ctab+constant(2*(_maxServos-1)) STEP 2 port := BYTE[noparse][[/noparse]addr] IF port =< 31 pulseWidth := BYTE[noparse][[/noparse]servoAddr+1] * __resTicks + __1msTicks - _pulseOffsetCompensation OUTA[noparse][[/noparse]port]:=1 WAITCNT(CNT+pulseWidth) OUTA[noparse][[/noparse]port]:=0 next20ms += __20msTicks DAT aStack LONG ' reuse as stack when called with external table - align to LONG ServoCtrl BYTE ' Entries for I/O number and pulse width 0..255 per ms BYTE 0, 255 BYTE 1, 127 BYTE 2, 0 BYTE 3, 0 BYTE 4, 255 BYTE 5, 255 BYTE -1, 255 BYTE 7, 255 BYTE 16, 0 ' rest unused (see constant _maxServos) BYTE -1, 127 '9 BYTE -1, 127 BYTE -1, 127 BYTE -1, 127 BYTE -1, 127 BYTE -1, 127 BYTE -1, 127 BYTE -1, 127 BYTE -1, 127 BYTE -1, 127 BYTE -1, 127 BYTE -1, 127 BYTE -1, 127 BYTE -1, 127 BYTE -1, 127 '23 ' The End
Not bad at all… However there is this ugly calibration to compensate for SPIN instruction time, as it is not possible to time a signal precisely with OUTA. There is always some difference - measured im us rather than ns - between the OUTA and getting the CNT.. But we can easily improve it – much! - by using a counter. The resolution becomes higher than 10 bits now!
{ 24servosB: A Tutorial in 4 Steps – Step 2 ------------------------------------------ November 2007 by deSilva v0.12: Simple SPIN version with timer Timer A is used for a very precise pulse width } PRI servosB(ctab) | port, next20ms, addr 'enable all used Ports REPEAT addr FROM ctab TO ctab+constant(2*(_maxServos-1)) STEP 2 port := BYTE[noparse][[/noparse]addr] IF port =< 31 DIRA[noparse][[/noparse]port] := 1 next20ms := CNT + __1msTicks ' first tick to act on REPEAT addr FROM ctab TO ctab+constant(2*(_maxServos-1)) STEP 2 port := BYTE[noparse][[/noparse]addr] IF port > 31 NEXT ' ---- this part contains the improvement using a timer: ' Programming timerA for PWM-mode ("NCO") ' i.e. pulse = sign bit (bit 31); thus preset register with MINUS pulse width ' That's all, folks [img]http://forums.parallax.com/images/smilies/smile.gif[/img] CTRA := 0 ' reset timer FRQA := 1 ' adding 1 @ system clock = 80 MHz PHSA := -(__1msTicks + __resTicks * BYTE[noparse][[/noparse]addr+1]) CTRA := (%0_00100 << 26) + port ' Now there is a little bit time to do other things .... ' Note that it is not very critical as the pulse is reset automatically! WAITPEQ(0, |<port, 0) ' ... which is equivalent to: REPEAT WHILE INA[noparse][[/noparse]port] CTRA := 0 '----- end of improved section next20ms += __20msTicks
The beauty of the Propeller is, that you can simply multiply this by as many COGs as you can spare. So 4 COGs will give you detailed control over 36 servos.. You already have to start using a port expander (which in this case can be a cheap 4-to-16 decoder, as we set port by port, a single COG can easily control 9 servos with 4 pins = needing just 16 pins totally)
That was easy so far, but as you have noticed all of our 4 COGs do nothing but wait… Can’t we put rather ONE to work for it all?
Not with SPIN… - but with machine code!
Algorithm 2:
* Wait for the next 20 ms tick
* Set ALL ports
* Wait 1ms
* Look through the pulse width table to find a port to be stopped at this time, and stop it
* Wait 1/256 ms
Repeat the last two steps further 255 times
(I ommit the code 24servosC here – educating as it may be – as few persons read machine code…[noparse];)[/noparse]
Blazingly fast as the machine code might be, working itself through the table in 40 us thwarts it within 6 loop runs!
All right, but we are now only at clock time N + 2 ms! We have still 18 ms until the next cycle starts, so we can repeat this algorithm also around 9 times, giving us 54 servos to control.
But on second sight, the signals do not look good at all. It is true, each pulse is stopped in the right 1/256 ms slot, but depending on its position in the table it can be at the beginning or the end of the slot, resulting in an unacceptable jitter! Oh dear – what have we done wrong?
The issue is not the use of machine language, but the use of the wrong algorithm! Why always looping through the same table??? Can’t we find out in advance which time slot will be the next needed? Of course we can, by simply re-sorting the table along the pulse width value!
Algorithm 3:
* Sort the pulse width table
* Wait for the next 20ms tick
* Set ALL ports
* Wait for the NEXT timeslot
* Reset the corresponding ports
* Repeat this until all ports have been reset
This idea is so fascinating that we can feel inclined to implement it even in SPIN!
The main obstacle is that the required precision of 1/256 ms leaves - worst case - only 40 us between two potential reset requests, which is just a little bit to short for 80 MHz SPIN… One could try.. though it’s very tight!
But now machine code enters again at its best! After having killed four bugs in it, here comes a program even deSilva can be proud of. It will not only control 30+ servos (Due to inefficiencies of the sorting the limit is 50 at the moment, but can be boosted to 100 with a little change), but also works with a fourfold precision of 1000 per ms.
To give it broader application, I changed the pulse width table to a long, each entry addressed by its physical I/O port number (0..29)
The test routine is configured to dim 2 LEDS at port 0 and 1, as duty cycle stays below 50% they can be connected without any resistor.
This is the astonishing short code:
[code]
{ 24servosD: A tutorial in 4 Steps: Step 4
November 2007 by deSilva
v0.14: Another Bug Nov, 2nd
Demonstration von Pulsmodulation auf mehreren Kan
Comments
I quite understand that it could be more interesting to distrubute the pulses within the whole 20ms...
But even 8 sensors are to heavy for this simple algorithm: Beau needs 8 loop runs, 400ns each so he cannot meet the resolution of 1us but 4us only, which suffices as generally only 8 bits = 1ms/256 is expected
Look at it again... the resolution is 1uS by choice... it could be finer than that (a couple hundred nano seconds of resolution if you wanted·; whatever the assembly overhead would allow), but I figured that 1uS was good enough for controlling standard servos.
I stand corrected... see below.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe
IC Layout Engineer
Parallax, Inc.
Post Edited (Beau Schwabe (Parallax)) : 11/4/2007 3:10:00 AM GMT
Before you reset a port you have to execute 8x 32 ticks @ 12.5 ns = 3.2us
There is no chance to do anything inbetween...
Ok... 3.2uS Perhaps I should read my own code a little closer. ·
·
Actually it could be longer than that since a rdlong can take anywhere from 7 to 22 clocks
·
·
So... 27 to 42 clocks multiplied by 8 equals 216 to 336 clocks
·
add another 4 clocks for a non jump djnz and that’s...
·
220 clocks to 340 clocks
·
·
This results in a best case scenario of 232 clocks... worst case scenario of 352 clocks.
·
Running at 80 MHz that translates to a 2.9uS (best)·to 4.4uS (worst) pulse resolution.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe
IC Layout Engineer
Parallax, Inc.
Post Edited (Beau Schwabe (Parallax)) : 11/4/2007 3:10:56 AM GMT
The "uncertainty" with HUB accesses is often misunderstood.
Once you enter a simple loop (without JMPs), you are locked to the HUB wheel (nice picture!)
Each loop run then will always be exactly either of 16, 32 , 48,... ticks, uncertainty only once at the very first run.
So your "best case scenario" was a nice try, but could not fool deSilva
Now you have me curious... the Servo32 Demo was the first assembly program that I wrote for the Propeller.· I have modified the servo32·(see attached) by pre-loading the·Pulse-Widths during the OFF-time··so that the resolution should be right at 1uS with an 80 MHz clock.· The timing critical path has been reduced to a·SUB and an RCL (8-clocks) per servo (64-clocks total for each servo group) plus an additional 12-clocks·as overhead to get the final BYTE value to the port....· a total of 76-clocks....·there is even a NOP instruction in there to make it an even 80 clocks, but when I scope the output, the best resolution that I can get is 5uS.· What am I missing?·· 80-clocks at 12.5nS = 1000nS or 1uS
Edit
File Attachment removed... updated version below
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe
IC Layout Engineer
Parallax, Inc.
Post Edited (Beau Schwabe (Parallax)) : 11/5/2007 5:34:55 AM GMT
And the SUBs will need an NR when inside a loop...
Ok... Duh... Thanks!··I have the code now working at 1.05uS resolution ...· If I could just shave one more Assembly command off·of the code below then it would be an even 1uS resolution.
At this point though I have been staring at it too long.· I am welcome to any suggestions.
·
Edit
Attachment removed...·updated version below
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe
IC Layout Engineer
Parallax, Inc.
Post Edited (Beau Schwabe (Parallax)) : 11/5/2007 6:16:47 PM GMT
CMP cnt, ServoWidthx wc
It is a pity, that RCL does not behave as "expected", otherwise both last shifts could be combined..
BTW: Are you sure the code will still work, when CNT overflows from $FF_FF_FF_FF to $0 ?
You can't reverse the CMP operands when one of them is the cnt register. The cnt register can only be used in the Source operand.
Propeller Manual (v1.0)
Page 397
"Each of these registers can be accessed just like any other register in the Destination or Source fields of instructions, except for those that are designated "(Read-Only)." Read-only registers can only be used in the Source field of an instruction."
·
Yes,· that's what this bit of code at the beginning of each Zone is for...
·
·
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe
IC Layout Engineer
Parallax, Inc.
Post Edited (Beau Schwabe (Parallax)) : 11/5/2007 4:42:25 PM GMT
Solution for·1uS servo resolution!... use cmpsub rather than cmp.
Basically for this operation cmp and cmpsub function oppositely of one another with relation to the C flag being set.
Edit
Attachment removed...·updated version below
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe
IC Layout Engineer
Parallax, Inc.
Post Edited (Beau Schwabe (Parallax)) : 11/6/2007 8:13:54 AM GMT
I now understand how your de-glitching works.. You have the time for NINE events within 20ms, but need only EIGHT, so you can waste one...
And yes: The Special-Register-Trap
I had another idea this morning:
Edit: Bad thinking: Beau only needs FOUR evens (as 8 ports are packaged in each zone), so he has FIVE to waste!
Post Edited (deSilva) : 11/6/2007 2:17:55 PM GMT
Ok, well this has been a fun learning curve.· I thrive on being challenged this way thanks!!
The previous "Solution" did not take into account that ServoByte needed to be reset or cleared on each ZoneLoop iteration.
As a result, other zones got clobbered, and there's also a nasty bit overlapping·issue of trying to use 9-bit·variable with·an 8-bit port.
Anyway, as I said I thrive on this·kind of challenge, and below is a slightly different solution that meets all the required needs, and actually has one instruction to spare.
Whew!!· Enjoy!·
Some other parts of the program could be optimized, but I will work on that later, after I am positive that there are no more gotcha's with this code.·
·
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe
IC Layout Engineer
Parallax, Inc.
Post Edited (Beau Schwabe (Parallax)) : 11/6/2007 8:11:51 AM GMT
Of course i still love my own solution, which effortless handles a mys resolution down to 40 MHz