Some CORDIC documentation

Phil Pilgrim (PhiPi) · 2006-10-13 06:07

Update: Here's a corrected version of this posting, with the erroneous equation for C (i.e. C = (A + / 2) removed and replaced by correct formulae, along with a somewhat different conclusion. Rather than leaving the original posting cluttered with overstrikes and edits, I simply fixed it. It's not as proper doing it that way, but it's a lot cleaner!

I'd like to work through a possible alternative to pure polar coordinates to see if it might confer any advantages. In a way, this is a cross between polar and Cartesian coordinates. In this scheme, instead of keeping track of the angle and radius for a vector vA (call them A and rA, respectively), we'll keep track of the sine and cosine of A instead (call them sA and cA, respectively), along with rA, i.e.:

····vA = (rA, sA, cA) = (rA, sin A, cos A)

For signal processing, one will likely need the sine and cosine of the result anyway, so we might as well bring 'em along for the ride from the get-go and, with any luck, avoid having to compute any further transcendental functions.

Consider the following diagram:

···· attachment.php?attachmentid=43596

We recognize the following identities from the diagram and from trigonometry:

····rC sC = rA sA + rB sB (i.e. yC = yA + yB)
····rC cC = rA cA + rB cB (i.e. xC = xA + xB)
····rC² = rA² + rB² - 2 rA rB cos D (Law of cosines.)
····D = pi - (B - A) = pi + (A -

····sin(x + y) = sin x cos y + cos x sin y
····cos(x + y) = cos x cos y - sin x sin y

From these we can derive:

····cos D = cos[noparse][[/noparse]pi + (A - ]
··············= cos pi cos(A - - sin pi sin (A -
··············= - cos(A -
··············= - cos A cos(-B) + sin A sin(-B)
··············= - cos A cos B - sin A sin B
··············= - cA cB - sA sB

So

····rC² = rA² + rB² + 2 rA rB (cA cB + sA sB), leaving

····rC = [noparse][[/noparse]rA² + rB² + 2 rA rB (cA cB + sA sB)]^½, from which we can then compute

····sC = (rA sA + rB sB) / rC
····cC = (rA cA + rB cB) / rC

Now, to summarize what we've got: Given two vectors vA = (rA, sA, cA), and vB = (rB, sB, cB), the sum vC = vA + vB can be computed from the above formulae for rC, sC, and cC. These formulae involve no transcendental functions. They do involve a square root, however, which can be computed easily and converges quickly using Newton's method.

As an addendum, for cycling through the phases of any vector by adding an incremental angle d, first compute sd = sin d and cd = cos d, then use the identities for angle addition:

····r(A + d) = rA
····s(A + d) = sA cd + cA sd
····c(A + d) = cA cd - sA sd

The big question after all this fol de rol is whether it's any better than bopping in and out of Cartesian coordinates. I'm afraid the answer is no. Consider the following, much simpler formulae:

····rC = (xC² + yC²)^½
·········= ((xA + xB)² + (yA + yB)²)^½

····sC = yC / rC
·········= (yA + yB) / rC

····cC = xC / rC
·········= (xA + xB) / rC

Clearly, keeping the x's and y's around confers a significant computational advantage. And, truth be known, in the "modified polar" system derived above, we never really escaped the Cartesian domain, but simply disguised it in trigonometric clothing. Anyway, if was fun working through it, and I hope I didn't waste anyone's time.

-Phil

Post Edited (Phil Pilgrim (PhiPi)) : 10/14/2006 7:56:28 PM GMT

Graham Stabler · 2006-10-13 09:56

Chip, I wanted to ask something. If an "addpolar" function was created that took two vectors and produced the resultant in polar form would it really matter what was happening behind the scenes?

Say you have a vector, this was created by a vector sum and that sum was done the traditional way by decompositon using cordics, being clever you kept hold of the resultant x,y co-ordinates too. You now want to add a vector, that is just two cordics, one to find the components of the new vector and a second to find the angle and length of the resultant. The new co-ordinates of the resultant is also known.

Those two cordics could be done sequentially or put in the same loop, in the same loop would save a little time because the instructions for t1 and the loop lookup would be common (saves 120 instructions).

So the idea is to have best of both worlds in a way, fairly quick additions and instant rotations and scales. Of course there is the problem that if you have scaled it the x,y's are no longer correct so in that instance you would need to do three cordics to do an addition.

I'm struggling to work out if this is going to be of any use, if you do all your additions and scales seperately then it seems pretty good but if you process interleaves scales and additions then it might be poor, then again the CORDIC is pretty fast so three cordics per rotation/additon pair might not be so bad.

Graham

Peter Verkaik · 2006-10-13 10:19

Phil,
You assume C = (A+B)/2
but this is only true if vectors A and B have the same length.
Adding (1,0) and (0,2) yields (1,2) which does not have a pi/4 angle.
C = (A+B)/2 calculates to pi/4 for vectors along the x and y axis, regardless what vector lengths.
For arbitrary lengths, the calculation of C is more complex.

regards peter

Peter Verkaik · 2006-10-13 11:25

Graham,
If all vector parameters are kept, x, y, r and A,
then when adding 2 vectors you could use x and y to find resulting x and y, then do 1 cordic to find
resulting r and A.
When rotating a vector, simply adjust A, and do 1 cordic to find the new x and y.
When scaling a vector, simply multiply x,y and r or only multiply r and use 1 cordic to find new x and y.
This may well be the fastest solution.

regards peter

Graham Stabler · 2006-10-13 12:56

Peter Verkaik said...
Graham,
If all vector parameters are kept, x, y, r and A,
then when adding 2 vectors you could use x and y to find resulting x and y, then do 1 cordic to find
resulting r and A.
When rotating a vector, simply adjust A, and do 1 cordic to find the new x and y.
When scaling a vector, simply multiply x,y and r or only multiply r and use 1 cordic to find new x and y.
This may well be the fastest solution.
regards peter

I was assumining the vector to be added was in polar co-ordinates but there is no reason why that should not be in x,y,r,A already as you say.

Graham

Graham Stabler · 2006-10-13 12:58

Almost forgot, here is the latest and correctest version of the CORDIC simulator in excel

Phil Pilgrim (PhiPi) · 2006-10-13 15:22

Peter,

Oh cripes! That same realization got me out of bed this morning. What was I thinking?! I had blindly copied the formula from one of Tracy's postings without even examining its premises. In a way, I'm glad. That half-angle formula annoyed the heck out of me. I'll have to rework it, but I'm not at all optimistic that anything can be gained by it. I still believe, however that maintaining the sine and cosine of the angle as part of each vector is more useful than carrying around the angle itself.

-Phil

Paul Baker · 2006-10-13 17:02

Phil, I was thinking a few days ago along the same lines when I was examining the representation of an angle via an exponent to use the power as the notation used, but I didn't delve into the idea very deeply.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer

Parallax, Inc.

Peter Verkaik · 2006-10-13 17:14

Now here is some interesting reading
http://dspace.lib.utexas.edu/bitstream/2152/32/1/arbaughj20424.pdf
Here table lookups are performed to find sin and cos for the first 1/3 of the iterations
What's better, it includes c++ sources to calculate those tables.
Check out chapters 4, 5 and 6.
regards peter

cgracey · 2006-10-13 18:14

Graham Stabler said...
Chip, I wanted to ask something. If an "addpolar" function was created that took two vectors and produced the resultant in polar form would it really matter what was happening behind the scenes?

Say you have a vector, this was created by a vector sum and that sum was done the traditional way by decompositon using cordics, being clever you kept hold of the resultant x,y co-ordinates too. You now want to add a vector, that is just two cordics, one to find the components of the new vector and a second to find the angle and length of the resultant. The new co-ordinates of the resultant is also known.

Those two cordics could be done sequentially or put in the same loop, in the same loop would save a little time because the instructions for t1 and the loop lookup would be common (saves 120 instructions).

So the idea is to have best of both worlds in a way, fairly quick additions and instant rotations and scales. Of course there is the problem that if you have scaled it the x,y's are no longer correct so in that instance you would need to do three cordics to do an addition.

I'm struggling to work out if this is going to be of any use, if you do all your additions and scales seperately then it seems pretty good but if you process interleaves scales and additions then it might be poor, then again the CORDIC is pretty fast so three cordics per rotation/additon pair might not be so bad.

Graham

Graham,

It may, indeed,·turn out that the most practical thing to do is maintain vectors in terms of (X,Y) and use CORDIC to rotate them (with built-in prescaling, to keep·usage as simple as possible). This way, vector adds are still straightforward (two ADD instructions - one for the X's and one for the Y's) and rotation is simple (either a CORDIC or hub instruction). If there was some computationally advantageous technique for adding polar vectors directly, it would be superior since a rotation would amount to a single ADD instruction (for the angle component) and the length would always be immediately accessible. Also, it would allow the programmer to think more clearly about what he's programming, as the polar domain is simpler to contemplate and allows insights and intuitive leaps which are obscured in the cartesian domain.

In a sense, yes, it doesn't matter what transpires in the background, as long as the result is achieved in a timely way, but some novel approach could greatly reduce the circuit complexity which also reduces power and silicon size/cost. That's what intrigues me. I have to believe there is some way to achieve this, even though none of the DSP vendors seem to have come up with anything. It's quite possible that they haven't bothered, since the industry seems quite sated by glorified MACs.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

Chip Gracey
Parallax, Inc.

Paul Baker · 2006-10-14 08:37

Chip, I found the function to draw a straight horizontal line in polar space as we talked about at dinner w/ Tracy. Its the cosecant (csc) function (and the secant draws a vertical line in polar space). The distance from the origin at Θ=90 degrees is the scaling factor of the function, so 2csc(x) is a horizontal line that is 2 away from the origin at 90 degrees. I think this can be used in your coordiante rotation vector summing concept. I haven't quite worked out how the formula works exactly, but it's there and hopefully will be little more than a single trig function (maybe will require using the secant too)·and an addition. I'll look at this idea some more over the weekend.

Even if its two trig functions and an add, it should be more computationally efficient than converting to cartesian and back.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer

Parallax, Inc.

Graham Stabler · 2006-10-14 11:24

I'm a bit scared to post this because I have probably overlooked something pretty major and I've only considered one case but I think the attached shows a way of adding two polar vectors with only two cordics and that beats normal polar-cartesian-polar by one cordic. If a cordic engine was built into the propeller then you would be cooking on gas. If the method is correct then I am afraid it is a lot cleverer than I am.

Graham

Click other link below for full sized version

attachment.php?attachmentid=43627

Post Edited (Graham Stabler) : 10/14/2006 11:28:09 AM GMT

Graham Stabler · 2006-10-14 11:34

How it works in words

The first step is to use my new cordic to find the components of vector2 (this finds both on one go) in the direction of and perpendicular to vector1. This produces a carteasean vector representing the resultant but rotated by -theta1. The second cordic converts this carteasean vector back to polar producing an angle and length. This just needs to be rotated by theta1 which in the polar space is just an addition of theta.

In a sense this method avoids one of the CORDICs that would have converted R1 to carteasean by making the conversion trivial, by simply putting vector1 on the "new" x-axis.

Graham

Peter Verkaik · 2006-10-14 14:42

Graham,
if both vectors also carried x and y, you can leave out the first cordic.
You trade memory for speed. If speed is what you're after, that's
how I would do it. The extra memory required (two 32bit words for x and y)
is worthwile I think to almost halve the time required for the addition of 2 vectors.

regards peter

Graham Stabler · 2006-10-14 16:26

I think they would be carrying the wrong x and y's. You have to find the x,y of vector two with resepect to vector one because the angle of the second vector is with respect to the x-axis.

Graham

Peter Verkaik · 2006-10-14 16:43

Graham,
What I mean is that EVERY vector should have properties x,y,r,A and that these are valid
upon entry and exit of a vector (library) function.

The relation between the properties is
r = sqrt(x*x + y*y)
A = arctan(y/x)

Having x,y,r,A instantly available also allows·tan,sin,cos to be quickly calculated
tan(A) = y/x
sin(A) = y/r
cos(A) = x/r

The add function would then do:
··add the x's and y's
··do 1 cordic to calculate·r and A for the new x and y

The rotate function would do:
· add angle to A
· do 1 cordic to calculate new x and y

The scale function would do:
· multiply r by scale factor
· do 1 cordic to find new x and y

regards peter

Graham Stabler · 2006-10-14 16:53

Sorry I thought your comment was related directly to my previous post rather than the previous discussion.

More likely your polar sums would be:

Find the missing co-ordinates of the vector to be added
Do the add or whatever
calculate the new x,y or r and A

Still two cordics.

It would depend on what you were doing of course, if you were always adding a fixed set of vectors that would be different or in some cases you might flip between systems (for long streams of additions or rotations for example).

Graham

Peter Verkaik · 2006-10-14 16:57

Graham,
You say
Find the missing co-ordinates of the vector to be added
Do the add or whatever
calculate the new x,y or r and A

The first line is not true, because the vector to add already has valid x and y in its properties.
So 1 cordic is all that is required.

regards peter

kelvin james · 2006-10-14 17:26

Not that i really want to get involved here as this is 5 miles over my head. I had previously read about Bresenhams' algorithim for plotting without using trig and uses integer only math. I don't know if this would be applicable to this situation or not, or if it could be manipulated into vector addition. Probably way off base, but thought it might be worth a look. Here is a explanation how it works.

kelvin

www.mandelbrot-dazibao.com/Bresen/Bresen.htm

Graham Stabler · 2006-10-14 17:55

Peter, where did those properties come from if not from a cordic? A vector has to start somewhere.

You choose from doing a 1 cordic on every operation or you do what I suggest and use 2 cordics for the additions but only the additions, my method cuts the time for a polar addition down by a third and if as I suspect most DSP consists of alternating scales/rotates and additions then it is no slower than storing all variables.

Graham

p.s. Very interesting Kelvin whether applicable or not!

Tracy Allen · 2006-10-14 18:14

I like where this is going with storing the intermediate results for access on the next cycle. A first step into pipelining, I guess.

Its always fun to visit Parallax, draw diagrams mentally and on napkins. And hear (literally) the progress Chip has made on the speech synth that he posted a few weeks ago. It is clear why he wants the fastest possbible CORDIC routines.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Tracy Allen
www.emesystems.com

Peter Verkaik · 2006-10-14 18:29

Graham,
All properties are set when you define/declare a vector, just as any variable.
Normally you only have properties x and y (or only r and A)
Having only x and y requires no cordic for adding but 2 cordics for rotating.
Having only r and A requires no cordic for rotation but 3 cordics for addition (or 2 in your approach).
Having all 4 properties requires 1 cordic for addition and 1 cordic for rotation.

To illustrate, using some C code
in C a vector would be a structure

struct {
· float x;
· float y;
· float r;
· float A;
} vector;

void setCartesian(vector *v, float x, float y) {
· *v.x = x;
· *v.y = y;
· convertToPolar(v); //make sure r and A are valid
}

void setPolar(vector *v, float r, float A) {
· *v.r = r;
· *v.A = A;
· convertToCartesian(v); //make sure x and y are valid
}

void add(vector *r, vector *a, vector *b) {
· *r.x·= *a.x + *b.x;
· *r.y = *a.y + *b.y;
· convertToPolar(r);
}

void rotate(vector *r, vector *v, float angle) {
· *r.A = *v.A·+ angle;
· *r.r = *v.r;
· convertToCartesian(r);
}

void convertToPolar(vector *v) {
· //cordic code to calculate r and A·using x and y
}

void convertToCartesian(vector *v) {
· //cordic code to calculate x and y·using r and A
}

regards peter

Graham Stabler · 2006-10-14 18:50

It is likely that the program rather than the programmer will define the 4 variables of the vector, this will probably entale a formula or process to get one set of variables and a cordic to get the other, then you do a rotate, scale or addition and you need to do another cordic, that's two cordics for ANY two vector process and potentially 3 depending on how the vectors are created (defined or calculated).

It would very much depend on the processing you intended to do I suspect.

Cheers,

Graham

cgracey · 2006-10-14 18:59

Peter Verkaik said...

Having only x and y requires no cordic for adding but 2 cordics for rotating.

Doesn't this just require 1 cordic? Both X and Y get rotated in a single cordic.

Having all 4 properties requires 1 cordic for addition and 1 cordic for rotation.

I think this is a really interesting idea - keep both forms around all the time. I worry that some error between the representations would build up over time, though. What do you think?

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

Chip Gracey
Parallax, Inc.

Peter Verkaik · 2006-10-14 19:13

Graham,
"It is likely that the program rather than the programmer will define the 4 variables of the vector"
This doesn't matter as long as the library functions are used.

"that's two cordics for ANY two vector process "
That's true, but when having 4 properties, one is in the set function to initialize.
A running vector that is used throughout the program, is usually only initialized once.

Chip,
Having only x and y, to add no cordic is required since there is no need, at that time,
for the length and angle.
1 cordic for addition and 1 cordic for rotation also makes the time required
for the functions more constant I think, which may proof useful in some applications.
If you were to do only additions, and no interest in length and angle, then I would have only x and y.

regards peter

Phillip Y. · 2006-10-14 20:18

I reviewed the link for Bresenhams'·algorithm for plotting.
"These algorithms are called INCREMENTAL, because the position of the next pixel is calculated on the basis of the last plotted one".

In the CNC world this is called interpolation it can ber linear, circular, log, exponential etc, the output·is + and - pulses for·X and·Y servos.
Early Yaskawa/Ysnac CNC machines used an "Interpolation Processor" for this, in 1983·the KM37601·was sold separately by TOKO.
I have·a chip and 75 page user manual with detailed examples of the interpolation calculations/process.

The interpolator uses some control bits to select the type of interpolation, direction and some 24 bit constants for the line properties,
then it would calculate the next move based on the best tracking error·of all the next point's 8·possibilities, it include 45 degree moves.
The calculations were only add and subtract and·the control bits +1,0,-1 [noparse][[/noparse]linear 10 clocks·, other 20 clocks for each interpolation cycle].

Another closely related thing is a Velocity Profile Generator to perform acceleration and deceleration of the interpolated steps.
Profiles could be linear, exponential and perhaps Scurve for smoothest starting and stopping.
This is really just another form of interpolation·where one axis is motion and the other axis·is speed or time between steps.

I recommend that·a·FUNCTION ENGINE be made general enough to do CORDIC, Interpolation and many other functions.

Phil Pilgrim (PhiPi) · 2006-10-14 20:27

In a prior posting, which I've now corrected, I explored the possibility of keeping the length along with the sine and cosine of the angle (instead of the angle itself) of each vector with which to do all the computations. It worked without requiring any further computation of transcendental functions. But I concluded there that it conferred no advantages over a purely Cartesian approach and, in fact was little more than a Cartesian formulation in trigonometric trappings.

Also, I've recently encountered some papers about "rational trigonometry", which you can read about here: en.wikipedia.org/wiki/Rational_trigonometry. Mentioning it in this thread may be a red herring, since I doubt its applicability in signal processing apps. But who knows? Perhaps, with a little massaging, it might prove useful. Anyway, it's being touted for its simplification of trig by replacing circular functions and their associated computational difficulties with metrics called "quadrance" and "spread". It's the spread function that I find troubling for signal-processing apps, since the quadrant information gets completely buried (which, the author claims, is okay for plane geometry). Anyway, 'just thought I'd toss it into the fray...

-Phil

Peter Verkaik · 2006-10-14 21:43

Phil,
Thanks for updating your other post. It really shows you can't really do without x and y.
Having also the r and A as vector properties makes both addition and rotation require
only 1 cordic.

The document
http://dspace.lib.utexas.edu/bitstream/2152/32/1/arbaughj20424.pdf
chapter 5 discusses lookup tables to perform the first third of cordic iterations
in a single step. If I understood it correctly, multiple tables can be used
to bring the number of iterations down more.

Chip,
there is also a discussion (chapter 5 and·6) in that document on how to implement that on silicon,
where one needs a fast way to index the lookuptables.
If I find the time I might try·this out·on the javelin, as I already
have a cordic class for it. Just to see how much time can be saved by using tables.

regards peter

Graham Stabler · 2006-10-15 00:03

Peter Verkaik said...

"that's two cordics for ANY two vector process "
That's true, but when having 4 properties, one is in the set function to initialize.
A running vector that is used throughout the program, is usually only initialized once.

Imagine this

program starts
initialize vector1
initialize vector2
repeat
{
calculate vector2 (one cordic to get 4 variables)
add vector2 to vector1 (one cordic to get 4 variables)
}

In this situation you still end up with two cordics, one to redefine the second vector and one to redefine the new resultant.

Peter Verkaik said...

Chip,
Having only x and y, to add no cordic is required since there is no need, at that time,
for the length and angle.

I think chip was just pointing out that it only takes one cordic to rotate an x,y pair (you don't have to convert to polar and back)

I had a bit of a look at the pdf, it looks as though the data tables might need to be quite large?

Graham

Peter Verkaik · 2006-10-15 00:42

Graham,
using the code example functions I described

program starts
setPolar(...); //initialize vector1
setPolar(...); //initialize vector2
repeat
{
add(...); //add vector2 to vector1 (one cordic to get 4 variables)
}

There is no need to calculate vector2 again inside the repeat
as the functions make sure ALL properties are valid all the time,
so you can use them directly when needed.

About the tables,
as I understand it multiple tables can be used that
significantly reduces the total amount of storage required.
According to figure 5.6 (page 65) when using 4 tables
and 32bits coefficients, only about 10 kilobyte (rom) space
is required.

regards peter

Some CORDIC documentation

Comments