Peter, your example doesn't do anything useful it justs adds vector2 over and over. Vector 2 might need to be modified on every loop, that might mean completely reloading the x,y values (for example in a position tracking system the values might come from an accelerometer) and that means another cordic to get the new r,A. Or vice versa.
Graham,
The original question was to add 2 vectors in polar properties
and return the result in polar properties.
It turns out we still require x and y to do that.
If you keep vectors in polar·format only you would · convert vector1 to cartesian · convert vector2 to cartesian · add vectors · convert result to polar
If you keep all properties you would · add vectors
My point is, the adding itself requires less cordics when having all properties.
The initialization of a vector however also requires a cordic,
so in the end, the total number of cordics in the program
is probably only slighty less. But if you need the components (x and y)
later on, they are instantly available, as are the length and angle.
Peter Verkaik said...
Chip,
there is also a discussion (chapter 5 and·6) in that document on how to implement that on silicon,
where one needs a fast way to index the lookuptables.
If I find the time I might try·this out·on the javelin, as I already
have a cordic class for it. Just to see how much time can be saved by using tables.
regards peter
Peter (and Graham),
I got up to page 70 or so of that document and when I saw that he was proposing 2048K table entries, I quit reading, as this would not be practical in our chip.
I really appreciate all the thought you, Graham, Tracy, Paul·and the others have put into this question. You guys are much better versed in the math than I am, so I have to read what you've written very carefully to absorb it.·Paul Baker and I·had dinner with Tracy Allen last night and we were discussing something like Graham has proposed: rotating a polar pair to sum against the X axis and maybe shortening the work. I must read more of what Graham said regarding this.
The next chip will have a single-cycle signed/unsigned 16x16 multiplier built in, so I'm kind of leaning towards a simple 16-bit (X, Y, and angle) CORDIC system in the hub which would be pipelined. It could rotate (X,Y) by an angle and also rotate Y to 0 to return the length and angle. This would provide all the basic functions that we need and the pre-/post-scaling could be done externally in software using two multiply instructions. This way,·with 14 instructions in-between each CORDIC exchange, you could do all the pre- and post-work. I'm still anxious to know about any possible labor-saving devices, though. I have a hunch that there is some technique for doing this that is way out of the box and only afterwards could it be tied back to the math identities. It's just very hard to get there from here.
The thing in Jason Arbaugh's paper that really caught my attention was the Hybrid CORDIC algorithm described at the end of chapter 4. Thanks for that reference, Peter. The crux of the method is that the value of tangent(theta) approaches close to theta when theta is small., or, to say the the same thing, theta by itself becomes a better and better approximation to arctan(theta) as theta becomes small. That is what happens in CORDIC successive approximation...
arctan(1 / 2^i) => 1 / 2^i <--- where by "=>" it means within an error that decreases as i increases.
The hybrid algorithm uses the standard arctan radix up through about 1/3 of the iterations, and then (as shown in a paper by Wang et al., Arbaugh's reference 52), the algorithm stops reading from the arctan table and instead uses the value 1 / 2^i directly. That saves having to read from the table for 2/3 of the iterations, but there is more. At that point, with the residual angle now in a staight binary radix, the 1's and 0's are one and the same as the sequence of + and - rotations, so there is no longer any need for test and subtract. The algorithm just has to step through and add 1/2^i when a bit is one, and subtract that when a bit is zero. The last 2/3 could all be done in a special purpose combinatorical adder without any further looping, so it can be "parallel".
The CORDIC gain is not affected, because it has almost converged at the 1/3 point, and the angles continue to accumulate. The reference Albaugh has for the hybrid algorithm is his number 52, (Wang, Piuri and Swartzlander, "Hybrid CORDIC Algorithms" IEEE transactions on Computers, v46, pp1202-1207, 1967).
Arbaugh goes on to develop the table method to decrease the execution time for the first 1/3 of the algorithm, which is then combined with the hybrid method. I don't really have a handle on the table method, but there are significant memory requirements.
Just to make the point stronger about what is going on in the hybrid algorithm, here is a spreadsheet that shows how rapdily arctangent of 1 / (2^i) converges to 1 / (2^i). There is the difference, and also the difference normalized to the arctangent value (the percent diffeerence if you will). This convergence allows the substitution, within some small error bound.
Tracy Allen said...
The hybrid algorithm uses the standard arctan radix up through about 1/3 of the iterations, and then (as shown in a paper by Wang et al., Arbaugh's reference 52), the algorithm stops reading from the arctan table and instead uses the value 1 / 2^i directly. That saves having to read from the table for 2/3 of the iterations, but there is more. At that point, with the residual angle now in a staight binary radix, the 1's and 0's are one and the same as the sequence of + and - rotations, so there is no longer any need for test and subtract. The algorithm just has to step through and add 1/2^i when a bit is one, and subtract that when a bit is zero. The last 2/3 could all be done in a special purpose combinatorical adder without any further looping, so it can be "parallel".
The CORDIC gain is not affected, because it has almost converged at the 1/3 point, and the angles continue to accumulate. The reference Albaugh has for the hybrid algorithm is his number 56, (Wang, Piuri and Swartzlander, "Hybrid CORDIC Algorithms" IEEE transactions on Computers, v46, pp1202-1207, 1967).
Arbaugh was talking about double-precision IEEE 754 quality results, too, which have 53 bits of (including the leading 1) of mantissa. For a 16-bit system, we wouldn't even get past his 1/3 point where the adding could get paralleled. Just an observation. This is good to know, though for a high-quality double-precision CORDIC implementation that could be in software.
Thanks for explaining all that Tracy, and showing the convergence. I didn't glean nearly as much as you did from that paper.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Chip Gracey
Parallax, Inc.
Post Edited (Chip Gracey (Parallax)) : 10/15/2006 5:51:59 AM GMT
Chip, while there was mention of the 53 bit mantissa floating point, I don't see any specific reference to that specific bit depth in connection with the logic behind the hybrid algorithm. The formula relating the sufficient number of iterations using the standard arctan radix came from the formula given by Arbaugh as,
n = (N - log2(3)) / 3
where I presume N is the bit depth. The constant log2(3) = 1.58, so that would put n=14/3 = 5, rounded off. Say you play it on the safe side and take the 16 bit algorithm up to iteration 8 and switch at that point to straight 1/ (2^n) instead of the table lookup of arctan(.).
At that point the approximation of column two to column 3 is one part in 100 million absolute and one part in a million normalized to the "correct" arctan value.
That formula comes from an error analysis that is presumably contained in the paper by Wang et al., and i have not read that. But it seems to me that the argument would hold true at any bit depth. The summary implies that the margin of error in the "1/3 rule" is something like one least significant bit, something like that.
Note that theta is always at least a hair greater than arctan(theta), and that is one condition that ensures that there are no "blind spots" that the successive approximation cannot reach.
It seems that in order to exploit this halving phenomenon that occurs somewhere down the lookup table, you would not use, say, the value $20000000 as a 45-degree starting point, but rather something a little different that would resolve to simple powers of two from the bottom of the table upwards, right? If you look at my table below, you can see that at some point, the values just start halving. The trouble is, they need to follow the pattern: $80, $40, $20, ...$01 at the end, in order to lend themselves to parallelization, not these non-power-of-two values that have more than one bit set in them, causing them to not halve cleanly. This would mean that some angle scaling would always need to be performed, before and after a CORDIC transform, right? Or, some non-power-of-two number would represent 2pi.
Tracy Allen said...
Chip, while there was mention of the 53 bit mantissa floating point, I don't see any specific reference to that specific bit depth in connection with the logic behind the hybrid algorithm. The formula relating the sufficient number of iterations using the standard arctan radix came from the formula given by Arbaugh as,
n = (N - log2(3)) / 3
where I presume N is the bit depth. The constant log2(3) = 1.58, so that would put n=14/3 = 5, rounded off. Say you play it on the safe side and take the 16 bit algorithm up to iteration 8 and switch at that point to straight 1/ (2^n) instead of the table lookup of arctan(.).
At that point the approximation of column two to column 3 is one part in 100 million absolute and one part in a million normalized to the "correct" arctan value.
That formula comes from an error analysis that is presumably contained in the paper by Wang et al., and i have not read that. But it seems to me that the argument would hold true at any bit depth. The summary implies that the margin of error in the "1/3 rule" is something like one least significant bit, something like that.
Note that theta is always at least a hair greater than arctan(theta), and that is one condition that ensures that there are no "blind spots" that the successive approximation cannot reach.
Tracy, I like the fact the 1/2^i can be used directly but that is only true if your angle is represented in radians, it doesn't really matter I suppose as long as the bit of the table you do use is in radians and you are prepared to accept potential conversions I suppose, ah I think this might be what Chip is saying.
In terms of loosing the decision, on my examples they do seem to converge to adding but only for the last three steps.
Duh, now i get it. We don't actually have 1/2^i to use directly anyway so all you do it use the table of angles (as they are now) and then at some point start using the previous angle from the table divided by two as that equals the next value.
Chip,
Pages 58-61 explain how to get easy index for the tables.
The traditional arctangent radix of 2^-i was choosen to
reduce multiplications to simple shifts. (pages 58-59)
The parallel arctangent radix uses an other angle,
where the bits of the angle are grouped and used as index
for the tables. (pages 60-61). This is the tricky part.
I checked my Javelin arctangent table for use with this approach and get · private final int T = (short)20861; //16384/(pi/4) · private static int[noparse]/noparse atnTable = new int[noparse]/noparse{16384,9672,5110,2594,1302, ··········································· 652,326,163,81,41,20,10,5,3,1,0}; ································· //T*atn(1/1),T*atn(1/2),T*atn(1/4)...T*atn(1/32768) · private static int[noparse]/noparse atnTable2= new int[noparse]/noparse{20861,10430,5215,2608,1304, ············································652,326,163,81,41,20,10,5,3,1,0}; ································· //T*(1/1),T*(1/2),T*(1/4)...T*(1/32768)
And indeed, from 1/32 on (one third of the table), the values are identical.
On page 64 is a formulae to calculate storage requirements.
For n=16 bits and t=4 tables 192 bytes would be required
For n=32 bits and t=8 tables 768 bytes would be required
There is a diagram on page 65.
I realized that my simplified explanation of the 2-cordic add probably made it sound worse than it is and my last diagram didn't help much. Here is the better explanation, notice the rotate to the x-axis is really just calculating the angle between the two vectors
That was n=32 and t=4.
The diagram shows it to be almost 10000.
Calculation yields 6144 (the diagram has a log scale, so the middle between 1E+02 and 1E+04
is 1E+03 (1000).
I think we will have to wait for Chip's judgement, as cog ram is currently 2k, I'm guessing that even 500 bytes is considered quite a lot but its certainly better than 2k or 10k.
I just realized something else, if you know x,y,r,A for JUST ONE of the two vectors in the method above you can workout the vector sum with one cordic. It doesn't matter which one you just change the order of the addition. This could be VERY handy for certain things like vector intergrations, you keep all for variables of the main vector and just add a constantly rotating and scaling vector too it. WRONG!! (I thought it gave you the new x,y of the main vector but it doesn't.
Graham
Post Edited (Graham Stabler) : 10/15/2006 1:08:26 PM GMT
Graham, I think you are right, that instead of reading values from the table for i=n to 15, you take the n-1th value from the table and successivey divide it by two for the successive rotations, and add or subtract, depending on the bit in the final residual angle being 1 or 0. That is for the case of when the test is done on the accumulated angle to determine sine and cosine of theta. It is different when determining driving y to zero, where the test is done on y, not angle (determining arctangent from x and y).
Alternatively the binary value to represent pi/4 could actually be the binary for pi/4 = %1011010100000101 (16 bits). Then the 1 / 2^i values could be used directly. But there would be a scaling required to get brads and degrees, and I think the math to calculate for different quadrants might get pretty messy.
This algorithm and the Arbaugh table algorithm mesh together. The procedure Peter is talking with the tables covers the first 1/3 of the calculation, and the Wang hybrid approximation covers the final 2/3 of the way to the answer.
· That link is broken - does anyone have the .pdf saved, that they can legally make available?
Also, one quick question:·Is·the concept of spherical·trigonometry covered herein at all?· I understand that the concepts presented herein lend themselves nicely to planar geometry, however my application (great circle navigation) could surely benefit from the application of spherical trig...
Regardless, I plan on researching this to it fullest, in the interests of getting 'up to speed' as I am embarking on a journey that will either require the use of a uM-FPU, or an understanding/implementation of the concepts presented in this thread (or some darn lucky porting ;-)
In a cursory review of the content presented herein, I feel that I am walking in the land of giants, looking around in sheer awe.· I am humbled by·volume of knowledge that is seemingly offered up with no expectation of reciprocation - kudos and thanks!
There is a succinct diagram for the rotational transform in spherical coordinateds here: www.thediagram.com/4_6/moyer.html
And there is oodles of spherical trig to be found in technical astronomy books. I don't know of any CORDIC shortcuts in relation to these things.
There is a succinct diagram for the rotational transform in spherical coordinateds here: www.thediagram.com/4_6/moyer.html
And there is oodles of spherical trig to be found in technical astronomy books. I don't know of any CORDIC shortcuts in relation to these things.
Mr. Allen,
Thanks so much for taking the time to point out this resource.
In the interests of possibly helping someone else with a similar need, I have also located some pertinent information, as it relates to Great Circle navigation, here: http://williams.best.vwh.net/avform.htm
Have a wonderful day!
-t
it has algorithms for most of the cordic routines though it doesn't explain them.
Graham,
Thanks for pointing me toward the link.· I am still trying to wrap my mind around the mechanics of the implementations of the·CORDIC algorithm, thus it is actually the theory·that I am after.· However, I so appreciate your input, as I did not make that clear in my initial queries.
Regarding the theories, it is finally starting to sink in ... I think ... I am just looking for a couple more papers to help solidify what·I think is on the tip of my 'mental tongue'.
...
Also, I am trying to port the attached to a 32bit NXP MCU with some simple BASIC on it - integers only, no floats, etc.· It would be·inappropriate for me to request assistance specific to the product, however,·I will ask for assistance with understanding the attached as it relates to·the following two topics: Struggle #1:
In the attached, I am really struggling with grasping the concept of fixed point notation.· The preamble describes that the number unit is made up of 32 bits.· If I am understanding it correctly, b31 = sign bit, b30 & b29 are the integral portions, and b28 to b0 is the fraction portions.· Now, I fully get the sign and integral portions, but I am struggling with how exactly the use of the remaining 29 bits is able to accurately represent an arbitrary·fractional number, without some obnoxious scaling (with respect to a fixed precision) having to be implemented.· Does the implementation·build the fractional portion·from the inferred decimal point out to the right to b0 (i.e. of the <0·decimal portion, the LSb=b28 and the MSb=b0)?· In re-reading what I just typed, I think I have confused the matter further for myself...· Struggle #2:
Also, related to the attached, the 3rd and 4th·lines of code is causing me heartburn (Hold on, wait, don't go ... I think I can explain where you won't think me a total bumbling idiot.?. ):
Obviously, the code: #define One (010000000000L>>1) is initializing the var One with a single bit set.· What is not obvious to me is how and which bit, exactly, is being set.· The number is 10,000,000,000.· Shifting it right one bit seems to me to serve no logical purpose:
10,000,000,000·equals ·1001010100000010111110010000000000·(>32 bits and not seemingly a good pattern...)
After the shift, it is 0100101010000001011111001000000000, if I am understanding things correctly.
Shouldnt the·number be·· 00100000000000000000000000000000?· If so, what was the logic in using 10,000,000,000?
...
I think that if I can get pointed in the right direction with these two struggles, and get my mind wrapped around the whys and hows, then the 4th line (#define HalfPi (014441766521L>>1))·may make sense to me at that point...
...
As always, thank you, kind sirs!· Your assistance and graciousness to this point is, and will be, greatly appreciated!
-t
Did you also read this thread, Cordic Object, and also look at the reference there to Turner's book that has code in Matlab.
Often the core algorithm is implemented to cover the domain of convergence only, and then a wrapper is needed that will extend the domain of validity to a range appropriate to a particular application. That way you can get either speed or generality, two qualities that are often in conflict.
Comments
I don't know how to make it any clearer.
Graham
The original question was to add 2 vectors in polar properties
and return the result in polar properties.
It turns out we still require x and y to do that.
If you keep vectors in polar·format only you would
· convert vector1 to cartesian
· convert vector2 to cartesian
· add vectors
· convert result to polar
If you keep all properties you would
· add vectors
My point is, the adding itself requires less cordics when having all properties.
The initialization of a vector however also requires a cordic,
so in the end, the total number of cordics in the program
is probably only slighty less. But if you need the components (x and y)
later on, they are instantly available, as are the length and angle.
regards peter
I got up to page 70 or so of that document and when I saw that he was proposing 2048K table entries, I quit reading, as this would not be practical in our chip.
I really appreciate all the thought you, Graham, Tracy, Paul·and the others have put into this question. You guys are much better versed in the math than I am, so I have to read what you've written very carefully to absorb it.·Paul Baker and I·had dinner with Tracy Allen last night and we were discussing something like Graham has proposed: rotating a polar pair to sum against the X axis and maybe shortening the work. I must read more of what Graham said regarding this.
The next chip will have a single-cycle signed/unsigned 16x16 multiplier built in, so I'm kind of leaning towards a simple 16-bit (X, Y, and angle) CORDIC system in the hub which would be pipelined. It could rotate (X,Y) by an angle and also rotate Y to 0 to return the length and angle. This would provide all the basic functions that we need and the pre-/post-scaling could be done externally in software using two multiply instructions. This way,·with 14 instructions in-between each CORDIC exchange, you could do all the pre- and post-work. I'm still anxious to know about any possible labor-saving devices, though. I have a hunch that there is some technique for doing this that is way out of the box and only afterwards could it be tied back to the math identities. It's just very hard to get there from here.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Chip Gracey
Parallax, Inc.
arctan(1 / 2^i) => 1 / 2^i <--- where by "=>" it means within an error that decreases as i increases.
The hybrid algorithm uses the standard arctan radix up through about 1/3 of the iterations, and then (as shown in a paper by Wang et al., Arbaugh's reference 52), the algorithm stops reading from the arctan table and instead uses the value 1 / 2^i directly. That saves having to read from the table for 2/3 of the iterations, but there is more. At that point, with the residual angle now in a staight binary radix, the 1's and 0's are one and the same as the sequence of + and - rotations, so there is no longer any need for test and subtract. The algorithm just has to step through and add 1/2^i when a bit is one, and subtract that when a bit is zero. The last 2/3 could all be done in a special purpose combinatorical adder without any further looping, so it can be "parallel".
The CORDIC gain is not affected, because it has almost converged at the 1/3 point, and the angles continue to accumulate. The reference Albaugh has for the hybrid algorithm is his number 52, (Wang, Piuri and Swartzlander, "Hybrid CORDIC Algorithms" IEEE transactions on Computers, v46, pp1202-1207, 1967).
Arbaugh goes on to develop the table method to decrease the execution time for the first 1/3 of the algorithm, which is then combined with the hybrid method. I don't really have a handle on the table method, but there are significant memory requirements.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Tracy Allen
www.emesystems.com
Post Edited (Tracy Allen) : 10/15/2006 5:58:03 AM GMT
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Tracy Allen
www.emesystems.com
Post Edited (Tracy Allen) : 10/15/2006 5:54:45 AM GMT
Thanks for explaining all that Tracy, and showing the convergence. I didn't glean nearly as much as you did from that paper.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Chip Gracey
Parallax, Inc.
Post Edited (Chip Gracey (Parallax)) : 10/15/2006 5:51:59 AM GMT
n = (N - log2(3)) / 3
where I presume N is the bit depth. The constant log2(3) = 1.58, so that would put n=14/3 = 5, rounded off. Say you play it on the safe side and take the 16 bit algorithm up to iteration 8 and switch at that point to straight 1/ (2^n) instead of the table lookup of arctan(.).
From the spreadsheet,
At that point the approximation of column two to column 3 is one part in 100 million absolute and one part in a million normalized to the "correct" arctan value.
That formula comes from an error analysis that is presumably contained in the paper by Wang et al., and i have not read that. But it seems to me that the argument would hold true at any bit depth. The summary implies that the margin of error in the "1/3 rule" is something like one least significant bit, something like that.
Note that theta is always at least a hair greater than arctan(theta), and that is one condition that ensures that there are no "blind spots" that the successive approximation cannot reach.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Tracy Allen
www.emesystems.com
Post Edited (Tracy Allen) : 10/15/2006 6:36:24 AM GMT
Sorry I got some concepts twisted around.
It seems that in order to exploit this halving phenomenon that occurs somewhere down the lookup table, you would not use, say, the value $20000000 as a 45-degree starting point, but rather something a little different that would resolve to simple powers of two from the bottom of the table upwards, right? If you look at my table below, you can see that at some point, the values just start halving. The trouble is, they need to follow the pattern: $80, $40, $20, ...$01 at the end, in order to lend themselves to parallelization, not these non-power-of-two values that have more than one bit set in them, causing them to not halve cleanly. This would mean that some angle scaling would always need to be performed, before and after a CORDIC transform, right? Or, some non-power-of-two number would represent 2pi.
···32-bit······ 20-bit
···$20000000··· $20000· ;1
···$12E4051E··· $12E40· ;2
···$09FB385B··· $09FB4· ;3
···$051111D4··· $05111· ;4
···$028B0D43··· $028B1· ;5
···$0145D7E1··· $0145D· ;6
···$00A2F61E··· $00A2F· ;7
···$00517C55··· $00518· ;8
···$0028BE53··· $0028C· ;9
···$00145F2F··· $00146· ;10
···$000A2F98··· $000A3· ;11
···$000517CC··· $00051· ;12
···$00028BE6··· $00029· ;13
···$000145F3··· $00014· ;14
···$0000A2FA··· $0000A· ;15
···$0000517D··· $00005· ;16
···$000028BE··· $00003· ;17
···$0000145F··· $00001· ;18
···$00000A30··· $00001? ;19
···$00000518··· $00000· ;20
···$0000028C··· $00000· ;21
···$00000146··· $00000· ;22
···$000000A3··· $00000· ;23
···$00000051··· $00000· ;24
···$00000029··· $00000· ;25
···$00000014··· $00000· ;26
···$0000000A··· $00000· ;27
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Chip Gracey
Parallax, Inc.
In terms of loosing the decision, on my examples they do seem to converge to adding but only for the last three steps.
Graham
Pages 58-61 explain how to get easy index for the tables.
The traditional arctangent radix of 2^-i was choosen to
reduce multiplications to simple shifts. (pages 58-59)
The parallel arctangent radix uses an other angle,
where the bits of the angle are grouped and used as index
for the tables. (pages 60-61). This is the tricky part.
I checked my Javelin arctangent table for use with this approach and get
· private final int T = (short)20861; //16384/(pi/4)
· private static int[noparse]/noparse atnTable = new int[noparse]/noparse{16384,9672,5110,2594,1302,
··········································· 652,326,163,81,41,20,10,5,3,1,0};
································· //T*atn(1/1),T*atn(1/2),T*atn(1/4)...T*atn(1/32768)
· private static int[noparse]/noparse atnTable2= new int[noparse]/noparse{20861,10430,5215,2608,1304,
············································652,326,163,81,41,20,10,5,3,1,0};
································· //T*(1/1),T*(1/2),T*(1/4)...T*(1/32768)
And indeed, from 1/32 on (one third of the table), the values are identical.
regards peter
Graham
For n=16 bits and t=4 tables 192 bytes would be required
For n=32 bits and t=8 tables 768 bytes would be required
There is a diagram on page 65.
regards peter
regards peter
I realized that my simplified explanation of the 2-cordic add probably made it sound worse than it is and my last diagram didn't help much. Here is the better explanation, notice the rotate to the x-axis is really just calculating the angle between the two vectors
click on cordic2.jpg for bigger version
The diagram shows it to be almost 10000.
Calculation yields 6144 (the diagram has a log scale, so the middle between 1E+02 and 1E+04
is 1E+03 (1000).
regards peter
I just realized something else, if you know x,y,r,A for JUST ONE of the two vectors in the method above you can workout the vector sum with one cordic. It doesn't matter which one you just change the order of the addition. This could be VERY handy for certain things like vector intergrations, you keep all for variables of the main vector and just add a constantly rotating and scaling vector too it. WRONG!! (I thought it gave you the new x,y of the main vector but it doesn't.
Graham
Post Edited (Graham Stabler) : 10/15/2006 1:08:26 PM GMT
Alternatively the binary value to represent pi/4 could actually be the binary for pi/4 = %1011010100000101 (16 bits). Then the 1 / 2^i values could be used directly. But there would be a scaling required to get brads and degrees, and I think the math to calculate for different quadrants might get pretty messy.
This algorithm and the Arbaugh table algorithm mesh together. The procedure Peter is talking with the tables covers the first 1/3 of the calculation, and the Wang hybrid approximation covers the final 2/3 of the way to the answer.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Tracy Allen
www.emesystems.com
http://www.ece.utexas.edu/~adnan/comm-06/cordic.pdf
regards peter
Also, one quick question:·Is·the concept of spherical·trigonometry covered herein at all?· I understand that the concepts presented herein lend themselves nicely to planar geometry, however my application (great circle navigation) could surely benefit from the application of spherical trig...
Regardless, I plan on researching this to it fullest, in the interests of getting 'up to speed' as I am embarking on a journey that will either require the use of a uM-FPU, or an understanding/implementation of the concepts presented in this thread (or some darn lucky porting ;-)
In a cursory review of the content presented herein, I feel that I am walking in the land of giants, looking around in sheer awe.· I am humbled by·volume of knowledge that is seemingly offered up with no expectation of reciprocation - kudos and thanks!
-t
There is a succinct diagram for the rotational transform in spherical coordinateds here:
www.thediagram.com/4_6/moyer.html
And there is oodles of spherical trig to be found in technical astronomy books. I don't know of any CORDIC shortcuts in relation to these things.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Tracy Allen
www.emesystems.com
Thanks so much for taking the time to point out this resource.
In the interests of possibly helping someone else with a similar need, I have also located some pertinent information, as it relates to Great Circle navigation, here:
http://williams.best.vwh.net/avform.htm
Have a wonderful day!
-t
Post Edited (Wulffy) : 4/18/2007 3:52:03 PM GMT
Please advise.· TIA.
-t
www.voidware.com/cordic.htm
it has algorithms for most of the cordic routines though it doesn't explain them.
Thanks for pointing me toward the link.· I am still trying to wrap my mind around the mechanics of the implementations of the·CORDIC algorithm, thus it is actually the theory·that I am after.· However, I so appreciate your input, as I did not make that clear in my initial queries.
Regarding the theories, it is finally starting to sink in ... I think ... I am just looking for a couple more papers to help solidify what·I think is on the tip of my 'mental tongue'.
...
Also, I am trying to port the attached to a 32bit NXP MCU with some simple BASIC on it - integers only, no floats, etc.· It would be·inappropriate for me to request assistance specific to the product, however,·I will ask for assistance with understanding the attached as it relates to·the following two topics:
Struggle #1:
In the attached, I am really struggling with grasping the concept of fixed point notation.· The preamble describes that the number unit is made up of 32 bits.· If I am understanding it correctly, b31 = sign bit, b30 & b29 are the integral portions, and b28 to b0 is the fraction portions.· Now, I fully get the sign and integral portions, but I am struggling with how exactly the use of the remaining 29 bits is able to accurately represent an arbitrary·fractional number, without some obnoxious scaling (with respect to a fixed precision) having to be implemented.· Does the implementation·build the fractional portion·from the inferred decimal point out to the right to b0 (i.e. of the <0·decimal portion, the LSb=b28 and the MSb=b0)?· In re-reading what I just typed, I think I have confused the matter further for myself...·
Struggle #2:
Also, related to the attached, the 3rd and 4th·lines of code is causing me heartburn (Hold on, wait, don't go ... I think I can explain where you won't think me a total bumbling idiot.?. ):
Obviously, the code: #define One (010000000000L>>1) is initializing the var One with a single bit set.· What is not obvious to me is how and which bit, exactly, is being set.· The number is 10,000,000,000.· Shifting it right one bit seems to me to serve no logical purpose:
10,000,000,000·equals ·1001010100000010111110010000000000·(>32 bits and not seemingly a good pattern...)
After the shift, it is 0100101010000001011111001000000000, if I am understanding things correctly.
Shouldnt the·number be·· 00100000000000000000000000000000?· If so, what was the logic in using 10,000,000,000?
...
I think that if I can get pointed in the right direction with these two struggles, and get my mind wrapped around the whys and hows, then the 4th line (#define HalfPi (014441766521L>>1))·may make sense to me at that point...
...
As always, thank you, kind sirs!· Your assistance and graciousness to this point is, and will be, greatly appreciated!
-t
Post Edited (Wulffy) : 4/21/2007 5:07:57 AM GMT
Did you also read this thread, Cordic Object, and also look at the reference there to Turner's book that has code in Matlab.
Often the core algorithm is implemented to cover the domain of convergence only, and then a wrapper is needed that will extend the domain of validity to a range appropriate to a particular application. That way you can get either speed or generality, two qualities that are often in conflict.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Tracy Allen
www.emesystems.com