Approximating Trig Functions

Jack Crenshaw · 2009-02-19 19:16

Pi Guy's recent posts concerning the math of ping radar bring up an interesting and perennial problem: Given a range and an angle, how do you computer the equivalent vector in Cartesian coordinates? This question ultimately boils down to calculating sine and cosine from the angle. I think that's a topic worth discussing in its own right.

Now, if I'm navigating a B-2 bomber from Missouri to Tehran, I need pinpoint navigation to second-of-arc accuracy. But if I'm trying to get a model robot from the living room to the kitchen, I don't need nearly as much accuracy.

The burning questions then become: How much is enough? And how do I get it?

The standard approach, used in computers since the IBM 702, uses power series. That solution fits well with compuiters with limited memory but lots of computer time. But series get less accurate when you truncate them, which of course you always must do. So you get maximum error at the maximum angle, which is usually at 45 degrees (folding quadrants takes care of the rest).

Optimizing the power series using Chebyshev polynomials is one response to this. You get controlled error within the range, and zero error at both end points.

The other alternative is simply a table lookup, like we did in high school (remember "proportional parts"?). Linear interpolation is always good, but needs lots of tabular points. Higher-order interpolation, include spline fits, is better.Some years ago, I did the table lookup thing, by fitting both the function and its derivatives through, as I recall, third order. But all the tabular methods do require tables of some size or another.

Whether you use formulas or tables or some combination, you still must decide how much accuracy is enough. And I think that question tends to drive the implementation.

Decades ago, someone wrote a column for Dr. Dobbs, suggesting that for low-accuracy requirements, we just fit a straight line through key points. I think he used segments of 45 degrees each. The interesting thing is that, when you do this, the optimal arrangement is _NOT_ to fit tabular points, like interpolation, but rather to allow some error at both ends.

Let me explain further. If I define points at, say, 0 and 45 degrees, I get perfect accuracy at both extremes, but I get a large error at 22.5 degrees. But if I relax the requirement of perfect accuracy at the end points, I can have errors at 0, 22.5, and 45, but the maximuim error will be smaller. This is the "minimax" concept.

I think it would be great fun -- not to mention useful -- to explore how we can best generate moderate-accuracy results fast, using some minimax technique.

Anyone up for it?

Jack

Decades ago,

Phil Pilgrim (PhiPi) · 2009-02-19 19:29

Jack,

There is a table of sines in the Propeller's ROM. It has 2049 16-bit entries for the 0 - 90° interval. This is a fine enough resolution (0.0439°) to preclude the necessity for interpolaiton in many cases.

-Phil

Jack Crenshaw · 2009-02-19 19:34

I forgot to mention: Someone's bound to point out the CORDIC solution. That's really a subset of the notion of splitting up the angle range into segments. Splitting the full angle range, 0..360, into 8 parts is natural. It comes from simple trig identities. Beyond that, you can split it into smaller segments using double-angle formulas. That's a good thing, too, though it takes more time.

I'm partial to using two unequal-size segments: One 15 deg wide over the 0..15-degree range. The other 30 deg wide, covering the 15..45 range, and centered at 30 deg. Since the angle in the last case goes from -15 to +15, the same formulas work with a minimum of logic.

CORDIC is sort of like that, only you split the angle up into segments organized by bit pattern -- high bit for 45 deg, say, next high to 22.5, next to 11.25, etc. In the end you need to store fewer sine-cos values (ultimate limit = 2) but need more comp time tp dp the double-angle formula thing.

Jack

SRLM · 2009-02-19 19:56

While not exactly what you are working on, this is in a similar vein. Still haven't gotten a complete working solution yet, but soon...

http://forums.parallax.com/showthread.php?p=767169

Andrew E Mileski · 2009-02-20 00:06

I've written my own trig functions before. I'm quite proud of my arctan derivation, as I spent the most time on it. It uses a S31.32 representation, which is more accurate than an IEEE float, but with far less range. Of course it is much faster than floating point, especially without all the IEEE special cases like NaNs and such. I keep meaning to translate it into SPIN.

All the approximations are done as terms in a series, so if you know the next term is smaller than can be represented, and won't change the approximation, then no more terms are needed. This is normally done on the worst case, so that a fixed number of iterations can always be done.

I don't like using tables, but it is possible to use the table values as a starting point for better approximations, or using several table entries as control points on an easier function to calculate. Lots of math I've forgotten from university days decades ago.

Aside: One of my favorites is calculating PI using x_n+1 = x_n + cos x_n. It is an example of constantly refining an approximation. Start with say 1, and shift the result left to get PI. It converges VERY fast, so even a few terms of the cos series will do.

Post Edited (Andrew E Mileski) : 2/20/2009 12:20:54 AM GMT

Jack Crenshaw · 2009-02-20 04:39

Phil Pilgrim said...
There is a table of sines in the Propeller's ROM. It has 2049 16-bit entries for the 0 - 90° interval. This is a fine enough resolution (0.0439°) to preclude the necessity for interpolaiton in many cases.

Well, dang! There I go again, trying to solve problems that don't exist. Is there anything Chip didn't think of already?

Guess I'd better read the manual before I post any more speculations.

Jack

Beau Schwabe · 2009-02-20 04:51

Jack Crenshaw,

There is also a LOG table built-in to the Propeller ROM.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe

IC Layout Engineer
Parallax, Inc.

Jack Crenshaw · 2009-02-20 04:56

Andrew E Mileski said...
I've written my own trig functions before. I'm quite proud of my arctan derivation, as I spent the most time on it. It uses a S31.32 representation, which is more accurate than an IEEE float, but with far less range. Of course it is much faster than floating point, especially without all the IEEE special cases like NaNs and such. I keep meaning to translate it into SPIN.

Andrew, before parking the atan function, take a look at the continued-fraction form. It converges MUCH faster than the power series. It's also way cool:

When you truncate the fraction at some order, you can convert it to a ratio of polynomials. See the attached formulas.

I've got LOTS more on the arctan, including minimax of errors.

Jack

Jack Crenshaw · 2009-02-20 05:01

Beau Schwabe (Parallax) said...
There is also a LOG table built-in to the Propeller ROM.

And exp(x)? And atan(x)? If yes, all problems are solvable!

Jack

Beau Schwabe · 2009-02-20 05:23

Jack Crenshaw,

Page 420 in the manual...

Log Table ($C000-$CFFF) - The log table contains data used to convert unsigned numbers into base-2 exponents.

Anti-Log Table($D000-$DFFF) - The anti-log table contains data used to convert base-2 exponents into unsigned numbers.

Sine-Table ($E000 - $F001) - The sine table provides 2,049 unsigned 16-bit sine samples spanning from 0 deg to 90 deg, inclusively (0.0439 deg resolution)

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe

IC Layout Engineer
Parallax, Inc.

virtuPIC · 2009-02-20 08:05

We can also combine several different technologies. Starting from the tables, then doeing CORDIC, and finally do some interpolation with 3rd ord 4th degree splines. Just an idea so far. Haven't looked into accuracy and performance.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Airspace V - international hangar flying!
www.airspace-v.com/ggadgets for tools & toys

Tracy Allen · 2009-02-20 09:09

Hi Jack,

I have a well thumbed copy of your book and I really dig the continued fractions and all. But hey! I've always been curious why your book didn't mention CORDIC solutions. (Yes, someone was bound to mention it!). The crux of the matter for the Propeller is lack of a hardware multiplier, and too, that it needs to make very efficient use of limited code space. CORDIC does not need multiplies, only bit shifts, and with more work (a lot of it!) the same machinery can be used for a range of trig and hyperbolic functions, and it can throw in a multiply for good measure (The CORDIC gain). It converges pretty fast, one bit per iteration in the 1st and 4th quadrant, so no half angles etc are necessary, and the wrapper for 4 quadrants is simple reflection of quadrants 2 and 3 through the origin. What's not to like?

I think the Propeller version 2 will have a hardware multiplier, and there is a rumor that it will also include CORDIC hardware, maybe in the counter modules or who knows, but suffice it to say that Chip likes complex harmonies. In the meantime, his assembly language for the current Propeller includes instructions that are honed for that kind of algorithm. For example, here is a snippet from Graham Stabler's algorithm that started off that thread "Some Cordic Documentation":

sumc    cx,dx                   ' Add -dx to cx if c=1 or dx if c=0 
            sumnc   cy,dy                   ' Add -dy to cx if c=0 or dy if c=1
             'Next add or subract the specified rotation from the angle accumulator   
:lookup     sumc    ca,table                ' Add -table to ca if c=1 or table if c=0
                                                        ' Remembering that the source field of the sumc command "table"
                                                        ' is being over written to address different elements
                                                        ' of the table below 
            add     :lookup,#1              ' Increments the destination of lookup sumc command

The sumc and sumnc effect either an addtion or a subtraction based on the result of an earlier comparison (rotated under or over?), and then the another sumc grabs a value from the table of arctangents and conditonally adds or subtracts it from the accumulator that tracks of the total rotation. Then the table address is incremented, but in the uniquely Propeller way of self-modifying the lookup instruction in place. Then back for another bit. The core algorithm is short and fast.

Anyway, I don't think of CORDIC as a poor sister to polynomials or continued fractions. It may involve a table lookup and successive approximation, but it's clever stuff.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Tracy Allen
www.emesystems.com

Jack Crenshaw · 2009-02-20 14:24

The problem with CORDIC is its speed. But I agree with you: If there's no hardware multiply, CORDIC is as fast as the multiply itself.

Jack

Andrew E Mileski · 2009-02-20 17:28

Jack Crenshaw said...
Andrew, before parking the atan function, take a look at the continued-fraction form. It converges MUCH faster than the power series. It's also way cool:

When you truncate the fraction at some order, you can convert it to a ratio of polynomials. See the attached formulas.

I've got LOTS more on the arctan, including minimax of errors.

Jack

It takes 3 iterations for my arctan to converge with 32 bits to the right of the decimal, but thanks, I will look at your method.

Jack Crenshaw · 2009-02-23 03:31

Re the arctan algorithm: I've looked at all the messages in the thread, and the threads referenced by SRLM
http://forums.parallax.com/forums/default.aspx?f=25&m=310282
and Rayman
http://www.dsprelated.com/showmessage/21872/1.phpand

Looks like you guys have been working this one for awhile! ;-} I do have a few thoughts to add:

1) They all work. If you limit the range (pi/12) is best), the continued-fraction version is really, really simple.

2) In its "exact" form, the truncated CF has all its error at the extremes of the range. However, you can get DRAMATIC reductions in error by "minimaxing" the the coefficients slightly. It's analogous to using Chebyshev polynomials, but they only work for power series. I'm partial to solutions that give zero error at both extremes. That way, you get no discontinuities when you cross region boundaries.

3) Andrew's comment led me to look at iterative approaches. I came across this formula, which is truly remarkable. Given input x, compute

y = x
loop (if you're really paranoid about accuracy
y = y + (x - tan(y))/(1+x*tan(y))
end loop
atan(x) = y

There is nothing magic about the formula itself. It's the usual double-angle form. Nothing unusual about the method, either. It's the most fundamental replacement method. What's remarkable about the algorithm is just how fast it converges. Over the entire range, 0..1 (no need for segmenting) it's accurate to 0.2 deg in a _SINGLE_ iteration. After two, it has a max error of 0.003 ARC SECONDS!!! (essentially perfect). Tol'ja it's remarkable.

Even more remarkable, change the initial guess to y -= (pi/4)*x, and it's good for 25 arc-sec with one iteration, with zero error at both extremes. It don't get no better than that, Jack.

OTOH, Tracy shamed me into looking at the CORDIC approach. I don't think it works in the classical fashion. That is to say, you can't just store a table of fixed steps in the input, because its relation to the output is tangled up with that double-angle formula. However, you _CAN_ use successive approximation, just as you'd do with an A/D converter. In effect, you're asking the computer "is my current guess too high or too low?" Then you add the next least significant bit to it, or not.

The beauty is, you only need about 16 bits (180/2048 degrees). And you've already got most of them, by the initial guess of y. That leaves only a handful of bits still to determine.

This approach still doesn't do away with all divisions. You still need to divide the sine by the cosine. But except for that, you need only bit-flipping operations. The conditional add is perfect for this.

Hope y'all find all this useful.

4) Chip et al: what are the chances of an arctan table in ROM, in the next version?

Jack

Approximating Trig Functions

Comments