CORDIC vs BKM
Mark Swann
Posts: 124
All,
I have seen at least one thread discussing, in depth, the use of CORDIC algorithms in the the Propeller. I have not used CORDIC algorithms, but do have a need, so I did some research and discovered·a newer, and supposedly better, BKM algorithm. Has anyone explored this algorithm? How would its implementation in the Propeller be different? Is it better? Any thoughts are appreciated.
Thanks
I have seen at least one thread discussing, in depth, the use of CORDIC algorithms in the the Propeller. I have not used CORDIC algorithms, but do have a need, so I did some research and discovered·a newer, and supposedly better, BKM algorithm. Has anyone explored this algorithm? How would its implementation in the Propeller be different? Is it better? Any thoughts are appreciated.
Thanks
Comments
Graham
It claims to be simpler, but needs more table space for logarithms. The Prop has a log table. Maybe BKM can use that, instead of its own table.
Is the lack of need for a result scaling factor really an issue?
As far as scaling goes, in cordic you sometimes need to scale and other times not, generally if you care about the length of the vector produced at the end you do need to scale. Chip wrote a neat optimized multiplication routine to do this but it still has to be done and takes time.
I'm going to have a read of that paper, it looks scary so wish me luck.
Graham
I looked at the .pdf, but it overwhelmed me pretty quickly.·Paul Baker stands a much better chance of figuring it out,·and so·do·you. The big question I have is: How big of lookup tables are required? This is the make/break issue, I think.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Chip Gracey
Parallax, Inc.
Quoted from the PDF: "CORDIC BKM needs the storage of (9*p)/2 constants, while CORDIC
needs the storage of p constant."
Mark
The algorithm itself based on generalization of the series log(1 + 1/(2^k)). That is the basis of some real valued shift-add algorithms for log and power functions, for example, he references the Briggs algorithm, which is described in more detail in chapter 20 of Arndt's book. But in this case, the decision variable is a complex number chosen from the set, d:{1, 0, -1, -i, i, i-1, i+1, -1-1, -i+1) and the values of log(1 + d/(2^k)) and the other functions of d are thus also complex valued. That gets pretty thick, pretty fast!
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Tracy Allen
www.emesystems.com
http://en.wikipedia.org/wiki/Signed-digit_representation
When you come to do the addition in Cordic you need to find the sign of y quite often and this means finding the sign of the highest non zero bit in the number rather than just looking at the msb as you would with normal 2's complement. This searching the number would remove the advantage of the carry free addition.
The BKM algorithm seems to be to get around that problem and as we are not using signed digit representation there is no advantage as far as I can see.
Graham
I'm onto the pipelined hub CORDIC implementation now for Prop2 and I've been working on some K-factor correction, which makes it really easy to use. This BKM algorithm looks neat, but I think I'll stick with the CORDIC for now, because I understand it.
To get K correction, I'll reduce the X and Y components by some right-shifted amounted within 16 of the 32 stages. I made a little SmallBASIC program to find the shift values:
Here's the result:
I'll make the last correction at #30 and ignore the ones below. That gets the error almost below 10 digits. It's really nice to get CORDIC results without X and Y being amplified by 1/~0.607252935!
Internally I think all fast processors these days use this or more complex schemes to reduce the penalty for carry -
the hardware can be a long way from the abstract programming model in order to eek out a bit more speed
and power efficiency. Carry is a real show-stopper for 64 bit or double float multiply if not addressed. The other
approach is pipelining as in a vector processor.