Shop OBEX P1 Docs P2 Docs Learn Events
CORDIC vs BKM — Parallax Forums

CORDIC vs BKM

Mark SwannMark Swann Posts: 124
edited 2015-04-21 09:31 in Propeller 1
All,

I have seen at least one thread discussing, in depth, the use of CORDIC algorithms in the the Propeller. I have not used CORDIC algorithms, but do have a need, so I did some research and discovered·a newer, and supposedly better, BKM algorithm. Has anyone explored this algorithm? How would its implementation in the Propeller be different? Is it better? Any thoughts are appreciated.

Thanks

Comments

  • Graham StablerGraham Stabler Posts: 2,510
    edited 2007-10-08 17:20
    Until I get back to work I can't download any of the juicy pdfs on the subject but it looks interesting. I suspect the implementation would be very similar, a loop and lookup table and some sort of condition that you assess on each loop. Cordic is actually really simple in terms of the programming, its the understanding and dealing with "simpler" issues like angle representation that can be tricky.

    Graham
  • Graham StablerGraham Stabler Posts: 2,510
    edited 2007-10-08 17:21
  • Mark SwannMark Swann Posts: 124
    edited 2007-10-08 17:45
    Graham Stabler said...
    Until I get back to work I can't download any of the juicy pdfs on the subject but it looks interesting. I suspect the implementation would be very similar, a loop and lookup table and some sort of condition that you assess on each loop. Cordic is actually really simple in terms of the programming, its the understanding and dealing with "simpler" issues like angle representation that can be tricky.

    Graham
    I did find this bit on Wikipedia:

    BKM provides a simpler method of computing some elementary functions,
    and unlike CORDIC, BKM needs no result scaling factor. The convergence
    rate of BKM is approximately one bit per iteration, like CORDIC, but
    BKM requires more precomputed table elements for the same precision
    because the table stores logarithms of complex operands.
    

    It claims to be simpler, but needs more table space for logarithms. The Prop has a log table. Maybe BKM can use that, instead of its own table.

    Is the lack of need for a result scaling factor really an issue?
  • Graham StablerGraham Stabler Posts: 2,510
    edited 2007-10-08 18:00
    I doubt you could use the inbuilt table sucessfully as they will be quite special values and if you start interpolating things will get slower and you might as well not bother. The CORDIC table is one value for each interation and you get one bit of accuracy per iteration so even at max resolution that's only 32 values so as long as it doesn't need masses of values I suspect there will be no problem.

    As far as scaling goes, in cordic you sometimes need to scale and other times not, generally if you care about the length of the vector produced at the end you do need to scale. Chip wrote a neat optimized multiplication routine to do this but it still has to be done and takes time.

    I'm going to have a read of that paper, it looks scary so wish me luck.

    Graham
  • cgraceycgracey Posts: 14,237
    edited 2007-10-09 07:39
    Graham,

    I looked at the .pdf, but it overwhelmed me pretty quickly.·Paul Baker stands a much better chance of figuring it out,·and so·do·you. The big question I have is: How big of lookup tables are required? This is the make/break issue, I think.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔


    Chip Gracey
    Parallax, Inc.
  • Mark SwannMark Swann Posts: 124
    edited 2007-10-09 15:14
    Chip Gracey (Parallax) said...

    Graham,

    I looked at the .pdf, but it overwhelmed me pretty quickly.·Paul Baker stands a much better chance of figuring it out,·and so·do·you. The big question I have is: How big of lookup tables are required? This is the make/break issue, I think.

    Chip,

    Quoted from the PDF: "CORDIC BKM needs the storage of (9*p)/2 constants, while CORDIC
    needs the storage of p constant."

    Mark
  • Tracy AllenTracy Allen Posts: 6,664
    edited 2007-10-09 16:01
    But I got stuck immediately on terminology. For example, what is a "redundant number" as opposed to one that is not? The BKM is "more suitable for computations in a redundant number system than the CORDIC...". It has to do with an understanding of subtle aspects of number systems. What is the difference between a "classical or signed-digit number system" and a "carry-save representation"? There is a lot of that.

    The algorithm itself based on generalization of the series log(1 + 1/(2^k)). That is the basis of some real valued shift-add algorithms for log and power functions, for example, he references the Briggs algorithm, which is described in more detail in chapter 20 of Arndt's book. But in this case, the decision variable is a complex number chosen from the set, d:{1, 0, -1, -i, i, i-1, i+1, -1-1, -i+1) and the values of log(1 + d/(2^k)) and the other functions of d are thus also complex valued. That gets pretty thick, pretty fast!

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Tracy Allen
    www.emesystems.com
  • Graham StablerGraham Stabler Posts: 2,510
    edited 2007-10-09 16:43
    Well it seems that an example of a redundant number system is signed digit, this allows each bit that makes up a number to be 0, 1 or -1 and that allows carry free high speed addition in hardware.

    http://en.wikipedia.org/wiki/Signed-digit_representation

    When you come to do the addition in Cordic you need to find the sign of y quite often and this means finding the sign of the highest non zero bit in the number rather than just looking at the msb as you would with normal 2's complement. This searching the number would remove the advantage of the carry free addition.

    The BKM algorithm seems to be to get around that problem and as we are not using signed digit representation there is no advantage as far as I can see.

    Graham
  • Graham StablerGraham Stabler Posts: 2,510
    edited 2007-10-09 16:53
    The question of scaling factors also seems to relate to the special number system because in Cordic the scaling factor is not a constant using that system and so you could not create optimized scaling hardware as easily.
  • cgraceycgracey Posts: 14,237
    edited 2015-04-20 22:50
    I was looking into this BKM algorithm today and there seems to be nothing new on the internet about it. Hence, this old thread popped up near the top of the search results. If there was some implementation to study, it might be usable, but I don't see any.

    I'm onto the pipelined hub CORDIC implementation now for Prop2 and I've been working on some K-factor correction, which makes it really easy to use. This BKM algorithm looks neat, but I think I'll stick with the CORDIC for now, because I understand it.

    To get K correction, I'll reduce the X and Y components by some right-shifted amounted within 16 of the 32 stages. I made a little SmallBASIC program to find the shift values:
    k = 0.6072529350088812561694
    j = 1.0
    t = 0
    for x = 0 to 39
      jj = j / pow(2,x)
      if j - jj > k then
          j = j - jj
          t++
          print x,1,j
      else
          print x,0,j
      endif
    next
    print
    print "total",t
    


    Here's the result:
    0	0	1	
    1	0	1	
    2	1	0.75	
    3	1	0.65625	
    4	1	0.615234375	
    5	0	0.615234375	
    6	0	0.615234375	
    7	1	0.610427856	
    8	1	0.608043373	
    9	0	0.608043373	
    10	1	0.60744958	
    11	0	0.60744958	
    12	1	0.607301277	
    13	0	0.607301277	
    14	1	0.60726421	
    15	0	0.60726421	
    16	1	0.607254944	
    17	0	0.607254944	
    18	0	0.607254944	
    19	1	0.607253786	
    20	1	0.607253207	
    21	0	0.607253207	
    22	1	0.607253062	
    23	1	0.60725299	
    24	1	0.607252954	
    25	1	0.607252935	
    26	0	0.607252935	
    27	0	0.607252935	
    28	0	0.607252935	
    29	0	0.607252935	
    30	0	0.607252935	
    31	1	0.607252935	
    32	1	0.607252935	
    33	0	0.607252935	
    34	1	0.607252935	
    35	0	0.607252935	
    36	1	0.607252935	
    37	0	0.607252935	
    38	0	0.607252935	
    39	0	0.607252935
    	
    total	19
    


    I'll make the last correction at #30 and ignore the ones below. That gets the error almost below 10 digits. It's really nice to get CORDIC results without X and Y being amplified by 1/~0.607252935!
  • Mark_TMark_T Posts: 1,981
    edited 2015-04-21 06:57
    Well it seems that an example of a redundant number system is signed digit, this allows each bit that makes up a number to be 0, 1 or -1 and that allows carry free high speed addition in hardware.

    Graham

    Internally I think all fast processors these days use this or more complex schemes to reduce the penalty for carry -
    the hardware can be a long way from the abstract programming model in order to eek out a bit more speed
    and power efficiency. Carry is a real show-stopper for 64 bit or double float multiply if not addressed. The other
    approach is pipelining as in a vector processor.
  • KyeKye Posts: 2,200
    edited 2015-04-21 07:36
    Design Compiler does all arithmetic using carry save logic. Basically, repeated mathematical operations are not done in binary. They are converted to binary and the carry is propagated at the very end of computation.
  • Tracy AllenTracy Allen Posts: 6,664
    edited 2015-04-21 09:31
    There's the rub. It needs a fast mechanism to convert from ordinary binary to the redundant representation and back again. Does that make sense for a single mathematical operation, say to compute one sin(x)?
Sign In or Register to comment.