Shop OBEX P1 Docs P2 Docs Learn Events
fast int sqrt for the prop - Page 2 — Parallax Forums

fast int sqrt for the prop

2»

Comments

  • VIRANDVIRAND Posts: 656
    edited 2008-08-28 00:02
    I'm surprised no one used the LOG Tables!
  • Tracy AllenTracy Allen Posts: 6,662
    edited 2008-08-28 02:06
    Chip, I'd very much like the ratio computation, right up there on a priority level with sqrt. You did that in hardware as a separate circuit?

    The way I have been doing it in pasm is the following and I wonder in the spirit of this thread if the code can be shortened.

    '  convert ratio y / x, where y < x, into a binary fraction f 
    ' enter with y, x:{y > x, x < 2^31, y <= 2^31}
    ' exit with binary fraction, f,   where f is the numerator of the approximation, y /x = f / (2^n).
    ' 
    fractyxb              ' enter here , n contains number of bits, usually will be 32 bits for frqx calc.
        mov f,#0    ' result
    :loop
        shl y,#1        ' double the value of y
        cmpsub y, x   wc,wr  ' if y>=x then y:=y-x, carry:=1, else a unchanged, carry:=0
        rcl f,#1        ' shift in carry, double the estimate of f, appoximation f/(2^n) =~ y/x
        djnz n,#:loop    ' do all bits
    fractyxb_ret    ret
    
    



    Here is the equivalent in spin, which can then be used with the ** operator when the number of bits=32 for calculating the trinary
    result = z * y / x. (result := f ** z)



    PRI fractyxzb (y, x,  b) : f
    '' calculate f = y/x * 2^nb
    '' enter with y < x, and b=number of bits in approximation 
    '' b is number of bits
    '' same as f: y/x = f / (2^nb) < y/x < (f+1)/(2^nb), 
    '' that is, f over a power of two is a close appoximation to the original fraction.
      repeat b                      
        y <<= 1
        f <<= 1
        if y => x    '  
          y -= x
          f++
    

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Tracy Allen
    www.emesystems.com
  • cgraceycgracey Posts: 14,133
    edited 2008-08-28 03:57
    Tracy,

    I think the only way that routine can get sped up is by straight-lining each iteration. In the next Propeller you could do this:

    ··········· rep···· [noparse][[/noparse]32,3]····· 'repeat 3 instructions 32 times

    ··········· nop················ 'must execute two instructions here

    ··········· nop

    ··········· shl···· x,#1······· 'begin 3-instruction block

    ··········· cmpsub· x,y····· wc

    ··········· rcl···· q,#1······· 'total cycles = 3 + 3*32 = 99



    With the divider circuit, you'd do this:

    ··········· setdivx x·········· 'set x

    ··········· setdivy y,[noparse][[/noparse]32]····· 'set y, begin 32 iterations

    ··········· (nop)·············· 'you could execute up to 31 instructions here

    ··········· getdivq q·········· 'get result (waits), total cycles = 2 + 31 + 1 = 34

    So, it's ~3 times faster, but works in the background, so in a way it only takes 3 instructions (33x faster). This is the way the CORDIC works, too.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔


    Chip Gracey
    Parallax, Inc.

    Post Edited (Chip Gracey (Parallax)) : 8/28/2008 4:03:09 AM GMT
  • AleAle Posts: 2,363
    edited 2008-08-28 08:31
    In a way how many FPUs work, I mean in the background. If we could access external memory fast enough, I mean at 160 MHz we could safely use SDRAM maybe at 66 or 80 MHz, that could easily compete with some multimedia optimized processors in the type or ARM or similar. There are many multimedia players with limited resources that decode video at 160x128 or 320x240 and 24 to 30 fps (No idea if it is doable).

    The SH-1 (and I think SH-2) have an iterative division that takes 32 or so instructions and cycles. but htis way is better because you have 31 instructions to do something.

    In a not so related note, is there any help for pointers ?, scale, auto-post or -pre decrement/increment ? (Maybe it is moot with the speed).

    What happens with consumption ? is there an estimate or something like that ?
  • Cluso99Cluso99 Posts: 18,069
    edited 2008-08-29 03:52
    Hey... Did anyone notice this...
    Chip said...

    ··········· rep···· [noparse][[/noparse]32,3]····· 'repeat 3 instructions 32 times

    WOW !!!!!
  • AleAle Posts: 2,363
    edited 2008-08-29 10:26
    I did. The newest way to do a loop without variable and without jump instruction. Pretty neat. I do not see the hour this chip shows up at my desk !!.
    Between this and the other thread I lost the count on how many improvements this prop 2 has, many more than what it was outlined in that long thread from last year We need it !. At last something we want will be created, just for us (and some other customers [noparse]:)[/noparse] ).
Sign In or Register to comment.