Shop OBEX P1 Docs P2 Docs Learn Events
Any ways to speed up Float32? - Page 4 — Parallax Forums

Any ways to speed up Float32?

124

Comments

  • Duane DegnDuane Degn Posts: 10,588
    edited 2012-12-17 00:29
    SRLM wrote: »
    I'm interested :) I have a version of F32 that I'm getting ready to post that has a built in "interpreter" to do sequences of float operations at PASM speed without needing the Spin calling cog. However, it relies on the fact that all of the F32 commands are standardized, and don't have extra functionality outside of the PASM.

    Fixing the PASM part of the code is currently by programming abilities. I don't think it would be hard to write this "patch" in PASM, but I don't think there is currently room left in the cog.

    Here's my latest patch.

    I added a few constants which I use in the patch code.
    CON
      ONE_ = 1.0
      NEG_ONE = -1.0
      ZERO = 0.0
      PI_OVER_TWO = pi / 2.0
      THREE_PI_OVER_TWO = (3.0 * pi) / 2.0
        
    

    And here's my latest "ATan2" method.
    PUB ATan2(a, b)
    {{
      Arc Tangent of vector a, b (in radians, no division is performed, so b==0 is legal).
      Parameters:
        a        32-bit floating point value
        b        32-bit floating point value
      Returns:   32-bit floating point value (angle in radians)
    }}
      if b == ZERO
        if a == a & $7FFF_FFFF
          return PI_OVER_TWO
        else
          return THREE_PI_OVER_TWO 
      result := b & $7FFF_FFFF      ' absolute value
      if b == result
        b := ONE_
      else
        b := NEG_ONE 
      a := FDiv(a, result)
      result := cmdATan2
      f32_Cmd := @result
      repeat
      while f32_Cmd
     
    

    It checks to see if the denominator, "b", is zero; if "b" is zero, it then sets the result to either "pi/2" or "3*pi/2" depending on the numerator's sign. Non-zero denominators are set to "1" or "-1" and the numerator is divided by the absolute value of the original denominator.

    I don't know what values of "b" cause erroneous results, but I haven't seen any problems if the "b" value is limited to "-1", "1". I haven't tested zero "b" values to see if there are any "a" values that would cause a problem.

    I've attached a modified F32 with the above patch.

    For my needs, the trig functions of F32 are the most important aspect of the program. The inverse kinematics code I'm working on has a lot of ATan2 calls. I'm really hoping Jonathan or someone else who understands the code can fix this problem.
  • LawsonLawson Posts: 870
    edited 2012-12-17 15:45
    Duane: The F32 assembly code uses a cordic loop for Atan2 while Float32Full uses reverse table lookup. F32 *should* be more accurate, but it looks like something got messed up in the scaling? Note, Atan(x) is the same as Atan2(x, 1.0). It's likely only Atan(x) was tested, so this bug got missed.

    Have you run the same numbers through Excel or a programming language? It'd be nice to confirm how good Float32Full is getting the calculation.

    Lawson
  • Duane DegnDuane Degn Posts: 10,588
    edited 2012-12-17 15:56
    Lawson wrote: »
    Have you run the same numbers through Excel or a programming language? It'd be nice to confirm how good Float32Full is getting the calculation.

    I ran a bunch of the above numbers through Window's calculator and they match with Float32Full out to four decimal places.

    I noticed that Atan uses Atan2. That's why I decided to test using one as the denominator with ATan2.
  • lonesocklonesock Posts: 917
    edited 2012-12-20 14:19
    Hey, everybody, sorry for the lack of response.

    My brain is at capacity until a little after Christmas. I will review this stuff as soon as I have some time.

    thanks,
    Jonathan
  • Duane DegnDuane Degn Posts: 10,588
    edited 2013-03-21 15:12
    Jonathan,

    I hope you had a nice Christmas. Any chance you'll have time to take a look at the ATan2 bug soon?

    Thanks,

    Duane
  • Duane DegnDuane Degn Posts: 10,588
    edited 2013-04-22 13:11
    Hey Jonathan,

    I've seen you out and about on the forum and I'm hoping your have time to take a look at the ATan2 issue. Let me know if there's anything I can do to help (hopefully something I know how to do). I could run some test numbers through the new method, when you're ready, if you like.

    Thanks again for creating F32. I use it all the time.
  • lonesocklonesock Posts: 917
    edited 2013-04-22 14:46
    Duane, please try this. Also, please accept the first annual Patience with Lonesock Award!

    Jonathan
  • lonesocklonesock Posts: 917
    edited 2013-05-01 09:03
    Has anyone found any problems with this latest version? With the updated OBEX working now I'd like to post the latest F32.

    thanks,
    Jonathan
  • Duane DegnDuane Degn Posts: 10,588
    edited 2013-05-01 10:52
    Jonathan,

    Thank you very much for fixing this. Sorry for not testing earlier.

    Something strange is going on in my test program. When I use the new version of F23 it crashes but if all the floats are done with Float32Full it doesn't crash.

    Give me a bit to pin point the code causing the crash. I'll post the test code here once I find the exact F32 call causing the problem.

    Edit: I'm starting to think the crash is caused by something unrelated to the floating point calculations. I'll post more once I'm sure.
  • Duane DegnDuane Degn Posts: 10,588
    edited 2013-05-01 11:26
    I don't see any problems with the new F32.

    My problems with the Prop apparently crashing was the result of my using the "DTR" check box to reset the Propeller. Once the "DTR" check box is used it stays active so when I'd press the space bar to trigger the next set of data to display it would be checking and unchecking "DTR". The second space bar press would cause the Propeller to reset. I need to remember the Parallax Serial Terminal has this feature.

    Here's my test program:
    CON
    
      _CLKMODE = XTAL1 + PLL16X 
      _XINFREQ = 5_000_000
     
    OBJ
    
    
     
      F32 : "F32"
      FloatFull : "Float32Full"
      Pst : "Parallax Serial Terminal"
      FloatStr : "FloatString"
        
    PUB Setup | localIndex
                                                                                  
      Pst.Start(115200)                                                      
      F32.start                                                                  
      FloatFull.start                                                              
      FloatStr.SetPrecision(4)
                                                  
      MainLoop
       
    PUB MainLoop | index1, index2, result32
        
      'Pst.str(string(13, "Start of MainLoop Method "))
    
    
      index1 := 25.0 
      repeat
        index2 := 31.0
        repeat 10
          Pst.str(string(13, "Tan2("))
          Pst.str(FloatStr.FloatToString(index2))
          Pst.str(string(", "))
          Pst.str(FloatStr.FloatToString(index1))
          Pst.str(string(") | F32 = "))
          result32 := F32.ATan2(index2, index1)
          Pst.str(FloatStr.FloatToString(result32))
          Pst.str(string(") | Float32Full = "))
          result := FloatFull.ATan2(index2, index1)
          Pst.str(FloatStr.FloatToString(result))
          index2 := F32.FAdd(index2, 0.1)
          Pst.str(string(") | Difference = "))
          result := F32.FSub(result32, result)
          Pst.str(FloatStr.FloatToString(result))
        index1 := F32.FSub(index1, 0.1)
        Pst.charin
    

    Here's some of the output:
    Tan2(31, 25) | F32 = 0.8921) | Float32Full = 0.8921) | Difference = 0.00000155Tan2(31.1, 25) | F32 = 0.8937) | Float32Full = 0.8937) | Difference = 0.000001669
    Tan2(31.2, 25) | F32 = 0.8953) | Float32Full = 0.8953) | Difference = 0.000001669
    Tan2(31.3, 25) | F32 = 0.8968) | Float32Full = 0.8968) | Difference = 0.000001669
    Tan2(31.4, 25) | F32 = 0.8984) | Float32Full = 0.8984) | Difference = 0.000001729
    Tan2(31.5, 25) | F32 = 0.8999) | Float32Full = 0.8999) | Difference = 0.00000155
    Tan2(31.6, 25) | F32 = 0.9015) | Float32Full = 0.9015) | Difference = 0.000001669
    Tan2(31.7, 25) | F32 = 0.903) | Float32Full = 0.903) | Difference = 0.000001729
    Tan2(31.8, 25) | F32 = 0.9045) | Float32Full = 0.9045) | Difference = 0.000001729
    Tan2(31.9, 25) | F32 = 0.9061) | Float32Full = 0.9061) | Difference = 0.000001609
    Tan2(31, 24.9) | F32 = 0.8941) | Float32Full = 0.8941) | Difference = 0.000001609
    Tan2(31.1, 24.9) | F32 = 0.8957) | Float32Full = 0.8957) | Difference = 0.000001669
    Tan2(31.2, 24.9) | F32 = 0.8972) | Float32Full = 0.8972) | Difference = 0.000001609
    Tan2(31.3, 24.9) | F32 = 0.8988) | Float32Full = 0.8988) | Difference = 0.000001609
    Tan2(31.4, 24.9) | F32 = 0.9003) | Float32Full = 0.9003) | Difference = 0.000001729
    Tan2(31.5, 24.9) | F32 = 0.9019) | Float32Full = 0.9019) | Difference = 0.000001669
    Tan2(31.6, 24.9) | F32 = 0.9034) | Float32Full = 0.9034) | Difference = 0.00000155
    Tan2(31.7, 24.9) | F32 = 0.905) | Float32Full = 0.905) | Difference = 0.000001609
    Tan2(31.8, 24.9) | F32 = 0.9065) | Float32Full = 0.9065) | Difference = 0.000001609
    Tan2(31.9, 24.9) | F32 = 0.908) | Float32Full = 0.908) | Difference = 0.000001609
    

    I used these test numbers since they were numbers which caused problems with the version 1.5 code.

    Are there other sets of numbers anyone thinks would be worth checking?

    Thanks again Jonathan for this recent fix and for F32. It's a great object.
  • lonesocklonesock Posts: 917
    edited 2013-05-03 11:08
    Sweet! The issue was an overflow in the CORDIC routine. I had previously attempted to convert some other functions to use CORDIC, but was running into weird problems reusing the same code...might be related. If nobody has any problems, I will post this version to the OBEX, then proceed trying to CORDICify some other functions. [8^)

    thanks,
    Jonathan
  • LawsonLawson Posts: 870
    edited 2013-07-24 13:14
    Just found a minor bug. F32.Fround(2048000000) returns POSX when that integer won't overflow. FloatMath gets this correct.

    wtf? Fmod(2.0, 3.0) is 1073741824 according to F32 and Float32Full. (it should be 2.0)

    Lawson

    Edit: turns out I'm wrong and F32 is right. When you display Ffloat(2.0) as a decimal number 2.0 == 1073741824. :zombie:
  • Duane DegnDuane Degn Posts: 10,588
    edited 2013-10-13 12:37
    Jonathan,

    I'm hoping I can talk you into updating the OBEX version with your latest fix.

    With the recent change in the OBEX old links to it have been broken, including the link in your signature. The OBEX search appears to be just about worthless so I always just display all the objects on a single page and use the browser's search feature to find the desired object. I did this to find F32's current location.

    Again, thanks for making such a useful object.
  • lonesocklonesock Posts: 917
    edited 2013-10-16 11:09
    In theory I have now updated the OBEX, and the link in my sig...thanks!

    Jonathan
  • MagnetspinMagnetspin Posts: 11
    edited 2014-09-29 03:20
    Jonathan
    With f32.Asin( f32.Sin(Pi/2)) we get 6.805647e+38 instead of Pi/2.

    Best Regards
  • Heater.Heater. Posts: 21,230
    edited 2014-09-29 03:31
    Magnetspin,

    Which language is that, Spin or C?

    If it is Spin then the Pi / 2 will be done as integer arithmetic and give the wrong result.

    Surely that should be f32.Asin(f32.Sin(f32.Div(Pi, 2.0))) or whatever divide is in F32, I forget.
  • kuronekokuroneko Posts: 3,623
    edited 2014-09-29 03:38
    Use constant(PI/2.0) instead. Or - as pointed out already - runtime divide.
  • Heater.Heater. Posts: 21,230
    edited 2014-09-29 03:56
    Whist we are here. I was recently playing with quaternians and found this magical piece of code for performing fast inverse square roots. That is 1 over the square root of x or x to the power minus one half. Trick is it uses integer math and a four floating point multiplies.

    In C it looks like this:
    float invSqrt(float x) {
    	float halfx = 0.5f * x;
    	float y = x;
    	long i = *(long*)&y;
    	i = 0x5f3759df - (i>>1);
    	y = *(float*)&i;
    	y = y * (1.5f - (halfx * y * y));
    	return y;
    }
    
    It occurred to me this might make a nice addition to F32 if there is any space left.


    You can read about this little code trick here: http://en.wikipedia.org/wiki/Fast_inverse_square_root It was first made public in the Quake III arena source code.
  • MagnetspinMagnetspin Posts: 11
    edited 2014-09-29 07:47
    Heater

    I was testing Jonathans F32 obex with these results:


    f32.sin(Constant(pi/2.0)) :1.000000e+0 :thumb:
    f32.Asin(f32.sin(Constant(pi/2.0))) :6.805647e+38 :frown:
  • Duane DegnDuane Degn Posts: 10,588
    edited 2014-09-29 08:38
    Magnetspin wrote: »
    Heater

    I was testing Jonathans F32 obex with these results:


    f32.sin(Constant(pi/2.0)) :1.000000e+0 :thumb:
    f32.Asin(f32.sin(Constant(pi/2.0))) :6.805647e+38 :frown:

    Are you using the latest version of F32?

    I can't repeat your results.

    Can you post your test code?
    result = F32.Sin(pi/2.0) = 1
    F32.ASin(1.0) = 1.571
    F32.ASin(result) = 1.571
    F32.ASin(F32.Sin(pi/2.0)) = 1.571
    

    I just added the code below to a program I'm presently working on which is using two instances of F32 (hence the "[PATH]" identifier).
    F32[PATH].Start                                                             
      FloatStr.SetPrecision(4)
    
      Com.Strs(DEBUG_COM, string("result = F32.Sin(pi/2.0) = "))
      result := F32[PATH].Sin(constant(pi/2.0))
      Com.Str(DEBUG_COM, FloatStr.FloatToString(result))
      NewLine
      Com.Str(DEBUG_COM, string("F32.ASin(1.0) = "))
      Com.Str(DEBUG_COM, FloatStr.FloatToString(F32[PATH].ASin(1.0)))
      NewLine
      Com.Str(DEBUG_COM, string("F32.ASin(result) = "))
      Com.Str(DEBUG_COM, FloatStr.FloatToString(F32[PATH].ASin(result)))
      NewLine
      Com.Str(DEBUG_COM, string("F32.ASin(F32.Sin(pi/2.0)) = "))
      Com.Str(DEBUG_COM, FloatStr.FloatToString(F32[PATH].ASin(F32[PATH].Sin(constant(pi/2.0)))))
    
  • MagnetspinMagnetspin Posts: 11
    edited 2014-09-30 02:53
    Hi Duane
    Thanks for the input.
    I have the latest version form obex.
    I encreased the Precision of the output:

    F32 testing

    f32.sin(Constant(pi/2.0)) :1.000001e+0 ????
    f32.Asin(1.000001) :6.805647e+38 that is the key
    f32.Asin(1.0) :1.570796e+0

    No idea, whats going on.

    I am spinning...
  • kuronekokuroneko Posts: 3,623
    edited 2014-09-30 04:13
    Magnetspin wrote: »
    I encreased the Precision of the output:

    F32 testing

    f32.sin(Constant(pi/2.0)) :1.000001e+0 ????
    f32.Asin(1.000001) :6.805647e+38 that is the key
    f32.Asin(1.0) :1.570796e+0
    Can you please post your complete code here (archived)?

    For me the binary pattern of sin(PI/2) says exactly 1.0 ($3F800000), not sure where you get that extra non-null digit from. As for arcsin(1.000001), what did you expect it to return?
  • MagnetspinMagnetspin Posts: 11
    edited 2014-09-30 04:37
    Here is the code.

    Thank you for the help.
  • kuronekokuroneko Posts: 3,623
    edited 2014-09-30 04:45
    Looks fine to me (nothing unexpected):
    F32 testing
    
     f32.sin(Constant(pi/2.0))           :1.000000e+0
     f32.Asin(1.000001)                  :6.805647e+38
     f32.Asin(1.0)                       :1.570796e+0
     f32.Asin(f32.sin(result)            :1.570796e+0
     f32.Asin(f32.sin(Constant(pi/2.0))) :1.570796e+0
     f32.Asin(1.0)                       :1.570796e+0
    
    OK, the second line is odd but the first is still plain 1.0.

    The actual result of the second line is correct, i.e. NaN which is returned as POSX. IOW FloatString somehow can't cope with it.
  • Duane DegnDuane Degn Posts: 10,588
    edited 2014-09-30 10:10
    Magnetspin wrote: »
    Heater

    I was testing Jonathans F32 obex with these results:


    f32.sin(Constant(pi/2.0)) :1.000000e+0 :thumb:
    f32.Asin(f32.sin(Constant(pi/2.0))) :6.805647e+38 :frown:

    Where is the code which produced this error?

    As kuroneko showed the code below produces the expected results.
    term.str( string(13,  " f32.Asin(f32.sin(Constant(pi/2.0))) :" ) )
      term.str( FS.FloatToScientific( f32.Asin(f32.sin(Constant(pi/2.0)))))
    

    Why make a false claim?

    Edit: I apologize for my rudeness.
  • kuronekokuroneko Posts: 3,623
    edited 2014-09-30 16:13
    Duane Degn wrote: »
    Why make a false claim?
    The only thing missing is the odd 1.000001, other than that we just found a bug in FloatString although it can be argued that it was never designed to handle special numbers (like NaN). So everyone please relax.

    Post [post=1295934]#114[/post] contains the archive showing the bug (6.8...e+38).
  • Duane DegnDuane Degn Posts: 10,588
    edited 2014-09-30 17:11
    kuroneko wrote: »
    The only thing missing is the odd 1.000001, other than that we just found a bug in FloatString although it can be argued that it was never designed to handle special numbers (like NaN). So everyone please relax.

    Post [post=1295934]#114[/post] contains the archive showing the bug (6.8...e+38).

    I for one wouldn't have worried much if the problem had been stated as Asin(1.000001) equals 6.8...e+38 since 1.00001 isn't a valid parameter to the Asin method.

    I'm actively working on code for my hexapods and a bug in the Asin method would be a serious issue (as the bug in the Atan2 method had been). I was irritated to find I had spent time investigating an issue which didn't exist.
    kuroneko wrote: »
    So everyone please relax.

    Since you requested it, I'll relax. But I'll still be bit irritated (but not for very long and it will be a relaxed state of irritation).

    Time of a nice relaxing dinner.
  • kuronekokuroneko Posts: 3,623
    edited 2014-09-30 17:21
    Duane Degn wrote: »
    I for one wouldn't have worried much if the problem had been stated as Asin(1.000001) equals 6.8...e+38 since 1.00001 isn't a valid parameter to the Asin method.
    To clarify, yes it is an invalid parameter which causes NaN to be returned (but displayed wrongly, 3.4e+38 is max for float). What I'm still puzzled about is how the 1.000001 was produced in the first place.
    f32.sin(Constant(pi/2.0)) :1.000001e+0 ????
  • MagnetspinMagnetspin Posts: 11
    edited 2014-10-06 08:17
    Ok
    The problem is solved now: Hanno found the error. His compiler used 3.141593, which equates to $40490FDC. He had to change this to $40490FDB.

    Thank to all for the support.
  • HannoHanno Posts: 1,130
    edited 2014-10-06 13:12
    Sorry, but the problem is not solved.

    ViewPort used to use the decimal representation of Pi given in the Propeller Manual which is very slightly different than the hex representation probably used by the Propeller Tool/BST. When I changed this, the result from the float object for sin(pi/2) is now the same as bst/propeller tool= 1.0

    However, the following code now produces an error on all compilers because the float object has an error in it's sin/cos calculation:

    F32.start

    fs.SetPrecision(10)
    F.start

    vp.str( string(13, " f32.sin(Constant(3*pi/2.0)) :" ) )
    vp.str( FS.FloatToScientific( f32.sin(Constant(3.0*PI/2.0))))


    The output is:
    f32.sin(Constant(3*pi/2.0)) :-1.000001e+0

    As pointed out by Magnetspin, this error may be tiny, but if used in subsequent calculations can result in NAN which is a pretty big number for our Prop.
    Hanno
Sign In or Register to comment.