+ Reply to Thread
Page 1 of 3 123 LastLast
Results 1 to 20 of 42

Thread: F32 - Concise floating point code for the Propeller

  1. #1

    Default F32 - Concise floating point code for the Propeller

    Hi, All.

    I just uploaded F32 to the OBEX (http://obex.parallax.com/objects/689/). It is basically a rewrite of Float32Full, faster and fitting into a single cog. The Spin calling functions are identical, so it should make a convenient drop-in replacement.

    Please try it out if you have any code that uses Float32 or Float32Full currently, and let me know if you have any problems, or feature requests.

    thanks,
    Jonathan
    Free time status: see my avatar [8^)
    F32 - fast & concise floating point: OBEX, Thread
    Unrelated to the prop: KISSlicer

  2. #2

    Default Re: F32 - Concise floating point code for the Propeller

    Faster AND smaller?

    Very impressive!

    Ross.
    Catalina - a FREE ANSI C compiler for the Propeller.
    Download it from http://catalina-c.sourceforge.net/

  3. #3

    Default Re: F32 - Concise floating point code for the Propeller

    Hi lonesock.


    NICE Work


    Quote Originally Posted by lonesock View Post
    Hi, All.

    I just uploaded F32 to the OBEX (http://obex.parallax.com/objects/689/). It is basically a rewrite of Float32Full, faster and fitting into a single cog. The Spin calling functions are identical, so it should make a convenient drop-in replacement.

    Please try it out if you have any code that uses Float32 or Float32Full currently, and let me know if you have any problems, or feature requests.

    thanks,
    Jonathan
    Regards
    Sapieha
    __________________________________________________ ___
    Nothing is impossible, there are only different degrees of difficulty.
    For every stupid question there is at least one intelligent answer.
    Don't guess - ask instead.
    If you don't ask you won't know.
    If your gonna construct something, make it as simple as possible yet as versatile/usable as possible.

  4. #4

    Default Re: F32 - Concise floating point code for the Propeller

    Quote Originally Posted by lonesock View Post
    Hi, All.

    I just uploaded F32 to the OBEX (http://obex.parallax.com/objects/689/). It is basically a rewrite of Float32Full, faster and fitting into a single cog. The Spin calling functions are identical, so it should make a convenient drop-in replacement.

    Please try it out if you have any code that uses Float32 or Float32Full currently, and let me know if you have any problems, or feature requests.

    thanks,
    Jonathan
    Excellent!

  5. #5

    Default Re: F32 - Concise floating point code for the Propeller

    Thanks for the kind words. There is one final thing I need to document, and that is the domain and accuracy for each function. (Some of them still use the lookup tables in prop ROM, with linear interpolation between points, as did the original Float32Full implementation.)

    Jonathan
    Free time status: see my avatar [8^)
    F32 - fast & concise floating point: OBEX, Thread
    Unrelated to the prop: KISSlicer

  6. #6

    Default Re: F32 - Concise floating point code for the Propeller

    Cool. I'll have to try these when I need more speed. Can you comment on how you achieved the improvements?

  7. #7

    Default Re: F32 - Concise floating point code for the Propeller

    Sure. I don't remember everything, but here's a few:

    * All of the Arc* functions used to be polynomial approximations, IIRC using 6 FP adds and 6 FP Multiplies, plus some other preconditioning functions, and a SQRT. I switched to an efficient CORDIC routine to calculate ATan2 directly (ATan2 originally computed the division, then called ATan), gaining much more speed, and better accuracy as well. Now all the Arc* functions use the ATan 2 implementation.

    * Both multiply and divide had some inefficiencies in their main loops, and computed more bits than necessary (fp32 only needs 24 bits of mantissa, so multiply uses 24 iterations, while divide needed 26 iterations to get the rounding right)

    * the command dispatch table no longer needs to fit inside the cog's RAM, and embeds the call command directly, so the dispatch routine is smaller too.

    * Sqrt used an iterative scheme with embedded FP multiplications...I switched to calculating it directly (with a nifty sliding window, which I have never seen before...might have to write a mini-whitepaper on it [8^)

    * the table interpolation is a bit faster now

    * the Sin and Cosine code use the faster table interpolation, and the Tangent code reuses some preliminary results (the angle scaling) from Sin when calling Cos.

    * various small tweaks.

    Jonathan
    Free time status: see my avatar [8^)
    F32 - fast & concise floating point: OBEX, Thread
    Unrelated to the prop: KISSlicer

  8. #8

    Default Re: F32 - Concise floating point code for the Propeller

    Quote Originally Posted by lonesock View Post
    * Sqrt used an iterative scheme with embedded FP multiplications...I switched to calculating it directly (with a nifty sliding window, which I have never seen before...might have to write a mini-whitepaper on it [8^)
    Is it anything like this?

  9. #9

    Default Re: F32 - Concise floating point code for the Propeller

    Nope! [8^)

    The calculation itself is a pretty typical bit-by-bit solution (if remainder >= ((root+delta)^2 - root^2) then you can subtract that term from the remainder, and add delta to the root). The cool part is the adjusting the remainder and root values in situ to keep from overflowing, and to keep all significant bits (for integers, sqrt of a 32 bit value is a 16 bit value, but for floating point, I have 24 significant bits in the input, and need 24 significant bits in the output).

    Jonathan
    Free time status: see my avatar [8^)
    F32 - fast & concise floating point: OBEX, Thread
    Unrelated to the prop: KISSlicer

  10. #10

    Default Re: F32 - Concise floating point code for the Propeller

    Just noticed this - thanks Jonathan

  11. #11

    Default Re: F32 - Concise floating point code for the Propeller

    Very cool stuff, thanks a lot.

  12. #12
    prof_braino's Avatar
    Location
    Between the pit of man's fears and the summit of his knowledge
    Posts
    4,006
    Blog Entries
    8

    Default Re: F32 - Concise floating point code for the Propeller

    Quote Originally Posted by lonesock View Post
    ...with a nifty sliding window, which I have never seen before...might have to write a mini-whitepaper on it ....
    *lonesock is now a rockstar among propellerheads*

    Please post the whitepaper, and join the Fourier for dummies thread, we need your help!

  13. #13

    Thumbs up Re: F32 - Concise floating point code for the Propeller

    Thanks Jonathan.

    Your previous F32 was already a great help in comparison to FloatMath. I use FDiv a lot. In one of my function, the old calculation took 140528 cycles & when I replaced with your's, it became 7376 cycles.

    Appreciated your support on this.

  14. #14

    Default Re: F32 - Concise floating point code for the Propeller

    Thanks for the kind words, everyone. I'm glad / I hope it's useful. [8^)

    Jonathan
    Free time status: see my avatar [8^)
    F32 - fast & concise floating point: OBEX, Thread
    Unrelated to the prop: KISSlicer

  15. #15

    Default Re: F32 - Concise floating point code for the Propeller

    This object is VERY facinating and the timing is really good for what I am working on.

    Unfortunatly, I am not sure if I can use it for what I want to. As most of what I am doing is speed sensitve, I am coding as much as I can in PASM. The reality is though that I really don't PASM enough to understand these math routines. So, I am wondering if these routines can be called from a PASM routine running in another COG? If it is possible, how is this done?

    Thanks

    Chris

  16. #16

    Default Re: F32 - Concise floating point code for the Propeller

    Hi, Chris.

    Great question. I updated the F32 object to more easily support calling from your own PASM cog, and updated the demo to show off a simple incrementing application. Btw, to get maximum speed out of your cog, I recommend setting up and starting the float operation, then doing some other stuff, then waiting to get the result back just before you need it. In the demo code I just loop until the data is ready, but that's not an overly efficient use of your cog's time [8^)

    Jonathan
    Free time status: see my avatar [8^)
    F32 - fast & concise floating point: OBEX, Thread
    Unrelated to the prop: KISSlicer

  17. #17

    Default Re: F32 - Concise floating point code for the Propeller

    Hi Jonathan,

    Thanks for responding. I was hoping it would be possible to call these routines from PASM in another COG.

    Can you provide a sample code of how this is done? Nothing complicated, just an example of how I would multiply or divide two numbers from within a PASM COG.

    Thanks

    Chris

  18. #18

    Default Re: F32 - Concise floating point code for the Propeller

    Sure! This is what was added to the file "demo_F32.spin" in the latest version of the F32 object in the OBEX:
    (of course it is showing Add instead of Mul or Div, but you get the picture)

    Code:
    {{
    
      This portion of the demo shows the calling convention used by PASM.
      F32 expects a vector of 3 sequential longs in Hub RAM, call them:
          "result a b"
      F32 also has a pointer long "f32_Cmd".  The calling goes like this:
    
      result := the dispatch call command (e.g. cmdFAdd)
      a := the first floating point encoded parameter
      b := the second floating point encoded parameter
    
      Now the vector is initialized, you set "f32_Cmd" to the address of
      the start of the vector:
    
      f32_Cmd := @result
    
      Just wait until f32_Cmd equals 0, and by then the F32 object wrote
      the result of the floating point operation into "result" (the
      head of the input vector).      
    }}
    PUB demo_F32_pasm | timeout
      ' The PASM calling cog needs to know 2 base addresses
      F32_call_vector[0] := f32.Cmd_ptr
      F32_call_vector[1] := f32.Call_ptr
      ' start up the demo cog
      cognew( @F32_pasm_eg, @F32_call_vector )
    
      ' and just print some stuff
      timeout := cnt
      repeat
        ' print it out
        term.str( fs.FloatToString( F32_call_vector[2] ) )
        term.tx( 13 )
        ' wait
        waitcnt( timeout += clkfreq )   
      
    DAT     ' this is the F32 call vector (3 longs)
            ' Note: all 3 are initialized to 0.0 (the Spin compiler can do fp32 constants)
    F32_call_vector         long    0.0[3]
    
    ORG 0
    F32_pasm_eg             ' read my pointer values in
                            mov     t1, par
                            rdlong  cmd_ptr, t1
                            add     t1, #4
                            rdlong  call_ptr, t1
    
                            ' initialize vector[1] (1st parameter) to 1.0
                            wrlong  increment, t1
    
    demo_loop               ' load the dispatch call into vector[0]
                            mov     t1, #f32#offAdd
                            add     t1, call_ptr
                            rdlong  t1, t1
                            wrlong  t1, par
    
                            ' call the F32 routine by setting the command pointer to non-0
                            mov     t1, par
                            wrlong  t1, cmd_ptr
    
                            ' now wait till it's done!
    :waiting_loop           rdlong  t1, cmd_ptr     wz
                  if_nz     jmp     #:waiting_loop
    
                            ' Done!  vector[0] = vector[1] + vector[2]
                            rdlong  t1, par
    
                            ' update my 2nd parameter, and do this all over again
                            mov     t2, par
                            add     t2, #8
                            wrlong  t1, t2
    
                            jmp     #demo_loop
    
    increment     long      1.0e6                        
    cmd_ptr       res       1
    call_ptr      res       1
    t1            res       1
    t2            res       1
    Jonathan
    Free time status: see my avatar [8^)
    F32 - fast & concise floating point: OBEX, Thread
    Unrelated to the prop: KISSlicer

  19. #19

    Default Re: F32 - Concise floating point code for the Propeller

    Cool!!
    Now I get another cog!

  20. #20

    Default Re: F32 - Concise floating point code for the Propeller

    Jonathan,

    I think I got it now.

    Thank much!

    Chris

+ Reply to Thread

Similar Threads

  1. Floating point math and the propeller
    By bprager in forum Propeller 1 Multicore Microcontroller
    Replies: 17
    Last Post: 01-13-2011, 07:23 PM
  2. Fixed Point to Floating Point
    By RogerInHawaii in forum Propeller 1 Multicore Microcontroller
    Replies: 5
    Last Post: 08-18-2009, 07:32 AM
  3. Propeller C, 32 bit floating point options
    By ImageCraft in forum Propeller 1 Multicore Microcontroller
    Replies: 13
    Last Post: 11-01-2008, 03:21 AM
  4. Convert IEEE-754 32-bit floating point value to fixed point 16.16 values.
    By Bean in forum SX Chips and Programming Tools
    Replies: 0
    Last Post: 05-13-2008, 10:17 PM
  5. Concise code
    By ScottL in forum BASIC Stamp
    Replies: 10
    Last Post: 05-16-2006, 07:46 PM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts