Shop OBEX P1 Docs P2 Docs Learn Events
Usefulness of Floating Point Math Co-Processors — Parallax Forums

Usefulness of Floating Point Math Co-Processors

Prophead100Prophead100 Posts: 192
edited 2012-08-18 16:26 in Propeller 1
I have some floating point heavy math code on the Prop that uses a lot of trigonometry functions such as tan,atan, sin, asin, cos, acos and some basic multiplication/division and addition/substraction. It works OK with the current OBEX objects but I would like to substantially speed up some 32 bit math in SPIN and perhaps upgrade to 64 bit math in SPIN or GCC if possible so I can squeeze in more performance.

Has anyone had experience using the uM-FPU 32 or uM-FPU64 coprocessors to speed up the calculations? Do these co-processors when used with the Prop makeup for the overhead of communicating with an external chip?

Comments

  • Duane C. JohnsonDuane C. Johnson Posts: 955
    edited 2012-08-16 15:14
    Hi Prophead100;

    How much precision do you require?

    And what is the application?

    Duane J
  • Duane DegnDuane Degn Posts: 10,588
    edited 2012-08-16 15:40
    I've used the uM-FPU32 chip a bit. I haven't done side by side speed comparisons, but my gut tells me using the F32 object would be faster than using the uM-FPU32 chip.

    I was pleasently surprised I was able to compute the IK positions of all 18 servos at 50Hz for my hexapod using F32.

    If you need 64 bit math, then I think the uM-FPU64 chip would be worth using.

    I think cessnapilot wrote objects for both the FPU chips.
  • Prophead100Prophead100 Posts: 192
    edited 2012-08-16 16:03
    The application is a mobile implementation of a tracking solar panel using my SPIN Solar Object which is 32 bits right now. It runs fast enough to calculate and provide coordinates to control a two axis solar panel or heliostat using a low precision (+/- 0.5 degree) formula but the floating point math is using a couple cogs and has a noticable (e.g. some observable fraction of a second) execution time. With faster 32 bit floating point math and fewer cogs I may have space to add MPPT power control (also floating point) and some closed-loop sensors for refined tracking (and power management options) or perhaps other functions run by the same board. I could also add more code to refine the heliostat formula.

    I am also working on using a higher precision formula (+/- 0.003 degree) for a concentrating solar collector or solar observatory which requires 64 bit math. With help from Jazzed and Heater, some C code from NREL ran well on the Prop using GCC but took a couple hundred seconds with a basic configuration which would be too slower than the speed needed to track the sun in real time at that details (72 ms).
  • jmgjmg Posts: 15,173
    edited 2012-08-16 16:13
    With most maths routines, you can trade off speed with precision, so you may find a 40 bit real, for example, adds precision, but not at such an impact on speed as 64 bit would.

    You could also use a Fixed-real approach, where a slower task does the precision calcs, and a faster one just interpolates between what are very short segments.
  • Tracy AllenTracy Allen Posts: 6,664
    edited 2012-08-16 16:25
    If each calculation involves pouring variables into the hopper and turning the crank on a complicated equation in many terms, well, you can program the whole thing into the µFPU, using its nice IDE. The communication reduces to transferring the data and retrieving the result(s), between which the µFPU is quite fast with the calculation itself. There are extras, such as hookup for a GPS and automatic parsing into variables, which might help with your mobile app. Also it has analog to digital converters for possible use with the sensors.
  • Mark_TMark_T Posts: 1,981
    edited 2012-08-16 16:29
    Surely fraction of a second latency is enough for tracking the sun in the sky? Or is this on the move?

    When I looked at the specs of these co-processors I wasn't impressed - I think they only served to reduce the ROM footprint on the microcontroller! Certainly not in the class of a 387 I believe. If they didn't have a gazillion pins an ARM FP vector co-processor would be handy (single cycle for single-precision on all but divide/sqroot, 2-cycle for double-precision...)

    The uM-FPU V3.1 does have 2 analog inputs with 12 bit A/D converter I note...
  • rod1963rod1963 Posts: 752
    edited 2012-08-16 17:02
    Too bad someone didn't adapt the STM32F4 as a math-coprocessor. It has DSP and floating point hardware. Use the Prop as a front-end and the F4 as the back end compute engine.
  • SRLMSRLM Posts: 5,045
    edited 2012-08-16 18:34
    I'm able to get over 2KHz on this block of floating point math operations (with a standard Propeller setup) with my version of F32:
    t_5 = 2 * offset
    const_2_pi = 2 * pi
    
    c = (K_Q * diameter) / K_T
    t_1 = M_z / (4*c)
    t_2 = M_y / t_5
    t_3 = M_x / t_5
    t_4 = F_z / 4
    
    F_1 = (t_4 + (t_1 - t_2)) #> 0
    F_2 = (t_4 - (t_1 + t_3)) #> 0
    F_3 = (t_4 + (t_1 + t_2)) #> 0
    F_4 = (t_4 + (t_3 - t_1)) #> 0
    
    
    t_1 = const_2_pi / (diameter * diameter)
    t_2 = rho * K_T
    
    omega_d_1 = t_1 * ((F_1 / t_2) sqrt 0)
    omega_d_2 = t_1 * ((F_2 / t_2) sqrt 0)
    omega_d_3 = t_1 * ((F_3 / t_2) sqrt 0)
    omega_d_4 = t_1 * ((F_4 / t_2) sqrt 0)
    
    n_d_1 = omega_d_1 / const_2_pi
    n_d_2 = omega_d_2 / const_2_pi
    n_d_3 = omega_d_3 / const_2_pi
    n_d_4 = omega_d_4 / const_2_pi
    

    Maximum Time: 0.417ms
    Minimum Time: 0.377ms
    Average Time: 0.414ms

    Note that, all of the operations are done. IE, there is no optimization of "2 * pi": this is included on the calculations.

    This uses one cog to compute without any oversight, and is 32 bit. It supports most of the functionality of F32, but not all:
    "*":"Mul", "/":"Div", "+":"Add", "-":"Sub", "sqrt":"Sqr", \
    "#>":"LimitMin", "<#":"LimitMax", "arc_t2":"ATan2", "arc_c":"ACos", \
    "arc_s":"ASin", "sin":"Sin", "cos":"Cos", "tan":"Tan", "~":"PID", "||":"TruncRound", \
    "ffloat":"Float"

    If needed, you could substitute the routines that you do need (exp, ...) for ones in the list above that you do not need.

    Anyway, I was planning on posting the source code for this on the forums sometime. If it looks useful to you then I can do it this weekend.
  • SRLMSRLM Posts: 5,045
    edited 2012-08-16 18:35
    Could you post the math routine that you are doing, or a link to the formulas?
  • Prophead100Prophead100 Posts: 192
    edited 2012-08-18 16:26
    Thanks for all the input
    If each calculation involves pouring variables into the hopper and turning the crank on a complicated equation in many terms, well, you can program the whole thing into the µFPU, using its nice IDE. The communication reduces to transferring the data and retrieving the result(s), between which the µFPU is quite fast with the calculation itself. There are extras, such as hookup for a GPS and automatic parsing into variables, which might help with your mobile app. Also it has analog to digital converters for possible use with the sensors.

    The ability to program a set of functions into the uFPU may the most effective way short of rewrite a lot of code to optimize it. The GPS parsing would definitely simplify the time and mobile location part of the equation. Perhaps, I could program it to simply output the solar variables as something that goes between the GPS and the Prop for the higher resolution/speed applications where the cost of the chip would be worth it. My only apprehension is learning another programming dialect. For routine lower precision solar tracking the Prop alone could do it at a lower cost. Thanks

    SLRM - The code I am trying to optimize is the solar object ( http://obex.parallax.com/objects/807/ ) in SPIN and some NREL code in C++ (Post 15 of this thread --> http://forums.parallax.com/showthread.php?141011-Newbie-C-Syntax-Question).

    Mark T. - The latency issue comes in when (a) the higher resolution tracking is needed for a high ratio solar concentrator or solar telescope, (b) where more time is needed in the code for other operations such as process control of the mechanics or (c) mobile as you suggest.
Sign In or Register to comment.