Shop OBEX P1 Docs P2 Docs Learn Events
SETMULA confusion in multi-threading. — Parallax Forums

SETMULA confusion in multi-threading.

ozpropdevozpropdev Posts: 2,792
edited 2013-11-06 22:27 in Propeller 2
Maybe I should have titled this post "Observations of a multi-tasker Part 2"

Firstly, apologies to Chip and the GETMULL instruction.
In my post on WAITVID glitches, I accused GETMULL of not operating correctly in multi-threading mode.
I accused it of introducing unexpected pipeline stall.
Chip tested this and found it to be Ok in his current FPGA core and thought it might have been fixed in-between releases.
That made me think that it was more likely that I had missed something else.

Here's the code in question
    SETMULA    reg1
    SETMULB    reg2
    MOV MX,#7
    DJNZ MX,#$      'loop added to fix problem
    GETMULL   reg3

'alternate code that works too

    SETMULA    reg1
    SETMULB    reg2
    GETMULL   reg3
    NOP
    NOP
    NOP       '3 NOP's added has same effect


The second alternative code alerted me to the fact that the extra NOP's and/or loops are offsetting my time slots to
a "sweet spot" that my code is happier with. Further investigation revealed that the SETMULA instruction appears to be the
culprit. The documentation states that this instruction takes 16 cycles. This implies that I am missing a complete sweep
of my scheduling as defined by SETTASK. As I am using the MATH stuff a lot, the cumulative error in my time slots has
a major effect.

The GETMULL instruction has already been fixed to stop pipeline stall in multi-threading mode which leads me to:

Can SETMULA and SETDIVA be fixed too? :)


Cheers
Brian

Edit: "New candidates for polling in multi-threading" was title.

Edit 2: "SETMULA takes 16 clock cycles" was incorrect. (Read from obsolete documentation! (Oops)
Now chasing time slot "sweet spot" issue. Eagerly waiting for new FPGA with SETRACE function :)

Comments

  • Cluso99Cluso99 Posts: 18,069
    edited 2013-11-06 18:01
    You can change the title by editing your post in advanced.
  • ozpropdevozpropdev Posts: 2,792
    edited 2013-11-06 19:10
    Thanks Ray

    I tried to choose a title that reflected the main reason for the post.
    I went around in circles for a while, I did have a few alternatives in mind such as:

    "Adventures in multi-threading"
    "Pipeline stall when using MUL and DIV hardware."

    or

    "Don't PANIC everyone, but I think I've observed another potential gotcha."
    "How coffee consumption increases proportionally to intensive system analysis."
    "Sorry Chip, I have some more suggestions for instruction changes ."

    Hmm, maybe I'll leave it for the moment. :)

    Cheers
    Brian
  • SeairthSeairth Posts: 2,474
    edited 2013-11-06 20:43
    Are you sure you've got that correct? What I'm seeing is that SETMULA and SETMULB take one cycle. The multiplier will take 17 cycles to complete, which would mean that the GETMULL would stall if called before the multiplier finishes. Of course, depending on your task scheduling, the code above may have enough cycles used by other tasks that you are meeting the 17 clock-cycle requirement before calling GETMULL.

    Also, I'm guessing that SETMULx would stall like GETMULL if the multiplier was still busy. Though, I'd imagine that the only reason you'd be in such a situation is if you were using the multiplier from two tasks, or have a bug.

    And, finally, I don't understand how the NOPs help. GETMULL is not a delayed instruction, is it? You know, the new SETRACE instruction would make all of this clear really fast!
  • ozpropdevozpropdev Posts: 2,792
    edited 2013-11-06 21:00
    From the unofficial docs.
    SETMULA D/#n

    Setup long A to be multiplied by long B given the value in register “D (0-511)” or number “n (0-511)”.

    Will take 16 cycles

    SETDIVA also shows in the same docs as taking 16 cycles.
    Is there some other docs I've missed somewhere?
  • ozpropdevozpropdev Posts: 2,792
    edited 2013-11-06 21:10
    Seairth wrote: »
    And, finally, I don't understand how the NOPs help. GETMULL is not a delayed instruction, is it?

    No it's not delayed and the 3 NOP's is just a coincidence.
    They seem to "align" the time slots to a happy place.
    The MUL instructions are only used in 1 task.
  • ozpropdevozpropdev Posts: 2,792
    edited 2013-11-06 21:19
    Looking at Chip's latest instruction set it appears that SETMULA/B is now obsolete.

    Now there is MUL,MUL32,MUL32U instead. Timing is unknown?
  • ozpropdevozpropdev Posts: 2,792
    edited 2013-11-06 21:33
    Ok.
    Just measured SETMULA and it took 1 cycle.
    Bring on SETTRACE!
  • cgraceycgracey Posts: 14,151
    edited 2013-11-06 21:53
    SETMULA and other instructions which start the multiplier, divider, square rooter, and CORDIC all take 1 clock. It's the GETxxxx instructions that will wait or loop until the state machine completes its current operation. I hope to have a new FPGA file out soon. Thanks for your patience, Everyone. Things ARE coming together very nicely.
  • Cluso99Cluso99 Posts: 18,069
    edited 2013-11-06 22:22
    cgracey wrote: »
    ... I hope to have a new FPGA file out soon. Thanks for your patience, Everyone. Things ARE coming together very nicely.
    Great news Chip! We are happily awaiting this huge improvement :)
  • ozpropdevozpropdev Posts: 2,792
    edited 2013-11-06 22:27
    Excellent!
    Thanks Chip :)
Sign In or Register to comment.