SETMULA confusion in multi-threading.
ozpropdev
Posts: 2,793
Maybe I should have titled this post "Observations of a multi-tasker Part 2"
Firstly, apologies to Chip and the GETMULL instruction.
In my post on WAITVID glitches, I accused GETMULL of not operating correctly in multi-threading mode.
I accused it of introducing unexpected pipeline stall.
Chip tested this and found it to be Ok in his current FPGA core and thought it might have been fixed in-between releases.
That made me think that it was more likely that I had missed something else.
Here's the code in question
The second alternative code alerted me to the fact that the extra NOP's and/or loops are offsetting my time slots to
a "sweet spot" that my code is happier with. Further investigation revealed that the SETMULA instruction appears to be the
culprit. The documentation states that this instruction takes 16 cycles. This implies that I am missing a complete sweep
of my scheduling as defined by SETTASK. As I am using the MATH stuff a lot, the cumulative error in my time slots has
a major effect.
The GETMULL instruction has already been fixed to stop pipeline stall in multi-threading mode which leads me to:
Can SETMULA and SETDIVA be fixed too?
Cheers
Brian
Edit: "New candidates for polling in multi-threading" was title.
Edit 2: "SETMULA takes 16 clock cycles" was incorrect. (Read from obsolete documentation! (Oops)
Now chasing time slot "sweet spot" issue. Eagerly waiting for new FPGA with SETRACE function
Firstly, apologies to Chip and the GETMULL instruction.
In my post on WAITVID glitches, I accused GETMULL of not operating correctly in multi-threading mode.
I accused it of introducing unexpected pipeline stall.
Chip tested this and found it to be Ok in his current FPGA core and thought it might have been fixed in-between releases.
That made me think that it was more likely that I had missed something else.
Here's the code in question
SETMULA reg1 SETMULB reg2 MOV MX,#7 DJNZ MX,#$ 'loop added to fix problem GETMULL reg3 'alternate code that works too SETMULA reg1 SETMULB reg2 GETMULL reg3 NOP NOP NOP '3 NOP's added has same effect
The second alternative code alerted me to the fact that the extra NOP's and/or loops are offsetting my time slots to
a "sweet spot" that my code is happier with. Further investigation revealed that the SETMULA instruction appears to be the
culprit. The documentation states that this instruction takes 16 cycles. This implies that I am missing a complete sweep
of my scheduling as defined by SETTASK. As I am using the MATH stuff a lot, the cumulative error in my time slots has
a major effect.
The GETMULL instruction has already been fixed to stop pipeline stall in multi-threading mode which leads me to:
Can SETMULA and SETDIVA be fixed too?
Cheers
Brian
Edit: "New candidates for polling in multi-threading" was title.
Edit 2: "SETMULA takes 16 clock cycles" was incorrect. (Read from obsolete documentation! (Oops)
Now chasing time slot "sweet spot" issue. Eagerly waiting for new FPGA with SETRACE function
Comments
I tried to choose a title that reflected the main reason for the post.
I went around in circles for a while, I did have a few alternatives in mind such as:
"Adventures in multi-threading"
"Pipeline stall when using MUL and DIV hardware."
or
"Don't PANIC everyone, but I think I've observed another potential gotcha."
"How coffee consumption increases proportionally to intensive system analysis."
"Sorry Chip, I have some more suggestions for instruction changes ."
Hmm, maybe I'll leave it for the moment.
Cheers
Brian
Also, I'm guessing that SETMULx would stall like GETMULL if the multiplier was still busy. Though, I'd imagine that the only reason you'd be in such a situation is if you were using the multiplier from two tasks, or have a bug.
And, finally, I don't understand how the NOPs help. GETMULL is not a delayed instruction, is it? You know, the new SETRACE instruction would make all of this clear really fast!
SETDIVA also shows in the same docs as taking 16 cycles.
Is there some other docs I've missed somewhere?
No it's not delayed and the 3 NOP's is just a coincidence.
They seem to "align" the time slots to a happy place.
The MUL instructions are only used in 1 task.
Now there is MUL,MUL32,MUL32U instead. Timing is unknown?
Just measured SETMULA and it took 1 cycle.
Bring on SETTRACE!
Thanks Chip