Cordic pipeline test
ozpropdev
Posts: 2,792
in Propeller 2
Hi All
I've just been looking at the cordic again and tried a simple test.
It appears that the cordic can only deal with one operation per cog at a time.
My code is split into two test.
Here's the results
Included is my full test code (Support code is a work in progress)
I've just been looking at the cordic again and tried a simple test.
It appears that the cordic can only deal with one operation per cog at a time.
My code is split into two test.
qmul m1,m2 'first operation getqx m3 getqy m4 qmul m1a,m2a 'second operation getqx m3a getqy m4aand then
qmul m1,m2 'first operation waitx wclk qmul m1a,m2a 'second operation getqx m3 getqy m4 getqx m3a getqy m4aNotice hat the second variant uses waitx to pause a bit before sending another operation.
Here's the results
Cordic pipeline test - Ozpropdev 2015 Reference values 1st result = $31F46C22 2nd result = $13F9AC78 Waitx value ?#28 Test results 1st result = $31F46C22 2nd result = $31F46C22 Waitx value ?#30 Test results 1st result = $13F9AC78 2nd result = $13F9AC78 Waitx value ?A value of 28 gives the first answer and a value of 30 gives the second answer.
Included is my full test code (Support code is a work in progress)
Comments
I will review the GETQX/GETQY behavior.
In a new test I was able to start 3 math ops on cordic and get the right results.
Works great but some care is needed to synchronize the retrieval window to get correct results.
What happens if the GETQX is late by one clock?
Oz, try logging GETCNT in the test with the adjustable WAIT. I think you'll find there is extra delays occurring, ie: an effective timeout due to having missed a result.
Does upper and lower limits infer that too-late missed a value ?
What happens if an interrupt fired in this time ?
Can this be changed to block instead?
The way I'm understanding is the GETQX is waiting for an edge condition (event) but also has a timeout so as to prevent a hang. The timeout will need extended a little, me thinks, to eliminate the need for WAITX #10.
I can't really see the point of a WAITC unless you plan to attempt hubram ops after submitting an operation and before reading the result, which seems a little crazy. Maybe it could instead be an assembler macro that just does WAITX #x for the appropriate value of x, maybe taking a relative pointer to where the corresponding Qxxx instruction is so the assembler can calculate how many ticks it should wait.
PS: There may also be potential missed loops of Hub timing adding an extra 16 clocks. Not sure if HubRAM timings apply to the CORDIC though.
Then how did ozpropdev get GETQx to return the same thing twice by making the delay too short?
When a CORDIC command is executed, the "got" flag is cleared.
When the results come back, the "got" flag is set.
Once the "got" flag is set, GETQX/GETQY release with the results.
How should this work for overlapped commands? It's a bit of a brain bender.
I think "got" needs to be cleared on GETQX/GETQY. In fact, GETQX/GETQY would each need their own "got" flag. They'd get set on results arriving and cleared on results being read by GETQX/GETQY. Is that the recipe we need? I think they'd also need to be cleared on a CORDIC command.
Is this what we need? :
X and Y "got" flags cleared on CORDIC command
X and Y "got" flags set on results arrival
X and Y "got" flags individually cleared on GETQX and GETQY
That's simple to do, anyway. We just need to nail down how the flags should behave. Would that behavior give us what we want?
ie an interrupt may delay the read of queued results.
-or, it may be simpler to queue at the cordic front end, if the true fifo like operation needs too much extra logic.
It does need to avoid timing changes changing the nett results.
If you're not overlapping commands, it should work fine, as is.
I just compiled with the new rules I listed above and it seems to work as expected. You get to do GETQX/GETQY only once after a CORDIC command now. This should allow you to overlap commands without any WAITX's. The next FPGA release will have this behavior.
Lol, here comes the scourge that is interrupts - and introduced after the CORDIC too. The pipeline is already there but yeah, an additional FIFO for each Cog's results maybe the only reliable solution.
And this isn't even directly addressing the current issue.
I'm not seeing why a FIFO is needed.
You give a command, and when it's done you can read the results. How would interrupts destroy this? Are you assuming the interrupt code is going to use the CORDIC, too? That would not work if mainline code was already using it.
Without bold print rules, overlapping CORDIC commands and interrupts will be mixed together.
Ah, I see. You won't be able to get your results out in time before they get overwritten. Then, when you go to get the second results, you lock up because there's no new data. Better not do overlapping CORDIC commands if you've got interrupts occurring. That's not a hardship, really. People just have to be told. Putting a FIFO in there to deal with that would be extreme overkill. Wait... it would only have to have one level to overcome the problem. Maybe later we'll deal with this, if it seems necessary.
(is the special case of same-clock read and next-arrive handled ok ?)
Not mixing queued CORDIC and interrupts is tolerable, as that is a common rule already - most MCU floating point libraries are not reentrant.
That said, using IRQ blocking is a workaround, and probably the one that'll be most used.
Not on current code, given ozpropdev's examples.
There, a same-as-before result occurs.
I'm not sure what Chip's upcoming changes will do to the examples.
From above:
"In fact, GETQX/GETQY would each need their own "got" flag." ...
"I just compiled with the new rules I listed above and it seems to work as expected. You get to do GETQX/GETQY only once after a CORDIC command now. This should allow you to overlap commands without any WAITX's. The next FPGA release will have this behavior. "
I think the rare but possible case of new-result arrives with read-previous on the same clock edge, may need special care.