Shop OBEX P1 Docs P2 Docs Learn Events
Cordic pipeline test — Parallax Forums

Cordic pipeline test

Hi All
I've just been looking at the cordic again and tried a simple test.
It appears that the cordic can only deal with one operation per cog at a time.
My code is split into two test.
		qmul	m1,m2			'first operation
		getqx	m3
		getqy	m4
		qmul	m1a,m2a			'second operation
		getqx	m3a
		getqy	m4a

and then
		qmul	m1,m2			'first operation
		waitx 	wclk		
		qmul	m1a,m2a			'second operation
		getqx	m3
		getqy	m4
		getqx	m3a
		getqy	m4a

Notice hat the second variant uses waitx to pause a bit before sending another operation.
Here's the results
 Cordic pipeline test - Ozpropdev 2015
 Reference values
 1st result = $31F46C22
 2nd result = $13F9AC78
 Waitx value ?#28
 Test results
 1st result = $31F46C22
 2nd result = $31F46C22
 Waitx value ?#30
 Test results
 1st result = $13F9AC78
 2nd result = $13F9AC78
 Waitx value ?

A value of 28 gives the first answer and a value of 30 gives the second answer.
Included is my full test code (Support code is a work in progress)


«1345

Comments

  • cgraceycgracey Posts: 14,209
    edited 2015-10-24 15:06
    Try getting rid of the WAITX. There's an opportunity every 16 clocks to start another CORDIC operation.

    I will review the GETQX/GETQY behavior.
  • Thanks Chip.
    In a new test I was able to start 3 math ops on cordic and get the right results.
    Works great but some care is needed to synchronize the retrieval window to get correct results.
    		qmul	m1,m2			'first operation
    		qmul	m1a,m2a			'second operation
    		qmul	m1b,m2b			'third operation
    		getqx	m3			'first result
    		getqy	m4
    		waitx 	#25			'10-25 works		
    		getqx	m3a			'second result
    		getqy	m4a
    		waitx 	#25		
    		getqx	m3b			'third result
    		getqy	m4b
    


  • evanhevanh Posts: 16,041
    edited 2015-10-25 03:43
    Can up to seven instructions fill between each QMUL without missing a timing slot? And presumably a similar number of instructions replace the WAITs as well?

    What happens if the GETQX is late by one clock?
  • evanhevanh Posts: 16,041
    Ah, forget the GETQX question. I can see from reading Oz's results at least part of the answer is likely there already.

    Oz, try logging GETCNT in the test with the adjustable WAIT. I think you'll find there is extra delays occurring, ie: an effective timeout due to having missed a result.

  • jmgjmg Posts: 15,175
    ozpropdev wrote: »
    waitx #25 '10-25 works
    ....
    waitx #25
    Interesting, is that 10~25 range valid for both waits ?
    Does upper and lower limits infer that too-late missed a value ?
    What happens if an interrupt fired in this time ?


  • If you read the result too early, do you get the previous result or what?

    Can this be changed to block instead?
  • Here's the results for too early (9) and too late (26)
     Cordic pipeline test - Ozpropdev 2015
     Reference values
     1st result = $31F46C22
     2nd result = $13F9AC78
     3rd result = $041EB7B8
     Waitx value ?9
     Test results
     1st result = $31F46C22
     2nd result = $31F46C22
     3rd result = $13F9AC78
     Waitx value ?#26
     Test results
     1st result = $31F46C22
     2nd result = $041EB7B8
     3rd result = $041EB7B8
     Waitx value ?
    
    
  • evanhevanh Posts: 16,041
    edited 2015-10-25 05:00
    Oh, don't look too good does it. I think I understand though. Try GETCNT to see if there is a timeout occurring. I bet there is when the WAIT is too large.

    The way I'm understanding is the GETQX is waiting for an edge condition (event) but also has a timeout so as to prevent a hang. The timeout will need extended a little, me thinks, to eliminate the need for WAITX #10.
  • jmgjmg Posts: 15,175
    Interesting, if that is as-intended behaviour, maybe a WAITC (wait Cordic) is needed, which would act like a FIFO read and tolerate interrupts ?
  • ElectrodudeElectrodude Posts: 1,660
    edited 2015-10-25 05:24
    So getqx and getqy return whatever answer is currently waiting for your cog to read, and it gets overwritten when it gets overwritten and doesn't get cleared when you read it?

    I can't really see the point of a WAITC unless you plan to attempt hubram ops after submitting an operation and before reading the result, which seems a little crazy. Maybe it could instead be an assembler macro that just does WAITX #x for the appropriate value of x, maybe taking a relative pointer to where the corresponding Qxxx instruction is so the assembler can calculate how many ticks it should wait.
  • evanhevanh Posts: 16,041
    So getqx and getqy return whatever answer is currently stored for your cog, and it gets overwritten when it gets overwritten?
    No, the GETQx instructions do wait for the next result to drop off the CORDIC pipe. Adding a WAITC won't help.

  • evanhevanh Posts: 16,041
    edited 2015-10-25 05:32
    But if a result doesn't turn up in time then you do get whatever last came out. That's my understanding.

    PS: There may also be potential missed loops of Hub timing adding an extra 16 clocks. Not sure if HubRAM timings apply to the CORDIC though.
  • evanh wrote: »
    So getqx and getqy return whatever answer is currently stored for your cog, and it gets overwritten when it gets overwritten?
    No, the GETQx instructions do wait for the next result to drop off the CORDIC pipe. Adding a WAITC won't help.

    Then how did ozpropdev get GETQx to return the same thing twice by making the delay too short?

  • evanhevanh Posts: 16,041
    That's where the length of the hard-coded GETQx timeout is important. If it's too short then the buff of WAITX #10 helps it out. I'm speculating of course.
  • jmgjmg Posts: 15,175
    Then how did ozpropdev get GETQx to return the same thing twice by making the delay too short?
    Yes, Looks like this is not quite working as expected....

  • cgraceycgracey Posts: 14,209
    The way it works now is like this:

    When a CORDIC command is executed, the "got" flag is cleared.
    When the results come back, the "got" flag is set.
    Once the "got" flag is set, GETQX/GETQY release with the results.

    How should this work for overlapped commands? It's a bit of a brain bender.

    I think "got" needs to be cleared on GETQX/GETQY. In fact, GETQX/GETQY would each need their own "got" flag. They'd get set on results arriving and cleared on results being read by GETQX/GETQY. Is that the recipe we need? I think they'd also need to be cleared on a CORDIC command.

    Is this what we need? :

    X and Y "got" flags cleared on CORDIC command
    X and Y "got" flags set on results arrival
    X and Y "got" flags individually cleared on GETQX and GETQY

    That's simple to do, anyway. We just need to nail down how the flags should behave. Would that behavior give us what we want?
  • jmgjmg Posts: 15,175
    I think it needs a genuine pipeline alongside the ripple-thru flags, just like a FIFO.
    ie an interrupt may delay the read of queued results.
    -or, it may be simpler to queue at the cordic front end, if the true fifo like operation needs too much extra logic.
    It does need to avoid timing changes changing the nett results.
  • cgraceycgracey Posts: 14,209
    jmg wrote: »
    I think it needs a genuine pipeline alongside the ripple-thru flags, just like a FIFO.
    ie an interrupt may delay the read of queued results.
    -or, it may be simpler to queue at the cordic front end, if the true fifo like operation needs too much extra logic.
    It does need to avoid timing changes changing the nett results.

    If you're not overlapping commands, it should work fine, as is.

    I just compiled with the new rules I listed above and it seems to work as expected. You get to do GETQX/GETQY only once after a CORDIC command now. This should allow you to overlap commands without any WAITX's. The next FPGA release will have this behavior.
  • evanhevanh Posts: 16,041
    edited 2015-10-25 09:40
    jmg wrote: »
    I think it needs a genuine pipeline alongside the ripple-thru flags, just like a FIFO.
    ie an interrupt may delay the read of queued results.
    -or, it may be simpler to queue at the cordic front end, if the true fifo like operation needs too much extra logic.
    It does need to avoid timing changes changing the nett results.

    Lol, here comes the scourge that is interrupts - and introduced after the CORDIC too. The pipeline is already there but yeah, an additional FIFO for each Cog's results maybe the only reliable solution.

    And this isn't even directly addressing the current issue.

  • cgraceycgracey Posts: 14,209
    evanh wrote: »
    jmg wrote: »
    I think it needs a genuine pipeline alongside the ripple-thru flags, just like a FIFO.
    ie an interrupt may delay the read of queued results.
    -or, it may be simpler to queue at the cordic front end, if the true fifo like operation needs too much extra logic.
    It does need to avoid timing changes changing the nett results.

    Lol, here comes the scourge that is interrupts - and introduced after the CORDIC too. The pipeline is already there but yeah, an additional FIFO for each Cog's results maybe the only reliable solution.

    And this isn't even directly addressing the current issue.

    I'm not seeing why a FIFO is needed.

    You give a command, and when it's done you can read the results. How would interrupts destroy this? Are you assuming the interrupt code is going to use the CORDIC, too? That would not work if mainline code was already using it.
  • evanhevanh Posts: 16,041
    cgracey wrote: »
    If you're not overlapping commands, it should work fine, as is.

    Without bold print rules, overlapping CORDIC commands and interrupts will be mixed together. :(
  • cgraceycgracey Posts: 14,209
    edited 2015-10-25 09:55
    evanh wrote: »
    cgracey wrote: »
    If you're not overlapping commands, it should work fine, as is.

    Without bold print rules, overlapping CORDIC commands and interrupts will be mixed together. :(

    Ah, I see. You won't be able to get your results out in time before they get overwritten. Then, when you go to get the second results, you lock up because there's no new data. Better not do overlapping CORDIC commands if you've got interrupts occurring. That's not a hardship, really. People just have to be told. Putting a FIFO in there to deal with that would be extreme overkill. Wait... it would only have to have one level to overcome the problem. Maybe later we'll deal with this, if it seems necessary.
  • jmgjmg Posts: 15,175
    edited 2015-10-25 18:34
    cgracey wrote: »
    Ah, I see. You won't be able to get your results out in time before they get overwritten. Then, when you go to get the second results, you lock up because there's no new data. Better not do overlapping CORDIC commands if you've got interrupts occurring. That's not a hardship, really. People just have to be told. ...
    Most important is not having read-delays affect the result delivery, which I think you have fixed.
    (is the special case of same-clock read and next-arrive handled ok ?)

    Not mixing queued CORDIC and interrupts is tolerable, as that is a common rule already - most MCU floating point libraries are not reentrant.


  • evanhevanh Posts: 16,041
    edited 2015-10-25 21:37
    I wouldn't put this in the same category for one reason: The whole Cog becomes banned, not just the ISR.

    That said, using IRQ blocking is a workaround, and probably the one that'll be most used.
  • evanhevanh Posts: 16,041
    On the current problem though, I'm a bit uneasy that there is potential for total lock-up on a GETQx instruction. If I'm reading Chip right then an incorrectly placed GETQx can wait forever.
  • jmgjmg Posts: 15,175
    evanh wrote: »
    On the current problem though, I'm a bit uneasy that there is potential for total lock-up on a GETQx instruction. If I'm reading Chip right then an incorrectly placed GETQx can wait forever.

    Not on current code, given ozpropdev's examples.
    There, a same-as-before result occurs.

    I'm not sure what Chip's upcoming changes will do to the examples.

  • evanhevanh Posts: 16,041
    I have a separate question too: What is the timings for Cog access, namely the GETQx instructions, to the CORDIC?
  • evanhevanh Posts: 16,041
    And does GETQX or GETQY reset the result flag? Does it matter what the instruction order is? What if the next result arrives when only half the result has been read?
  • jmgjmg Posts: 15,175
    edited 2015-10-25 23:19
    evanh wrote: »
    And does GETQX or GETQY reset the result flag? Does it matter what the instruction order is? What if the next result arrives when only half the result has been read?
    I think Chip has now expanded to a flag each ?
    From above:
    "In fact, GETQX/GETQY would each need their own "got" flag." ...

    "I just compiled with the new rules I listed above and it seems to work as expected. You get to do GETQX/GETQY only once after a CORDIC command now. This should allow you to overlap commands without any WAITX's. The next FPGA release will have this behavior. "


    I think the rare but possible case of new-result arrives with read-previous on the same clock edge, may need special care.
  • cgraceycgracey Posts: 14,209
    I've got the read/new result overlap case covered. I expect to have new FPGA files out today. Just need to compile for all 4 boards.
Sign In or Register to comment.