64 / 32 => 32 Unsigned Divide
Phil Pilgrim (PhiPi)
Posts: 23,514
I needed a division routine that divides a 64-bit unsigned dividend by a 32-bit unsigned divisor to return a 32-bit quotient and 32-bit remainder, leaving the original carry bit unchanged. The only limitation would have to be that the high long of the dividend be less than the divisor (i.e. no overflow). I couldn't find a routine like this in the forum or in the object exchange (although I'm sure that deSilva, who has a memory like a steel trap, can point to one immediately). Here's the code I came up with:
It's pretty straightforward, except for the inconvenient fact that cmpsub doesn't treat the carry flag as a most-significant bit when doing its comparison; so that condition has to be tested separately. (The consequence of this behavior for division routines that use cmpsub exclusively is that the high bit of the divisor must be zero.) The original carry bit simply propagates through rb and back out again, so is preserved.
Exclusive of the call and return, this routine takes 8.2µs to execute with an 80MHz clock.
-Phil
'Divide ra:rb by rx, leaving quotient in rb, remainder in ra, and rx unchanged. 'Precondition: ra < rx. divabx [b]mov[/b] cctr,#32 'Initialize loop counter. :loop [b]rcl[/b] rb,#1 [b]wc[/b] 'Rotate quotient bit in from carry, [b]rcl[/b] ra,#1 [b]wc[/b] ' and rotate dividend out to carry [b]if_c[/b] [b]sub[/b] ra,rx 'rx < carry:ra if carry set, so just subtract, leaving carry set. [b]if_nc[/b] [b]cmpsub[/b] ra,rx [b]wc[/b] 'Otherwise, use cmpsub to do the dirty work. [b]djnz[/b] cctr,#:loop 'Back for more. [b]rcl[/b] rb,#1 [b]wc[/b] 'Rotate last quotient bit into place, restoring original carry. divabx_ret [b]ret[/b]
It's pretty straightforward, except for the inconvenient fact that cmpsub doesn't treat the carry flag as a most-significant bit when doing its comparison; so that condition has to be tested separately. (The consequence of this behavior for division routines that use cmpsub exclusively is that the high bit of the divisor must be zero.) The original carry bit simply propagates through rb and back out again, so is preserved.
Exclusive of the call and return, this routine takes 8.2µs to execute with an 80MHz clock.
-Phil
Comments
Which is essentially the same.
Note that they have not explicitely stated as Phil did, that "....the high long of the dividend be less than the divisor."
Now if we imagine we had only 16 bits words at our disposal how would that routine had to be modified?
We see immediately that we will already run into trouble with the first instruction (SHL #15). So, we first reduce our requirements to dividing 63 bit numbers by 31 bit numbers only Tis is not so grave, as signed numbers are just of that kind...
Now directly adapting it to 63/31 bit division would yield:
It is obvious now, that we can save an instruction by CAREFULLY exchanging CMPSUB and the RCLs
This looks like a straightforward transformation... It comes out very nicely withthe preserved carry flag. I think the special consideration Phil took for the CMPSUB overflow has to do with his strive for 64/32 rather than 63/31 division. I THINK it works, but I am not absolutely sure
Post Edited (deSilva) : 1/27/2008 12:17:10 AM GMT