Please help with 64 bit math (pasm) again?
average joe
Posts: 795
Hi guys, I've been using my 64 bit math object for a while and it's working just fine. But now I needed a new function (2^n - 64 bit average) and decided to clean it up a bit. Well, I've finished on both tasks, but haven't tested either. The thing is, I'm looking at the code and I think it could be improved still. First, the entry (once pointers and function are loaded in spin) hub use could be optimized (??I think??). I'm just not seeing it right now. I don't think the write - back can be improved even though it has equivalent of 6 no-ops. But those aren't big improvements really....
The big change in semantics is, instead of passing 2 pointers to 64 bit numbers... I'm now passing a pointer to a buffer, still 64 bit numbers (read-only in pasm) and the size of the buffer. I have another couple functions to implement at some time (calculate checksum on byte buffer) that will use the same entry method (pointer to buffer and size)
What I am really trying to wrap my head around is doing the power of 2 divide the RIGHT way.. using the flags.. What I have so far:
Seems a bit hefty for pasm?
The FULL code for easy preview
Thanks as always guys!
The big change in semantics is, instead of passing 2 pointers to 64 bit numbers... I'm now passing a pointer to a buffer, still 64 bit numbers (read-only in pasm) and the size of the buffer. I have another couple functions to implement at some time (calculate checksum on byte buffer) that will use the same entry method (pointer to buffer and size)
DAT '' mailboxes results long 0 [2] function long 0 ' local copy of function a1ptr long 0 ' local copy of argument1 pointer a2ptr long 0 ' local copy of argument2 pointer PUB Average(SamplePtr, Number) {{ ////////////////////////////////////////////////////////////////////////////////// // SamplePtr is pointer first sample of buffer // // Number is power of 2 number of samples to average (2^n) = samples // // result of addition is stored in arg1prt longs // ////////////////////////////////////////////////////////////////////////////////// }} '' arg1_l1 - address of 2 long object local result buffer, pre-loaded with the address of sample buffer '' arg1_l2 - could carry ?? on call '' arg2_11 - Sample buffer size expressed as 2^N '' ?result? is arg2_l2, write back address of results in ?pasm? results := SamplePtr DoMath(@results, @Number, "v") return @results PUB Divide(arg1ptr, arg2ptr) '' Perform 64 bit divide {{ ////////////////////////////////////////////////////////////////////////////////// // arg1ptr is pointer to least significant long of dividend // // arg2ptr is pointer to divisor // // result of division is stored in arg1ptr longs // // remainder is stored in arg2ptr long // ////////////////////////////////////////////////////////////////////////////////// }} DoMath(arg1ptr, arg2ptr, "d") PRI DoMath(arg1ptr, arg2ptr, func) '' do the math operation a1ptr := arg1ptr ' move pointers to mailboxes a2ptr := arg2ptr ' move pointers to mailboxes function := func ' move pointers to mailboxes repeat while function ' wait for asm cog to finish
What I am really trying to wrap my head around is doing the power of 2 divide the RIGHT way.. using the flags.. What I have so far:
ave ' calculate number of samples mov t3, arg2_l1 ' copy arg2 long 1 into average counts_power of two mov t1, #1 ' shl t1, t3 ' shift t6 by count, 2^N sub t1, #1 ' sub 1 from total, first value preloaded ' now deal with addresses mov t4, arg1_l1 ' copy address of buffer to var mov t5, t4 ' and save a copy in work_add ' load the first sample rdlong arg1_l1, t5 ' get lsl of argument 1 add t5, #4 ' add long offset to work pointer rdlong arg1_l2, t5 ' get msl of argument 1 add t5, #4 ' add long offset to work pointer ave_add ' add remaining samples call #arg2_fill ' put next sample into arg2 longs call #addi ' 64 bit add, sub t1, #1 ' sub 1 from count tjnz t1, #ave_add ' until we have added all samples '' power of 2 divide 'prepare t6 mov t6, #0 ' move 0 into t6 mov t1, t3 ' move power of two into t1 pmask shl t6, #1 ' shift t6 left by 1 and t6, #1 ' and 1 to t6 djnz t1, #pmask ' loop ' do division on arg1 longs, shift right by N shr arg1_l1, t3 ' truncate lsb mov t2, arg1_l2 ' copy msl and t2, t6 ' t6 off overflow mov t1, #32 ' prepare inverse of power of 2 sub t1, t3 ' sub power of 2 from 32 shl t2, t1 ' shift overflow left by inverse power of 2 and arg1_l1, t2 ' and overflow from long 2 to long 1 shr arg1_l2, t3 ' truncate lsb ave_ret ret
Seems a bit hefty for pasm?
The FULL code for easy preview
{{ ////////////////////////////////////////////////////////////////////////////////// // 64 bit signed addition and subtraction methods // // Author : Joe Heinz // // Copyright : Joe Heinz - 2013 // // Grateful acknowledgement to kuroneko (parallax forum) // ////////////////////////////////////////////////////////////////////////////////// }} DAT '' mailboxes results long 0 [2] function long 0 ' local copy of function a1ptr long 0 ' local copy of argument1 pointer a2ptr long 0 ' local copy of argument2 pointer cog byte 0 ' cog PASM object is running in PUB NULL ' not a top level object PUB Start'(controlPin) '' Start math object in new cog {{ ////////////////////////////////////////////////////////////////////////////////// // controlPin is pin used to indicate there is a math operation to do // // returns cog ASM engine was run in or false if no cog available // ////////////////////////////////////////////////////////////////////////////////// }} cog := cognew(@entry, @function) +1 ' start asm cog return cog ' return cog PUB Stop '' Stop Math Cog if cog cogstop(cog~~ - 1) PUB Average(SamplePtr, Number) {{ ////////////////////////////////////////////////////////////////////////////////// // SamplePtr is pointer first sample of buffer // // Number is power of 2 number of samples to average (2^n) = samples // // result of addition is stored in arg1prt longs // ////////////////////////////////////////////////////////////////////////////////// }} '' arg1_l1 - address of 2 long object local result buffer, pre-loaded with the address of sample buffer '' arg1_l2 - could carry ?? on call '' arg2_11 - Sample buffer size expressed as 2^N '' ?result? is arg2_l2, write back address of results in ?pasm? results := SamplePtr DoMath(@results, @Number, "v") return @results PUB Divide(arg1ptr, arg2ptr) '' Perform 64 bit divide {{ ////////////////////////////////////////////////////////////////////////////////// // arg1ptr is pointer to least significant long of dividend // // arg2ptr is pointer to divisor // // result of division is stored in arg1ptr longs // // remainder is stored in arg2ptr long // ////////////////////////////////////////////////////////////////////////////////// }} DoMath(arg1ptr, arg2ptr, "d") PUB Multiply(arg1ptr, arg2ptr) '' Perform 64 bit multiply {{ ////////////////////////////////////////////////////////////////////////////////// // arg1ptr is pointer to least significant long of multiplicand // // arg2ptr is pointer to multiplier // // result of multiplication is stored in arg1ptr longs // ////////////////////////////////////////////////////////////////////////////////// }} DoMath(arg1ptr, arg2ptr, "m") PUB Addition(arg1ptr, arg2ptr) '' Perform 64 bit add {{ ////////////////////////////////////////////////////////////////////////////////// // arg1ptr is pointer to least significant long of argument1 // // arg2ptr is pointer to least significant long of argument2 // // result of addition is stored in arg1ptr longs // ////////////////////////////////////////////////////////////////////////////////// }} DoMath(arg1ptr, arg2ptr, "a") PUB Subtract(arg1ptr, arg2ptr) '' Perform 64 bit subtract {{ ////////////////////////////////////////////////////////////////////////////////// // arg1ptr is pointer to least significant long of host time, minuend // // arg2ptr is pointer to least significant long of argument2, subtractend // // result of subtraction is stored in arg1ptr longs // ////////////////////////////////////////////////////////////////////////////////// }} DoMath(arg1ptr, arg2ptr, "s") PRI DoMath(arg1ptr, arg2ptr, func) '' do the math operation a1ptr := arg1ptr ' move pointers to mailboxes a2ptr := arg2ptr ' move pointers to mailboxes function := func ' move pointers to mailboxes repeat while function ' wait for asm cog to finish DAT org entry mov fn_ptr, par ' get first fn_ptr of passing method, loop rdlong funct, fn_ptr ' read the function type *add or subtract* cmp funct, #0 wz ' if zero if_z jmp #loop ' try again ' now get pointers to arg1 and arg2 add fn_ptr, #4 ' add long offset to hub pointer rdlong arg1_ptr, fn_ptr ' get pointer to argument 1 add fn_ptr, #4 ' add long offset to hub pointer rdlong arg2_ptr, fn_ptr ' get pointer to argument 2 sub fn_ptr, #8 ' subtract 2 long offsets, point back at function ' get 2 longs from arg1 pointer rdlong arg1_l1, arg1_ptr ' get lsl of argument 1 add arg1_ptr, #4 ' add long offset to arg1 pointer rdlong arg1_l2, arg1_ptr ' get msl of argument 1 ' get 2 longs from arg2 pointer mov t5, arg2_ptr call #arg2_fill { rdlong arg2_l1, arg2_ptr ' get lsl of argument 2 add arg2_ptr, #4 ' add long offset to arg1 pointer rdlong arg2_l2, arg2_ptr ' get msl of argument 2 } ' do function cmp funct, #"a" wz if_z call #addi cmp funct, #"s" wz if_z call #subt cmp funct, #"m" wz if_z call #mult cmp funct, #"d" wz if_z call #div cmp funct, #"v" wz if_z call #ave ' write back 2 longs from arg1 pointer write wrlong arg1_l2, arg1_ptr ' write most significant long of result sub arg1_ptr, #4 ' subtract long offset from argument1 pointer wrlong arg1_l1, arg1_ptr ' write least significant long of result ' write back 2 longs from arg2 pointer wrlong arg2_l1, arg2_ptr ' write most significant long of result add arg2_ptr, #4 ' add long offset from argument2 pointer ' this differs since we read in sub wrlong arg2_l2, arg2_ptr ' write least significant long of result ' zero out function wrlong zero, fn_ptr ' clear function to return to caller jmp #loop ' back to loop start arg2_fill rdlong arg2_l1, t5 ' get lsl of argument 2 add t5, #4 ' add long offset to work pointer rdlong arg2_l2, t5 ' get msl of argument 2 add t5, #4 ' add long offset to work pointer arg2_fill_ret ret ave ' calculate number of samples mov t3, arg2_l1 ' copy arg2 long 1 into average counts_power of two mov t1, #1 ' shl t1, t3 ' shift t6 by count, 2^N sub t1, #1 ' sub 1 from total, first value preloaded ' now deal with addresses mov t4, arg1_l1 ' copy address of buffer to var mov t5, t4 ' and save a copy in work_add ' load the first sample rdlong arg1_l1, t5 ' get lsl of argument 1 add t5, #4 ' add long offset to work pointer rdlong arg1_l2, t5 ' get msl of argument 1 add t5, #4 ' add long offset to work pointer ave_add ' add remaining samples call #arg2_fill ' put next sample into arg2 longs call #addi ' 64 bit add, sub t1, #1 ' sub 1 from count tjnz t1, #ave_add ' until we have added all samples '' power of 2 divide 'prepare mask mov t6, #0 ' move 0 into t6 mov t1, t3 ' move power of two into t1 pmask shl t6, #1 ' shift t6 left by 1 and t6, #1 ' and 1 to t6 djnz t1, #pmask ' loop ' do division on arg1 longs, shift right by N shr arg1_l1, t3 ' truncate lsb mov t2, arg1_l2 ' copy msl and t2, t6 ' mask off overflow mov t1, #32 ' prepare inverse of power of 2 sub t1, t3 ' sub power of 2 from 32 shl t2, t1 ' shift overflow left by inverse power of 2 and arg1_l1, t2 ' and overflow from long 2 to long 1 shr arg1_l2, t3 ' truncate lsb ave_ret ret div mov t5, #0 ' prepare MSB place findm rol arg2_l1, #1 wc ' rotate divisor left 1 place and check for overflow if_nc add t5, #1 ' if no overflow, add 1 to MSB place if_nc jmp #findm ' repeat loop until overflow ror arg2_l1, #1 ' rotate back one mov t6, #1 ' prepare t6 mov t1, t5 ' move msb shift factor to t1 for itteration count shl t6, t5 ' move t6 by msb shift factor div2 cmpsub arg1_l2, arg2_l1 wc ' subtract divsor from dividend and watch for subtraction muxc t4, t6 ' mux c flag to t4 at t6 place cmpsub t1, #1 wz ' subtract 1 from shift factor watching for last division if_nz shr t6, #1 ' shift t6 right by 1 if_nz shr arg2_l1, #1 ' shift divisor right by 1 if_nz jmp #div2 ' and do loop again shr t6, #1 ' shift t6 right by 1 shr arg2_l1, #1 ' shift divisor right by 1 cmpsub arg1_l2, arg2_l1 wc ' subtract divsor from dividend and watch for subtraction muxc t4, t6 ' mux c flag to t4 at t6 place sub t1, #1 wz ' subtract 1 from shift factor watching for last division mov t1, #1 ' move 1 to t1 for muxc shl t6, #31 ' move t6 to MSB mov t2, #32 ' move 32 to t2 for itteration count div1 shl arg1_l2, #1 ' make room for next bit from LSL shl arg1_l1, #1 wc ' shift next bit from LSL watching for carry muxc arg1_l2, t1 ' mux carry into MSL cmpsub arg1_l2, arg2_l1 wc ' subtract divisor from MSL, watch for subtraction muxc t3, t6 ' mux c flag to t3 sub t2, #1 wz ' subtract 1 from itteration count if_nz shr t6, #1 ' move t6 to next place if_nz jmp #div1 ' do loop again mov arg2_l1, arg1_l2 ' move remainder in arg2_l1 mov arg2_l2, #0 ' move zero to arg2_l2 mov arg1_l2, t4 ' move t4 to arg1_l2 mov arg1_l1, t3 ' move t3 to arg1_l1 div_ret ret mult mov t1, arg2_l1 ' get multiplier from arg2_l1 and place in multiplier mov arg2_l1, arg1_l1 ' else copy longs for repeated addidion mov arg2_l2, arg1_l2 ' copy the next long sub t1, #1 ' subtract one from multiplier :loop call #addi ' jmp to add djnz t1, #:loop ' and loop mult_ret ret addi add arg1_l1, arg2_l1 wc ' do first addition and write c flag addsx arg1_l2, arg2_l2 ' do second addition addi_ret ret subt sub arg1_l1, arg2_l1 wc ' do first subtraction and write c flag subsx arg1_l2, arg2_l2 ' do second subtraction subt_ret ret zero long 0 '3 longs, pointers to hub memory fn_ptr res arg1_ptr res arg2_ptr res ' 5 longs, function and 4 long opperands funct res arg1_l1 res arg1_l2 res arg2_l1 res arg2_l2 res ' temp variables t1 res t2 res t3 res t4 res t5 res t6 res {{ ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ TERMS OF USE: MIT License │ ├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation │ │files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, │ │modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software│ │is furnished to do so, subject to the following conditions: │ │ │ │The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.│ │ │ │THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE │ │WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR │ │COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, │ │ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ }}
Thanks as always guys!
Comments
But the thing I'm wondering is... If I do something like this: Since I'm passing the address to results, and the address to Number... Couldn't I do the result assignment in PASM? Just write to @Number + 4. That's assuming the SPIN result is is the next variable, can someone confirm? Now if a method has locally declared longs, that would change right? I guess I could start probing at addresses in viewport but I'm sure someone knows this off the top of their head?
My mistake, result is first, i.e. result followed by method parameters followed by local variables.
I'm pretty sure it's result, then parameters, then locals.
example function: BST compiler listing: