64 bit math

average joe · 2013-07-18 11:45

After some work I'm ready to release the first version of my 64 bit (unsigned) math object. It allows adding or subtracting 64 bit numbers, as well as multiplying or dividing a 64 bit number by a 32 bit number. It's far from perfect. There's no checking for overflows or any other safeguards. This first version starts a cog, does the math, then stops the cog. I will be creating a second version that keeps the cog running as time permits.

groggory · 2013-07-18 12:13

average joe wrote: »
I'm working on the first part of my 64 bit code and have a quick question.
I need to subtract one 64 bit, unsigned number from another 64 bit, unsigned number. This needs to return a 64 bit signed result. (offset) This offset will later be added back to an unsigned 64 bit number to return a relative, unsigned 64 bit number. So my thinking is to do this:
sub   Xlow,  Ylow wc
subx Xhigh, Yhigh 
The original numbers are not expected to exceed 63 bits, so am I doing things right so far? I'll be able to provide more details later, just looking for a quick sanity check.

I'm sure I'll have more questions as I go.

I think step one should be to use 64 bit signed numbers (63 bit + sign) for everything. Here's a for instance on why.

a - b = c
c + d = f

if: b > a then c < 0

if d < c*-1 then e < 0

for example.

a = 0 , b = 10 , d = 0
a - b = 0 - 10 = c = -10
c + d = -10 + 0 = e = -10

in summary: a = 0, b = 10 , c = -10 , d = 0 , e = -10
a, b, and d are fine as unsigned numbers, but c and e would need to be signed numbers.

....

In turn, your life is easier if you you use signed math for everything. Otherwise you need to spend more time being careful that numbers never leave the positive number interval.

average joe · 2013-07-18 12:39

That's what I was thinking. So far, here's what I have. I have interleaved the actual subtraction with hub ops to decrease the number of cycles used. I will be making an addition engine that will run very similarly. I'm starting and stopping the cog since these will not be used that often and trying to save battery.

PUB Subtract(hostPtr, nodePtr)   | cog

  cog := cognew(@address, @hostPtr) +1
  repeat while nodePtr
  cogstop (cog -1)
  

DAT
        org
        
address       nop   
host_ptr      mov       address, par
node_ptr      rdlong    host_ptr, address
host_long1    add       address, #4
host_long2    rdlong    node_ptr, address
node_long1    rdlong    host_ptr, host_long1
node_long2    add       host_ptr, #4
              rdlong    host_ptr, host_long2
              rdlong    node_ptr, node_long1
              add       node_ptr, #4
              subs      host_long1, node_long1  wc       
              rdlong    node_ptr, node_long2
              sub       node_ptr, #4
              subsx     host_long2, node_long2
              wrlong    host_long2, host_ptr
              sub       host_ptr, #4
              wrlong    host_long1, host_ptr
              wrlong    node_ptr, #0

Any thoughts would be helpful!

average joe · 2013-07-18 22:19

I have made some modifications to the assembly engine for my maths. I THINK it's getting close but is still untested (and un-commented). Here's what I've come up with. Once again, any thoughts are greatly appreciated!

{{
//////////////////////////////////////////////////////////////////////////////////
//      64 bit signed addition and subtraction methods                          //
//      Author : Joe Heinz                                                      //
//      Copyright : Joe Heinz                                                   //
//                                                                              //
//      data structure as follows :                                             //
//      VAR  long  hostLong1, hostLong2, nodeLong1, nodeLong2                   //
//////////////////////////////////////////////////////////////////////////////////
}}
PUB Subtraction(hostPtr, nodePtr)      '' Perform 64 bit subtraction
{{
//////////////////////////////////////////////////////////////////////////////////
//  hostPtr is pointer to least significant long of host time, minuend          //  
//  nodePtr is pointer to least significant long of node time, subtractend      //
//  result of subtraction is stored in host longs                               //
//  returns cog ASM engine was run in or false if no cog available              //
//////////////////////////////////////////////////////////////////////////////////
}}

  result := DoMath(hostPtr, nodePtr, "s")
  
PUB Addition(hostPtr, nodePtr)         '' Perform 64 bit subtraction
{{
//////////////////////////////////////////////////////////////////////////////////
//  hostPtr is pointer to least significant long of host time                   //  
//  nodePtr is pointer to least significant long of node time                   //
//  result of subtraction is stored in host longs                               //
//  returns cog ASM engine was run in or false if no cog available              //
//////////////////////////////////////////////////////////////////////////////////
}}

  result := DoMath(hostPtr, nodePtr, "a")
  
PRI DoMath(hPtr, nPtr, function) : cog                  ''
                                                                                                
  cog := cognew(@address, @hPtr) +1                     ' start asm cog
  repeat while nPtr                                     ' wait for asm cog to finish
  cogstop (cog -1)                                      ' stop cog
  

DAT
        org
        
address       nop   
host_ptr      mov       address, par                    ' get first address of passing method
node_ptr      rdword    host_ptr, address               ' read the address to get pointer to Least Significant long of host time 
host_long1    add       address, #2                     ' add a word offset to address
host_long2    rdword    node_ptr, address               ' read the address to get pointer to Least Significant long of node time
funct         add       address, #2                     ' add a word offset to address                                     
              rdbyte    funct, address                  ' read the function type *add or subtract*
              cmp       funct, sub_funct        wz      ' if subtract function, set z flag
              rdlong    host_ptr, host_long1            ' get least significant long of host time
node_long1    add       host_ptr, #4                    ' add long offset to host pointer
node_long2    rdlong    host_ptr, host_long2            ' get most significant long of host time
              rdlong    node_ptr, node_long1            ' get least significant long of node time
              add       node_ptr, #4                    ' add long offset to node pointer                                             
if_z          subs      host_long1, node_long1  wc      ' do first subtraction and write c flag
if_nz         adds      host_long1, node_long1  wc      ' do first addition and write c flag
              rdlong    node_ptr, node_long2            ' get most significant long of node time
              sub       node_ptr, #4                    ' subtract long offset from node pointer
if_z          subsx     host_long2, node_long2          ' do second subtraction                                   
if_nz         addsx     host_long2, node_long2          ' do second addition
              wrlong    host_long2, host_ptr            ' write most significant long of result
              sub       host_ptr, #4                    ' subtract long offset from host pointer
              wrlong    host_long1, host_ptr            ' write least significant long of result
              wrlong    node_ptr, #0                    ' clear node pointer to stop cog and return to caller

sub_funct     long "s"

*edit*
Just updated code with fully commented version. Still looking for any advice!

kuroneko · 2013-07-18 23:26

average joe wrote: »

Just updated code with fully commented version. Still looking for any advice!

I guess you figure out your hub access swap-around soon enough (addr vs data). Also, you do have the cog space so just use a res array for the data. This manual overlay is really bad style if you ask me. And your PASM code should either stop itself or wait in a controlled way until stopped externally. Letting it run off into the wild may have too many side-effects.

average joe · 2013-07-19 01:06

What's wrong with the manual overlay method? I will probably convert to res array, just wondering... Very good point about having the cog stop itself. I'll take care of that right now.
Also, fixed the address / data swap. Thanks for pointing that out!

*edit*
Code with suggested improvements :

{{
//////////////////////////////////////////////////////////////////////////////////
//      64 bit signed addition and subtraction methods                          //
//      Author : Joe Heinz                                                      //
//      Copyright : Joe Heinz - 2013                                            //
//                                                                              //
//      data structure as follows :                                             //
//      VAR  long  hostLong1, hostLong2, nodeLong1, nodeLong2                   //
//////////////////////////////////////////////////////////////////////////////////
}}
PUB Subtraction(hostPtr, nodePtr)      '' Perform 64 bit subtraction
{{
//////////////////////////////////////////////////////////////////////////////////
//  hostPtr is pointer to least significant long of host time, minuend          //  
//  nodePtr is pointer to least significant long of node time, subtractend      //
//  result of subtraction is stored in host longs                               //
//  returns cog ASM engine was run in or false if no cog available              //
//////////////////////////////////////////////////////////////////////////////////
}}

  result := DoMath(hostPtr, nodePtr, "s")
  
PUB Addition(hostPtr, nodePtr)         '' Perform 64 bit subtraction
{{
//////////////////////////////////////////////////////////////////////////////////
//  hostPtr is pointer to least significant long of host time                   //  
//  nodePtr is pointer to least significant long of node time                   //
//  result of subtraction is stored in host longs                               //
//  returns cog ASM engine was run in or false if no cog available              //
//////////////////////////////////////////////////////////////////////////////////
}}

  result := DoMath(hostPtr, nodePtr, "a")
  
PRI DoMath(hPtr, nPtr, function) : cog                  ''
                                                                                                
  cog := cognew(@entry, @hPtr) +1                       ' start asm cog
  repeat while nPtr                                     ' wait for asm cog to finish
  

DAT
        org
        
entry         cogid     p_cog
              mov       address, par                    ' get first address of passing method
              rdword    host_ptr, address               ' read the address to get pointer to Least Significant long of host time 
              add       address, #2                     ' add a word offset to address
              rdword    node_ptr, address               ' read the address to get pointer to Least Significant long of node time
              add       address, #2                     ' add a word offset to address                                     
              rdbyte    funct, address                  ' read the function type *add or subtract*
              cmp       funct, sub_funct        wz      ' if subtract function, set z flag
              rdlong    host_long1, host_ptr            ' get least significant long of host time
              add       host_ptr, #4                    ' add long offset to host pointer
              rdlong    host_long2, host_ptr            ' get most significant long of host time
              rdlong    node_long1, node_ptr            ' get least significant long of node time
              add       node_ptr, #4                    ' add long offset to node pointer                                             
if_z          subs      host_long1, node_long1  wc      ' do first subtraction and write c flag
if_nz         adds      host_long1, node_long1  wc      ' do first addition and write c flag
              rdlong    node_long2, node_ptr            ' get most significant long of node time
              sub       node_ptr, #4                    ' subtract long offset from node pointer
if_z          subsx     host_long2, node_long2          ' do second subtraction                                   
if_nz         addsx     host_long2, node_long2          ' do second addition
              wrlong    host_long2, host_ptr            ' write most significant long of result
              sub       host_ptr, #4                    ' subtract long offset from host pointer
              wrlong    host_long1, host_ptr            ' write least significant long of result
              wrlong    zero, node_ptr                   ' clear node pointer to stop cog and return to caller
              cogstop   p_cog

sub_funct     long "s"
zero          long 0
             
address       res
host_ptr      res
node_ptr      res
host_long1    res
host_long2    res
node_long1    res
node_long2    res
funct         res
p_cog         res

kuroneko · 2013-07-19 01:18

average joe wrote: »

What's wrong with the manual overlay method?

Let's just say it's harder to maintain when you change/restructure your code. You always have to keep an eye on what code location is still required etc. I find it annoying having data lables mixed with code. I don't mind the occasional nop location referenced by $+n though. A res array doesn't use up space and keeps all variables separated form the code.

Anyway, if that's your style, stick with it.

Just noticed, function parameters are still longs, so even if you only read words you'll have to advance by 4 (not 2, @nPtr == @hPtr + 4).

average joe · 2013-07-19 01:42

Ahhh, I see where you are coming from. It's just a habit from constantly running out of cogram. Very good point.

Re: long address. That's a great point! When I started writing, I had word variables to store the pointers in. But I decided that passing the DoMath variables to be better. And then never went back and changed the advance.

{{
//////////////////////////////////////////////////////////////////////////////////
//      64 bit signed addition and subtraction methods                          //
//      Author : Joe Heinz                                                      //
//      Copyright : Joe Heinz - 2013                                            //
//      Grateful acknowledgement to kuroneko (parallax forum                    //
//      data structure as follows :                                             //
//      VAR  long  hostLong1, hostLong2, nodeLong1, nodeLong2                   //
//////////////////////////////////////////////////////////////////////////////////
}}
PUB Subtraction(hostPtr, nodePtr)      '' Perform 64 bit subtraction
{{
//////////////////////////////////////////////////////////////////////////////////
//  hostPtr is pointer to least significant long of host time, minuend          //  
//  nodePtr is pointer to least significant long of node time, subtractend      //
//  result of subtraction is stored in host longs                               //
//  returns cog ASM engine was run in or false if no cog available              //
//////////////////////////////////////////////////////////////////////////////////
}}

  result := DoMath(hostPtr, nodePtr, "s")
  
PUB Addition(hostPtr, nodePtr)         '' Perform 64 bit subtraction
{{
//////////////////////////////////////////////////////////////////////////////////
//  hostPtr is pointer to least significant long of host time                   //  
//  nodePtr is pointer to least significant long of node time                   //
//  result of subtraction is stored in host longs                               //
//  returns cog ASM engine was run in or false if no cog available              //
//////////////////////////////////////////////////////////////////////////////////
}}

  result := DoMath(hostPtr, nodePtr, "a")
  
PRI DoMath(hPtr, nPtr, function) : cog                  ''
                                                                                                
  cog := cognew(@entry, @hPtr) +1                       ' start asm cog
  repeat while nPtr                                     ' wait for asm cog to finish
  

DAT
        org
        
entry         cogid     p_cog
              mov       address, par                    ' get first address of passing method
              rdword    host_ptr, address               ' read the address to get pointer to Least Significant long of host time 
              add       address, #2                     ' add a word offset to address
              rdword    node_ptr, address               ' read the address to get pointer to Least Significant long of node time
              add       address, #2                     ' add a word offset to address                                     
              rdbyte    funct, address                  ' read the function type *add or subtract*
              cmp       funct, sub_funct        wz      ' if subtract function, set z flag
              rdlong    host_long1, host_ptr            ' get least significant long of host time
              add       host_ptr, #4                    ' add long offset to host pointer
              rdlong    host_long2, host_ptr            ' get most significant long of host time
              rdlong    node_long1, node_ptr            ' get least significant long of node time
              add       node_ptr, #4                    ' add long offset to node pointer                                             
if_z          subs      host_long1, node_long1  wc      ' do first subtraction and write c flag
if_nz         adds      host_long1, node_long1  wc      ' do first addition and write c flag
              rdlong    node_long2, node_ptr            ' get most significant long of node time
              sub       node_ptr, #4                    ' subtract long offset from node pointer
if_z          subsx     host_long2, node_long2          ' do second subtraction                                   
if_nz         addsx     host_long2, node_long2          ' do second addition
              wrlong    host_long2, host_ptr            ' write most significant long of result
              sub       host_ptr, #4                    ' subtract long offset from host pointer
              wrlong    host_long1, host_ptr            ' write least significant long of result
              wrlong    zero, node_ptr                   ' clear node pointer to stop cog and return to caller
              cogstop   p_cog

sub_funct     long "s"
zero          long 0
             
address       res
host_ptr      res
node_ptr      res
host_long1    res
host_long2    res
node_long1    res
node_long2    res
funct         res
p_cog         res

kuroneko · 2013-07-19 01:58

When you - finally - change the advance you should also update the end indication. You check for nPtr being 0 but never write to it (you write to long[nPtr] instead). You advanced address to read the function code so you might as well use that with a wrbyte. Otherwise you'll have to get back to address -4 etc.

average joe · 2013-07-19 02:08

Ahh, that would have been an issue as well. Need to be more careful of these thing!

{{
//////////////////////////////////////////////////////////////////////////////////
//      64 bit signed addition and subtraction methods                          //
//      Author : Joe Heinz                                                      //
//      Copyright : Joe Heinz - 2013                                            //
//      Grateful acknowledgement to kuroneko (parallax forum)                   //
//      data structure as follows :                                             //
//      VAR  long  hostLong1, hostLong2, nodeLong1, nodeLong2                   //
//////////////////////////////////////////////////////////////////////////////////
}}
PUB Subtraction(hostPtr, nodePtr)      '' Perform 64 bit subtraction
{{
//////////////////////////////////////////////////////////////////////////////////
//  hostPtr is pointer to least significant long of host time, minuend          //  
//  nodePtr is pointer to least significant long of node time, subtractend      //
//  result of subtraction is stored in host longs                               //
//  returns cog ASM engine was run in or false if no cog available              //
//////////////////////////////////////////////////////////////////////////////////
}}

  result := DoMath(hostPtr, nodePtr, "s")
  
PUB Addition(hostPtr, nodePtr)         '' Perform 64 bit subtraction
{{
//////////////////////////////////////////////////////////////////////////////////
//  hostPtr is pointer to least significant long of host time                   //  
//  nodePtr is pointer to least significant long of node time                   //
//  result of subtraction is stored in host longs                               //
//  returns cog ASM engine was run in or false if no cog available              //
//////////////////////////////////////////////////////////////////////////////////
}}

  result := DoMath(hostPtr, nodePtr, "a")
  
PRI DoMath(hPtr, nPtr, function) : cog                  ''
                                                                                                
  cog := cognew(@entry, @hPtr) +1                       ' start asm cog
  repeat while nPtr                                     ' wait for asm cog to finish
  

DAT
        org
        
entry         cogid     p_cog
              mov       address, par                    ' get first address of passing method
              rdword    host_ptr, address               ' read the address to get pointer to Least Significant long of host time 
              add       address, #4                     ' add a word offset to address
              rdword    node_ptr, address               ' read the address to get pointer to Least Significant long of node time
              add       address, #4                     ' add a word offset to address                                     
              rdbyte    funct, address                  ' read the function type *add or subtract*
              cmp       funct, sub_funct        wz      ' if subtract function, set z flag
              rdlong    host_long1, host_ptr            ' get least significant long of host time
              add       host_ptr, #4                    ' add long offset to host pointer
              rdlong    host_long2, host_ptr            ' get most significant long of host time
              rdlong    node_long1, node_ptr            ' get least significant long of node time
              add       node_ptr, #4                    ' add long offset to node pointer                                             
if_z          subs      host_long1, node_long1  wc      ' do first subtraction and write c flag
if_nz         adds      host_long1, node_long1  wc      ' do first addition and write c flag
              rdlong    node_long2, node_ptr            ' get most significant long of node time
if_z          subsx     host_long2, node_long2          ' do second subtraction                                   
if_nz         addsx     host_long2, node_long2          ' do second addition
              wrlong    host_long2, host_ptr            ' write most significant long of result
              sub       host_ptr, #4                    ' subtract long offset from host pointer
              wrlong    host_long1, host_ptr            ' write least significant long of result
              sub       address, #4                     ' subtract long offset from address, points to nPtr
              wrlong    zero, address                   ' clear node pointer to return to caller
              cogstop   p_cog

sub_funct     long "s"
zero          long 0
             
address       res
host_ptr      res
node_ptr      res
host_long1    res
host_long2    res
node_long1    res
node_long2    res
funct         res
p_cog         res

kuroneko · 2013-07-19 02:14

Just tested this bit and it does in fact work:

PRI DoMath(hPtr, nPtr, function) : cog                  ''
                                                                                                
  cog := cognew(@entry, @hPtr) +1                       ' start asm cog
  repeat while function                                 ' wait for asm cog to finish

DAT             org     0
        
entry           mov     address, par                    ' get first address of passing method
                rdword  host_ptr, address               ' read the address to get pointer to Least Significant long of host time 
                add     address, #4                     ' add a word offset to address
                rdword  node_ptr, address               ' read the address to get pointer to Least Significant long of node time
                add     address, #4                     ' add a word offset to address                                     
                rdbyte  funct, address                  ' read the function type *add or subtract*

                cmp     funct, #"s"             wz      ' if subtract function, set z flag
              
                rdlong  host_long1, host_ptr            ' get least significant long of host time
                add     host_ptr, #4                    ' add long offset to host pointer
                rdlong  node_long1, node_ptr            ' get least significant long of node time
                add     node_ptr, #4                    ' add long offset to node pointer                                             
                rdlong  node_long2, node_ptr            ' get most significant long of node time

        if_z    [COLOR="#FF0000"]sub[/COLOR]     host_long1, node_long1  wc      ' do first subtraction and write c flag
        if_nz   [COLOR="#FF0000"]add[/COLOR]     host_long1, node_long1  wc      ' do first addition and write c flag

                rdlong  host_long2, host_ptr            ' get most significant long of host time

        if_z    subsx   host_long2, node_long2          ' do second subtraction                                   
        if_nz   addsx   host_long2, node_long2          ' do second addition

                wrlong  host_long2, host_ptr            ' write most significant long of result
                sub     host_ptr, #4                    ' subtract long offset from host pointer
                wrlong  host_long1, host_ptr            ' write least significant long of result

                wrlong  zero, address                   ' clear node pointer to stop cog and return to caller

                cogid   p_cog
                cogstop p_cog

p_cog           res     alias
address         res

host_ptr        res
node_ptr        res

host_long1      res
host_long2      res
node_long1      res
node_long2      res

funct           res

tail            fit

CON
  zero  = $1F0
  alias = 0
                
DAT

Note that the math is still slightly off (somehow the connections isn't made to the high part).
The lower 32 bits should be manipulated with add/sub instead of adds/subs.

average joe · 2013-07-19 03:02

Okay, that makes sense. Sign is not evaluated until the second add/sub. Thanks so much for your help! Now time to get the 64 bit counter working...

average joe · 2013-07-19 03:21

Here's the clock source. I think this one's right, although I could be wrong. I had a feeling that the maths code had problems but 18 + hour days are starting to take their toll.

CON
{{
//        64 bit clock source       //
//        Author : Joe Heinz        //
//   Copyright : Joe Heinz - 2013   //
}}

_ctr1_apin  = 0                                         ' pin to use as output from counter A to counter B
_sample_pin = 1                                         ' pin to sample both counters phase 
_pll_mode   = %111                                      ' pll divisor = 1:1

VAR  long long1, long2                                  ' two longs to store 64 bit variable
     byte cog                                           ' byte to store cog running timer in
PUB NULL                        '' not a top level object

PUB Start                       '' Start timing cog
  long1 := _sample_pin                                  ' move sample pin into longa
  cog := cognew(@entry, @long1) +1                      ' start assembly cog
  outa[_sample_pin] := 0                                ' set samplePin to LOW         
  dira[_sample_pin] := 1                                ' make samplePin an output
  result := cog
  
PUB Stop                        '' Stop timing cog
   if cog                                               ' if cog started
     cogstop(cog ~~ -1)                                 ' stop cog and clear cog variable

PUB GetSample                   '' Get a sample time
  outa[_sample_pin] := 1                                ' make samplePin HIGH
  if long1 > $FFFF_FFFC                                 ' if near overflow at sample
    long2 -= 1                                          ' subtract 1 from high 32 bits
  outa[_sample_pin] := 0                                ' make samplePin LOW
  result := @long1                                      ' return address of longa
         
DAT
        org

entry   rdlong trigger, par
        mov   dira, diraval                             'set APIN to output
        mov   frqa, #1                                  'set counter to increment 1 each cycle
        mov   frqb, #1                                  'set counter to increment 1 each cycle
                                
        mov   wr2add, par                               ' move par address to wr2add
        add   wr2add, #4                                ' add long offset
        mov   ctrb, ctramode                            ' establish counter A mode and APIN                        
        mov   ctra, ctramode                            ' establish counter B mode and APIN
                                                                   
:loop   waitpeq zero, trigger                           ' wait for trigger pin to be idle         4 +
        waitpne zero, trigger                           ' wait for trigger pin to be active       4 +
        mov   long_1, phsa                              ' move phsa to long1                      4
        mov   long_2, phsb                              ' move phsb to long2                      4
        wrlong  long_1,par                              ' write long1 to longa                    8 + 
        wrlong  long_2, wr2add                          ' write long2 to longb                    16
        jmp   :loop                                     ' repeat                                  4
                                                        ' min 48 clock cycles, 
        
diraval       long      |< _ctr1_apin  
ctramode      long      (%00010 << 26) + ( _pll_mode << 23) + _ctr1_apin        ' PLL mode - single ended, APIN is input for counter B
ctrbmode      long      (%01110 << 26) + _ctr1_apin                             ' neg edge triggered, APIN is output from counter A
zero          long      0

trigger       res                                                              ' pin to trigger sample
long_1        res
long_2        res
wr2add        res

*edit*
Found a couple errors already

kuroneko · 2013-07-19 04:33

For comparison, a rather old 64bit timer implementation. Don't get distracted by it's overlay nature.

The 64bit value is emitted to the serial console after each key press.

prof_braino · 2013-07-19 07:28

Kuroneko is probably one of you best resources, but here is another tidbit that might help.

This logger uses 64 bit math.

http://code.google.com/p/propforth/wiki/Logger1Simple

The source code is included in the comments. The stuff that gets loaded is the same, but in base 64 encoding to load faster. Skip past that and the forth source to the assembler code.

average joe · 2013-07-23 21:55

Well I thought I would be okay with addition and subtraction but now it comes to mind that I will also need multiplication and division. I have the multiplication pretty well tied up but division is starting to get to me. I only need to multiply and divide by 100, but the multiplicand and dividend will be 63 bit numbers. Okay, so more like 50 - 56 bit numbers.... My first thought was to do division in spin but I'm drawing a blank. Any thoughts?

kuroneko · 2013-07-23 22:12

Have a look at http://propeller.wikispaces.com/MATH. While it doesn't specifically show 64bit integer math, it does do double precision floating point. There is also the umath object but this only produces a 32bit result.

average joe · 2013-07-25 13:05

I think I have it working. Could really use a couple more sets of eyes on it though!

{{
//////////////////////////////////////////////////////////////////////////////////
//      64 bit signed addition and subtraction methods                          //
//      Author : Joe Heinz                                                      //
//      Copyright : Joe Heinz - 2013                                            //
//      Grateful acknowledgement to kuroneko (parallax forum)                   //
//      data structure as follows :                                             //
//      VAR  long  hostLong1, hostLong2, nodeLong1, nodeLong2                   //
//////////////////////////////////////////////////////////////////////////////////
}}
PUB Addition(hostPtr, nodePtr)         '' Perform 64 bit subtraction
{{
//////////////////////////////////////////////////////////////////////////////////
//  hostPtr is pointer to least significant long of host time                   //  
//  nodePtr is pointer to least significant long of node time                   //
//  result of addition is stored in host longs                                  //
//  returns cog ASM engine was run in or false if no cog available              //
//////////////////////////////////////////////////////////////////////////////////
}}

  result := DoMath(hostPtr, nodePtr, "a")

PUB Subtraction(hostPtr, nodePtr)      '' Perform 64 bit subtraction
{{
//////////////////////////////////////////////////////////////////////////////////
//  hostPtr is pointer to least significant long of host time, minuend          //  
//  nodePtr is pointer to least significant long of node time, subtractend      //
//  result of subtraction is stored in host longs                               //
//  returns cog ASM engine was run in or false if no cog available              //
//////////////////////////////////////////////////////////////////////////////////
}}

  result := DoMath(hostPtr, nodePtr, "s")

PUB multiply(hostPtr, nodePtr)      '' Perform 64 bit subtraction
{{
//////////////////////////////////////////////////////////////////////////////////
//  hostPtr is pointer to least significant long of multiplicand                //  
//  nodePtr is pointer to multiplier                                            //
//  result of multiplication is stored in host longs                            //
//  returns cog ASM engine was run in or false if no cog available              //
//////////////////////////////////////////////////////////////////////////////////
}}

  result := DoMath(hostPtr, nodePtr, "m")  

PUB divide(hostPtr, nodePtr)      '' Perform 64 bit subtraction
{{
//////////////////////////////////////////////////////////////////////////////////
//  hostPtr is pointer to least significant long of dividend                    //  
//  nodePtr is pointer to divisor                                               //
//  result of division is stored in host longs                                  //
//  remainder is stored in nodelong1                                            //
//  returns cog ASM engine was run in or false if no cog available              //
//////////////////////////////////////////////////////////////////////////////////
}}

  result := DoMath(hostPtr, nodePtr, "d")    
  
PRI DoMath(hPtr, nPtr, function) : cog                  ''
                                                                                                
  cog := cognew(@entry, @hPtr) +1                       ' start asm cog
  repeat while function                                     ' wait for asm cog to finish
  

DAT
        org
        
entry             cogid     p_cog                           ' get running cog number                                                         8+  1*2   1
                  mov       address, par                    ' get first address of passing method, hPtr                                      4    3
                  rdword    host_ptr, address               ' read the address to get pointer to Least Significant long of host time         8   1*2   2
                  add       address, #4                     ' add a word offset to address, points to nPtr                                   4    3
                  rdword    node_ptr, address               ' read the address to get pointer to Least Significant long of node time         8   1*2   3
                  add       address, #4                     ' add a word offset to address, points to function                               4    3  
                  rdbyte    funct, address                  ' read the function type *add or subtract*                                       8   1*2   4                                 
                  rdlong    host_long1, host_ptr            ' get least significant long of host time                                        8   1*2   5
                  add       host_ptr, #4                    ' add long offset to host pointer                                                4    3
                  rdlong    host_long2, host_ptr            ' get most significant long of host time                                         8   1*2   6
                  rdlong    node_long1, node_ptr            ' get least significant long of node time                                        8   1*2   7
                  add       node_ptr, #4                    ' add long offset to node pointer                                                4    3
                  rdlong    node_long2, node_ptr            ' get most significant long of node time
                  
                  cmp       funct, #"a"   wz
      if_z        call      #addi   
           
                  cmp       funct, #"s"   wz
      if_z        call      #subt
      
                  cmp       funct, #"m"   wz
      if_z        call      #mult            
                  
                  cmp       funct, #"d"   wz
      if_z        call      #div      
                  

write             wrlong    host_long2, host_ptr            ' write most significant long of result                                          8   1*2   9
                  sub       host_ptr, #4                    ' subtract long offset from host pointer                                         4    3
                  wrlong    host_long1, host_ptr            ' write least significant long of result                                         8   1*2   10
                  wrlong    node_long2, node_ptr
                  sub       node_ptr, #4
                  wrlong    node_long1, node_ptr
                  wrlong    zero, address                   ' clear function to return to caller                                         8   1*2   11
                  cogstop   p_cog                           ' stop the cog                                                                   8   1*2   12







div 
                  cmp   node_long1, zero    wz          ' check to see if divisor is zero
          if_z    mov   host_long1, zero                ' and zero out result
          if_z    mov   host_long2, zero                ' undefined
          if_z    jmp   write                           ' write to buffers
           
                  ' find most significant bit shift offset of divisor < mplace = 0 to 31 >             
                  mov   mplace, #0                      ' prepare MSB place
findm             rol   node_long1, #1            wc    ' rotate divisor left 1 place and check for overflow   
          if_nc   add   mplace, #1                      ' if no overflow, add 1 to MSB place
          if_nc   jmp   #findm                          ' repeat loop until overflow
                  ror   node_long1, #1                  ' rotate back one 
                  
                  mov   mask, #1                        ' prepare mask
                  mov   tmp, mplace                     ' move msb shift factor to tmp for itteration count
                  shl   mask, mplace                    ' move mask by msb shift factor
                  
div2              cmpsub host_long2, node_long1   wc    ' subtract divsor from dividend and watch for subtraction
                  muxc  d2, mask                        ' mux c flag to d2 at mask place
                  cmpsub tmp, #1                  wz    ' subtract 1 from shift factor watching for last division
          if_nz   shr   mask, #1                        ' shift mask right by 1
          if_nz   shr   node_long1, #1                  ' shift divisor right by 1
          if_nz   jmp   #div2                           ' and do loop again
                 
                  shr   mask, #1                        ' shift mask right by 1
                  shr   node_long1, #1                  ' shift divisor right by 1
                                     
                  cmpsub host_long2, node_long1   wc    ' subtract divsor from dividend and watch for subtraction
                  muxc  d2, mask                        ' mux c flag to d2 at mask place
                  sub tmp, #1                     wz    ' subtract 1 from shift factor watching for last division

                  mov   tmp, #1                         ' move 1 to tmp for muxc
                  shl   mask, #31                       ' move mask to MSB
                  mov   temp, #32                       ' move 32 to temp for itteration count 
                  
div1              shl   host_long2, #1                  ' make room for next bit from LSL
                  shl   host_long1, #1            wc    ' shift next bit from LSL watching for carry
                  muxc  host_long2, tmp                 ' mux carry into MSL
                  cmpsub host_long2, node_long1   wc    ' subtract divisor from MSL, watch for subtraction
                  muxc  d1, mask                        ' mux c flag to d1  
                  sub   temp, #1                  wz    ' subtract 1 from itteration count
          if_nz   shr   mask, #1                        ' move mask to next place
          if_nz   jmp   #div1                           ' do loop again
          
                  mov   node_long1, host_long2          ' move remainder in node_long2
                  mov   node_long2, #0
                  mov   host_long2,  d2                 ' move d2 to host_long2    
                  mov   host_long1,  d1                 ' move d1 to host_long1
    
div_ret           ret



mult              mov       multiplier, node_long1          ' get multiplier from node_long1 and place in multiplier
                  cmp       multiplier, zero          wz    ' check for multiply by 0
if_z              mov       host_long1, zero                ' if multiply by zero just zero host_long1
if_z              mov       host_long2, zero                ' and zero host_long2
if_z              jmp       write                           ' jump to write         
                  mov       node_long1, host_long1          ' else copy longs for repeated addidion
                  mov       node_long2, host_long2          ' copy the next long
                  sub       multiplier, #1
                  
:loop             call      #addi
                  djnz      multiplier, #:loop
mult_ret          ret                  


subt              sub       host_long1, node_long1  wc      ' do first subtraction and write c flag                                          4    4
                  subsx     host_long2, node_long2          ' do second subtraction  
subt_ret          ret
            

addi              add       host_long1, node_long1  wc      ' do first addition and write c flag
                  addsx     host_long2, node_long2          ' do second addition
addi_ret          ret
             

sub_funct     long "s"                                                                                               '' 192 clock cycles min ~ 127 us
zero          long 0
             
address       res
host_ptr      res
node_ptr      res
host_long1    res
host_long2    res
node_long1    res
node_long2    res
funct         res
p_cog         res 
multiplier    res

mask          res
tmp           res
temp          res
mplace        res
d2            res 
d1            res

Not sure if anyone else will find this helpful but once it's done I will be releasing under MIT license.

Duane Degn · 2013-07-25 13:40

average joe wrote: »

Not sure if anyone else will find this helpful but once it's done I will be releasing under MIT license.

This is very interesting to me. I wonder why you have it launch the cog with each call instead of having a normal "Start" method to launch the cog? I don't remember the exact time, but launching a cog takes a bit of time and if one had a lot of math to perform, the current arrangement sure seems like it would slow the process down significantly.

I was envisioning something like F32 that launched a cog and waited for a command.

Did you not what to tie the cog up permanently?

average joe · 2013-07-25 14:09

I'm actually going to have 2 versions. One that's launched on a per-call basis and one that just stays running. I'm only occasionally doing the math and device is battery powered so trying to keep the current consumption down. That's the reasoning behind that.

Still doing some sanity checks and it seems to work okay. I do need to put in the comments that mul-div only use 32 bit multiplier / divisor...

Duane Degn · 2013-07-25 15:02

average joe wrote: »

I'm actually going to have 2 versions. One that's launched on a per-call basis and one that just stays running. I'm only occasionally doing the math and device is battery powered so trying to keep the current consumption down. That's the reasoning behind that.

Still doing some sanity checks and it seems to work okay. I do need to put in the comments that mul-div only use 32 bit multiplier / divisor...

Sounds good.

I haven't run into the need for 64-bit math myself but I'm pretty sure it's just a matter of time until I'll need/want it.

Thanks for posting your code.

Lawson · 2013-07-25 15:45

Fyi. A few of the forum gurus figured out how to do 64-bit addition and subtraction in spin. The "50 bit maths" thread has it all.

Lawson

kuroneko · 2013-07-25 22:22

average joe wrote: »

I think I have it working. Could really use a couple more sets of eyes on it though!

Are you sure you want to keep your multiply like that? I'm not saying it isn't working but it may take an awful long time. There is also at least one wrong jump in there (missing #).

average joe · 2013-07-26 00:10

kuroneko wrote: »

Are you sure you want to keep your multiply like that? I'm not saying it isn't working but it may take an awful long time. There is also at least one wrong jump in there (missing #).

Honestly, no I don't want to keep it that way. Sadly I don't have much time to continue working on this. I have a TON of code to write and my deadline is quickly approaching. It will work for now since 99 iterations won't take that long and it's not called very often. I admit this object still could use quite a bit of work but it will have to wait.

groggory · 2013-07-26 12:06

I know this is already solved...but for binary multiplication and division you could always essentially code in the way you would do it by hand:

http://academic.evergreen.edu/projects/biophysics/technotes/misc/bin_math.htm

for multiplying two 64 bit numbers that would be 64 iterations of 64 comparisons, then 64 numbers to add together. This would leave you with a 128 bit result. This would be pretty straight forward to program and would be a bit costly to run (~4100 simple math operations plus the add)...but not nearly as costly as adding the multiplicand the multiplier number of times (~2^64 add operations)

Nice job Joe on tackling this problem. Maybe this can be revisited in the future to expand and optimize a 64 bit math library.

64 bit math

Comments