I'm trying to do my first assembly code, but no luck

Robert Watkins · 2009-12-12 04:27

I'm just starting with the propeller. As an exercise I've written in spin an numerical method to estimate pi and want to do the same in assembly.

In spin, I've used an object (object) to do the floating point math. This object appears to support floating point in assembler, but I'm not able to figure out how to use these features in assembler.

I've attached the assembler version of the algorithm, but I'm not able to figure out how to connect the floating point features to my code.

Any assistance would be helpful.

Robert

Mike Green · 2009-12-12 04:42

You want to create a user-defined-function as described on the bottom of page 8 of the Floating Point Library documentation. You could use your assembly routine, but you'd have to combine it with the assembly code for Float32 and possibly delete one of the transcendental functions to make room for your code.

You'd need to use Float32Full rather than Float32A if you want to create a user-defined-function. Look at the documentation and the source code for Float32Full.

stevenmess2004 · 2009-12-12 04:46

That all looks right but what you have to do is copy the routines you want to use (and any they use) into the piText.spin file.

Robert Watkins · 2009-12-12 04:52

Thanks for the quick reply!

I'm still not sure what to do though. p8 only has a snippet of code. I still can't make the connection with what is written to what I need to do.

I want to use the functions in the library. With Spin, I just reference them in the OBJ section. How is this done in assembler?

@@@EDIT@@@

I just saw the second post. I tried to copy the assembler code to mine, but there must be something more to the user-defined functions that I don't understand.

stevenmess2004 · 2009-12-12 05:59

Okay, what you want is something like the attached. It compiles but I haven't checked that it gives the answer you want. It should give you a good enough idea of what you need to do though. I had to change you last line to

abs  time, time     'to accomodate for the negatie value

Mike Green · 2009-12-12 06:14

We're talking about two different solutions.

1) Remove some of the routines from Float32 to make room for your code. Probably Sin and Cos. You'd substitute your code for them and probably use the same calls from Spin to call your routine (make both Sin and Cos call your routine).

2) Use the user-defined-function mechanism in Float32Full/Float32A. This essentially provides an interpreter for a simple instruction set (see FFuncCmd through JmpNaNCmd and FAddCmd through CeilCmd in the list of command constants at the beginning of Float32Full.spin) with the commands in the upper 16 bits of a long and some kind of operand in the lower 16 bits. You'd then call FFunc with the address of the interpretive code as a parameter. As shown in the documentation, the first long has to be a JmpCmd with the lower 16 bits being the address of the JmpCmd. A zero long ends the list. Jumps (Jmp...) are relative to the first long.

stevenmess2004's solution would probably work except there's no provision for returning the result to the Spin code and signalling that your assembly routine is done. The "ret" from Toggle won't work because there's no way to return to the other cog.

Post Edited (Mike Green) : 12/12/2009 6:23:12 AM GMT

Mike Green · 2009-12-12 07:08

Here's an untested conversion of your program to do your algorithm as a user-defined-function

CON   
  _clkmode = xtal1 + pll16x
  _xinfreq = 5_000_000

OBJ flt : "Float32Full"   
   
PUB Main | startTime, elapsed
  flt.start
  startTime := cnt                          ' Starting time
  flt.FFunc(@Compute)                       ' Do computation
  elapsed := cnt - startTime                ' Elapsed time

DAT
Compute   long   flt#JmpCmd   + @Compute    ' Establish starting address
LoopIt    long   flt#LoadCmd  + @j          ' Get starting value
          long   flt#FAddCmd                ' Increment it
          long   1.0
          long   flt#SaveCmd  + @j
          long   flt#FMulCmd                ' Multiply by two
          long   2.0
          long   flt#SaveCmd  + @temp1      ' Save to be able to square it
          long   flt#FMulCmd  + @temp1      ' Now we have (j * 2) * (j * 2)
          long   flt#FSubCmd                ' (4 * j^2) - 1
          long   1.0
          long   flt#SaveCmd  + @temp2      ' Save so we can swap
          long   flt#LoadCmd  + @temp1
          long   flt#FDivCmd  + @temp2      ' (4 * j^2) / ((4 * j^2) - 1)
          long   flt#FMulCmd  + @temp3      ' Multiply by previous result
          long   flt#SaveCmd  + @temp3      ' Save for next time
          long   flt#LoadCmd  + @j          ' Check count for limit
          long   flt#FCmpCmd
          long   100000.0
          long   flt#JmpLtCmd + @LoopIt
          long   0                          ' End of user function
        
 j        long      0.0                     ' Loop counter
 temp1    long      0.0
 temp2    long      0.0
 temp3    long      1.0                     ' Result

Post Edited (Mike Green) : 12/12/2009 5:09:33 PM GMT

Robert Watkins · 2009-12-12 16:36

Thanks to the both of you. I truly appreciate being able to see the multiple solutions and to get a sense of making them do what I want.

I'll dig into this some more and see what I come up with.

Beau Schwabe · 2009-12-12 16:59

Mike,

Is this line of your example code in error?

long flt#LongCmd + @j ' Check count for limit

Robert Watkins,

I will see you over at Dee's tonight, and I can look at your code in person there.

Welcome aboard to the forum!!

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe

IC Layout Engineer
Parallax, Inc.

Mike Green · 2009-12-12 17:11

@Beau - Thanks for catching that ... too much typing late at night. I corrected the previous message.

It would be nice if Cam could provide a little more documentation on this. I'm still figuring out the details by going through his code. I'm at the point where I can see mostly what he's done with the user-defined-function stuff, but not why. I'm sure that'll come eventually.

Robert Watkins · 2009-12-13 02:34

OK, Here is a working version. Yay! Thanks for everyone's help

CON
  _clkmode = xtal1 + pll16x
  _xinfreq = 5_000_000

OBJ flt : "Float32Full"
  Debug   : "FullDuplexSerialPlus"
    fString : "FloatString"

PUB Main | startTime, elapsed

  Debug.start(31, 30, 0, 57600)
  Waitcnt(clkfreq*2 + cnt)
  Debug.tx(Debug#CLS)
  flt.start
  startTime := cnt                          ' Starting time
  flt.FFunc(@Compute)                       ' Do computation
  elapsed := cnt - startTime                ' Elapsed time
  elapsed := elapsed / clkfreq
  debug.str(String("Time "))
  debug.dec (elapsed)
  Debug.str(String("[img]http://forums.parallax.com/images/smilies/tongue.gif[/img]i="))
  debug.str(fstring.FloatToString(temp3))
  'Debug.str(String(":iterations="))
  'Debug.str(fstring.FloatToString(j))

DAT
Compute   long   flt#JmpCmd   + @Compute    ' Establish starting address
LoopIt    long   flt#LoadCmd  + @j          ' Get starting value
          long   flt#FAddCmd                ' Increment it
          long   1.0
          long   flt#SaveCmd  + @j
          long   flt#FMulCmd                ' Multiply by two
          long   2.0
          long   flt#SaveCmd  + @temp1      ' Save to be able to square
          long   flt#FMulCmd  + @temp1      ' Now we have (j * 2) * (j * 2)
          long   flt#SaveCmd  + @temp1
          long   flt#FSubCmd                ' (4 * j^2) - 1
          long   1.0
          long   flt#SaveCmd  + @temp2      ' Save so we can swap
          long   flt#LoadCmd  + @temp1
          long   flt#FDivCmd  + @temp2      ' (4 * j^2) / ((4 * j^2) - 1)
          long   flt#FMulCmd  + @temp3      ' Multiply by previous result
          long   flt#SaveCmd  + @temp3      ' Save for next time
          long   flt#LoadCmd  + @j          ' Check count for limit
          long   flt#FCmpCmd
          long   100000.0
          long   flt#JmpGtCmd + @LoopIt
          long   flt#LoadCmd  + @temp3
          long   flt#FmulCmd
          long   2.0
          long   flt#SaveCmd  + @temp3
          long   0                          ' End of user function

 j        long      0.0                     ' Loop counter
 temp1    long      1.0
 temp2    long      1.0
 temp3    long      1.0                     ' Result

Mike Green · 2009-12-13 02:36

I'd be curious about how much faster this version is compared to one completely in Spin.

Robert Watkins · 2009-12-13 02:41

Here's the spin version

''Calculate pi numerically

CON
   
  _clkmode = xtal1 + pll16x
  _xinfreq = 5_000_000
   

OBJ
   
  Debug   : "FullDuplexSerialPlus"
  fMath   : "Float32"
  fString : "FloatString"
   
   
PUB Calculate | a, b, c, i, j, time

  fMath.start
 
  Debug.start(31, 30, 0, 57600)
  Waitcnt(clkfreq*2 + cnt)
  Debug.tx(Debug#CLS)
  b := 1.0
  a := 1.0
  i := 0
  j := 1.0
  time := cnt

  repeat i from 1 to 100000
    b := fmath.FMul(fmath.FMul(2.0,j),fmath.FMul(2.0,j))
    b := fmath.FDiv(b,fmath.FSub(b,1.0))
    a := fmath.FMul (a,b)
    j := fmath.FAdd (j,1.0)

  a := fmath.FMul (a,2.0)
  time := time - cnt
  time := time / clkfreq
  debug.str(String("Time "))
  debug.dec (time)
  Debug.str(String(":"))
  debug.str(fstring.FloatToString(a))
  Debug.str(String(":"))

  b := pi
  c := fmath.FSub(a, b)

  Debug.str(String("Caluclated Pi - System Pi = "))

  debug.str(fstring.FloatToString(c))

Robert Watkins · 2009-12-13 02:42

funny you should ask... [noparse]:)[/noparse]

The spin is 24 seconds and the assembler is 8 seconds, though getting all cogs working together, we should get 2?

Beau Schwabe · 2009-12-13 18:04

Mike, All,

After reviewing your code example last night with Robert, I think I understand the user defined Function option within
the Float32Full.

Basically in a way the 'Float32A' is written to serve as an interpreter for what you place in the DAT section longs.
Given that, I am reposting Mike's code with heavy comments. Hopefully this will make better since as it wasn't obviously
clear to me at first when I sat down to look at this method to define a user function within the Float32Full.

[b]CON[/b]
  [b]_clkmode[/b] = [b]xtal1[/b] + [b]pll16x[/b]
  [b]_xinfreq[/b] = 5_000_000

[b]OBJ[/b] flt : "[b]Float[/b]32Full"
  Debug   : "FullDuplexSerial"
    fString : "FloatString"

[b]PUB[/b] Main | startTime, elapsed

  Debug.start(31, 30, 0, 38400)
  [b]Waitcnt[/b](clkfreq*2 + [b]cnt[/b])
  Debug.tx(0)
  flt.start
  startTime := [b]cnt[/b]                          ' Starting time
  flt.FFunc(@Compute)                       ' Do computation
  elapsed := [b]cnt[/b] - startTime                ' Elapsed time
  elapsed := elapsed / clkfreq
  debug.[b]str[/b]([b]String[/b]("Time "))
  debug.dec (elapsed)
  Debug.tx(13)
  Debug.[b]str[/b]([b]String[/b]("i="))
  debug.[b]str[/b](fstring.FloatToString(temp3))
  'Debug.str(String(":iterations="))
  'Debug.str(fstring.FloatToString(j))

[b]DAT[/b]
Compute   [b]long[/b]   flt#JmpCmd   + @Compute    ' Establish starting address to define a function called "Compute"
'---------------------------------------------------------------------------------------------------------------------
LoopIt    [b]long[/b]   flt#LoadCmd  + @j          ' Load the accumulator of the "Float32A" interpreter with the value of "j" 

'---------------------------------------------------------------------------------------------------------------------
          [b]long[/b]   flt#FAddCmd                ' Add 1 to the accumulator
          [b]long[/b]   1.0
'---------------------------------------------------------------------------------------------------------------------
          [b]long[/b]   flt#SaveCmd  + @j          ' Copy the accumulator into "j"
'---------------------------------------------------------------------------------------------------------------------
          [b]long[/b]   flt#FMulCmd                ' Multiply the accumulator by two
          [b]long[/b]   2.0
                                            ' Now we have (j * 2)
'---------------------------------------------------------------------------------------------------------------------          
          [b]long[/b]   flt#SaveCmd  + @temp1      ' Copy the accumulator into "temp1"
'---------------------------------------------------------------------------------------------------------------------
          [b]long[/b]   flt#FMulCmd  + @temp1      ' Multiply the accumulator by "temp1"  

                                            ' Now we have (j * 2) * (j * 2)  ...or...  (4 * j^2)
'---------------------------------------------------------------------------------------------------------------------
          [b]long[/b]   flt#SaveCmd  + @temp1      ' Copy the accumulator into "temp1"
'---------------------------------------------------------------------------------------------------------------------
          [b]long[/b]   flt#FSubCmd                ' Subtract 1 from the accumulator 
          [b]long[/b]   1.0
                                            ' Now we have (4 * j^2) - 1    
'---------------------------------------------------------------------------------------------------------------------          
          [b]long[/b]   flt#SaveCmd  + @temp2      ' Copy the accumulator into "temp2" ; so we can swap
'---------------------------------------------------------------------------------------------------------------------
          [b]long[/b]   flt#LoadCmd  + @temp1      ' Load the accumulator with the value of "temp1"

                                            ' Now we have (j * 2) 
'---------------------------------------------------------------------------------------------------------------------
          [b]long[/b]   flt#FDivCmd  + @temp2      ' Divide the accumulator by "temp2" 
                                            ' Now we have (4 * j^2) / ((4 * j^2) - 1)
'---------------------------------------------------------------------------------------------------------------------          
          [b]long[/b]   flt#FMulCmd  + @temp3      ' Multiply the accumulator by "temp3" ; Multiply previous result
'---------------------------------------------------------------------------------------------------------------------          
          [b]long[/b]   flt#SaveCmd  + @temp3      ' Copy the accumulator into "temp3" ; The Result
'---------------------------------------------------------------------------------------------------------------------          
          [b]long[/b]   flt#LoadCmd  + @j          ' Load the accumulator with the value of "j" ;  Check count for limit
'---------------------------------------------------------------------------------------------------------------------          
          [b]long[/b]   flt#FCmpCmd                ' Compare the accumulator value against 100000 and store result back into
                                            ' the accumulator       
          [b]long[/b]   100000.0
'---------------------------------------------------------------------------------------------------------------------          
          [b]long[/b]   flt#JmpGtCmd + @LoopIt     ' If the accumulator indicated that the previous comparrison was less than
                                            ' the compared value then Jump to the Address labeled "LoopIt" and repeat
                                            ' the process all over again.
'---------------------------------------------------------------------------------------------------------------------
          [b]long[/b]   flt#LoadCmd  + @temp3      ' Load the accumulator with the value of "temp3"
'---------------------------------------------------------------------------------------------------------------------
          [b]long[/b]   flt#FmulCmd                ' Multiply the accumulator by "2"
          [b]long[/b]   2.0          
'---------------------------------------------------------------------------------------------------------------------          
          [b]long[/b]   flt#SaveCmd  + @temp3      ' Copy the accumulator into "temp3" ; The Result
'---------------------------------------------------------------------------------------------------------------------          
          [b]long[/b]   0                          ' End of user function ; Terminates user defined function
'---------------------------------------------------------------------------------------------------------------------          
        
'---------------------------------------------------------------------------------------------------------------------
                                            '' User defined variables to be used in function    
'---------------------------------------------------------------------------------------------------------------------
 j        [b]long[/b]      0.0                     ' Loop counter
 temp1    [b]long[/b]      0.0
 temp2    [b]long[/b]      0.0
 temp3    [b]long[/b]      1.0                     ' Result

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe

IC Layout Engineer
Parallax, Inc.

Post Edited (Beau Schwabe (Parallax)) : 12/13/2009 9:04:56 PM GMT

Mike Green · 2009-12-13 18:21

To add to Beau's comments ...

The lower 16 bits of a floating point "instruction" is usually the address of the operand (a long in hub memory). If it's zero, the next LONG is taken as a literal constant.

The floating point accumulator is initialized to zero. The value left in the accumulator is returned as the result of the FFunc Spin call.

Notice that there are no provisions for arrays or subscripting or integer calculations. This facility is really intended for implementing new functions, not complex whole programs which are intended to be done in Spin where those capabilities are available.

I'm trying to do my first assembly code, but no luck

Comments