Learning PASM2 and cordic!
I've been learning PASM to convert some math intensive routines over to PASM for improved speeds. In the process I've learned lots of new things and spent a lot of time digging into whatever documentation and examples I could get my hands on.
I wanted to share some of the things I learned that might help someone else who is new to P2 PASM.
First time creating new PASM code was difficult and figuring out each next step was pretty tedious and I wasn't proceeding very well. So I used SPIN to code the math routine first. This code was worked on until I was getting the expected output. I wrote the SPIN as a series of steps so I could break down the sequence and this was used as a guideline. Now I was writing out only a line of 2 of SPIN at a time, much easier to keep track of everything.
I coded the PASM using the same routine as the SPIN routine (might have been easier to create separate routines but this worked as I bounced back and forth between SPIN and PASM. Debug statements are after most SPIN code lines to match the SPIN values with the equivalent PASM values. If they didn't match or I got an error, then I had something new to learn about PASM!
pub GetXYLegPosition(femurA, tibiaA, coxaA, body):x1, y1 | f1, t1, c1, c2, femurL, tibiaL, coxaL, scale1, degree1, tmp1, step1, step2, step3, step4, step5, step6, step7, step8 'calculate distances using angles in Forward Kinematics 'input 3 leg angles and body from the floor distance 'return leg Cartesian location in x,y coordinates 'uses pr0-pr2 as additional temporary variable registers femurL := femurLength tibiaL := tibiaLength coxaL := coxaLength org mov f1, ##1800 sub f1, femurA mov t1, tibiaA add t1, femurA mov c1, coxaA cmp c1, ##900 wcz if_a sub c1, ##1800 abs c1, c1 debug (sdec(f1), sdec(t1), sdec(c1)) mov tmp1, femurL mul tmp1, tmp1 'femurL squared mov pr0, tibiaL mul pr0, pr0 'tibia squared add tmp1, pr0 'add values and store in tmp1 mov pr1, tmp1 'copy tmp1 over to pr1 debug (sdec(tmp1), sdec(pr1)) mov pr0, femurL mul pr0, #2 'mult femurL by 2 mul pr0, tibiaL 'mult tibiaL by prev results, store in pr0 debug (sdec(pr0)) mov scale1, ##10000 'cosine start mov degree1, ##$80000000 setq #0 qdiv degree1, ##1800 getqx degree1 ' debug (sdec(degree1)) qmul degree1, t1 getqx tmp1 ' debug (sdec(tmp1)) qrotate scale1, tmp1 ' debug (sdec(scale1), sdec(tmp1)) getqx tmp1 'cosine value in tmp1 debug (sdec(tmp1)) qmul tmp1, pr0 'mult cosine by pr0 getqx tmp1 'store in tmp1 qdiv tmp1, ##10000 'div by 10000 to get in right range getqx tmp1 'store in tmp1 ' debug (sdec(tmp1)) sub pr1, tmp1 'sub pr0 from pr1, result in pr1 ' debug (sdec(pr1), sdec(pr0)) qsqrt pr1, #0 'take square root getqx pr2 'store square root in pr2 ' debug (sdec(pr2)) mul pr2, pr2 'square pr2 mul body, body 'square body height sub pr2, body 'subtract values, store in pr2 qsqrt pr2, #0 getqx tmp1 'store value in tmp1 ' debug (sdec(tmp1)) add tmp1, coxaL 'get total length, value of side from tip to femur pivot ' debug (sdec(tmp1)) mov degree1, ##$80000000 'convert coxaA to value readable by pasm cordic setq #0 qdiv degree1, ##1800 getqx degree1 qmul degree1, c1 getqx degree1 'degree1 is c1 coxaA ready for cordic ' debug (sdec(tmp1), sdec(degree1)) qrotate tmp1, degree1 getqx x1 getqy y1 ' debug (sdec(x1), sdec(y1)) end f1 := 1800 - 701 'femurAngle[n] t1 := 4 + 701 'tibiaAngle[n] + femurAngle[n] c2 := 1000 c1 := c2 if c1 > 900 c1 := 1800 - c1 'convert angle to CCW rotation for math debug("Leg XY angle positions: Leg ", sdec(f1), sdec(t1), sdec(c1)) 'test faster calculations step1 := t1 'get angle debug(sdec(step1)) step2 := (femurLength*femurLength) + (tibiaLength*tibiaLength) debug(sdec(step2)) step3 := 2*femurLength*tibiaLength debug(sdec(step3)) step4 := abs(qcos(10000,step1,3600)) debug(sdec(step4)) step5 := (step3 * step4)/10000 debug(sdec(step5)) step6 := step2 - step5 debug(sdec(step6)) step7 := sqrt(step6) debug(sdec(step7)) step8 := sqrt((step7*step7) - (bodyz*bodyz)) debug(sdec(step8)) step1 := step8 + coxaLength debug(sdec(step1)) step2 := abs(c1 * ($80000000 / 180)) debug("c1 conversion: ", sdec_(step2)) x1, y1 := polxy(step1, step2) if c2 < 900 step3 := step3 * -1 debug("Results: ", "Hypot: ", sdec_(step1), " angle: ", sdec_(step2), " x: ", sdec_(x1), " y: ", sdec_(y1))
Found a limitation of 16 registers (for variables) in PASM for a routine. This includes the variables passed to the routine, return values, and local variables. From the SPIN docs there are an additional 8 registers labeled pr0 through pr7 that can also be used for any purpose and don't count toward that 16 register limit.
Qrotate was interesting to figure out, there were several discussions about using it but little code that I could use. Getting the input angle into the right format was the hardest as I wanted to input 1/10 degree increments so scaling the value was required. Another thing that tripped me up for a while was multiplication sequences that exceeded 32 bits, a problem until I finally realized what was happening and scaled the values down enough to stay inside the 32 bit limit.
I really liked using qrotate to get sine and cosine by making the hypotenuse value = 1 or a scaled value!
The ability to use direct integer values using # in front of the value was great but when I tried that with a value >512 I got an error. Took a while but digging in the docs I finally found that ## allows number up to 32 bits.
I couldn't find much on qsqrt example wise, still don't know what the second input value is supposed to be but putting qsqrt value1, #0 works.
I looked but couldn't find a way to use the SPIN declared constants in PASM without using local variables in the routine.
This method worked well for me to keep me focused on using small steps and on the specific task I needed. I'm sure this is a long way around for the experienced PASM coders out there but it worked well in my case.
Next item is to use a variety of angle inputs to make sure it doesn't break and work to see if there is any way to simplify and speed up the routine. I know cordic routines can be stacked but the way I am using them the results of one cordic is needed for the next cordic so not sure that option is available to me here.
Suggestions for improvements are welcome and definitely let me know if I misstated something or have the wrong interpretation of what I saw.
Bob
Comments
The S input for SQRT is the upper 64 bits of the value to root, so indeed, zero is correct.
(Unrelatedly, you never need to do
SETQ #0
, it defaults to zero)Also, you always want to try to get as many instructions as you can inbetween the CORDIC command and getting the result. If you're daring you can even run multiple commands in parallel.
Furthermore, I think you could eliminate a lot of the multiplies and divides if you changed your angle units to be based on powers of two instead of tenth degrees (i.e. what the CORDIC takes natively and also the only valid(tm) way to represent angles in a computer).
Constants should just work. Just use them. Infact, PASM immediate values can be any constant expression (
add x,#4+6*8
is perfectly valid).I would have never thought of trying to do the math by hand, ouch. I have built a SCARA robot and a Hexapod and never once though about doing the math by hand. If I had to do the math by hand I would still be trying to figure out the SCARA math.
Fortunately I program in C and all the angle and floating point math was done for me. I just had to figure out how to calculate the angle given two side of the triangle.
I also limited my math to 1 degree as the servos are not that accurate and even though the stepper motors are 1 degree seem reasonable to me.
Mike
PS: It also looks like SPIN could benefit from a math library.
Remember, your method arguments, local variables, and PASM have to be moved into the lower memory of the Spin interpreter cog to run. You might consider dividing the problem into smaller steps and then putting them into their own cog (all PASM) that you can call as needed. Also consider doing timing tests on your code. I have found in a couple of cases that native Spin2 is as fast or faster than moving code into the Spin interpreter and executing it on the fly.
As Ada pointed out, inline PASM can use declared constants. In the pasm code the have to be prefaced with # or ## (if the value is > 511)
Thanks for the suggestions! I learned about the limit on the number of registers in PASM because I I chose to code the PASM in the same routine as the SPIN, it was a good lesson to learn and forced me to do some looking in the SPIN docs which led me to learning about the PR0-PR7 registers as an option also.
I’ll be trying out using the constants in the code, it didn’t work for me when I tried it initially, it may have been because the value was >511.
Keep in mind that inline PASM cannot use more than $123 registers in the cog, or you code will impinge on the interpreter's code.
I directly accessed the SPIN constants in PASM, not sure why it didn't work the first time I tried, must have been trying to use it in the wrong context. I also removed the setq #0 entry based on the info that the Q register is = 0 when initialized.
Question, is there a way to access global variables in PASM? Can they only be passed to inline pasm via parameters?
You can pass them as arguments to the method, or you could pass a base address and then read a collection of values from there.
That's not quite right: Q will have some indeterminate value when entering your inline ASM. However, most instructions ignore the Q register if they aren't directly preceded by a SETQ.
Thanks for the clarification
Thanks, I’ll try that out today. I will try out passing the base address, I believe I saw an example somewhere.
Just remember to advance your base address by the size of the variable you read. If compiling with Propeller Tool, you can mix longs, words, and bytes together -- the compiler doesn't re-order them. If you use FlexProp, though, the variables will be re-ordered (longs, then words, then bytes).
Not anymore, I don't think (was fixed quite a couple versions ago).
Thank you for letting me know that, Ada. I ran into troubles with FlexProp when reading a WAV file header from memory -- back then the variables were re-ordered which created a bit of trouble.