Enhancing Spin2 with Inline Assembly - Let's Build a Code Cookbook
JonnyMac
Posts: 9,102
Updates in the P2 architecture and instruction set resulted in a significant speed improvement over the P1 -- roughly 16x with the same code at the same clock speed. Still, there are processes that will require the precision of assembly. In the P1, we are forced to start a cog to use assembly. In the P2, we're able to use inline assembly, a feature long enjoyed by those using compilers.
For those wondering how inline assembly works with the Spin interpreter, let's look at the structure of a P2 method that uses inline assembly.
Please join me in this thread by sharing your inline assembly snippets. Let's build a code cookbook to help each other and those new to the Propeller 2.
Tips:
-- Keep your code short, and neatly formatted; if it's easy on the eyes, it will be easier to follow.
-- Comments are always a good thing; more is better
-- Include an archive with a demo that shows off your cool code.
For those wondering how inline assembly works with the Spin interpreter, let's look at the structure of a P2 method that uses inline assembly.
pub method(param1, param2) : result | local1, local2 ' setup code (Spin) org ' assembly instructions end ' finish code (Spin)When a method that uses inline assembly is encountered, any parameters, result variable(s), local variables(s), and the assembly code is moved into a reserved area of the Spin interpreter cog. If the assembly segment needs variables, they are defined as local variables of the method. When the assembly code is finished, the parameters, result(s), and local(s) are moved back to the hub. This allows pre- and post-assembly work with these variables. You can think of the process as temporarily adding a command to the Spin interpreter.
Please join me in this thread by sharing your inline assembly snippets. Let's build a code cookbook to help each other and those new to the Propeller 2.
Tips:
-- Keep your code short, and neatly formatted; if it's easy on the eyes, it will be easier to follow.
-- Comments are always a good thing; more is better
-- Include an archive with a demo that shows off your cool code.
Comments
Here's a working Spin2 method duplicates the functionality of the PBASIC PULSIN command (with a little better resolution):
While this works, we may try our hand at PASM2 by duplicating the heart of the method in assembly. Working with a small bit of assembly in a method like this is easier than launching a cog. Here's my final assembly version.
Inline assembly makes it easy to learn PASM2.
And, turning on LEDs is always fun...
Edit: This is a far more elegant solution, as suggested by @Wuerfel_21 (thank you!) Note, again, that this code uses a constant for the number of ticks in a millisecond.
I think you could write it slightly better as
To Ray's point, if you really wanted t duplicate the HIGH and LOW instructions from PBASIC, you could -- as he points out, twiddling and output (especially one with an LED) us a great place to start learning. When I attended the Skip Barber Racing School they would tell us, "Slow in is fast out." This was referring to setting up for a corner -- exit speed from the corner is far more important than entry speed. That is to say, start slowly/simply and work you way toward complex code.
Just doing my first steps in reading PASM2 documents, which lead me to this thread and a question:
Isn't there a chance to get a wrong reading when doing a getct for the high long and then another getct for the low long?
For example getct high is reading $0000_0000 and getct is reading $0000_0000. This might mean that ct(h) was just read 2 ticks before ct(l) ran into a rollover. So, the correct result would actually be $0000_0001_0000_0000.
This means, that is dangerous to rely on the result of your division, as it might be off by (2 pow(32)-1)/200Mhz every once in a while.
That actually doesn't happen. I think the high long is incremented if the low long is about to overflow or something.
Chip designed the hardware to account for low word rollover. I’m on my phone, so can’t find where it states it in the documentation.
My recollection is that a GETCT WC latches the state of both halves and stalls interrupts for one instruction to protect a following GETCT.
There may be a fix-up as she states, but either way, the error you are concerned about is prevented by Chip’s thoughtful design. Just one more thing that makes the P2 such a great processor.