Using Inline PASM2 to Learn PASM2
JonnyMac
Posts: 9,180
in Propeller 2
During my presentation on Spin programming I stated that one of my favorite features of the P2 is inline PASM for small sections of code. I love this feature. Just yesterday I was working on a simplistic flash storage object and wanted to speed up the bit-banged SPI coms to the flash. Some will wonder why I didn't choose smart pins for SPI; the truth is that it takes more code to setup and use smart pins for SPI that it does just writing bit-banged code. These are the working methods for shifting data in and out of the flash.
What if you're brand new to PASM? The great thing about having inline PASM is that we can use it to learn and test very small bits of code. Case in point: another forum member asked me about the differences between SHL (shift left) and SAL (shift arithmetic left). To be candid, I didn't even know the latter instruction existed! That said, I made a guess that is described in the cryptic notation of the PASM spreadsheet -- and verified with some code.
The notes in the spreadsheet can be a little cryptic. Here's the description for SHL.
Now for SAL.
pri shiftout(value, bits) | t, x t := ticks ' copy timing org cmp bits, #32 wz ' 32 bits? if_e jmp #.bbso ' yes, go mov x, #32 ' no, move msb to value.[31] sub x, bits shl value, x .bbso rep #8, bits shl value, #1 wc ' get next bit drvc #SF_SDO ' put on SDO waitx t ' let SDO settle drvh #SF_SCK ' clock the bit waitx t waitx t drvl #SF_SCK waitx t end pri shiftin(bits) : value | t t := ticks ' copy timing org rep #8, bits waitx t ' let SDI settle testp #SF_SDI wc ' sample SDI rcl value, #1 ' move new bit into value.[0] drvh #SF_SCK ' clock the next bit waitx t waitx t drvl #SF_SCK waitx t endFor ad hoc use like this, bit-banged SPI is sensible. For massive data moves, using the streamer (which I haven't learned yet) and smart pin SPI makes sense -- the loader does that.
What if you're brand new to PASM? The great thing about having inline PASM is that we can use it to learn and test very small bits of code. Case in point: another forum member asked me about the differences between SHL (shift left) and SAL (shift arithmetic left). To be candid, I didn't even know the latter instruction existed! That said, I made a guess that is described in the cryptic notation of the PASM spreadsheet -- and verified with some code.
pub main() | x setup() wait_for_terminal(true) x := $12345678 term.fhex(shl_test(x, 4), 8) term.tx(13) term.fhex(sal_test(x, 4), 8) term.tx(13) term.tx(13) x := $12345679 term.fhex(shl_test(x, 4), 8) term.tx(13) term.fhex(sal_test(x, 4), 8) term.tx(13) repeat waitct(0) pub shl_test(value, bits) : result org shl value, bits mov result, value end pub sal_test(value, bits) : result org sal value, bits mov result, value endThe results from PST:
23456780 23456780 23456790 2345679FMy suspicion was confirmed: SAL shifts the value left and fills in the MSB side with the original bit 0. This makes sense as SAR shifts everything to the right and pads the MSB end with the original bit 31.
The notes in the spreadsheet can be a little cryptic. Here's the description for SHL.
SHL Shift left. D = [63:32] of ({D[31:0], 32'b0} << S[4:0]). C = last bit shifted out if S[4:0] > 0, else D[31]In English this is saying that a 64-bit value is built from original value (in the high long) and 32 zeros in the low long (0). The entire thing is shifted left and the final result is the upper 32 bits. This explains why SHL pads the LSB side with 0s.
Now for SAL.
SAL Shift arithmetic left. D = [63:32] of ({D[31:0], {32{D[0]}}} << S[4:0]). C = last bit shifted out if S[4:0] > 0, else D[31]This time, the lower long of the internal 64-bit register is made up of 32 copies of whatever was in our original bit 0. You can see the result of this behavior in the example above, and I've attached my little PASM test program if you want to experiment (and if you're not already a PASM expert, you should experiment).
Comments
I’m trying to sort-out why Chip’s SAL is different from every other Shift Arithmetic Left instruction that I have seen. Example: on Intel x86 platforms, ASL/SAL and SHL produce the same opcode. They are simply aliases for each other. The low order bits get filled-in with zeros. But on the P2, SHL and SAL produce two different opcodes, and the SAL fills the low-order bits with ones while SHL fills with zeros.
I’m totally lost as to why this is so different. Can anyone straighten me out on this? Chip? Jon? Buehler?
You could do it like this in Spin.
Edited to add: this is the behaviour I am used to (from Wikipedia: https://en.m.wikipedia.org/wiki/Arithmetic_shift )
“Equivalence of arithmetic and logical left shifts and multiplication:
Arithmetic left shifts are equivalent to multiplication by a (positive, integral) power of the radix (e.g., a multiplication by a power of 2 for binary numbers). Logical left shifts are also equivalent[\b], except multiplication and arithmetic shifts may trigger arithmetic overflow whereas logical shifts do not.”
First line I created a new word in the dictionary called SHIFT which automatically switches to assembler (then indents) and then with one single line of code "_ret_ sal a,#8" I shift the parameter 'a' (the top of the data stack) left arithmetically by 8 bits, leaving the result on the stack, and return in the same instruction (because we can). As I type each line in code it will list the machine code since I have the listing turned on.
Now the word SHIFT is available at the TAQOZ prompt, so type in a number which goes on top of the stack (a) and implicitly call our new word SHIFT, then examine what's on top of the stack in binary format with .BIN (Print Binary).
One area we shift left is when we are bit-bashing SPI and I2C data which goes msb first, so we keep shifting one bit at a time with the result in the carry and then drive the data pin according to the state of the carry.
P.S. While there is such a thing as "shift arithmetically right" since the msb is the sign bit, it is not the same thing shifting left since the lsb is not the sign bit. Nonetheless, the P2 can do this and it is the reverse of the SAR operation and had to be called something.
In most other micros you can only shift/rotate on bit per instruction so the result is straight forward.
For this output: