Inline asm not working as expected
Ok, I am learning PASM as inline assembly in my C programs. This is a copy of entire program as it doesn't flash quickstart leds every second:
Now, if I uncomment "pause(1000)" it runs just fine on my quickstart. If I leave it commented it does not flash the lights as expected. I understand that a brief pause is required before using statements such as "printf", but is it also needed before ASM statements? I'm confused
Thanks
Daniel
#include <propeller.h>
#include "simpletools.h"
int main(void)
{
//pause(1000);
__asm__ volatile (
"mov r1, #0xff\n\t"
"shl r1, #16\n\t"
"or dira, r1\n\t"
"or outa, r1\n\t"
"my_label: "
"rdlong r2, #0\n\t"
"add r2, cnt\n\t"
"or outa, r1\n\t"
"rdlong r2, #0\n\t"
"add r2, cnt\n\t"
"waitcnt r2, #0\n\t"
"andn outa, r1\n\t"
"rdlong r2, #0\n\t"
"add r2, cnt\n\t"
"waitcnt r2, #0\n\t"
"brs #my_label\n\t"
);
return 0;
}
Now, if I uncomment "pause(1000)" it runs just fine on my quickstart. If I leave it commented it does not flash the lights as expected. I understand that a brief pause is required before using statements such as "printf", but is it also needed before ASM statements? I'm confused
Thanks
Daniel

Comments
If you need that kind of control, you should launch assembly in a COG.
It might be getting put into fcache, and that might break your code. If it is you'll be able to see that happening in the assembly output.
Here is some code that intentionally goes into the fcache:
/** Output a byte on the I2C bus. * * @param byte the 8 bits to send on the bus. * @returns true if the device acknowledges, false otherwise. */ bool SendByte(const unsigned char byte) { int result; int datamask, nextCNT, temp; __asm__ volatile( " fcache #(PutByteEnd - PutByteStart)\n\t" " .compress off \n\t" /* Setup for transmit loop */ "PutByteStart: " " mov %[datamask], #256 \n\t" /* 0x100 */ " mov %[result], #0 \n\t" " mov %[nextCNT], cnt \n\t" " add %[nextCNT], %[clockDelay] \n\t" /* Transmit Loop (8x) */ //Output bit of byte "PutByteLoop: " " shr %[datamask], #1 \n\t" // Set up mask " and %[datamask], %[databyte] wz,nr \n\t" // Move the bit into Z flag " muxz dira, %[SDAMask] \n\t" //Pulse clock " waitcnt %[nextCNT], %[clockDelay] \n\t" " andn dira, %[SCLMask] \n\t" // Set SCL high " waitcnt %[nextCNT], %[clockDelay] \n\t" " or dira, %[SCLMask] \n\t" // Set SCL low //Return for more bits " djnz %[datamask], #__LMM_FCACHE_START+(PutByteLoop-PutByteStart) nr \n\t" // Get ACK " andn dira, %[SDAMask] \n\t" // Float SDA high (release SDA) " waitcnt %[nextCNT], %[clockDelay] \n\t" " andn dira, %[SCLMask] \n\t" // SCL high (by float) " waitcnt %[nextCNT], %[clockDelay] \n\t" " mov %[temp], ina \n\t" //Sample input " and %[SDAMask], %[temp] wz,nr \n\t" // If != 0, ack'd, else nack " muxz %[result], #1 \n\t" // Set result to equal to Z flag (aka, 1 if ack'd) " or dira, %[SCLMask] \n\t" // Set scl low " or dira, %[SDAMask] \n\t" // Set sda low " jmp __LMM_RET \n\t" "PutByteEnd: " " .compress default \n\t" : // Outputs [datamask] "=&r" (datamask), [result] "=&r" (result), [nextCNT] "=&r" (nextCNT), [temp] "=&r" (temp) : // Inputs [SDAMask] "r" (sda_mask_), [SCLMask] "r" (scl_mask_), [databyte] "r" (byte), [clockDelay] "r" (clock_delay_) ); return result; }I disagree a bit on that point. My example case? An I2C driver, where all realtime happenings work at the command of the master. Most of the Propeller objects have an unfortunate flaw: you must use two cogs to do one thing. One cog for the realtime, and one cog for the processing. With PropellerGCC, in certain circumstances, you can merge these functions into a single cog. I2C is a good example: you want the speed and precision, but you don't need it all the time. In between you want the ease of programming that C offers. The above function shows a bit of that. Anyway, that's my opinion.
Doing a JMP to __LMM_RET is different because it is a predefined symbol for the function return address in LMM mode.
I use in-line ASM in the 115200 bps simple serial driver in libsimpletext and other drivers for one-COG solutions. Such things are very valuable. I wouldn't use any tricks in this kind of code though.
__attribute__((fcache)) static void _outbyte(int bitcycles, int txmask, int value) { int j = 10; int waitcycles; waitcycles = CNT + bitcycles; while(j-- > 0) { /* C code is too big and not fast enough for all memory models. // waitcycles = waitcnt2(waitcycles, bitcycles); */ __asm__ volatile("waitcnt %[_waitcycles], %[_bitcycles]" : [_waitcycles] "+r" (waitcycles) : [_bitcycles] "r" (bitcycles)); /* if (value & 1) OUTA |= txmask else OUTA &= ~txmask; value = value >> 1; */ __asm__ volatile("shr %[_value],#1 wc \n\t" "muxc outa, %[_mask]" : [_value] "+r" (value) : [_mask] "r" (txmask)); } }In case it matters, optimizer was off, and memory model was LMM.
I guess my problem was solved, I will just have to read more docs on PropGCC to understand the options better.
Thanks
Daniel
It sounds like the original problem was the "Enable pruning" option. My guess is that without the pause() call, the compiler thought nothing was happening in main and removed the whole program.
Shouldn't the __volatile__ in "__asm__ __volatile__" prevent the inline assembly sequence from getting pruned?
===Jac
That will protect it from the compiler, but perhaps not the linker. That's the only explanation I can think of for his described behavior. I think turning off linker pruning is the easiest work-around.
As far as I understand, the linker works at the function level, and uses references to determine which functions are needed; unreferenced functions (functions that aren't called by any other function) are pruned. The main function should never get pruned by the linker because it gets called by the C runtime startup code which is at the entry point into the executable file. In other words, I'm still not sure if pruning is the reason that the code doesn't work.
Maybe danielstritt can post the assembly output of the failing code?
===Jac
No pruning, works as expected:
GNU assembler version 2.21 (propeller-elf) using BFD version (propellergcc_v1_0_0_2090) 2.21. options passed : -lmm -ahdlnsg=lmm/activity board.asm input file : C:\Users\Daniel\AppData\Local\Temp\cc7JVFNP.s output file : lmm/activity board.o target : propeller-parallax-elf time stamp : 1 .text 2 .Ltext0 3 .balign 4 4 .global _main 5 _main 6 .LFB0 7 .file 1 "activity board.c" 1:activity board.c **** #include <propeller.h> 2:activity board.c **** 3:activity board.c **** int main(void) 4:activity board.c **** { 8 .loc 1 4 0 5:activity board.c **** __asm__ volatile ( 9 .loc 1 5 0 10 ' 5 "activity board.c" 1 11 0000 FF00FCA0 mov r1, #0xff 12 0004 1000FC2C shl r1, #16 13 0008 0000BC68 or dira, r1 14 000c 0000BC68 or outa, r1 15 0010 0000FC08 my_label: rdlong r2, #0 16 0014 0000BC80 add r2, cnt 17 0018 0000BC68 or outa, r1 18 001c 0000FC08 rdlong r2, #0 19 0020 0000BC80 add r2, cnt 20 0024 0000FCF8 waitcnt r2, #0 21 0028 0000BC64 andn outa, r1 22 002c 0000FC08 rdlong r2, #0 23 0030 0000BC80 add r2, cnt 24 0034 0000FCF8 waitcnt r2, #0 25 0038 2C00FC84 brs #my_label 26 27 ' 0 "" 2 6:activity board.c **** "mov r1, #0xff\n\t" 7:activity board.c **** "shl r1, #16\n\t" 8:activity board.c **** "or dira, r1\n\t" 9:activity board.c **** "or outa, r1\n\t" 10:activity board.c **** "my_label: " 11:activity board.c **** "rdlong r2, #0\n\t" 12:activity board.c **** "add r2, cnt\n\t" 13:activity board.c **** "or outa, r1\n\t" 14:activity board.c **** "rdlong r2, #0\n\t" 15:activity board.c **** "add r2, cnt\n\t" 16:activity board.c **** "waitcnt r2, #0\n\t" 17:activity board.c **** "andn outa, r1\n\t" 18:activity board.c **** "rdlong r2, #0\n\t" 19:activity board.c **** "add r2, cnt\n\t" 20:activity board.c **** "waitcnt r2, #0\n\t" 21:activity board.c **** "brs #my_label\n\t" 22:activity board.c **** ); 23:activity board.c **** 24:activity board.c **** return 0; 25:activity board.c **** } 28 .loc 1 25 0 29 003c 0000FCA0 mov r0, #0 30 0040 0000BCA0 mov pc,lr 31 .LFE0 59 .Letext0 DEFINED SYMBOLS C:\Users\Daniel\AppData\Local\Temp\cc7JVFNP.s:5 .text:00000000 _main C:\Users\Daniel\AppData\Local\Temp\cc7JVFNP.s:9 .text:00000000 L0 C:\Users\Daniel\AppData\Local\Temp\cc7JVFNP.s:15 .text:00000010 my_label .debug_frame:00000000 .Lframe0 .text:00000000 .LFB0 .debug_abbrev:00000000 .Ldebug_abbrev0 .text:00000000 .Ltext0 .text:00000044 .Letext0 .debug_line:00000000 .Ldebug_line0 .text:00000044 .LFE0 .debug_info:00000000 .Ldebug_info0 UNDEFINED SYMBOLS r1 dira outa r2 cnt pc r0 lrWith pruning, lights on quickstart don't turn on or blink (not really sure what it's doing then):
GNU assembler version 2.21 (propeller-elf) using BFD version (propellergcc_v1_0_0_2090) 2.21. options passed : -lmm -ahdlnsg=lmm/activity board.asm input file : C:\Users\Daniel\AppData\Local\Temp\cc7JVFNP.s output file : lmm/activity board.o target : propeller-parallax-elf time stamp : 1 .text 2 .Ltext0 3 .balign 4 4 .global _main 5 _main 6 .LFB0 7 .file 1 "activity board.c" 1:activity board.c **** #include <propeller.h> 2:activity board.c **** 3:activity board.c **** int main(void) 4:activity board.c **** { 8 .loc 1 4 0 5:activity board.c **** __asm__ volatile ( 9 .loc 1 5 0 10 ' 5 "activity board.c" 1 11 0000 FF00FCA0 mov r1, #0xff 12 0004 1000FC2C shl r1, #16 13 0008 0000BC68 or dira, r1 14 000c 0000BC68 or outa, r1 15 0010 0000FC08 my_label: rdlong r2, #0 16 0014 0000BC80 add r2, cnt 17 0018 0000BC68 or outa, r1 18 001c 0000FC08 rdlong r2, #0 19 0020 0000BC80 add r2, cnt 20 0024 0000FCF8 waitcnt r2, #0 21 0028 0000BC64 andn outa, r1 22 002c 0000FC08 rdlong r2, #0 23 0030 0000BC80 add r2, cnt 24 0034 0000FCF8 waitcnt r2, #0 25 0038 2C00FC84 brs #my_label 26 27 ' 0 "" 2 6:activity board.c **** "mov r1, #0xff\n\t" 7:activity board.c **** "shl r1, #16\n\t" 8:activity board.c **** "or dira, r1\n\t" 9:activity board.c **** "or outa, r1\n\t" 10:activity board.c **** "my_label: " 11:activity board.c **** "rdlong r2, #0\n\t" 12:activity board.c **** "add r2, cnt\n\t" 13:activity board.c **** "or outa, r1\n\t" 14:activity board.c **** "rdlong r2, #0\n\t" 15:activity board.c **** "add r2, cnt\n\t" 16:activity board.c **** "waitcnt r2, #0\n\t" 17:activity board.c **** "andn outa, r1\n\t" 18:activity board.c **** "rdlong r2, #0\n\t" 19:activity board.c **** "add r2, cnt\n\t" 20:activity board.c **** "waitcnt r2, #0\n\t" 21:activity board.c **** "brs #my_label\n\t" 22:activity board.c **** ); 23:activity board.c **** 24:activity board.c **** return 0; 25:activity board.c **** } 28 .loc 1 25 0 29 003c 0000FCA0 mov r0, #0 30 0040 0000BCA0 mov pc,lr 31 .LFE0 59 .Letext0 DEFINED SYMBOLS C:\Users\Daniel\AppData\Local\Temp\cc7JVFNP.s:5 .text:00000000 _main C:\Users\Daniel\AppData\Local\Temp\cc7JVFNP.s:9 .text:00000000 L0 C:\Users\Daniel\AppData\Local\Temp\cc7JVFNP.s:15 .text:00000010 my_label .debug_frame:00000000 .Lframe0 .text:00000000 .LFB0 .debug_abbrev:00000000 .Ldebug_abbrev0 .text:00000000 .Ltext0 .text:00000044 .Letext0 .debug_line:00000000 .Ldebug_line0 .text:00000044 .LFE0 .debug_info:00000000 .Ldebug_info0 UNDEFINED SYMBOLS r1 dira outa r2 cnt pc r0 lrI hope this helps.
Daniel
Are you running this as a program saved to the EEPROM ?
We had some trouble with pruning and EEPROM that was fixed in version release_1_0_2097. I have a new build, but we are still testing with it.
Thanks
Daniel
I'll post a pointer to the latest package after more testing.
Thanks
Daniel