Shop OBEX P1 Docs P2 Docs Learn Events
Inline asm not working as expected — Parallax Forums

Inline asm not working as expected

danielstrittdanielstritt Posts: 43
edited 2013-11-02 10:45 in Propeller 1
Ok, I am learning PASM as inline assembly in my C programs. This is a copy of entire program as it doesn't flash quickstart leds every second:
#include <propeller.h>

 #include "simpletools.h"

int main(void)
 {


     //pause(1000);
 

     __asm__ volatile (
         "mov r1, #0xff\n\t"
         "shl r1, #16\n\t"
         "or  dira, r1\n\t"
         "or  outa, r1\n\t"
         "my_label: "
         "rdlong r2, #0\n\t"
         "add r2, cnt\n\t"
         "or outa, r1\n\t"
         "rdlong r2, #0\n\t"
         "add r2, cnt\n\t"
         "waitcnt r2, #0\n\t"
         "andn outa, r1\n\t"
         "rdlong r2, #0\n\t"
         "add r2, cnt\n\t"
         "waitcnt r2, #0\n\t"
         "brs #my_label\n\t"
     );
 

     return 0;
 }

 


Now, if I uncomment "pause(1000)" it runs just fine on my quickstart. If I leave it commented it does not flash the lights as expected. I understand that a brief pause is required before using statements such as "printf", but is it also needed before ASM statements? I'm confused

Thanks

Daniel

Comments

  • jazzedjazzed Posts: 11,803
    edited 2013-10-26 14:09
    I could be wrong, but don't think that branch or jump is supported in Propeller-GCC in-line ASM.

    If you need that kind of control, you should launch assembly in a COG.
  • SRLMSRLM Posts: 5,045
    edited 2013-10-26 14:14
    What's the actual assembly output of your program?

    It might be getting put into fcache, and that might break your code. If it is you'll be able to see that happening in the assembly output.

    Here is some code that intentionally goes into the fcache:
    /** Output a byte on the I2C bus.
         * 
         * @param   byte the 8 bits to send on the bus.
         * @returns true if the device acknowledges, false otherwise.
         */
        bool SendByte(const unsigned char byte) {
            int result;
    
            int datamask, nextCNT, temp;
    
            __asm__ volatile(
                    "         fcache #(PutByteEnd - PutByteStart)\n\t"
                    "         .compress off                  \n\t"
                    /* Setup for transmit loop */
                    "PutByteStart: "
                    "         mov %[datamask], #256          \n\t" /* 0x100 */
                    "         mov %[result],   #0            \n\t"
                    "         mov %[nextCNT],  cnt           \n\t"
                    "         add %[nextCNT],  %[clockDelay] \n\t"
    
                    /* Transmit Loop (8x) */
                    //Output bit of byte
                    "PutByteLoop: "
                    "         shr  %[datamask], #1                \n\t" // Set up mask
                    "         and  %[datamask], %[databyte] wz,nr \n\t" // Move the bit into Z flag
                    "         muxz dira,        %[SDAMask]        \n\t"
    
                    //Pulse clock
                    "         waitcnt %[nextCNT], %[clockDelay] \n\t"
                    "         andn    dira,       %[SCLMask]    \n\t" // Set SCL high
                    "         waitcnt %[nextCNT], %[clockDelay] \n\t"
                    "         or      dira,       %[SCLMask]    \n\t" // Set SCL low
    
                    //Return for more bits
                    "         djnz %[datamask], #__LMM_FCACHE_START+(PutByteLoop-PutByteStart) nr \n\t"
    
                    // Get ACK
                    "         andn    dira,       %[SDAMask]    \n\t" // Float SDA high (release SDA)
                    "         waitcnt %[nextCNT], %[clockDelay] \n\t"
                    "         andn    dira,       %[SCLMask]    \n\t" // SCL high (by float)
                    "         waitcnt %[nextCNT], %[clockDelay] \n\t"
                    "         mov     %[temp],    ina           \n\t" //Sample input
                    "         and     %[SDAMask], %[temp] wz,nr \n\t" // If != 0, ack'd, else nack
                    "         muxz    %[result],  #1            \n\t" // Set result to equal to Z flag (aka, 1 if ack'd)
                    "         or      dira,       %[SCLMask]    \n\t" // Set scl low
                    "         or      dira,       %[SDAMask]    \n\t" // Set sda low 
                    "         jmp     __LMM_RET                 \n\t"
                    "PutByteEnd: "
                    "         .compress default                 \n\t"
                    : // Outputs
                    [datamask] "=&r" (datamask),
                    [result] "=&r" (result),
                    [nextCNT] "=&r" (nextCNT),
                    [temp] "=&r" (temp)
                    : // Inputs
                    [SDAMask] "r" (sda_mask_),
                    [SCLMask] "r" (scl_mask_),
                    [databyte] "r" (byte),
                    [clockDelay] "r" (clock_delay_)
                    );
    
            return result;
        }
    
  • SRLMSRLM Posts: 5,045
    edited 2013-10-26 14:17
    jazzed wrote: »
    If you need that kind of control, you should launch assembly in a COG.

    I disagree a bit on that point. My example case? An I2C driver, where all realtime happenings work at the command of the master. Most of the Propeller objects have an unfortunate flaw: you must use two cogs to do one thing. One cog for the realtime, and one cog for the processing. With PropellerGCC, in certain circumstances, you can merge these functions into a single cog. I2C is a good example: you want the speed and precision, but you don't need it all the time. In between you want the ease of programming that C offers. The above function shows a bit of that. Anyway, that's my opinion.
  • jazzedjazzed Posts: 11,803
    edited 2013-10-26 15:04
    The level of control I was referring to was "brs #my_label" ... I don't think that is a valid thing to do with propeller-gcc inline ASM - it "might work out of coincidence".

    Doing a JMP to __LMM_RET is different because it is a predefined symbol for the function return address in LMM mode.

    I use in-line ASM in the 115200 bps simple serial driver in libsimpletext and other drivers for one-COG solutions. Such things are very valuable. I wouldn't use any tricks in this kind of code though.
    __attribute__((fcache)) static void _outbyte(int bitcycles, int txmask, int value)
    {
      int j = 10;
      int waitcycles;
    
    
      waitcycles = CNT + bitcycles;
      while(j-- > 0) {
        /* C code is too big and not fast enough for all memory models.
        // waitcycles = waitcnt2(waitcycles, bitcycles); */
        __asm__ volatile("waitcnt %[_waitcycles], %[_bitcycles]"
                         : [_waitcycles] "+r" (waitcycles)
                         : [_bitcycles] "r" (bitcycles));
    
    
        /* if (value & 1) OUTA |= txmask else OUTA &= ~txmask; value = value >> 1; */
        __asm__ volatile("shr %[_value],#1 wc \n\t"
                         "muxc outa, %[_mask]"
                         : [_value] "+r" (value)
                         : [_mask] "r" (txmask));
      }
    }
    
    
    
    
  • danielstrittdanielstritt Posts: 43
    edited 2013-10-26 15:17
    Hmm, after fiddling around with the source, I decided to check options set in SIDE. I always enable "All warnings" and "Enable pruning", this time also "32-bit doubles". I discovered with the "pause(1000)" commented out, and disabling "Enable pruning", it runs again just fine. I didn't get the connection to "pause", and unless the __asm__ statement got "pruned" out, I don't get that either. I thought it just removed unused functions, unreachable code, and maybe even unused variables.

    In case it matters, optimizer was off, and memory model was LMM.

    I guess my problem was solved, I will just have to read more docs on PropGCC to understand the options better.

    Thanks

    Daniel
  • David BetzDavid Betz Posts: 14,516
    edited 2013-10-27 04:37
    jazzed wrote: »
    The level of control I was referring to was "brs #my_label" ... I don't think that is a valid thing to do with propeller-gcc inline ASM - it "might work out of coincidence".
    I'm pretty sure that BRS is fine in inline assembly since it just compiles to an ADD or SUB to the PC of a constant.
  • ersmithersmith Posts: 6,054
    edited 2013-10-28 08:24
    David is correct, "brs #my_label" is OK, it will be translated into an appropriate instruction based on the memory model selected (JMP would not work in LMM, but in that case "brs" becomes "ADD PC, #X").

    It sounds like the original problem was the "Enable pruning" option. My guess is that without the pause() call, the compiler thought nothing was happening in main and removed the whole program.
  • jac_goudsmitjac_goudsmit Posts: 418
    edited 2013-10-28 08:44
    ersmith wrote: »
    It sounds like the original problem was the "Enable pruning" option. My guess is that without the pause() call, the compiler thought nothing was happening in main and removed the whole program.

    Shouldn't the __volatile__ in "__asm__ __volatile__" prevent the inline assembly sequence from getting pruned?

    ===Jac
  • ersmithersmith Posts: 6,054
    edited 2013-10-31 11:26
    Shouldn't the __volatile__ in "__asm__ __volatile__" prevent the inline assembly sequence from getting pruned?

    ===Jac

    That will protect it from the compiler, but perhaps not the linker. That's the only explanation I can think of for his described behavior. I think turning off linker pruning is the easiest work-around.
  • jac_goudsmitjac_goudsmit Posts: 418
    edited 2013-11-01 10:18
    ersmith wrote: »
    That will protect it from the compiler, but perhaps not the linker. That's the only explanation I can think of for his described behavior. I think turning off linker pruning is the easiest work-around.

    As far as I understand, the linker works at the function level, and uses references to determine which functions are needed; unreferenced functions (functions that aren't called by any other function) are pruned. The main function should never get pruned by the linker because it gets called by the C runtime startup code which is at the entry point into the executable file. In other words, I'm still not sure if pruning is the reason that the code doesn't work.

    Maybe danielstritt can post the assembly output of the failing code?

    ===Jac
  • danielstrittdanielstritt Posts: 43
    edited 2013-11-01 12:09
    I outputted the files as asm files, here is the result:

    No pruning, works as expected:
     GNU assembler version 2.21 (propeller-elf)
         using BFD version (propellergcc_v1_0_0_2090) 2.21.
     options passed    : -lmm -ahdlnsg=lmm/activity board.asm 
     input file        : C:\Users\Daniel\AppData\Local\Temp\cc7JVFNP.s
     output file       : lmm/activity board.o
     target            : propeller-parallax-elf
     time stamp        : 
    
       1                      .text
       2                  .Ltext0
       3                      .balign    4
       4                      .global    _main
       5                  _main
       6                  .LFB0
       7                      .file 1 "activity board.c"
       1:activity board.c **** #include <propeller.h>
       2:activity board.c **** 
       3:activity board.c **** int main(void)
       4:activity board.c **** {
       8                      .loc 1 4 0
       5:activity board.c ****     __asm__ volatile (
       9                      .loc 1 5 0
      10                  ' 5 "activity board.c" 1
      11 0000 FF00FCA0         mov r1, #0xff
      12 0004 1000FC2C         shl r1, #16
      13 0008 0000BC68         or  dira, r1
      14 000c 0000BC68         or  outa, r1
      15 0010 0000FC08         my_label: rdlong r2, #0
      16 0014 0000BC80         add r2, cnt
      17 0018 0000BC68         or outa, r1
      18 001c 0000FC08         rdlong r2, #0
      19 0020 0000BC80         add r2, cnt
      20 0024 0000FCF8         waitcnt r2, #0
      21 0028 0000BC64         andn outa, r1
      22 002c 0000FC08         rdlong r2, #0
      23 0030 0000BC80         add r2, cnt
      24 0034 0000FCF8         waitcnt r2, #0
      25 0038 2C00FC84         brs #my_label
      26                      
      27                  ' 0 "" 2
       6:activity board.c ****          "mov r1, #0xff\n\t"
       7:activity board.c ****          "shl r1, #16\n\t"
       8:activity board.c ****          "or  dira, r1\n\t"
       9:activity board.c ****          "or  outa, r1\n\t"
      10:activity board.c ****          "my_label: "
      11:activity board.c ****          "rdlong r2, #0\n\t"
      12:activity board.c ****          "add r2, cnt\n\t"
      13:activity board.c ****          "or outa, r1\n\t"
      14:activity board.c ****          "rdlong r2, #0\n\t"
      15:activity board.c ****          "add r2, cnt\n\t"
      16:activity board.c ****          "waitcnt r2, #0\n\t"
      17:activity board.c ****          "andn outa, r1\n\t"
      18:activity board.c ****          "rdlong r2, #0\n\t"
      19:activity board.c ****          "add r2, cnt\n\t"
      20:activity board.c ****          "waitcnt r2, #0\n\t"
      21:activity board.c ****          "brs #my_label\n\t"
      22:activity board.c ****     );
      23:activity board.c **** 
      24:activity board.c ****     return 0;
      25:activity board.c **** }
      28                      .loc 1 25 0
      29 003c 0000FCA0         mov    r0, #0
      30 0040 0000BCA0         mov    pc,lr
      31                  .LFE0
      59                  .Letext0
    DEFINED SYMBOLS
    C:\Users\Daniel\AppData\Local\Temp\cc7JVFNP.s:5      .text:00000000 _main
    C:\Users\Daniel\AppData\Local\Temp\cc7JVFNP.s:9      .text:00000000 L0
    C:\Users\Daniel\AppData\Local\Temp\cc7JVFNP.s:15     .text:00000010 my_label
                         .debug_frame:00000000 .Lframe0
                                .text:00000000 .LFB0
                        .debug_abbrev:00000000 .Ldebug_abbrev0
                                .text:00000000 .Ltext0
                                .text:00000044 .Letext0
                          .debug_line:00000000 .Ldebug_line0
                                .text:00000044 .LFE0
                          .debug_info:00000000 .Ldebug_info0
    
    UNDEFINED SYMBOLS
    r1
    dira
    outa
    r2
    cnt
    pc
    r0
    lr
    
    
    

    With pruning, lights on quickstart don't turn on or blink (not really sure what it's doing then):
     GNU assembler version 2.21 (propeller-elf)
         using BFD version (propellergcc_v1_0_0_2090) 2.21.
     options passed    : -lmm -ahdlnsg=lmm/activity board.asm 
     input file        : C:\Users\Daniel\AppData\Local\Temp\cc7JVFNP.s
     output file       : lmm/activity board.o
     target            : propeller-parallax-elf
     time stamp        : 
    
       1                      .text
       2                  .Ltext0
       3                      .balign    4
       4                      .global    _main
       5                  _main
       6                  .LFB0
       7                      .file 1 "activity board.c"
       1:activity board.c **** #include <propeller.h>
       2:activity board.c **** 
       3:activity board.c **** int main(void)
       4:activity board.c **** {
       8                      .loc 1 4 0
       5:activity board.c ****     __asm__ volatile (
       9                      .loc 1 5 0
      10                  ' 5 "activity board.c" 1
      11 0000 FF00FCA0         mov r1, #0xff
      12 0004 1000FC2C         shl r1, #16
      13 0008 0000BC68         or  dira, r1
      14 000c 0000BC68         or  outa, r1
      15 0010 0000FC08         my_label: rdlong r2, #0
      16 0014 0000BC80         add r2, cnt
      17 0018 0000BC68         or outa, r1
      18 001c 0000FC08         rdlong r2, #0
      19 0020 0000BC80         add r2, cnt
      20 0024 0000FCF8         waitcnt r2, #0
      21 0028 0000BC64         andn outa, r1
      22 002c 0000FC08         rdlong r2, #0
      23 0030 0000BC80         add r2, cnt
      24 0034 0000FCF8         waitcnt r2, #0
      25 0038 2C00FC84         brs #my_label
      26                      
      27                  ' 0 "" 2
       6:activity board.c ****          "mov r1, #0xff\n\t"
       7:activity board.c ****          "shl r1, #16\n\t"
       8:activity board.c ****          "or  dira, r1\n\t"
       9:activity board.c ****          "or  outa, r1\n\t"
      10:activity board.c ****          "my_label: "
      11:activity board.c ****          "rdlong r2, #0\n\t"
      12:activity board.c ****          "add r2, cnt\n\t"
      13:activity board.c ****          "or outa, r1\n\t"
      14:activity board.c ****          "rdlong r2, #0\n\t"
      15:activity board.c ****          "add r2, cnt\n\t"
      16:activity board.c ****          "waitcnt r2, #0\n\t"
      17:activity board.c ****          "andn outa, r1\n\t"
      18:activity board.c ****          "rdlong r2, #0\n\t"
      19:activity board.c ****          "add r2, cnt\n\t"
      20:activity board.c ****          "waitcnt r2, #0\n\t"
      21:activity board.c ****          "brs #my_label\n\t"
      22:activity board.c ****     );
      23:activity board.c **** 
      24:activity board.c ****     return 0;
      25:activity board.c **** }
      28                      .loc 1 25 0
      29 003c 0000FCA0         mov    r0, #0
      30 0040 0000BCA0         mov    pc,lr
      31                  .LFE0
      59                  .Letext0
    DEFINED SYMBOLS
    C:\Users\Daniel\AppData\Local\Temp\cc7JVFNP.s:5      .text:00000000 _main
    C:\Users\Daniel\AppData\Local\Temp\cc7JVFNP.s:9      .text:00000000 L0
    C:\Users\Daniel\AppData\Local\Temp\cc7JVFNP.s:15     .text:00000010 my_label
                         .debug_frame:00000000 .Lframe0
                                .text:00000000 .LFB0
                        .debug_abbrev:00000000 .Ldebug_abbrev0
                                .text:00000000 .Ltext0
                                .text:00000044 .Letext0
                          .debug_line:00000000 .Ldebug_line0
                                .text:00000044 .LFE0
                          .debug_info:00000000 .Ldebug_info0
    
    UNDEFINED SYMBOLS
    r1
    dira
    outa
    r2
    cnt
    pc
    r0
    lr
    

    I hope this helps.

    Daniel
  • jazzedjazzed Posts: 11,803
    edited 2013-11-01 12:54
    Hi,

    Are you running this as a program saved to the EEPROM ?

    We had some trouble with pruning and EEPROM that was fixed in version release_1_0_2097. I have a new build, but we are still testing with it.
  • danielstrittdanielstritt Posts: 43
    edited 2013-11-01 14:05
    Actually, I do almost all my testing with ram only. I just tried with writing the program to EEPROM, and saw no difference from just RAM in the 2 options. If I can do anything to help, let me know

    Thanks

    Daniel
  • jazzedjazzed Posts: 11,803
    edited 2013-11-01 14:30
    I've tested your original code with the latest propeller-gcc release_1_0_2162 build, and don't see any trouble.
    I'll post a pointer to the latest package after more testing.
  • danielstrittdanielstritt Posts: 43
    edited 2013-11-02 10:45
    Thanks. It's nice to know that the problem might not have been my fault after all :)

    Thanks

    Daniel
Sign In or Register to comment.