Shop OBEX P1 Docs P2 Docs Learn Events
Compiling LLVM for P2 on Windows (it works!) - Page 2 — Parallax Forums

Compiling LLVM for P2 on Windows (it works!)

2»

Comments

  • roglohrogloh Posts: 6,331
    edited 2026-04-15 13:45

    I just dumped the symbols, wow your's includes a lot of extra stuff. I see you are setting the clock but not much more than that vs my example. The clock seems to have sucked in a lot of extra string functionality as well as time functions. I think you may well have used different compiler command line settings. This is what I used to build my elf file.

    clang -I. -I../.. -Ibuild -Wall -Werror --target=p2 -fno-jump-tables -c -fdata-sections -ffunction-sections -o main.o main.c
    clang -v --target=p2 -Wl,--gc-sections -Wl,-L/Users/roger/Applications/p2llvm/libc/lib -Wl,-L/Users/roger/Applications/p2llvm/libp2/lib -o main.elf main.o

    EDIT: yep after using this on your source I get the symbols in out2.txt which is a smaller image vs your original .elf file dumped in out.txt. It might be the --gc-sections option doing this. EDIT2: yes it is that option which makes the difference.

  • RaymanRayman Posts: 16,308
    edited 2026-04-15 18:13

    Last time when through this, was told that the .elf doesn't really reflect the size of the actual binary...

    Seems you have to convert the .elf to a .bin to see exactly how big it is on chip?

  • RaymanRayman Posts: 16,308

    Used web search to figure out how to convert .elf to .bin

    675 x 267 - 15K
  • RaymanRayman Posts: 16,308

    Build in the way from post#32 above and size is smaller...

    984 x 397 - 37K
  • RaymanRayman Posts: 16,308

    @rogloh Guess I can look at what you posted in other thread, but...

    Are the changes you made just to the lib files? Or, to the main LLVM files too?

  • 32k for a hello world is still ridiculous.
    I always find it annoying that the typical compiler->linker arrangement is unable to really provide readable listings that actually correspond to the output binary. Very hard to figure out what's actually in your code.

    (don't read this post as me being all negative!)

  • RaymanRayman Posts: 16,308

    @Wuerfel_21 code is posted above…

    Would be interesting to compare binary size with Flexprop…

  • roglohrogloh Posts: 6,331
    edited 2026-04-15 23:36

    @Rayman said:
    @rogloh Guess I can look at what you posted in other thread, but...

    Are the changes you made just to the lib files? Or, to the main LLVM files too?

    Details are in the other thread so best to read it. But yes it has changes to the libs (which is what I provided you), and also to LLVM source which needs fixes for the C modulus "%" operation otherwise it crashes LLVM, and also other CORDIC dependency fixes etc. You really should take those file changes and update your own LLVM build to resolve theses. Also there are still cases if you disassemble random bytes say with llvm-objdump -D it will crash this tool because the disassembler doesn't know about ALL P2 instructions yet. But if you disassemble genuine P2 compiled C code with llvm-objdump -d it's fine.

    @Wuerfel_21 said:
    32k for a hello world is still ridiculous.

    Often the case with C with libs included.

    I always find it annoying that the typical compiler->linker arrangement is unable to really provide readable listings that actually correspond to the output binary. Very hard to figure out what's actually in your code.

    I find in this case that objdump -d is generally okay. I get a good disassembly listing of all P2 code in the binary. But if you wanted to see the symbols for global data accesses you don't see them used in the code it's just absolute read/write hex addresses which is not nice. You only see data symbols and their addresses in the symbol table. They really should cross reference them back in the disassembly listing IMO, or maybe that's just not implemented in the P2 port right now. Also it'd be real nice to have a way in the listing to see which C function arguments are being accessed in the stack frame or which initial registers they get copied into by somehow referring them back to the source code. Bit tricky if they don't have a way to carry it through in the .elf file. From memory I think enabling debugging helps pass more info down through the intermediate files.

    main.elf:       file format elf32-p2
    
    Disassembly of section .text:
    
    00000000 <__entry>:
           0: f8 a1 03 fb                    rdlong r0, ptra        
           4: 10 00 80 fd                    jmp #\16
                    ...
    
    00000040 <__start0>:
          40: f8 a1 03 fb                    rdlong r0, ptra        
          44: 98 00 00 ff                    augs #152
          48: 28 a1 07 f6                    mov r0, #296   
          4c: d0 a1 03 fb                    rdlong r0, r0  
          50: 02 a0 97 fb                    tjz r0, #2
          54: 00 00 90 ff                    augd #1048576
          58: 00 fe 65 fd                    hubset #255
          5c: 00 00 00 ff                    augs #0
          60: 68 a0 07 f6                    mov r0, #104   
          64: d0 01 e8 fc                    coginit #0, r0 
    
    00000068 <__start>:
          68: f8 a1 03 fb                    rdlong r0, ptra        
          6c: 98 00 00 ff                    augs #152
          70: 38 a5 07 f6                    mov r2, #312   
          74: 29 fe 67 fd                    setq2 #511
          78: 01 00 00 ff                    augs #1
          7c: 00 00 04 fb                    rdlong $0x000, #0      
    
  • RaymanRayman Posts: 16,308

    Guess it’d be interesting to use spin2cpp on something and see if clang can compile it….

  • In the other thread it was noted that the elf files were bloated because they clear all of memory and not just the code.

    I don't remember if there was an option to not do that or I built a loader that removed that.

    Mike

  • RaymanRayman Posts: 16,308

    Think I have a minimum build folder (at least compiles hello.c).

    To make .o from .c:
    clang -I. -I./sys -Ibuild -Wall -Werror --target=p2 -fno-jump-tables -c -fdata-sections -ffunction-sections -o hello.o hello.c

    To make .elf from .o
    clang -v --target=p2 -Wl,--gc-sections -Wl,-L./ -Wl,-L./ -o hello.elf hello.o

    To make .bin from .elf:
    llvm-objcopy -O binary hello.elf hello.bin

  • RaymanRayman Posts: 16,308

    Guess forgot that made webpage for Clang a long time ago...

    https://www.rayslogic.com/Propeller2/Clang.htm

    I've uploaded a minimum build folder that can compile as in post #42.
    But, still need to add the fixes from @rogloh , so not really ready yet...

  • Wuerfel_21Wuerfel_21 Posts: 5,839
    edited 2026-04-16 15:55

    @Rayman said:
    @Wuerfel_21 code is posted above…

    Would be interesting to compare binary size with Flexprop…

    Ahh, we're doing printf, which is a typical bloat landmine. Most likely pulling in a bunch of floating point support along the way. Though IIRC P1 GCC could do printf with float support and not run totally out of RAM.

    FlexC has multiple levels of mitigation for this problem, so beating it is hard:
    - printf (though not sprintf or other variants) is treated as a builtin (__builtin_printf) and the compiler scans the format string and uses simpler functions to accomplish the same job if possible
    - If the user program doesn't use floats, float support in the library is automatically disabled
    - The actual library formatting implementation is pretty lean overall.

    ( It used to be possible to reduce bloat related to file descriptors etc if you just want to print to the console, but this got busted)

    The fairer comparsion would be to use puts, I guess.


    Most direct equivalent code to your LLVM example:

    enum {
        _CLKFREQ = 200000000
    };
    #include "propeller.h"
    #include "stdio.h"
    int main() {
        _setbaud(115200 * 2);
        printf("Hello World!\n");
        while(1) {
            _waitx(CLKFREQ);
        }
    }
    

    Comes out to 7352 bytes.

    If we force the real formatting implementation by using fprintf, which has no builtin processing:

    enum {
        _CLKFREQ = 200000000
    };
    #include "propeller.h"
    #include "stdio.h"
    int main() {
        _setbaud(115200 * 2);
        fprintf(stdout,"Hello World!\n");
        while(1) {
            _waitx(CLKFREQ);
        }
    }
    

    Comes out to 8588 bytes.

    If we also enable float support...

    enum {
        _CLKFREQ = 200000000
    };
    #include "propeller.h"
    #include "stdio.h"
    int main() {
        _setbaud(115200 * 2);
        float foo = 1.0;
        fprintf(stdout,"Hello World!\n");
        while(1) {
            _waitx(CLKFREQ);
        }
    }
    

    We get 13244 bytes

    Going the other way, if we don't include stdio (which will bog us down with a bunch of function pointers the compiler struggles to get rid of) and call the builtin directly:

    enum {
        _CLKFREQ = 200000000
    };
    #include "propeller.h"
    int main() {
        _setbaud(115200 * 2);
        __builtin_printf("Hello World!\n");
        while(1) {
            _waitx(CLKFREQ);
        }
    }
    

    We're down to 5768 bytes

    If the aforementioned simple IO feature wasn't busted... (does not work on current versions)

    enum {
        _CLKFREQ = 200000000
    };
    #define _SIMPLE_IO
    #pragma exportdef _SIMPLE_IO
    #include "propeller.h"
    int main() {
        _setbaud(115200 * 2);
        __builtin_printf("Hello World!\n");
        while(1) {
            _waitx(CLKFREQ);
        }
    }
    

    It would be 3040 bytes. Still a lot, but most of it is actually zero-padding that the compiler will generate no matter what.

    (EDIT: by adding -H 32 to the command line, some of it is saved and the size is exactly 2048 bytes - IDK why it's there by default)

    And for comparsion, using puts instead of printf:

    enum {
        _CLKFREQ = 200000000
    };
    #include "propeller.h"
    #include "stdio.h"
    int main() {
        _setbaud(115200 * 2);
        puts("Hello World!\n");
        while(1) {
            _waitx(CLKFREQ);
        }
    }
    

    4688 bytes

    (all at default settings, -Os might make things slightly smaller but let's not)

    So that's definitely something that needs improving to make LLVM a good option for P2.


    @Wuerfel_21 said:
    32k for a hello world is still ridiculous.

    Often the case with C with libs included.

    I always find it annoying that the typical compiler->linker arrangement is unable to really provide readable listings that actually correspond to the output binary. Very hard to figure out what's actually in your code.

    I find in this case that objdump -d is generally okay.

    Didn't even know about that one... >.<
    Though it still ends up being an annotated disassembly of the already built program.

  • RaymanRayman Posts: 16,308

    Hmm... If things can be under 32k, maybe can compile for P1 too somehow?
    That's probably pretty futile without XMM though, would guess...

  • RaymanRayman Posts: 16,308

    Thought those old notes on compiling LLVM would give me some insight on how to compile the .a libraries. But, seems couldn't figure it out back then either :(

  • RaymanRayman Posts: 16,308

    Ok, copied the @rogloh files from https://forums.parallax.com/discussion/169862/micropython-for-p2/p23 into a build folder and rebuilt.
    Copied over new files from llmvfixes.zip first. Seems should be ready to go.

    The p2llmv-fixes.zip looks to have stuff for building the .a libraries, but since was gifted those .a files from @rogloh (above) and can't build it anyway, skipping that.

    Do have a question about the clang*.exe files... The all have exactly the same file size. Thinking they are all actually the same file. Are they?
    7-Zip can compress them down as though were one file, so thing that is true...

Sign In or Register to comment.