Shop OBEX P1 Docs P2 Docs Learn Events
Compiling LLVM for P2 on Windows (it works!) - Page 2 — Parallax Forums

Compiling LLVM for P2 on Windows (it works!)

2»

Comments

  • roglohrogloh Posts: 6,337
    edited 2026-04-15 13:45

    I just dumped the symbols, wow your's includes a lot of extra stuff. I see you are setting the clock but not much more than that vs my example. The clock seems to have sucked in a lot of extra string functionality as well as time functions. I think you may well have used different compiler command line settings. This is what I used to build my elf file.

    clang -I. -I../.. -Ibuild -Wall -Werror --target=p2 -fno-jump-tables -c -fdata-sections -ffunction-sections -o main.o main.c
    clang -v --target=p2 -Wl,--gc-sections -Wl,-L/Users/roger/Applications/p2llvm/libc/lib -Wl,-L/Users/roger/Applications/p2llvm/libp2/lib -o main.elf main.o

    EDIT: yep after using this on your source I get the symbols in out2.txt which is a smaller image vs your original .elf file dumped in out.txt. It might be the --gc-sections option doing this. EDIT2: yes it is that option which makes the difference.

  • RaymanRayman Posts: 16,314
    edited 2026-04-15 18:13

    Last time when through this, was told that the .elf doesn't really reflect the size of the actual binary...

    Seems you have to convert the .elf to a .bin to see exactly how big it is on chip?

  • RaymanRayman Posts: 16,314

    Used web search to figure out how to convert .elf to .bin

    675 x 267 - 15K
  • RaymanRayman Posts: 16,314

    Build in the way from post#32 above and size is smaller...

    984 x 397 - 37K
  • RaymanRayman Posts: 16,314

    @rogloh Guess I can look at what you posted in other thread, but...

    Are the changes you made just to the lib files? Or, to the main LLVM files too?

  • 32k for a hello world is still ridiculous.
    I always find it annoying that the typical compiler->linker arrangement is unable to really provide readable listings that actually correspond to the output binary. Very hard to figure out what's actually in your code.

    (don't read this post as me being all negative!)

  • RaymanRayman Posts: 16,314

    @Wuerfel_21 code is posted above…

    Would be interesting to compare binary size with Flexprop…

  • roglohrogloh Posts: 6,337
    edited 2026-04-15 23:36

    @Rayman said:
    @rogloh Guess I can look at what you posted in other thread, but...

    Are the changes you made just to the lib files? Or, to the main LLVM files too?

    Details are in the other thread so best to read it. But yes it has changes to the libs (which is what I provided you), and also to LLVM source which needs fixes for the C modulus "%" operation otherwise it crashes LLVM, and also other CORDIC dependency fixes etc. You really should take those file changes and update your own LLVM build to resolve theses. Also there are still cases if you disassemble random bytes say with llvm-objdump -D it will crash this tool because the disassembler doesn't know about ALL P2 instructions yet. But if you disassemble genuine P2 compiled C code with llvm-objdump -d it's fine.

    @Wuerfel_21 said:
    32k for a hello world is still ridiculous.

    Often the case with C with libs included.

    I always find it annoying that the typical compiler->linker arrangement is unable to really provide readable listings that actually correspond to the output binary. Very hard to figure out what's actually in your code.

    I find in this case that objdump -d is generally okay. I get a good disassembly listing of all P2 code in the binary. But if you wanted to see the symbols for global data accesses you don't see them used in the code it's just absolute read/write hex addresses which is not nice. You only see data symbols and their addresses in the symbol table. They really should cross reference them back in the disassembly listing IMO, or maybe that's just not implemented in the P2 port right now. Also it'd be real nice to have a way in the listing to see which C function arguments are being accessed in the stack frame or which initial registers they get copied into by somehow referring them back to the source code. Bit tricky if they don't have a way to carry it through in the .elf file. From memory I think enabling debugging helps pass more info down through the intermediate files.

    main.elf:       file format elf32-p2
    
    Disassembly of section .text:
    
    00000000 <__entry>:
           0: f8 a1 03 fb                    rdlong r0, ptra        
           4: 10 00 80 fd                    jmp #\16
                    ...
    
    00000040 <__start0>:
          40: f8 a1 03 fb                    rdlong r0, ptra        
          44: 98 00 00 ff                    augs #152
          48: 28 a1 07 f6                    mov r0, #296   
          4c: d0 a1 03 fb                    rdlong r0, r0  
          50: 02 a0 97 fb                    tjz r0, #2
          54: 00 00 90 ff                    augd #1048576
          58: 00 fe 65 fd                    hubset #255
          5c: 00 00 00 ff                    augs #0
          60: 68 a0 07 f6                    mov r0, #104   
          64: d0 01 e8 fc                    coginit #0, r0 
    
    00000068 <__start>:
          68: f8 a1 03 fb                    rdlong r0, ptra        
          6c: 98 00 00 ff                    augs #152
          70: 38 a5 07 f6                    mov r2, #312   
          74: 29 fe 67 fd                    setq2 #511
          78: 01 00 00 ff                    augs #1
          7c: 00 00 04 fb                    rdlong $0x000, #0      
    
  • RaymanRayman Posts: 16,314

    Guess it’d be interesting to use spin2cpp on something and see if clang can compile it….

  • iseriesiseries Posts: 1,539

    In the other thread it was noted that the elf files were bloated because they clear all of memory and not just the code.

    I don't remember if there was an option to not do that or I built a loader that removed that.

    Mike

  • RaymanRayman Posts: 16,314

    Think I have a minimum build folder (at least compiles hello.c).

    To make .o from .c:
    clang -I. -I./sys -Ibuild -Wall -Werror --target=p2 -fno-jump-tables -c -fdata-sections -ffunction-sections -o hello.o hello.c

    To make .elf from .o
    clang -v --target=p2 -Wl,--gc-sections -Wl,-L./ -Wl,-L./ -o hello.elf hello.o

    To make .bin from .elf:
    llvm-objcopy -O binary hello.elf hello.bin

  • RaymanRayman Posts: 16,314

    Guess forgot that made webpage for Clang a long time ago...

    https://www.rayslogic.com/Propeller2/Clang.htm

    I've uploaded a minimum build folder that can compile as in post #42.
    But, still need to add the fixes from @rogloh , so not really ready yet...

  • Wuerfel_21Wuerfel_21 Posts: 5,839
    edited 2026-04-16 15:55

    @Rayman said:
    @Wuerfel_21 code is posted above…

    Would be interesting to compare binary size with Flexprop…

    Ahh, we're doing printf, which is a typical bloat landmine. Most likely pulling in a bunch of floating point support along the way. Though IIRC P1 GCC could do printf with float support and not run totally out of RAM.

    FlexC has multiple levels of mitigation for this problem, so beating it is hard:
    - printf (though not sprintf or other variants) is treated as a builtin (__builtin_printf) and the compiler scans the format string and uses simpler functions to accomplish the same job if possible
    - If the user program doesn't use floats, float support in the library is automatically disabled
    - The actual library formatting implementation is pretty lean overall.

    ( It used to be possible to reduce bloat related to file descriptors etc if you just want to print to the console, but this got busted)

    The fairer comparsion would be to use puts, I guess.


    Most direct equivalent code to your LLVM example:

    enum {
        _CLKFREQ = 200000000
    };
    #include "propeller.h"
    #include "stdio.h"
    int main() {
        _setbaud(115200 * 2);
        printf("Hello World!\n");
        while(1) {
            _waitx(CLKFREQ);
        }
    }
    

    Comes out to 7352 bytes.

    If we force the real formatting implementation by using fprintf, which has no builtin processing:

    enum {
        _CLKFREQ = 200000000
    };
    #include "propeller.h"
    #include "stdio.h"
    int main() {
        _setbaud(115200 * 2);
        fprintf(stdout,"Hello World!\n");
        while(1) {
            _waitx(CLKFREQ);
        }
    }
    

    Comes out to 8588 bytes.

    If we also enable float support...

    enum {
        _CLKFREQ = 200000000
    };
    #include "propeller.h"
    #include "stdio.h"
    int main() {
        _setbaud(115200 * 2);
        float foo = 1.0;
        fprintf(stdout,"Hello World!\n");
        while(1) {
            _waitx(CLKFREQ);
        }
    }
    

    We get 13244 bytes

    Going the other way, if we don't include stdio (which will bog us down with a bunch of function pointers the compiler struggles to get rid of) and call the builtin directly:

    enum {
        _CLKFREQ = 200000000
    };
    #include "propeller.h"
    int main() {
        _setbaud(115200 * 2);
        __builtin_printf("Hello World!\n");
        while(1) {
            _waitx(CLKFREQ);
        }
    }
    

    We're down to 5768 bytes

    If the aforementioned simple IO feature wasn't busted... (does not work on current versions)

    enum {
        _CLKFREQ = 200000000
    };
    #define _SIMPLE_IO
    #pragma exportdef _SIMPLE_IO
    #include "propeller.h"
    int main() {
        _setbaud(115200 * 2);
        __builtin_printf("Hello World!\n");
        while(1) {
            _waitx(CLKFREQ);
        }
    }
    

    It would be 3040 bytes. Still a lot, but most of it is actually zero-padding that the compiler will generate no matter what.

    (EDIT: by adding -H 32 to the command line, some of it is saved and the size is exactly 2048 bytes - IDK why it's there by default)

    And for comparsion, using puts instead of printf:

    enum {
        _CLKFREQ = 200000000
    };
    #include "propeller.h"
    #include "stdio.h"
    int main() {
        _setbaud(115200 * 2);
        puts("Hello World!\n");
        while(1) {
            _waitx(CLKFREQ);
        }
    }
    

    4688 bytes

    (all at default settings, -Os might make things slightly smaller but let's not)

    So that's definitely something that needs improving to make LLVM a good option for P2.


    @Wuerfel_21 said:
    32k for a hello world is still ridiculous.

    Often the case with C with libs included.

    I always find it annoying that the typical compiler->linker arrangement is unable to really provide readable listings that actually correspond to the output binary. Very hard to figure out what's actually in your code.

    I find in this case that objdump -d is generally okay.

    Didn't even know about that one... >.<
    Though it still ends up being an annotated disassembly of the already built program.

  • RaymanRayman Posts: 16,314

    Hmm... If things can be under 32k, maybe can compile for P1 too somehow?
    That's probably pretty futile without XMM though, would guess...

  • RaymanRayman Posts: 16,314

    Thought those old notes on compiling LLVM would give me some insight on how to compile the .a libraries. But, seems couldn't figure it out back then either :(

  • RaymanRayman Posts: 16,314

    Ok, copied the @rogloh files from https://forums.parallax.com/discussion/169862/micropython-for-p2/p23 into a build folder and rebuilt.
    Copied over new files from llmvfixes.zip first. Seems should be ready to go.

    The p2llmv-fixes.zip looks to have stuff for building the .a libraries, but since was gifted those .a files from @rogloh (above) and can't build it anyway, skipping that.

    Do have a question about the clang*.exe files... The all have exactly the same file size. Thinking they are all actually the same file. Are they?
    7-Zip can compress them down as though were one file, so thing that is true...

  • roglohrogloh Posts: 6,337

    @Rayman said:
    Ok, copied the @rogloh files from https://forums.parallax.com/discussion/169862/micropython-for-p2/p23 into a build folder and rebuilt.
    Copied over new files from llmvfixes.zip first. Seems should be ready to go.

    The p2llmv-fixes.zip looks to have stuff for building the .a libraries, but since was gifted those .a files from @rogloh (above) and can't build it anyway, skipping that.

    Do have a question about the clang*.exe files... The all have exactly the same file size. Thinking they are all actually the same file. Are they?
    7-Zip can compress them down as though were one file, so thing that is true...

    When I build LLVM I get these files in the "bin" folder area and the clang* fies are different. Not sure what exatly you are talking about or maybe its a windows specific thing with Visual Studio. I do see a couple of symlinks are used to target the same clang binary if that's what you meant.

    ❯ ls -l
    .rwxr-xr-x roger staff 556 B  Mon Mar 16 16:56:27 2026  analyze-build
    .rwxr-xr-x roger staff 109 MB Fri Apr 10 11:22:54 2026  bugpoint
    .rwxr-xr-x roger staff  99 MB Fri Apr 10 11:22:57 2026  c-index-test
    lrwxr-xr-x roger staff   8 B  Mon Mar 16 17:14:20 2026  clang ⇒ clang-14
    lrwxr-xr-x roger staff   5 B  Sat Apr 11 17:05:48 2026  clang++ ⇒ clang
    .rwxr-xr-x roger staff 281 MB Fri Apr 10 11:22:56 2026  clang-14
    .rwxr-xr-x roger staff 176 MB Fri Apr 10 11:22:55 2026  clang-check
    lrwxr-xr-x roger staff   5 B  Sat Apr 11 17:05:48 2026  clang-cl ⇒ clang
    lrwxr-xr-x roger staff   5 B  Sat Apr 11 17:05:48 2026  clang-cpp ⇒ clang
    .rwxr-xr-x roger staff  88 MB Fri Apr 10 11:22:54 2026  clang-extdef-mapping
    .rwxr-xr-x roger staff 8.1 MB Mon Mar 16 17:10:24 2026  clang-format
    .rwxr-xr-x roger staff 6.0 MB Mon Mar 16 17:11:40 2026  clang-nvlink-wrapper
    .rwxr-xr-x roger staff  15 MB Mon Mar 16 17:10:55 2026  clang-offload-bundler
    .rwxr-xr-x roger staff  16 MB Mon Mar 16 17:11:52 2026  clang-offload-wrapper
    .rwxr-xr-x roger staff  99 MB Mon Mar 16 17:12:39 2026  clang-refactor
    .rwxr-xr-x roger staff  93 MB Mon Mar 16 17:12:38 2026  clang-rename
    .rwxr-xr-x roger staff 257 MB Fri Apr 10 11:22:56 2026  clang-repl
    .rwxr-xr-x roger staff 199 MB Fri Apr 10 11:22:56 2026  clang-scan-deps
    .rwxr-xr-x roger staff  33 KB Mon Mar 16 17:06:41 2026  count
    .rwxr-xr-x roger staff  30 MB Mon Mar 16 17:12:37 2026  diagtool
    .rwxr-xr-x roger staff  64 MB Fri Apr 10 11:22:53 2026  dsymutil
    .rwxr-xr-x roger staff 3.0 MB Mon Mar 16 17:07:02 2026  FileCheck
    .rwxr-xr-x roger staff  22 KB Mon Mar 16 16:56:27 2026  git-clang-format
    .rwxr-xr-x roger staff 9.7 KB Mon Mar 16 16:56:27 2026  hmaptool
    .rwxr-xr-x roger staff 562 B  Mon Mar 16 16:56:27 2026  intercept-build
    lrwxr-xr-x roger staff   3 B  Sat Apr 11 17:05:48 2026  ld.lld ⇒ lld
    lrwxr-xr-x roger staff   3 B  Sat Apr 11 17:05:48 2026  ld64.lld ⇒ lld
    lrwxr-xr-x roger staff   3 B  Sat Apr 11 17:05:48 2026  ld64.lld.darwinnew ⇒ lld
    lrwxr-xr-x roger staff   3 B  Sat Apr 11 17:05:48 2026  ld64.lld.darwinold ⇒ lld
    .rwxr-xr-x roger staff  91 MB Fri Apr 10 11:22:53 2026  llc
    .rwxr-xr-x roger staff 144 MB Fri Apr 10 11:22:54 2026  lld
    lrwxr-xr-x roger staff   3 B  Sat Apr 11 17:05:48 2026  lld-link ⇒ lld
    .rwxr-xr-x roger staff  78 MB Mon Mar 16 17:13:44 2026  lli
    .rwxr-xr-x roger staff 4.1 MB Mon Mar 16 17:13:25 2026  lli-child-target
    lrwxr-xr-x roger staff  15 B  Sat Apr 11 17:05:48 2026  llvm-addr2line ⇒ llvm-symbolizer
    .rwxr-xr-x roger staff  15 MB Fri Apr 10 11:22:49 2026  llvm-ar
    .rwxr-xr-x roger staff  17 MB Mon Mar 16 17:11:42 2026  llvm-as
    .rwxr-xr-x roger staff 2.3 MB Mon Mar 16 17:10:47 2026  llvm-bcanalyzer
    lrwxr-xr-x roger staff  12 B  Sat Apr 11 17:05:48 2026  llvm-bitcode-strip ⇒ llvm-objcopy
    .rwxr-xr-x roger staff  54 MB Fri Apr 10 11:22:52 2026  llvm-c-test
    .rwxr-xr-x roger staff  16 MB Mon Mar 16 17:11:42 2026  llvm-cat
    .rwxr-xr-x roger staff  22 MB Fri Apr 10 11:22:50 2026  llvm-cfi-verify
    .rwxr-xr-x roger staff 1.1 MB Mon Mar 16 17:06:56 2026  llvm-config
    .rwxr-xr-x roger staff  18 MB Mon Mar 16 17:11:10 2026  llvm-cov
    .rwxr-xr-x roger staff 5.8 MB Mon Mar 16 17:10:53 2026  llvm-cvtres
    .rwxr-xr-x roger staff  15 MB Mon Mar 16 17:10:54 2026  llvm-cxxdump
    .rwxr-xr-x roger staff 1.8 MB Mon Mar 16 17:09:13 2026  llvm-cxxfilt
    .rwxr-xr-x roger staff 2.4 MB Mon Mar 16 17:10:39 2026  llvm-cxxmap
    .rwxr-xr-x roger staff  11 MB Mon Mar 16 17:10:50 2026  llvm-diff
    .rwxr-xr-x roger staff  10 MB Mon Mar 16 17:10:50 2026  llvm-dis
    lrwxr-xr-x roger staff   7 B  Sat Apr 11 17:05:46 2026  llvm-dlltool ⇒ llvm-ar
    .rwxr-xr-x roger staff  19 MB Fri Apr 10 11:22:44 2026  llvm-dwarfdump
    .rwxr-xr-x roger staff  59 MB Fri Apr 10 11:22:52 2026  llvm-dwp
    .rwxr-xr-x roger staff  29 MB Mon Mar 16 17:14:11 2026  llvm-exegesis
    .rwxr-xr-x roger staff  23 MB Mon Mar 16 17:12:50 2026  llvm-extract
    .rwxr-xr-x roger staff  58 MB Fri Apr 10 11:22:52 2026  llvm-gsymutil
    .rwxr-xr-x roger staff  15 MB Mon Mar 16 17:11:13 2026  llvm-ifs
    lrwxr-xr-x roger staff  12 B  Sat Apr 11 17:05:48 2026  llvm-install-name-tool ⇒ llvm-objcopy
    .rwxr-xr-x roger staff  48 MB Fri Apr 10 11:22:44 2026  llvm-jitlink
    .rwxr-xr-x roger staff 4.1 MB Mon Mar 16 17:09:34 2026  llvm-jitlink-executor
    lrwxr-xr-x roger staff   7 B  Sat Apr 11 17:05:46 2026  llvm-lib ⇒ llvm-ar
    .rwxr-xr-x roger staff  15 MB Mon Mar 16 17:10:55 2026  llvm-libtool-darwin
    .rwxr-xr-x roger staff  20 MB Mon Mar 16 17:12:51 2026  llvm-link
    .rwxr-xr-x roger staff  15 MB Fri Apr 10 11:22:51 2026  llvm-lipo
    .rwxr-xr-x roger staff 105 MB Fri Apr 10 11:22:54 2026  llvm-lto
    .rwxr-xr-x roger staff 115 MB Fri Apr 10 11:22:54 2026  llvm-lto2
    .rwxr-xr-x roger staff 7.1 MB Fri Apr 10 11:22:47 2026  llvm-mc
    .rwxr-xr-x roger staff 6.6 MB Fri Apr 10 11:22:47 2026  llvm-mca
    .rwxr-xr-x roger staff 6.2 MB Fri Apr 10 11:22:47 2026  llvm-ml
    .rwxr-xr-x roger staff  16 MB Mon Mar 16 17:11:42 2026  llvm-modextract
    .rwxr-xr-x roger staff 1.4 MB Mon Mar 16 17:09:15 2026  llvm-mt
    .rwxr-xr-x roger staff  16 MB Fri Apr 10 11:22:50 2026  llvm-nm
    .rwxr-xr-x roger staff  19 MB Mon Mar 16 17:11:05 2026  llvm-objcopy
    .rwxr-xr-x roger staff  21 MB Fri Apr 10 11:22:44 2026  llvm-objdump
    .rwxr-xr-x roger staff 2.8 MB Mon Mar 16 17:10:53 2026  llvm-opt-report
    lrwxr-xr-x roger staff  12 B  Sat Apr 11 17:05:48 2026  llvm-otool ⇒ llvm-objdump
    .rwxr-xr-x roger staff  24 MB Mon Mar 16 17:11:31 2026  llvm-pdbutil
    .rwxr-xr-x roger staff  71 KB Mon Mar 16 17:06:42 2026  llvm-PerfectShuffle
    .rwxr-xr-x roger staff 9.3 MB Mon Mar 16 17:10:51 2026  llvm-profdata
    .rwxr-xr-x roger staff  28 MB Fri Apr 10 11:22:44 2026  llvm-profgen
    lrwxr-xr-x roger staff   7 B  Sat Apr 11 17:05:46 2026  llvm-ranlib ⇒ llvm-ar
    .rwxr-xr-x roger staff 6.8 MB Mon Mar 16 17:10:58 2026  llvm-rc
    lrwxr-xr-x roger staff  12 B  Sat Apr 11 17:05:48 2026  llvm-readelf ⇒ llvm-readobj
    .rwxr-xr-x roger staff  23 MB Mon Mar 16 17:11:39 2026  llvm-readobj
    .rwxr-xr-x roger staff  16 MB Fri Apr 10 11:22:51 2026  llvm-reduce
    .rwxr-xr-x roger staff  14 MB Fri Apr 10 11:22:44 2026  llvm-rtdyld
    .rwxr-xr-x roger staff  12 MB Mon Mar 16 17:11:34 2026  llvm-sim
    .rwxr-xr-x roger staff  14 MB Mon Mar 16 17:10:54 2026  llvm-size
    .rwxr-xr-x roger staff  18 MB Mon Mar 16 17:11:50 2026  llvm-split
    .rwxr-xr-x roger staff 8.6 MB Mon Mar 16 17:11:33 2026  llvm-stress
    .rwxr-xr-x roger staff 1.8 MB Mon Mar 16 17:10:53 2026  llvm-strings
    lrwxr-xr-x roger staff  12 B  Sat Apr 11 17:05:48 2026  llvm-strip ⇒ llvm-objcopy
    .rwxr-xr-x roger staff  20 MB Mon Mar 16 17:11:32 2026  llvm-symbolizer
    .rwxr-xr-x roger staff  15 MB Mon Mar 16 17:10:56 2026  llvm-tapi-diff
    .rwxr-xr-x roger staff  16 MB Mon Mar 16 17:07:04 2026  llvm-tblgen
    .rwxr-xr-x roger staff 2.1 MB Mon Mar 16 17:06:57 2026  llvm-undname
    lrwxr-xr-x roger staff   7 B  Sat Apr 11 17:05:48 2026  llvm-windres ⇒ llvm-rc
    .rwxr-xr-x roger staff  22 MB Mon Mar 16 17:11:35 2026  llvm-xray
    .rwxr-xr-x roger staff 801 KB Mon Mar 16 17:06:56 2026  not
    .rwxr-xr-x roger staff  28 MB Mon Mar 16 17:11:33 2026  obj2yaml
    .rwxr-xr-x roger staff 115 MB Fri Apr 10 11:22:54 2026  opt
    .rwxr-xr-x roger staff  22 MB Fri Apr 10 11:22:44 2026  sancov
    .rwxr-xr-x roger staff  20 MB Mon Mar 16 17:11:34 2026  sanstats
    .rwxr-xr-x roger staff  56 KB Mon Mar 16 16:56:27 2026  scan-build
    .rwxr-xr-x roger staff 550 B  Mon Mar 16 16:56:27 2026  scan-build-py
    .rwxr-xr-x roger staff 4.6 KB Mon Mar 16 16:56:27 2026  scan-view
    .rwxr-xr-x roger staff 3.8 KB Mon Mar 16 16:56:27 2026  set-xcode-analyzer
    .rwxr-xr-x roger staff 1.7 MB Mon Mar 16 17:06:56 2026  split-file
    .rwxr-xr-x roger staff  18 MB Mon Mar 16 17:11:43 2026  verify-uselistorder
    lrwxr-xr-x roger staff   3 B  Sat Apr 11 17:05:48 2026  wasm-ld ⇒ lld
    .rwxr-xr-x roger staff 2.0 MB Mon Mar 16 17:06:57 2026  yaml-bench
    .rwxr-xr-x roger staff  14 MB Mon Mar 16 17:11:13 2026  yaml2obj
    
  • roglohrogloh Posts: 6,337

    @Rayman said:
    The p2llmv-fixes.zip looks to have stuff for building the .a libraries, but since was gifted those .a files from @rogloh (above) and can't build it anyway, skipping that.

    I just logged the output of the make process for building these libraries which should help you reverse engineer things so you can build your own versions if needed.

  • RossHRossH Posts: 5,740
    edited 2026-04-18 02:46

    @Wuerfel_21 said:

    Ahh, we're doing printf, which is a typical bloat landmine. Most likely pulling in a bunch of floating point support along the way. Though IIRC P1 GCC could do printf with float support and not run totally out of RAM.

    As Catalina still can, of course - I can't resist putting an ad in here! :) ...

    <advertisment>

    Using Catalina's COMPACT mode you can have full stdio support for programs executing entirely from Hub RAM - including full floating point and full file system support - on a P1 or a P2. The maximum overhead is about 14k.

    For example, "Hello World" for the Propeller 1:

    catalina hello_world.c -lcx -O5 -C COMPACT
    code size 14952

    or, for the Propeller 2:

    catalina hello_world.c -P2 -lcx -O5 -C COMPACT
    code size 14988

    Of course, if you don't need full file system or full floating point support (as "Hello World" doesn't) you don't have to include either one.

    Catalina does this by providing different libraries, that have different combinations of stdio and floating point support:

    -lcx full floating point support, full stdio support, full file system support - max overhead about 14k
    -lcix floating point support, stdio support but no floating point I/O, full file system support - max overhead about 11k
    -lc full floating point support, stdio support but no file system - max overhead about 10k
    -lci floating point support, stdio support but no floating point I/O and no file system - max overhead about 3k

    So ...

    catalina hello_world.c -p2 -lcix -O5 -C COMPACT
    code size 11372 bytes

    catalina hello_world.c -p2 -lc -O5 -C COMPACT
    code size 10136 bytes

    catalina hello_world.c -p2 -lci -O5 -C COMPACT
    code size 3436 bytes

    As you might expect, it is including file system support that is the largest memory hog (edit: see note, below).

    In the case of "Hello, World", Catalina also offers several other ways to reduce code size.

    You can use stdio but add a smaller version of printf (slightly less functional, but adequate for most programs) by adding -ltiny:

    catalina hello_world.c -p2 -lci -O5 -C COMPACT -ltiny
    code size 1520 bytes

    Or you can replace printf with an even smaller version that does not pull in any stdio code at all:

    catalina hello_world.c -p2 -lci -O5 -C COMPACT -Dprintf=t_printf
    code size 1056 bytes

    None of these require any modifications to hello_world.c, which in all the above cases is as follows:

    #include <stdio.h>
    
    void main() {
       printf("Hello, world!\n");
    }
    

    However, if you are ok with modifying the program, you can do better.

    For example, this program - tiny_world.c - is functionally identical to hello_world.c, but uses only "built in" capabilities:

    #define printf(str) t_string(1, str);
    
    void main() {
       printf("Hello, world!\n");
    }
    

    Then ...

    catalina tiny_world.c -p2 -lci -O5 -C COMPACT -C NO_EXIT -C NO_REBOOT
    code size 100 bytes

    Catalina's strength is that you usually do not need to modify a C program to get it to execute. Of course, there are additional libraries offered that add functionality that will only work on the Propeller 1 or Propeller 2, but if you stick to "clean" C (originally only C89, now also C99, C11 or C23), you don't need to modify programs, whether they are going to execute on the Propeller 1, Propeller 2, and whether they are compiled as COMPACT or NATIVE programs to execute from Hub RAM, or as XMM programs to execute from external RAM.

    </advertisment>

    Edited to add note: Technically, it is not "file system" support that bloats stdio so much - it is "stream" support. The -lci, library variant has simplified streams which supports only stdin, stdout and stderr. The other library variants all have full stream support.

  • roglohrogloh Posts: 6,337
    edited 2026-04-18 03:41

    I just compiled this...under P2LLVM with the printf and stdio.h include file commented out. I still get a large program generated so it looks like it sucks in a lot of library stuff by default. That definitely needs to be optimized to reduce the size of the binaries being created. You can see what it's bringing in inside the sorted symbol table output I extracted as out.txt using llvm-objdump -x. I think part of the problem is that P2LLVM is packing certain built-in functions into LUTRAM to accelerate commonly(?) made calls which then reference external code in HUB it brings in afterwards by default. This includes a lot of floating point conversion code which isn't necessary in many cases. It makes sense to do memcpy and memmove in LUT but not sure about all the other ones (unless you really want to do floating point).

    //#include <stdio.h>
    #include <propeller.h>
    
    int main(void)
    {
    _uart_init(63,62,115200,0);
    //printf("hello world\n");
    return 0;
    }
    

    Here's part of what it puts into LUTRAM. A lot of floating point conversion and comparsion calls, seemingly back to HUB RAM anyway at copies of the same label (with different code). A bug in the output generated perhaps? IMO it's probably best to keep this all in HUB anyway, only included when needed, and not even use LUTRAM. Also it could have used a JMP instead and had the RETA of the called function return to the original location which would be faster and save a long each time.
    ```
    00000358 <__fixunsdfdi>:
    358: b4 44 c0 fd calla #__fixuint
    35c: 2e 00 64 fd reta

    00000360 <__fixunsdfsi>:
    360: 90 45 c0 fd calla #__fixuint
    364: 2e 00 64 fd reta

    00000368 <__fixunssfdi>:
    368: 30 46 c0 fd calla #__fixuint
    36c: 2e 00 64 fd reta

    00000370 <__fixunssfsi>:
    370: e4 46 c0 fd calla #__fixuint
    374: 2e 00 64 fd reta

    ...

    00000200 g F .text 00000008 __adddf3
    00000208 g F .text 00000008 __addsf3
    00000210 g F .text 00000054 __ashldi3
    00000264 g F .text 00000058 __ashrdi3
    000002bc g F .text 00000008 __eqdf2
    000002bc g F .text 00000008 __ledf2
    000002bc g F .text 00000008 __ltdf2
    000002bc g F .text 00000008 __nedf2
    000002c4 g F .text 00000008 __gedf2
    000002c4 g F .text 00000008 __gtdf2
    000002cc g F .text 00000008 __unorddf2
    000002d4 g F .text 00000008 __eqsf2
    000002d4 g F .text 00000008 __lesf2
    000002d4 g F .text 00000008 __ltsf2
    000002d4 g F .text 00000008 __nesf2
    000002dc g F .text 00000008 __gesf2
    000002dc g F .text 00000008 __gtsf2
    000002e4 g F .text 00000008 __unordsf2
    000002ec g F .text 00000008 __divdi3
    000002f4 g F .text 00000008 __divdf3
    000002fc g F .text 00000034 __divsi3
    00000330 g F .text 00000008 __divsf3
    00000338 g F .text 00000008 __extendsfdf2
    00000340 g F .text 00000008 __fixdfsi
    00000348 g F .text 00000008 __fixsfdi
    00000350 g F .text 00000008 __fixsfsi
    00000358 g F .text 00000008 __fixunsdfdi
    00000360 g F .text 00000008 __fixunsdfsi
    00000368 g F .text 00000008 __fixunssfdi
    00000370 g F .text 00000008 __fixunssfsi
    00000378 g F .text 00000008 __floatdisf
    00000380 g F .text 00000008 __floatsidf
    00000388 g F .text 00000008 __floatundidf
    00000390 g F .text 00000008 __floatundisf
    00000398 g F .text 00000008 __floatunsidf
    000003a0 g F .text 00000008 __floatunsisf
    000003a8 g F .text 000000d8 __floatsisf
    00000480 g F .text 00000054 __lshrdi3
    000004d4 g F .text 00000094 memcpy
    00000568 g F .text 00000068 memmove
    000005d0 g F .text 00000028 memset
    000005f8 g F .text 00000008 __moddi3
    00000600 g F .text 00000034 __modsi3
    00000634 g F .text 00000008 __muldf3
    0000063c g F .text 00000008 __mulsf3
    00000644 g F .text 00000098 __muldi3
    000006dc g F .text 00000014 __negdi2
    000006f0 g F .text 00000024 __subdf3
    00000714 g F .text 00000018 __subsf3
    0000072c g F .text 00000014 __udivdi3
    00000740 g F .text 000000cc __udivmoddi4
    0000080c g F .text 0000003c __umoddi3
    00000848 g F .text 00000008 sqrtf
    00000850 g F .text 00000008 powf
    ````

Sign In or Register to comment.