I just dumped the symbols, wow your's includes a lot of extra stuff. I see you are setting the clock but not much more than that vs my example. The clock seems to have sucked in a lot of extra string functionality as well as time functions. I think you may well have used different compiler command line settings. This is what I used to build my elf file.
EDIT: yep after using this on your source I get the symbols in out2.txt which is a smaller image vs your original .elf file dumped in out.txt. It might be the --gc-sections option doing this. EDIT2: yes it is that option which makes the difference.
32k for a hello world is still ridiculous.
I always find it annoying that the typical compiler->linker arrangement is unable to really provide readable listings that actually correspond to the output binary. Very hard to figure out what's actually in your code.
@Rayman said:
@rogloh Guess I can look at what you posted in other thread, but...
Are the changes you made just to the lib files? Or, to the main LLVM files too?
Details are in the other thread so best to read it. But yes it has changes to the libs (which is what I provided you), and also to LLVM source which needs fixes for the C modulus "%" operation otherwise it crashes LLVM, and also other CORDIC dependency fixes etc. You really should take those file changes and update your own LLVM build to resolve theses. Also there are still cases if you disassemble random bytes say with llvm-objdump -D it will crash this tool because the disassembler doesn't know about ALL P2 instructions yet. But if you disassemble genuine P2 compiled C code with llvm-objdump -d it's fine.
@Wuerfel_21 said:
32k for a hello world is still ridiculous.
Often the case with C with libs included.
I always find it annoying that the typical compiler->linker arrangement is unable to really provide readable listings that actually correspond to the output binary. Very hard to figure out what's actually in your code.
I find in this case that objdump -d is generally okay. I get a good disassembly listing of all P2 code in the binary. But if you wanted to see the symbols for global data accesses you don't see them used in the code it's just absolute read/write hex addresses which is not nice. You only see data symbols and their addresses in the symbol table. They really should cross reference them back in the disassembly listing IMO, or maybe that's just not implemented in the P2 port right now. Also it'd be real nice to have a way in the listing to see which C function arguments are being accessed in the stack frame or which initial registers they get copied into by somehow referring them back to the source code. Bit tricky if they don't have a way to carry it through in the .elf file. From memory I think enabling debugging helps pass more info down through the intermediate files.
Would be interesting to compare binary size with Flexprop…
Ahh, we're doing printf, which is a typical bloat landmine. Most likely pulling in a bunch of floating point support along the way. Though IIRC P1 GCC could do printf with float support and not run totally out of RAM.
FlexC has multiple levels of mitigation for this problem, so beating it is hard:
- printf (though not sprintf or other variants) is treated as a builtin (__builtin_printf) and the compiler scans the format string and uses simpler functions to accomplish the same job if possible
- If the user program doesn't use floats, float support in the library is automatically disabled
- The actual library formatting implementation is pretty lean overall.
( It used to be possible to reduce bloat related to file descriptors etc if you just want to print to the console, but this got busted)
The fairer comparsion would be to use puts, I guess.
Going the other way, if we don't include stdio (which will bog us down with a bunch of function pointers the compiler struggles to get rid of) and call the builtin directly:
(all at default settings, -Os might make things slightly smaller but let's not)
So that's definitely something that needs improving to make LLVM a good option for P2.
@Wuerfel_21 said:
32k for a hello world is still ridiculous.
Often the case with C with libs included.
I always find it annoying that the typical compiler->linker arrangement is unable to really provide readable listings that actually correspond to the output binary. Very hard to figure out what's actually in your code.
I find in this case that objdump -d is generally okay.
Didn't even know about that one... >.<
Though it still ends up being an annotated disassembly of the already built program.
Thought those old notes on compiling LLVM would give me some insight on how to compile the .a libraries. But, seems couldn't figure it out back then either
The p2llmv-fixes.zip looks to have stuff for building the .a libraries, but since was gifted those .a files from @rogloh (above) and can't build it anyway, skipping that.
Do have a question about the clang*.exe files... The all have exactly the same file size. Thinking they are all actually the same file. Are they?
7-Zip can compress them down as though were one file, so thing that is true...
Comments
I just dumped the symbols, wow your's includes a lot of extra stuff. I see you are setting the clock but not much more than that vs my example. The clock seems to have sucked in a lot of extra string functionality as well as time functions. I think you may well have used different compiler command line settings. This is what I used to build my elf file.
clang -I. -I../.. -Ibuild -Wall -Werror --target=p2 -fno-jump-tables -c -fdata-sections -ffunction-sections -o main.o main.c
clang -v --target=p2 -Wl,--gc-sections -Wl,-L/Users/roger/Applications/p2llvm/libc/lib -Wl,-L/Users/roger/Applications/p2llvm/libp2/lib -o main.elf main.o
EDIT: yep after using this on your source I get the symbols in out2.txt which is a smaller image vs your original .elf file dumped in out.txt. It might be the --gc-sections option doing this. EDIT2: yes it is that option which makes the difference.
Last time when through this, was told that the .elf doesn't really reflect the size of the actual binary...
Seems you have to convert the .elf to a .bin to see exactly how big it is on chip?
Used web search to figure out how to convert .elf to .bin
Build in the way from post#32 above and size is smaller...
@rogloh Guess I can look at what you posted in other thread, but...
Are the changes you made just to the lib files? Or, to the main LLVM files too?
32k for a hello world is still ridiculous.
I always find it annoying that the typical compiler->linker arrangement is unable to really provide readable listings that actually correspond to the output binary. Very hard to figure out what's actually in your code.
(don't read this post as me being all negative!)
@Wuerfel_21 code is posted above…
Would be interesting to compare binary size with Flexprop…
Details are in the other thread so best to read it. But yes it has changes to the libs (which is what I provided you), and also to LLVM source which needs fixes for the C modulus "%" operation otherwise it crashes LLVM, and also other CORDIC dependency fixes etc. You really should take those file changes and update your own LLVM build to resolve theses. Also there are still cases if you disassemble random bytes say with
llvm-objdump -Dit will crash this tool because the disassembler doesn't know about ALL P2 instructions yet. But if you disassemble genuine P2 compiled C code withllvm-objdump -dit's fine.Often the case with C with libs included.
I find in this case that
objdump -dis generally okay. I get a good disassembly listing of all P2 code in the binary. But if you wanted to see the symbols for global data accesses you don't see them used in the code it's just absolute read/write hex addresses which is not nice. You only see data symbols and their addresses in the symbol table. They really should cross reference them back in the disassembly listing IMO, or maybe that's just not implemented in the P2 port right now. Also it'd be real nice to have a way in the listing to see which C function arguments are being accessed in the stack frame or which initial registers they get copied into by somehow referring them back to the source code. Bit tricky if they don't have a way to carry it through in the .elf file. From memory I think enabling debugging helps pass more info down through the intermediate files.main.elf: file format elf32-p2 Disassembly of section .text: 00000000 <__entry>: 0: f8 a1 03 fb rdlong r0, ptra 4: 10 00 80 fd jmp #\16 ... 00000040 <__start0>: 40: f8 a1 03 fb rdlong r0, ptra 44: 98 00 00 ff augs #152 48: 28 a1 07 f6 mov r0, #296 4c: d0 a1 03 fb rdlong r0, r0 50: 02 a0 97 fb tjz r0, #2 54: 00 00 90 ff augd #1048576 58: 00 fe 65 fd hubset #255 5c: 00 00 00 ff augs #0 60: 68 a0 07 f6 mov r0, #104 64: d0 01 e8 fc coginit #0, r0 00000068 <__start>: 68: f8 a1 03 fb rdlong r0, ptra 6c: 98 00 00 ff augs #152 70: 38 a5 07 f6 mov r2, #312 74: 29 fe 67 fd setq2 #511 78: 01 00 00 ff augs #1 7c: 00 00 04 fb rdlong $0x000, #0Guess it’d be interesting to use spin2cpp on something and see if clang can compile it….
In the other thread it was noted that the elf files were bloated because they clear all of memory and not just the code.
I don't remember if there was an option to not do that or I built a loader that removed that.
Mike
Think I have a minimum build folder (at least compiles hello.c).
To make .o from .c:
clang -I. -I./sys -Ibuild -Wall -Werror --target=p2 -fno-jump-tables -c -fdata-sections -ffunction-sections -o hello.o hello.c
To make .elf from .o
clang -v --target=p2 -Wl,--gc-sections -Wl,-L./ -Wl,-L./ -o hello.elf hello.o
To make .bin from .elf:
llvm-objcopy -O binary hello.elf hello.bin
Guess forgot that made webpage for Clang a long time ago...
https://www.rayslogic.com/Propeller2/Clang.htm
I've uploaded a minimum build folder that can compile as in post #42.
But, still need to add the fixes from @rogloh , so not really ready yet...
Ahh, we're doing
printf, which is a typical bloat landmine. Most likely pulling in a bunch of floating point support along the way. Though IIRC P1 GCC could do printf with float support and not run totally out of RAM.FlexC has multiple levels of mitigation for this problem, so beating it is hard:
-
printf(though notsprintfor other variants) is treated as a builtin (__builtin_printf) and the compiler scans the format string and uses simpler functions to accomplish the same job if possible- If the user program doesn't use floats, float support in the library is automatically disabled
- The actual library formatting implementation is pretty lean overall.
( It used to be possible to reduce bloat related to file descriptors etc if you just want to print to the console, but this got busted)
The fairer comparsion would be to use
puts, I guess.Most direct equivalent code to your LLVM example:
enum { _CLKFREQ = 200000000 }; #include "propeller.h" #include "stdio.h" int main() { _setbaud(115200 * 2); printf("Hello World!\n"); while(1) { _waitx(CLKFREQ); } }Comes out to 7352 bytes.
If we force the real formatting implementation by using fprintf, which has no builtin processing:
enum { _CLKFREQ = 200000000 }; #include "propeller.h" #include "stdio.h" int main() { _setbaud(115200 * 2); fprintf(stdout,"Hello World!\n"); while(1) { _waitx(CLKFREQ); } }Comes out to 8588 bytes.
If we also enable float support...
enum { _CLKFREQ = 200000000 }; #include "propeller.h" #include "stdio.h" int main() { _setbaud(115200 * 2); float foo = 1.0; fprintf(stdout,"Hello World!\n"); while(1) { _waitx(CLKFREQ); } }We get 13244 bytes
Going the other way, if we don't include stdio (which will bog us down with a bunch of function pointers the compiler struggles to get rid of) and call the builtin directly:
enum { _CLKFREQ = 200000000 }; #include "propeller.h" int main() { _setbaud(115200 * 2); __builtin_printf("Hello World!\n"); while(1) { _waitx(CLKFREQ); } }We're down to 5768 bytes
If the aforementioned simple IO feature wasn't busted... (does not work on current versions)
enum { _CLKFREQ = 200000000 }; #define _SIMPLE_IO #pragma exportdef _SIMPLE_IO #include "propeller.h" int main() { _setbaud(115200 * 2); __builtin_printf("Hello World!\n"); while(1) { _waitx(CLKFREQ); } }It would be 3040 bytes. Still a lot, but most of it is actually zero-padding that the compiler will generate no matter what.
(EDIT: by adding
-H 32to the command line, some of it is saved and the size is exactly 2048 bytes - IDK why it's there by default)And for comparsion, using
putsinstead ofprintf:enum { _CLKFREQ = 200000000 }; #include "propeller.h" #include "stdio.h" int main() { _setbaud(115200 * 2); puts("Hello World!\n"); while(1) { _waitx(CLKFREQ); } }4688 bytes
(all at default settings, -Os might make things slightly smaller but let's not)
So that's definitely something that needs improving to make LLVM a good option for P2.
Didn't even know about that one... >.<
Though it still ends up being an annotated disassembly of the already built program.
Hmm... If things can be under 32k, maybe can compile for P1 too somehow?
That's probably pretty futile without XMM though, would guess...
Thought those old notes on compiling LLVM would give me some insight on how to compile the .a libraries. But, seems couldn't figure it out back then either
Ok, copied the @rogloh files from https://forums.parallax.com/discussion/169862/micropython-for-p2/p23 into a build folder and rebuilt.
Copied over new files from llmvfixes.zip first. Seems should be ready to go.
The p2llmv-fixes.zip looks to have stuff for building the .a libraries, but since was gifted those .a files from @rogloh (above) and can't build it anyway, skipping that.
Do have a question about the clang*.exe files... The all have exactly the same file size. Thinking they are all actually the same file. Are they?
7-Zip can compress them down as though were one file, so thing that is true...