Awesome! It looks like the main conflict is the propeller2.h changes. I see you turned a lot of the pin access macros into functions. I don't think I'd want to pull this in, since what would be a single instruction inline now becomes a multi-instruction function call. Adding the leading underscore is fine, I don't think I have enough code that I would not want to change it, and it is more consistent with the other functions in that library.
I'd also probably make simpletools a separate library, maybe even a separate repo, as it's relatively unrelated to the actual function of the compiler/core libraries to execute code.
The last changes to the propeller2.h and propeller2.c fixed the issue with the inline functions. I thorough tested all the functions to make sure they were inline instructions.
I also documented most of the functions so doxygen can be used.
Yes, I did not want to put that in libp2 but need it to compile with just a reference and have it link in.
Not so fast, I just came up for some air and found I missed a few functions.
You know in the old simpleIDE GCC it could not remove functions contained in a program so most of the individual functions were separate programs and now putting them back together leads to some missing functions if you don't watch what you're doing....
Also had to convert the eeprom library functions over to flash since we don't have that.
Need another assembly function:
cogstop
Got to have that one if were going use them.
So a separate folder called tools with the final library name as Libtools.a should work?
In the process of testing i2c with a 24lc256 chip to see if the functions came over correctly and the speed doesn't get in the way.
You know in the old simpleIDE GCC it could not remove functions contained in a program so most of the individual functions were separate programs and now putting them back together leads to some missing functions if you don't watch what you're doing....
Can you elaborate what you mean by this? You can still have each function be a separate .c file, if that's what you mean.
cogstop is added, not sure how I missed it.
Yes, I think a separate top level folder called libtools generating a libtools.a should work. I don't think it should be called just "tools" in the top level, since could be misinterpreted as a folder containing tools for the repo, not the tools library.
Out of curiosity, what was the motivation for changing from macros to inline functions for pin control? Overall, it does the same thing, but adds complexity to the source.
@iseries said:
So a separate folder called tools with the final library name as Libtools.a should work?
I'd strongly encourage you to create a new repository for the Simple. If you want help creating a build system for it, I'd be happy to lend a hand, but this probably shouldn't be in the compiler itself (especially not if we ever want this merged upstream).
@iseries said:
So a separate folder called tools with the final library name as Libtools.a should work?
I'd strongly encourage you to create a new repository for the Simple. If you want help creating a build system for it, I'd be happy to lend a hand, but this probably shouldn't be in the compiler itself (especially not if we ever want this merged upstream).
I agree with making a separate repo, but it won't affect upstream merge if we keep it in the same repo--the llvm-project folder is a submodule that I explicitly keep separate so that we can merge in the future.
On the P1 there is limited amount of memory and the GCC compiler would keep functions that were contained in a program as a library. So when the program got compiled and linked in the code size was larger than it needed to be because it did not drop the functions not referenced in the main program.
To get around this they extracted all the functions into small programs and then linked them all together so that the GCC compiler would not include them if not referenced.
I hate having little programs with one function in them, so I tried to put them back together again just cus.
For example the simpletools library is 63 little programs.
got it. Yeah clang does basically the same thing. -ffunction-sections puts every function in it's own section, and then if the section is not referenced, the garbage collector in the linker will throw out that section. If you want to keep everything, you can disable that by removing that flag from compilation.
I agree that having one small function per file can be a little bit much, but I do think it makes sense to keep code that is functionally unrelated in separate C files. otherwise finding where things occur is a mess. And once the project gets large enough, you can enable must faster compiling because each c file can be compiled in parallel.
include <propeller.h>
int test(int);
int i, j;
int main(int argc, char** argv)
{
i = 0x0a2829;
test(i);
while (1)
{
_wait(1000);
}
}
int test(int x)
{
int t;
char p1, p2, w;
t = x;
w = t >> 16;
p2 = (t >> 8) & 0xff;
p1 = t & 0xff;
_pinh(p1);
_pinh(p2);
_pinl(p2);
_pinl(p1);
return t;
}
When compiled it thinks its passing character values to the pinh and pinl but in reality it's passing the hole value.
Your code snippet does look like a bug--I will investigate today. Seems like it's throwing out the &. 0xff part of the operation for some reason.
Regarding when things get compiled--that is totally up to the user and how they set up their environment. I sometimes make the first step of my make all command a clean step just to be sure I'm compiling the latest code when there could interactions that cause bugs that the make system doesn't know about. Lately I've been using cmake and it is much better about making sure the right dependencies are known.
Because _pinh and _pinl have a char type for their argument, the ABI should dictate that the function itself performs the cast/truncation. As a result, the compiler knows that the callee will perform the truncation to a char, so it removes that operation from the caller to save code execution time and space.
Take a look at the _pinh or _pinl disassembly and see if it's doing that. If it's not, then it is a bug and I will investigate. I will try to recreate as well to double check.
With that inline asm, you'd need to mark pin as being written to with "+r"(pin), then it should mark it as used, and push and pop it correctly, I think.
So I re-created your example, and I get the following. You can see it does a signx to truncate the value properly. So not sure why yours is compiling differently. And this is what I think you'd expect to happen as well, rather than relying on the callee to do the truncation.
#include <propeller.h>
#include <stdio.h>
int test(int);
int i, j;
void _pinh(char pin);
void _pinl(char pin);
int main(int argc, char** argv)
{
i = 0x0a2829;
test(i);
while (1) {
}
}
int test(int x)
{
int t;
char p1, p2, w;
t = x;
w = t >> 16;
p2 = (t >> 8) & 0xff;
p1 = t & 0xff;
_pinh(p1);
_pinh(p2);
_pinl(p2);
_pinl(p1);
return t;
}
void __attribute__((noinline)) _pinh(char pin) {
asm ("dirh %0\n"::"r"(pin));
}
void __attribute__((noinline)) _pinl(char pin) {
asm ("dirl %0\n"::"r"(pin));
}
(unrelated, I finally added a symbolizer, so function names are now resolved in disassembly. It only works in on complete programs though, not objects that contain relocations)
example 1. For the assembly being removed, change the inline asm constraint to be an input: asm("and %0, #255\ndrvh %0\n"::"+r"(pin)); By putting the constraint as an input, it gets marked as used, and the full function will push/pop the used registers and not throw away the code it thinks is unused:
@n_ermosh said:
Actually I don't think it's a bug, it's a feature
Because _pinh and _pinl have a char type for their argument, the ABI should dictate that the function itself performs the cast/truncation. As a result, the compiler knows that the callee will perform the truncation to a char, so it removes that operation from the caller to save code execution time and space.
Take a look at the _pinh or _pinl disassembly and see if it's doing that. If it's not, then it is a bug and I will investigate. I will try to recreate as well to double check.
But then you'd have to mask with $3F before passing the value into DRVL/DRVH, for those take 11 bit pin-fields.
@n_ermosh said:
Actually I don't think it's a bug, it's a feature
Because _pinh and _pinl have a char type for their argument, the ABI should dictate that the function itself performs the cast/truncation. As a result, the compiler knows that the callee will perform the truncation to a char, so it removes that operation from the caller to save code execution time and space.
Take a look at the _pinh or _pinl disassembly and see if it's doing that. If it's not, then it is a bug and I will investigate. I will try to recreate as well to double check.
But then you'd have to mask with $3F before passing the value into DRVL/DRVH, for those take 11 bit pin-fields.
I was actually wrong in that statement. The caller should be the one to prepare the argument properly. There's definitely something wrong--very early in code generation, it breaks the association between the truncated value as a char and the value passed to the inline asm statement, so it throws away the truncation code.
Regarding the bug with casting to chars. I think this is a bug in LLVM's code. I created a super simple example in LLVM IR with just two instructions that recreates this, and sent an email to the llvm-dev team. I'll keep digging and see if I'm doing something wrong, but so far can't find anything.
It moved the test after the pin was changed which won't work.
Don't know how they used this product with ARM unless they don't use inline which come to think about it when I was working with arm code everything was a call unless it was sensitive and then it turned into an assembly driver that was called.
@Rayman said:
@n_ermosh Sounds like you are close to getting this C++ compiler working on P2. Keep it up, good work! Will be great when ready.
@Rayman I'd say it's in a pretty useable state, with the exception of code with a lot of bit-banging like Mike is working on. Fortunately, he's been a great tester to uncover all these issues . However, I've been able to work on a few small projects here and there with it and it's been working well, including getting lwIP ported and working to get ethernet connectivity to P2.
@iseries It looks like it re-orders the instructions because it has no knowledge that something external could modify the INx/OUTx registers, and I don't know if I marked those as implicit operands to those instructions. Can you try adding volatile to the inline assembly statements, maybe that will be enough to tell it not to reorder things? But I don't think so; I'll work on it from the compiler side.
I haven't run into that issue much yet because like you mentioned, I do what you'd do on ARM, make it a single asm block rather than multiple statements. But, I don't love that approach--what you did above should work.
Adding volatile to the ASM statements seems to do the trick. Makes sense in a way--tells the compiler this statement has side effects and not make assumptions about what effects this statement has on the state of the processor
Comments
Instructions added.
Awesome! It looks like the main conflict is the propeller2.h changes. I see you turned a lot of the pin access macros into functions. I don't think I'd want to pull this in, since what would be a single instruction inline now becomes a multi-instruction function call. Adding the leading underscore is fine, I don't think I have enough code that I would not want to change it, and it is more consistent with the other functions in that library.
I'd also probably make simpletools a separate library, maybe even a separate repo, as it's relatively unrelated to the actual function of the compiler/core libraries to execute code.
The last changes to the propeller2.h and propeller2.c fixed the issue with the inline functions. I thorough tested all the functions to make sure they were inline instructions.
I also documented most of the functions so doxygen can be used.
Yes, I did not want to put that in libp2 but need it to compile with just a reference and have it link in.
Just need to put it all together.
Mike
Congrats to both of you. @n_ermosh for getting LLVM so far, and @iseries for porting of Simple. Huge steps here!
Not so fast, I just came up for some air and found I missed a few functions.
You know in the old simpleIDE GCC it could not remove functions contained in a program so most of the individual functions were separate programs and now putting them back together leads to some missing functions if you don't watch what you're doing....
Also had to convert the eeprom library functions over to flash since we don't have that.
Need another assembly function:
cogstop
Got to have that one if were going use them.
So a separate folder called tools with the final library name as Libtools.a should work?
In the process of testing i2c with a 24lc256 chip to see if the functions came over correctly and the speed doesn't get in the way.
Mike
Can you elaborate what you mean by this? You can still have each function be a separate .c file, if that's what you mean.
cogstop is added, not sure how I missed it.
Yes, I think a separate top level folder called libtools generating a libtools.a should work. I don't think it should be called just "tools" in the top level, since could be misinterpreted as a folder containing tools for the repo, not the tools library.
Out of curiosity, what was the motivation for changing from macros to inline functions for pin control? Overall, it does the same thing, but adds complexity to the source.
I'd strongly encourage you to create a new repository for the Simple. If you want help creating a build system for it, I'd be happy to lend a hand, but this probably shouldn't be in the compiler itself (especially not if we ever want this merged upstream).
I agree with making a separate repo, but it won't affect upstream merge if we keep it in the same repo--the llvm-project folder is a submodule that I explicitly keep separate so that we can merge in the future.
On the P1 there is limited amount of memory and the GCC compiler would keep functions that were contained in a program as a library. So when the program got compiled and linked in the code size was larger than it needed to be because it did not drop the functions not referenced in the main program.
To get around this they extracted all the functions into small programs and then linked them all together so that the GCC compiler would not include them if not referenced.
I hate having little programs with one function in them, so I tried to put them back together again just cus.
For example the simpletools library is 63 little programs.
Mike
got it. Yeah clang does basically the same thing. -ffunction-sections puts every function in it's own section, and then if the section is not referenced, the garbage collector in the linker will throw out that section. If you want to keep everything, you can disable that by removing that flag from compilation.
I agree that having one small function per file can be a little bit much, but I do think it makes sense to keep code that is functionally unrelated in separate C files. otherwise finding where things occur is a mess. And once the project gets large enough, you can enable must faster compiling because each c file can be compiled in parallel.
I think I have uncovered an inline bug.
This program:
When compiled it thinks its passing character values to the pinh and pinl but in reality it's passing the hole value.
Mike
Well, after a day of head banging, I got the 24LC256 memory chip to work with the i2c driver on the P2.
It turned out that the driver was just fine, the compiler not so much.
One should not change a whole bunch of code here and there and then try to get something new to work only to find out that nothing was working.
Linking is great because you don't get to find a new bug based on previous compiled code. With Arduino everything gets compiled every time.
Mike
Your code snippet does look like a bug--I will investigate today. Seems like it's throwing out the
&. 0xff
part of the operation for some reason.Regarding when things get compiled--that is totally up to the user and how they set up their environment. I sometimes make the first step of my
make all
command aclean
step just to be sure I'm compiling the latest code when there could interactions that cause bugs that the make system doesn't know about. Lately I've been using cmake and it is much better about making sure the right dependencies are known.Actually I don't think it's a bug, it's a feature
Because
_pinh
and_pinl
have achar
type for their argument, the ABI should dictate that the function itself performs the cast/truncation. As a result, the compiler knows that the callee will perform the truncation to a char, so it removes that operation from the caller to save code execution time and space.Take a look at the
_pinh
or_pinl
disassembly and see if it's doing that. If it's not, then it is a bug and I will investigate. I will try to recreate as well to double check.I think it's a bug because I tried doing it at the other end.
This cause r0 to be truncated correctly but then the follow on instructs don't restore the register and try to work with truncated values.
I think the compiler can't see inside the assembly code and just passes the value on.
In that case the inline will have to push the register, modify the value, use it and pop the registers back off.....
Mike
With that inline asm, you'd need to mark pin as being written to with
"+r"(pin)
, then it should mark it as used, and push and pop it correctly, I think.So I re-created your example, and I get the following. You can see it does a
signx
to truncate the value properly. So not sure why yours is compiling differently. And this is what I think you'd expect to happen as well, rather than relying on the callee to do the truncation.Full compile command (generated by CMake)
/opt/p2llvm/bin/clang++ -D__P2_GIT_VERSION__=\"1c4ab22-modified\" -fno-exceptions -fno-jump-tables --target=p2 -Dprintf=__simple_printf -ffunction-sections -fdata-sections -Oz -std=gnu++17 -MD -MT src/CMakeFiles/t101.dir/t101.cpp.obj -MF CMakeFiles/t101.dir/t101.cpp.obj.d -o CMakeFiles/t101.dir/t101.cpp.obj -c /Users/nikita/Github/p2llvm/test/src/t101.cpp
(unrelated, I finally added a symbolizer, so function names are now resolved in disassembly. It only works in on complete programs though, not objects that contain relocations)
Right, your code is not doing inline so everything works fine. That is what I did to put things back to normal.
Here is a test program that does not work. It completely removes the assembly code for both examples.
This next program works exempt test2 is using the wrong values even though I told the compiler these variables are character.
The underlining problem here is if someone uses this function there could be side effects that are not going to work.
Mike
example 1. For the assembly being removed, change the inline asm constraint to be an input:
asm("and %0, #255\ndrvh %0\n"::"+r"(pin));
By putting the constraint as an input, it gets marked as used, and the full function will push/pop the used registers and not throw away the code it thinks is unused:example 2. I will take a look. You are right, something funky is happening here--I'll see what it takes to fix it.
Compiler optimization is a bit of a black box... hard to tell why it's doing something a lot of the time.
But then you'd have to mask with $3F before passing the value into DRVL/DRVH, for those take 11 bit pin-fields.
I was actually wrong in that statement. The caller should be the one to prepare the argument properly. There's definitely something wrong--very early in code generation, it breaks the association between the truncated value as a char and the value passed to the inline asm statement, so it throws away the truncation code.
@Wuerfel_21 ,
You mean 6 bit fields.
0x3f = 0b0011 1111
Right, we assume the lower 8 bits are a valid pin value.
I tried your example but it doesn't see the and in the assembly code and whacks the r0 register right off the bat.
This however works though:
Can't use 255 though as the compiler sees it as a character and drops it from the program altogether.
Mike
No, I mean 11 bit fields. If you do
drvh #$40
, it will drive high pin 0 and pin 1 - read the documentation.@Wuerfel_21 ,
The spreadsheet says:
5:0 = 6 bits right?
Mike
Your spreadsheet is outdated.
@Wuerfel_21 ,
No wander the program is doing funky things with a whole bunch of different pins at the same time.
Well, my book mark is for Rev A. of the spreadsheet which is what I have.
Mike
Regarding the bug with casting to chars. I think this is a bug in LLVM's code. I created a super simple example in LLVM IR with just two instructions that recreates this, and sent an email to the llvm-dev team. I'll keep digging and see if I'm doing something wrong, but so far can't find anything.
@n_ermosh Sounds like you are close to getting this C++ compiler working on P2. Keep it up, good work! Will be great when ready.
I put the and patch into the inline assembly issue so I could retest the i2c code to see if it would work with the inline assembly code.
It does not. The code generated is fine but it reorders the instructions.
Compiles like this:
It moved the test after the pin was changed which won't work.
Don't know how they used this product with ARM unless they don't use inline which come to think about it when I was working with arm code everything was a call unless it was sensitive and then it turned into an assembly driver that was called.
That might be what is needed here.
Mike
@Rayman I'd say it's in a pretty useable state, with the exception of code with a lot of bit-banging like Mike is working on. Fortunately, he's been a great tester to uncover all these issues . However, I've been able to work on a few small projects here and there with it and it's been working well, including getting lwIP ported and working to get ethernet connectivity to P2.
@iseries It looks like it re-orders the instructions because it has no knowledge that something external could modify the INx/OUTx registers, and I don't know if I marked those as implicit operands to those instructions. Can you try adding
volatile
to the inline assembly statements, maybe that will be enough to tell it not to reorder things? But I don't think so; I'll work on it from the compiler side.I haven't run into that issue much yet because like you mentioned, I do what you'd do on ARM, make it a single asm block rather than multiple statements. But, I don't love that approach--what you did above should work.
Adding volatile to the ASM statements seems to do the trick. Makes sense in a way--tells the compiler this statement has side effects and not make assumptions about what effects this statement has on the state of the processor
example:
I tried it at the function level and not at the asm level and it didn't work.
I will try it at the asm level as you did and see it works that way.
Mike