Where it fails is in driver code where the order of operations might matter.
I'm curious about this.
I guess we are all familiar with the issue of inputs not getting read or variables that are written by other processes not getting read (same thing really). Basically because the compiler thinks it "owns" all the data and it knows it has previously read the thing, it has not changed it, so why read it again, it will be the same right? Oops.. you have just optimized away fetching of input data.
This is normally easily fixed by telling the compiler it does not control something. Adding "volatile" to variable declarations.
But, what you mention there is about order of execution. So I might write:
...
a = 1;
...
b = 1;
...
And for whatever reason find the assignments are done in reverse order. Perhaps even if a and b are volatile. Perhaps a and b are outputs that need to be set in that order. Oops...
Have you ever seen such a thing happen? Do you have an example? Does the C standard even allow that?
I have some times glossed over the descriptions of "sequence points" in C which I presume, perhaps wrongly, enable you to control this in standard syntax.
Where it fails is in driver code where the order of operations might matter.
I'm curious about this.
I guess we are all familiar with the issue of inputs not getting read or variables that are written by other processes not getting read (same thing really). Basically because the compiler thinks it "owns" all the data and it knows it has previously read the thing, it has not changed it, so why read it again, it will be the same right? Oops.. you have just optimized away fetching of input data.
This is normally easily fixed by telling the compiler it does not control something. Adding "volatile" to variable declarations.
But, what you mention there is about order of execution. So I might write:
...
a = 1;
...
b = 1;
...
And for whatever reason find the assignments are done in reverse order. Perhaps even if a and b are volatile. Perhaps a and b are outputs that need to be set in that order. Oops...
Have you ever seen such a thing happen? Do you have an example? Does the C standard even allow that?
I have some times glossed over the descriptions of "sequence points" in C which I presume, perhaps wrongly, enable you to control this in standard syntax.
So, this ordering business is not C's fault. The processors can do reordering of operations on the fly at execution time. If you want your C code to be portable you have to assume this will happen somewhere and take the precautions suggested in your link.
Except the article does say "Complicating these issues is the fact that both the compiler and the processor can reorder reads and write". So even if you are targeting a known processor that is known not to reorder you could still not be sure the compiler does not reorder.
I'd really like to see a piece of C for the Propeller where this happens.
Aha! What is happening is that your platform by default is including a serial HMI driver, which is using the same pins the debugger wants to use for its serial comms. You have found the correct answer, which is to either specify NO_HMI, or choose another type of HMI (e.g. VGA or TV). If you need to use a serial HMI, you can still use the debugger, but you need to tell the program to use a different set of pins for the debugger serial comms, and you have to have two USB connections to your platform (I have to do this on the HYDRA or HYBRID, both of which cannot use the standard serial port when XMM memory is in use - instead, I use the mouse port or a keyboard port, with a special cable described on page 23 of the Catalina Reference Manual).
And there is one of my problems. I don't easy see what is activated for my platform. For me it is an try and error game and I feel rather stupid.
Now after you wrote, that a serial HMI driver is included, I searched where I can see this. Ah, there is the Custom_HMI.inc and there at the end in the #else, there it is. Now I understand. But for what is then the checkbox TTY in the build options ? I am confused.
Are this config/definition files really easier than a few includes/pragmas in the code? Or the _DEF.inc only + the build options?
There is a discussion on code size in the Catalina Reference Manual starting on page 118 ("A Note about Catalina Code Sizes"). However, it is mainly concerned with reducing code size, and it really only covers LMM and CMM modes - it does not cover all the various XMM modes.
A table of all possible combinations of even a simple example program compiled with all possible memory models, loader options, XMM configurations, HMI options, and optimizer levels would be pages long (on the C3, for example, there are 4 different ways just to use the same XMM memory) - but it is a good idea, so I will see if I can come up with a sensible subset to include in the next release.
You are right, there are maybe too many options to give total numbers. But maybe start a table with the building options and there effect on included code/plugins, codesize and needed changes/precautions in the config files.
This all is not meant as critisism. English is not my mother tongue, so it's maybe all because I do not understand the manuals good enough.
Actually if you look at raw (i.e. un-optimized) LCC output compared to un-optimized GCC output, LCC output is much cleaner, simpler and more efficient. I was amazed to find this - the LCC compiler is actually quite good at basic code generation - which helps explain why it is still used as the basis for many compilers - both commercial and free ones - even after all this time.
But although GCC is a pretty ordinary at basic code generation, it makes up for it by having an astonishingly good optimizer, whereas I have had to write my own one for LCC. I would be the first to admit mine is nowhere near as good as the GCC one - but it is getting better all the time!
Ross.
Is there any file, where I can see a mixed view (C-Code-assembler) ?
So, this ordering business is not C's fault. The processors can do reordering of operations on the fly at execution time. If you want your C code to be portable you have to assume this will happen somewhere and take the precautions suggested in your link.
Except the article does say "Complicating these issues is the fact that both the compiler and the processor can reorder reads and write". So even if you are targeting a known processor that is known not to reorder you could still not be sure the compiler does not reorder.
I'd really like to see a piece of C for the Propeller where this happens.
I think there are hardware read and write barriers on some processors.
I think there are hardware read and write barriers on some processors.
I'm not au fait with how this all works. I just imagine that modern CPU's have long pipelines and multiple execution units and that somehow the processor stuffs instructions into those units in parallel where possible. It has to ensure that data dependencies are not broken. That still allows for reads and writes to happen out of order if there are no apparent data dependencies.
Clearly to fix that one would have to be able to tell the processors internal instruction scheduler that there are dependencies it cannot work out for itself. Hence barrier instructions, mfence, lfence, and sfence on x86 I believe.
So far so good.
But there is still that nagging statement that the compiler can reorder things for it's own weird optimization purposes.
I'm just curious if there is an example of that we can create somehow. Even on x86 would do, we can always look at the assembler output to see it happening.
And this is the -O2 output:
.text
.balign 4
.global _foo
_foo
mvi r6,#_B
mov r5, #0
rdlong r7, r6
add r7, #1
wrlong r5, r6
mvi r6,#_A
wrlong r7, r6
lret
.comm _B,4,4
.comm _A,4,4
Err, is that backwards or not?
Ross, what does Catalina do here?
I guess it's trying to reuse the _B address in r6. It would be interesting to see what was done on ARM or MIPS. The x86 isn't really a good comparison since it doesn't really have general registers.
That makes sense. In the absence of volatile, the compiler assumes there are no dependencies between A and B so it doesn't matter if they are written in the opposite order. The volatile keyword basically tells it not to make such assumptions.
You can, of course, turn off all optimization and then you get exactly what you wrote. However, as many have mentioned, the code is really *too* redundant to be of much use. There might not be an optimization option that gets rid of the redundant code but doesn't move things around. If that's the case then I guess GCC is not for you. Maybe Ross will say whether Catalina does any code reordering. If not, then it would be a good option. Luckily we have both choices on the Propeller.
Well, Catalina's optimizer does some re-ordering, if you consider "inlining" to be re-ordering.
And LCC does a small amount of re-ordering when it is constructing loops and such things - these can confuse people using the debugger for the first time - but I think these would all be considered "normal" and "expected" for any compiler/optimizer.
Is there any file, where I can see a mixed view (C-Code-assembler) ?
No - you can see the PASM code output but the C source code is not included.
However, if you add -g to your compile command and look in the resulting .dbg file, you will see the address of each source line, which you can cross-reference to the listing (.lst file).
Is it easy to improve that, so Source lines are inserted as comments ?
That helps a lot in both learning new systems, and in tracking possible bugs.
Its certainly possible, since LCC already identifies the source lines when debugging is enabled - but it doesn't make the source available to the back-end code generator (where the PASM is produced), so I don't think it is simple - but I'll have a look when I get time.
Is there any file, where I can see a mixed view (C-Code-assembler) ?
It's a bit of a pain but GCC can do it. Fist compile with "-g" so as to include debugging symbols in the binary.
Then use the objdump command to get a listing. Like so:
Catalina's optimizer does some re-ordering, if you consider "inlining" to be re-ordering.
There is reordering of of instruction layout in memory, as in inlining. There is reordering of the actual run time execution sequence. We have been discussing the latter. In the example the source says "write A then write B", the compiled code does the opposite.
This is serious because if variables are written out in reverse order at run time you program is not behaving as you might naively expect. I was surprised how easily GCC can me made to do such temporal reordering.
All is made well again by the use of "volatile". I hope...!
There is reordering of of instruction layout in memory, as in inlining. There is reordering of the actual run time execution sequence. We have been discussing the latter. In the example the source says "write A then write B", the compiled code does the opposite.
This is serious because if variables are written out in reverse order at run time you program is not behaving as you might naively expect. I was surprised how easily GCC can me made to do such temporal reordering.
All is made well again by the use of "volatile". I hope...!
I don't know a lot about GCC these days, but GCC used to be quite well known for overly-optimistic optimization. Presumably it is better these days, but it still seems to be very aggressive.
Catalina's optimization is far less aggressive, but so far I have not seen an instance of it doing the wrong thing.
It's a bit of a pain but GCC can do it. Fist compile with "-g" so as to include debugging symbols in the binary.
Then use the objdump command to get a listing.
Yes, I considered doing it this way for Catalina - essentially providing a "list" utility that used the debugging information to identify the source lines, and then interspersed them in the listing. But it would be better if they just naturally appeared in the normal listing output.
...used to be quite well known for overly-optimistic optimization.
Gosh, when was that? I have been using GCC since 1998 and I don't recall any time our projects code failed due to optimizer bugs. I have managed to crash the compiler a few times though. Not recently. That was mostly using very old GCC versions supplied with Wind River's VxWorks RTOS. One of the many reasons we switched to Linux for all our embedded systems.
Gosh, when was that? I have been using GCC since 1998 and I don't recall any time our projects code failed due to optimizer bugs. I have managed to crash the compiler a few times though. Not recently. That was mostly using very old GCC versions supplied with Wind River's VxWorks RTOS. One of the many reasons we switched to Linux for all our embedded systems.
You suddenly make me feel old! I began using GCC in the early eighties, when we eventually had to give up the GCC compiler altogether because of all its problems. But I used it again later (with much more success) indirectly via GNAT.
Like any complex software, of course the GNU optimizer has bugs in it! A quick google search for "gcc optimizer bugs" still shows many instances even in recent versions. Yes, some of them just crash the compiler, but others cause the generation of incorrect code.
But in the case of an optimizer (which is after all usually optional - before the Propeller, I had never worked on a program where memory was so tight you absolutely had to use an optimizer), I suspect many people just do like we always did and simply turned off - or at least reduced - the level of optimization till it worked again. You might try turning it again later (e.g. on the next release of the compiler) or you might not.
You suddenly make me feel old! I began using GCC in the early eighties,
Yes, your memory is fading:)
The good news is you are not as old as you think. GCC was not released until 1987. The first "stable" release was in 1991. GCC was not adopted by BSD until 1994.
I was amazed to find it in serious use inside Nokia in 1998. But then, at that time I had only just recently become aware Linux and the world of Free Software.
Sounds like you were among the brave pioneering early adopters though.
Are you sure? (about the date I mean!) I guess it may have been late eighties. I certainly remember GCC 1.x (not sure what "x" was!) But I do remember we had to use multiple C++ compilers to compile the project, because C++ was so new that no single compiler could yet compile the whole language successfully, so we had to use different compilers for different components. What a mess that was! In the end, the Sun C++ compiler won out, because it was the first one that managed to compile the whole project.
Did I say "dodgy ground"? It's like hiking out on the pack ice, full of the excitement that at any moment a crack will open up and swallow you whole and all your huskies. Or perhaps parachuting without a reserve. Or is it that experience of building a beautiful arch and then realizing that if you want to change any small part it's all going to collapse? Or just painting yourself into a very small corner of a very large room, when you finally realize your class hierarchy does not model the actual problem you have.
Apart from all that C++ is fine. It redeems itself by the fact that one can write things like JavaScript engines in it.
Comments
I guess we are all familiar with the issue of inputs not getting read or variables that are written by other processes not getting read (same thing really). Basically because the compiler thinks it "owns" all the data and it knows it has previously read the thing, it has not changed it, so why read it again, it will be the same right? Oops.. you have just optimized away fetching of input data.
This is normally easily fixed by telling the compiler it does not control something. Adding "volatile" to variable declarations.
But, what you mention there is about order of execution. So I might write: And for whatever reason find the assignments are done in reverse order. Perhaps even if a and b are volatile. Perhaps a and b are outputs that need to be set in that order. Oops...
Have you ever seen such a thing happen? Do you have an example? Does the C standard even allow that?
I have some times glossed over the descriptions of "sequence points" in C which I presume, perhaps wrongly, enable you to control this in standard syntax.
Except the article does say "Complicating these issues is the fact that both the compiler and the processor can reorder reads and write". So even if you are targeting a known processor that is known not to reorder you could still not be sure the compiler does not reorder.
I'd really like to see a piece of C for the Propeller where this happens.
And there is one of my problems. I don't easy see what is activated for my platform. For me it is an try and error game and I feel rather stupid.
Now after you wrote, that a serial HMI driver is included, I searched where I can see this. Ah, there is the Custom_HMI.inc and there at the end in the #else, there it is. Now I understand. But for what is then the checkbox TTY in the build options ? I am confused.
Are this config/definition files really easier than a few includes/pragmas in the code? Or the _DEF.inc only + the build options?
You are right, there are maybe too many options to give total numbers. But maybe start a table with the building options and there effect on included code/plugins, codesize and needed changes/precautions in the config files.
This all is not meant as critisism. English is not my mother tongue, so it's maybe all because I do not understand the manuals good enough.
Is there any file, where I can see a mixed view (C-Code-assembler) ?
Clearly to fix that one would have to be able to tell the processors internal instruction scheduler that there are dependencies it cannot work out for itself. Hence barrier instructions, mfence, lfence, and sfence on x86 I believe.
So far so good.
But there is still that nagging statement that the compiler can reorder things for it's own weird optimization purposes.
I'm just curious if there is an example of that we can create somehow. Even on x86 would do, we can always look at the assembler output to see it happening.
Well google found it for me. The most simple code can induce GCC to reorder stuff:
http://preshing.com/20120625/memory-ordering-at-compile-time/
Which shows the following example: Compile that with -O0 and -O2 for Intel and you can clearly see it in the assembler output. However propgcc is a bit harder to grok. This is the -O0 output: Err, is that backwards or not?
Ross, what does Catalina do here?
Now, if we make and A and B volatile we get: Which looks like we are back in order again!
Well, Catalina's optimizer does some re-ordering, if you consider "inlining" to be re-ordering.
And LCC does a small amount of re-ordering when it is constructing loops and such things - these can confuse people using the debugger for the first time - but I think these would all be considered "normal" and "expected" for any compiler/optimizer.
Ross.
No - you can see the PASM code output but the C source code is not included.
However, if you add -g to your compile command and look in the resulting .dbg file, you will see the address of each source line, which you can cross-reference to the listing (.lst file).
Ross.
Ross.
Errr. Nothing - here is Catalina's output ...
Ross.
Is it easy to improve that, so Source lines are inserted as comments ?
That helps a lot in both learning new systems, and in tracking possible bugs.
Its certainly possible, since LCC already identifies the source lines when debugging is enabled - but it doesn't make the source available to the back-end code generator (where the PASM is produced), so I don't think it is simple - but I'll have a look when I get time.
Ross.
Then use the objdump command to get a listing. Like so: Which produces something like this: Problem is that is not the code you want to run, it's not optimized and far too big.
Setting optimization on makes a mess of the listing output.
There is reordering of of instruction layout in memory, as in inlining. There is reordering of the actual run time execution sequence. We have been discussing the latter. In the example the source says "write A then write B", the compiled code does the opposite.
This is serious because if variables are written out in reverse order at run time you program is not behaving as you might naively expect. I was surprised how easily GCC can me made to do such temporal reordering.
All is made well again by the use of "volatile". I hope...!
I don't know a lot about GCC these days, but GCC used to be quite well known for overly-optimistic optimization. Presumably it is better these days, but it still seems to be very aggressive.
Catalina's optimization is far less aggressive, but so far I have not seen an instance of it doing the wrong thing.
Ross.
Yes, I considered doing it this way for Catalina - essentially providing a "list" utility that used the debugging information to identify the source lines, and then interspersed them in the listing. But it would be better if they just naturally appeared in the normal listing output.
Ross.
Then you never used the Microchip MPLAB C Compiler, which is a port of gcc!
You suddenly make me feel old! I began using GCC in the early eighties, when we eventually had to give up the GCC compiler altogether because of all its problems. But I used it again later (with much more success) indirectly via GNAT.
Like any complex software, of course the GNU optimizer has bugs in it! A quick google search for "gcc optimizer bugs" still shows many instances even in recent versions. Yes, some of them just crash the compiler, but others cause the generation of incorrect code.
But in the case of an optimizer (which is after all usually optional - before the Propeller, I had never worked on a program where memory was so tight you absolutely had to use an optimizer), I suspect many people just do like we always did and simply turned off - or at least reduced - the level of optimization till it worked again. You might try turning it again later (e.g. on the next release of the compiler) or you might not.
Ross.
Or the AVR toolchain!
I did try GCC for the AVR briefly. Gave up with them, not the compilers fault, I discovered the propeller
The good news is you are not as old as you think. GCC was not released until 1987. The first "stable" release was in 1991. GCC was not adopted by BSD until 1994.
I was amazed to find it in serious use inside Nokia in 1998. But then, at that time I had only just recently become aware Linux and the world of Free Software.
Sounds like you were among the brave pioneering early adopters though.
Are you sure? (about the date I mean!) I guess it may have been late eighties. I certainly remember GCC 1.x (not sure what "x" was!) But I do remember we had to use multiple C++ compilers to compile the project, because C++ was so new that no single compiler could yet compile the whole language successfully, so we had to use different compilers for different components. What a mess that was! In the end, the Sun C++ compiler won out, because it was the first one that managed to compile the whole project.
Ross.
Using C++ in the 1980's was just fool hardy:)
A new and impossible to implement language AND a new compiler. Definitely treading dodgy ground.
Mind you I still think that about C++ today.
Well, at least we agree on that!
Did I say "dodgy ground"? It's like hiking out on the pack ice, full of the excitement that at any moment a crack will open up and swallow you whole and all your huskies. Or perhaps parachuting without a reserve. Or is it that experience of building a beautiful arch and then realizing that if you want to change any small part it's all going to collapse? Or just painting yourself into a very small corner of a very large room, when you finally realize your class hierarchy does not model the actual problem you have.
Apart from all that C++ is fine. It redeems itself by the fact that one can write things like JavaScript engines in it.