CogC. Why so many useless mov commands ?
dnalor
Posts: 223
In the assembly listing of a compiled cogc file I see many useless mov comands.
If I have two variables:
The line:
is "translated" in this assembler code:
Why ?
Optimsisation level seems to be irrelevant.
Is there a way to change this behavior or is it possible to declare variables as registers ?
If I have two variables:
_COGMEM volatile _UDWORD udwV1; _COGMEM volatile _UDWORD udwAnswer1;
The line:
udwAnswer1 = udwV1;
is "translated" in this assembler code:
mov r7, _udwV1 mov _udwAnswer1, r7
Why ?
Optimsisation level seems to be irrelevant.
Is there a way to change this behavior or is it possible to declare variables as registers ?
Comments
-Phil
The reason the output is awkward is because of the "volatile" keyword on the variables. GCC has to be very conservative about removing any read or write accesses to volatile memory, and so many optimizations on volatile variables are suppressed.
"volatile" says that the contents of the variable can change unexpectedly, either because of hardware (the variable is connected to some hardware) or because other threads can read/write the variable. Unless you're doing some fancy multiple threading within the cog, it's usually not necessary to specify "volatile" for any _COGMEM variables (other than the hardware registers, which are already defined in <propeller.h>).
Thanks,
Actually GCC will allow memory to memory moves on non-volatile values, and so taking the "volatile" keyword off you'll get the expected:
Mind you, there are still some optimization opportunities that gcc is missing. There's always room for improvement!
Eric
Remember that _COGMEM is still memory as far as GCC is concerned, and I guess on some architectures the semantics of memory to memory moves are not compatible with "volatile" (maybe depending on the bus width the order of byte accesses is different?). Or maybe it's just an artifact of how GCC parses variables internally -- initially everything is moved into registers, and later the register moves are optimized away, but those optimizations are probably suppressed by the volatile keyword.
I have put some propeller specific peephole optimizations to improve this kind of thing in the "performance" branch of propgcc, but those aren't stable enough for release yet.
Eric
Ok. It's true. Removed the volatile keyword and it looks better in this case.
And that also explains why gives not
So best solution seems to be, to use local variables whenever it is possible. But then you get stackusage very fast, because 15 register are not always enough.
Is it possible to make more register ? Maybe 32 ?
Ok you would maybe loose some codespace, but if you can save 4 stack accesses, then you win.
And unused registerspace could be reused with an variable declared as register. Seems to work.
Here in this subroutine udwI is not volatile. And at the end you see this useless moving. If you use a local variable i then this is not the case.
But then you get often stackusage:
But If you have not only r0...r14 but maybe r0....r30 then this is maybe not the case. But I do not know how many registers gcc can handle.
Ah, I see the problem. Thanks for the idea -- it's a good one. Unfortunately, whatever number of registers we pick will probably be wrong for some people -- either too many or not enough.
I think the better answer is to keep improving the Propeller specific code generator so that _COGMEM variables are as good as register variables. We're still working on that. For example, the CMM preview does produce better code than the original beta; things like: should now produce at least some of the time.
In the meantime using a mixture of local variables (for anything that does not need to be preserved across function calls) and COGMEM variables is probably the best way to work around this issue.
Thank you for thinking about this, though! It's good that people are questioning the code generation and offering these suggestions.
Regards,
Eric
That of course would be best, if possible. Make register number/count settable via pragma or command line is too difficult ?? Maybe impossible ?
That's the way I work with every c-compiler at the beginning, to get a feeling which code gives the the desired result or should not be used. You're on a very good way and near the goal.