CogC. Why so many useless mov commands ?
dnalor
Posts: 223
In the assembly listing of a compiled cogc file I see many useless mov comands.
If I have two variables:
The line:
is "translated" in this assembler code:
Why ?
Optimsisation level seems to be irrelevant.
Is there a way to change this behavior or is it possible to declare variables as registers ?
If I have two variables:
_COGMEM volatile _UDWORD udwV1; _COGMEM volatile _UDWORD udwAnswer1;
The line:
udwAnswer1 = udwV1;
is "translated" in this assembler code:
mov r7, _udwV1 mov _udwAnswer1, r7
Why ?
Optimsisation level seems to be irrelevant.
Is there a way to change this behavior or is it possible to declare variables as registers ?

Comments
-Phil
The reason the output is awkward is because of the "volatile" keyword on the variables. GCC has to be very conservative about removing any read or write accesses to volatile memory, and so many optimizations on volatile variables are suppressed.
"volatile" says that the contents of the variable can change unexpectedly, either because of hardware (the variable is connected to some hardware) or because other threads can read/write the variable. Unless you're doing some fancy multiple threading within the cog, it's usually not necessary to specify "volatile" for any _COGMEM variables (other than the hardware registers, which are already defined in <propeller.h>).
Thanks,
Actually GCC will allow memory to memory moves on non-volatile values, and so taking the "volatile" keyword off you'll get the expected:
Mind you, there are still some optimization opportunities that gcc is missing. There's always room for improvement!
Eric
Remember that _COGMEM is still memory as far as GCC is concerned, and I guess on some architectures the semantics of memory to memory moves are not compatible with "volatile" (maybe depending on the bus width the order of byte accesses is different?). Or maybe it's just an artifact of how GCC parses variables internally -- initially everything is moved into registers, and later the register moves are optimized away, but those optimizations are probably suppressed by the volatile keyword.
I have put some propeller specific peephole optimizations to improve this kind of thing in the "performance" branch of propgcc, but those aren't stable enough for release yet.
Eric
Ok. It's true. Removed the volatile keyword and it looks better in this case.
And that also explains why gives not
So best solution seems to be, to use local variables whenever it is possible. But then you get stackusage very fast, because 15 register are not always enough.
Is it possible to make more register ? Maybe 32 ?
Ok you would maybe loose some codespace, but if you can save 4 stack accesses, then you win.
And unused registerspace could be reused with an variable declared as register.
register _UDWORD udwReg10 asm("r10");Seems to work.Here in this subroutine udwI is not volatile. And at the end you see this useless moving. If you use a local variable i then this is not the case.
219:alfatdriver.cogc **** for(udwI = 0; udwI < length; udwI++) 148 .loc 1 219 0 149 00f0 00EEFCA0 mov _udwI, #0 150 00f4 00007C86 cmp r14, #0 wz 151 00f8 4900685C IF_E jmp #.L9 152 .LVL16 153 .L13 220:alfatdriver.cogc **** { 221:alfatdriver.cogc **** b = udwPutByte(' '); 154 .loc 1 221 0 155 00fc 2000FCA0 mov r0, #32 156 0100 0022FC5C call #_udwPutByte 157 .LVL17 222:alfatdriver.cogc **** if(ubpBuffer) 158 .loc 1 222 0 159 0104 7600BCA2 mov r7,_ubpBuffer wz 223:alfatdriver.cogc **** ubpBuffer[udwI] = b; 160 .loc 1 223 0 161 0108 77009480 IF_NE add r7, _udwI 162 010c 00001400 IF_NE wrbyte r0, r7 219:alfatdriver.cogc **** for(udwI = 0; udwI < length; udwI++) 163 .loc 1 219 0 164 0110 7700BCA0 mov r7, _udwI 165 0114 0100FC80 add r7, #1 166 0118 00EEBCA0 mov _udwI, r7 167 011c 00003C85 cmp r7, r14 wc 168 0120 3F00705C IF_B jmp #.L13 169 .LVL18 170 .L9But then you get often stackusage:
But If you have not only r0...r14 but maybe r0....r30 then this is maybe not the case. But I do not know how many registers gcc can handle.
Ah, I see the problem. Thanks for the idea -- it's a good one. Unfortunately, whatever number of registers we pick will probably be wrong for some people -- either too many or not enough.
I think the better answer is to keep improving the Propeller specific code generator so that _COGMEM variables are as good as register variables. We're still working on that. For example, the CMM preview does produce better code than the original beta; things like: should now produce at least some of the time.
In the meantime using a mixture of local variables (for anything that does not need to be preserved across function calls) and COGMEM variables is probably the best way to work around this issue.
Thank you for thinking about this, though! It's good that people are questioning the code generation and offering these suggestions.
Regards,
Eric
That of course would be best, if possible. Make register number/count settable via pragma or command line is too difficult ?? Maybe impossible ?
That's the way I work with every c-compiler at the beginning, to get a feeling which code gives the the desired result or should not be used. You're on a very good way and near the goal.