spin2cpp code generation for S3: a conundrum
ersmith
Posts: 6,088
in Propeller 1
I've run into an interesting problem in the spin2cpp conversion of S3 code, and I'd like to ask a wider audience for their opinions. The problem is in the S3 motor driver functions. The code looks like:
The new (experimental) spin2cpp --gas converts this to:
The problem is that GCC knows only that Motor_cmd is 16 bit aligned, and so to do the 32 bit store it issues 2 wrword commands. This would normally not be a problem -- it's the right thing to do if the variable isn't aligned. But in fact it *is* 32 bit aligned, and the motor cog reads both the time and command with one rdlong. The result is that the motor control cog can get the command in the middle of the two 16 bit writes, and ends up reading the wrong timeout.
I've tried a number of "obvious" work-arounds (casts involving volatile, and different optimization options) but none of them seem to persuade GCC to ignore alignment and just perform a 32 bit write. Turning off optimization entirely does it, but that's not practical -- the resulting code is way too big.
Possible fixes for the original Spin code:
(1) Change the motor COG PASM code to read cmd and time separately. Slower, but with care we can get rid of the race condition.
(2) Declare Motor_cmd as a .long, and remove Motor_time. This is probably the best solution. since it guarantees the alignment, and Motor_time is not in fact referenced on its own anywhere! So in a way it's the right thing to do.
But the original code did work; are we likely to run into similar problems in other projects? Maybe we should change spin2cpp instead. Possible fixes for spin2cpp:
(1) Emit __attribute__((aligned(4))) directives for any word DAT variables that do happen to be long aligned. This is ugly, and liable to break if the user edits the C code. Of course, the original Spin code is also likely to break if anyone edits it!
(2) Do the cast using an external function; something like:
Note that the long_ptr function cannot be visible in this compilation unit, or else the compiler will inline it and we're back to the same alignment problems. This is probably the most robust solution, but doing this for all Spin long[] directives will add a bunch of redundant function calls.
What are your opinions about this? I'm torn between saying the original Spin is broken (if a variable must be long aligned, the DAT section should indicate this, not just in comments...), saying spin2cpp --gas is broken (it doesn't translate a functional Spin program correctly) and saying GCC is broken (it doesn't do a 32 bit write when we ask it to).
Incidentally, without --gas the code generated by spin2cpp works, because the generated DAT array (a binary blob) is known to the compiler and it knows that the array as a whole is 32 bit aligned, and we ask for an offset in the array that is 32 bit aligned. But the C code is ugly and hard for the user to maintain:
DAT Motor_cmd word 0 ' Must begin on a long boundary Motor_time word 0 ... PUB run_motors(command, left_distance, right_distance, timeout, max_speed, end_speed) ... long[@Motor_cmd] := timeout << 16 | (0 #> max_speed <# 15) << 8 | (0 #> end_speed <# 15) << 4 | command & (MOT_IMM | MOT_CONT)
The new (experimental) spin2cpp --gas converts this to:
extern uint16_t Motor_cmd[] __asm__("Motor_cmd"); _dat_( Motor_cmd: ); _dat_( .word 0 ); _dat_( Motor_time: ); _dat_( .word 0 ); ... ((int32_t *)(int32_t)(Motor_cmd))[0] = (((timeout << 16) | ((Min__((Max__(0, max_speed)), 15)) << 8)) | ((Min__((Max__(0, end_speed)), 15)) << 4)) | (command & (SCRIBBLER_MOT_IMM | SCRIBBLER_MOT_CONT));
The problem is that GCC knows only that Motor_cmd is 16 bit aligned, and so to do the 32 bit store it issues 2 wrword commands. This would normally not be a problem -- it's the right thing to do if the variable isn't aligned. But in fact it *is* 32 bit aligned, and the motor cog reads both the time and command with one rdlong. The result is that the motor control cog can get the command in the middle of the two 16 bit writes, and ends up reading the wrong timeout.
I've tried a number of "obvious" work-arounds (casts involving volatile, and different optimization options) but none of them seem to persuade GCC to ignore alignment and just perform a 32 bit write. Turning off optimization entirely does it, but that's not practical -- the resulting code is way too big.
Possible fixes for the original Spin code:
(1) Change the motor COG PASM code to read cmd and time separately. Slower, but with care we can get rid of the race condition.
(2) Declare Motor_cmd as a .long, and remove Motor_time. This is probably the best solution. since it guarantees the alignment, and Motor_time is not in fact referenced on its own anywhere! So in a way it's the right thing to do.
But the original code did work; are we likely to run into similar problems in other projects? Maybe we should change spin2cpp instead. Possible fixes for spin2cpp:
(1) Emit __attribute__((aligned(4))) directives for any word DAT variables that do happen to be long aligned. This is ugly, and liable to break if the user edits the C code. Of course, the original Spin code is also likely to break if anyone edits it!
(2) Do the cast using an external function; something like:
// in a library int32_t *long_ptr(void *arg) { return (int32_t *)arg; } // in the code (long_ptr(Motor_cmd))[0] = (((timeout << 16) | ((Min__((Max__(0, max_speed)), 15)) << 8)) | ((Min__((Max__(0, end_speed)), 15)) << 4)) | (command & (SCRIBBLER_MOT_IMM | SCRIBBLER_MOT_CONT));
Note that the long_ptr function cannot be visible in this compilation unit, or else the compiler will inline it and we're back to the same alignment problems. This is probably the most robust solution, but doing this for all Spin long[] directives will add a bunch of redundant function calls.
What are your opinions about this? I'm torn between saying the original Spin is broken (if a variable must be long aligned, the DAT section should indicate this, not just in comments...), saying spin2cpp --gas is broken (it doesn't translate a functional Spin program correctly) and saying GCC is broken (it doesn't do a 32 bit write when we ask it to).
Incidentally, without --gas the code generated by spin2cpp works, because the generated DAT array (a binary blob) is known to the compiler and it knows that the array as a whole is 32 bit aligned, and we ask for an offset in the array that is 32 bit aligned. But the C code is ugly and hard for the user to maintain:
((int32_t *)(int32_t)(&(*(uint16_t *)&dat[3012])))[0] = (((timeout << 16) | ((Min__((Max__(0, max_speed)), 15)) << 8)) | ((Min__((Max__(0, end_speed)), 15)) << 4)) | (command & (MOT_IMM | MOT_CONT));
Comments
EDIT: OK, it appears that the optimizer breaks this into two word writes. The external function may be the way to go, but only apply it when writing to variables defined as WORD or BYTE.
I don't know if you can do this translation better, but it is certainly more readable/understandable... BTW Of course, since Motor_time is not referenced, the correct and best solution is
I don't think it's unreasonable to change the Spin code so that it produces better C code. In this case, use Cluso's last suggestion and declare Motor_cmd32. This would be especially important if Parallax decides to use the Spin code as the master source, and the C code is regenerated by spin2cpp whenever the Spin code is updated.
Since there's no way to do that or specify alignment in Spin, the user has done this "manually". Without some kind of hint to the code translator there's no trivial way to make this work all the time. You could conceivably tag variables as "potentially" aligned, and then wait to see if the generated code ever does a long-write like you showed above. If that happens, then actually emit the align attribute on them, otherwise skip it since it ended up not being required.
Is that workable? Kind of ugly, but without a way for the original code to declare the intention, none of the solutions are pretty.
Perhaps Spin and C have the same idea of endianness, I don't know. For sure such unions cause problems when moving code from an architecture with one endianness to another with the opposite. For this reason I have always avoided unions.
No more so than any other solution presented here. If you're trying to translate from a known source form to a known target, it's reasonably safe. I doubt anyone is going to be attempting to run the S3 motor driver on a PowerPC chip.
This problem seems like creating C code out of assembler source. C is a lot more structured and strict than assembler so it's hard to do accurately. Unless you make an emulator of the target machine in C that can run the assembler instructions.
That's impossible.
We are not about to create a Propeller emulator in C that runs Propeller machine code on the Propeller !
In this case the address is on a long boundary, so the long is made up of the addressed word and the next word in memory in a little-endian fashion. If the C code is run on a little-endian machine, such as the Prop, the data can be accessed as a single long. If the target machine is big-endian than the words will need to be swapped. That is, if the other cog reads the long as two words. Now if the target machine happens to be a single-processor machine, the problem could be handled by doing two word accesses, but with interrupts disabled to prevent any thread switching.
And then there is the issue about non-aligned accesses. Some machines allow this and will access the odd word address and the next even address. The Prop will access the previous even word address and the odd word address. And some machine will not allow it at all. So if spin2cpp were to generate code for these machines it would need special memory access routines to support each type of machine.
If you really want to support Prop assembly on any machine you could integrate the code from spinsim to your program. However, assembly code would run very slowly compared to C code, and I don't believe it would work correctly on big-endian machines.
I'm with (2), if the vars are read as a long, they should be declared as a long.
If Spin lacks records/unions, then it needs to be 'coded manually'.
Yes, tho my personal style would be the include time in the name, & comment the packed record nature.
If you can add warnings, and helpers, that is great, but I would expect any project translation effort to also result in some clean-ups in the original source. (so good warnings are more important than complex band-aids)
The clarity of the output of this is looking excellent, and I'm sure Ken can use this to introduce Spin to C users.
ie Seeing the same code in two languages, is a great way to quickly register the variances.
For that use, it is in Parallax's interests to have 'clean, well coded' Spin as the starting reference.
Sorry for the confusion.
You are right, it has nothing to do with assembler.
I was alluding to the fact that perhaps there are things going on in the source language that cannot be expected to be easily translatable to the target language. At least not without a huge lot of complex analysis of the source and perhaps some understanding of the original intent of the author.
Many years ago while working a Racal they had many man years of code developed in PL/M 86. They wanted to move all that to C in order to make it portable. They contracted a guy to create a translator. The end result was a pretty good translation but there were some details it could not handle. Those had to be fixed up by hand.