automatically convert Spin to C or C++

I've just released a new version (0.96) of the spin2cpp tool for converting spin files to C++. It now also supports conversion to plain C as well, with some restrictions. At the moment the C output is static, so only one instance of each object is allowed. I've made an effort to make the C output generic (i.e. it should generally compile with either PropGCC or Catalina) but there are still a few C++ and/or GCC specific features output for some constructs, e.g. lookup and lookdown.
Catalina already has a way to interface with spin, of course, but sometimes one may want to convert a spin file in order to benefit from the improved speed of C.
This version of spin2cpp also supports putting inline C and C++ code in spin comments. Any comment starting with {++ is assumed to be C++ (or C) code and is passed through. The intent is that a spin driver that interfaces to PASM code can have C added to it as well, so that the same spin file can be used to produce a C/C++ driver.
The download link is at:
http://code.google.com/p/spin2cpp
Catalina already has a way to interface with spin, of course, but sometimes one may want to convert a spin file in order to benefit from the improved speed of C.
This version of spin2cpp also supports putting inline C and C++ code in spin comments. Any comment starting with {++ is assumed to be C++ (or C) code and is passed through. The intent is that a spin driver that interfaces to PASM code can have C added to it as well, so that the same spin file can be used to produce a C/C++ driver.
The download link is at:
http://code.google.com/p/spin2cpp
Comments
Thinking of a typical Spin program, there are generally a list of global variables in the VAR section. My understanding of C is that one tends not to use global variables and instead passes variables into functions and then the function returns new values. So I've been thinking of a way of writing Spin that is more like C to start with, ie no global variables.
I'm a C newbie but I think the general concept is you pass a pointer to a variable rather than the variable. Then inside the function you now have direct access to that variable and can change it and in effect, this makes it a global variable.
The simple case is passing one variable and returning one variable. The slightly more complex case is returning two variables. I think the general concept in Spin is to pass two pointers with the @, eg
PUB Main | a,b a := 4 b := 5 PointerDemo(@a,@b) PUB PointerDemo(c,d) long[c] := 6 long[d] := 7
I'm not 100% sure that is correct but the idea is for variables a and b in the main routine to be changed by the function.
So I ran this through Spin2CPP and it produced this code
static pointSpin thisobj; int32_t pointSpin_Main(void) { int32_t _local__0000[2]; int32_t result = 0; _local__0000[0] = 4; _local__0000[1] = 5; pointSpin_Pointerdemo((int32_t)(&_local__0000[0]), (int32_t)(&_local__0000[1])); return result; } int32_t pointSpin_Pointerdemo(int32_t C, int32_t D) { int32_t result = 0; ((int32_t *)C)[0] = 6; ((int32_t *)D)[0] = 7; return result; }
If this is correct, one could presumably do the same thing with arrays?
Eventually spin2cpp will fully support objects even in C mode, and then the output of:
VAR a,b PUB Main a := 4 b := 5 PointerDemo(6, 7) PUB PointerDemo(x, y) a := x b := y
will look like:
int32_t pointSpin_Main(pointSpin *thisobj) { int32_t result = 0; thisObj->A = 4; thisObj->B = 5; pointSpin_Pointerdemo(thisobj, 6, 7); return result; } int32_t pointSpin_Pointerdemo(pointSpin *thisobj, int32_t X, int32_t Y) { int32_t result = 0; thisobj->A = X; thisobj->B = Y; return result; }
i.e. a pointer to the struct containing the VAR variables will be passed in as a parameter to all the functions. This is the way C++ works "under the hood"
Ok, brainstorming here as I have been writing some spin code that is pushing up against the limits of what spin can do. Spin does not seem fully "object oriented". Consider an SD driver object. You can reference all its methods from the main program and that is fine. With the touchscreen however, it has been incredibly useful to create a common object called touch.spin which contains a whole bunch of useful methods. Using this object, it is possible to create a calculator program and a synth program using very little extra code. The problem is that integral to this touch.spin object is an SD driver. So the sd driver object becomes a sub-object of the touch.spin object. You can still reference the sd code from the main, but you have to do this via the touch.spin object. The next problem though is if you add in some new code that contains an object that references the sd card object. You either end up with two copies of the sd object (each of which takes about 1/3 of the hub ram), or you have to delete one and pass everything back up to the main, and then down into the tree of another object.
With your code you just posted, does this mean C++ can pass references to objects to and from functions? If so, that makes things a lot more flexible.
Or to put it another way, thinking graphically, Spin is like a tree with branches, and if one leaf on one branch talks to another leaf it needs to talk via the base of the tree. What would be more flexible would be to tell every leaf about a common object and then each leaf can put variables in this object and retrieve variables.
To take a specific example, create an object like a button. Put into that object two variables and a string. Pass the object with your routine *thisobj. Then every function can use the object.
I have this vague idea that if we take all the obex objects and convert them via spin2cpp to C objects, we could use them in ways we can't use them in spin (eg my .wav player which currently has two sd objects in it)
Help understanding this would be most appreciated.
Combine this with your other efforts and Spin2Cpp program binaries could end up faster and smaller than Spin binaries.
Sweet
@Dr_Acula, I've had the same problem with multiple copies of the SD drivers. I got around it by making a modified version of FSRW that uses a pointer to a file block rather than VAR variables. It's the same idea as passing a struct pointer in C, but doing it in Spin. However, since Spin doesn't handle structures I had to copy the contents of the file block to local variables in FSRW. Fortunately, a LONGMOVE of 11 variables in Spin only takes about the same amount of time as a couple of assignment statements, so there's not much overhead in doing this.
@Steve, when are you adding spin2cpp to the SimpleIDE? This would make Spin development very flexible, where it could be compiled to bytecodes or to the various PropGCC PASM models. Taking this one step further would allow C/C++ programs to call Spin routines that are converted by spin2cpp. This would make all of the Spin objects in the OBEX available to PropGCC programs.
Of course, spin2cpp doesn't help you much here, since it's taking Spin code as input, so it will never actually take advantage of the structure passing capability. Perhaps you could use spin2cpp to convert to C (or probably C++ is better if there are a lot of objects) and then hand optimize the output to use structures.
Great news. I can see a new way of working - convert a whole lot of obex objects into C, then start using them in new and creative ways.
Nice work, Eric. On the Prop II (where the necessary Hub space will probably be available) this will be a useful option.
By the way - lookup and lookdown are easy enough to program in C (or PASM for that matter). I've had to do this already for my own purposes - perhaps both GCC and Catalina should agree on C function prototypes for all the standard Spin functions (lookup, lookdownz, bytemove, wordfill, coginit, locknew etc etc) and then add the functions to their respective standard libraries. This would sure help people migrating from Spin to C (whether Catalina or GCC). If you want to explore this possibility further, send me a PM.
Ross.
The common case of lookup/lookdown (when the elements are constant) is easy enough to handle -- just generate a temporary array and index into it. It's the case where the element array includes a reference to the (possibly modified) index that causes me headaches. For example:
r := lookupz(i++ : i, 9)
I was trying to avoid changing the libraries, although that's obviously a possibility. Using a variadic function is also another possibility, I guess -- is that what you ended up doing?Eric
It will handle the Spin code, but the result is too big to fit into hub memory without some manual changes (for Graphics_Demo, changing to single buffering; I haven't actually tried VGA_Demo yet).
Once the various C compilers have their compressed memory models (CMM) working then there's a good chance these demos will work without changes.
Eric
The LCD codes I'm interested in are very close to the VGA_Demo...
BTW, if you have access to Propeller-GCC with compressedcode, you can add the CMM model to the SimpleIDE by defining a file called memorymodels.txt in the SimpleIDE.exe folder.
Contents of my memorymodels.txt:
CMM
COG
LMM
XMMC
XMM-SINGLE
XMM-SPLIT
Hi Dave,
I would still like to have the Spin bytecode interpreter available on the Prop II. I originally thought there might not be one, but Parallax has assured us all that there will, and I see no reason for them to reconsider that just because of CMM. Spin will always give you the smallest code sizes on the Propeller - and code size will continue to be both a Prop I and a Prop II killer for many users and applications.
Of course, the Prop II will have more RAM than the Prop I, but even so we already run many C programs that wouldn't fit on the P2 without CMM (we currently run them using XMM). Where I believe CMM will get most traction is in entirely supplanting the need for bolting on expensive XMM hardware to both the Prop I and the Prop II.
While the GCC team is of course entirely free to use the Catalina CMM kernel (or a derivative), I hope they decide to go their own way. I'm interested in seeing if they can do better - and I'm sure they will try hard, since Catalina has effectively stolen a march on them with CMM support
Ross.
I think it's great that we're experimenting with different solutions... I don't think there will ever be a single "one size fits all" best, and certainly a diverse selection of tools is good for the Propeller community (IMHO).
If you want to take a peek at the PropGCC CMM code, which is still in development, you can look at the "compressedcode" branch of the source repository. We're using a byte code based interpreter, which likely has more overhead than the Catalina CMM kernel but which allows for potentially better compression. Here's the btea function of xxtea as produced by propeller-elf-gcc -Os -mcmm:
GAS LISTING xxtea.s page 1 1 .text 2 .global _btea 3 _btea 4 0000 0388 lpushm #(8<<4)+8 5 0002 2112 cmps r1, #1 wz,wc 6 0004 0B5061 xmov r5,r0 mov r6,r1 7 0007 4E0000 IF_BE brw #.L2 8 000a 2611 sub r6, #1 9 000c 0AD6 mov r13, r6 10 000e A034 mov r0, #52 11 0010 2D29 shl r13, #2 12 0012 09 ldiv 13 0013 1D50 add r13, r5 14 0015 E09660 xmov r9,r6 add r0,#6 15 0018 BC mov r12, #0 16 0019 E98634 xmov r8,r6 and r9,#3 17 001c 17DD rdlong r7, r13 18 .L5 19 001e 56B97937 mvi r6,#-1640531527 19 9E 20 0023 1C60 add r12, r6 21 0025 E44C2A xmov r4,r12 shr r4,#2 22 0028 2434 and r4, #3 23 002a 0A15 mov r1, r5 24 002c BB mov r11, #0 25 002d 7F33 brs #.L3 26 .L4 27 002f E66B34 xmov r6,r11 and r6,#3 28 0032 D6E148 xmov r14,r1 xor r6,r4 29 0035 2E40 add r14, #4 30 0037 2629 shl r6, #2 31 0039 1620 add r6, r2 32 003b EAA749 xmov r10,r7 shl r10,#4 33 003e 2B10 add r11, #1 34 0040 1FED rdlong lr, r14 35 0042 D33FC8 xmov r3,lr xor r3,r12 36 0045 166D rdlong r6, r6 37 0047 1678 xor r6, r7 38 0049 1630 add r6, r3 39 004b E33F3A xmov r3,lr shr r3,#3 40 004e 2F29 shl lr, #2 41 0050 275A shr r7, #5 42 0052 13A8 xor r3, r10 43 0054 17F8 xor r7, lr 44 0056 1730 add r7, r3 45 0058 1768 xor r7, r6 46 005a 161D rdlong r6, r1 47 005c 1760 add r7, r6 48 005e 171F wrlong r7, r1 49 0060 0A1E mov r1, r14 50 .L3 51 0062 1B83 cmp r11, r8 wz,wc 52 0064 7CC9 IF_B brs #.L4 53 0066 115D rdlong r1, r5 54 0068 0B6731 xmov r6,r7 mov r3,r1 55 006b 1498 xor r4, r9 56 006d 233A shr r3, #3 GAS LISTING xxtea.s page 2 57 006f 2649 shl r6, #4 58 0071 2429 shl r4, #2 59 0073 1638 xor r6, r3 60 0075 0BF137 xmov lr,r1 mov r3,r7 61 0078 1420 add r4, r2 62 007a 235A shr r3, #5 63 007c 2F29 shl lr, #2 64 007e 13F8 xor r3, lr 65 0080 11C8 xor r1, r12 66 0082 1630 add r6, r3 67 0084 FB010084 sub r0, #1 wz 68 0088 144D rdlong r4, r4 69 008a 1748 xor r7, r4 70 008c 1710 add r7, r1 71 008e 1768 xor r7, r6 72 0090 16DD rdlong r6, r13 73 0092 1760 add r7, r6 74 0094 17DF wrlong r7, r13 75 0096 450000 IF_NE brw #.L5 76 0099 4F0000 brw #.L1 77 .L2 78 009c 2716 neg r7, #1 79 009e 1172 cmps r1, r7 wz,wc 80 00a0 430000 IF_AE brw #.L1 81 00a3 1416 neg r4, r1 82 00a5 A034 mov r0, #52 83 00a7 0B14A6 xmov r1,r4 mov r10,r6 84 00aa 09 ldiv 85 00ab 51B97937 mvi r1,#-1640531527 85 9E 86 00b0 2060 add r0, #6 87 00b2 3AFF8F xor r10,__MASK_FFFFFFFF 88 00b5 0ADA mov r13, r10 89 00b7 07 lmul 90 00b8 E11411 xmov r1,r4 sub r1,#1 91 00bb 2129 shl r1, #2 92 00bd 2D29 shl r13, #2 93 00bf 1150 add r1, r5 94 00c1 1D50 add r13, r5 95 00c3 594786C8 mvi r9,#1640531527 95 61 96 00c8 175D rdlong r7, r5 97 .L9 98 00ca E3302A xmov r3,r0 shr r3,#2 99 00cd 2334 and r3, #3 100 00cf 0BEAFD xmov r14,r10 mov lr,r13 101 00d2 7F33 brs #.L7 102 .L8 103 00d4 E66E34 xmov r6,r14 and r6,#3 104 00d7 1638 xor r6, r3 105 00d9 E6BF29 xmov r11,lr shl r6,#2 106 00dc 2B41 sub r11, #4 107 00de 1620 add r6, r2 108 00e0 D44708 xmov r4,r7 xor r4,r0 109 00e3 E8873A xmov r8,r7 shr r8,#3 110 00e6 2729 shl r7, #2 111 00e8 2E11 sub r14, #1 GAS LISTING xxtea.s page 3 112 00ea 1CBD rdlong r12, r11 113 00ec 166D rdlong r6, r6 114 00ee 16C8 xor r6, r12 115 00f0 1640 add r6, r4 116 00f2 E44C49 xmov r4,r12 shl r4,#4 117 00f5 2C5A shr r12, #5 118 00f7 1C78 xor r12, r7 119 00f9 1488 xor r4, r8 120 00fb 14C0 add r4, r12 121 00fd 1648 xor r6, r4 122 00ff 17FD rdlong r7, lr 123 0101 1761 sub r7, r6 124 0103 17FF wrlong r7, lr 125 0105 0AFB mov lr, r11 126 .L7 127 0107 2E02 cmps r14, #0 wz,wc 128 0109 75C9 IF_NE brs #.L8 129 010b 1F1D rdlong lr, r1 130 010d 0B674F xmov r6,r7 mov r4,lr 131 0110 2449 shl r4, #4 132 0112 263A shr r6, #3 133 0114 1648 xor r6, r4 134 0116 0BEF47 xmov r14,lr mov r4,r7 135 0119 2429 shl r4, #2 136 011b 2E5A shr r14, #5 137 011d 2329 shl r3, #2 138 011f 14E8 xor r4, r14 139 0121 1320 add r3, r2 140 0123 1640 add r6, r4 141 0125 1708 xor r7, r0 142 0127 FA090080 add r0, r9 wz 143 012b 143D rdlong r4, r3 144 012d 14F8 xor r4, lr 145 012f 1740 add r7, r4 146 0131 1678 xor r6, r7 147 0133 175D rdlong r7, r5 148 0135 1761 sub r7, r6 149 0137 175F wrlong r7, r5 150 0139 450000 IF_NE brw #.L9 151 .L1 152 013c 048F lpopm #(8<<4)+15 153 013e 0200 lret GAS LISTING xxtea.s page 4 DEFINED SYMBOLS xxtea.s:3 .text:0000000000000000 _btea xxtea.s:18 .text:000000000000001e .L5 xxtea.s:77 .text:000000000000009c .L2 xxtea.s:97 .text:00000000000000ca .L9 xxtea.s:151 .text:000000000000013c .L1 NO UNDEFINED SYMBOLS
Eric
I must admit I had not considered this case. Apart from this, I think a variadic function is the best answer (I think it may even be the only possible answer, given the way the function is supposed to handle character strings). The case you're trying to address seems to me to involve unspecified behavior anyway - do you know of any instances of people actually using these functions this way?
I agree. Unfortunately, I don't really have time to keep abreast of what's happening in GCC as well as keeping Catalina going, but thanks for the listing - very interesting. I tried byte encoding as my very first option, since I knew that would give you the best code sizes (I tried nearly everything - 8 bit, 16 bit, 24 bit and 32 bits, as well as many of the possible combinations of these). Yes, I could certainly get smaller code sizes with 8 bit encoding than the encoding I finally settled on (essentially, 16 bit) - but the resulting execution speeds were simply not compelling. Certainly not compelling enough to justify claiming that CMM could ever displace either LMM or Spin. Of course, you may be able to do a bit better than I managed - happy to wait and see.
Ross.
It probably is an edge case, but I wanted to handle it correctly. Actually though I think there is another solution. Since the mapping between Spin and C does not have to be one to one, I can scan over the lookup statement and extract the index, creating code like:
{ int tmp__ = i++; int tmparray__[2] = { 0, 9}; tmparray__[0] = i; r = lookupz(tmp__, 2, tmparray__); }
or something of that ilk. I'd like to stick with arrays if possible rather than variadic functions, because the performance will be better (especially for the common case of constant arrays). But providing a variadic lookup function in the library may be a good idea for people doing a port by hand.Thanks for your help with this!
Eric
No worries, since I didn't actually do anything
Ross.
x := lookupz(3 : i++, suba, y /= 5, subb, y *= 5)
The expressions i++, suba and y /=5 will all be evaluated, but subb and y *= 5 will not be. This is an undocumented feature of lookupz, so it is probably not important to emulate this feature.
I've tried to leave the Windows world behind and am using BST. Is there any way that this might be provided for other OSes?
Oh wow, that's interesting. It also implies a completely different way of implementing lookup. I should probably look into it -- it's been my experience that if there's a weird undocumented feature of the Spin interpreter, then some program somewhere uses it :-).
Thanks,
Eric
The source actually compiles under Linux (that's my "native" platform) and it's pretty generic code, so it should work on the Mac, too. If you're comfortable with the Linux or Mac command line, just pull down the source and type "make" in the spin2cpp directory. I should get around to posting a Linux binary one of these days, I just haven't had time.
(Oh, and the Windows version works fine under Wine on Linux as well, except that the --elf option looks for propeller-elf-gcc.exe in the emulated Windows drive.)
Eric
But, I just see a couple of zip files for the Windows version, not the source code.
I did manage to locate the source code and print a copy, but an unsure of how to go about downloading the actual file. The web site seems to insist that I 'clone' your site and I am not sure what that might lead to.
So I'll use the .exe for now.
Linux is an open source operating system with a Unix-like heritage. It means that a source distribution should be able to compile on the computer assuming a minimum set of tools (make and gcc).
Cloning means that you need a source control tool to get the source. Once you have the source, you can build the tool. That is all.
What Linux do you use? I could probably make an application .zip or .tar for you.
Cheers,
Eric
But if you actually are not bothered by this, I use Ubuntu Linux. I am sure there will be others that may be more timid about using 'make' to generate a program. But that is a little absurd since we are taking about a tool for learning C and C++ so that we may do that compile process.
Thanks for your concern. I will take a look at the Linux .zip