C/Spin benchmarks?
RossH
Posts: 5,511
Hi all,
I'm looking for programs suitable for use in comparing the performance and code size of C vs Spin - i.e. any instances where the same algorithm has (as far as practical) been implemented in both C and Spin. I've begun working on a new Catalina kernel that could potentially reduce the code size of C programs much closer to the sizes typical of Spin programs - but at a performance cost over existing C compilers. I'd like to find some more real examples to test it out on before I decide if it is worth releasing.
I know of Heater's fft_bench (which is a great starting point) but I need more examples. If anyone knows of any candidates please post them here (or link to them). It may be a whole program or just an algorithm that has been ported from C to Spin (or vice versa).
Thanks,
Ross.
P.S. This thread is not about benchmarking between the various C compilers - there are other threads for that, and the tradeoff between speed and code sizes of the various C compilers (ICC, Catalina, Zog, GCC) is reasonably well understood. I want this thread to be specifically about Spin vs C. And at this stage, I'm actually much more interested in looking at code sizes rather than performance.
I'm looking for programs suitable for use in comparing the performance and code size of C vs Spin - i.e. any instances where the same algorithm has (as far as practical) been implemented in both C and Spin. I've begun working on a new Catalina kernel that could potentially reduce the code size of C programs much closer to the sizes typical of Spin programs - but at a performance cost over existing C compilers. I'd like to find some more real examples to test it out on before I decide if it is worth releasing.
I know of Heater's fft_bench (which is a great starting point) but I need more examples. If anyone knows of any candidates please post them here (or link to them). It may be a whole program or just an algorithm that has been ported from C to Spin (or vice versa).
Thanks,
Ross.
P.S. This thread is not about benchmarking between the various C compilers - there are other threads for that, and the tradeoff between speed and code sizes of the various C compilers (ICC, Catalina, Zog, GCC) is reasonably well understood. I want this thread to be specifically about Spin vs C. And at this stage, I'm actually much more interested in looking at code sizes rather than performance.
Comments
Are you thinking along the lines of some form of interpreter for this?
It would make sense to me to focus the effort on PropGCC and the Parallax SPIN compiler, since they are both open source and official code of Parallax.
This is no slight against you, but now what hole does Catalina fill? More importantly, why wasn't Parallax's efforts focused around Catalina, since it was already there? That isn't so much a question of Parallax, but of you, why didn't they consider it for the official compiler?
Sorry to take your thread off-topic, but I'm wary of confusing newcomers and duplicating effort, to what end?
You can use Eric's spin2cpp converter if you don't mind re-purposing the resulting code. The C++ness of the output can be easily replaced with normal C in most cases with a simple sed script or regex in vim.
For example, the hello program is translated directly to the .cpp below. The FullDuplexSingleton.spin is translated function by function (not shown here). The one drawback so far with spin2cpp is that all files need to live in the same directory.
request: an option to compress/optimize code without removing comments. (Like Phil's clean & BST)
OBC
Don't we always start these benchmark quests with the good old recursive FIBO. It's a crappy benchmark in many ways but has become something of a tradition.
A major challenge might be to get the C equivalent of one of the Spin FAT file systems down to Spin size.
It was "silly", "stupid", "foolish" thing anyone be able to say.
It was that talk that stopped Brad from more working on BST
These were my numbers ages ago, comparing spin to Imagecraft's C.
http://javalins.wordpress.com/2009/09/04/propeller-languages-speed-comparison/
I thought it was an interesting comparison. I haven't used your complier though.
James
The spin2cpp tool is a good idea - thanks. But it does have an inherent bias in that it is self-selecting for programs already hand-crafted to be efficient in Spin - converting them to C would not use much of the C language, since C is a considerably larger language than Spin. To balance it out, I would also need at least some examples of programs that were translated the other way. Hence this thread.
Hi Cluso (& also Jazzed)
You could call the approach I'm taking a "hybrid" kernel. I suppose all LMM kernels are hybrid in the sense that they are all at least partly interpreted - i.e. they implement "primitive" or "pseudo" instructions using various techniques that extend the basic PASM operations (e.g. to manage the stack). The important thing is to find the right balance. I'll be prepared to go into more detail when I have some actual results to report, but at the moment all I can say is that it looks promising. My current rule-of-thumb is that I probably won't bother persevering with it unless it can execute C more than twice as fast as Spin, using less than twice the code size (any better than that would be a bonus). As you are probably aware, this is a significantly differerent time/space tradeoff to what any of the current C compilers are designed to achieve.
Hi pedward,
You'd have to ask those questions of Parallax, not me. As it stands, I think Catalina offers a solution for C (not C++) that is still unmatched by GCC - although I am sure GCC will catch up eventually. As to what role Catalina fills - well, I think it is still the most practical option if you want to use a language other than Spin on a bare Propeller (i.e. no XMM), or program using a mix of Spin and C - and I have already said elswehere that I intend to move it further in that direction (so the GCC team need not feel threatened!). While I certainly won't be abandoning XMM (I also have a new XMM kernel in development, but I'm waiting to see if this idea pans out first, since I would probably also redesign that kernel if it does). But XMM on the Propeller is very much like "expanded memory" was on the old 8086 DOS systems - i.e. a temporary expedient we had to put up with until something better came along.
Hi Heater,
Yes, I also use FIBO - but in any language FIBO only needs about 3 operations (i.e. call, add, compare) - it just executes them over and over and over. It is not really a good benchmark for comparing diverse languages. But the file system idea is a good one - we have DOSFS (written in C) and several Spin alternatives. None of the Spin ones are exactly analogous to DOSFS, since they all tend to include large chunks of hand-coded PASM - but I seem to recall there is at least one that is implemented mostly in Spin to save space. Perhaps someone may remember which one that was?
Hi sapieha,
Don't worry - if I was thin-skinned I would have abandoned these forums long ago
Ross.
Thanks James - I had a quick look, but couldn't find your actual Spin or C code. Can you point me to it?
The ImageCraft ICC compiler is discontinued and unsupported.
Zog is not a compiler but rather a virtual machine/emulator for running ZPU byte codes on the Propeller. It was something I just had to do because the ZPU architecture is very small and easily fits in a COG's worth of PASM and because there was an existing GCC target for it. ZPU byte codes turn out not to produce so small executables as I would like and are about the speed of Spin. Although I still think there are use cases for it, like when running from serially connected FLASH/EEPROM. Anyway I have not had much time to work on Zog for a long while so it's not really in the running against Catalina and propgcc.
So your question is really "Now that propgcc is here why don't you abandon Catalina and work on propgcc instead?"
That is rather an impertinent question. Can you really expect someone who has spend thousands of hours over a number of years working on a project to just give up on it? As you see from this thread Ross has plans to push it ahead and give GCC a run for it's money.
One killer for C at the moment is the large size of it's binaries. You cannot fit as much functionality into a Prop with LMM code as you can with Spin byte codes. Looks like Ross has a plan for that.
I can't speak for Ross or Parallax but I can only imagine that there is a feeling in many circles that GCC is the "professionals" choice. This is more a gut marketing thing than a solid engineering decision. One might also surmise that there are more GCC developer gurus around than there are for other compilers.
Welcome to the opensource world. Why do we have hundreds of different Linux distributions, or you can use BSD, why do we have Gnome and KDE and so on and so on.
Reading this with interest. We have a lot of "run it from anywhere but the HUB" options now. Good for larger projects and likely good for higher complexity objects too, depending on the complexity and how it manifests in terms of code required.
SPIN + PASM is really great on a bare Prop, and code size is the primary reason why that is so. A bare Prop will continue to be an entry scenario for the foreseeable future. Simple cost will drive this, as will the variety of external storage options. Not everybody will select an SD card, EMM, XMM, etc...
This is a great discussion, and I think having it here perfectly answers the question posed earlier.
@Ross, would something like a line drawing routine be of use? Mostly shifts, adds, writes to HUB and decisions. I've got some VGA C graphics code that I ported selected bits to SPIN for a project. The SPIN port isn't optimal in any way, it just works.
Sure - just post it (or point me to it) and I'll have a look. At the very least, I can extract the C and Spin implementations of the algorithm and use them to compare code sizes.
Ross.
JKISS32: http://forums.parallax.com/showthread.php?136975-JKISS32-High-quality-PRNG-with-RealRandom-seed-save-a-COG.
MWC256: http://forums.parallax.com/showthread.php?137118-MWC256-Very-high-quality-PRNG-with-RealRandom-seed-save-a-COG.
The later contains both routines.
Thanks Heater. You don't by any chance have any C algorithms converted to Spin that exercise (at least in the C version) things like arrays, pointers, structures, unions or bit fields? Anything with switch statements or conditional expressions? I wonder if Dave's C to Spin converter handles these things?
Ross.
Ross, let me know if you want me to try converting some C programs to Spin.
I thought you were only looking for a one-to-one correspondence. Ya, C is more complicated, and there will never exists a Spin program that would use all of C's features. Still, I don't follow the logic completely because theoretically at least whatever is written for spin can be translated to C++ (and C with simple manipulations). I accept that it doesn't fit your needs - it's your project after all.
But I suspect running GCC compiled code for the ZPU under Catalina might cause some kind of time space inversion and Ross would disappear in singularity.
http://code.google.com/p/spin2cpp
In my experience the converted C++ code tends to be about 4x bigger than Spin and about 4-8x faster when compiled with gcc.
As you've already mentioned Dave Hein has a C to Spin converter to go the other way.
Yes, those numbers are about what I see with Catalina as well - they're kind of "inherent" in typical LMM type solutions. Originally I thought those numbers would be "good enough" to encourage more C use, but I'm now beginning to wonder. Like most people, I keep finding that when I try and take advantage of the faster execution speeds C offers, I keep running out of RAM. That's why I'm wondering whether other tradeoffs might be more useful.
Ross.
If you limit yourself to programs that exhibit a 1-1 correspondence then the answer (at least on on code sizes) is always going to be Spin.
Ross.
Hi Dave,
Thanks! I'd forgotten you posted a version of euler in Spin. I don't think there was a Spin version of xtea or dhrystone version though.
Ross.
Many people who ask about the propeller ask if it supports GCC. Not to say there's anything wrong with Catalina, but, the name does not get as much respect as GCC. Rather than trying to fight an uphill battle, Parallax choose to go with the flow for the C compiler.
Not to say the propeller chip's architecture goes with the flow...
Anyway,
According to Parallax, SPIN can execute 500,000 SPIN instructions per second, so 4x is 2mil and 8x is 4mil. Obviously PASM is 20MIPs, all numbers are dependent on use though.
That means compiled C is 5x to 10x slower than PASM, assuming LMM vs COG as the issue.
If you could implement a faster interpreter, you could probably find a happy middle ground between LMM and code density for speed. You might end up with 4x code density and 1/20th or 1/10th the speed of COG PASM.
Exactly. My current thinking is that C on the Propeller is not getting the traction it might because it is currently "falling between two stools" - it is neither fast enough to be attractive for those applications that require speed, nor small enough to be attractive for those applications that require complex algorithms.
Perhaps there is some happy "middle ground". In this thread I'm just exploring possible alternatives.
Ross.
Isn't the best approach there, one that allows small sections of C, to create PASM, rather than byte-code equivalents.
( I think GCC port is following this already ?)
- or a least, this should be part of any solution.
If there is another switch choice that allows memory to 'go further', at the cost of some speed, that is also useful.
Is the memory you run short of, Code or Data memory ?
If it is code, then QuadSPI memory (even DDR) could remove that barrier ?
The FSRW file drivers is originally written in C and then translated to Spin. The C code is also in the object's ZIP. So this may be good to compare the code size. But the performance is a bit harder to measure, because it's mainly the SPI driver which defines the SD access speed (and with fast SPI code, also the SD card) and not the file driver.
Andy