Spin under SimpleIDE: compiled or interpreted?

samuell · 2016-04-12 08:45

Hi,

I've recently discovered that SimpleIDE supports the Spin language. I'm really new to that language, and I'm confused because it can be compiled (or translated to machine code) or interpreted on chip, depending on the reference I read. So, my question is, if I use SimpleIDE to program in Spin, will my code compile or be interpreted on the fly?

I ask this because interpreted languages are not my cup of tea, as they are by no means faster.

Also, is C actually being compiled, or sort of interpreted? I notice that it could be a lot faster. My PC has two "cogs" running at 3.2GHz, and is at least 1000x plus faster when calculating prime numbers than the P8X32A, that has 8 cogs, but uses 7 of them to perform the calculations. Doesn't make sense that the difference is so big, considering that I'm using LMM to boost it up.

Kind regards, Samuel Lourenço

Heater. · 2016-04-12 09:59

The Spin language is compiled. But it is not compiled into native processor instructions it is compiled into bytecodes. These bytecodes are what gets loaded into the Propeller. The Propeller contains within it's ROM a byte code interpreter for those byte codes.

The result of this is that your program is compiled down to a small size allows more functionality in the limited space of the Prop. But it is much slower than executing native instructions.

This is all somewhat like the Java system.

The above is true for the original Prop Tool, and later tools like BST, HomeSpun, OpenSpin, PropellerIDE, SimpleIDE. I believe you can now compile Spin to native code with some newer tools. I have never tried this.

The C compilers generate a sort of native code. A Prop cannot execute native code from the 32K shared RAM, only the 512 LONGs within it's COG space. To get around this C compilers generate native instructions that are read from HUB by a "kernel" running in a COG and executed in the COG one at a time. This allows for big C programs but is about 4 times slower than actual in COG native code.

I'm not sure why you are surprised at the difference in speed between you PC and a Propeller. A rough estimate is that a Prop can run code at a speed of 20MIPS. It has 8 COGS so that is 160MIPS. Absolute maximum if you make use of all a COGs in your program at 100% efficiency.

So a Prop is 20 times slower than your PC.

But even if your program made use of all 8 COGs it would probably only get a speed up of about 4.

So a Prop is 40 times slower.

But if you are running compiled C code you might lose another factor of 4

So a Prop is 160 times slower.

But if you are using Spin you might loose a factor of 100

So a Prop is 4000 times slower.

Then the Prop has no multiply or divide instructions so you loose a lot more.

Honestly, how can you expect a 20MIPS micro-controller to come near a 3GHz PC?

By the way, don't have a downer on interpreted languages. Java and C# get their speed through compilation to byte code and then Just In Time compilation of that byte code to native instructions when they are run. Javascript has no byte code stage but it runs nearly as fast as native C++ in many applications.

samuell · 2016-04-12 10:28

Nice explanation Heater.

Yes, I would be expecting the Propeller to be much slower, but not that much. I'm still impressed by the speed, don't get me wrong. This is quite a powerful micro with a great architecture. But then I did some calculations, and the assumed difference ratio was not right. I would expect it to be 20 times slower, or a value like that. Having an Intel E6700 (Pentium) doesn't make my PC awesomely fast.

Heater. wrote: »

...
But if you are running compiled C code you might lose another factor of 4

So a Prop is 160 times slower.

But if you are using Spin you might loose a factor of 100

So a Prop is 4000 times slower.
...

So, basically, if I translate my C program to Spin I'm actually getting no advantage?

Kind regards, Samuel Lourenço

Heater. · 2016-04-12 10:49

The method by which the code compiled from C programs is run is called "Large Memory Model" (LMM). Basically a very tight loop fetches instructions from HUB and runs them in COG. The compilers have to take special care of things like calls and jumps as you are now jumping around a list of instructions in HUB rather than COG so the normal calls and jumps cannot be used.

But it's more complex than that. prop-gcc can identify short loops in your source code. When it finds them it can compile them to real native instructions rather than LMM. At run time it loads the entire compiled loop into HUB and runs it at full native speed. This is called "fcache" if you want to search for it here.

I wrote a Fast Fourier Transform in C which has an inner loop that can be fcached. The result is that the C version runs almost as fast as the hand crafted assembler version of the same algorithm! Not bad.

I then went the next step and made my FFT use multiple COGS. I used OpenMP to get the compiler to split the loop and distribute the work to 2 or 4 COGs. Whilst this does speed things up it is by no means by a factor of 2 and 4 respectively. The more COGS you use the less effective parallelization becomes.

The code for that FFT is here: https://github.com/ZiCog/fftbench There is the same algorithm implemented in C, Spin, BASIC and other languages just for comparisons sake. The C version is pretty amazing I think, it gets a speed boost through fcache and another speed post through OpenMP, all without having to change the code.

Bottom line. If you want fast, don't use Spin. Unless you are prepared to write the fast bits in PASM.

Peter Jakacki · 2016-04-12 11:14

I was surprised when I saw you were trying to run your code on the Prop over multiple cogs as if it were some kind of supercomputer, that it ain't. It's a microcontroller, that's right, not a CPU which is just the CPU in a big expensive power hungry package designed for speed and even in the huge numbers that it sells in is still expensive. Now microcontrollers on the hand are the little chips that might be inside the keyboard and look after simple but necessary tasks dedicating themselves to relieving the host CPU which does not at all like to have to stop crunching big numbers to "scan keys". The Prop is a bit more than that humble microcontroller but a microcontroller nontheless, very good at I/O tasks but the cogs have very limited "cache" memory if you like, only 2kB vs at least 2MB on the Intel which is running 160 times faster plus it has hardware Floating Point. Not sure what you are trying to do but the Prop will do stuff the Intel can't, the stuff that we here are interested in. What is your background and interest in the Prop?

ersmith · 2016-04-12 12:20

samuell wrote: »

I've recently discovered that SimpleIDE supports the Spin language. I'm really new to that language, and I'm confused because it can be compiled (or translated to machine code) or interpreted on chip, depending on the reference I read. So, my question is, if I use SimpleIDE to program in Spin, will my code compile or be interpreted on the fly?

In general, there are 3 ways that Spin can be used:

(1) It can be converted to bytecodes which are then interpreted by a program in the ROM. This is by far the most common way Spin is used. Bytecode converters include Propeller Tool, openspin, bstc, and homespun. I don't use SimpleIDE myself, but from glancing at the manual it appears that it uses openspin to convert Spin to bytecode. It may be possible that you can change the bytecode compiler, in which case it might be possible to change SimpleIDE to use one of the methods below.

(2) It can be converted to C++ (or plain C) via a program called spin2cpp, and then compiled with PropGCC. This would kind of make sense for SimpleIDE, so it was my first guess as to how it works, but I don't see it mentioned in the manual.

(3) Recent versions of spin2cpp are able to output PASM code (or indeed executable binary) directly, providing yet another way to compile Spin to machine language. Code may be placed directly in a COG (if small enough) or executed in LMM mode from hub (see below).

Also, is C actually being compiled, or sort of interpreted?

Again, there are 3 choices:

(1) Very small programs may be compiled with -mcog, which produces direct machine code that runs in a single COG. The program must fit in the 2K memory space available.

(2) The default mode of operation is LMM. In this mode there is a sort of interpreter that fetches instructions from hub memory and executes them in COG memory. The instructions fetched are actually machine code, so LMM represents an in-between state between an interpreter and compiler; technically it's an interpreter, but it's "interpreting" machine code so it's very fast. As Heater mentioned above, even in LMM mode some small loops are executed entirely inside the COG via FCACHE.

(3) PropGCC also provides a mode called CMM which is more of a traditional interpreter. In CMM it outputs compressed versions of the machine instructions, which are then decoded via an interpreter running in the COG. There's a 1-1 correspondence between CMM instructions and actual PASM instructions, so the CMM interpreter can convert CMM codes into machine code relatively quickly, but the CMM codes take up more space than Spin bytecodes.

Regards,
Eric

David Betz · 2016-04-12 12:24

FYI, I believe CMM is the default memory model when using SimpleIDE.

Heater. · 2016-04-12 12:25

There has been a recurring theme around here for many years that the Propeller with it's 8 cores and 32 bits is some kind of device from which a super computer can be built. Remember the "Big Brain"?

Kind of ignoring it's basic features like it's clock rate and MIPS, it's lack of hardware multiply/divide. Never mind floating point and so on.

Also ignoring some of it's main attractions like the tight coupling between cores and I/O.

In some regards the Prop is a "super computer", if what you want to do is "bit-banging" in a timing deterministic way it will run rings around any 3GHz Intel equipped PC, or a super computer sized cluster of them.

Otherwise. No.

ersmith · 2016-04-12 12:28

I should also mention here that besides the 4 choices for execution environment that I discussed above (direct execution of machine code in COG mode, LMM mode, CMM interpreter, and ROM spin bytecode interpreter) there are other kinds of interpreters available for the Propeller:

(5) A "threaded" interpreter is very fast because its bytecodes are simply addresses of machine code routines to execute. Forth is usually implemented this way, and Tachyon Forth in particular is a very fast interpreter for the Propeller. Generally these are much faster than Spin bytecodes, but not as fast as LMM.

(6) There are also interpreters that implement instruction sets of other CPUs, e.g. a Z80 interpreter. These are slowest of all, but interesting because they allow us to run programs from other computers (including potentially compilers for the interpreted CPU).

samuell · 2016-04-12 12:53

Peter Jakacki wrote: »

I was surprised when I saw you were trying to run your code on the Prop over multiple cogs as if it were some kind of supercomputer, that it ain't. It's a microcontroller, that's right, not a CPU which is just the CPU in a big expensive power hungry package designed for speed and even in the huge numbers that it sells in is still expensive. Now microcontrollers on the hand are the little chips that might be inside the keyboard and look after simple but necessary tasks dedicating themselves to relieving the host CPU which does not at all like to have to stop crunching big numbers to "scan keys". The Prop is a bit more than that humble microcontroller but a microcontroller nontheless, very good at I/O tasks but the cogs have very limited "cache" memory if you like, only 2kB vs at least 2MB on the Intel which is running 160 times faster plus it has hardware Floating Point. Not sure what you are trying to do but the Prop will do stuff the Intel can't, the stuff that we here are interested in. What is your background and interest in the Prop?

Hi Peter,

I'm studying the possibility of applying this chip in devices that require some analytical computing power, but nonetheless are easy to program (don't require an OS like the Intel), are self contained and can run embedded. I think the Propeller is great for those applications. It is remarkably fast (never dared to do somethink like this on a PIC) and still fairly easy to program.

The prime numbers calculation program it was just an evaluation program. I didn't intended to do something useful to the end user. However, it has been useful already in the most unexpected ways. I now use it to test the stability of my projects (especially when the Prop is overclocked). I'm already tinkering with my board in order to improve the overvoltage protections (which, in the meantime, make any overclocking impossible, as the voltage doesn't pass over 3.6V, and the ideal would be 3.8V).

Heater. wrote: »

...
But it's more complex than that. prop-gcc can identify short loops in your source code. When it finds them it can compile them to real native instructions rather than LMM. At run time it loads the entire compiled loop into HUB and runs it at full native speed. This is called "fcache" if you want to search for it here.

I wrote a Fast Fourier Transform in C which has an inner loop that can be fcached. The result is that the C version runs almost as fast as the hand crafted assembler version of the same algorithm! Not bad.
...

Well, heater, it seems I have to try that "fcache" option. It didn't ran well last time, but I forgot that I had the Prop overclocked at 128MHz (I know it is insane - long story).

Kind regards, Samuel Lourenço

Heater. · 2016-04-12 12:58

ersmith,

(including potentially compilers for the interpreted CPU)

There is no "potentially" about it. Leor Zolman's BDSC C compiler runs fine on the Propeller under the ZiCog Z80 emulator running CP/M on the Propeller. And no doubt under PullMoll's qz80 emulator as well. Still the first and only C compiler that runs on the Propeller!

DavidZemon · 2016-04-12 13:02

I'm half asking other members of the forums, and half suggesting to you samuell, that perhaps the Propeller isn't the machine you're looking for. If you want to off-load maht-heavy operations from your main embedded processor, I think anything with hardware multiply/divide or even hardware floating point would serve you much better. With a Propeller coming in at $8, it's quite expensive as a math co-processor. Just as a reference, you have the Pi Zero running 1 GHz with hardware floating point and 512 MB of RAM, GPU, etc etc. If you can all that for $5 (bulk price of course), imagine how cheap the chip itself must be, and what you could do with a chip more appropriately designed for your use case.

samuell · 2016-04-12 13:50

DavidZemon wrote: »

I'm half asking other members of the forums, and half suggesting to you samuell, that perhaps the Propeller isn't the machine you're looking for. If you want to off-load maht-heavy operations from your main embedded processor, I think anything with hardware multiply/divide or even hardware floating point would serve you much better. With a Propeller coming in at $8, it's quite expensive as a math co-processor. Just as a reference, you have the Pi Zero running 1 GHz with hardware floating point and 512 MB of RAM, GPU, etc etc. If you can all that for $5 (bulk price of course), imagine how cheap the chip itself must be, and what you could do with a chip more appropriately designed for your use case.

Hi David,

I'm afraid I was misunderstood here. I'm not looking for a co-processor. Of course Propeller doesn't support floating point or multiply/divide by hardware alone, but nonetheless it is a favourite so far (or I just say that because I'm in love with it). I'm using is as main processor. Nevertheless, for an oscilloscope or something like that I may have to look for another micro (or maybe use an FPGA, but I have to learn Verilog, yuck). I don't imagine that the P1 can acquire data from a 500MSPS DAC, and furthermore, oversample the data in real time. Maybe the P2 does.

By the way, will the P2 support hardware floating point operations, as well as multiplications and divisions? I can always use a math co-processor. This one seems to be interesting, as it can communicate with the Propeller via SPI. I think this was somewhere indicated in the forum, but I don't recall.

Kind regards, Samuel Lourenço

Heater. · 2016-04-12 14:05

samuell has been into finding prime numbers. The Prop is of course hopeless at that. Not enough mathematical grunt and not enough memory to work in. Given the biggest prime known, found this year, is 22 million digits long (wide?).

samuell · 2016-04-12 14:14

Heater. wrote: »

samuell has been into finding prime numbers. The Prop is of course hopeless at that. Not enough mathematical grunt and not enough memory to work in. Given the biggest prime known, found this year, is 22 million digits long (wide?).

That, not even with my CPU. I would run into several problems, being:
- I would have to find a way of creating a new type, as long long wouldn't cut it;
- Memory wouldn't cut it;
- Using the disk drive would be too slow, too hard on the drive and wouldn't cut it.

Now, replacing the Propeller with a CPU, as someone suggested:
- Would have to integrate the CPU with all other peripherals and memory - Propeller has that implemented;
- PCB hi-speed design is better left for the professionals - I'm not one;
- Would not be embedded nor easy to program (if you consider Raspberry Pi to be embedded, it is not, it is a PC like any other).

So, given the above, I think the Propeller is just the ideal. My idea is not to surpass my PC (that would be an absurd on my behalf), but to make the most of it. If I can run faster with the same hardware, then why not?

Kind regards, Samuel Lourenço

Heater. · 2016-04-12 14:26

samuell,

What is it you actually want to do?

We would like to offer suggestions but without a goal in mind it's impossible.

For example, if prime numbers is the thing then it's just as well to pre-calculate them and store them on SD card or whatever for the program on the Prop to use as required.

But I doubt that is what you are aiming for.

The Raspberry Pi can certainly be "embedded". It may well be a full up Linux running machine (usually) but if I embed all that in some system where it just does whatever I programmed it to do and the user does not even know it's there, then that is an embedded system. I have built many such Linux based embedded systems starting about 15 years ago. The Pi is a new kid on the block.

samuell · 2016-04-12 14:38

Heater. wrote: »

samuell,

What is it you actually want to do?

We would like to offer suggestions but without a goal in mind it's impossible.

For example, if prime numbers is the thing then it's just as well to pre-calculate them and store them on SD card or whatever for the program on the Prop to use as required.

But I doubt that is what you are aiming for.

The Raspberry Pi can certainly be "embedded". It may well be a full up Linux running machine (usually) but if I embed all that in some system where it just does whatever I programmed it to do and the user does not even know it's there, then that is an embedded system. I have built many such Linux based embedded systems starting about 15 years ago. The Pi is a new kid on the block.

Hi heater,

Now you've asked the right question. I don't have a goal. I'm just studying that this puppy can do. I am not a programmer, nor an electrical engineer, but I do like to venture into electronics. It wouldn't be challenging to me to use a board that was already made. Big part of the fun consists on designing my PCBs. Actually, I learn far more from doing it than from seeing others doing it, or studying what other do. Having said that, a CPU is out of reach to me because it involves too technical stuff and hi-speed design concepts that I don't have not I'm capable of (nor I will, I think, but if I find time that can change).

On the other hand, FPGA's require knowledge that I don't have, nor I will. I know nothing about micro-electronics, nor I will invest on that field, I'm certain (micro-electronics is boring and I prefer to be shot in the head, sorry if offence is taken for this choice of mine, but I hate physics). I'm more an analog and low speed type of guy, because that is what I like. I ventured into the Propeller out of curiosity, but I'm now finding it very useful. That's it.

Kind regards, Samuel Lourenço

Publison · 2016-04-12 15:42

samuell wrote: »

By the way, will the P2 support hardware floating point operations, as well as multiplications and divisions? I can always use a math co-processor. This one seems to be interesting, as it can communicate with the Propeller via SPI. I think this was somewhere indicated in the forum, but I don't recall.

Kind regards, Samuel Lourenço

I'm afraid that FPU will be hard to source. The owner of the company died of cancer a couple of months ago, bad news, he was a one man operation. I have been trying to find a source for months. The only stock seems to be in France, and they will not ship out of the EU.

EDIT: They MAY ship to PORTUGAL.

samuell · 2016-04-12 15:47

Publison wrote: »

I'm afraid that FPU will be hard to source. The owner of the company died of cancer a couple of months ago, bad news, he was a one man operation. I have been trying to find a source for months. The only stock seems to be in France, and they will not ship out of the EU.

EDIT: They MAY ship to PORTUGAL.

Hi Publison,

Sorry to hear that. I can't find any equivalents. Since the chip is discontinued, it would not be good for new designs, but still a very interesting option for development.

I'm in Portugal, yes. Where I can source those chips?

Kind regards, Samuel Lourenço

Publison · 2016-04-12 16:04

samuell wrote: »

Publison wrote: »

I'm afraid that FPU will be hard to source. The owner of the company died of cancer a couple of months ago, bad news, he was a one man operation. I have been trying to find a source for months. The only stock seems to be in France, and they will not ship out of the EU.

EDIT: They MAY ship to PORTUGAL.

Hi Publison,

Sorry to hear that. I can't find any equivalents. Since the chip is discontinued, it would not be good for new designs, but still a very interesting option for development.

I'm in Portugal, yes. Where I can source those chips?

Kind regards, Samuel Lourenço

http://www.lextronic.fr/R1601-micromega-corporation.html

samuell · 2016-04-12 16:33

Thanks, Publison!

Heater. · 2016-04-12 18:29

samuell,

I admire your curiosity. There is no way we can all know everything about computer science, electronics, mathematics, etc, etc. Each one of which is a huge subject area. We all play with what attracts us.

BUT. What's up with Physics?

The basis of everything. Trying to understand the way the world works. What geek, of any discipline, could not be fascinated by that?

DavidZemon · 2016-04-12 18:43

Heater. wrote: »

samuell,

I admire your curiosity. There is no way we can all know everything about computer science, electronics, mathematics, etc, etc. Each one of which is a huge subject area. We all play with what attracts us.

BUT. What's up with Physics?

The basis of everything. Trying to understand the way the world works. What geek, of any discipline, could not be fascinated by that?

Fascinated and and fascinated enough to make it through all the math required for a proper understanding are two different things. I'm with samuell on this one

Heater. · 2016-04-12 19:09

There is something wrong here.

Physics is not about the maths. Physics is about experiment, observation of what the stuff around us actually does, reality. Kids in primary school can start to investigate that with no maths required.

Sure your Newtons and Einsteins come along and discover a ton of maths that happens to miraculously model what reality does. But, for example, the Fardays of this world had no idea about such maths when they were playing with electricity and magnets. The maths came later with Maxwell.

To paraphrase Newton: "Sure, I have the equations of motion and gravity, I have no frikken idea why they work"

Wish I could find the actual quote now.

So much for the "proper understanding". There is no such thing.

samuell · 2016-04-12 19:40

Heater. wrote: »

samuell,

I admire your curiosity. There is no way we can all know everything about computer science, electronics, mathematics, etc, etc. Each one of which is a huge subject area. We all play with what attracts us.

BUT. What's up with Physics?

The basis of everything. Trying to understand the way the world works. What geek, of any discipline, could not be fascinated by that?

Well, I'm fascinated with the practical aspect of things. But explaining the world around us by equations doesn't fit the equation to me. That is why I love electronics but hate micro-electronics. There is no rationale for me to love it.

Heater. · 2016-04-13 07:27

I always loved the practical side of Physics. So many experiments to do and weird phenomena to observe and measure. And by extension electronics and so on.

I was also fascinated by mathematics. Although never very good at it.

I must admit that when it came to university level physics the maths content pretty much overwhelmed me. It just kept coming, more and more, everyday, for four years, far too much for my tiny mind to adsorb. Perhaps I did not pick and choose what I should concentrate on wisely.

Then at some point you find you are doing a lot of maths but don't get to do the experiments. Like we didn't have a nuclear reactor or particle accelerator to play with! So it all gets to look very abstract and floats out of ones mind.

It's amazes me that I actually managed to graduate at all. The end result of that overload is that I forgot pretty much all of it the day after graduating. Still remember the experiments though. I mean, how do you weigh a single electron?

Anyway, maths equations don't explain anything. Newton is famous for such things as F = m * a and F = G * M * m / r * r. These are just a model of what goes on that enables one to make useful predictions about various situations. Newton also said:

"I have not yet been able to discover the cause of these properties of gravity from phenomena and I feign no hypotheses..."

Which is to say, here are some equations that work, I have no idea why.

It's not clear how one does even basic electronics without some equations. Ohms law, impedance of capacitors and inductors, resonant frequency of L/C tanks etc, etc.

Spin under SimpleIDE: compiled or interpreted?

Comments