Several Newbie Questions
Chuck Davis
Posts: 23
I'm new to the Propeller (although I tried it for a while many years ago). I was wondering if someone could answer some minor questions for me:
1) On most 32-bit processors, 32 bit operations are faster than byte or word operations. Is this true with the SPIN language?. ie. given that memory is not an issue, should I prefer to use long operands as much as possible?
2) Does the SPIN interpreter fit entirely into a cog, or does it use some sort of overlay scheme? If the latter, are there any performance implications we should know about?
3) Is there a rough number of some sort available for SPIN interpreter performance (instructions per second) ? Obviously this depends on complexity, just looking for a ball park... At what point do I have to resort to PASM?
4) Isn't it possible to have memory collisions, even when updating a single variable if two cogs are trying to access it? Examples seem to talk about multiple variables, but even and operation like X=X+1 should be a problem
if multiple COGS are sharing the variable.
5) Just a small gripe: Why in the world is the assignment statement := instead of just = ????. This is the source of dozens of recompiles, especially when moving back and forth between SPIN and C.
Love the system! Once I get used to it I'll probably love it more....
Chuck
1) On most 32-bit processors, 32 bit operations are faster than byte or word operations. Is this true with the SPIN language?. ie. given that memory is not an issue, should I prefer to use long operands as much as possible?
2) Does the SPIN interpreter fit entirely into a cog, or does it use some sort of overlay scheme? If the latter, are there any performance implications we should know about?
3) Is there a rough number of some sort available for SPIN interpreter performance (instructions per second) ? Obviously this depends on complexity, just looking for a ball park... At what point do I have to resort to PASM?
4) Isn't it possible to have memory collisions, even when updating a single variable if two cogs are trying to access it? Examples seem to talk about multiple variables, but even and operation like X=X+1 should be a problem
if multiple COGS are sharing the variable.
5) Just a small gripe: Why in the world is the assignment statement := instead of just = ????. This is the source of dozens of recompiles, especially when moving back and forth between SPIN and C.
Love the system! Once I get used to it I'll probably love it more....
Chuck
Comments
2) Yes, the Spin interpreter fits entirely in a cog.
3) The Spin interpreter runs around 1 MIPS. You use PASM when Spin isn't fast enough.
4) Yes, an operation such as X:=X+1 could collide if two cogs are doing it. That why the Prop supports locks.
5) Spin uses := because that's what Chip wanted. There's no other reason. When going from C to Spin watch out for >= and <=. They do something different in Spin.
By the way, how is 8 bit access on a any 32 bit CPU slower? Any pointers to examples.
If these things are an issue for you application you are going to want to use PASM.
Yes, multiple COGs accessing the same data can be an issue. Locks can resolve this but you will find that most applications get away without that. If you have one writer and one reader you can get by without locks. The FullDuplexSerial object with it's input and output cyclic buffers is a classical example of that.
":=" comes from the ancient and long forgotten Pascal programming language. The language used to create the Propeller Tool. I guess it's what Chip likes to use.
Those other operators are some weird aberration nobody can explain:)
2) Yes
3) No
4) Yes; use of locks is required for synchronization as the memory subsystem has no read-modify-write primitives
5) C has perhaps the worst syntax and most ambiguous semantics of any language, ever. The use of := for assignment predates "C" by many, many years. The use of the == operator and the errors that result when one uses = in value context are absurd shortcomings of both C and C++. The better question is why does Spin use == for its equality operator instead of simply =.
Dave;
I'm not sure where these numbers come from.... my experience is that Spin instructions are much slower than that.... often in the 20 to 50 usec range.
For a Prop running at 80 MHZ, I measure times for the following code as :
So, a very simple instruction is around 10 usec; 10 times what you are saying.
What gives here ?
Cheers,
Peter (pjv)
The Spin interpretive codes take on the order of a couple of microseconds to execute. The Spin source examples you cited consist of several operations. It's like stating instruction execution times, but giving examples in C source code. You have to look at the compiler generated code.
I ran a dumb test
repeat 1_000_000
and got 16.3 seconds or about 61,000 instructions per second
other basics
Counter ++ = 40,000 instructions per sec
Counter := Counter +1 = 21,400 instructions per second (pays to use ++, I guess)
Counter := Counter * 3 = 10,700 instructions per second
Of course if you have 8 cogs going, you can multiply by 8, if you can divide your work up that way.
As far as byte vs. long, I read somewhere that (at least in C) the compiler has to convert the byte to a long, do the math, and convert it back to a byte. Thus the idea that it's faster to just use longs in the first place.
Just for fun, I ran the above test for Counter := Counter +1 with Counter as a byte instead of a long, and it only ran at 15,200 instructions per second, so apparently there is some overhead
Thanks for the comments.
I do understand that the SPIN interpreter executes instructions internally at the native speed of the processor.... 20 MHZ (at an 80 MHZ clock). And it takes several, perhaps numerous of those to effect a SPIN code statement. So anytime I thought of the speed of the SPIN "language", I was not thinking of the sub-statement level, as that is not terribly meaningful.... at least to me. I'm interested in how fast SPIN will execute the code I'm writing, and I presumed that was the nature of the Poster's question as he was considering using assembler.
Cheers,
Peter (pjv)
Thanks for your input.
I really don't have a very good understanding of the complexities that SPIN deals with in executing byte codes. Or for that matter how many of those it takes to effect any particular SPIN "instruction", or what the meaning of an "extended instruction" is. So to me its all a matter of how fast I can wiggle the port bits, or how fast I can achieve a certain result.
So in repeating my small test and using a simple assignment statement insted of the previous XOR function, and got the following:
That certainly is a bunch faster. Still, when writing SPIN code, all I can do is allow 20-ish usec per relatively simple statement, and 100 or more for complex ones. In the end, I need a scope to really display what is really going on as I have no way to "cycle count", or "bytecode count" in SPIN. I'm just not at that level.
Cheers,
Peter (pjv)
If you're already familiar with C, you might feel more at home with PropGCC - the port of GCC 4.6 for the Propeller.
The only example I've seen where X-bit access is slower than 2X-bit access comes from an assignment such as "mySmallVar = myBigVar;" - the compiler will mask off the extra bits, requiring an extra instruction or two.