Re: Why no SPIN compile to assembly.
potatohead
Posts: 10,261
Edit: @VIRAND --Thought it might work on it's own thread!
Assembly runs in the COG only, unless one runs a VM to interpret assembly instructions in the HUB memory. The SPIN interpreter fits into a COG, and it runs a SPIN program that exists in the HUB memory space.
So, SPIN, compiled to assembly, would be limited to small programs that fit into COGS.
Or...
More development would have to have happened to enable LMM (hub resident assembly code) code, run in a fashion similar to how SPIN code is currently executed. An assembly kernel, running on a COG, would handle executing the LMM code, just as the SPIN interpreter does for SPIN byte code.
This is also the same reason why there is no "in-line" assembly constructs possible within SPIN. There is no context for it.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Wiki: Share the coolness!
Post Edited (potatohead) : 1/4/2008 3:33:53 AM GMT
Assembly runs in the COG only, unless one runs a VM to interpret assembly instructions in the HUB memory. The SPIN interpreter fits into a COG, and it runs a SPIN program that exists in the HUB memory space.
So, SPIN, compiled to assembly, would be limited to small programs that fit into COGS.
Or...
More development would have to have happened to enable LMM (hub resident assembly code) code, run in a fashion similar to how SPIN code is currently executed. An assembly kernel, running on a COG, would handle executing the LMM code, just as the SPIN interpreter does for SPIN byte code.
This is also the same reason why there is no "in-line" assembly constructs possible within SPIN. There is no context for it.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Wiki: Share the coolness!
Post Edited (potatohead) : 1/4/2008 3:33:53 AM GMT
Comments
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer
Parallax, Inc.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Wiki: Share the coolness!
PASM doesn't have a terminal console either so why is PASM there ? In fact, why does the Propeller chip even exist ?
The answer is that neither PASM nor Spin need to have a terminal console to perform useful work and it is easy enough to add one if required ( it's usually easiest with Spin ).
The Propeller Tool could compile to PASM rather than bytecode but then it could only compile programs small enough to fit into a Cog's memory. By compiling to bytecode much larger programs can be created and executed.
The trade-off is; PASM - small programs but fast, Spin - larger programs but slower.
Spin implies to me that it uses cogs, when in the case of long programs it may not,
especially most programs I've written thus far. It seems to me that ...
Ok, I just realized something, if cogs have their own extra 2K memory apart from the 32K.
... it seemed to me that otherwise a long program that wouldn't fit in one cog would
shut down after starting up the next cog where the following code is, unless it must
loop back across the boundary of a cog. I guess it doesn't work that way at all.
Someone wrote a short 'trace program' (as I anachronismically call it) which runs in a
cog and pulls instructions out of hub ram and single step executes them, which is not
interpretation and should only be a little slower than normal PASM execution in a cog.
I don't know what the trace program was called except perhaps a virtual machine that
runs native code. Ahh, yes, it was called LARGE MEMORY MODEL (LMM).
Again sorry for upsetting with my question about SPIN. I use SPIN much more than PASM,
but just wondered why it interprets instead of assembles. I understand the answer given
is that it's more convenient for large programs.
Maybe Off-topic again, but related someone wrote an on-Propeller assembler here. I can't get the keyboard
to work, so I'll find that topic and ask how, expecting a problem with the different pins used
by my keyboard possibly being incompatible with their keyboard driver because they aren't adjacent.
Sorry for wasting space here thinking out loud about this. Propeller is awesome and I don't want
anyone mad at me for simple misunderstandings. One thing that's awesome is so much can be on
the chip, and perhaps the reason it isn't is because people got excited about that after it was done.
It's probably coincidence but I might have posted a couple years ago about the propeller that when
I heard how fast it was going to be, it should make it's own video. Nevertheless I was surprised to
find it has 8 cogs which each have that capability just before I started using Propellers. Peace and Joy!
Do not worry about that, I am sure no one is mad at you.
Spin works in a similar manner to Large Memory Model. The small Spin Interpreter resides in the Cog ( like that 'trace program' ) and interprets the Spin Bytecode from outside. It is a very elegant solution but can suffer from limited execution speed.
The unusual architecture of the propeller can be looked upon in at least two ways:
(A) It is a computer with 8 very flexible, nano-programmable "functional units". A standard computer (van Neumann) contains dedicated functional units as: "control unit/instruction decoding", "I/O", "ALU", "floating point unit"; modern processors also many of of them so they can sometimes perform more than one instruction per clock ("superscalar"). From this point of view it is absolutely natural to define a (virtual) machine language that is executed by a control unit. This Nano Program is called SPIN-interpreter, but you can easily (as proven by Hippy and at least one FORTH author) write your own. SPIN does not depend on other functional units. It behaves as if all processing in a computer were perfomed by the "control unit" alone, which would be an extreme waste of resources!
An extended version of such an "interpreter" would of course delegate some time lengthy operation onto another COG. This is especially useful for complex I/O operations.
However scheduling of concurrent work is no easy task, neither in real processors nor in the Propeller Environment. So the "SPIN-way" is this: you must point the interpreter to such possibilities explicitely by COGNEW instructions
Putting all together: Something like SPIN - a virtual machine - is very natural for an architecture as the Propeller. I think Hippy's recent activities have shown that the power and beauty of the Propeller is hardly explored: Defining your own optimized VM for your specific application!
The concept behing the LMM model is to use a VM 90% similar to the "Nano Code" of the functional units (PASM). This concept profits from their generic speed in the case of that 90% match.
It is still unclear - and a very exciting question! - how large the average speed-up wrt SPIN (or another "interpreted" language as FORTH) will be. Anyhow you will pay a large penalty wrt memory, as a PASM instruction is rather redundant (this can easily be measured by "zipping" a binary file full of 32k machine code).
(B) A complementary (and more conventional) view shows the Propeller as 8 fully staffed microcomputers plus a large (as it were) fast (as it were) accessible secondary memory. This is the "non VM approach".
A "real program" must fit into one COG. Period. If not you have to use overlay techniques, which however becomes difficult when the overlay loading logic has to reside side by side with the overlays themselves in a limited memory. The second challenge is linear data memory access.
This could be done manually in machine language with support from a clever IDE - in fact deSilva is working on that from time to time The memory penalty is the same as with LMM, the speed is also unclear as the wait for "overlay load operations" can add up to the same order of magnitude as the LMM overhead. There are more awkward constraints e.g. only tight loops will be possible...
As you see this is all very political, compromise upon compromise.
ImageCraft has decided - and most likely for good reasons - to use the LMM concept. After some provisos I think this is a wise move: they will profit directly from the enhancements of the Propeller II.
It is most important to be able to run programs from a rich library environment at competative speed on the propeller.
Post Edited (deSilva) : 1/5/2008 3:06:27 AM GMT
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Wiki: Share the coolness!
The same goes for C vs spin. There's a speed/size trade off. C code will be faster than Spin. But you'll be able to squeeze more functionality into 32K with Spin than C.
Unless someone does a byte code version of C. Which might come close to the size/speed balance of Spin.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Help to build the Propeller wiki - propeller.wikispaces.com
Play Defender - Propeller version of the classic game
Prop Room Robotics - my web store for Roomba spare parts in the UK
Spin bytecode opcodes are extremely well optimised with most bytecodes being single or two byte instructions and most operations being reasonably complex. Even after taking the fact that Spin is a stack machine architecture out of the equation, PASM requires two longs to load a constant greater than $1FF and 'register indirect' requires a not inconsiderable sequence of PASM instructions. Something like 'a[noparse][[/noparse] b ] := true' would usually be three bytes of bytecode but I estimate around six PASM instructions / Cog memory cells, a ratio of 1:8. Some Spin constructs, mainly branches, give better ratios and can drop as low as 1:2, maybe lower. My feeling is that the average ratio is 1:6.
Compared to a traditional microprocessor architecture the Propeller could be said to be lacking; just 496 in-Cog instructions for maximum processing speed with a notable lack of register indirect instructions, 8K instructions using LMM at around four-times slower speed, 16K with a Thumb-style LMM and another reduction of speed, or 24K ( my guesstimate ) of Spin instructions at the slowest speed. That doesn't mean the Propeller is as lacking as it may sound in practice, and the Prop II will help resolve some issues. The unchanged Cog design will mean the Propeller will never be as fast as it could be with a larger direct executable memory, but that doesn't mean it's necessarily less suitable in the real world either.
Where any C ( or other ) compiler targets itself for generated code, and consequently its usefulness and value, remains to be seen. Some may be looking for compilers which can generate fastest in-Cog code to avoid coding in PASM, others may be looking for compilers which can generate large applications and are less interested in speed. A compiler which can do both is the ideal.
The second important thing is that you can finally use datastructures
A well written algorithm in a well compiled languge for a well designed machine architecture should have no redundancy at all
If some PASM code happens to generate a lot of zero bits in each 32-bit instruction, that should get tremendously well compressed compared to the same PASM code which compresses worse simply because different register addresses were used which mashes up the 'lots of zero bits' advantage.
If two identical ( other than address location ) PASM programs don't give the same "information density" result then the whole theory seems to fall to pieces. It's measuring compressibility not anything else.
But I see some pitfalls here. A well known means for "information hiding" is randomization. A compiler can easily and unknowingly generate "noise" by - e.g. - always selecting different registers for intermediate results rather than always using the same - free - register. This will lok like a more complex algorithm to a compressor. But I think such things will everage when the program consists of some thousend instructions...
Similar issues will occur with preset mass data... I would much like to know how the sine tables in ROM would compress; they obviously carry nearly no mentionable information at all...
Since 32K was plenty for 8-bit machines, it might be a good project to make and use a 6502 emulator in PASM for example,
since 6502 had not too many instructions and incredible programs were written for it in 32K,
such as on Apple II, C64, Atari, etc.
In other words, coding in Assembler on 8 bit systems often made smaller faster programs
than in High Level Language, but compact code is Not a given advantage in this case of of PASM vs. Spin.