Propeller basics
mickelsen
Posts: 4
I'm trying to get my mind around the basics and theory of operation of the Propeller chip.· From what I've read, in most cases each cog runs by loading a copy of the Spin interpreter and then running a program that resides in main memory.· Is this correct?· Are there any cases where the running program resides entirely in the cog's memory?· Or have I missed the point entirely?· Please pardon a newbie's lack of understanding.
Thanks,
Mark
Thanks,
Mark
Comments
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer
Parallax, Inc.
Thanks again for helping me to "change my mind", literally.
Mark
Each cog accesses the main memory in round robin fashion where each cog has access to the main memory every 16 clock cycles. Since each assembly instruction takes 4 cycles except for a few which take 5 or more (like a main memory access), theres only time to perform·2 assembly instructions between main memory access slots if every slot is to be used.
Both have thier place in applications, and for those which require all eight processors to be doing time intensive functions, a "coginit(0,asm_ptr,parameter_ptr)" replaces the original Spin cog with an assembly program of your own. Chip's released bootstrap program and all third party "non-Spin" development environments do exactly this, they insert the Spin bytecode to override Spin (ie a cog is loaded with Spin which then in turn tells itself to overwrite itself with another assembly program).
Word of warning though, once you kill all cogs which run Spin, it is an arduous task to reinitialize a cog running Spin.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer
Parallax, Inc.
Post Edited (Paul Baker (Parallax)) : 11/5/2006 11:10:45 AM GMT
It is a fairly unusual setup in the microcontroller field, especially "out of the box", and something which has arisen from Parallax's BASIC-Stamp products. It's a clever and powerful approach, which opens up development to both hardcore low level types and more casual programmers, who due to the power of the chip can get good performance without getting too close to the metal. I suspect we'll start to start to see alternatives to the Spin language popping up before long, which are either compiled to Spin bytecode, or use their own custom interpreter.
I'm actually working on a runtime to tie in with my GUI project. It's for a 4GL based on something called Retrieve, a kind of COBOL/Pascal hybrid which I've developed a lot of businees applications in over the years. Initially it'll be used primarily as a way of defining screen layouts, simple user iteraction and the like, but I hope to expand it and ideally allow development to be done directly on the Propeller by writing editors and a compiler in the language itself. Even if this never sees the light of day (and I'm very hopeful it will) it's a lot of fun and is keeping me out of the pub, which apparently is a good thing.
Most assembly routines are effectively interpreters in that they pick up parameters from some fixed locations in memory, act on them, then look for another set of parameters. High speed I/O usually works through main memory buffers that way. The high speed floating point package works that way. That way, the routines don't have to be reloaded into a cog over and over again.
SPIN bytecodes are variable length although a lot of them are intentionally single byte. As Paul implied, successive accesses to main memory are optimum when separated by two instructions and it's not hard to find two unrelated instructions to move to between two main memory accesses, so the SPIN interpreter doesn't waste time waiting for its turn to come around.
SPIN is quite fast. Half duplex serial I/O routines in SPIN can handle up to at least 19200 Baud for example. On the other hand, a cog executes an instruction every 50ns. Two synchronized cogs can produce VGA text at a resolution of 1024 x 768 with only a couple of resistors for digital to analog conversion as external components (each cog taking turns building scan lines).
I seems apparent to me that, with only a maximum of 496 assembly instruction slots available in cog memory, assembly programs can become code space limited very quickly unless you implement some sort of paging scheme which moves segments of code into cog memory from main memory.· In fact, I can see that the only permanently resident part of some assembly programs would be the paging code leaving the largest possible area to move code into.· Has anybody run into this limited code space problem?· Has anyone written a code swapping routine for assembly programs that are too big to fit into cog memory?
The apparently limited size of each cog's codespace is offset by the fact that the instructions are two-address instructions that operate on 32-bit operands. Combine this with some really clever math and bit-manipulation instructions, selective setting of flags, and conditional execution for each instruction, and you have a highly-efficient and compact programming model. This is exemplified by the Spin interpreter, all of which fits in a single cog without overlays.
-Phil
It's a fairly trivial task to write a code overlay scheme using main memory overlays. It would take maybe 10-12 instructions including setup. The real trick is to use some type of external memory to save on the space in main memory needed for the overlays. A general purpose I2C driver takes maybe 250 instructions leaving 1/2 of the cog for overlays. An SPI driver would probably take less, depending on the details of the memory used, but would take extra I/O pins to use.
Your points are well taken.· I need to go back and study the assembly language instructions so that when I start writing code I can get all the functions out of one instruction possible.· I've spent many years programming 8-bit micros whose instructions have only one function and maybe, in some rare cases, two.· Once again I need a paradigm shift.· This certainly is an interesting processor(s) to work with.· Thanks for all your help.· I'm sure you'll hear from me again.
Mark
I was really just lost for a while. No registers (well the COG ram is really a lotta registers --if you want to think of it that way), no indexing, self-modifying code, conditional flags and results and execution. It's really different!
So far, each time I code a loop it gets smaller!
Coupla things I'm still missing:
Byte level operators. Having everything 32 bits in the COG RAM seems clumsy to me. It's getting less so, so maybe that's just me. I'm sure the net gain from the powerful instructions and the benefits of the very regular timing aspects of this outweigh byte operators.
Would be really nice to have single cycle delays possible. A NOP takes 4 cycles. Would be great to have it take 1 - 4 depending on something... Somehow this seems wasteful.
Things I'm liking!
Lots of smart branches. At first the large number of them seemed redundant compared to your typical 8bitter. However, this does grant you an instruction for every case you might need.
Taken branches are fastest! Good call.
No need to learn and tweak lots of addressing modes. This in the 8bit world is the key to everything and is hard. Love the 6809 because of it's many addressing modes and ability to make relocatable and re-entrant code. At first, this language seemed really goofy, but now it's making all that other stuff seem like cruft one does not need. ---damn cool.
(Still love the 6809 though --it was just too fun to program)
No interrupts. At first, this seemed like a serious shortcoming to me. But after finally getting my mind wrapped around the COGs, it makes a heck of a lot of sense. There is going to be a lot of new ways to do things! This is essentially why I bought a prop. That spark of discovery and learning is alive and well in this regard.
The no interrupts also still sits in the things I don't like column too. It's because I'm still really digesting multiprocessing and what it means where states are concerned. Parallel execution is just really different. It's difficult for me to grok what the state of things will be at times. IMHO, this will pass!
Conditional results! It's possible to have long strings of instructions without having to worry about flag states and such. Very cool.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer
Parallax, Inc.
Post Edited (Paul Baker (Parallax)) : 11/5/2006 10:38:43 PM GMT
And this JMPRET trick? (I'll do some searching..)
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer
Parallax, Inc.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer
Parallax, Inc.