Your favorite features of the Propeller P8X32A multicored microprocessor chip
Cluso99
Posts: 18,069
It's my thread so here are the rules to post here...
- Try and be concise about what feature(s) you like about the prop.
- The feature(s) can also be advantage(s) of the prop, so list them too.
- Where practical, give an example of the use of this/these feature(s)/advantage(s)
- List possible uses
- Attach code in a file, not embedded in the posting unless it is only a few lines
- Place obviously separate feature(s)/advantage(s) in a separate posting.
- Do not compare the feature(s) to other chips except generally.
- Do not start a comparison with other chips and don't name them either.
- Disseminate possible feature(s)/advantage(s) others have not yet discovered.
- Disseminate methods to use those feature(s)/advantage(s).
- Possible use by Parallax as a marketing tool.
Comments
- PASM is clean, simple, regular, easy to use, and is 32-bit RISC based.
- There are 64 basic instruction op codes
- Each instruction has a S (source) and D (destination) address
- A few specialised instructions do not fall into this category (e.g. CogInit)
- The S (source) can be an immediate 9-bit value
- There are no accumulators as such. All cog memory can be the source or destination.
- Each instruction has optional conditional execution based on the C and/or Z flag(s).
- Each instruction can optionally change (set/reset) the C and/or Z flag(s).
- Each instruction can optionally not write-back the result
- e.g. a CMP instruction is really a SUB with no write-back
- Deterministic instruction execution. Almost all instructions are 4 clocks.
The are two really major advantages to me...- The flexibility of optional conditional instruction execution based on the C and/or Z flag(s)
- Saves having to jump around code based on the result of flags
- The ability of each instruction to optionally change (set/reset) the C and/or Z flag(s)
- Saves having to either save the flags for later or executing different paths
These features save both code space and execution time.1) Built In Video Generator hardware
2) Multi-Core with no interrupts
3) Parallax's OBEX.
There was a module I came this close ("pinches fingers together") to building, but ran out of time on this project. I have a system that goes into a pipeline for a repair about 500ft from the pipe flange entry. The system has a bunch of tools to do the repair, and all the tools require a bunch of NTSC video channels over fiber for all of the various functionality. The module would have included a simple color camera, LED lighting, a Prop with a BNC video output, an LISY 300 Gyroscope, a SHT-11 Temp Humidity Sensor, an H48C 3 axis accelerometer, along with a HM55B Compass. The data from the sensors were to be overlayed on top of the video from the NTSC camera and transmitted through the fiber to a 24" LCD monitor with an 8ch multiviewer. For the price, package, and functionality, I don't think I could have done it on any other platform, bar none. To top it off, the whole program was written with 12Blocks, and thrown together in an evening from objects straight off the OBEX and a bit of tweaking to get into 12Blocks. I'm still going to build it, but it will have to wait for revs to the system next year.
That sounds like an awesome application of 12Blocks and the prop! Sounds like the software part of it was pretty far along- what do you need to finish this?
Hanno
- The MUXC, MUXNC, MUXZ & MUXNZ instructions
- MUXC Dest,Mask ' the destination bits identified by a mask (source) are set to the state of the C flag
Useage Scenario...We need to set a group of pins according to various clock counts (sort of like PWM).
The Propeller's counters make possible many of its diverse capabilities, including but not limited to:
2. Delta-sigma ADC input.
3. DUTY-mode DAC output.
4. Frequency-modulated carriers.
5. PLL timing to the video circuitry.
6. I/Q demodulation.
7. Auto-incrementing long hub pointers.
I love the Propeller's counters! (Was that redundant? )
-Phil
As you know I love the 8 cores, the lack of hassle with interrupts, the instruction set architecture, the freedom to choose in software what "hardware" is my chip today and above all else the 40 pin DIP package:)
Deffo DIP 40. That's just friendly in all sorts of good ways.
PASM is simply beautiful. Love the instruction set, and that simplicity combined with the deterministic operation of the COGs, makes programming assembly language a lot of fun. Honestly, all of those things make for challenges in compiler land, but those are getting the good treatment right now. Two fine efforts, Catalina and GCC are going to make larger programs possible and practical. Catalina already does, and it's a snap.
The community here. Man, I don't think I've ever experienced such a sharp and solid set of people. For some reason, the Prop seems to attract a lot of great folks. Thanks all for that. When I don't have a lot of propeller time, I miss that as much as I do just exploring the chip.
SPIN + PASM is brilliant. It has it's limits, but working within those is just fun. The whole affair is very lean, and I find myself staying focused on things I want to do, rather than being focused on the things needed to do it, more often than not.
I've only made light use of the counters so far, but I like them, and clearly seeing what others have done, they are a great feature.
Mostly though, I like that Chip built it. If you go back to some of the earlier threads where he's got some things to say about how and why things were done, it's clear to me he's got a basic vision of how things can be and had the means and dedication to simply go and do them. How cool is that?
Jason Dorie's quadcopter code, I'll bet!
8 independent MCUs w/ counters in a single smt package.
All have access to all I/O pins (PITA until experience sets in, though a nuisance sometimes)
Always know how long a function will need to run for a given clock freq.
Main use:
Test devices. Starting with the curve tracer. Multi-input A/D sampling and processing system which can be paralleled across multiple props and cogs as well as paralleling and pipeline processing of the inputs. For example, in the OBEX entry I put up: can expand the A/D channel up to five with the MCP3201, all parallel, synced, no skew between measurement times at all and currently sampling at 100Ksps. Or use fewer A/D and set up cogs for inline pipeline processing of the samples w/ a cog handling each process and passing the result to another through the hub memory. Consider the nightmare of multiple parallel captures with other MCU types. It would require an MCU PACKAGE for EACH A/D channel to do the same thing as I have done with the PROP chip. Sure, you could use only one very fast using either interrupt or TDMd, but the samples would not exactly be parallel captured in time now would they? Kinda why I chose not to use multi-channel A/D chips. Of course I could parallel these AND syncronize the channels so that I could have up to n device samples in parallel with up to 8 possible inputs on each channel. Currently can not do this as simply with any other device I have played with to this point.
Code is in the OBEX, and I doubt the base will be changing anytime soon, the file will continue to have modules built upon the concept added to it.
Other features: The prop forums. The OBEX. The tools(sort of)
I don't work for Parallax either, but at some point I may do a couple of instructional videos on the A/D apps possible especially as the counters make this clockgen object work. I have not yet pushed it to its max speed yet, need to get some socket/QFN64 adapters to try a couple very high speed TI samples.
Frank Freedman
I've never really understood this "deterministic execution time" argument so perhaps I'm missing something.
I'll agree, that on non-Propeller architectures, if interrupts are turned on then you may not know how long it will take for two consecutive instructions to be executed, but for a system which does not utilise interrupts then the instruction timing is usually as predictable and consistent as it is for the Propeller.
Perhaps more so when one considers -
How many instruction cycles there ? Not only does it depend on where the Hub cycling has got to it also depends on the status of the carry bit, and that can affect the time when execution of subsequent instructions take place at.
In isolation no one can say whether that instruction will take 4 or between 7 and 22 cycles to execute, and if carry is dependant on arbitrary external data input then there can be no determinism.
So by "determinism" it seems to be that, if you write code in a certain way, you can predict exactly when instructions will execute and that's true for most micros, and, if you write code in some other way, it will be impossible or difficult to predict, and that's equally true of the Propeller.
In most cases of describing this deterministic behaviour the example given is two I/O interactions at time A and at time A+K, that as long as 'nothing silly' is done between the two then the actions will maintain a consistent time interval. That's as true on most micros as it is on the Propeller, and you can equally do things on the Propeller to mess up larger granularity determinism as you can on any other micro.
As I said; I may be missing something or misunderstanding what others who use the term "determism" mean by that.
Second place is the multiple cores, because it's such a trip to finally write programs that do things in parallel after decades of wondering what that would be like - and it's so useful to be able to do that.
The most interesting (I think) Prop project I've done so far is telemetry from a high power rocket. I have a Prop Protoboard running the receiver, with one cog constantly reading serial altitude data from an XBee, another cog constantly updating an LCD display, another one running buzzers that signal flight events (apogee, drogue deployment, main parachute deployment, landing), and another one simply monitoring the states of two XBee digital I/O pins that indicate whether or not the parachutes have deployed. I could have done a lot of that with a single processor, but it would almost certainly be slow and buggy, and the buzzer would have required quite a bit of extra circuitry.
hippy, this is valid point and something that has to be taken into consideration when writing time sensitive code on any micro. Every microprocessor, minicomputer, or mainframe I have worked on or read about has instructions with variable execution times. With PASM on the Propeller you can at least be certain of the minimum and maximum time a loop will take, and it is relatively simple to calculate those times.
On the other hand, you seem to be comparing other micros to a single cog on the prop. With 8 cogs I can dedicate individual cogs to to the various functions required without having to worry about affecting cogs dedicated to time sensitive functions. The only way to accomplish that with other micros is to use interrupts or multiple chips, either of which can become very complex compared to the Propeller.
The argument is in the context of multiprocessing. You are entirely correct about deterministic cycle times being a feature of other CPU's. But, those cycle times tend to vary based on a few things:
1. DMA access for support devices, video, sound, etc...
2. Refresh cycles, where external memories are being used
3. Interrupts.
4. Caches.
Really, I personally was writing about this:
Time sensitive code on a Propeller is generally easy. Being able to compartmentalize the problem through the multiprocessing capability is a great feature on top of that. To do so with things like video and sound happening is brilliant.
I think there's an issue as to which micros it's reasonable to compare the Propeller to.
It's true to say Propeller has a determinism better than a Pentium but so too does PICmicro and AVR. Likewise PICmicro and AVR have ADC, UART, SPI and I2C interfaces, Flash and Data Eeprom on-chip and the Propeller, while different still competes with those, so all we're doing is showing how a Pentium is inappropriate in those roles rather than showing what advantage the Propeller has in a more realistic sense.
I'm not overly familiar with ARM but understand those may have timings affected by caching so that would be a valid comparison to make.
That's also true, I was questioning the deterministic claim rather than the advantage of multiple Cogs and shared Ram in one package which I see as a different issue.
Valid points Leon but with the Propeller I can use/stock one chip and still have the exact functions/peripheral capabilities required by using an existing object, modifying an object, or writing my own object. With other chips I need to stock multiple chips, possibly with different architectures and instuction sets to get the functions needed.
The Propeller architecture is different enough that comparisons to other micros are difficult. The great advantage it has (beyond it's timing determinancy) for my use is that all the software peripherals can be tailored to a specific application where those same peripherals on other micros are fixed in hardware. I can also choose the specific peripherals I want for an application.
It is a different issue, but all the relevant issues have to be considered when determining which micro is the best choice for an application. Coordinating multiple PICs (or other micros) to handle a task that a single Prop can handle would not be an easy task.
Having 8 cpu's, a shared memory, and shared pins can be a great advantage for many problems.
Easy to prototype and get running
DIP40 package
Not forced to use C and only C for programming it.
Determinism:
Hippy has a valid point. BUT, this is a micro and it does matter when you are sampling pins or driving pins and implementing functional devices. In these cases often/mostly your code cannot handle interrupts or non-determinism. Within pasm, so long as you do not execute a hub instruction, you can absolutely calculate timing loops.
Even the jump type instructions are fully deterministic. If the instruction has conditional execution (e.g. "if_c jmp #dosomething" the instruction takes 4 clocks. Next, if the jump is executed it takes 4 clocks. Finally, if the jump is now not executed (e.g. "tjnz test,#somewhere") it will take 8 cycles (because the pipeline must be flushed).
It is also quite simple to get a program back into timestep by using "waitcnt". The counters also provide ways to handle timing sensitivities, depending upon the application of course.
And don't forget, you have 8 cores, so the timing sensitive ones can be contained within their own cores.
Sample of it's use:
I have a program that uses 4 cogs to sample all 32 I/O pins. Each cog (core cpu) is initialised to sample on successive clocks (i.e. cogA samples on clock+0, cogB samples on clock+1, cogC samples on clock+2, cogD samples on clock+3, and then the loops repeat with cogA sampling on clock+4....). I get a total of ~1700 samples before the cogs fill and have to execute other functions. At 80MHz the 32 I/O pins are sampled every 12.5ns (by overclocking to 100MHz sampling is 10ns. I successfully overclock to 104MHz and without full testing 108MHz). I am not the only one to have done this, and my code is in the OBEX and is MIT licenced - use it for anything you like, no restrictions, even commercially.
I have used multiple micros in many of my commercial designs since ~1980. The propeller exploits this multiprocessor concept in a fantastically simple way.
One last thing... The Propeller chips is NOT a processor to replace the Intel x86 type series for a PC, nor the super powerful ARM or ATOM processor chips in those superfast applications. These chips are computer chips whereas the Propeller is a microprocessor/microcontroller.
When I have the time, I will add a post about the soft peripherals and the ultimate flexibility of this single chip to perform what is a whole family of chips on most other micros.
You can evangelize all day and you are either preaching to the choir or being ignored as a "fan boy".
I see a lot of folks on here that make all manner and flavors of boards with all sorts of possibilities, but very few examples of actual projects.
Don't TELL ME how wonderful the prop is, SHOW ME.
C.W.
Also valid points Leon, but “production quantities” also depends on the application. In smaller quantities the engineering cost per unit will outweigh the cost difference of the microcontrollers. Better to use one that is more flexible and easier to work with even if it does cost a few dollars more.
@Potatohead, I don't really have a “favorite” feature. I like all the features, including the layout of the I/O pins, the 8 cogs, the shared hub memory, the lovely orthogonal instruction set, the flexible I/O arrangement, the counters, the video circuitry, etc., etc., and last but far from least is the elegant simplicity and power of this multiprocessor. The only additional thing I wish for occasionaly is more I/O pins, but that is pretty easy to get around.
@rod1963, true the prop is a microcontroller, but it is not “just” a microcontroller. If it can emulate so many of the earlier 8 bit microprocessors at or near their original speeds then it is more than just a microcontroller.
And likewise with other micros; as long as you don't muck up its determinism, you can absolutely calculate timing loops.
I completely agree this accurate timing ability is important but I've never used a micro intended for embedded applications that did not have this determinism, which could not bit-bang with accurate and consistent cycle timing. Hence I don't see how it's an advantage for the Propeller over every other micro which has the same, though it is frequently touted as an advantage.
Whenever I hear "Propeller is deterministic" I'm reminded of Idiocracy and the round table discussion of "Brawndo's got electrolytes" :-)