Prop II: Speculation & Details... Will it do what you want???

David Betz · 2011-05-05 11:30

Beau Schwabe (Parallax) wrote: »

- Chip has BIG plans for the ROM.

On chip development environment? :-)

Sal Ammoniac · 2011-05-05 12:17

Interesting thread. Where these issues discussed back when it may have made a difference in the design of the Prop II?

The things that disappoint me the most about the Prop II design are its lack of hardware debug support and lack of cog memory.

Many posters to this thread have poo-pooed hardware debuggers as only needed by people who write huge monolithic blocks of code and only then try to get it working. This may be true to some extent, but hardware debugging capabilities, in my experience, often make it trivial to find errors that would take a lot of head scratching and insertion of print statements and other primitive techniques. It would be great to have a hardware debugger that would allow non-invasive debugging of a thread of execution running on a cog without affecting the other cogs. (This is available on that unmentionable 'X' MCU and it's really cool.)

My biggest disappointment is the continued 2KB cog memory limitation. Sure, there are techniques available, like LMM, but these are kludges at best and are reminiscent of the bad old x86 days when we had 64KB segments and loads of memory models (tiny, small, medium, large, etc.) This was fixed once and for all in the 386, and I was hoping that the Prop II would solve this problem for for Propeller. Alas, it looks like it was not to be. (Suggestion: for the Prop III, go directly to a 64 bit architecture to eliminate the 9 bit restriction on source/destination addresses in the instruction word.)

The 128K of hub RAM is also disappointing. I was expecting at least 256 and hoping for 512KB. NXP has an ARM Cortex-M4 with 1MB of FLASH and 264KB of SRAM; heck you can even get 8-bit AVRs with 256KB of FLASH. The industry trend is towards more memory as device geometries shrink.

The number of cogs is a minor disappointment. I was hoping for 16 to make it easier to squeeze in everything I want to do. The whole philosophy behind Prop I is to use a dedicated resource (cog) to handle something that otherwise would need interrupts and/or some kind of tasking scheme. With only eight cogs, one is forced to come up with some kind of kludge if all eight are already in use.

I went to the Embedded Systems Conference yesterday and went to lunch with a group of friends in the embedded development field and we discussed the various MCU offerings at the show. I brought up the Prop and Prop II as discussion points, but got little interest from this group. The biggest issues raised were poor support for C (and C++; it's an Eclipse/GCC world), lack of debug support, and lack of peripherals. I mentioned the object exchange, but they pointed out that a mish-mash of objects written by a large group of diverse developers (ranging from rank amateurs to experienced pros) is hardly a match for some of the extensive manufacturer written libraries such as Microchip's. In their view, the Prop is a hobbyist niche processor and is not likely to gain much traction in their company's future development plans.

davidsaunders · 2011-05-05 13:40

Sal:
Thank you for the very good news on the Prop. The fact that people view the shortcomings as being lack of manufacture libs is an extreme positive, I have spent much time debugging code on other chips just to find that my code was fine, the bug was in one of those manufacture libs. As to HW debugging support, you have it, use another cog temporarily during development, this way you are not limited by the HW debugger, big problem with other chips.

th3jester · 2011-05-05 13:42

Ken Gracey wrote: »

... now ported the Propeller 2 (may be only a code name for now) ...

Ken Gracey

code name...unless I'm misunderstanding, it seems the Prop 2 could have a different name. Interesting.

A small comment on the upcoming Prop 2 and its features. I haven't had an issue with ROM/RAM space yet, I have hit the wall with execution speed and number of cogs. Since the number of cogs will be the same; the 160 MIPS will be very useful. I will also enjoy the ADC/DAC on every pin.

I can hardly wait!

potatohead · 2011-05-05 15:25

I want the ROM.

jmg · 2011-05-05 16:22

ROM makes good sense, provided it has the right stuff in it!.

That then brings up the next issue : Revising that 'right stuff'.

So the ideal ROM is actually OTP, not mask - otherwise you get long supply chain latencies, and the ROM is never quite right, and it never gets revised....

SiLabs seem to still believe there is price/volume traction in offering OTP and Flash models. Their newest USB micro, mentions $1.07

I did see mention of fuse issues, earlier in a Prop II test discussion.

Question : Is the 'ROM' in Prop II actually OTP ?

Mike Green · 2011-05-05 17:34

It's masked ROM, just like on the Prop I. Fuses take lots and lots of space. That's why there'll only be 32 bits of that. OTP is no different from EEPROM or fuse ROM in that it takes lots of space and special handling to build it.

The Prop I's ROM has never needed to be revised since it was released, just like the rest of the chip. No guarantees that the same will happen with the Prop II, but it's a great precedent to follow.

Invent-O-Doc · 2011-05-05 19:04

It was an interesting experience to speak at the UPENE last year with Beau. They are trying to put this chip on a usable and small size chip employing a relatively low tech process that makes the chip very affordable for us. There is only so much they can put on this chip and they had to make a decision at some point as to what it would contain. I'm very curious about what will be on the ROM (IDE?).

If there is any item that may seem limiting, perhaps the 2K COG RAM.

Beau Schwabe · 2011-05-05 20:42

FUSE Bits:

There will be 65 Fuse Bits (The image shown is a 13x5 array and occupies 66,681 square microns <- about 12 single dots from a 300 dpi printer)

Cluso99 · 2011-05-06 01:37

Thanks Beau. Most of us have no real idea about the space taken by various circuits. I have learnt so much on this forum from yourself, Chip and others about the die and sizes, etc. So again, thank you so much for the insights you all have given me and others.

Since the ROM is about 1/6 the size of RAM, it is a real pitty we do not know in advance everything we require, otherwise we could put those objects into rom and save our precious ram. Anyone have a reliable crystal ball ???

Heater. · 2011-05-06 01:46

I suggested we put the Zog ZPU bytecode interpreter in there so that the thing can boot straight into C code instead of Spin.
At a pinch the Catalina LMM kernel would also suffice:)

David Betz · 2011-05-06 04:53

Heater. wrote: »

I suggested we put the Zog ZPU bytecode interpreter in there so that the thing can boot straight into C code instead of Spin.
At a pinch the Catalina LMM kernel would also suffice:)

Forget ZOG. Just put a hardware ZPU core on the chip! :-)

Heater. · 2011-05-06 05:00

David,

Whacky but interesting idea.

Have the COGS implement the ZPU instruction set as well as the regular COG instructions.
Some mode trickery gets you into ZPU mode.
Have the ZPU ops execute code from HUB. Bingo nice fast C programs. Eight of them running in parallel if you like.
The ZPU instruction set is tiny and simple so that's just a weeny bit of silicon, perhaps occupying the space where we remove useless ROM stuff:)

jazzed · 2011-05-06 06:27

David Betz wrote: »

Forget ZOG. Just put a hardware ZPU core on the chip! :-)

That would give GCC for free except for lack of little endian support in the ZPU tool chain ....

David Betz · 2011-05-06 07:01

jazzed wrote: »

That would give GCC for free except for lack of little endian support in the ZPU tool chain ....

I'm sure the little endian support could be made to work with a minimal amount of effort.

jazzed · 2011-05-06 09:34

David Betz wrote: »

I'm sure the little endian support could be made to work with a minimal amount of effort.

Could be, but then the Propeller model would be up-ended by having a special cpu on board.
I think the idea is very attractive because the "business logic" code can be any size supported by external memory.

Bill Henning · 2011-05-06 09:47

Actually, this is a very interesting idea.

If one cog was replaced with a ZOG core, it could do a 16 byte cache line fetch every hub cycle (RDQLONG) and given how small the ZOG core is, it could probably have a 2KB cache (256 cache lines). With even a simplistic direct mapped cache (2 way associative would be better) my guess would be it could do close to 100MIPS!

Ideally, the cache could be split into 1K for code, 1K for data, as otherwise data access cycles (including stack) to the hub would take 8 cycles as usual.

With said cache, I am guessing ZOG would not take more space on die than a regular cog. ZOG-COG should keep all the prop regular I/O features

ZOG+7Dwarfs!

My suspicion is that something like this could be done for Prop3, but not Prop2.

That's still OK, as I am sure that a new ZOG VM taking advantage of the new features (auto increment pointers etc) would run at approx. 20 ZOG MIPS...

evanh · 2011-05-06 09:56

I may be a bit late saying this, I've been absent and I suspect I may be repeating what others have already said, but comparing a Cog to traditional processors is a bit off track, imho. If you want to compare something like that then you should be comparing LMM and it's offshoots. Forget about Cog RAM for anything other than I/O loops, optimised subroutines and machine engines.

The LMM mechanism may not always compare well on speed but for everything else it is fine. You want to run a huge compiled C app from external RAM? - no problem. And speed is gained by offloading functionality to the other Cogs a bit like asymmetrical co-pro's and/or I/O processors, take your pick. The Prop is very flexible. PropII even more so.

Or at least that's how I foresee the PropII panning out as a platform for OS type developments.

On the subject of the 128KB Hub RAM, again I think many of you are not looking at the ease of adding externally maybe hundreds of megabytes - both RAM and Flash. Hub RAM will become a deterministic buffer and exchange space when dealing with a large environment.

If there is one thing I would have wanted more of than we got it would be more Cogs.

potatohead · 2011-05-06 10:59

Strongly agree with this. I see the COG almost as micro-code, where it's possible to define the "CPU" in ways appropriate for the task at hand.

I also really like the idea of a kernel being in the ROM. Having the SPIN intrepeter in there is great. Adding a Catalina style kernel in there would be a big boost for C, and would produce a standard target for development efforts. Would go a long way toward addressing that "but it does not work with C" exception so often seen.

On a side note, this discussion is why I liked "Concurrent Multi-Processor" instead of "multi-core". Treating the COG like "a CPU" leads directly to, the instruction limits. When one does not think of a COG that way, the small set of instructions make sense, because "software defined silicon" means the COG can be a CPU, or a peripheral, or etc...

Heater. · 2011-05-06 11:14

evanh,

Yep, we should not forget that having a lot more pins makes adding external RAM/ROM a much more practical proposition.

The issue of more than more 8 COGS is not so clear. Given that they all have to share HUB RAM bandwidth, going to high numbers of COGS makes things slower.
In the extreme an infinite number of COGS would result in zero MIPS. You would have an infinite amount of processing power (infinity * 160MIPS) but not be able use any of it!!

So even if you had the space what is the optimum number of processors when sharing RAM? I'm still not sure and it's probably app dependent.

Also, if I understood correctly it will be easier to set up almost independent threads within a Prop II COG so there is less pressure to have more COGS.

Phil Pilgrim (PhiPi) · 2011-05-06 11:16

potatohead wrote:

I see the COG almost as micro-code, where it's possible to define the "CPU" in ways appropriate for the task at hand.

That's very true. Prop instructions border on a VLIW implementation with their flag setting and conditional execution bits.

-Phil

Heater. · 2011-05-06 11:19

poatohead,

Having the SPIN intrepeter in there is great

Given that no application code can be run until it is loaded from a EEPROM or PC there is no point in having the Spin interpreter in ROM. Or indeed anything else bar the boot loader code. It may as well be downloaded with your app code.

Upside is that you save ROM space perhaps to use for RAM or other functionality.

Downside is a slightly longer boot time, not much though.

Heater. · 2011-05-06 11:20

Agreed about the micro-code view of things. This is borne out by the various LMM kernels for C.

David Betz · 2011-05-06 11:23

Heater. wrote: »

Agreed about the micro-code view of things. This is borne out by the various LMM kernels for C.

I agree that using the COG to implement a VM is very flexible and can allow lots of different architectures to be implemented. I wonder if the performance is good enough for C though. I have a relatively simple compiler that runs really slowly on the C3 under Catalina but is pretty snappy on a 16 bit dsPIC chip. Of course, Propeller 2 will improve this considerably.

Leon · 2011-05-06 11:27

The 16-bit PICs were designed to execute high-level languages like C efficiently.

lonesock · 2011-05-06 11:30

I'm certainly in favor of leaving the Spin interpreter in (not that we're voting on it or anything ;-) for a couple of reasons:

1 - I often do some programming/testing without the eeprom in, or after I managed to damage it.
2 - it only takes 2kB (which would only give us, what, 256 bytes of RAM?)
3 - paranoia: EEPROM's dying of old-age / over-use scares me...I know, I know, the application would be in trouble then too.
4 - someone could accidentally / maliciously modify the Spin interpreter

On the plus side of (optionally) loading the interpreter from EEPROM or SD would be that we could easily use some of the updated version people crank out here on the forums (faster sqrt, embedded LMM, etc.) For me this would be a very nice benefit.

Jonathan

Heater. · 2011-05-06 11:43

lonesock,

1. Moving the interpreter out of ROM and into the binary that is loaded from EEPROM or PC does not prevent you working without the EEPROM
2. 2KB Yes but then there is all the other ROM stuff that can be moved out. Only the boot loader is required.
3. EEPROM dropping dead is not helped by any ROM content.
4. Good point.

I'm not so worried that anyone might maliciously modify the Spin interpreter if it were combined with your binary download file. After all they can always hack your app in EEPROM anyway.
But perhaps in Prop II more ROM is required for the security features of the boot loader. Hopefully a small area though.

potatohead · 2011-05-06 11:48

I edited my post above, to say Catalina style kernel, instead of interpreter.

Seems to me, the program image could contain a few identity bytes. One for SPIN, one for LMM, etc... Stuff some useful, standard things in the ROM, so that we get those efforts focused. It will always be possible to do new and better things in RAM, just as we do now. The upside is having some standard services on chip that can handle various modes of development. Booting to a LMM kernel, for example, would permit a C program to start directly, and quickly. Booting to SPIN does the same, and either could launch a instance of the other, with all the core things known and consistent.

The new Prop will do LMM very well, compared to the existing one. Again, with the COG as micro-code view, the overall performance expectations would be more in line with what we know today, and not seem diminished or less, like they do with COG as CPU view, where the inability to just run everything as raw PASM tends to clash with the overall design of the Prop.

LMM was not known at the time of Prop I. It is now. That changes things considerably where the ROM is concerned. Who knows what Chip will stuff in there?

I suppose another identifier would indicate a raw PASM image too. Just point COG 0 at the target address, load and go, delivering the standard Prop I model, LMM kernel, and PASM boot modes, allowing people to do what they want to do from there.

Since we are on the topic of ROM, it's prolly a waste of space in the eyes of most, but I want a 4 color 8x8 font, Parallax style, with the symbols and such, interleaved just like the existing one is. That would standardize text, and give us two compatible options. I will author that, as I've written before, if it's a consideration.

Honestly, given we know it can boot from SD, etc... I would not be surprised to see Chip do a basic BIOS type thing, where quite a bit can be on-chip, standard to make the most of the HUB.

potatohead · 2011-05-06 12:01

@heater, the benefit is known implementations that focus development. Unless we are giving up something very significant, having a standard LMM kernel and SPIN in the ROM makes a lot of sense.

Just look at video. We don't have a standard implementation. Well, we do have the Parallax reference drivers and circuit. That's good. But, it's hard to ignore the vast array of video options we have. I love that personally, but... there is a strong case for some standards there.

And LMM is the same. We've a lot of different options, each valid, important, etc...

Seems to me, that focus for SPIN and LMM makes a lot of sense. Guess we shall see what Chip thinks!

Phil Pilgrim (PhiPi) · 2011-05-06 12:04

The Spin interpreter has to stay in ROM. That said, we should quit calling it a "Spin interpreter." It simply is not. It's an interpreter for bytecodes that can be the target many languages, not just Spin. For that reason, it's too useful not to stay on-chip.

-Phil

Prop II: Speculation & Details... Will it do what you want???

Comments