More Prop II info..!?!

Coley · 2008-08-27 07:28

@rjo_

Yeah you are right, I can't think of any other company that would do this.

But, Parallax isn't just any other company is it, it's more like a huge family with everyones favourite, Uncle Chip!!!

A 12 Euro box, interesting but limited by cost more than anything, I take it that the idea would be to bring computing to the masses?

I guess one of the things that I had in mind with the Hybrid was for it's educational qualities and we have already supplied some for a local school that has an after school electronics club.

Bundled with a basic interpreter it would tick the box but I wonder what the take up would be, out of all the people in the world that use computers how many know/want to know how to program???

Regards,

Coley

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
PropGFX Forums - The home of the Hybrid Development System and PropGFX Lite

evanh · 2008-08-27 11:03

stevenmess2004 said...
What about something that could handle FireWire 100? There could be licensing issues but it may be able to do dv with the PropII.

Not if it's just a hack job and not advertised as being a Firewire product.

stevenmess2004 said...
I assume that you are aiming at a maximum clock of 160MHz so anything faster would be out.

I was under the impression that both 100 and gigabit were around the 20-30 Mbaud area because unshielded cables had an EMI cap imposed. However, a quick google gets me the answer of 125 MBaud for Gigabit. Still possible for the PropII maybe dunno, but very demanding and not worth it at all imho given that Giabit is backwards compatible to 100TX.

(expletive), I just found out that gigabit does some sort of direction switching like half-duplex. So as to use all four channels for bursting in one direction I guess. That's not something I'd expected to find.

Evan

evanh · 2008-08-27 11:29

Oooh, ah, what about delta-sigma D/A's? Using the same capacitors. Messy? I haven't tried to visualise a configuration, it just popped into my head.

Sapieha · 2008-08-27 11:35

Hi all.

Why not handle all posible metods?

That solution must not wiolate COG´s construction and must have capablites to COG´s programing capablites.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Nothing is impossible, there are only different degrees of difficulty.

Sapieha

Sapieha · 2008-08-27 21:35

Hi Chip Gracey.

It was one more thred with My proposo to Propeller modyfication.

http://forums.parallax.com/forums/default.aspx?f=25&m=235126

In My fault tolerant experiments it is big problem with only 512 DWords in COG´s memory.
( Propeller architectur is perfect to it )

But for it I mast program 3 COG´s to have same program indepedently run in with no Hub RAM.
That is very smal programs with smal capablites I cn RUN with only 512 DWord´s RAM.

With more COG´s RAM Propeller is perfectly suited to advanced sterings systems.
That open very big market to.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Nothing is impossible, there are only different degrees of difficulty.

Sapieha

evanh · 2008-08-27 22:39

Sapieha, looks like you should be using Imagecraft's C.

Paul Baker · 2008-08-27 22:56

Hi Sapieha,
The amount of memory in the cog is one thing that will not change with the new Propeller. There is no space in the 32 bit instruction to express more than 512 distinct locations. And there is no elegant solution to circumvent this.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer

Parallax, Inc.

Sapieha · 2008-08-27 23:01

Hi evanh

Using Imagecraft's C. is not any help.

It not gave my biger programs in one COG.
In fault tolerant systems You must have 3 or more indepedent procesors/COG´s RUN same program and test results from procesing on all 3 simuntant with high speed.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Nothing is impossible, there are only different degrees of difficulty.

Sapieha

evanh · 2008-08-28 00:19

Sapieha said...
Using Imagecraft's C. is not any help.
It not gave my biger programs in one COG.

Maybe you need to think outside the square a little. What you are really wanting is verification and correction right? All cores can read the same data so they can check up on each other when operating in LMM.

Mike Huselton · 2008-08-28 07:40

Chip,

Chip said...
The amount of memory in the cog is one thing that will not change with the new Propeller. There is no space in the 32 bit instruction to express more than 512 distinct locations. And there is no elegant solution to circumvent this.

Would some type of enhanced overlay mechanism or other program image trickery get your Wizard's hat activated? This area might be worth some consideration.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
JMH - Electronics: Engineer - Programming: Professional

Sapieha · 2008-08-28 08:23

Hi evanh.

LMM is not parallel systems with separate hardware CPU-RAM.
In LMM fault on HUB-RAM funktion give You fault on every CPU/COG.

Hi Paul Baker.
Yes. Not linear but by banking.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Nothing is impossible, there are only different degrees of difficulty.

Sapieha

ErNa · 2008-08-28 09:23

Paul said...
There is no space in the 32 bit instruction to express more than 512 distinct locations. And there is no elegant solution to circumvent this.

Often it makes sense to have arrays in the cog-memory. The array elements are addressed by e.g. word pointer in the 512-address space but can give access to 64k. That surely would be usefull. At least for me!

Post Edited (ErNa) : 8/28/2008 9:29:55 AM GMT

evanh · 2008-08-28 09:38

The instruction allocation is not the only limitation. The SRAM required for those Cog "registers" is not small. The more Cogs there are and the bigger that memory block is the less space there is for Hub ram. However, if there was a new process (MRAM :>) that replaced the SRAM cells with much smaller structures then this concern would diminish considerably.

evanh · 2008-08-28 09:57

Sapieha said...
LMM is not parallel systems with separate hardware CPU-RAM.
In LMM fault on HUB-RAM funktion give You fault on every CPU/COG.

You can't expect to not use hub memory.

There is only one type of fault that is sensible to repair in software and that is a bit flip - which does happen. I wouldn't know how to manage this but it's fair to say that standard error correction techniques would apply. And you could have your duplicate programs running on multiple cogs that can actively compare with each other and correct hub ram when there is a problem.

Any hard errors are likely fatal, no way of knowing what is causing the corruption, and are best caught with a watchdog power off.

If you need multiple large memories to go with the multiple processors then the Prop really isn't the right solution.

Evan

evanh · 2008-08-28 10:06

The other neat part about replacing all the SRAM cells with MRAM is you can do away with both the internal ROM and external EPROM for good. And freeing up those pins.

evanh · 2008-08-28 10:10

Well, almost all the internal ROM. Might still need a bootstrap. [noparse]:)[/noparse]

Sapieha · 2008-08-28 11:55

Hi evanh.

Realy no coments but!

You are after my in fault tolerant systems constructions techniqe.
In that system not only CPU´s/COG´s must be duplicated but power and more.
With every program in separate space to prevent posiblites on fault in DIE on chip.

None of chip is perfect and have areas with problem´s.
One aspect of duplicate procesing is to prevent fault in chip and to corect it´s fault with parallel procesing of same program and test if all its results is same and decide with result is corect if one is diferent.

Ps.· Think forward not backwards. Think tomorow posiblites not what You have today.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Nothing is impossible, there are only different degrees of difficulty.

Sapieha

Post Edited (Sapieha) : 8/28/2008 1:12:56 PM GMT

evanh · 2008-08-28 13:50

Sapieha said...
One aspect of duplicate procesing is to prevent fault in chip and to corect it´s fault with parallel procesing of same program and test if all its results is same and decide with result is corect if one is diferent.

I know I was only generalising but do try to understand what I said about what is sensible error correction. It is parallel checking and correcting. However, I think you are probably expecting too much for this method to work well with a faulty chip. Other than to maybe force a power off.

Sapieha said...
Ps. Think forward not backwards. Think tomorow posiblites not what You have today.

Yes, of course a larger cog address space would be nicer. But one has to balance that against the real estate used by duplicating it for every core while at the same time still maintaining microamp idle currents.

Evan

evanh · 2008-08-28 13:51

Sapieha: How much are you looking for btw? You seem to be wanting something like 64k per core. That's not where the Prop wants to go.

Sapieha · 2008-08-28 13:55

Hi evanh.

Yes. Propeller have posiblites to be very big sale produkt.
Only Chip Gracey have power to balance it´s capablites.

But it is costomer that revide his power by shiping it or not because it´s possibilites

Ps. 2k DWords·is Ok (4x512 banks)

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Nothing is impossible, there are only different degrees of difficulty.

Sapieha

Post Edited (Sapieha) : 8/28/2008 2:06:08 PM GMT

evanh · 2008-08-28 14:37

To me, future looking is 32 cogs and 2 MB hub ram, instead of more cog ram.

hippy · 2008-08-28 14:53

I was a 'more memory in Cogs please' person until I discovered the joys of LMM. True, it's not as fast as native PASM so it's not perfect for everything but it makes any VM kernel extensible in that it can execute code in Hub, I2C Eeprom, SD Card or on attached memory chips.

It would be possible to add banked memory to Cogs and perhaps to just one Cog to keep real-estate overhead down and, whilst a 'hack', it would probably meet most people's desires. I don't expect it to happen, maybe in a later generation.

For leading edge stuff, robust fault-tolerant systems on a chip, it may be worthwhile considering taking the Propeller idea and implementing that in FPGA. It doesn't have to be 100% compatible ( nor compatible at all ) and one can then create what one actually needs. That seems to be a specialised situation which would justify a specialised solution.

For everyone else, what there is seems to be a good all-round match for most situations. Quadrupled RAM on-chip, quadrupled execution speed, double the number of Cogs plus all the whiz-bang being added should fulfil even more desires. More Cog memory would be low on my priority list.

Now if only Ferrari manufactured a Smart Car version with the same performance, 400mpg and sub $1K cost I'd be happy. They can call it the OCPBR, One Car Per Boy-Racer

heater · 2008-08-28 15:20

hippy: I feel the same way.

The COGs are such weird processors. Is there any other processor that executes it's native instructions exclusively from its own registers?

With LMM especially the COG RAM is not RAM so much as it is REGISTERS.

I hope Chip is concentrating on speeding up the LMM execution loop as much as possible. Perhaps with auto incrementing rdlong, perhaps with some new hopop that reduces the whole loop to almost nothing. I don't know. Ultimately LMM instruction fetch should be limited only to the HUB memory bandwidth divided by the number of COGS.

With Prop II the C compiler and LMM will be very important so it should have some help. Perhaps even Spin should be changed to use LMM instead of byte codes.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
For me, the past is not over yet.

Beau Schwabe · 2008-08-28 16:06

evanh,

en.wikipedia.org/wiki/Image:MRAM-Cell-Simplified.svg
According to the "simplified" version of the MRAM this would add at least 4 process layers that we can see to the design which translates into several thousands of dollars per layer the we must pay..

en.wikipedia.org/wiki/MRAM
To paraphrase....the above reference....
MRAM has similar speeds to SRAM but is slightly slower. "...a CPU designer may be inclined to use MRAM to offer a much larger but somewhat slower cache, rather than a smaller but faster one. It remains to be seen how this trade off will play out in the future... ...However, to date, MRAM has not been widely adopted in the market. It may be that vendors are not prepared to take the risk of allocating a modern fab to MRAM producing when such fabs cost a billion dollars to build... ...In comparison, MRAM is still largely "in development", and being produced on older non-critical fabs... ...As demand for Flash continues to outstrip supply, it appears it will be some time before a company can afford to "give up" one of their latest fabs for MRAM production. Even then, MRAM designs currently do not come close to Flash in terms of cell size, even using the same fab."

In many cases Memory is outsourced Intellectual Property to companies dedicated to do nothing but create memory.... A customer specifies the parameters and a GDSii file is provided to the FAB along with the design submission from the customer. Sometimes the customer may not even see the Physical layout of the memory, the only thing that needs to be conveyed is where the connections are for power/ground, signals, and the "keep-out" dimensions for the memory block. Because of the specialized nature of Memories, there are several more processes involved that ultimately translate into more time at the FAB and more money required for tooling at the FAB to manufacture a chip. In the long run, it's a balancing act to find just the right recipe.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe

IC Layout Engineer
Parallax, Inc.

ErNa · 2008-08-28 16:42

Nothing is impossible. Until you try to do it. To me it is a wonder, a little company can have such a product as the propeller is. So we should not put to much exotic requirements in it. We want to have a PropII in finite time, so make everything a simple as possible. Anyway: to run the cogs from hubs will bring a lot of wait states. Therefor the concept of cog with local memory is adequate. Only because SPIN code is interpreted, it makes sense to have it stored in global memory.
I only hope, the discussion about address range doesn't mean, that we are far from takeoff ;-(

Sapieha · 2008-08-28 18:01

Hi All.

COG´s in Propeller like Ferari´s with >LMM it run one after other both not never parallel.
It is wasting on its power

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Nothing is impossible, there are only different degrees of difficulty.

Sapieha

hippy · 2008-08-28 20:55

All Cogs running LMM do run in parallel but the performance of each Cog does drop. Whether that's wasted power or not depends on if one needs the speed for anything else. Each Cog still gets up to 5 MIPS performance.

A single LMM interpreter can run multiple LMM programs. There's context switching time but it should be possible to get 2 MIPS each for two pograms, 1 MIPS each for three.

That 5 MIPS is what we can get now; with the Prop Mk II that jumps to 20 MIPS ( LMM potentially as fast as PASM is now ), and it may get better still if Chip tweaks the LMM loop capabilities.

Sapieha · 2008-08-28 21:03

Hi hippy.

Yes. You said it.
"" All Cogs running LMM do run in parallel but the performance of each Cog does drop ""
Yes it is wery big decrease in Cogs performance!

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Nothing is impossible, there are only different degrees of difficulty.

Sapieha

potatohead · 2008-08-28 21:49

IMHO, doing this would require surrendering determinism and that comes with a whole other set of problems, which will increase code complexity and with that comes overhead, thus limiting the potential overall.

Can't have something for nothing.

If the COG's do round robin, and ideally with the burst access mode discussed, then there is a balance between access and computation. Since both don't happen at the same time, the very real potential exists to get near the full throughput of the chip, for a lot of tasks in a straightforward and predictable way.

If, on the other hand, they all can just access the HUB in a non-deterministic way, then programs must then handle their own timing of things and thus become more complex, which eats into compute time available. Still there is the potential to get near the full throughput, but with the added cost of complexity for more tasks.

I've not done any analysis of the scope of tasks and where these curves fall, but just simple observation of how code has evolved to date suggests a great many fairly easy tasks would be impacted in a negative way, should determinism be lost.

To take your Jaguar analogy (and with cars and computers it's always trouble), having all the cylinders fire at the same time would then lead to what? Serious imbalance and very strong peaks of power, but with serious gaps in the overall delivery of said power. Carry that analogy to the tasks the engine (CPU) is required to perform and it's clear that mass power, without being managed in some fashion is mostly useless.

I think that's about how it is with the prop.

Look at the CELL processor designs of the PS3 and to a degree the design of the PS2. Both of those are capable of some pretty amazing things, but they are a total bear to program. The devil is in the details, and it costs a lot more than people think. These are not deterministic designs and considerable effort has been spent on building robust code that can perform in... a deterministic way, so that the rest of the problem can interface reliably with the mass compute power these chips bring to the table. Why bother with that whole affair, when...

In the deterministic world of the propeller, the determinism gives us a path where we transform hard problems into I/O problems and doing so means managing the I/O in a balanced way so the number of states we must track and account for are known.

The fewer of these there are at any point in the process, the quicker our brains can solve for them and thus code for them to obtain useful results.

Determinism then has the very real benefit of forcing lower complexity code, or at worst case, a structured complexity that can actually be managed without having to literally code an entire OS for that purpose. That benefit means spending far more time on code that directly impacts the problem at hand, not a meta-problem, and that's a more rapid, likely robust, and practical solution.

(all serious business issues where solving problems is concerned)

In other multi-processing scenarios, where the system is not deterministic, there always exists a fair amount of code that essentially has to bring order to chaos in order for the solution to then be realized. I don't think forcing people to entertain this, particularly in the embedded space, makes any real sense.

Finally then, a holistic view of how these things goes, really means the "power" is not wasted, just moved around some in order to achieve a more productive balance. That's part of the Propeller magic smoke!

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Wiki: Share the coolness!

Chat in real time with other Propellerheads on IRC #propeller @ freenode.net

Post Edited (potatohead) : 8/28/2008 9:54:31 PM GMT

evanh · 2008-08-28 23:52

Sapieha: You can do both at the same time. Especially if there is 16 cogs.

You can have the small tight "I/O controllers" running in parallel. And you can have the large sequential LMM programs also running in parallel.

More Prop II info..!?!

Comments