What would you want more of, cogs or RAM?

Tracy Allen · 2006-11-29 05:56

Lower quiescent currrent, but you know I would say that.

Now, I am assuming that quiescent or leakage current means the current floor when running at 32khertz or when essentially stopped. Is that right? The operating current at full tilt at 160 or 200mhz would be a lot more. But maybe not too much more than the current prop, because the gate capacitors are smaller. Would this new chip have the same stepped clocking options as the current one?

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Tracy Allen
www.emesystems.com

Mike Green · 2006-11-29 06:07

I would opt for lower quiescent current. The performance improvement is very small for a large increase in quiescent current, enough to be problematic for battery operation say as a data logger or other intermittant operation.

Paul Baker · 2006-11-29 06:24

Tracy Allen said...
Lower quiescent currrent, but you know I would say that.

Now, I am assuming that quiescent or leakage current means the current floor when running at 32khertz or when essentially stopped. Is that right? The operating current at full tilt at 160 or 200mhz would be a lot more. But maybe not too much more than the current prop, because the gate capacitors are smaller. Would this new chip have the same stepped clocking options as the current one?

Hehe, when Chip and I were going over the models for the low power and high performance options, I said "You know Tracy is really going to push for the low power option".

Yes, it's is the static current consumption cause by the channel length and doping concentrations,·Chip calculated the number of NMOS transistors in the hub memory to arrive at the figure since that contributes the lion's share of quiescent current consumption. And the new chip will also have the gated clock functionality of the current·Propeller.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer

Parallax, Inc.

Phil Pilgrim (PhiPi) · 2006-11-29 06:25

I guess the question this discussion begs is, "What kind of datalogging app requires a chip capable of 160MIPS apiece in 8 or 16 cogs?" In other words, if a chip is being designed for high performance, is it even necessary to consider how well it might function in applications that are better served by slower, nanowatt-sipping micros?

-Phil

M. K. Borri · 2006-11-29 06:31

An expendable "camera UAV" with loitering capabilities, used for search and rescue.

Will it be possible to underclock the new Prop, or run it without a crystal? (I know that defeats the point -- just asking it it will be possible at all or not)

Bill Henning · 2006-11-29 06:46

Personally, I'd prefer the 200MHz option [noparse]:)[/noparse]

Chip Gracey (Parallax) said...

Wow! You all have lots of good ideas. It seems that many of you already have a very thorough grasp of the current Propeller.

I have a question to ask you all:

Which would you rather have:

a) 350uA quiescent current with 160MHz operation (as planned)

b) 1400uA quiescent current, but with 200MHz operation (lower Vt transistors for faster operation, but more leakage).

In 0.18um process technology, there is going to be much greater leakage than there is in the current 0.35um process. Would it be worth it to·suffer·another 3x increase in leakage for 25% more performance? What do you think?

P.S. The current Propeller's leakage is only 600nA, which makes it viable for battery-powered, always-on applications. A 0.18um chip, in any case, will waste a lot of juice just by being powered up, even if there is no toggling.

Alex Silva · 2006-11-29 07:35

I vote lower power. I was thinking of the same thing as Borri, a camera bearing UAV [noparse]:)[/noparse]

Besides if the 8x8 prop supercomputer project ever gets going lower power consumption is going to be a good thing.

I was looking over the thread on the RF stuff posted earlier and think that whatever can be done to support that would be nice too. Lots of great ideas floating around.

OzStamp · 2006-11-29 10:52

Hi Chip.

Low power preferred .

160MHZ awesome.. sounds like your enjoying this all.
Thanks for what your doing .. this is really very exciting stuff..

I am still having with the old trusty Stamp as well..
Creating large characters on a std serial lcd .. it got me hooked.
May have to try and port it to a Propeller as an exercise ..

Did I mention protection ?

How difficult would it be to add real A/D conversion ?
4 or so channels sitting in a corner somewhere 12bit + ·resolution..
alway's ready to supply a COG with a value· no coding ..just reading a register..
Tracey would love that also ..

Ronald Nollet
Parallax Australia distributor

william chan · 2006-11-29 13:31

After reading some posts and their reasonings, I now change my mind.
I now prefer a 16 cog version with 128K RAM. ( 128K is still much larger than the current 32K ).
Having more cogs is what makes the Propeller different from the crowd.
Having more RAM just makes the Propeller more and more like an ARM processor, which many are already available in the market.
Lower current consumption is also preferable.
... and don't forget the on-chip eeprom and a full speed hardware serializer for bluetooth or 802.11b.
Thanks.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
www.fd.com.my
www.mercedes.com.my

Post Edited (william chan) : 11/29/2006 1:52:35 PM GMT

ciw1973 · 2006-11-29 14:35

Much though I love raw power, I don't see that the penalty in terms of leakage in this case is worthwhile. So put me down on the low power list.

I've been ill for a couple of days, and have had lots of time to think, and I've come back around to the idea of 16 cogs and 128K RAM.

As the library of modules we have available increases (I'm thinking a year down the line) it will become more and more important that we have more cogs to allow all people to combine these as they see fit. Before long I can see there being an increasing number of "must-have" multi-cog modules, and can see us hitting the limits of the current hardware pretty quickly as a result.

After all, the Propeller's biggest selling point is its multi-processing capabilities, and a new version with the same number of cogs (albeit running a lot faster and with more shared RAM) would do little to bring new people to the party. Getting more from a new version of the chip with the same number of cogs will require a knowledge of assembly language, or a dependence upon those who do, and whilst that's where I'm happiest personally, there is a whole world of people out there who'll not have heard of the Propeller yet, and won't have any interest in getting their hands that dirty. They'll just want to plug together a load of pre-existing modules and write some spin code to get their project working, and I think that's a market that's too big and exciting to ignore.

The Propeller gets me excited in the same way that CPLDs & FPGAs did a few years back, the potential is simply incredible, but the difference here is that all this potential is available to the masses who only require some really basic coding skills to make use of it. There really hasn't been anything like this whole package before.

More seasoned developers can put in the time to write e.g. a single cog combined keyboard and mouse driver, but a much wider, less technical potential audience will see the Propeller as an opportunity to build new and exciting things using a LEGO-type approach. Tell them that they have 8 "slots" they can fill with modules and they'll be happy, tell them they've got 16, and even though some of the things they'll want to "plug in" may take two or more of these, they'll be much happier, and see the chip as twice as powerful. I think John Abshier make this point very well in an earlier post.

I know many people have a need for more memory, but using a cog as a dedicated external serial memory access interface would solve many of the problems which would initially suggest that more internal memory is needed, especially with the proposed increase in speed. Having banks of say 4K, with a long to indicate which 256 byte blocks are "dirty" and need to be written back to external memory could be implemented easily, and I feel would be very useful.

I'd also just like to add/re-iterate my backing to the following ideas, roughly in order of priority:

- Having on-chip FLASH/EEPROM, ideally as a replacement for the current ROM. This would open up whole new avenues for alternative language interpreters, fonts etc. would reduce component counts and reduce start-up times etc. All code could live here, and leave the RAM as actual working memory.

- Allowing some degree of scheduling of hub access for the cogs. I know this is potentially a nightmare to implement, but if a suitable mechanism can be found, and each cog gets an equal share of access by default for reasons of backwards compatibility, then it could be very powerful and allow much more efficient use of the chip's shared resources. I think the resolution would have to be somewhere in the region of 256 slots to allow enough flexibility to make it worthwhile though.

- Having the more generalised enhanced video shift registers that Bill Henning proposed. The potential here is quite simply incredible, and I really hope these can be implemented.

- Having a single cycle hardware multiply (and divide?) would be fantastic.

- Having some form of direct inter-cog communication. Using port B seems like a good, simple, flexible way to achieve this for syncing and the transfer of small amounts of data etc., with Bill's fast serial link used for larger amounts of data. My only concern is the amount of code required in both communicating cogs to implement the transfer, so I think having both would be important.

- I think keeping all cogs identical in hardware terms is essential, and that any additional peripherals should be accessible via the hub.

- Also having a couple of ADCs and DACs built in and accessible in the same way as the system counter via the hub rather than built into the cog hardware.

- Whilst most people (myself included) had been more interested in producing VGA output, having an absolute limit of only four colours per 16 pixel block on the TV output side of things is pretty limiting, especially considering the enthusiasm for using the Propeller for video games that the Hydra has aroused. But I'm sure Andre' will already have given you some good ideas in this area.

- Making a DIP version of the chip available, or at least a version in a PLCC package for through-hole prototyping.

There are also a couple of things which I'd like to propose:

- With the current architecture there is a practical limit of 512 longs of cog memory, how about we double this, as well as the status registers, PC etc. and allow each cog core to effectively run two processes on a time-share basis? There would need be a way to swap between the two states at intervals based on the system clock, as well as under cog control. This would allow more effective use of cogs for processes which are less active or non-time critical, and could also be a way of implementing fast interrupts which I've seen calls for on other threads.

- For reasons of backwards compatibility, there should be a way to speed limit each cog so it runs at 1/2, 1/4 or 1/8th of full speed. This would allow old modules to be run on new faster version of the chip without there being timing issues, and all you'd need to know as a user would be the speed the original module was designed to run at.

Regardless of which route is taken, let's face it, the new Propeller is going to be magnificent. We're really nearing the limits of the current version's capabilities, and I can only imagine what we'll be doing with it by the time the next generation of this marvellous chip starts shipping.

helloseth · 2006-11-29 15:50

ciw1973 said...

- Having on-chip FLASH/EEPROM, ideally as a replacement for the current ROM.

Chip said this was 'impossible' due to their current chip fab tech.

I would like to know if it is possible to make a 'custom' prop with my own rom. I realize that this would be big bucks, but it would allow the prop to be used in a killer device with huge volumes.
(Just dreaming, but you never know.)

Seth

Phillip Y. · 2006-11-29 16:21

I sugest that with 16 cogs that they be set up as pais of cogs,
that is each 8 pair of two cogs have more interconnection,
of registers, memory,counters etc.
Perhaps one cog could be disabled and the other could use the memory of both.
for compatiblity start up with 8 cogs and then enable others as needed.

CJ · 2006-11-29 16:36

after reading all the info suggested, my vote now goes to the 16/128k version

160Mhz
2.56BIPS total, that's more power than some desktop computers still in use today
32 counters with 6.75ns granularity

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Who says you have to have knowledge to use it?

I've killed a fly with my bare mind.

pjv · 2006-11-29 17:34

Hi All;

While most of my needs (industrial process control) do not require the nice video features the Propeller has, I do find the concept of one thread per COG somewhat limiting, and wasteful when the COG goes into a "wait" mode. What I would really like is some simple method of implementing deterministic interrupts/task switching per cog. If we had that capability, the 8 COGS would be plenty, especially with the higher clock speed. In that case I would favor more RAM over more COGS.

However if such a task switching scheme is not possible or reasonable, then I would vote for extra COGS at the expense of RAM.

As for power consumption, the lower power wins hands down........ I'm presently implementing the current Propeller in an industrial product that wants to run for 5 years on a set of AA alkaline batteries, and I'd love to be able to consider the new chip for this in the future.

Cheers,

Peter (pjv)

potatohead · 2006-11-29 18:03

I'm liking the points presented for the larger number of cogs. There is something elegant in just putting those necessary software peripherals on their own cogs, then weaving them together with a big memory model (or higher level language) program running on the remaining cogs.

Software efforts on the prop are still early. This decision will impact those to a degree. What we've seen so far is the lego approach really appealing.

In this, I think the higher number of cogs is a clear winner. No matter which way this goes, I'm stoked personally.

This thread is getting messy, what are the real tradeoffs with the higher number of cogs?

-lower overall hub memory access speed
-lower overall system ram
-higher overall power requirements.

Does this outweigh the positives the higher number of cogs bring to the table?

A standard method of interfacing external RAM would really balance this aspect of the discussion.

For me, the prop has three primary appeals:

-all in one package that does a lot with very little additional electronics
-speed
-true multi processor
-simplicity and symmetry in design.

Eventually, the prop is gonna scale beyond these things. That's my opinion and it's based on the clear advantages the mutiprocessing aspect brings to the table. A while back, somebody here posted their robotics win with a prop that was up against far more powerful linear multi-thread machines running a lotta stuff. The ability to multi-process removed a lot of complexity from the task at hand. IMHO, this is very significant and is a primary differentiator for the prop.

When the prop scales to the point where it's being used for higher level computing tasks, running applications, etc... it's gonna have different needs than it will now.

Given this, I vote for the 16 cog model. Those of us wanting to build all in one applications are going to find the RAM limits somewhat restrictive for larger scale projects. In the end though, it's gonna come down to interfacing with external resources. Bigger RAM will put this off, but not stop it. With the amount of speed coming to us, we are gonna outgrow even the 256 quicker than we think.

My only reservation comes down to the higher precision possible with fewer cogs. I'm not sure that's worth diminishing the multi-processor aspect of things. Again, I point back to the robotics application posted here. IMHO, that's a direction that should be continued.

We know more about multi-thread code than we do real multi-processing code. Bending the prop in that direction will make some things easier, but at the cost of encouraging more true multi-processing application development on the chip.

Given the body of software we have right now, one would be able to easily build a system with a video display, control I/O, mouse, keyboard, speech output, sound output, on board lcd display, large capacity external storage, external RAM, another video display, etc...

The 16 prop model will allow for all of that, and still have cogs left for the actual intelligence or application running on the system. IMHO, that's a lot of power in the package!

Would the next scaling of the prop after that have only 8 cogs? I don't think so. Would be too coarse. Why not embrace it now then? Software that ties the cogs together will tie 8 cogs as well as it will 16 cogs. We might end up with more cogs yet, depending on what people do.

I'm sold on the 16 cog. Frankly, it will force creative objects and circuts for RAM and storage. Dedicating a coupla these cogs to these tasks would essentially bring us the 16 cog / large memory prop anyway. It's gonna take code to do the bigger things no matter what the onboard RAM model is. Why deny those who don't need big code, the cogs they would use for their projects?

Rsadeika · 2006-11-29 18:10

I have been following this thread for the last couple of days, and I guess I can voice my opinon. I started out gung-ho for the 8 cog/256kb option, but after reading all the different info that was offered up, I have switched to the 16 cog, 128kb option. I thought about, for my application at least, what would I be running out of first, ram or cogs. The answer was probably cogs, so, more cogs the better; by adding an interrupt option would be defeating the concept of the propeller.

As for the power issue, again, in my applications I will be using battery power, so the lower the power consumption the better.

Ray

Phil Pilgrim (PhiPi) · 2006-11-29 18:47

Peter,

If you'll indulge my playing devil's advocate for a moment: This regards the question about base-level current consumption and pertains to your 5-year AA-battery app. Given that you can't be leveraging much of the performance and features designed into the Prop at that level of consumption, why did you choose the Propeller for that app in the first place (as opposed to, say, a TMS430)? I'm just trying to wrap my mind around why minimum base-level current consumption is so important in a device designed for performance. It seems a little like trying to optimize a Formula 1 racer to get 40mpg when it's going under 25mph, so it can be used to deliver mail.

Thanks,
Phil

Post Edited (Phil Pilgrim (PhiPi)) : 11/29/2006 8:53:00 PM GMT

parts-man73 · 2006-11-29 18:48

With this talk of higher power consuption.... Will cooling ever become an issue? Right now, my Props never seem to warm up at all, at least not to the point of needing any kind of cooling.

With some of the specifications that have been mentioned, it comes to mind that at least some type of heatsink may be required.

Brian

helloseth · 2006-11-29 19:03

potatohead said...
My only reservation comes down to the higher precision possible with fewer cogs.

Can you explain what you mean by "higher precision possible with fewer cogs"?

For me, the current "ideal" would be 16 cogs with programmable hub access. So a PWM cog, can monitor the motors, and every once in a while access the hub ram to look for commands, and leave other cogs with faster (or more frequent) hub access.

Seth

potatohead · 2006-11-29 20:06

Lower number of cogs means faster overall access to shared resources in the hub. Where matters of timing are concerned, that involve the hub, or communication between cogs, we would have more precision in that regard.

Paul Baker · 2006-11-29 20:47

parts-man73 said...
With this talk of higher power consuption.... Will cooling ever become an issue? Right now, my Props never seem to warm up at all, at least not to the point of needing any kind of cooling.

With some of the specifications that have been mentioned, it comes to mind that at least some type of heatsink may be required.

Brian

No, the next generation will not require a heatsink either.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Paul Baker
Propeller Applications Engineer

Parallax, Inc.

pjv · 2006-11-29 21:08

Hello Phil;

Currently I sell a product that is mostly asleep, but has a few millisecond burst of chaotic activity at precisely pre-determined intervals. I now do this with a PIC (I hate PICs so I had someone else write the code) and it consumes an average current of 160 uAmps, with a resultant AA alkaline battery life in excess of one year.

My market will be significantly expanded if I can stretch that life..... 7 years would be nirvana, but 3 years is very good.

With clever power conservation methods I hope to get down below 50 uAmps average, and I believe the Propeller can do this for me. The 32 bits with (future) harware multiply should be phenomenal, and I can write part of my code in high level language to boot.

On top of all that, I just don't want to learn yet another micro architecture. Between the SX and the Propeller, all my needs are satisfied.

Cheers,

Peter (pjv)

GreyBox Tim · 2006-11-29 22:41

I'd go for the lower power consumption, 8-cog high memory version. This seems like the easiest way to retain the ROI for the exiting silicon, and still keep the basic architecture of the Propeller the same.

I think that there is quite a bit that can be done in an 8-processor configuration - but I'd be wary of upping the cog count for a "PC replacement" type application. The embedded applications for this chip are what brought me into it - and it's very important to have a larger RAM so that one can build more involved low-level drivers for many of the existing ICs on the market.

There's a ton of processing time in a single second at 160MHz, and I'd venture a guess that most of the current cogs are not being used efficiently and thus sit idle an awful lot...

The other issue is the IO and data transfer – either way you’ll likely end up with more data to send or receive than can be simply managed at one time with the current 32-IO layout and the hub times. By adding the memory to the Cogs – more functions can be done within one Cog, minimizing Cog-Hub-Cog data handoffs and Cog/Hub-RAM transactions. This should improve the processing throughput enough to make a difference (as opposed to doubling the Cog count and having to manage more complicated data transfer timings and bandwidth over-heads).

Dual-port memory (while space hungry and expensive) is not too hard to do, but splitting the time that the hub and a second memory controller gets (and optionally hardwiring it to the “B” port I/O for super-bus transactions), should reduce re-development to an absolute minimum.

Getting as many single instruction operations in the die is really helpful at increasing the work output of a micro. Multiply is great - and divide can be spoofed easily with a quick multiply.

Cheers!

-Tim

cgracey · 2006-11-30 00:28

Another question for you all:

What if the entire chip ran at 1.8V only?

Pros:

This would eliminate separate 3.3V VDD_IO pins.

No need for a separate 3.3V supply.

Pins would transition much faster for LVDS possibilities.

It would simplify the device architecture.

At 1.8V, video/VGA generation would still work fine, so would direct interfacing to elemental sensors.

Cons:

1.8V would be inadequate for driving LEDs.

Transistors would have to be used to generate 3.3v+ outputs.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

Chip Gracey
Parallax, Inc.

OzStamp · 2006-11-30 00:35

Hi Chip.

My vote would be no.
For industrial type apps you would need to add too many components externally..
You need to be able to drive opto's and stuff like that without having to add too much ext stuff.

Ronald Nollet

Bean · 2006-11-30 00:58

Couldn't you still drive an LED by connecting the anode to 3.3V and the cathode to the pin.
When the pin is high there will only be 1.5V across the LED (not enough to light it), but when the pin is low there will be 3.3V across it (enough to light it).
That's what I would do with a 1.8V output pin.

Bean.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Cheap used 4-digit LED display with driver IC·www.hc4led.com

Low power SD Data Logger www.sddatalogger.com
SX-Video Display Modules www.sxvm.com
Stuff I'm selling on ebay http://search.ebay.com/_W0QQsassZhittconsultingQQhtZ-1

"People who are willing to trade their freedom for·security deserve neither and will lose both." Benjamin Franklin
·

cgracey · 2006-11-30 01:08

Bean (Hitt Consulting) said...
Couldn't you still drive an LED by connecting the anode to 3.3V and the cathode to the pin.
When the pin is high there will only be 1.5V across the LED (not enough to light it), but when the pin is low there will be 3.3V across it (enough to light it).

Yeah, that's what I was thinking when Ron said you couldn't drive opto-isolators. The bigger headache would come when you needed to interface to bidirectional 3.3V I/Os.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

Chip Gracey
Parallax, Inc.

James Long · 2006-11-30 01:14

Chip,

My opinion......the separate supply adds one regulator and a few caps......a total 1.8 volt chips adds a bunch more components.

James L

Phil Pilgrim (PhiPi) · 2006-11-30 01:20

Chip,

Given how many questions and issues there've already been in the forum with 3.3V vs. 5V, I think a 1.8V interface is a non-starter. That big a speed bump in the road to design-ins would lose too many potential users.

-Phil

Post Edited (Phil Pilgrim (PhiPi)) : 11/30/2006 5:28:09 AM GMT

william chan · 2006-11-30 01:30

Can we run the core at 1.8v, but allow the I/O s to be powered by anything from 1.8v to 5v ?
That would be perfect.

One more thing, since the SPIN interpreter doesn't fully use up the 512 longs, SPIN should allow REGVAR ( register variables )
that allocates space on the unused cog memory. This will allow faster cog processing and reduce hub memory bottlenecks.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
www.fd.com.my
www.mercedes.com.my

What would you want more of, cogs or RAM?

Comments