We would not see a P1B until the end of this year, at the earliest. It would NOT end up being quick or easy as there would be 'things' that would surface in expanding it that would need to be dealt with. To make it viable at all you would need to add at least some of the new instructions from the P2 and that will open a new can of worms.
That puts the P2 out at least another 2 years (this year lost + next to get back to it and get it done). And that is ONLY if this new P1B sells enough to help fund the P2 further.
I agree with posts above, a P1B at this point would pretty much doom the P2 to ever getting done.
That said I would like to see a P1B AFTER the P2 comes out to keep the P1 series fresh. There is room for an improved P1, if only in adding analog to the I/O and more memory (Hub and Cog - no HUBEXEC so Cog ram expansion is worth its weight in gold...).
The P1B would also not solve the power issue with the P2 so all of the work and/or marketing decisions to finalize it and get it done are still there to do. Better to just do that NOW and move on than to delay it for a year and what? Hope for a miracle and that someone drops the entry price for 65nm processes down to what 180nm are today?
The P2 will be amazing and I really think that Chip can do a lot to reduce power usage when he starts looking at optimizing the power chain in the instructions. I would rather see him spend valuable time working on that then some P1 FPGA test variants.
Probably not thinking about it as 5W is not proven, and running at a lower MHZ would drop that a lot.
Not to mention I cannot think of a single application that would churn all gates in all eight cogs simultaneously. Keeping cordic, mul, div, mac, video, hub, counters, serial, hub all 100% busy all the time... LOL!
No matter what Chip proposes, the forum demands more. Perfection so far has killed good enough and we only have a P1. I like Chip's proposal in post #494. Twice the cogs (I have run out of cogs.), 8x hub ram (Since I don't do graphics or gaming not as important for me.), 2x pins (I have run out of pins.), smart IO (Built in pull-ups and ADC is really big.), >2x counters, 10x faster than current chip, cost comparable to current chip (Boards will be more expensive.). I really would like to have a new chip to play with when winter curtails my outdoor activities.
Parallax, please ignore all the suggestions for a P1+ and focus on getting the P2 out.
I agree most strongly with this and should have qualifed what I wrote above much better. There is a reason why I've not made, nor will make P1+ suggestions.
I know bean counters. If the P1B is made first, we will NEVER see the P2.
Right on!
What frustrates me is every time I start to get serious about some code and begin building, we have one of these messes which makes me want to file the DE2 and go do something else.
The P2 as it has been known is actually morphed into a P3 after it failed the first two fab attempts and design enhancements were added in the meantime almost daily. Knowing this we should not even be referring to Chip's latest suggestion as P1B as a P1B would have just been a P1 with extra I/O. The P2 was intended to be a faster P1 with more memory and more I/O at least, perhaps even more cogs, and this is what Chip has said he can do "EASILY". So the P2 will indeed come to fruition as Parallax promised, but with aces up their sleeve to follow.
The P3 can only make it to market if it is done properly but it has to have a market to come to, which it won't have if it continues to have design enhancements which keep delaying the silicon. The P3 design should ideally be fabricated in 65nm as Chip mentioned and it will be fantastic once all the bugs are ironed out and the tools are made available.
We have been waiting a long time for P2 so if it is indeed easy enough then can we just have the P2 in the meantime for real commercial product. I'm quite happy to wait for P3 which I hope will be used in rather more than just some fancy boy's toys.
Please, rembember that Parallax has currently a very particular RMA policy (It looks that they accepts everything). I am worried about this 5W.
The PLL multiplier maybe will be limited to 4x or 8x (not 16x). The price per die will need to be increased by the expected failure rate or Parallax would need to change their RMA policy, at least for P2 IC.
Technical comparison between P16X32B @ 100MIPS (200MHz) and P2 @ 160MHz
Maximum Hub Bandwidth per Cog:
P1B: 200/16 * 4 = 50MB/sec
P2: 160/8 * 32 = 640MB/sec
Winner: P2 by a factor of 12.8!
Total hub bandwidth per chip:
P1B: 16*50 = 800MB/sec
P2: 640*8 = 5,120MB/sec
Winner: P2 by a factor of 6.4!
MIPS per cog
P1B: 100MIPS
P2: 160MIPS
Winner: P2 by a factor of 1.6-10 depending on instruction mix.
MIPS per chip
P1B: 1,600 max mips
P2: 1,280 max mips
Winner: P2 due to more DMIPS
*NOTE: P2 can do a lot more work per cycle, has auto increment address modes, mul, div, cordic etc
Hub bandwidth makes a huge difference. Compiled code would run much faster on P2. Overall, P2 system throughput will be faster.
Video Limits
1080p60, 8 bits per R,G,B
165Mhz dot clock
2,073,600 pixels per frame (1920*1080, 8bpp) ==> 124.42MB/sec (no clut, so RRRGGGBB is best possible color) ==> NOT POSSIBLE ON P1B (maybe using 3 cogs)
6,220,800 bytes per frame (24bpp) ==> 373.24MB/sec @ 60Hz (NOT POSSIBLE ON P1B (maybe using 8 cogs)
NOT POSSIBLE ON P1B because it CANNOT read even an 8 bit 1080p60 bitmap from SDRAM fast enough to refresh the screen!
On P2, possible as 1/4 of a cog (1 task)
Winner: P2 by 20:1 (8bpp) 60:1 (24bpp) ... a landslide
Please, rembember that Parallax has currently a very particular RMA policy (It looks that they accepts everything). I am worried about this 5W.
The PLL multiplier maybe will be limited to 4x or 8x (not 16x). The price per die will need to be increased by the expected failure rate or Parallax would need to change their RMA policy, at least for P2 IC.
I continue to be a strong believer in the P1, but with the proposed changes, will the power consumption still be as low as before ? I.E. 5-ish microamps when all cogs are in WAIT mode ? This is really important to me for multi-year D cell battery applications.
And secondly, would it be possible and reasonable to add indirect or relative addressing ? I really miss that in the current design.... it makes the code so much less effective.
But what does Parallax' current high volume costumers think about a $12 5W BGA IC (or $12 3W BGA IC)?
That is really the issue, and I think lacking a bit in the whole P2 saga. Parallax can only afford to do 1 design, not only in money, but time (ie. it took 8 years to get here).
Just back from EElive, (the old Embedded Systems Conference), and I can tell you a 5W microcontroller would get laughed off the floor. Power is a big issue, not only because of the battery operated applications, but these days many products have more than 1 micro inside, and if your 1 micro is hogging 5W, it will get replaced. I've never thought Parallax could reasonably compete in the general marketplace for micros, as they were starting from the disadvantage of using process technology that is 10 years old. Yes some of the small 8bit and ARM chips are using 150 nm, but those are the <1$ chips, which are pad limited anyway. the other selling point of those devices is power at less than 50uA/MHz. Even if you by the aggregate MIPs of a P2 (and I don't), that works out to 1500uA/MHz.
The reason I don't buy the 1600MIPs, is the design idea of the P2, is that you trade off programmable cores, for peripherals, so that most of the P2 COGs are dedicated to peripheral tasks. The main application program gets handled by a couple of the COGs. If you really think the P2 can compete, I'd suggest trying a real world example benchmark. As the P2 does video, I would suggest comparing the decompressing of a JPEG image and displaying that. And see how the FPGA implementation stands up vs. and ARM11 (BeagleBone market) or a CortexM4 (non-Linux market). I don't think the P2 times will be all that great (even with hub-execution, the P2's problem has always been the memory access bottleneck).
I think more and simpler cores is probably a better tradeoff, especially if you get the power down (it really should be way less than 1W, a RasbPi is 4W but that includes SDRAM and Ethernet/USB controller, and most big CortexM4 parts are around 200 mW max per CPU -- yes there are 2 and 3 core devices available now). Though I think for the application side of processors, at least a couple COGs should be able to do hub-execution.
So the real answer is those couple people buying P1s in any quantity, what is it that they need? The P2 can't compete on price, power or speed, but there may be niches out there for it. And there is always the education business, where a simple multi-processor could be a good teaching platform. However, even here I think Parallax dropped the ball. The P1 is far superior to the AVR, but 10 years later which one owns the market? As has always been true, software sells hardware, and the P1 was always too complex for the beginner, heck it's even too complex for some "professionals", when one company introduced a 3 CPU core chip, I had one of their marketing guys say "why do you need that?". I kind of just shook my head.
Originally Posted by Peter Jakacki
I think I favor this mix better as years ago I thought that if the Prop had an ARM + cogs that the ARM would make a good processor for applications while the cogs would handle all the realtime stuff. Now the enhanced P2 cog or two would make a very nice applications processor with the sprinkle of P1 cogs dutifully performing their realtime tasks. Is this possible?
It IS possible, though it's a little messy to mix such different things. It would be nice to keep the next chip homogenous.
LOL. All I got with this suggestion several years ago was a blank stare and several seconds of awkward silence.
Chip, thanks for at least entertaining the idea of a bigger Propeller1 in the interim. Just don't go overboard!
Yes, nice post. Agree. After this storm (4 days and more than 500 messages) I think that we can add little to solve this dilemma. We hope and trust that the good judgement of Ken and Chip will bring some magical solution.
Don't get me wrong, P1B would be an interesting chip for many uses - it is just not the promised P2.
I know bean counters. If the P1B is made first, we will NEVER see the P2.
Even to suggest that Parallax is dominated by "bean counters" -- a pejorative term -- is beyond the pale. Whatever chip they ultimately market has to have broad market acceptance and make a profit, or there's no point in pursuing it. You guys seem to think that Parallax owes you something just because you bought expensive FPGA boards and spent time writing P2 software for all the pet features you suggested be loaded onto the P2's back. Well, they don't. That risk was entirely yours, and if you're crying over wasted time, too freaking bad.
The P2 trajectory has hit a pretty major speed bump, and those who don't accept that are in denail. I think the P1+ discussion is a healthy one to be having at this time, even if nothing comes of it. At the very least, it's a mind cleanser -- albeit not as good as a month on the beach in Mexico. We can't undo the past. The money and time that've been spent are gone. But the experience and lessons from that effort still have enduring value. The starting point to focus on is now; not yesterday, not last week, not two years ago. If that means a P1+ soon instead of a P2 who knows when, fine. If, somehow, the P2 can be resurrected in marketable form, that's fine, too. Whatever the case, we'll survive -- even thrive. It's important that Parallax does, too.
Re Post #551 - From very early on, it was known that the P2 would be significantly more power hungry than the P1 due to inherent leakage in the manufacturing process involved. Smaller geometries are even leakier and have higher overall power floors even though the power per gate requirement is lower. Power switching helps, but the P2, even if it were functionally similar to the P1, would not be able to compete on idle or wait state power consumption.
But what does Parallax' current high volume costumers think about a $12 5W BGA IC (or $12 3W BGA IC)?
I would hope that Parallax has discussed this with their current high volume customers. They are the ones that pay the bills.
I still believe the best approach is to continue with the original plan, but then follow that with a 65nm version later on. 5 Watts is not ideal, but it can be managed. 15 years ago I worked on a product that had 4 Philips Trimedia TM1000 chips in it. Each one consumed 4 Watts. The product did not use fans, but used convection cooling only. It would get very warm on top, and we joked that there should be cup holders on top so that our customers could use it to keep their coffee warm.
I think there would be many applications for a 4-Watt P2, or even a 2-Watt P2 at half the clock rate or using only half the cogs. The high volume developers would find many uses for such a chip. Then a year later when the 65nm version comes out the sales for the P2 would increase dramatically.
The alternative is to go directly with the 65nm version. However, I think that's a bit too risky for Parallax at this time.
I like your post. I don't want to speak for you, but it appears that in your opinion energy consumption is key and is already too high, despite the best applied power reduction strategies.
For the full bore P2 coming in at around 3W, I think a more useful real world test would be to load an uncompressed bmp over a serial line, rotate it and save it to SDRAM. I have no idea what the comparison would show, but I think this a better real world example.
Bill et al (including me) who favor a variant before P2/3 so long as it doesn't kill the P2/P3. To me, the conversation should stick to power reduction, memory enhancement, pin count and cog bandwidth…. The less P2-like the variant is… the better the argument to continue the main development of the P2/3.
I would be curious how much Hub/Cog RAM we could have…with 8(2 clock per instruction)Cogs, with double the pin count and the absolute minimum power consumption running at 160MHz.
Even to suggest that Parallax is dominated by "bean counters" -- a pejorative term -- is beyond the pale. Whatever chip they ultimately market has to have broad market acceptance and make a profit, or there's no point in pursuing it. You guys seem to think that Parallax owes you something just because you bought expensive FPGA boards and spent time writing P2 software for all the pet features you suggested be loaded onto the P2's back. Well, they don't. That risk was entirely yours, and if you're crying over wasted time, too freaking bad.
The P2 trajectory has hit a pretty major speed bump, and those who don't accept that are in denail. I think the P1+ discussion is a healthy one to be having at this time, even if nothing comes of it. At the very least, it's a mind cleanser -- albeit not as good as a month on the beach in Mexico. We can't undo the past. The money and time that've been spent are gone. But the experience and lessons from that effort still have enduring value. The starting point to focus on is now; not yesterday, not last week, not two years ago. If that means a P1+ soon instead of a P2 who knows when, fine. If, somehow, the P2 can be resurrected in marketable form, that's fine, too. Whatever the case, we'll survive -- even thrive. It's important that Parallax does, too.
-Phil
Well said Phil, my thoughts exactly.
Let's hope some form of Phoenix can rise from the flames.
We all just need to keep developing for P1 in the meantime.....
Correct me if I am wrong… but Bill never complained about all of the work he has done on the P2… or the P1 for that matter.
I agree that the reference to "bean counters" was a little off the track, but the folks at Parallax have fairly thick skin and I think they can easily understand what Bill was trying to say.
I think a full discussion about the consequences of choosing among the various tracks is a good thing. No consensus can be achieved, but this isn't about
building a consensus… it is about speaking to the sensibilities.
I think Phil could have been more careful as well.
Any real test application which handles a lot of memory would be a better test, and closer to a real world application. Rotating an image would be OK, except it would not highlight some of the computational deficiencies of the P2. ARM M4 and 9-11 all have hardware multiply (good thing for a general purpose COG, but wasted on a COG doing just peripheral stuff), and also floating point hardware and divide assist. And video means pictures and most pictures are JPG.
Energy is definitely key, and vendors have been jumping through hoops to get both active and standby power modes down. Hence the old benchmarks have been replace with CoreMark which tries to measure how much computing bang do you get per W. Most embedded ARMs go to sleep modes consuming less than 1uA, but can wake up quickly and get the job done, then go back to sleep.
For anyone feeling battle fatigue, who can't take a month off in Mexico.
There is a vitamin preparation, which is marketed for the eye, called Ocuvit. The history is that originally this preparation was
assaulted by the official ophthalmology establishment as being horse feathers. Then the controlled studies came back and it was true.
The stuff fights the retinal effects of overuse. In many cases it will stop or retard the onset of macular degeneration.
It does this by supplying the nutrients that an active eye runs out of first The eye is part of the brain. No studies have been done.
But this might be the healthiest thing you can give your tired brains.
I agree that P2 has to be built ASAP. A wealth of ideas has gone into it and it is the prime platform. Today, the engineer at OnSemi is going to check power dissipation for the case of S and D not toggling. This will tell us right away what power level we could approach with lots of flop gating into the ALU sections.
About the Prop1 ideas: I was thinking that hub exec could be realized by having no more than a single cache line that would afford 100% execution speed in a straight line. Also, this would allow the dual-port cog RAM to be cut in half, from 0.292 sq mm to ~0.15 sq mm, since we only need it for deterministic code and variables. Add in the ~0.1 sq mm of cog logic and the cog area drops from 0.4 sq mm to 0.25 sq mm - a ~38% shrink, while affording execution from hub. This is all at 180nm.
When I went in to look at the old Prop1 code, I was blown away how there is almost nothing there. It doesn't have the creature comforts of the Prop2, but it is as lean as can be. It's so compact that you can just add more of them, rather than inflate them.
A moderator resorting to ad hominem attacks instead of technical arguments.
"bean counter", a common term, is beyond the pale, but "you're crying over wasted time, too freaking bad" is OK?
(and I was not worried about wasted time, but about ParallaxSemiconductor losing credibility - you know, face?)
I wonder who moderates the moderators.
1) the "pretty major speed bump" is still not proven.
2) 1W-2W "typical power usage" is what was projected, and is compatible even with a 5W maximum
3) Chip has already started thinking about power management.
4) reducing clock speed helps a lot without crippling the architecture.
5) I was referring to *CORPORATE BUYERS* who for YEARS have been promised a feature set on ParallaxSemiconductor.com, not the developers on the forum, which naysayers now want to kill off.
6) you have always opposed the addition of features.
Even to suggest that Parallax is dominated by "bean counters" -- a pejorative term -- is beyond the pale. Whatever chip they ultimately market has to have broad market acceptance and make a profit, or there's no point in pursuing it. You guys seem to think that Parallax owes you something just because you bought expensive FPGA boards and spent time writing P2 software for all the pet features you suggested be loaded onto the P2's back. Well, they don't. That risk was entirely yours, and if you're crying over wasted time, too freaking bad.
The P2 trajectory has hit a pretty major speed bump, and those who don't accept that are in denail. I think the P1+ discussion is a healthy one to be having at this time, even if nothing comes of it. At the very least, it's a mind cleanser -- albeit not as good as a month on the beach in Mexico. We can't undo the past. The money and time that've been spent are gone. But the experience and lessons from that effort still have enduring value. The starting point to focus on is now; not yesterday, not last week, not two years ago. If that means a P1+ soon instead of a P2 who knows when, fine. If, somehow, the P2 can be resurrected in marketable form, that's fine, too. Whatever the case, we'll survive -- even thrive. It's important that Parallax does, too.
For anyone feeling battle fatigue, who can't take a month off in Mexico.
There is a vitamin preparation, which is marketed for the eye, called Ocuvit. The history is that originally this preparation was
assaulted by the official ophthalmology establishment as being horse feathers. Then the controlled studies came back and it was true.
The stuff fights the retinal effects of overuse. In many cases it will stop or retard the onset of macular degeneration.
It does this by supplying the nutrients that an active eye runs out of first… The eye is part of the brain. No studies have been done.
But this might be the healthiest thing you can give your tired brains.
but "you're crying over wasted time, too freaking bad" is OK?
'Sorry if my post offended or seemed over the top. Despite its context following a quote, it wasn't aimed at any one individual. It's just that the process undertaken months ago to solicit and incorporate more and more features from forum requests seemed to take an eye off of a more cohesive vision for the P2, and that the current turn of events is the consequence of that. I'm an ardent adherent of the following mantra by Antoine de Saint-Exupery:
La perfection est atteinte
non quand il ne reste rien
Your post was way over the top, and mischaracterized what I said.
I know you have been frustrated by the additions, however we have different points of view.
I have been frustrated by all the attempts at crippling the new features and the "sky is falling" attitude.
1) the additions happened due to failed shuttle runs, and needing to remove the DAC bus, freeing logic
2) the additions make P2 FAR more capable, for more markets
3) keeping to "simplicity" and "purity" at the expense of significant performance and capability enhancements, the removal of which will decrease markets P2 will be appropriate for, is ill advised
I do request that if you disagree with one of my technical arguments, you respond with a technical argument, with supporting data and calculations.
See my P1B/P2 analyses - I provide backup for my arguments.
One thing both of us agree on:
We want Parallax, and the P2 (and later P1B) to succeed.
'Sorry if my post seemed over the top. It's just that the process undertaken months ago to solicit and incorporate more and more features from forum requests seemed to take an eye off of a more cohesive vision for the P2, and that the current turn of events is the consequence of that. I'm an ardent adherent of the following mantra by Antoine de Saint-Exupery:
La perfection est atteinte
non quand il ne reste rien
The reason I chose my example is that we already have a camera. http://www.parallax.com/product/28320 and it doesn't produce jpg.
Right now the easiest way to link that camera up to a P2 would be a serial line.
If you combine this camera with the computational strengths of the P2… then you can give the elev-8 http://www.parallax.com/product/80000
a horizon to look at… this would be one way to null out some of the characteristics of mems sensors.
I was under the impression that the P1 was made using different software and that making a new version of it was off the table. But now it seems it is possible, apparently almost even easy to do so. If that is the case, that a new P1 can be made with more cogs, more IO, more memory, more MHz, and with ADCs, and can use existing objects, and can be available soon, and is power efficient, and, and... Wow. Do it. Do it now! (said in Swarzenegger's voice)
I don't see why that would hurt the P2 development.
I do request that if you disagree with one of my technical arguments, you respond with a technical argument, with supporting data and calculations.
I would if I thought we were both qualified to make such arguments, considering the dirth of hard data available to support either position. I know I'm not. Even if you're right -- and I'm not saying you aren't -- it still seems like a good time to step back, take a breath, and consider other alternatives, even if they don't materialize.
I've worked with SXes and NMOS Z8s in industiral environments. Both got hot, and the thermal considerations were unpleasant to deal with. When I see figures like 5W -- or even 3W -- I think, "Oh, no, here we go again!" But maybe I've just gotten spoiled by the cool-running P1.
Comments
That puts the P2 out at least another 2 years (this year lost + next to get back to it and get it done). And that is ONLY if this new P1B sells enough to help fund the P2 further.
I agree with posts above, a P1B at this point would pretty much doom the P2 to ever getting done.
That said I would like to see a P1B AFTER the P2 comes out to keep the P1 series fresh. There is room for an improved P1, if only in adding analog to the I/O and more memory (Hub and Cog - no HUBEXEC so Cog ram expansion is worth its weight in gold...).
The P1B would also not solve the power issue with the P2 so all of the work and/or marketing decisions to finalize it and get it done are still there to do. Better to just do that NOW and move on than to delay it for a year and what? Hope for a miracle and that someone drops the entry price for 65nm processes down to what 180nm are today?
The P2 will be amazing and I really think that Chip can do a lot to reduce power usage when he starts looking at optimizing the power chain in the instructions. I would rather see him spend valuable time working on that then some P1 FPGA test variants.
But what does Parallax' current high volume costumers think about a $12 5W BGA IC (or $12 3W BGA IC)?
Not to mention I cannot think of a single application that would churn all gates in all eight cogs simultaneously. Keeping cordic, mul, div, mac, video, hub, counters, serial, hub all 100% busy all the time... LOL!
John Abshier
I agree most strongly with this and should have qualifed what I wrote above much better. There is a reason why I've not made, nor will make P1+ suggestions.
Right on!
What frustrates me is every time I start to get serious about some code and begin building, we have one of these messes which makes me want to file the DE2 and go do something else.
The P3 can only make it to market if it is done properly but it has to have a market to come to, which it won't have if it continues to have design enhancements which keep delaying the silicon. The P3 design should ideally be fabricated in 65nm as Chip mentioned and it will be fantastic once all the bugs are ironed out and the tools are made available.
We have been waiting a long time for P2 so if it is indeed easy enough then can we just have the P2 in the meantime for real commercial product. I'm quite happy to wait for P3 which I hope will be used in rather more than just some fancy boy's toys.
The PLL multiplier maybe will be limited to 4x or 8x (not 16x). The price per die will need to be increased by the expected failure rate or Parallax would need to change their RMA policy, at least for P2 IC.
Maximum Hub Bandwidth per Cog:
P1B: 200/16 * 4 = 50MB/sec
P2: 160/8 * 32 = 640MB/sec
Winner: P2 by a factor of 12.8!
Total hub bandwidth per chip:
P1B: 16*50 = 800MB/sec
P2: 640*8 = 5,120MB/sec
Winner: P2 by a factor of 6.4!
MIPS per cog
P1B: 100MIPS
P2: 160MIPS
Winner: P2 by a factor of 1.6-10 depending on instruction mix.
MIPS per chip
P1B: 1,600 max mips
P2: 1,280 max mips
Winner: P2 due to more DMIPS
*NOTE: P2 can do a lot more work per cycle, has auto increment address modes, mul, div, cordic etc
Hub bandwidth makes a huge difference. Compiled code would run much faster on P2. Overall, P2 system throughput will be faster.
Video Limits
1080p60, 8 bits per R,G,B
165Mhz dot clock
2,073,600 pixels per frame (1920*1080, 8bpp) ==> 124.42MB/sec (no clut, so RRRGGGBB is best possible color) ==> NOT POSSIBLE ON P1B (maybe using 3 cogs)
6,220,800 bytes per frame (24bpp) ==> 373.24MB/sec @ 60Hz (NOT POSSIBLE ON P1B (maybe using 8 cogs)
NOT POSSIBLE ON P1B because it CANNOT read even an 8 bit 1080p60 bitmap from SDRAM fast enough to refresh the screen!
On P2, possible as 1/4 of a cog (1 task)
Winner: P2 by 20:1 (8bpp) 60:1 (24bpp) ... a landslide
Signal Capture / Generation Limits
P1B: 100Mhz, interleaving 5 cogs, to write/read 32 bits @ 100Mhz to/from hub.
P2: 160Mhz, one cog
Winner: P2 by 8:1 (5*1.6)
SDRAM
P1B: has to be bit banged, best guess 50MB/sec, 20 pins left after 16 bit SDRAM
P2: 320MB/sec with 16 bit interface (640MB/sec with 32 bit interface), 48 user pins left after 16 bit SDRAM
Winner: P2 by 12.8:1
LMM vs HUBEXEC
P1B Cog: 50MIPS
P2 Cog: 160MIPS
Winner: P2 by 3.4:1
Pins
P1B: 64 less flash + serial, so 58 user pins
P2: 92 less flash + serial, so 86 user pins
Winner: P2 by 28 pins
Power consumption
Roughly the same
And much much more.
P1B should be made AFTER P2 as an I/O expander for P2, and for lower end applications.
I know bean counters. If the P1B is made first, we will NEVER see the P2.
Wow, what a turn of events.
I continue to be a strong believer in the P1, but with the proposed changes, will the power consumption still be as low as before ? I.E. 5-ish microamps when all cogs are in WAIT mode ? This is really important to me for multi-year D cell battery applications.
And secondly, would it be possible and reasonable to add indirect or relative addressing ? I really miss that in the current design.... it makes the code so much less effective.
Keep up the good work.
Cheers,
Peter (pjv)
Just back from EElive, (the old Embedded Systems Conference), and I can tell you a 5W microcontroller would get laughed off the floor. Power is a big issue, not only because of the battery operated applications, but these days many products have more than 1 micro inside, and if your 1 micro is hogging 5W, it will get replaced. I've never thought Parallax could reasonably compete in the general marketplace for micros, as they were starting from the disadvantage of using process technology that is 10 years old. Yes some of the small 8bit and ARM chips are using 150 nm, but those are the <1$ chips, which are pad limited anyway. the other selling point of those devices is power at less than 50uA/MHz. Even if you by the aggregate MIPs of a P2 (and I don't), that works out to 1500uA/MHz.
The reason I don't buy the 1600MIPs, is the design idea of the P2, is that you trade off programmable cores, for peripherals, so that most of the P2 COGs are dedicated to peripheral tasks. The main application program gets handled by a couple of the COGs. If you really think the P2 can compete, I'd suggest trying a real world example benchmark. As the P2 does video, I would suggest comparing the decompressing of a JPEG image and displaying that. And see how the FPGA implementation stands up vs. and ARM11 (BeagleBone market) or a CortexM4 (non-Linux market). I don't think the P2 times will be all that great (even with hub-execution, the P2's problem has always been the memory access bottleneck).
I think more and simpler cores is probably a better tradeoff, especially if you get the power down (it really should be way less than 1W, a RasbPi is 4W but that includes SDRAM and Ethernet/USB controller, and most big CortexM4 parts are around 200 mW max per CPU -- yes there are 2 and 3 core devices available now). Though I think for the application side of processors, at least a couple COGs should be able to do hub-execution.
So the real answer is those couple people buying P1s in any quantity, what is it that they need? The P2 can't compete on price, power or speed, but there may be niches out there for it. And there is always the education business, where a simple multi-processor could be a good teaching platform. However, even here I think Parallax dropped the ball. The P1 is far superior to the AVR, but 10 years later which one owns the market? As has always been true, software sells hardware, and the P1 was always too complex for the beginner, heck it's even too complex for some "professionals", when one company introduced a 3 CPU core chip, I had one of their marketing guys say "why do you need that?". I kind of just shook my head.
Nice post.
Chip, thanks for at least entertaining the idea of a bigger Propeller1 in the interim. Just don't go overboard!
Yes, nice post. Agree. After this storm (4 days and more than 500 messages) I think that we can add little to solve this dilemma. We hope and trust that the good judgement of Ken and Chip will bring some magical solution.
Even to suggest that Parallax is dominated by "bean counters" -- a pejorative term -- is beyond the pale. Whatever chip they ultimately market has to have broad market acceptance and make a profit, or there's no point in pursuing it. You guys seem to think that Parallax owes you something just because you bought expensive FPGA boards and spent time writing P2 software for all the pet features you suggested be loaded onto the P2's back. Well, they don't. That risk was entirely yours, and if you're crying over wasted time, too freaking bad.
The P2 trajectory has hit a pretty major speed bump, and those who don't accept that are in denail. I think the P1+ discussion is a healthy one to be having at this time, even if nothing comes of it. At the very least, it's a mind cleanser -- albeit not as good as a month on the beach in Mexico. We can't undo the past. The money and time that've been spent are gone. But the experience and lessons from that effort still have enduring value. The starting point to focus on is now; not yesterday, not last week, not two years ago. If that means a P1+ soon instead of a P2 who knows when, fine. If, somehow, the P2 can be resurrected in marketable form, that's fine, too. Whatever the case, we'll survive -- even thrive. It's important that Parallax does, too.
-Phil
I still believe the best approach is to continue with the original plan, but then follow that with a 65nm version later on. 5 Watts is not ideal, but it can be managed. 15 years ago I worked on a product that had 4 Philips Trimedia TM1000 chips in it. Each one consumed 4 Watts. The product did not use fans, but used convection cooling only. It would get very warm on top, and we joked that there should be cup holders on top so that our customers could use it to keep their coffee warm.
I think there would be many applications for a 4-Watt P2, or even a 2-Watt P2 at half the clock rate or using only half the cogs. The high volume developers would find many uses for such a chip. Then a year later when the 65nm version comes out the sales for the P2 would increase dramatically.
The alternative is to go directly with the 65nm version. However, I think that's a bit too risky for Parallax at this time.
I like your post. I don't want to speak for you, but it appears that in your opinion energy consumption is key and is already too high, despite the best applied power reduction strategies.
For the full bore P2 coming in at around 3W, I think a more useful real world test would be to load an uncompressed bmp over a serial line, rotate it and save it to SDRAM. I have no idea what the comparison would show, but I think this a better real world example.
Bill et al (including me) who favor a variant before P2/3 so long as it doesn't kill the P2/P3. To me, the conversation should stick to power reduction, memory enhancement, pin count and cog bandwidth…. The less P2-like the variant is… the better the argument to continue the main development of the P2/3.
I would be curious how much Hub/Cog RAM we could have…with 8(2 clock per instruction)Cogs, with double the pin count and the absolute minimum power consumption running at 160MHz.
Well said Phil, my thoughts exactly.
Let's hope some form of Phoenix can rise from the flames.
We all just need to keep developing for P1 in the meantime.....
I agree that the reference to "bean counters" was a little off the track, but the folks at Parallax have fairly thick skin and I think they can easily understand what Bill was trying to say.
I think a full discussion about the consequences of choosing among the various tracks is a good thing. No consensus can be achieved, but this isn't about
building a consensus… it is about speaking to the sensibilities.
I think Phil could have been more careful as well.
Energy is definitely key, and vendors have been jumping through hoops to get both active and standby power modes down. Hence the old benchmarks have been replace with CoreMark which tries to measure how much computing bang do you get per W. Most embedded ARMs go to sleep modes consuming less than 1uA, but can wake up quickly and get the job done, then go back to sleep.
There is a vitamin preparation, which is marketed for the eye, called Ocuvit. The history is that originally this preparation was
assaulted by the official ophthalmology establishment as being horse feathers. Then the controlled studies came back and it was true.
The stuff fights the retinal effects of overuse. In many cases it will stop or retard the onset of macular degeneration.
It does this by supplying the nutrients that an active eye runs out of first The eye is part of the brain. No studies have been done.
But this might be the healthiest thing you can give your tired brains.
About the Prop1 ideas: I was thinking that hub exec could be realized by having no more than a single cache line that would afford 100% execution speed in a straight line. Also, this would allow the dual-port cog RAM to be cut in half, from 0.292 sq mm to ~0.15 sq mm, since we only need it for deterministic code and variables. Add in the ~0.1 sq mm of cog logic and the cog area drops from 0.4 sq mm to 0.25 sq mm - a ~38% shrink, while affording execution from hub. This is all at 180nm.
When I went in to look at the old Prop1 code, I was blown away how there is almost nothing there. It doesn't have the creature comforts of the Prop2, but it is as lean as can be. It's so compact that you can just add more of them, rather than inflate them.
A moderator resorting to ad hominem attacks instead of technical arguments.
"bean counter", a common term, is beyond the pale, but "you're crying over wasted time, too freaking bad" is OK?
(and I was not worried about wasted time, but about ParallaxSemiconductor losing credibility - you know, face?)
I wonder who moderates the moderators.
1) the "pretty major speed bump" is still not proven.
2) 1W-2W "typical power usage" is what was projected, and is compatible even with a 5W maximum
3) Chip has already started thinking about power management.
4) reducing clock speed helps a lot without crippling the architecture.
5) I was referring to *CORPORATE BUYERS* who for YEARS have been promised a feature set on ParallaxSemiconductor.com, not the developers on the forum, which naysayers now want to kill off.
6) you have always opposed the addition of features.
non quand il ne reste rien
I know you have been frustrated by the additions, however we have different points of view.
I have been frustrated by all the attempts at crippling the new features and the "sky is falling" attitude.
1) the additions happened due to failed shuttle runs, and needing to remove the DAC bus, freeing logic
2) the additions make P2 FAR more capable, for more markets
3) keeping to "simplicity" and "purity" at the expense of significant performance and capability enhancements, the removal of which will decrease markets P2 will be appropriate for, is ill advised
I do request that if you disagree with one of my technical arguments, you respond with a technical argument, with supporting data and calculations.
See my P1B/P2 analyses - I provide backup for my arguments.
One thing both of us agree on:
We want Parallax, and the P2 (and later P1B) to succeed.
The reason I chose my example is that we already have a camera. http://www.parallax.com/product/28320 and it doesn't produce jpg.
Right now the easiest way to link that camera up to a P2 would be a serial line.
If you combine this camera with the computational strengths of the P2… then you can give the elev-8 http://www.parallax.com/product/80000
a horizon to look at… this would be one way to null out some of the characteristics of mems sensors.
I don't see why that would hurt the P2 development.
I've worked with SXes and NMOS Z8s in industiral environments. Both got hot, and the thermal considerations were unpleasant to deal with. When I see figures like 5W -- or even 3W -- I think, "Oh, no, here we go again!" But maybe I've just gotten spoiled by the cool-running P1.
-Phil