@Rick if the past is any indicator the forum could have 100 new board designs for you pretty quick. ;-)
Quite true but I want every board designer to open up their board designs to the forum so everyone can decide what features their boards MUST have to be successful. It's only fair, isn't it?? We can tell them what will be easy and what they can afford and what their target markets should be.
Quite true but I want every board designer to open up their board designs to the forum so everyone can decide what features their boards MUST have to be successful. It's only fair, isn't it?? We can tell them what will be easy and what they can afford and what their target markets should be.
No problem ;-) Let's say we all work that out the day it goes to fab?
Perhaps a possible way would be for us to buy chip(s) from the shuttle run and pay up front. Whether they work or not its our problem and not Parallax's.
Perhaps $100 / chip ??? I know this does not cover the shuttle run, but if Parallax sold 100 then that is $10K. Might make for a decent quantity to be made on the first shuttle, and if they work we have working chips early. Obviously we need Ken's input, but this is a starting point.
I think this could work, but I would also suggest that for each $100 you invest (i.e. each shuttle run chip you buy), you get an additional $1 discount on up to 100 production run chips (or some other formula that Parallax comes up with - which might depends on volume - provided it comes out at a cost-neutral $100).
This makes it attractive not only for enthusiasts, but also for anyone intending to use the chip for commercial applications - since for the latter it also removes the risk that the initial shutte run has some problem that makes the chips unusable.
Hi All
I already have a HUGE investment in P2 so I think it's going to be of no surprise that I vote for P2.
I do use P1's in 99% of my projects though, so a updated P1 would be great too, along as
it DOES NOT kill off P2 and further development!
Here's my 2 cents worth.
Fab a 180nm P8X32B in Chips 2 clock /instruction design with 8 Cogs @ 200+MHz.
Add the extra 32 I/o pins (Port .
NO changes to P1 instruction setat all.
Fill rest of chip with HUB ram (512K,1M or 2M?)
Now package it in the existing P1 packages (DIP,QFN) with no PortB connected
and also package it in a new SMT package for the full 64 I/o capabilities.
This gives everyone there extra IO,HUB and speed with no radical changes to the core.
Existing OBEX,docs as stated in a post earler remain the same.
The other bonus in using P1 style packaging and pinouts is that existing designs can
easily update to a better spec chip. All Parallax's P1 Eval/demo and educational boards
don't need redesigning. Just load em up with the new chips.
This also keeps the hobbyists in the loop with the DIP package.
Hi All
I already have a HUGE investment in P2 so I think it's going to be of no surprise that I vote for P2.
I do use P1's in 99% of my projects though, so a updated P1 would be great too, along as
it DOES NOT kill off P2 and further development!
Hi Brian,
I understand you would vote Yes for the P2, but I'm having trouble figuring out if this is a Yes or a No for the PnnX32! The basis of this thread is that the PnnX32 would not derail the P2.
Hmmm... yes... but... 16/32cores? 512KB/1MB of hub ram? Seriously?? :frown:
If a simplified product has to be produced, I only see two possible scenarios:
1. an evolution of the original P1, eventually going to replace it, adding the things that were left out from the start, and possibly a few modifications dictated by things that happened later (LMM and compiled C mainly).
2. a REAL "intermediate" chip, equally distant from both P1 and P2 to be worth keeping in production even after the P2 is made.
While other manufacturers can afford to offer a nearly continuous scaling of features, with a next generation chip already laid out, and after the demise of the SX line, I think that the segment to cover ought to be "smaller"/"less pins".
How pulling in too much of P2 features is going to work when the real P2 is out?
So if I was in Chip shoes, and if, striked by lightning, I acquired some of his abilities, this would be my plan for a P1B variant:
No analog pins. We don't want to step on P2 toes, do we? But maybe pullups would be useful, saving a bunch of external resistors.
No externally clocked deserializer. But capability to shift-in reliably at instruction rate would be nice though. Think about SD card, we mostly gave up 20/25mbps just to be on the (spi_)safe side! Also for chip-to-chip links (with same clock), think of Beau's high speed serial, but double rate.
For this intermediate generation at least, should we really need that pesky and tricky protocols that the rest of the world consider a must have, we could just avoid being shy and do like most of them (XMOS, Microchip, etc.), and use an ULPI or (R)MII external PHY chip?
Absolutely none of those "look ma, BCD in one clock!" kind of additions. IMHO the preferred way to save on code size, is adding one more subroutine in 99% of the real world use cases. The remaining 1% can be solved by writing a "sweet64" interpreter in 300 longs.
Same number of COGs (8), or 16 only if it doesn't break other things. Two clocks per instruction is great.
Fastest possible clock rate (but eventually, publicly pretend it tops at half of that )
64KB hub ram (!), and just a small rom, the upper half of ram being loaded with P1 compatible rom image, which can be overwritten if required.
Port B implemented, but only internally. A new instruction to wait for a write by another cog without wasting one bit as strobe, to allow full 32-bit COG-to-(multi)COG transfers.
Hey wait a minute... internal?!? Most people lamented that they end 1 or 2 pins short most of the time! (me too!!!)
Well, I'd solve this by downgrading the reference VGA mode to 4bpp+sync.
At this point, a 4bpp video mode is in order. No palette required, we only need to drive RGBI or grayscale DAC using 4 pins. Maybe the palette argument could be repurposed to represent order of nibbles, but it's not really necessary.
Here's our missing 2 pins!
And also:
The four instructions not implemented in P1.
Single instructions to save/load C and Z (can't remember which one required two ops).
Largely unchanged programming model and register layout, except some register gets dual function depending on cog "mode" (sorry heater, I just said *MODE* ).
When cog is used to execute from HUB:
Ability to clock the counters from hub read/writes, scaled by size. So that PHSA and PHSB can be used as indexes for hub access. Pre-decrement or post-increment.
We already had tricks to do block transfers at hub window rate, using counters, but I'm not sure if that would work right with 2-clock instructions. In case it doesn't, block move instructions from/to hub would be nice (or REP prefix maybe?).
Some kind of "range" registers, to catch code jumping outside a single line of L1-code cache. Much like fcache works already, except this would trigger an interr... (OMG! Did I just say "INTERRUPT"?!) to handle reload.
m68k-like link/unlink acting on indexes, to implement software controlled L1-data cache, but limited to current subroutine parameters and locals. Absolute memory references goes thru normal hub access.
Whacky enough, uh?
*quickly runs and takes cover*
P.S.: seriously, I hope nobody take offence. I had similar gut reaction when some mods were proposed for P2, but I admit that later, when dust settled, I was pleasantly surprised by the way things were integrated in the big picture. P2(3?) is being tuned, and I hope contributions from everyone can be retained.
I didn't want to give that variant an official designation until Parallax did. In particular, what does the trailing "A" or "B" mean?
Ross.
Chip used the designation P16X32B for the 16 cog version proposed. The "B" is the second Parallax chip. So I presume the P2 will become P8X32C in this scenario.
Fab a 180nm P8X32B in Chips 2 clock /instruction design with 8 Cogs @ 200+MHz.
Add the extra 32 I/o pins (Port .
NO changes to P1 instruction setat all.
Fill rest of chip with HUB ram (512K,1M or 2M?)
Without 16 cogs, I would vote no. Not enough improvement now.
I don't expect >512KB hub as it uses ~25mm2 and we only had 37mm2 for the P2 die (+I/O ring?)
32 x P1 cogs ~12mm2
(info from Chip's post)
Recalculating, perhaps 16 cogs and 768KB might fit.
I am with you on the argument to keep it as close as doable to the P1.
16 cog as 2clock/ins will be very close to the existing P1 having 8 cogs at 4clocks/ins. good for existing software.
And Port B should be implemented as planned years before. Just a question of packaging. Even Dip is there again with port B NOT accessible from the outside. Direct replacement but faster, 2x cogs and more ram.
Same chip in other packaging with more pins can expose port B...
Without 16 cogs, I would vote no. Not enough improvement now.
I don't expect >512KB hub as it uses ~25mm2 and we only had 37mm2 for the P2 die (+I/O ring?)
32 x P1 cogs ~12mm2
(info from Chip's post)
Recalculating, perhaps 16 cogs and 768KB might fit.
Thanks Ray,
512KB(or 768?) HUB + 16 Cogs + 64I/O (32 in DIP version)
I can live with that!
I understand you would vote Yes for the P2, but I'm having trouble figuring out if this is a Yes or a No for the PnnX32! The basis of this thread is that the PnnX32 would not derail the P2.
I am with you on the argument to keep it as close as doable to the P1.
16 cog as 2clock/ins will be very close to the existing P1 having 8 cogs at 4clocks/ins. good for existing software.
And Port B should be implemented as planned years before. Just a question of packaging. Even Dip is there again with port B NOT accessible from the outside. Direct replacement but faster, 2x cogs and more ram.
Same chip in other packaging with more pins can expose port B...
Represents incremental improvement on an existing design, an approach that is business-wise and technically feasible (and parts of which have been customer-requested features). I'm not too comfortable putting Chip under pressure with a KickStarter effort for 65nm due to remaining manual layout, serdes design, integrated testing of core with I/O pin which remain big unknowns in my view. Each would take "two months" to complete.
The unit costs identified by Chip don't have relevance to us unless considered together with projected volumes and ROI expectations. Everything costs more than we expect.
The unit costs identified by Chip don't have relevance to us unless considered together with projected volumes and ROI expectations. Everything costs more than we expect.
Fair point. Those of us who do this as a hobby are not particularly price sensitive. Nor, I suspect, is Chip!
Kickstarter is not just financing. Getting quick-easy-huge money is just 50% of it. The other 50% is about marketing, idea development, customers, ...
You can get free advertisement, and reach new unknown customers. If you do it in a smart way, you can create demand for your product. You will also know things from custumers that your competitor probably don't know. You will have real numbers to predict sales, price-demand, etc ....
It is low-hanging fruit. And will pave the road for future Px.
I vote a resounding YES, and preference for the P32X32B (32 P1 cogs with 512KB).
...
BTW I agree with Bill's 32 hub slot mechanism - easy to describe, fairly simple to implement (I think).
+1 with £££
And with (your?) suggestion of a simple register mechanism to pass 32-bit values to/from the COG on either side.
+1 Yes with £££ should they go the funding route ( personally I don't see how using Kickstarter will help it come quicker it would take months to build awareness for it to get to a huge figure. plus 180nm process is absolutely fine for PxxX32B )
Cog count and HUBRAM size to me is whatever Chip & Ken decide is more viable for them to do.
Unlike P2 we need to sit back and let Chip and Ken sort this out, as they know their market more than anyone here!
However the PxxX32B turns out, will be awesome, and will totally outperform the pants off the P8X32A, and we will all be amazed by what it can do!
HUBRAM size, will be awesome at both 256KB or 512KB or whatever, as it won't matter, if people want more, we'll have 64 pins, so will be able to add more memory if needed!
Imagine a SPIN/C/C++ IDE with a 'Peripheral Wizard'.
You want a USART so you click on the wizard and select the USART option. You then have 3 choices...
1) A single USART in a COG with 50ns baud rate resolution
2) A dual USART in a COG with 1us resolution
3) A quad USART which supports all standard baud rate between 75 and 115,200
You select the one you want and the compiler does the rest dropping the necessary skeleton access functions into your code.
Ditto with SPI/Dual SPI/Quad SPI/I2C/ADC/DAC/Timer/PWM/.....
Now target that at a chip with 32 peripheral blocks and flexible pin routing. You want 64 USARTs for some strange multiport line concentrator? No problem.
And with (your?) suggestion of a simple register mechanism to pass 32-bit values to/from the COG on either side.
I think it was Cluso who suggested a "ring" configuration for cogs to pass data to the cogs on either side. My suggestion was for a "common bus" architecture - I had in mind the kind of "multiple access with collision detection" communication paradigm popularized by ethernet (showing my age, perhaps).
I loved the idea of Prop2 and have several designs for which I thought it would have been perfect. It has/had two problems- the first is it doesn't exist and the second is power. The prop1 is flexible enough to do battery operated applications on up. I (wrongly) figured the Prop2 would be a big step up in features with a small step up in power. It's more than a big step up in features and also more than a big step up in power. Currently most of my designs are battery powered applications for which the current Prop2 is not useable. I'd love to see something in between. I can afford a bit more power but not what the P2 currently consumes. And I think to compare favorably in the market place P2 power numbers are too high.
What's the expected power consumption of this new proposal? Seems there's lots of support for this but I haven't seen power numbers. I think some caution is in order. What stops this design being power hungry like the P2 design?
Cog count and HUBRAM size to me is whatever Chip & Ken decide is more viable for them to do.
Unlike P2 we need to sit back and let Chip and Ken sort this out, as they know their market more than anyone here!
^^^ THIS!
@Ross
it's a "yes" meaning that I would support Parallax whatever initiative they think is appropriate at this time.
However I don't know what my vote is worth, given that as an hobby consumer, I'm way behind humanoido in P8X32A population
What's the expected power consumption of this new proposal? Seems there's lots of support for this but I haven't seen power numbers. I think some caution is in order. What stops this design being power hungry like the P2 design?
Well, obviously we don't know yet - but we do know that the P1 only requires 1/16th the logic elements of the P2, so even a "super" P1 is unlikely to require more than 1/4 or maybe 1/3 the power of a P2. If this turns out not to be the case, I would assume Parallax would not proceed.
Comments
I didn't want to give that variant an official designation until Parallax did. In particular, what does the trailing "A" or "B" mean?
Ross.
Quite true but I want every board designer to open up their board designs to the forum so everyone can decide what features their boards MUST have to be successful. It's only fair, isn't it?? We can tell them what will be easy and what they can afford and what their target markets should be.
I think this could work, but I would also suggest that for each $100 you invest (i.e. each shuttle run chip you buy), you get an additional $1 discount on up to 100 production run chips (or some other formula that Parallax comes up with - which might depends on volume - provided it comes out at a cost-neutral $100).
This makes it attractive not only for enthusiasts, but also for anyone intending to use the chip for commercial applications - since for the latter it also removes the risk that the initial shutte run has some problem that makes the chips unusable.
Ross.
I already have a HUGE investment in P2 so I think it's going to be of no surprise that I vote for P2.
I do use P1's in 99% of my projects though, so a updated P1 would be great too, along as
it DOES NOT kill off P2 and further development!
Here's my 2 cents worth.
Fab a 180nm P8X32B in Chips 2 clock /instruction design with 8 Cogs @ 200+MHz.
Add the extra 32 I/o pins (Port .
NO changes to P1 instruction setat all.
Fill rest of chip with HUB ram (512K,1M or 2M?)
Now package it in the existing P1 packages (DIP,QFN) with no PortB connected
and also package it in a new SMT package for the full 64 I/o capabilities.
This gives everyone there extra IO,HUB and speed with no radical changes to the core.
Existing OBEX,docs as stated in a post earler remain the same.
The other bonus in using P1 style packaging and pinouts is that existing designs can
easily update to a better spec chip. All Parallax's P1 Eval/demo and educational boards
don't need redesigning. Just load em up with the new chips.
This also keeps the hobbyists in the loop with the DIP package.
My 2c FWIW....Brian
Hi Brian,
I understand you would vote Yes for the P2, but I'm having trouble figuring out if this is a Yes or a No for the PnnX32! The basis of this thread is that the PnnX32 would not derail the P2.
So should we count you as a Yes?
Ross.
If a simplified product has to be produced, I only see two possible scenarios:
1. an evolution of the original P1, eventually going to replace it, adding the things that were left out from the start, and possibly a few modifications dictated by things that happened later (LMM and compiled C mainly).
2. a REAL "intermediate" chip, equally distant from both P1 and P2 to be worth keeping in production even after the P2 is made.
While other manufacturers can afford to offer a nearly continuous scaling of features, with a next generation chip already laid out, and after the demise of the SX line, I think that the segment to cover ought to be "smaller"/"less pins".
How pulling in too much of P2 features is going to work when the real P2 is out?
So if I was in Chip shoes, and if, striked by lightning, I acquired some of his abilities, this would be my plan for a P1B variant:
- Same number of COGs (8), or 16 only if it doesn't break other things. Two clocks per instruction is great.
- Fastest possible clock rate (but eventually, publicly pretend it tops at half of that )
- 64KB hub ram (!), and just a small rom, the upper half of ram being loaded with P1 compatible rom image, which can be overwritten if required.
- Port B implemented, but only internally. A new instruction to wait for a write by another cog without wasting one bit as strobe, to allow full 32-bit COG-to-(multi)COG transfers.
Hey wait a minute... internal?!? Most people lamented that they end 1 or 2 pins short most of the time! (me too!!!)Well, I'd solve this by downgrading the reference VGA mode to 4bpp+sync.
Here's our missing 2 pins!
And also:
- The four instructions not implemented in P1.
- Single instructions to save/load C and Z (can't remember which one required two ops).
- Largely unchanged programming model and register layout, except some register gets dual function depending on cog "mode" (sorry heater, I just said *MODE* ).
When cog is used to execute from HUB:- Ability to clock the counters from hub read/writes, scaled by size. So that PHSA and PHSB can be used as indexes for hub access. Pre-decrement or post-increment.
- We already had tricks to do block transfers at hub window rate, using counters, but I'm not sure if that would work right with 2-clock instructions. In case it doesn't, block move instructions from/to hub would be nice (or REP prefix maybe?).
- Some kind of "range" registers, to catch code jumping outside a single line of L1-code cache. Much like fcache works already, except this would trigger an interr... (OMG! Did I just say "INTERRUPT"?!) to handle reload.
- m68k-like link/unlink acting on indexes, to implement software controlled L1-data cache, but limited to current subroutine parameters and locals. Absolute memory references goes thru normal hub access.
Whacky enough, uh?*quickly runs and takes cover*
P.S.: seriously, I hope nobody take offence. I had similar gut reaction when some mods were proposed for P2, but I admit that later, when dust settled, I was pleasantly surprised by the way things were integrated in the big picture. P2(3?) is being tuned, and I hope contributions from everyone can be retained.
I don't expect >512KB hub as it uses ~25mm2 and we only had 37mm2 for the P2 die (+I/O ring?)
32 x P1 cogs ~12mm2
(info from Chip's post)
Recalculating, perhaps 16 cogs and 768KB might fit.
I agree with what he said ^^^
I am with you on the argument to keep it as close as doable to the P1.
16 cog as 2clock/ins will be very close to the existing P1 having 8 cogs at 4clocks/ins. good for existing software.
And Port B should be implemented as planned years before. Just a question of packaging. Even Dip is there again with port B NOT accessible from the outside. Direct replacement but faster, 2x cogs and more ram.
Same chip in other packaging with more pins can expose port B...
not whacky at all
Enjoy!
Mike
Thanks Ray,
512KB(or 768?) HUB + 16 Cogs + 64I/O (32 in DIP version)
I can live with that!
That would be a YES then for a 16 Cog version
Hi Mike,
I'm counting this as a Yes. Yes?
I have added a tally in the first post. Let me know if I missed anyone, or I am misrepresenting your opinion!
Ross.
Represents incremental improvement on an existing design, an approach that is business-wise and technically feasible (and parts of which have been customer-requested features). I'm not too comfortable putting Chip under pressure with a KickStarter effort for 65nm due to remaining manual layout, serdes design, integrated testing of core with I/O pin which remain big unknowns in my view. Each would take "two months" to complete.
The unit costs identified by Chip don't have relevance to us unless considered together with projected volumes and ROI expectations. Everything costs more than we expect.
Ken Gracey
Fair point. Those of us who do this as a hobby are not particularly price sensitive. Nor, I suspect, is Chip!
Ross.
You can get free advertisement, and reach new unknown customers. If you do it in a smart way, you can create demand for your product. You will also know things from custumers that your competitor probably don't know. You will have real numbers to predict sales, price-demand, etc ....
It is low-hanging fruit. And will pave the road for future Px.
EDIT: I am talking about 180nm
I'm taking this as a Yes.
Ross.
I just wanted to say that there are additional benefits other than cash through crowd funding. Let it be Kickstarter, or any other.
This will also help to define project deadlines.
+1 with £££
And with (your?) suggestion of a simple register mechanism to pass 32-bit values to/from the COG on either side.
Cog count and HUBRAM size to me is whatever Chip & Ken decide is more viable for them to do.
Unlike P2 we need to sit back and let Chip and Ken sort this out, as they know their market more than anyone here!
However the PxxX32B turns out, will be awesome, and will totally outperform the pants off the P8X32A, and we will all be amazed by what it can do!
HUBRAM size, will be awesome at both 256KB or 512KB or whatever, as it won't matter, if people want more, we'll have 64 pins, so will be able to add more memory if needed!
And the best thing, is it would be with us soon!
You want a USART so you click on the wizard and select the USART option. You then have 3 choices...
1) A single USART in a COG with 50ns baud rate resolution
2) A dual USART in a COG with 1us resolution
3) A quad USART which supports all standard baud rate between 75 and 115,200
You select the one you want and the compiler does the rest dropping the necessary skeleton access functions into your code.
Ditto with SPI/Dual SPI/Quad SPI/I2C/ADC/DAC/Timer/PWM/.....
Now target that at a chip with 32 peripheral blocks and flexible pin routing. You want 64 USARTs for some strange multiport line concentrator? No problem.
I think it was Cluso who suggested a "ring" configuration for cogs to pass data to the cogs on either side. My suggestion was for a "common bus" architecture - I had in mind the kind of "multiple access with collision detection" communication paradigm popularized by ethernet (showing my age, perhaps).
Ross.
What's the expected power consumption of this new proposal? Seems there's lots of support for this but I haven't seen power numbers. I think some caution is in order. What stops this design being power hungry like the P2 design?
^^^ THIS!
@Ross
it's a "yes" meaning that I would support Parallax whatever initiative they think is appropriate at this time.
However I don't know what my vote is worth, given that as an hobby consumer, I'm way behind humanoido in P8X32A population
Well, obviously we don't know yet - but we do know that the P1 only requires 1/16th the logic elements of the P2, so even a "super" P1 is unlikely to require more than 1/4 or maybe 1/3 the power of a P2. If this turns out not to be the case, I would assume Parallax would not proceed.
So should I count you as Yes or a No?
Ross.
We are a democracy here! Your vote is worth the same as anyone else!
Since when? hehe ...