Well, that's actually what chip asked on his first post, isn't it? what are those other reasons that a two cog P2 has to be selected? it is because it has a small footprint? I don't think 64 is small. is it because it has better analog performance than any other MCU? this needs to be tested, but 12 bits are quite common nowadays, with some other MCUs reaching 14 or 16 bits. Does it have better digital performance than a current 28nm US $6 cyclone 10 fpga, or 55nm US$9 max 10 FPGA? I think it doesn't. Does it support out of the box a miriad of protocols like SPI, I2C, I2S, ETHERNET, WIFI, BLE, CAN, PROFIBUS, ...? I don't think so .... It is because it has plenty of RAM and FLASH? or because it has code protection?
An eight or 16 cogs P2 has in itself something unique (deterministic, interrupt-less, parallel thinking magic). But what does a two cog version has?
It's very hard to compete out there!
If it's not going to be cheap then better not waste any time with it.
Intel/Altera are already moving their lowest cost fpgas into mcu territory (max10 and cyclone 10 have smaller packages, reduced number of power supply required, and integrated extras like flash/adc at less than $10). Other players like PSOC are moving there too. And thousands of different common MCUs like ARM Cortex, PIC, AVR, MSP430/432, H8, 8051, have those options too.
Right now there are currently no reasons to move into a two cogs P2 other than we like parallax freedom and vision (Chip vision of how MCUs should be done) or that the price is so cheap that we want to try it out.
Don't be fooled. It's not going to be $3, not even for 10K quantities.
Parallax have to make back a big investment and at $3 that's not much above raw cost.
And then you have to add 2 power supplies, brownout, a flash chip or SD, bypass caps and possibly a crystal or oscillator. Not all these are required on other micros that the P2X2C128K40P will be competing with.
That isn't to say that it doesn't have features other chips don't have, but we have to be realistic too.
It seems to me that the 2 COG version would have to sell for more than the P1 although I suppose you could argue that the P1 has 8 COGs and hence is more powerful.
On another note, if we get this 2 COG version we will probably need to find a way to stuff a bunch of peripherals into a single COG since we won't have the luxury of assigning a COG to every peripheral. This means we'll have to get comfortable with interrupts since I imagine they will be necessary for multiple peripherals in a single COG.
It seems to me that the 2 COG version would have to sell for more than the P1 although I suppose you could argue that the P1 has 8 COGs and hence is more powerful.
On another note, if we get this 2 COG version we will probably need to find a way to stuff a bunch of peripherals into a single COG since we won't have the luxury of assigning a COG to every peripheral. This means we'll have to get comfortable with interrupts since I imagine they will be necessary for multiple peripherals in a single COG.
Absolutely true, but the smart pins will take some of the workload.
I estimate that the 2-cog die would be 4.5 x 4.5mm and fit perfectly into the 7x7mm 48-pin Amkor exposed-pad package with the 5x5mm die pad. That part would be a lot of fun to play with. It would be very fast and much lower power than the current chip. Way less-expensive, too. Maybe in the $3 range.
That fits my definition for 'cheap'. Almost perfect! small, cheaper, and faster.
The 2-cog version would draw almost as much power as the 8-cog version, is that right?
I think I remember the plot showing that power consumption increased only slightly with #cogs...
The 2-cog version would draw almost as much power as the 8-cog version, is that right?
I think I remember the plot showing that power consumption increased only slightly with #cogs...
I believe Chip has indicated that the power consumption is high because the entire clock tree is always active in the current chip. I would expect the clock tree in the 2 cog version to be 1/4 of that in the 8 cog version, so power consumption should be 1/4 as well.
What was the expected power consumption reduction for the current P2 with clock-gating enabled? I remember you said 10% lower Fmax but didn't remember any data about power.
And what would be the power reduction and Fmax for a two cogs P2 with clock-gating?
(I'd prefer a higher Fmax for both of them, but I am curious about the exact numbers)
What was the expected power consumption reduction for the current P2 with clock-gating enabled? I remember you said 10% lower Fmax but didn't remember any data about power.
And what would be the power reduction and Fmax for a two cogs P2 with clock-gating?
(I'd prefer a higher Fmax for both of them, but I am curious about the exact numbers)
If everything is running, the power would be the same, either way. If one cog out of eight is running, the power might be 20%.
It seems to me that the 2 COG version would have to sell for more than the P1 although I suppose you could argue that the P1 has 8 COGs and hence is more powerful.
On another note, if we get this 2 COG version we will probably need to find a way to stuff a bunch of peripherals into a single COG since we won't have the luxury of assigning a COG to every peripheral. This means we'll have to get comfortable with interrupts since I imagine they will be necessary for multiple peripherals in a single COG.
Yes, it will be important to sanity check going down to 2 COG.
GOG count does not have to be 2,4 as 3,5,6 are also possible, those just have empty eggbeater slot(s)
There are already 2 core mcu’s, with very good debug.
Debug of a 2 cog part will be a challenge.
On a small part, another Verilog path could be P1V,
what MHz / RAM could P1V hit ?
It’s really the smart pins that are the selling point here.
The 0.6 " 68k is a 68008, look at the possible suffixes in the table of page 11.7. Even if the first package shown says pin 64 on the top left, the width (A) is only ~60 mm.
Here a photo of a 1" breakout board I made recently with a TQFP-100 package. I didn't go for a narrower (one could trim it "easily" to 0.9"), I just wanted to finish it asap.
Looks like maybe that 0.9" socket could take the same nifty soldering trick, to make a P2 Stacker ?
Further to the talk around a 0.9" Standard package P2 DIP, I've done some quick 'will it fit' scribbles for a FLiP_P2_Dip64
into that thread above which was for FLiP-like alternatives.
The pinout provides unique VIO pins for each set of four I/O pins. On the 2-cog 32-pin part, you could have 8 different VIO supplies.
Did you ever discuss with OnSemi, about putting the P2 die, into a BGA package ?
Calls for "Smaller size" seem to be a reasonable part of the push to subset P2's, so if you can address part of that with the 'same-die, different package'. that other vendors do, that makes more business sense that creating a new part ?
The pinout provides unique VIO pins for each set of four I/O pins. On the 2-cog 32-pin part, you could have 8 different VIO supplies.
Did you ever discuss with OnSemi, about putting the P2 die, into a BGA package ?
Calls for "Smaller size" seem to be a reasonable part of the push to subset P2's, so if you can address part of that with the 'same-die, different package'. that other vendors do, that makes more business sense that creating a new part ?
The package we use is important because it has a very low Tja. I think a BGA would get very hot without such a solid way of getting the heat out.
Bga gives a lot of connections...
I could see it being about the same
As soon as copper heat dissipation area is cut by a trace, the PCB's heat conduction is compromised. A BGA wouldn't afford the kind of heat conduction that this exposed pad package provides.
The package we use is important because it has a very low Tja. I think a BGA would get very hot without such a solid way of getting the heat out.
.. but FPGAs run hotter than P2, and they manage the heat ok in their BGA packages ? Likewise RaspPi etc
Here is a table for a Lattice 10x10 BGA package - note the JC/JB thetas decrease as die size increases.
As expected, since a bigger die, sits over more BGA pads, and so spreads the watts/mm2
BGA is not a 2 Layer package, but your BGA customers, are not using 2L PCBs to start with.
BGA thermal resistance
JB°C/W JC°C/W
LFE5U-12F CSFBGA 10 mm x 10 mm 285 20.7 12.5
LFE5U-25F CSFBGA 10 mm x 10 mm 285 20.7 12.5
LFE5UM-25F CSFBGA 10 mm x 10 mm 285 20.7 12.5
LFE5UM-45F CSFBGA 10 mm x 10 mm 285 17.2 9.6
LFE5U-45F CSFBGA 10 mm x 10 mm 285 17.2 9.6
LFE5U-85F CSFBGA 10 mm x 10 mm 285 12.5 6.7
LFE5UM-85F CSFBGA 10 mm x 10 mm 285 12.5 6.7
Yes, the real work is done by the vias, and the inner plane layers, then spread the heat sideways.
The largest die Lattice parts have quite low Tjb, and I'm not sure what size die they actually have.
Note that P2 has a very large die.
P2D2 has ~ 49 vias, whilst a 10x10 BGA could expect to have well over 100 GND Balls and associated vias.
I like this idea. It seems like a perfect intermediate between the P1 and full P2. P2 is overkill for a lot of P1 projects that come just a bit short. Only 4 cogs is made up for by running faster and interrupts, and one of the big reasons we've lusted for more pins on P1 is external RAM -- which we don't need on P2 because it has a much more adequate RAM for whole projects with business logic included. I wouldn't want it to impact P2 launch but if you can do it cheaply on the side, then it would be a nice example of what we in Louisiana call lagniappe -- a little something extra.
@cgracey said:
Here is a 2-cog, 128K-hub, 32-pin version that would fit into the Amkor 7x7mm 48-pin package:
What is the chance of this seeing the light of day?
Well, fab capacity is sold out through next year and we'd have to spend maybe $300k to make it happen. We probably won't be able to do this for a while, yet.
Comments
An eight or 16 cogs P2 has in itself something unique (deterministic, interrupt-less, parallel thinking magic). But what does a two cog version has?
It's very hard to compete out there!
If it's not going to be cheap then better not waste any time with it.
Intel/Altera are already moving their lowest cost fpgas into mcu territory (max10 and cyclone 10 have smaller packages, reduced number of power supply required, and integrated extras like flash/adc at less than $10). Other players like PSOC are moving there too. And thousands of different common MCUs like ARM Cortex, PIC, AVR, MSP430/432, H8, 8051, have those options too.
Right now there are currently no reasons to move into a two cogs P2 other than we like parallax freedom and vision (Chip vision of how MCUs should be done) or that the price is so cheap that we want to try it out.
If ON could get it to us for half that.
Parallax have to make back a big investment and at $3 that's not much above raw cost.
And then you have to add 2 power supplies, brownout, a flash chip or SD, bypass caps and possibly a crystal or oscillator. Not all these are required on other micros that the P2X2C128K40P will be competing with.
That isn't to say that it doesn't have features other chips don't have, but we have to be realistic too.
On another note, if we get this 2 COG version we will probably need to find a way to stuff a bunch of peripherals into a single COG since we won't have the luxury of assigning a COG to every peripheral. This means we'll have to get comfortable with interrupts since I imagine they will be necessary for multiple peripherals in a single COG.
Absolutely true, but the smart pins will take some of the workload.
That fits my definition for 'cheap'. Almost perfect! small, cheaper, and faster.
I think I remember the plot showing that power consumption increased only slightly with #cogs...
I believe Chip has indicated that the power consumption is high because the entire clock tree is always active in the current chip. I would expect the clock tree in the 2 cog version to be 1/4 of that in the 8 cog version, so power consumption should be 1/4 as well.
C.W.
And what would be the power reduction and Fmax for a two cogs P2 with clock-gating?
(I'd prefer a higher Fmax for both of them, but I am curious about the exact numbers)
If everything is running, the power would be the same, either way. If one cog out of eight is running, the power might be 20%.
GOG count does not have to be 2,4 as 3,5,6 are also possible, those just have empty eggbeater slot(s)
There are already 2 core mcu’s, with very good debug.
Debug of a 2 cog part will be a challenge.
On a small part, another Verilog path could be P1V,
what MHz / RAM could P1V hit ?
It’s really the smart pins that are the selling point here.
Here a photo of a 1" breakout board I made recently with a TQFP-100 package. I didn't go for a narrower (one could trim it "easily" to 0.9"), I just wanted to finish it asap.
Further to the talk around a 0.9" Standard package P2 DIP, I've done some quick 'will it fit' scribbles for a FLiP_P2_Dip64
into that thread above which was for FLiP-like alternatives.
https://forums.parallax.com/discussion/comment/1448773/#Comment_1448773
Did you ever discuss with OnSemi, about putting the P2 die, into a BGA package ?
Calls for "Smaller size" seem to be a reasonable part of the push to subset P2's, so if you can address part of that with the 'same-die, different package'. that other vendors do, that makes more business sense that creating a new part ?
The package we use is important because it has a very low Tja. I think a BGA would get very hot without such a solid way of getting the heat out.
I could see it being about the same
As soon as copper heat dissipation area is cut by a trace, the PCB's heat conduction is compromised. A BGA wouldn't afford the kind of heat conduction that this exposed pad package provides.
.. but FPGAs run hotter than P2, and they manage the heat ok in their BGA packages ? Likewise RaspPi etc
Here is a table for a Lattice 10x10 BGA package - note the JC/JB thetas decrease as die size increases.
As expected, since a bigger die, sits over more BGA pads, and so spreads the watts/mm2
BGA is not a 2 Layer package, but your BGA customers, are not using 2L PCBs to start with.
Same for bga...
Looks good on wikipedia... fpga all use it...
Yes, the real work is done by the vias, and the inner plane layers, then spread the heat sideways.
The largest die Lattice parts have quite low Tjb, and I'm not sure what size die they actually have.
Note that P2 has a very large die.
P2D2 has ~ 49 vias, whilst a 10x10 BGA could expect to have well over 100 GND Balls and associated vias.
Reviving this old thread ...
What is the chance of this seeing the light of day?
Well, fab capacity is sold out through next year and we'd have to spend maybe $300k to make it happen. We probably won't be able to do this for a while, yet.
Wow, two years of back orders for 180 nm production? My jaw is on the floor.