From what I understand, every section of the P2 will be done via synthesis by Treehouse.
That means that the manual pin layout section done by Beau is being redone by Treehouse (currently underway or done).
Therefore, it seems to me that there is nothing that the P1+ would require that would not be available for use in a P1+.
This would include the clock input & PLL, and anything else required. I would be fairly sure there will be a PLL in the P2 that could be used for the video in P1+. If not, then surely Treehouse would have access to one.
Now, that does not mean these have been validated in this instance. For that matter, nothing has been validated by Parallax using either Treehouse or OnSemi. Remember, the only silicon produced for Parallax by OnSemi was the failed P2 design that was a mix of the manual pin layout by Beau plus the remainder of the design by another contractor. Unfortunately an error crept in to the mix of the manual section with the remainder section where the sections had a major short between them rendering the chip totally unusable
FWIW I wonder what features this P2 version had in it ??? If it was successful, IIRC it was production ready.
Since I believe Treehouse (will) have the manual pin design converted to HLL Synthesis, I wonder if this could be a viable P2- ???
@pjv,
I wonder if a tailored version of a P1+ run at the old P1 feature size (360nm?) might work for you, and have other markets too???
The die would be larger for 64 I/O (maybe a QFP84 package)
Probably would fit more hub RAM. Maybe just 64KB and serial load the ROM if required.
Chip may be able to help increase the speed since he knows where the critical sections were.
The video uses a lot of LEs, so some space saving is possible by only having 1 or 2 cogs with video. (Ken's customer survey said they weren't interested in video!)
IMHO an increase in cog RAM at least in some cogs would be worthwhile providing there was support for relative jump/calls (ie cogexec), and maybe rd/wrlut (movlut).
Since there would be no hubexec, perhaps LMM would benefit with a PTR register (shadow PAR could be used) that auto-incremented for rd/wr-long/word/byte.
Maybe MUL/MULS and perhaps DIV.
Super simple serial in the cogs without video would basically round out the feature set.
The only other question is security - the problem is no eeprom/flash and if 360nm I don't think the P2 fuses will work.
I still don't understand why eeprom/flash cannot be part of the design. I think it requires and extra mask or two. Surely that would be cheaper in the long run than adding an external eeprom/flash part. IIRC the actual die production cost was around $2.30??? for the P2 (before packaging/testing etc). I do know OnSemi produce the CAT24C512 eeproms. I don't think two die packages are viable (cost wise), but that could be worth asking.
This would include the clock input & PLL, and anything else required. I would be fairly sure there will be a PLL in the P2 that could be used for the video in P1+. If not, then surely Treehouse would have access to one.
Sure, but that is valuable Analog IP, available at a price.
Not it is also NOT designed for the older P1 process.
I wonder if a tailored version of a P1+ run at the old P1 feature size (360nm?) might work for you, and have other markets too???
The die would be larger for 64 I/O (maybe a QFP84 package)
Probably would fit more hub RAM. Maybe just 64KB and serial load the ROM if required.
Chip may be able to help increase the speed since he knows where the critical sections were.
Or, it may run slower, as rats-nest autorouting is not as Low-C as manually crafted layout.
A 80MHz 32b adder at 360nm? is pretty impressive.
I still don't understand why eeprom/flash cannot be part of the design. I think it requires and extra mask or two. Surely that would be cheaper in the long run than adding an external eeprom/flash part. IIRC the actual die production cost was around $2.30??? for the P2 (before packaging/testing etc). I do know OnSemi produce the CAT24C512 eeproms. I don't think two die packages are viable (cost wise), but that could be worth asking.
Again, anything is possible, but remember flash is much slower than SRAM and it needs to be on the Process you want .
You could use a small loader flash in the corner, but then the whole die wears the extra mask costs, for what is a tiny portion of die area.
If that was identified as vitally important to customers, then I'd look at stacked die - already used in many devices,
and it can give a optional family mix, with no main-die price adder. ( eg NUC505 has 2 die - Main MCU and SerialFlash)
I think Xilink do the same on one FPGA family.
You can also get reasonable system security with a 31c MCU.
Other news:
There is this alternative : (some FPGAs offer OTP and RAM, and IIRC XMOS have a small OTP Cell )
Agreed. Just noting differences. Multiple clock domains was mentioned a number of times when the P2 process went along the same route as P1 did. Lots of differences now, most centered on avoidance of manual or custom silicon.
I do however feel P1 plus discussion should be elsewhere. It's not productive mixed in with the P2 discussion going on. That doesn't mean it is somehow not valid, just that it has little to nothing to do with the core focus here.
And, iirc, that discussion has been had a number of times too. There exists an upfront effort that is non trivial. IMHO, funding some time with the fab to do a synthesis and design review is the most likely outcome. The guys who own the process are going to have the most meaningful insight.
In any caee, the current design differs very considerably from how the P1 was done, and that deviation is an artifact of hard, expensive and time consuming lessons learned along the way.
A P1 project can expect to see similar dynamics, though at a smaller scale and perhaps migrated by an aggressive pruning of features.
The P1 is a custom one off. There is no, tweak and remake on the table.
Really, the P1v is all about making a new chip that resembles the P1, not about making a derivative. That is where the initial cost and overall difficulty lie.
Good luck with it. I completely understand the motivations.
From what little I do understand of chip production, it seems the smaller the internal workings of the chip -- the less power required.
Taiwan Semi is claiming to be getting 8 nanometer production on-line and China can managae 12 nanometer production, while the Propeller is at something like 180 or 145 nanometer due to prohibitive production start up costs.
Onsemi seems to only produce down to 145 nanometer. So an entirely new supplier would have to be located in order to really optimize low power.
From what little I do understand of chip production, it seems the smaller the internal workings of the chip -- the less power required.
Taiwan Semi is claiming to be getting 8 nanometer production on-line and China can managae 12 nanometer production, while the Propeller is at something like 180 or 145 nanometer due to prohibitive production start up costs.
Onsemi seems to only produce down to 145 nanometer. So an entirely new supplier would have to be located in order to really optimize low power.
Am I correct? Did I miss something?
This is what I understand...
As the feature size goes lower, the leakage per transistor goes up. But because the voltage has been lowered, the power leakage actually goes down per transistor. But there are usually more transistors on finer feature size. Hence the typical statement that a finer geometry uses more power. But its not necessarily the case because if you have the same number of transistors between two feature sizes, the finer feature size will use less power switching, and provided the core voltage has been lowered, there will be less power used in total.
Anyway, that is how I understand companies like Atmel are getting their AVRs etc using much less power for the same chip - new design with a finer geometry, and some power down bits too.
Hey, how about an 8nm or 12nm P2 ? There would be a lot of dies on a wafer!!!
Or 64KB Cogs
From what little I do understand of chip production, it seems the smaller the internal workings of the chip -- the less power required.
Taiwan Semi is claiming to be getting 8 nanometer production on-line and China can managae 12 nanometer production, while the Propeller is at something like 180 or 145 nanometer due to prohibitive production start up costs.
Onsemi seems to only produce down to 145 nanometer. So an entirely new supplier would have to be located in order to really optimize low power.
From what little I do understand of chip production, it seems the smaller the internal workings of the chip -- the less power required.
That only works in one aspect - active switching losses. The charge in a smaller transistor is smaller, therefore less energy is needed to change it's state.
Problem is, with CMOS, there is two types of static leakage losses that become significant as the feature size shrinks. One is because of source-drain proximity - short channel length makes the channel hi-Z is not so high any longer, so they have to make the gate closer to combat this. Which leads to uncontrolled gate tunnelling or some such, i.e: The gate starts to leak badly when the insulator is only a couple of angstroms thick.
As far as I can tell every time the process size shrinks by half the cost of actually getting anything made with it doubles.
This is the down side of Moore's Law. Making it harder and harder for the little guy to make a start.
Not even the big guys can keep up...
IBM has sold off their fabs.
AFAIK there are
Intel, TSMC, Samsung & ex-IBM only left at the cutting edge.
Pity help us if we get into one of those shortage nightmares we had in the 80's and 90's !
Ummm. it is a bit like gold mining. Everytime the price of gold rises, the cost of mining seems to move up as well.
So every time a chip offers more value, the price of the production process increases.
The producers of process equiment may be making more stable profits than the actual producers of gold or micro-controllers. So Parallax tries to stay away from all that by using a producer that is pretty much running old technology, maybe even second-hand equipement.
But business is getting slow for TSMC, so maybe there will be a savings due to drop in global demand that will allow the Propeller 2 to be sourced on a low-power technology. The book-to-bill ratio fell below 1.0 recently.
So at the moment the proposed development appears to be in the reasonable range from a business perspective, but many more details still need to be resolved.
In discussions with Treehouse today, it was decided to get certain parties together to assess some issues. That will happen on Nov 16, and the steps following will be determined by the outcome of those discussions.
In the meantime.......
As I am totally devoid of any FPGA/VHDL/Verilog knowledge, I am soliciting the forum members to find someone willing to contract for modification of the P1 files to incorporate the (as yet unspecified) features desired. These will, among some possible other things, encompass activating port B. According to Chip, indirect addressing should be almost trivial, and the ocillators are likely stock items from OnSemi. Fuses, such as OTP, also seemed standard to Treehouse.
So, are there any experts here who might like to take on a challenge -to be well specified- and be financially rewarded for their effort. Personally, I cannot yet judge the magnituded of the effort, but I expect it to be not onerous, and Chip will be able to vet those estimates.
We are also scouring our local Universities for competent talent to add to this picture, so, if it makes sense, it need not be a one-man-show.
Agreed, this would be a fantastic achievement. I am willing to help and have done some of the things necessary. Will email within 24 hrs when I get home. Only have my iPhone here so typing is a bit tedious.
Comments
That means that the manual pin layout section done by Beau is being redone by Treehouse (currently underway or done).
Therefore, it seems to me that there is nothing that the P1+ would require that would not be available for use in a P1+.
This would include the clock input & PLL, and anything else required. I would be fairly sure there will be a PLL in the P2 that could be used for the video in P1+. If not, then surely Treehouse would have access to one.
Now, that does not mean these have been validated in this instance. For that matter, nothing has been validated by Parallax using either Treehouse or OnSemi. Remember, the only silicon produced for Parallax by OnSemi was the failed P2 design that was a mix of the manual pin layout by Beau plus the remainder of the design by another contractor. Unfortunately an error crept in to the mix of the manual section with the remainder section where the sections had a major short between them rendering the chip totally unusable
FWIW I wonder what features this P2 version had in it ??? If it was successful, IIRC it was production ready.
Since I believe Treehouse (will) have the manual pin design converted to HLL Synthesis, I wonder if this could be a viable P2- ???
@pjv,
I wonder if a tailored version of a P1+ run at the old P1 feature size (360nm?) might work for you, and have other markets too???
The die would be larger for 64 I/O (maybe a QFP84 package)
Probably would fit more hub RAM. Maybe just 64KB and serial load the ROM if required.
Chip may be able to help increase the speed since he knows where the critical sections were.
The video uses a lot of LEs, so some space saving is possible by only having 1 or 2 cogs with video. (Ken's customer survey said they weren't interested in video!)
IMHO an increase in cog RAM at least in some cogs would be worthwhile providing there was support for relative jump/calls (ie cogexec), and maybe rd/wrlut (movlut).
Since there would be no hubexec, perhaps LMM would benefit with a PTR register (shadow PAR could be used) that auto-incremented for rd/wr-long/word/byte.
Maybe MUL/MULS and perhaps DIV.
Super simple serial in the cogs without video would basically round out the feature set.
The only other question is security - the problem is no eeprom/flash and if 360nm I don't think the P2 fuses will work.
I still don't understand why eeprom/flash cannot be part of the design. I think it requires and extra mask or two. Surely that would be cheaper in the long run than adding an external eeprom/flash part. IIRC the actual die production cost was around $2.30??? for the P2 (before packaging/testing etc). I do know OnSemi produce the CAT24C512 eeproms. I don't think two die packages are viable (cost wise), but that could be worth asking.
I suspect a very brief chat with Chip would quickly clear up these various questions, and determine what the tough issues would be.
Cheers,
Peter (pjv)
Sure, but that is valuable Analog IP, available at a price.
Not it is also NOT designed for the older P1 process.
Or, it may run slower, as rats-nest autorouting is not as Low-C as manually crafted layout.
A 80MHz 32b adder at 360nm? is pretty impressive.
Again, anything is possible, but remember flash is much slower than SRAM and it needs to be on the Process you want .
You could use a small loader flash in the corner, but then the whole die wears the extra mask costs, for what is a tiny portion of die area.
If that was identified as vitally important to customers, then I'd look at stacked die - already used in many devices,
and it can give a optional family mix, with no main-die price adder. ( eg NUC505 has 2 die - Main MCU and SerialFlash)
I think Xilink do the same on one FPGA family.
You can also get reasonable system security with a 31c MCU.
Other news:
There is this alternative : (some FPGAs offer OTP and RAM, and IIRC XMOS have a small OTP Cell )
http://www.digitimes.com/supply_chain_window/story.asp?datepublish=2015/11/11&pages=PR&seq=200
However, note their focus is around '55nm technology', finding that IP in P2 or P1 FAB qualified, is a whole different story.
Cluso, read this - http://forums.parallax.com/discussion/comment/1353689/#Comment_1353689
I do however feel P1 plus discussion should be elsewhere. It's not productive mixed in with the P2 discussion going on. That doesn't mean it is somehow not valid, just that it has little to nothing to do with the core focus here.
And, iirc, that discussion has been had a number of times too. There exists an upfront effort that is non trivial. IMHO, funding some time with the fab to do a synthesis and design review is the most likely outcome. The guys who own the process are going to have the most meaningful insight.
In any caee, the current design differs very considerably from how the P1 was done, and that deviation is an artifact of hard, expensive and time consuming lessons learned along the way.
A P1 project can expect to see similar dynamics, though at a smaller scale and perhaps migrated by an aggressive pruning of features.
The P1 is a custom one off. There is no, tweak and remake on the table.
Really, the P1v is all about making a new chip that resembles the P1, not about making a derivative. That is where the initial cost and overall difficulty lie.
Good luck with it. I completely understand the motivations.
Taiwan Semi is claiming to be getting 8 nanometer production on-line and China can managae 12 nanometer production, while the Propeller is at something like 180 or 145 nanometer due to prohibitive production start up costs.
Onsemi seems to only produce down to 145 nanometer. So an entirely new supplier would have to be located in order to really optimize low power.
Am I correct? Did I miss something?
As the feature size goes lower, the leakage per transistor goes up. But because the voltage has been lowered, the power leakage actually goes down per transistor. But there are usually more transistors on finer feature size. Hence the typical statement that a finer geometry uses more power. But its not necessarily the case because if you have the same number of transistors between two feature sizes, the finer feature size will use less power switching, and provided the core voltage has been lowered, there will be less power used in total.
Anyway, that is how I understand companies like Atmel are getting their AVRs etc using much less power for the same chip - new design with a finer geometry, and some power down bits too.
Hey, how about an 8nm or 12nm P2 ? There would be a lot of dies on a wafer!!!
Or 64KB Cogs
8nm is 500x denser than 180nm!
That only works in one aspect - active switching losses. The charge in a smaller transistor is smaller, therefore less energy is needed to change it's state.
Problem is, with CMOS, there is two types of static leakage losses that become significant as the feature size shrinks. One is because of source-drain proximity - short channel length makes the channel hi-Z is not so high any longer, so they have to make the gate closer to combat this. Which leads to uncontrolled gate tunnelling or some such, i.e: The gate starts to leak badly when the insulator is only a couple of angstroms thick.
PS: I was reading a little about it just yesterday after JMG made mention of this.
Currently, IBM has 7nm in the lab, may be ready for mass prod in 2017, and is EUV based...
Intel has forecast 10nm in early 2017, however its not known how that is going to ramp.
TSMC has forecast 10nm for 2016, and 7nm for 2017, however their recent track record has been poor, so....
This is immaterial though, the newer processes are too expensive for Parallax, and most uC anyways.
This is the down side of Moore's Law. Making it harder and harder for the little guy to make a start.
Not even the big guys can keep up...
IBM has sold off their fabs.
AFAIK there are
Intel, TSMC, Samsung & ex-IBM only left at the cutting edge.
Pity help us if we get into one of those shortage nightmares we had in the 80's and 90's !
So every time a chip offers more value, the price of the production process increases.
The producers of process equiment may be making more stable profits than the actual producers of gold or micro-controllers. So Parallax tries to stay away from all that by using a producer that is pretty much running old technology, maybe even second-hand equipement.
But business is getting slow for TSMC, so maybe there will be a savings due to drop in global demand that will allow the Propeller 2 to be sourced on a low-power technology. The book-to-bill ratio fell below 1.0 recently.
So at the moment the proposed development appears to be in the reasonable range from a business perspective, but many more details still need to be resolved.
In discussions with Treehouse today, it was decided to get certain parties together to assess some issues. That will happen on Nov 16, and the steps following will be determined by the outcome of those discussions.
In the meantime.......
As I am totally devoid of any FPGA/VHDL/Verilog knowledge, I am soliciting the forum members to find someone willing to contract for modification of the P1 files to incorporate the (as yet unspecified) features desired. These will, among some possible other things, encompass activating port B. According to Chip, indirect addressing should be almost trivial, and the ocillators are likely stock items from OnSemi. Fuses, such as OTP, also seemed standard to Treehouse.
So, are there any experts here who might like to take on a challenge -to be well specified- and be financially rewarded for their effort. Personally, I cannot yet judge the magnituded of the effort, but I expect it to be not onerous, and Chip will be able to vet those estimates.
We are also scouring our local Universities for competent talent to add to this picture, so, if it makes sense, it need not be a one-man-show.
What do you say ?
Cheers,
Peter (pjv)
I sincerely hope you can pull this off not just for yourself but also for the whole Parallax/Propeller community.
I'll certainly be in the queue to purchase some from you.
Actually, if the project is successful, the chips will be sold by Parallax.
Cheers,
Peter (pjv)