What would you want in a new P8X32B ?
Cluso99
Posts: 18,069
The P2 is close now.
So I am now wondering about the P8X32B (previously referred to as the P1B). We know that there are problems with the chip design software to make the P1B so it begs the question, in the light of what has been learnt with the P2 design, what could be done with a revised P1B design?
There is no need to distract Chip for now. These are just some random thoughts and requirements...
So I am now wondering about the P8X32B (previously referred to as the P1B). We know that there are problems with the chip design software to make the P1B so it begs the question, in the light of what has been learnt with the P2 design, what could be done with a revised P1B design?
There is no need to distract Chip for now. These are just some random thoughts and requirements...
- Low power (like the P1 or lower) - therefore probably same process as P1
- More I/O
- Is an internal 1% or 0.5% internal oscillator possible with this fabrication
- Hub access every 8 clocks instead of 16
- 64KB Hub RAM with 2KB of that for monitor/loader (ie no ROM like the P2)
- Could the VGA pins be internally joined (as in P2). Similar for the TV/Composite.
- Would simple analog on the pins be possible or do we use the same sigma-delta
- Increase operating frequency - would 120MHz be possible as standard
- Obviously fix PLL, and increase multiplier options so we could use the same xtal (internal?) for say 80/96/100/120MHz
- Could a QFP64 10x10mm@0.5mm pitch package work - 52 I/O (or 14x14mm@0.8mm pitch)
- Use an I/O bank selection like the P2 (not the stated C bit)
- I2C EEPROM or SPI FLASH
Comments
'Definitely agree with #11.
I like #5.
'Don't like #6 at all: 'too many other things to use VGA-style output for than just VGA display.
Re #9: longer time constant for phase detector filtering in counter PLL mode => less jitter. (Programmable filtering?)
#13: More counter modes.
-Phil
Since the current multiplier is a power of two perhaps a 4MHz xtal with multipliers of 1, 2, 4, 8, 16, 32 and max frequency of 128MHz would be better. Gives us both a lower power and higher speed option.
In view of the prices I would suggest Flash in place of eeprom.
This comes down to measurements and numbers.
We do not yet know what the Static and low-speed Icc of P2 is.
We do not yet know the relative price (which relates to yields and testing time) for P2 :: P1
What many other suppliers do, is take one die, and sell as multiple variants.
In the P2 die case, one obvious easy step is to offer in a TQFP64-0.8mm pitch package.
That pretty much hits the P8X32B target, with very minimal new engineering effort.
It becomes a commercial decision on what price to give the reduced pin count version.
#14: Execute in place support for QuadSPI(DDR) memory.
This would be hardware supported to allow full speed DDR, and fast burst access.
Much of this can be mission-proven first, on a P2, by using the slower SW pathways.
#6 would only be an option because I agree, there are plenty of other things we use the VGA counters for.
#9 is your baby (expertise) Phil
#13 Yes, and a simple gate to select an input pin instead of internal counters could be a huge help. Nothing too special, just a little more flexible.
The more I design propeller boards the more I realise I just need one thing:
More pins.
I don't even care about the package any more - if there were a QFN package with more pins someone would very quickly come out with a breakout board.
Let's say, hypothetically, that one had 64 I/O pins instead of 32. Then things like could be done with an external sram chip and devote a cog to handling the memory access and then you could have 512k or more.
Cool idea!
How many I/O pins does a DE0 or a DE2 have? Do those FPGAs have enough grunt to emulate all 8 cogs of the standard propeller?
If I understand correctly the Pi was designed the old fasioned way, schematically, gate by gate. A slow a dedious way to proceed. It does not scale.
The PII has been designed with a hardware description language, Verilog is it, where muchmore complex things can be tackled, development is faster and changes are easier.
I imagine by now Chip does not want to go back to the old ways. Any new Prop, bigger or smaller is more likly to be a varient of the PII.
I imagine something more like:
1) Start from the PII design
2) Cut it down to 64 I/O pins for smaller packaging.
3) Remove a big lot of the transistors around the pins getting rid of some fancy Prop II I/O features.
4) Halve the RAM.
5) With all the space saved increase the process size to allow for lower power consumption. As well as slowing the clock maybe.
6) More radically. Only use 4 COGs. Thus saving more power consumption. We have threading within COGs now to make up for less COGs.
If you want to have significant differences, then it would likely involve redoing it in the new HDL and require a significant time and cost investment to do it.
Making a P2 variant that works on a different process would be fairly significant also, remember the I/Os are hand laid out by Beau to the current process specs, and the rules change for a different process.
I think the best bet for Prop chip that is lower power than the P2 and more I/Os than the P1, is to just run the P2 at lower clock rates.
Is the power difference really going to be that terrible? I mean we're still going to be able to run a P2 from batteries, and with the advanced batteries we have these days I'm thinking it'll be just fine.
Otherwise the new chip should have a different name like P"1.5".
Keeping cog, hub and instruction timing and the instruction set and opcodes 100% compatible, more hub RAM, no ROM, less cogs don't look like a good idea. I think more cogs would hurt too.
...and the I2C EEPROM was is great idea, please don't replace it by wear-out-memory inside the P1b.
Just more pins (and maybe some (few) MHz more being officially supported) would yield a P1b opening many new doors without breakyng anything except the packaging ... and please have a heart for hobbyists and breadboards right from the start.
Something like P4X32C...
Basically the P2 but with 4 cogs and 64 I/O, maybe reduce hub RAM to 64K if that has a big cost benefit.
I know it would run against Chip's philosophy, but if it makes a big enough difference in cost it might make sense to have 32 P1 type "digital" I/O and 32 P2 type "super" I/O.
C.W.
If Parallax wanted to make one, I think that minimal changes would allow for:
- 64 I/O (TQFP 80)
- 62KB RAM / 2KB boot loader+monitor
- SPI flash booting
That would be sweet for many applications, and only minor revisions to P1 docs to make P1B docs. It would also fit nicely into the product list, between the P1 and P2.
But... we (as in forum) are more likely to start a P3 wish thread after the P2 is fully functional, and we think of tweaks to improve it.
I'd love another register in the counters and a mode such that one could set a pin's "on" ticks and "off" ticks -- like in the SX48 (though with 32-bit registers it would be far more flexible). Would be so nice to have set-and-forget variable duty cycle, fixed-frequency pwm without using a code loop.
And of course my really big bugaboo with the P1; indirect and relative addressing. I realize this would be neigh impossible with the current P1 layout.
Cheers,
Peter (pjv)
John Abshier
Functional MUL/DIV
64KB HUB if possible (-2KB booter)
64 IOs
Faster clock (smaller process to ~double SRAM would naturally be faster)
Marginally higher power dissipation ok
I'd go for additional counter modes, if not a ticks register, then more feedback options. For example, one in the NCO mode that would condition the advance of the counter on the state of the high bit or carry, so the counter could self-extinguish. Or, a mode that cross-couples ctra to ctrb, where each runs when the b31 bit of the other is high. Would that allow free-running standard PWM? It is already possible to acheive free-running standard PWM by offset aligning two counters.
I know it would be nice to have EE or flash on board, but I think the process that allows for the I/O pads to have their nice characteristics doesn't allow for that.
How many will really sell? (and at what price point? 1.5 times P1 cost? How does that go against P2 planned costs?)
How much will it cost in up front engineering and fabrication costs? (P1 re-engineering I imagine is pretty expensive based on the technology used.)
The same goes for re-engineering a P1.5 from a P2. You gain some savings in engineering costs since it's modern technology but it is still a lot of time and money for Chip and Beau. Will the changes increase per wafer yield enough to justify it? Are the sales volumes there? EXCEPT for the power issues, will it be more economical to just grab a P2 and use as much of it as you need? Where would the P1.5 fit into the cost model?
Yes, it's lots of fun to talk about. Maybe it is practical for a Atmel of Microchip to do something like this but is it really a good use of resources for Parallax?
Unless I'm totally off in left field and these changes amount to a few lines of "ode" and a "recompile" at the silicon level.
Is a poem coming...
Ode to Prop 1.5...
C.W.
You can always put more than one die in a package.
I learned this at defcon when one of the presenters disclosed their research where they had discovered that someone in their customer's supply chain had surreptitiously inserted a radio transceiver into an IC package.
When they x-ray'd the package they found three dies: an MCU, flash memory, and a 430Mhz radio transceiver. They only expected the first two.
If I were their customer, that level of "interest" would keep me awake at night.
You could, but (1) you need room and both Propellers are near the limits of die size for at least one of their packages, and (2) it makes the packaging more expensive. I'd bet it's nonstandard enough that the extra packaging cost to Parallax would be more than the cost to us to add an external chip.
#1 make it cheaper
#2 increase clock speed
#3 decrease power consumption
#4 keep it code and pin compatible
#1 make it cheaper
#3 decrease power consumption
[/QUOTE]
+ 1
more pins would be nice, but with the P2 addressing most of anything I could imagine for a P1 improvement, I think that a lower cost would bring the chip to more markets, especially with the C efforts.
On another note, this morning I read about DDC transistors - 35% faster for same power, 50% lower power for same speed, 55% faster for increased power at same voltage. The problem here is that (apart from new synthesis) it requires different core voltages. It seems like the same chip is useable (scalable) from lower power to faster higher power.
http://www.eejournal.com/archives/articles/20130916-suvolta/
Unfortunately, probably not available to Parallax without great expense
Those were tested at 65nm, 55nm and 28nm as coming. Nowhere close to the Prop process.
besides, Chip probably does not want to hear the words 'hotter process' at this point in time