More than 1/3 of power dissipation figures did flew away. Awesome magic!
And without having to recur to any sort of assisted cooling.
Perhaps, now is the right moment to ask Wendy, so she has time to substitute those two chinese guys, the ones that were riding alligators, by two Inuits, happilly feeding some penguins?
It's just a pity OnSemi didn't has any crystal-clear, thermally-enhanced globe-topping compound available, that passed their caracterization/quality screening, to surround the whole cake.
Wendy sent me the power reports for the download test (1 cog at 20MHz) and the power torture test, where all 8 cogs are executing from hub and all 64 smart pins are running PWM at a clock frequency of 182MHz:
And here's the power torture test. The old silicon dissipated 1.2W, while the new silicon will dissipate 790mW:
Looking good. With pretty much the same silicon active, in the everything running case, what has made this lower power ?
Wendy sent me the power reports for the download test (1 cog at 20MHz) and the power torture test, where all 8 cogs are executing from hub and all 64 smart pins are running PWM at a clock frequency of 182MHz:
And here's the power torture test. The old silicon dissipated 1.2W, while the new silicon will dissipate 790mW:
Looking good. With pretty much the same silicon active, in the everything running case, what has made this lower power ?
Clock gating has reduced the power. This design even has 15% more logic than the current silicon.
Clock gating has reduced the power. This design even has 15% more logic than the current silicon.
That makes sense for 1 COG, but where you test all 8 cogs are executing from hub and all 64 smart pins are running PWM at a clock frequency of 182MHz: what is not being used & so is gated off ?
I think Cordic is disabled in that test ? What else is gated off to save power ?
It's really good to read that power has reduced by a third in the torture test, Chip. That alone seems worth it, and all portable battery powered P2 applications have just gained more runtime - almost up to 50% more in the most extreme cases where all power goes to the P2 and it's running flat out. Also board design currents are now lowered and resistive losses in power inductors will be a lot less. I like it a lot.
Wendy sent me the power reports for the download test (1 cog at 20MHz) and the power torture test, where all 8 cogs are executing from hub and all 64 smart pins are running PWM at a clock frequency of 182MHz:
Here is the download power test. This is how much power will be dissipated when the chip is waiting for a download or actively downloading, where one cog is active. The current silicon dissipates 136mW, while the new silicon will dissipate 75mW:
****************************************
Report : Averaged Power
Design : CHIP
Version: M-2016.12-SP1
Date : Fri Mar 29 12:07:38 2019
****************************************
Internal Switching Leakage Total
Power Group Power Power Power Power ( %) Attrs
--------------------------------------------------------------------------------
clock_network 0.0292 0.0154 3.322e-06 0.0446 (59.10%) i
register 3.302e-03 1.622e-04 2.591e-05 3.490e-03 ( 4.63%)
combinational 7.393e-03 0.0132 7.511e-05 0.0207 (27.44%)
sequential 3.650e-04 3.531e-06 1.264e-06 3.698e-04 ( 0.49%)
memory 1.819e-03 1.972e-03 1.950e-04 3.986e-03 ( 5.28%)
io_pad 0.0000 2.225e-03 8.824e-05 2.313e-03 ( 3.07%)
black_box 0.0000 0.0000 9.160e-14 9.160e-14 ( 0.00%)
Net Switching Power = 0.0330 (43.72%)
Cell Internal Power = 0.0421 (55.77%)
Cell Leakage Power = 3.889e-04 ( 0.52%)
---------
Total Power = 0.0754 (100.00%)
And here's the power torture test. The old silicon dissipated 1.2W, while the new silicon will dissipate 790mW:
****************************************
Report : Averaged Power
Design : CHIP
Version: M-2016.12-SP1
Date : Thu Mar 28 12:17:02 2019
****************************************
Internal Switching Leakage Total
Power Group Power Power Power Power ( %) Attrs
--------------------------------------------------------------------------------
clock_network 0.3084 0.1488 3.312e-06 0.4572 (57.93%) i
register 8.300e-03 3.107e-03 2.573e-05 0.0114 ( 1.45%)
combinational 0.0361 0.0769 7.545e-05 0.1131 (14.33%)
sequential 4.566e-04 2.212e-05 1.270e-06 4.800e-04 ( 0.06%)
memory 0.1056 1.972e-03 1.950e-04 0.1078 (13.65%)
io_pad 0.0000 0.0992 8.938e-05 0.0993 (12.58%)
black_box 0.0000 0.0000 9.151e-14 9.151e-14 ( 0.00%)
Net Switching Power = 0.3301 (41.82%)
Cell Internal Power = 0.4589 (58.13%)
Cell Leakage Power = 3.902e-04 ( 0.05%)
---------
Total Power = 0.7894 (100.00%)
Note that these power figures are for CORE logic and memory, and do not include the 3.3V I/O pads.
This is quite a big improvement, and we have to take into account that the current P2 prototype already gets warm and not hot while using all its cogs to calculate big prime numbers at 160MHz. Never measured the power consumption on the P2 eval board while doing this, but that might be on the works.
I'm expecting the final P2 to be very stable and quite overclockable, just by seeing this trend. Probably, one can expect to see a less jittery VGA output, judging by the simulations of this new iteration.
Clock gating has reduced the power. This design even has 15% more logic than the current silicon.
That makes sense for 1 COG, but where you test all 8 cogs are executing from hub and all 64 smart pins are running PWM at a clock frequency of 182MHz: what is not being used & so is gated off ?
I think Cordic is disabled in that test ? What else is gated off to save power ?
CORDIC commands are being issued, also. I forgot about that.
I am more interested to know what the max freq might be for lower specs like commercial temp 0-70C and 5% Vcc.
That will likely need bench tests on real silicon to get a typical, but the better numbers simulated here, means it should come in better than the existing P2es, as the die temperature is lower.
Maybe even 3% or 2% Vcc spec is viable ?
I am more interested to know what the max freq might be for lower specs like commercial temp 0-70C and 5% Vcc.
That will likely need bench tests on real silicon to get a typical, but the better numbers simulated here, means it should come in better than the existing P2es, as the die temperature is lower.
Maybe even 3% or 2% Vcc spec is viable ?
No. It’s not a typical value.
It’s a tools value and yes it could be 2.5% Vcc.
It would be a value OnSemi eould calculate from the tools. These days you often get multiple specs.
Thanks,
What was wrong that gave 2.1 Watts? Should the numbers be scaled down?
Certainly some surprising absolute differences between the two:
- Cell Leakage Power is almost doubled now. (The transistor count can't have increased that much.)
- Black_box is >10x bigger now! (I have no idea what this is.)
- Io_pad is 4x bigger now. (Different inclusion prameter I presume.)
All the others have dropped significantly, which is expected due to the clock-gating.
I think that earlier silicon test must have been worst-case power (low Vt & high leakage, low temp, high Vdd). Not sure what the black-box difference is about.
The power test results from the new silicon are at typical case.
Okay, that 2.1W power test result was for a simulation of 20% of the flops toggling on every clock, without any specific test program running. It was some kind of estimation the tools made on their own.
Wendy is looking for the 1.2W report that was a product of running our power torture test code. When and if she finds it, I will post it here.
Wendy reminded me yesterday that timing closure for the new P2 is set from -55C to +150C junction temperature. Our package Tja is ~20C/W. The anticipated power dissipation was ~2.25W. This would result in a ~45C (20×2.25) rise in junction temperature over ambient temperature, which affords us a -55C to +85C packaged temperature range with ~20C (150-45-85) allowance for local hot spots on the die.
Chip,
We use the P1 up to 125°C. Do you think the P2 will be able to run in an environment that hot ? Possibly at a reduced clock frequency ?
We make test equipment, so it needs to be spec'd by Parallax that way, or we can't use it. IOW we can't take a chance on "it seems to work" etc.
Wendy reminded me yesterday that timing closure for the new P2 is set from -55C to +150C junction temperature. Our package Tja is ~20C/W. The anticipated power dissipation was ~2.25W. This would result in a ~45C (20×2.25) rise in junction temperature over ambient temperature, which affords us a -55C to +85C packaged temperature range with ~20C (150-45-85) allowance for local hot spots on the die.
Chip,
We use the P1 up to 125°C. Do you think the P2 will be able to run in an environment that hot ? Possibly at a reduced clock frequency ?
We make test equipment, so it needs to be spec'd by Parallax that way, or we can't use it. IOW we can't take a chance on "it seems to work" etc.
Thanks,
Bean
If you reduced the clock frequency enough to reduce the junction temperature rise, it would be okay, but I don't know how low-frequency you'd have to go. We will determine these things with ON's help soon.
Comments
LET THE FABRICATION BEGIN!!
Just had to type that. Not sure why. Go, dog, go!
KG
Thanks for the update. Going to be a good summer, I think.
In addition to the P2-ES boards, are there going to be a few chips available from this run to populate our P2D2 boards?
More than 1/3 of power dissipation figures did flew away. Awesome magic!
And without having to recur to any sort of assisted cooling.
Perhaps, now is the right moment to ask Wendy, so she has time to substitute those two chinese guys, the ones that were riding alligators, by two Inuits, happilly feeding some penguins?
It's just a pity OnSemi didn't has any crystal-clear, thermally-enhanced globe-topping compound available, that passed their caracterization/quality screening, to surround the whole cake.
The max. clock frequency is probably limited by something other than heat, right?
Looking good. With pretty much the same silicon active, in the everything running case, what has made this lower power ?
Clock gating has reduced the power. This design even has 15% more logic than the current silicon.
Well, aside from temperature, there are process conditions, which are fixed per die, and then there's voltage level.
I think Cordic is disabled in that test ? What else is gated off to save power ?
~30% power reduction for full operation is a really nice achievement - and don't forget you added quite a bit of logic too.
I'm expecting the final P2 to be very stable and quite overclockable, just by seeing this trend. Probably, one can expect to see a less jittery VGA output, judging by the simulations of this new iteration.
Kind regards, Samuel Lourenço
CORDIC commands are being issued, also. I forgot about that.
P2es models to have this Cpd Cpd=1.60nF+8*83.33pF/COG
If I assume that boot test was at the typical 23MHz that gives
Fi=23M Icc=41.9m
Cpd=((Icc)*Vcc)/(Vcc*Vcc*Fi)
Cpd = 1.012nF
Fi=182M Icc=438.5m
Cpd=((Icc)*Vcc)/(Vcc*Vcc*Fi)
Cpd = 1.338nF
Difference / 8 gives 40.805pF/(COG+8 smart pins)
P2+ then has Cpd ~ 1.012nF + 40.805pF/(COG+8 smart pins)
At 20kHz RCSLOW, that indicates Cpd contribution of ~ 36uA (+ RCSLOW Icc + Static Leakage)
That will likely need bench tests on real silicon to get a typical, but the better numbers simulated here, means it should come in better than the existing P2es, as the die temperature is lower.
Maybe even 3% or 2% Vcc spec is viable ?
It’s a tools value and yes it could be 2.5% Vcc.
It would be a value OnSemi eould calculate from the tools. These days you often get multiple specs.
Was there such data for the P2ES chip? It would be nice to compare.
Here is the power torture test from the first silicon:
This shows 2.1W. I thought it was 1.2W.
What was wrong that gave 2.1 Watts? Should the numbers be scaled down?
Certainly some surprising absolute differences between the two:
- Cell Leakage Power is almost doubled now. (The transistor count can't have increased that much.)
- Black_box is >10x bigger now! (I have no idea what this is.)
- Io_pad is 4x bigger now. (Different inclusion prameter I presume.)
All the others have dropped significantly, which is expected due to the clock-gating.
Was that the same test conditions ? - ie 8 COGS + Cordic+64 Smart Pins 182MHz PWM ?
A direct compare is surprising There are big shifts, seems either test conditions changed, or OnSemi have revised their models ?
The power test results from the new silicon are at typical case.
Yeah, that doesn't make sense.
I will ask Wendy tomorrow if she has the report that showed 1.2W for the original silicon.
Wendy is looking for the 1.2W report that was a product of running our power torture test code. When and if she finds it, I will post it here.
Chip,
We use the P1 up to 125°C. Do you think the P2 will be able to run in an environment that hot ? Possibly at a reduced clock frequency ?
We make test equipment, so it needs to be spec'd by Parallax that way, or we can't use it. IOW we can't take a chance on "it seems to work" etc.
Thanks,
Bean
If you reduced the clock frequency enough to reduce the junction temperature rise, it would be okay, but I don't know how low-frequency you'd have to go. We will determine these things with ON's help soon.
We are hoping for 100MHz. That would give us a nice round 10nSec measurement resolution.
Bean