@Rayman said:
2nd New board at 375 MHz starts to show some flicker in palette after about 1 minute.
That's those hubRAM read errors. Video output is pretty good for testing this. The driver resides in cogRAM so doesn't crash, while the display shows the read errors as glitches.
There is definitely a thermal improvement with the new boards, even with the vias surrounding the core.
Having to crank up heat gun to solder components was first indication.
Doing the ground polygon pour on all the layers, copied from Eval, is probably the main reason for that.
I should remember to always do that. Was not smart not too...
Also, probably less chemical waste this way as only have to remove a bit of the copper...
I'm not sure what startup failure you're referring to but COGINIT for example is a burst of hubRAM reads. So if the clock frequency is high enough then a read error loading the cog is a possibility.
The 2V test is interesting - surely that's just moving a "glitch" threshold slightly higher?
Does that suggest the 1.8V power supply is not able to deliver the current bursts?
Have you tried measuring the ripple on VDD whilst VGA is running; check for minimum voltage over a couple minutes whilst your VGA code is running and misbehaving ?
Also, as you are using the old board to generate 2V, have you also tried leaving the old board at 1.8V and feed that into the new board ?
With the core voltage set to 2.3 V, can have all cogs running at 380 MHz, stable with fan on.
One funny thing though is that the core voltage is being loaded down to 2.0 V.
Have to check the datasheet to see how this is possible.
Maybe this drag down is the real issue...
Looks like it is possible if the core current is approaching 500 mA
I never did a current chart because it's so dependant on the program. I did do some tests targetting max power a long time back but never plotted anything. It was just a what can cook the hottest type question. I exceeded 5 Watts at 1.8 Volts ... so up to 3 Amps.
I was wondering whether we could make it go to 5 watts, paying homage to the "we're looking at 5 watts in a bga" thread. Sounds like you've already made it there evanh
Ok, seems need to upgrade regulators to 2 A from 500 mA, like Eval/Edge to do the crazy overclocking.
Was definitely pushing the 500 mA past the limit...
I like the AP3445L from Eval. 2 A and adjustable. Will use for both 3.65 and 1.8 V.
It must be that there's a range in VIO current draws between chips, would explain why some were doing better than others on my boards...
Or, a range on how low VIO can go without trouble...
Memory moved to top layer appears slightly worse than before. Testing with Chip's driver shows a top speed ~320 MHz. Was ~330 MHz with Simple boards.
MegaYume runs fine.
NeoYume seems to run Metal Slug fine, but the big one (Crossed Swords?) starts OK but crashes after about a minute.
(USB controller doesn't work at all here. Strangely works on some Eval headers but not others. Buying a new controller to hopefully fix this issue...)
Might try removing the ground fill on the inner layers around the memory to maybe reduce capacitance to ground a hair...
@Rayman said:
@evanh What does the "unstable" output tell me? Max freq with one cog going?
It's the bottom frequency where hubRAM read errors begin.
It's more of a temperature vs frequency logging program. The code is explicitly doing very little to allow the die temperature to closely match the heatsink temperature. Lowering the board temperature will increase that bottom/threshold frequency.
It's probably a little excessive in one respect - It checks the entirety of hubRAM on each pass. Nice for thoroughness but also unnecessarily spikes power draw for longer than needed.
If you want to measure it with more power draw then just remove the loop delay in my code. It should shift to a lower frequency for a given cooling setup. Other activities can also be added to the other cogs but you're limited to cogexec - They'll crash if using hubexec.
@Rayman said:
NeoYume seems to run Metal Slug fine, but the big one (Crossed Swords?) starts OK but crashes after about a minute.
(USB controller doesn't work at all here. Strangely works on some Eval headers but not others. Buying a new controller to hopefully fix this issue...)
Comments
Tried again and top speed increased from 320 to 340 MHz @ Vdd=2.0.
Seems there's a bit of variability. Board might be heating up or something...
Wow, just got that old board to 370 MHz by using the variable regulator on the first new board to generate 2.0 V for Vdd
Second new board just hit 375 MHz with Vdd = 2.0V. Was only doing 335 MHz before...
Think now know the first change coming to this board!
Thanks @Tubular
Now thinking the startup failure isn't thermal at all really, but a tug on Vdd. Can that be? Guess could scope up Vdd at take a look...
Actually checking VGA output now with Vdd boost to 2.0 V...
2nd New board at 375 MHz starts to show some flicker in palette after about 1 minute.
With fan on it though, been going for 5 minutes now with no issue.
Glad added those (2 of 4) fan mounting holes.
Have to remember to come up with a header so can plug fan power into the board...
1 hour 10 minutes now. Probably at thermal equilibrium by now. It's not hot under chip, just above room temperature though. QED, Stopping test...
That's those hubRAM read errors. Video output is pretty good for testing this. The driver resides in cogRAM so doesn't crash, while the display shows the read errors as glitches.
There is definitely a thermal improvement with the new boards, even with the vias surrounding the core.
Having to crank up heat gun to solder components was first indication.
Doing the ground polygon pour on all the layers, copied from Eval, is probably the main reason for that.
I should remember to always do that. Was not smart not too...
Also, probably less chemical waste this way as only have to remove a bit of the copper...
I'm not sure what startup failure you're referring to but COGINIT for example is a burst of hubRAM reads. So if the clock frequency is high enough then a read error loading the cog is a possibility.
My approach has been to startup at a stable clock rate then explicitly set the frequency from within my test code - https://forums.parallax.com/discussion/comment/1563109/#Comment_1563109
Hmm data sheet says absolute max is 2.2 v and dc max is 1.9 v.
Have to research this a bit. 1.9 probably also gives improvement. Have to see if it’s enough.
The 2V test is interesting - surely that's just moving a "glitch" threshold slightly higher?
Does that suggest the 1.8V power supply is not able to deliver the current bursts?
Have you tried measuring the ripple on VDD whilst VGA is running; check for minimum voltage over a couple minutes whilst your VGA code is running and misbehaving ?
Also, as you are using the old board to generate 2V, have you also tried leaving the old board at 1.8V and feed that into the new board ?
That’s an interesting idea @VonSzarvas
…
I’ll check with double 1.8v
Should also test with onboard supply disabled to make sure not already double dipping ..
VGA Stable at 350 MHz with no fan (still 2nd new board at 2.0V Vdd)
Double 1.8 V Vdd ok at 340 MHz, dies at 345 MHz. So, not a real improvement. 5 MHz maybe...
With the core voltage set to 2.3 V, can have all cogs running at 380 MHz, stable with fan on.
One funny thing though is that the core voltage is being loaded down to 2.0 V.
Have to check the datasheet to see how this is possible.
Maybe this drag down is the real issue...
Looks like it is possible if the core current is approaching 500 mA
If you run my tester, it won't load the regulator. You'll be able to compare loaded vs unloaded at various sysclock frequencies.
@evanh do you have a core current vs freq chart?
Also, where is the tester?
I relinked it nine posts back as a reminder then - https://forums.parallax.com/discussion/comment/1563251/#Comment_1563251
Here's current vs frequency from my simple repeat loop testing (not evanh's useful boundary hunter)
Ok, wow, guess didn’t consider crazy current draw at high freq.
Copied a lot from Eval, but not the beefy power supply…
What's your max current rating @Rayman ?
500 mA
I never did a current chart because it's so dependant on the program. I did do some tests targetting max power a long time back but never plotted anything. It was just a what can cook the hottest type question. I exceeded 5 Watts at 1.8 Volts ... so up to 3 Amps.
I was wondering whether we could make it go to 5 watts, paying homage to the "we're looking at 5 watts in a bga" thread. Sounds like you've already made it there evanh
Somewhat different conditions though. I had to go to 360 MHz to break 5 Watts. Chip was talking about 160 MHz back then I think.
Ok, seems need to upgrade regulators to 2 A from 500 mA, like Eval/Edge to do the crazy overclocking.
Was definitely pushing the 500 mA past the limit...
I like the AP3445L from Eval. 2 A and adjustable. Will use for both 3.65 and 1.8 V.
It must be that there's a range in VIO current draws between chips, would explain why some were doing better than others on my boards...
Or, a range on how low VIO can go without trouble...
@evanh What does the "unstable" output tell me? Max freq with one cog going?
Memory moved to top layer appears slightly worse than before. Testing with Chip's driver shows a top speed ~320 MHz. Was ~330 MHz with Simple boards.
MegaYume runs fine.
NeoYume seems to run Metal Slug fine, but the big one (Crossed Swords?) starts OK but crashes after about a minute.
(USB controller doesn't work at all here. Strangely works on some Eval headers but not others. Buying a new controller to hopefully fix this issue...)
Might try removing the ground fill on the inner layers around the memory to maybe reduce capacitance to ground a hair...
It's the bottom frequency where hubRAM read errors begin.
It's more of a temperature vs frequency logging program. The code is explicitly doing very little to allow the die temperature to closely match the heatsink temperature. Lowering the board temperature will increase that bottom/threshold frequency.
It's probably a little excessive in one respect - It checks the entirety of hubRAM on each pass. Nice for thoroughness but also unnecessarily spikes power draw for longer than needed.
If you want to measure it with more power draw then just remove the loop delay in my code. It should shift to a lower frequency for a given cooling setup. Other activities can also be added to the other cogs but you're limited to cogexec - They'll crash if using hubexec.
Is this with the new 1.4-RC1 version?