OnSemi has been doing some prep work and their initial power estimation for the core came back at 4-6W. This is without clock-gating, so when they get their synthesis tools to properly infer clock-gating, power will drop by probably 1/3. This is still huge, and doesn't account for memory and I/O power, which could be significant. My gut feeling is that we can get the power down to 5W and not need cooling by using a 17x17mm BGA package.
This all seems, outrageous, I know. I think we are being very ambitious for a 180nm process. This level of design complexity really needs 90nm, or less, to be practical. 40nm would be great - low power and GHz+ speed, but we don't have the budget for that.
I don't know what level of enthusiasm there might be for a 5W BGA Prop2 that costs $12. I think the 5W is worse than $12.
What we can do at this point:
A) Continue on the current trajectory.
Pare down the design, jettisoning all kids of features, like hub exec, and get back to something much smaller and maybe faster. The current FPGA implementation could be further honed and later, hopefully, made into a chip using a smaller technology.
C) Drop the current design to four cogs, which would also reduce cache sizes and hub memory down to 128KB. This would also allow us to shrink the die considerably, as we could change the I/O pin aspect ratio to allow them to fit together more densely, occupying more of what was needed for the core. This would also mean the whole chip would fit on an FPGA.
D) Retire to an opium den.
E) Other ideas?