Yeah, the Prop2-HOT Cog design was fully pipelined as 1 to 1. That was costly space wise as well as thermally. It's a bit of a trade-off between having lots of simpler cores or fewer high performance cores. The Propeller tends toward the former. 1 to 1 could still happen in the future but it will likely be at a finer process than 180nm.
I could be wrong but you could say, on the Prop2-COLD, the Hub is really running at sys clock rate, ie: 200 MIPS. It can transact with any and all Cogs on any and all clock cycles. There's just a restriction on what address is available to each Cog at any one interval.
I'm not sure how much you've read about this new Hub. It has been referred to as the blender and other less becoming names because of the address interleaving imposed on the Cogs. It's built like a cross-point switch.
It should have theoretical peak throughput of 16 x 200Mhz x 4 = 12.8GB/s, but of course, that's never going to be fully utilised as it requires all Cogs to be streaming 100% of their respective Hub bandwidth simultaneously.
Burst streaming for an individual Cog is not DMA'd to/from the Cog I don't think. It will require reading/writing an exchange buffer of some sort. The details during the HOT development never quite became clear, to me at least. Then again, these ideas may be history only now.
HubExec will automatically burst the instruction fetches.
Comments
I could be wrong but you could say, on the Prop2-COLD, the Hub is really running at sys clock rate, ie: 200 MIPS. It can transact with any and all Cogs on any and all clock cycles. There's just a restriction on what address is available to each Cog at any one interval.
Which is latest specs that I know of (Sept-2014), which is for the COG's at least is 0.5MIPS/Mhz (100MIPS/200Hhz), I assume the HUB is as you say then at 200MIPS.
States:
It should have theoretical peak throughput of 16 x 200Mhz x 4 = 12.8GB/s, but of course, that's never going to be fully utilised as it requires all Cogs to be streaming 100% of their respective Hub bandwidth simultaneously.
Burst streaming for an individual Cog is not DMA'd to/from the Cog I don't think. It will require reading/writing an exchange buffer of some sort. The details during the HOT development never quite became clear, to me at least. Then again, these ideas may be history only now.
HubExec will automatically burst the instruction fetches.
See here for a diagram of the new hub design
Many Thanks, great diagram ..