This thread is about the new chip we are going to build in the 180nm process.
About code compatibility with Prop1:
Is this absolutely mandatory? I ask because, having learned a lot of things making the Prop2, we can augment the Prop1-type cogs in a few simple ways to quadruple hub memory bandwidth, which affords way better video and LMM operation. Improved video, alone, is going to break code compatibility. Also, we'll need to support DAC outputs and access pipelined math circuits. I don't see how we can have 100% code compatibility without walking on egg shells.
We can easily make an improved cog that is going to be very familiar looking, though.
The big picture:
I've been hammering out a new, minimalist design that should be along the lines of a Prop1 in cog complexity, with a few things taken away and a few things added.
First, hub memory will be comprised of 16 instances of 32768 x 8 RAM for 512KB. This is going to yield a hub data path of 128 bits. Only the RAMs that are needed on a given cycle will be activated, saving power.
Though the cog memory map is still 512x32, cog RAM will be physically organized as 128 x 128, so we can read or write four contiguous registers with RDQUAD/WRQUAD instructions. This is way better than what we had on the Prop2, because, rather than just affecting mappable overlay registers, these transfers are into and out of the actual cog registers, themselves. These 128 bit paths don't take too much mux'ing and they keep the power down to reasonable levels. Interfaces to any peripherals can take advantage of them, too. This also gives cogs running at 200MHz (100 MIPS) a hub memory bandwidth of 200MB/s, which is enough to do any kind of VGA that we have the internal hub memory to support, at any color depth (up to 24bpp) - without any hub slot reallocation scheme needed to favor particular cogs. LMM greatly benefits from this, too.
VGA is going to simply be a use case of a shifter that drives four DAC outputs to a set of fixed pins attached to each cog. The four channels from highest to lowest can be used as: R, G, B, HSYNC. In some modes, the shifter handles data in ways that is obviously for video, but, otherwise, it's a generic circuit that can simultaneously update four DACs with unique 8-bit data. You can also write the DACs directly in software, with 8 bits of dither, to realize something like 16-bit DACs.
What complicated the heck out of the P2 video was all the accommodation to support fancy color-space conversion for TV's. I plan to get rid of all that, as it's very costly, being full of staged multipliers and CORDIC rotators. Every flat screen TV I've seen has a VGA connector, and it is tidier than component connections, anyway. You can still drive a TV using the DAC shifter to make a 1-wire composite signal, but color modulation is no longer part of the package. This is a little sad, in that a one-wire color signal was nice, but you wouldn't want to have to read small text on a TV, anyway, so it was kind of a novelty.
There are some Prop1 instructions that I've never used, like CMPSX. Maybe we could cull a few of those for other things. Any ideas on getting rid of any of those instructions?
Do we really need to support words any more, with RDWORD/WRWORD? Bytes, I think, are always needed, but I don't think I've ever used words for anything before. Convention says we need them, but do we, really?
Here is the pin-out, as posted earlier in another thread:
Attachment not found.
This is going to take several weeks, probably, to develop. I'll post FPGA images for the DE0-Nano and DE2-115 boards as soon as we have something working.