pipelined COGs, FPU COG & RMAC COG
steph_tsf
Posts: 1
Is it feasible to design 90 nanometer silicon operating at 80 MHz featuring 8 pipelined COGs delivering 80 MIPS each, with the exact same instruction set as the original COGs?
Is it feasible to add one FPU COG operating at 80 MHz inspired by the x86 SSE2 instruction set, however restricted to 2 parallel channels instead of 4 ?
Is it feasible to add one RMAC COG operating at 80 MHz implementing a dual-channel repetitive multiply-accumulate instruction operating on 32-bit fixed point numbers and outputting a 64-bit fixed point number ? This is for FIR or IIR filtering two channels using the same IIR or FIR coefficients.
Is it feasible to add hardware support for 8 I2S ports assignable to one of the COGs ?
Is it feasible to add one FPU COG operating at 80 MHz inspired by the x86 SSE2 instruction set, however restricted to 2 parallel channels instead of 4 ?
Is it feasible to add one RMAC COG operating at 80 MHz implementing a dual-channel repetitive multiply-accumulate instruction operating on 32-bit fixed point numbers and outputting a 64-bit fixed point number ? This is for FIR or IIR filtering two channels using the same IIR or FIR coefficients.
Is it feasible to add hardware support for 8 I2S ports assignable to one of the COGs ?
Comments
Some of those features you get using for instance the *cough* xmos chip *cough*.
For the FPU part I'd say... what do you want it for ? which operations and how fast ?
You can implement it yourself using a high-end FPGA like a Stratix or Virtex part...
Or... you may also consider one of the many ARM, DSPs or PowerPC processors already available that run at 1GHz. Those, while single-threaded (there are dual+ core versions) may deliver what you want out-of-the-box.
If you have the money go from a FPGA to ASIC , both Altera and Xilinx provide such a path
There are a couple of forum members who implemented one COG in FPGA. I do not know how "expensive" in terms of Slices/LUTs and so on they are, and for DSP functions the FPGA provide 18x18 bit multipliers and ready-to-use DSP cores.
Have a lot of fun!
@ David: I have written a processor in Verilog that executes a subset of Prop ASM, so I can use the PropTool to write small ASM programs for it, latest info here: http://forums.parallax.com/showthread.php?t=122127. The .PPT contains information on the subset of instructions.
My goal was to have a simple processor inside a DE1 development board to access the board resources (I don't want to use the Altera NIOS processor) and not to make a complete COG in Verilog. Code is (partly) available on request, but I have several versions floating around for various experiments with the DE1 board (translation: I have a version control problem) A project involves not just the processor core, but also the processor interface with the board resources and memory, and the ASM programs, and the propeller code to download the ASM into the soft COG). And my ideas on the perfect architecture still change each time I pick up a pencil...
I wrote it in Verilog which I had to learn to do it.
Nico Hattink
I was also able to check the ccodes in the 1st clock.
I found most of the basics of Verilog easy.
I'm not really up for this but it seems there have been a number of verilog attacks on the COG problem already.
How about going with VHDL for a change?
Then the resulting COG(s) would be runnable on Windows or Linux when compiled with the GHDL simulator.
http://ghdl.free.fr/
Perhaps one day I'll splash out on that Alter Cyclone III kit I can get locally and get to grips with VHDL myself but that's al long road.
I did try to learn VHDL some while ago... and somewhere it didn't work, I found Verilog easier to follow.
Your suggestion of a compiler makes thing more appealing, though... I'll consider it.
I'll target a Spartan 3 200 kgates (built on the pPropFPGA board I did some moths ago).
A few years back I decided to have look into FPGAs and started to investigate VHDL. Here is what I found:
1) My first non-trivial VHDL was to be a digital implementation of the analogue 4th order state variable active crossover filter from the Rane Corporation http://www.rane.com/pdf/linriley.pdf
2) VHDL turns out to be quite verbose, like ADA, but I kind of liked that.
3) VHDL is a lot bigger language than can actually be synthesized on an FPGA using Altera/Zylinx tools.
4) Result was that my crossover simulated very well under GHDL on my Linux box. I could throw wav files at it and capture wav file output using a test harness written in C, easy.
5) But this thing could not be synthesized because it used too many multipliers for a reasonably priced FPGA. They would have to be multiplexed some how.
6) Item 3 above is a point that most books/tutorials on VHDL skipped over. They would happily teach you all corners of the language with no regard to what could really be synthesized. Had I been set up with an FPGA board and, say Altera, tools perhaps I would have learned to distinguish these things soon enough.
That crossover is still waiting to be reworked into a real FPGA with some decent audio I/O. In the mean time a friend of mine implemented the thing in C++, in the style of the VHDL code, and uses it in his PC sound studio system to this day.
I have no VHDL experienc, cannot compare. Verilog is good for me, the Terasic boards I am using come with (badly commented) source level refence designs in Verilog. The Quartus design tools gives good information on resource usage (logic cells, memory, multipliers) while inserting high level building blocks in a design. Verilog also has constructs that cannot be synthesized, but are used for simulation only.
Why are Functions not working ???
Using an Altera Quartus mega-function involves: invoke the wizard, select the megafunction, give this particular instance a name, fill in parameters, finish and copy the named megafuction calling sequence into you Verilog module, fill in the real life signals.
Attached copy shows the shifter function and the Verilog code that uses it. Yes, a barrelshifter accounts for several hundred of LE's of the 1000+ that my CPU requires. Remember, the implementation is probably 32 multiplexers each 32 bit to 1 bit.
I meant functions defined in verlog. I use icarus verilog as simulator and somehow a function a wrote does not seem to work... some error in my code for sure.
That would be because it is based on ADA!
I guess I was just lucky to have that explained to me in an actual VHDL course in college. It helps to also take a course in VLSI around the same time--you get a much better idea of what actually can be done--and what one should bother to attempt to do. (You should also spend some time designing your own cells--you'll get a better understanding of what Chip and Beau are going through right now.)
Granted, at this point I'm not sure what good it did me in the long term to focus on that stuff in college as the chance I'll ever do design work is rapidly diminishing.