Shop OBEX P1 Docs P2 Docs Learn Events
pipelined COGs, FPU COG & RMAC COG — Parallax Forums

pipelined COGs, FPU COG & RMAC COG

steph_tsfsteph_tsf Posts: 1
edited 2010-09-20 11:48 in Propeller 1
Is it feasible to design 90 nanometer silicon operating at 80 MHz featuring 8 pipelined COGs delivering 80 MIPS each, with the exact same instruction set as the original COGs?
Is it feasible to add one FPU COG operating at 80 MHz inspired by the x86 SSE2 instruction set, however restricted to 2 parallel channels instead of 4 ?
Is it feasible to add one RMAC COG operating at 80 MHz implementing a dual-channel repetitive multiply-accumulate instruction operating on 32-bit fixed point numbers and outputting a 64-bit fixed point number ? This is for FIR or IIR filtering two channels using the same IIR or FIR coefficients.
Is it feasible to add hardware support for 8 I2S ports assignable to one of the COGs ?

Comments

  • AleAle Posts: 2,363
    edited 2010-09-09 06:34
    It may be possible why not... the question is what do you want it for (because something else may already be done...) ?

    Some of those features you get using for instance the *cough* xmos chip *cough*.
    For the FPU part I'd say... what do you want it for ? which operations and how fast ?

    You can implement it yourself using a high-end FPGA like a Stratix or Virtex part...
    Or... you may also consider one of the many ARM, DSPs or PowerPC processors already available that run at 1GHz. Those, while single-threaded (there are dual+ core versions) may deliver what you want out-of-the-box.

    If you have the money go from a FPGA to ASIC :), both Altera and Xilinx provide such a path :)

    There are a couple of forum members who implemented one COG in FPGA. I do not know how "expensive" in terms of Slices/LUTs and so on they are, and for DSP functions the FPGA provide 18x18 bit multipliers and ready-to-use DSP cores.

    Have a lot of fun!
  • David BetzDavid Betz Posts: 14,516
    edited 2010-09-09 07:46
    Ale wrote: »
    There are a couple of forum members who implemented one COG in FPGA. I do not know how "expensive" in terms of Slices/LUTs and so on they are, and for DSP functions the FPGA provide 18x18 bit multipliers and ready-to-use DSP cores.
    That sounds interesting! Is the Verilog/VHDL code for the single COG FPGA available?
  • AleAle Posts: 2,363
    edited 2010-09-10 00:07
    I don't think they made their code available... but they said it was partially working. I'm tempted to try this myself but I do not know Verilog that well :(
  • nutsonnutson Posts: 242
    edited 2010-09-10 02:10
    @ Steph: the thing you want exists, it is called a DSP. Go to the Texas Instruments product page and you will find a wide choice of configurations and speeds. They even have a very powerfull 6 core version TMS320C674x

    @ David: I have written a processor in Verilog that executes a subset of Prop ASM, so I can use the PropTool to write small ASM programs for it, latest info here: http://forums.parallax.com/showthread.php?t=122127. The .PPT contains information on the subset of instructions.

    My goal was to have a simple processor inside a DE1 development board to access the board resources (I don't want to use the Altera NIOS processor) and not to make a complete COG in Verilog. Code is (partly) available on request, but I have several versions floating around for various experiments with the DE1 board (translation: I have a version control problem) A project involves not just the processor core, but also the processor interface with the board resources and memory, and the ASM programs, and the propeller code to download the ASM into the soft COG). And my ideas on the perfect architecture still change each time I pick up a pencil...
  • David BetzDavid Betz Posts: 14,516
    edited 2010-09-10 04:44
    nutson wrote: »
    @ David: I have written a processor in Verilog that executes a subset of Prop ASM, so I can use the PropTool to write small ASM programs for it, latest info here: http://forums.parallax.com/showthread.php?t=122127. The .PPT contains information on the subset of instructions.
    Thanks! I tried to download the PPT file but wasn't able to open it on my Macintosh under Powerpoint 2008. I removed the ".txt" extension leaving just the ".ppt" extension but that didn't help.
  • LeonLeon Posts: 7,620
    edited 2010-09-10 05:02
    I couldn't view it with Open Office (Windows).
  • Cluso99Cluso99 Posts: 18,069
    edited 2010-09-10 05:41
    I have implemented most of the instruction set and it runs in 3 clocks but nothing is properly tested. The code has not been released.

    I wrote it in Verilog which I had to learn to do it.
  • nutsonnutson Posts: 242
    edited 2010-09-10 05:59
    Must be forum changes, re-attached the document, it is a .PPT file of 396KByte made with Office 2003

    Nico Hattink
  • LeonLeon Posts: 7,620
    edited 2010-09-10 07:32
    That's OK with Open Office (remove the .txt).
  • AleAle Posts: 2,363
    edited 2010-09-11 04:38
    I decided to start my own effort, for the time being I need to learn Verilog (better) and thus I decided to implement few(er) instructions, add, sub, cmp and jmp :). I have to start somewhere. I'm targeting 4 clocks per instruction, let's see :)
  • Cluso99Cluso99 Posts: 18,069
    edited 2010-09-11 05:03
    Ale: I was using the Spartan 3A 400 (the Avnet board). You can perform dual access to the RAM so I had a I & R in the first clock, fetch the D & S in the second clock, and execute in the 3rd clock.
    I was also able to check the ccodes in the 1st clock.

    I found most of the basics of Verilog easy.
  • Heater.Heater. Posts: 21,230
    edited 2010-09-11 05:15
    Ale,

    I'm not really up for this but it seems there have been a number of verilog attacks on the COG problem already.

    How about going with VHDL for a change?

    Then the resulting COG(s) would be runnable on Windows or Linux when compiled with the GHDL simulator.

    http://ghdl.free.fr/

    Perhaps one day I'll splash out on that Alter Cyclone III kit I can get locally and get to grips with VHDL myself but that's al long road.
  • AleAle Posts: 2,363
    edited 2010-09-11 05:25
    heater:

    I did try to learn VHDL some while ago... and somewhere it didn't work, I found Verilog easier to follow.
    Your suggestion of a compiler makes thing more appealing, though... I'll consider it.

    I'll target a Spartan 3 200 kgates (built on the pPropFPGA board I did some moths ago).
  • Heater.Heater. Posts: 21,230
    edited 2010-09-11 06:08
    Ale,

    A few years back I decided to have look into FPGAs and started to investigate VHDL. Here is what I found:

    1) My first non-trivial VHDL was to be a digital implementation of the analogue 4th order state variable active crossover filter from the Rane Corporation http://www.rane.com/pdf/linriley.pdf

    2) VHDL turns out to be quite verbose, like ADA, but I kind of liked that.

    3) VHDL is a lot bigger language than can actually be synthesized on an FPGA using Altera/Zylinx tools.

    4) Result was that my crossover simulated very well under GHDL on my Linux box. I could throw wav files at it and capture wav file output using a test harness written in C, easy.

    5) But this thing could not be synthesized because it used too many multipliers for a reasonably priced FPGA. They would have to be multiplexed some how.

    6) Item 3 above is a point that most books/tutorials on VHDL skipped over. They would happily teach you all corners of the language with no regard to what could really be synthesized. Had I been set up with an FPGA board and, say Altera, tools perhaps I would have learned to distinguish these things soon enough.

    That crossover is still waiting to be reworked into a real FPGA with some decent audio I/O. In the mean time a friend of mine implemented the thing in C++, in the style of the VHDL code, and uses it in his PC sound studio system to this day.
  • nutsonnutson Posts: 242
    edited 2010-09-11 14:05
    Using on chip dual port memory, 3 clocks / instruction is the most easy, as there are no pipeline complications. DJNZ is 3 clocks in my design, jump taken or not. I have made a version early in the process that used 4 clocks / instruction but was pipelined, effectively executing one instruction / 2 clocks. However, the pipeling complicates things quite a lot, as instruction N+1 and its operands are fetched while the result of instruction N is not known yet. This was too complicated for me.

    I have no VHDL experienc, cannot compare. Verilog is good for me, the Terasic boards I am using come with (badly commented) source level refence designs in Verilog. The Quartus design tools gives good information on resource usage (logic cells, memory, multipliers) while inserting high level building blocks in a design. Verilog also has constructs that cannot be synthesized, but are used for simulation only.
  • Cluso99Cluso99 Posts: 18,069
    edited 2010-09-11 16:49
    I have noticed there are Spartan 6 boards with locked fully blown Xilinx FPGA software for ~$500. But my 400kgate Spartan 3A board from Avnet is now $50 (I got it for $40). It doesn't come with Jtag but has USB download and some nice PSOC products with it as well as external memory.
  • AleAle Posts: 2,363
    edited 2010-09-12 08:30
    I have been playing witha small 32 bit ALU that implements a subset of the COG's instructions... the shifters account for half of the logic involved!
    Why are Functions not working ???
  • nutsonnutson Posts: 242
    edited 2010-09-12 10:13
    What do you mean with functions?

    Using an Altera Quartus mega-function involves: invoke the wizard, select the megafunction, give this particular instance a name, fill in parameters, finish and copy the named megafuction calling sequence into you Verilog module, fill in the real life signals.

    Attached copy shows the shifter function and the Verilog code that uses it. Yes, a barrelshifter accounts for several hundred of LE's of the 1000+ that my CPU requires. Remember, the implementation is probably 32 multiplexers each 32 bit to 1 bit.
    768 x 739 - 89K
  • AleAle Posts: 2,363
    edited 2010-09-13 00:02
    nutson,

    I meant functions defined in verlog. I use icarus verilog as simulator and somehow a function a wrote does not seem to work... some error in my code for sure.
  • RvnPhnxRvnPhnx Posts: 36
    edited 2010-09-20 11:48
    Heater. wrote: »
    2) VHDL turns out to be quite verbose, like ADA, but I kind of liked that.

    That would be because it is based on ADA!
    Heater. wrote: »
    3) VHDL is a lot bigger language than can actually be synthesized on an FPGA using Altera/Zylinx tools.
    ......
    6) Item 3 above is a point that most books/tutorials on VHDL skipped over. They would happily teach you all corners of the language with no regard to what could really be synthesized. Had I been set up with an FPGA board and, say Altera, tools perhaps I would have learned to distinguish these things soon enough.

    I guess I was just lucky to have that explained to me in an actual VHDL course in college. It helps to also take a course in VLSI around the same time--you get a much better idea of what actually can be done--and what one should bother to attempt to do. (You should also spend some time designing your own cells--you'll get a better understanding of what Chip and Beau are going through right now.)

    Granted, at this point I'm not sure what good it did me in the long term to focus on that stuff in college as the chance I'll ever do design work is rapidly diminishing.
Sign In or Register to comment.