What's wrong with my HDL FIFO?

After a long hiatus I was getting back into my processor design for FPGA and built a UART for it. Then I decided the UART could do with a FIFO on the receiver and possibly on the transmitter. Sounds simple enough but it took me all day to get the logic right and the test harness in shape.

Now, I'm hoping this is a rhetorical question because as far as I can tell my FIFO works just fine. Under simulation at least. Bytes go in, bytes come out, in the right order, it says FULL when it's full and EMPTY when it's empty. No matter what I throw at it.

But then I started looking at Verilog FIFO designs around the net. Wow, they can be very long winded, complicated and hard to follow. Mine is really short and sweet. That made me start thinking I had missed an important point somewhere.

Here some examples of what look like over complex, tortuous, solutions:
https://www.fpga4student.com/2017/01/verilog-code-for-fifo-memory.html
https://www.quora.com/What-is-the-Verilog-code-for-synchronous-FIFO

The latter comes from an Intel engineer, and as far as I can tell is taking twice as much code to do exactly the same as mine. Perhaps that's why Intel processors need so many transistors and run so hot :)

So what is wrong with my FIFO? What am I missing here?

Here is my FIFO in SpinalHDL:
class Fifo(width: Int, depth: Int) extends Component {

  val addressWidth = (scala.math.log(depth) / scala.math.log(2)).toInt

  val io = new Bundle {
    val dataIn = in UInt (width bits)
    val dataOut = out UInt (width bits)
    val read = in Bool
    val write = in Bool
    val full = out Bool
    val empty = out Bool
  }

  val mem = Mem(Bits(width bits), wordCount = depth)
  val head = Reg(UInt (addressWidth  bits )) init (0)
  val tail = Reg(UInt (addressWidth  bits)) init (0)
  val full = Reg(Bool) init False
  val empty = Reg(Bool) init True

  mem.write(head, io.dataIn.asBits, !full & io.write)
  io.dataOut := U(mem.readAsync(tail))

  when (io.write && !io.read) {
    when (!full) {
      head := head + 1
      full := ((head + 1) === tail)
      empty := False
    }
  }

  when (!io.write && io.read) {
    when (!empty) {
      tail := tail + 1
      empty := (tail + 1  === head)
      full := False
    }
  }

  when (io.write & io.read) {
    when (full) {
      tail := tail + 1
      full := False
    }
    when (empty) {
      head := head + 1
      empty := False
    }
    when (!full & !empty) {
      tail := tail + 1
      head := head + 1
    }
  }

  io.empty := empty
  io.full := full
}
How simple could that be? In fact I think it's one of the most beautiful pages of code I have ever written. Never mind the specifics of the Spinal syntax, it has a certain poetic symmetry about it. That generates the following Verilog, longer and less pretty but still has the poetic symmetry:
module Fifo (
      input  [7:0] io_dataIn,
      output [7:0] io_dataOut,
      input   io_read,
      input   io_write,
      output  io_full,
      output  io_empty,
      input   clk,
      input   reset);
  wire [7:0] _zz_1;
  wire [6:0] _zz_2;
  wire [6:0] _zz_3;
  wire [7:0] _zz_4;
  wire  _zz_5;
  reg [6:0] head;
  reg [6:0] tail;
  reg  full;
  reg  empty;
  reg [7:0] mem [0:127];
  assign _zz_2 = (head + (7'b0000001));
  assign _zz_3 = (tail + (7'b0000001));
  assign _zz_4 = io_dataIn;
  assign _zz_5 = ((! full) && io_write);
  always @ (posedge clk) begin
    if(_zz_5) begin
      mem[head] <= _zz_4;
    end
  end

  assign _zz_1 = mem[tail];
  assign io_dataOut = _zz_1;
  assign io_empty = empty;
  assign io_full = full;
  always @ (posedge clk or posedge reset) begin
    if (reset) begin
      head <= (7'b0000000);
      tail <= (7'b0000000);
      full <= 1'b0;
      empty <= 1'b1;
    end else begin
      if((io_write && (! io_read)))begin
        if((! full))begin
          head <= (head + (7'b0000001));
          full <= (_zz_2 == tail);
          empty <= 1'b0;
        end
      end
      if(((! io_write) && io_read))begin
        if((! empty))begin
          tail <= (tail + (7'b0000001));
          empty <= (_zz_3 == head);
          full <= 1'b0;
        end
      end
      if((io_write && io_read))begin
        if(full)begin
          tail <= (tail + (7'b0000001));
          full <= 1'b0;
        end
        if(empty)begin
          head <= (head + (7'b0000001));
          empty <= 1'b0;
        end
        if(((! full) && (! empty)))begin
          tail <= (tail + (7'b0000001));
          head <= (head + (7'b0000001));
        end
      end
    end
  end

endmodule
I know, way off topic for this forum. But I know there are some HDL gurus around here and I blame Chip for setting me off down this road.

Comments

  • 17 Comments sorted by Date Added Votes
  • KeithEKeithE Posts: 940
    edited November 9 Vote Up0Vote Down
    I don't want to take a lot of time to review this, but this part:

    io.dataOut := U(mem.readAsync(tail))
    ...
    assign _zz_1 = mem[tail];
    assign io_dataOut = _zz_1;

    Indicates that you are doing async reads. I'm not sure what device you are targeting, but it may mean that this gets implemented with a bunch of registers. This is probably not what you want? Often times RAM reads are clocked. e.g. see sysMEM in Memory Usage Guide for iCE40 Devices

    But yeah a synchronous FIFO shouldn't be too complicated. It's the asynchronous ones that get interesting.

    Also one funny thing that I noticed recently when trying to understand things that other people wrote, is that head and tail are ambiguous terms. (I read a specification where they used inconsistently.) I would use something like wrPtr and rdPtr now. See Head_or_tail_first
  • KeithE,

    Thanks for taking a quick peek at it. For sure don't waste much time on it.

    I had not really looked into what Spinal does with readAsyn and readSync. Just now I changed it to Sync as an experiment. All that happens is that the memory read gets clocked out:
    Spinal:
        mem.readSync:
    
    Verilog:
        assign io_dataOut = _zz_2;
        assign _zz_1 = 1'b1;
        always @ (posedge clk) begin
            if(_zz_1) begin
            _zz_2 <= mem[tail];
            end
        end
    
    That causes all my test bench to fail as the data is now one clock late.

    Also as an experiment I dropped it into my xoro project, with a picorv32. In both cases Quartus synthesized it into SYNC_RAM. Which surprised me as I had a devil of a job getting Quartus to use memory blocks for the picorv32's RAM, ended up having to use the Quartus memory wizard thing to do it for me.

    In this case I might be happier if the memory was a bunch of LUTs. I had to reduce my picorv32 RAM from 64K bytes to 40K bytes to stop Quartus complaining that it can't find enough RAM blocks. Not sure why it want's to eat so much for my 32 stage FIFO.

    Ah, yes, I had not considered asynchronous FIFO's. Dealing with different clock domains is out of my league at the moment.

    That's interesting about the ambiguity of "head" and "tail". Having created many software FIFO's over the years and seen many others I have never had anyone argue that point in reviews. I have never even thought about it.

    I'm going to leave as is. Because that is the way I have always seen it. Because, apparently, that is how the Linux kernel does it so everyone else must be wrong.

    Perversely I could say that what we have here is queue of empty space. So "head" indicates the first free space slot to be consumed from the queue :)
  • Yes I’m in agreement with the Linux usage too. Whew!

    We used some 8051 IP where the internal data RAM had to support async reads. No way to configure it as sync. It would have impacted performance. But it was only 128 bytes, so we could live with a regfile.
  • I was thinking about this "head" vs "tail" ambiguity.

    Strangely enough I suspect that if one asked me to implement a "FIFO" in software, or "cyclic buffer" as they have been known, I would use the terms "head" and "tail" as I have here.

    On the other hand if the requirement was for a "queue" then I might use "head" and "tail" the other way round.

    How so?

    Because "FIFO" and "cyclic buffer" are sort of hardware guys way to look at it and the terms those guys would use. You don't by a "queue" chip you buy a "FIFO" chip. For example: https://www.digikey.com/products/en/integrated-circuits-ics/logic-fifos-memory/707

    But "queue" is from the computer science world where queue's consume data from the "head" and add it to the "tail".

    It's all the same idea of course, the confusion comes when hardware guys meet software guys in the middle, in the FPGA world.

    Anyway, my problem now is that my serial receiver works fine with my picorv32 core in an FPGA. My receiver + FIFO works fine in Verilator simulation. But when I put the receiver + FIFO in FPGA all kinds of garbage comes out.

    This all gets too big and complex to discuss here. I have to fight with it some more....
  • Oh shoot. This "head"/"tail" thing is worse than it seems.

    In the esoteric programming world of Haskell we have this:
    >head [5,4,3,2,1]
    5
    
    So "head" is the first element in a list.
    >tail [5,4,3,2,1]  
    [4,3,2,1]
    
    So "tail" is everything else!
    >last [5,4,3,2,1]  
    1   
    
    So "last" is the one on the end. Fair enough.
    >init [5,4,3,2,1]  
    [5,4,3,2]
    
    Which everything before "last". Hmm..OK

    So "head" is the opposite to "last" and "tail" is the opposite to "init"

    Confused yet?

    OK, given we are talking about a FIFO here, as in "first in, first out", we should not use "head" and "tail" but rather "first" and "last".

    "first" is the position things go out of the FIFO, they were first in.

    "last" is the position things go in to the FIFO, they were last in.

    That would at least remove the "head"/"tail" ambiguity. At risk of confusing everyone because they had never seen these things named like that before.

    Perhaps I worry about trivia too much....



  • ElectrodudeElectrodude Posts: 1,192
    edited November 9 Vote Up0Vote Down
    Heater. wrote: »
    Oh shoot. This "head"/"tail" thing is worse than it seems.

    In the esoteric programming world of Haskell we have this:
    >head [5,4,3,2,1]
    5
    
    So "head" is the first element in a list.
    >tail [5,4,3,2,1]  
    [4,3,2,1]
    
    So "tail" is everything else!
    >last [5,4,3,2,1]  
    1   
    
    So "last" is the one on the end. Fair enough.
    >init [5,4,3,2,1]  
    [5,4,3,2]
    
    Which everything before "last". Hmm..OK

    So "head" is the opposite to "last" and "tail" is the opposite to "init"

    Confused yet?

    OK, given we are talking about a FIFO here, as in "first in, first out", we should not use "head" and "tail" but rather "first" and "last".

    "first" is the position things go out of the FIFO, they were first in.

    "last" is the position things go in to the FIFO, they were last in.

    That would at least remove the "head"/"tail" ambiguity. At risk of confusing everyone because they had never seen these things named like that before.

    Perhaps I worry about trivia too much....



    That's not just Haskell. Most functional languages either call them car and cdr (like most Lisps) or call them head and tail.

    This section of Wikipedia's Linked List article says that tail can refer to either the rest of the list or the last element of the list. The article uses the term "tail" in the sense of the rest of the list extensively: e.g. search the page for "tail-sharing".
  • Oh Boy, yes, "car" and "cdr". Lisp and Scheme and others...

    At least "car" and "cdr" avoid ambiguity by being really weird.

    To any normal person this "head" and "tail" thing is obvious. Animals have a head at one end and a tail at the other.Food goes in at the head end and comes out at the tail end. The stuff in the middle is not either head nor tail.

    The CS guys get it wrong.

    In Haskell lists can be infinitely long, for example [1, 2..]

    So the head is "1".

    But what is the last? ... hangs up computer...

    Perhaps for this reason the Haskell guys define "tail" as everything that is not "head", [2..]. They cannot get to the actual tail!
  • instead of head and tail one could find better names. I for example would be tempted to name one of them FirstIn and the other FirstOut so that they say what they do, but that is to COBOL-like for other people...

    Mike
    I am just another Code Monkey.
    A determined coder can write COBOL programs in any language. -- Author unknown.
    Press any key to continue, any other key to quit

    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this post are to be interpreted as described in RFC 2119.
  • msrobots wrote: »
    instead of head and tail one could find better names. I for example would be tempted to name one of them FirstIn and the other FirstOut so that they say what they do, but that is to COBOL-like for other people...

    Mike
    No, that might be obvious these days. Much better to use something ambiguous to obfuscate the code.
    My Prop boards: P8XBlade2, RamBlade, CpuBlade, TriBlade
    Prop OS (also see Sphinx, PropDos, PropCmd, Spinix)
    Website: www.clusos.com
    Prop Tools (Index) , Emulators (Index) , ZiCog (Z80)
  • How about the Fortran approach, just call them I and J ?

  • How about using Fi and Fo. Saves keystrokes and leaves Fee and Fum free for other uses ;-)
    In science there is no authority. There is only experiment.
    Life is unpredictable. Eat dessert first.
  • Heater.Heater. Posts: 21,177
    edited November 10 Vote Up0Vote Down
    Yes.

    Of course if you call them "Fi" and "Fo" then the "F" is redundant.

    Might as well call them "i" and "o". Which brings us back to Fortran world.

    I don't like to use "ptr" as in "wrPtr" and "rdPtr". Pointers are a low level software things, as in C, they don't really describe what is going on. And again the "Ptr" part is redundant. Might as well call them "wr" and "rd".

    Oh what, "Fee-fi-fo-fum, I smell the blood of an Englishman, Be he alive, or be he dead I'll grind his bones to make my bread."

    That put the fear of God into me as a five year old. Who are these big guys that want to eat us?!
  • What have I started? ;-)

    This sort of reminds me of something that happened at a job. There was a mode called green mode. I asked - why is it called green mode? Oh, that's because it was the mode drawn with a green dry erase marker on the white board. Similarly there was a mode called mode 3, because it was the third mode of operation introduced. Somehow modes 1 and 2 got reasonable names indicating what they did, but mode 3 never did. And the key GPS software engineers seemed to enjoy this sort of thing.
  • KeithE wrote: »
    What have I started? ;-)
    ....

    No worries. After dealing with problem after problem this week I was am in a bit of a whimsical mood. I'm sure it will pass soon.
    In science there is no authority. There is only experiment.
    Life is unpredictable. Eat dessert first.
  • KeithE,

    Colour me stupid but modes named "1", "green", "3" do not indicate what any of them did.

  • I didn’t give the updated names for 1 and 2.
  • Heater.Heater. Posts: 21,177
    edited November 19 Vote Up0Vote Down
    Well of course there is something wrong with my FIFO. It won't work at 100MHz. At least not when embedded into my UART receiver. Which is a pain because the picorv32 processor core I'm using does.

    I guess it's hardly surprising as there is a huge lot of logic to get through between a byte arriving and a FIFO write and similarly on read.

    In the receiver:
    ...
    when (baudClockX64Sync2.rise) {
        ...
        switch(state) {
          ...
          is(3) {
            // Check stop bit
            when(bitTimer === 0) {
              when(io.rx === True) {
                when (!fifo.io.full) {
                  fifo.io.dataIn := shifter
                  fifo.io.write := True
                  ...
    
    Then in the fifo:
                   ...
                    when (io.write.rise && !io.read) {
                        when (!full) {
                        head := head + 1
                        full := ((head + 1) === tail)
                        empty := False
                   ...
    
    What to do? Some how slow down the FIFO read/write cycles? Add some kind of pipelining buffers on the FIFO inputs and outputs. All sounds a bit messy.

    Code is here if anyone is curious:

    SpinalHDL : https://github.com/ZiCog/xoroshiro/tree/master/src/main/scala/xoroshiro
    Verilog : https://github.com/ZiCog/xoroshiro

    Edit:

    STOP PRESS:

    Quite by chance I now have the FIFO running with a 100MHz clock in my UART receiver. BUT it only does so when I hook the UART's internal state variable to the DE0 Nano LEDS. Take that out and it all locks up!

    Make me think this is working on the edge timing wise.

Sign In or Register to comment.