Interesting. Isn't the SymbiFlow tool chain for ECP5 a bit far off?
Currently my state machine is in my UART Rx, connected to a picorcv32. I'm not sure how I would test it on the FPGA without the processor. I'd have to build another state machine to exercise it...
From hitting the "go" button in Quartus to blinky lights on FPGA is almost 1 minute and 20 seconds. That's only using 12% of the FPGA.
Not really. The state machine I have inside my async receiver has been working solidly for days now. My serial test has been running as fast as possible all the while, with no error.
However, if I make any seemingly innocuous change, like using "one hot" encoding for the state then Quartus makes random junk out of it that does not work. Even if my Verilator simulation is still fine. I dare not touch anything! I have no idea what breaks it.
Meanwhile, I can make no sense out of the Quartus generated state machine as shown in the netlist viewer. Which looks like this:
When I inspect the table of transitions that comes with that diagram I see crazy things that basically say "If you are not in this state, this is how you transition to the next state".
I.e. there is no such possible transition as shown in the diagram! Why is it shown with an arc in the diagram then?
"You can analyze this state machine instance using the state machine
diagram, transition table, and encoding table. Clearly something is
wrong with the state machine because every state has a transition to
every other state (Figure 12–34). After inspecting the state machine
behavior, you determine that in this scenario, the designer forgot to
create default assignments for the next state (that is, next_state =
current_state if the conditions are not met)."
I didn't look back at your generated Verilog, but here's hoping it is your issue.
So I spent an hour adding default assignments, in my case "next = state" to suitable places in my code. Some caused it to fail. One worked. But always that state machine diagram shows all possible transitions. Something is not right.
Now I just notice something likely very important. Quartus has generated two state machines in my async receiver. One for "state" and one for "next". The diagram I posted earlier was for "next". Here is the one for "state". It's even more weird, it has no transitions out of sate zero!
I'm guessing it's not correct to have two state machines.
1) They don't use a "next_state", they only have the "state" variable.
2) Their case statement is directly driven by "posedge clk or posedge reset"
3) All possible paths through a case assign to state what the next state should be.
4) Their state machine has two case statements. One does nothing but update state and the other sets output according to state.
With that in mind I rearranged my SpinalHDL code a bit:
1) Got rid of the "next". I never did like that next state thing, it's redundant. Why do so many example Verilog state machines around the net advise to do that?
2) My "switch" statement (verilog case) was inside a test for the baud rate clock edge. My hunch is that this confuses Quartus so I turned things inside out. I lifted the switch to the outer level and put the test for baud clock edge inside every case. This is logically equivalent, and works, so I have no idea why Quartus might get confused.
3) I put a default state assignment into every case.
4) Nothing, unlike the Intel example I left mine as a single switc/case block. Why do they split things up into two case blocks like that?
With those changes BINGO! Quartus now generates a single state machine instead of two and the state machine viewer shows only the correct state transitions. Oh, and it actually works!
A big thank you Keith for prompting me to look into that some more.
The crucial difference, I believe, is that the "switch" is now at the top level with the tests for "baudClockX64Sync2.rise" inside each case. Rather than the other way around, clock test at the top level and the switch inside that.
Logically this is the same. And it did work. But this way round Quartus draws me a correct transition diagram. And only one of them!
Looking at it now I think I can simplify it further. I could remove the test for "baudClockX64Sync2.rise" in each case and just test "when((bitTimer === 0).rise)".
Edit: I made the above state machine simplifications, removed state S4.
module AsyncReceiver (
input io_enable,
input io_mem_valid,
output io_mem_ready,
input [3:0] io_mem_addr,
output [31:0] io_mem_rdata,
input io_baudClockX64,
input io_rx,
input clk,
input reset);
reg _zz_5;
reg _zz_6;
wire [7:0] _zz_7;
wire _zz_8;
wire _zz_9;
wire _zz_10;
wire _zz_11;
wire [31:0] _zz_12;
wire [0:0] _zz_13;
reg [1:0] state;
reg [5:0] bitTimer;
reg [2:0] bitCount;
reg [7:0] shifter;
reg baudClockX64Sync1;
reg baudClockX64Sync2;
reg rxSync1;
reg _zz_1;
reg _zz_2;
reg _zz_3;
reg [7:0] rdata;
reg ready;
wire busCycle;
reg _zz_4;
assign _zz_10 = (busCycle && (! _zz_4));
assign _zz_11 = (bitTimer == (6'b000000));
assign _zz_12 = {24'd0, rdata};
assign _zz_13 = (! _zz_9);
Fifo fifo_1 (
.io_dataIn(shifter),
.io_dataOut(_zz_7),
.io_read(_zz_5),
.io_write(_zz_6),
.io_full(_zz_8),
.io_empty(_zz_9),
.clk(clk),
.reset(reset)
);
always @ (*) begin
_zz_5 = 1'b0;
if(_zz_10)begin
case(io_mem_addr)
4'b0000 : begin
_zz_5 = 1'b1;
end
4'b0100 : begin
end
default : begin
end
endcase
end
end
always @ (*) begin
_zz_6 = 1'b0;
case(state)
2'b00 : begin
end
2'b01 : begin
end
2'b10 : begin
end
default : begin
if(_zz_11)begin
if((rxSync1 == 1'b1))begin
if((! _zz_8))begin
_zz_6 = 1'b1;
end
end
end
end
endcase
end
assign busCycle = (io_mem_valid && io_enable);
assign io_mem_rdata = (busCycle ? _zz_12 : (32'b00000000000000000000000000000000));
assign io_mem_ready = (busCycle ? ready : 1'b0);
always @ (posedge clk or posedge reset) begin
if (reset) begin
state <= (2'b00);
bitTimer <= (6'b000000);
bitCount <= (3'b000);
shifter <= (8'b00000000);
baudClockX64Sync1 <= 1'b0;
baudClockX64Sync2 <= 1'b0;
rxSync1 <= 1'b0;
rdata <= (8'b00000000);
ready <= 1'b0;
end else begin
baudClockX64Sync1 <= io_baudClockX64;
baudClockX64Sync2 <= baudClockX64Sync1;
rxSync1 <= io_rx;
if((baudClockX64Sync2 && (! _zz_1)))begin
bitTimer <= (bitTimer - (6'b000001));
end
case(state)
2'b00 : begin
state <= (2'b00);
if(((! rxSync1) && _zz_2))begin
state <= (2'b01);
bitTimer <= (6'b011111);
end
end
2'b01 : begin
state <= (2'b01);
if((bitTimer == (6'b000000)))begin
if((rxSync1 == 1'b0))begin
bitTimer <= (6'b111111);
state <= (2'b10);
end else begin
state <= (2'b00);
end
end
end
2'b10 : begin
state <= (2'b10);
if((baudClockX64Sync2 && (! _zz_3)))begin
if((bitTimer == (6'b000000)))begin
shifter[bitCount] <= rxSync1;
bitCount <= (bitCount + (3'b001));
if((bitCount == (3'b111)))begin
state <= (2'b11);
end
end
end
end
default : begin
state <= (2'b11);
if(_zz_11)begin
state <= (2'b00);
end
end
endcase
ready <= busCycle;
if(_zz_10)begin
case(io_mem_addr)
4'b0000 : begin
rdata <= _zz_7;
end
4'b0100 : begin
rdata <= {7'd0, _zz_13};
end
default : begin
end
endcase
end
end
end
always @ (posedge clk) begin
_zz_1 <= baudClockX64Sync2;
_zz_4 <= busCycle;
end
always @ (posedge clk) begin
_zz_2 <= rxSync1;
end
always @ (posedge clk) begin
_zz_3 <= baudClockX64Sync2;
end
endmodule
Wow, that is one audacious project and an excellent write up too. He has lot's ideas there to borrow.
Never heard of that panologic device. Sounds like a sweet FPGA box to get hold of.
Thanks for all your hints and tips re: my FIFO and state machine. Especially prompting me to investigate this rabbit hole to the end rather than giving up in confusion and despair.
I might be back with FIFO woes if I ever decide to make it asynchronous!
Comments
https://www.tindie.com/products/tinyfpga/tinyfpga-bx/
I just bought one.
The edit/run turn around time of Quartus has been driving me nuts while try to get that state machine to work.
It's time to get that tinyfpga board or two.
https://www.crowdsupply.com/tinyfpga/tinyfpga-ex
As far as your state machine, can you build a simpler example for quicker iteration? How long does an iteration take currently?
Currently my state machine is in my UART Rx, connected to a picorcv32. I'm not sure how I would test it on the FPGA without the processor. I'd have to build another state machine to exercise it...
From hitting the "go" button in Quartus to blinky lights on FPGA is almost 1 minute and 20 seconds. That's only using 12% of the FPGA.
Not really. The state machine I have inside my async receiver has been working solidly for days now. My serial test has been running as fast as possible all the while, with no error.
However, if I make any seemingly innocuous change, like using "one hot" encoding for the state then Quartus makes random junk out of it that does not work. Even if my Verilator simulation is still fine. I dare not touch anything! I have no idea what breaks it.
Meanwhile, I can make no sense out of the Quartus generated state machine as shown in the netlist viewer. Which looks like this:
When I inspect the table of transitions that comes with that diagram I see crazy things that basically say "If you are not in this state, this is how you transition to the next state".
I.e. there is no such possible transition as shown in the diagram! Why is it shown with an arc in the diagram then?
https://courses.cs.washington.edu/courses/cse467/08au/labs/Resources/Quartus II Netlist Viewers.pdf
"You can analyze this state machine instance using the state machine
diagram, transition table, and encoding table. Clearly something is
wrong with the state machine because every state has a transition to
every other state (Figure 12–34). After inspecting the state machine
behavior, you determine that in this scenario, the designer forgot to
create default assignments for the next state (that is, next_state =
current_state if the conditions are not met)."
I didn't look back at your generated Verilog, but here's hoping it is your issue.
Interesting. Thanks.
So I spent an hour adding default assignments, in my case "next = state" to suitable places in my code. Some caused it to fail. One worked. But always that state machine diagram shows all possible transitions. Something is not right.
Now I just notice something likely very important. Quartus has generated two state machines in my async receiver. One for "state" and one for "next". The diagram I posted earlier was for "next". Here is the one for "state". It's even more weird, it has no transitions out of sate zero!
I'm guessing it's not correct to have two state machines.
Oh my God, I fixed it! What a struggle.
I was looking at the Mealy State Machine template that Intel has here: https://www.intel.com/content/www/us/en/programmable/support/support-resources/design-examples/design-software/verilog/ver-state-machine.html how simple could it be?! I noticed a couple of points:
1) They don't use a "next_state", they only have the "state" variable.
2) Their case statement is directly driven by "posedge clk or posedge reset"
3) All possible paths through a case assign to state what the next state should be.
4) Their state machine has two case statements. One does nothing but update state and the other sets output according to state.
With that in mind I rearranged my SpinalHDL code a bit:
1) Got rid of the "next". I never did like that next state thing, it's redundant. Why do so many example Verilog state machines around the net advise to do that?
2) My "switch" statement (verilog case) was inside a test for the baud rate clock edge. My hunch is that this confuses Quartus so I turned things inside out. I lifted the switch to the outer level and put the test for baud clock edge inside every case. This is logically equivalent, and works, so I have no idea why Quartus might get confused.
3) I put a default state assignment into every case.
4) Nothing, unlike the Intel example I left mine as a single switc/case block. Why do they split things up into two case blocks like that?
With those changes BINGO! Quartus now generates a single state machine instead of two and the state machine viewer shows only the correct state transitions. Oh, and it actually works!
A big thank you Keith for prompting me to look into that some more.
and the Verilog it generates here: https://github.com/ZiCog/xoroshiro/blob/master/AsyncReceiver.v
Also below for convenience.
The crucial difference, I believe, is that the "switch" is now at the top level with the tests for "baudClockX64Sync2.rise" inside each case. Rather than the other way around, clock test at the top level and the switch inside that.
Logically this is the same. And it did work. But this way round Quartus draws me a correct transition diagram. And only one of them!
Looking at it now I think I can simplify it further. I could remove the test for "baudClockX64Sync2.rise" in each case and just test "when((bitTimer === 0).rise)".
Edit: I made the above state machine simplifications, removed state S4.
https://tomverbeure.github.io/rtl/2018/11/26/Racing-the-Beam-Ray-Tracer.html
He has a link to yet another SpinalHDL RISC V as well.
And then there’s this in Chisel:
https://www2.eecs.berkeley.edu/Pubs/TechRpts/2018/EECS-2018-151.html
Never heard of that panologic device. Sounds like a sweet FPGA box to get hold of.
Thanks for all your hints and tips re: my FIFO and state machine. Especially prompting me to investigate this rabbit hole to the end rather than giving up in confusion and despair.
I might be back with FIFO woes if I ever decide to make it asynchronous!