RISC V ?

KeithE · 2017-04-08 21:21

I've used higher-end commercial simulators such as VCS+Verdi before, so I've had the experience. BTW - you should have a copy of ModelSim if you've installed the Altera tools. It's pretty decent. I also think that H/W unit tests are a good idea - they can be quite quick and easy to write. But of course you need simulations of the entire chip as well. Whole chip simulations can be slow even at the RTL level. Also debugging the whole chip is harder, especially when all of the stimulus is randomized.

I wonder how the simulation performance of MyHDL (Python based) compares to the free Verilog simulators? It looks like there's not a whole lot of active development, but it's still alive. Google summer of code still supports it, and the forums have activity. I don't think that any of the free Verilog simulators support SystemVerilog, so being able to use Python for verification work might be nice. I haven't tried MyHDL and was initially pretty skeptical, but for FPGA work it might be worth a try.

Didn't the scheme guys develop a scheme chip in scheme? Someone needs to do a JavaScript version of MyHDL then ;-)

msrobots · 2017-04-08 21:49

well you need to learn the RISC V instructions also or what other CPU you put in there. After a longer absence just lurking I took a deep breath and read the current document for the P2.

Its not as bad as I was afraid of. The overall design looks quite consistent to me now and has a bunch of interesting features compared to the P1. PASM2 is still overwhelming me a little, but so did PASM1 also years ago.

The smart-pins are easier to understand as the timers ever where (for me) same goes for the streamer, compared to the whole waitvd & co stuff. It just seem to make more sense alas I never tried to do something with it, yet.

So I am stealing with my eyes, reading @Chips and @ozpropdevs examples, same as I did with PASM years ago. And I think the current state is already a very nice Assembler language, adding the missed pointers to PASM, addressing the HUB access latency/variance of P1 with the FIFO.

I wish someone else but me and Roy would try to convince Chip that 8 char TAB should not dictate 7 char mnemonics, just going to 12 would allow 11 chars and a lot more descriptive names.

But that is the COBOL-programmer in me. I like longer and more descriptive source-codes. No need to look it up, if you can read it.

Mike

Heater. · 2017-04-08 21:54

There is a whole world of tools I guess. As I'm just tinkering for fun so I like to see what can be done with Open Source tools as much as possible.
Mind you, often times it happens that what when I tinker with something for fun for ages it ends up with someone offering me a job to do that. Perhaps I'm getting a bit too old for that now.

There seems to be a few problems here:

Firstly is my design even syntactically and logically correct? I'm relying on the icarus compiler and simulator to tell me that.

Then, does that design synthesize to a logically correct arrangement of Logic Elements in my FPGA or a logically correct circuit of gates in a real chip? Well, I'm trusting Quartus to get that right just now

Looks like the same issue as delivering C/C++ source code to someone who has a different compiler, CPU architecture, operating system, all kinds of bugs can show up.

Then, will my design work at the clock rate I want? Hmm...just now I ask the FPGA itself. If I wind up the clock on the picorv32 on DE0-nano passed 100MHz it don't work no more!

What else have I missed?

The Icarus Verilog simulator does support SystemVerilog. One had to enable System Verilog support with a flag a while back apparently. Perhaps still so.

The Verilator verilog to C++ compiler also supports SystemVerilog. I have not tried that yet.

No idea about MyHDL or Scheme. But they are designing RISC V chips in Haskell.

jmg · 2017-04-08 22:01

Ale wrote: »

Some time ago, I tried to develop my own RISC V cores. An async model from the memory stand-point. I wanted a 32-bit version that fist in a ... Lattice MachXO2 , what I have at hand and love.
I got side-tracked recreating some old HP calculators (in FPGAs). I wanted to buy some iCE5LP4K–SG48 from mouser the other day, just to test their low power abilities. They had none. I got the 1k LUTs one, for testing will do... I didn't check digikey though... but they don't have it It would be great if the MachXO2 would be available in such a package, the ~ 4k LUT version seem to come in some kind of dual row QFN . What do they have against a nice 64 or 48 pin TQFP ? . Only the smaller members are available in such a package. Yes I know you mention the iCE40 UltraPlus and I speak about the Ultra only... the names are.... not helping. Anyways still not available

I do have an iCE40 UltraPlus here (QFN48, 128kBytes RAM on chip), on the eval board, but it is showing not unexpected signs of being very new.

eg They have a canned demo of RGB, which I think is not working - RGB stays Red.
Hard to tell if the PC side, or FPGA side is to blame, as they do not include just a simple Blinky, but had to add a Spiffy GUI with sliders to control things.... (they need to learn KISS)
Smarter would have been at least two demos, sure, do one that PC links, but provide another far simpler one that just runs three differing dividers, one for each led.

I can download a binary, after digging for their example, and same result.
However, only some sources are provided, so I cannot rebuild.
When I manually add extra files, the synth is ok, but the mapper complains it cannot place the CSn...

So, Lattice do still need to get it together on the newest parts, and should ship a source-set that actually works.

jmg · 2017-04-08 22:07

msrobots wrote: »

I wish someone else but me and Roy would try to convince Chip that 8 char TAB should not dictate 7 char mnemonics, just going to 12 would allow 11 chars and a lot more descriptive names.

But that is the COBOL-programmer in me. I like longer and more descriptive source-codes. No need to look it up, if you can read it.

Agreed, and for their target market, I would think simple code readability matters a whole lot more.
Cryptically short names mattered decades ago, when there were 2k PCs. Best avoided in 2017+

Heater. · 2017-04-08 22:21

I wish someone could convince the world that TABs are really broken and annoying. They should never appear in source code.

I do agree that limiting mnemonics and other names to 7 chars is rather 1970's and unnecessary.

On the other hand, needing more than a few characters in assembler instruction mnemonics might be a sign that you have far to many instructions

KeithE · 2017-04-08 22:27

Heater - using open source tools is nice, it's just painful at times when major features are missing. But it is interesting how much could be done on a Pi with free tools now. I wonder if the MagPi has ever written anything about it? I fully support this, but did want to point out that you likely have ModelSim if you ever want to give that a try.

There are tools to address many of the problems that you raise. Proving formal properties (assertions or that a design matches a specification), proving logical equivalence (e.g. RTL to gates, ...), static timing analysis, linting, looking for issues related to clock domain crossing, coverage analysis,...

The problem with System Verilog support is there is so much to support. Getting UVM (the hot verification framework of the day) running is non-trivial. Here's an explanation that I pasted:

SystemVerilog is a combination of a hardware description language (HDL), based on Verilog and a hardware verification language (HVL) based on Vera with additional features coming from assertion languages. Apart from being under a single name and sharing a similar low level syntax, SystemVerilog remains a collection of different languages.

The HDL portion of the language comes from IEEE 1364-2001 which is now technically deprecated in favor of SystemVerilog (IEEE 1800). It is extended with some features of VHDL such as interfaces. New Application programming interfaces were added including the Direct Programming Interface (DPI).

The HVL itself is multiple languages. The main body is an objected oriented language that targets a constrained random test pattern methodology. The assertion language is a declarative language that has now been essentially converged with the property specification language (PSL).

msrobots · 2017-04-08 22:33

Come on @Heater. even with half the numbers of instructions stuff should be readable.

ld, mv, cp was nice as we had screens with 40 characters wide.

why mov instead of move. I really think PASM2 could win a lot of readability having longer mnemonics.

And RISC V assembly also.

Enjoy!

Mike

Heater. · 2017-04-08 22:53

KeithE,

I am under no illusion that Open Source tools can do everything. We cannot expect it to be possible to create open tools for closed and undocumented things like the bitstreams of FPGA's. Or whatever it is you need to produce for real ASICS and chips. That would be like expecting people to be able to write a C/C++ compiler for an undocumented CPU instruction set.

Still. I find it stunning that clever guys do manage to reverse engineer a lot of things and get them working with Open Source software.

I did a quick search of the Pi forums regarding FPGAs and FPGA tools on the Pi the other day. Almost no mention anywhere. Which is odd since the icoboard has been out for a while.

I may get around to ModelSim at some point. So much to do...

I understand what you say about Verilog/SystemVerilog being a mess of languages. There is stuff you can write that can be synthesized into an FPGA or chip design. Then there is stuff that is great for test benches and such. I noticed this years ago when I dabbled with VHDL/GHDL for a while.

Perhaps that is why the guys at Berkeley are using Haskell to design chips now.

KeithE · 2017-04-09 02:05

Heater. wrote: »

Perhaps that is why the guys at Berkeley are using Haskell to design chips now.

I have to admit that I don't really understand the motivation for Chisel. The real problem in industry seems to be in verification not in design. And the verification environment for RISC V is not all that impressive. Did I miss something? Some others seem skeptical - e.g. https://news.ycombinator.com/item?id=10861072 and way down the page at https://news.ycombinator.com/item?id=10831601. If I missed something let me know, and this could be an interesting example to study.

(Also in my experience the biggest problem is writing the specification which drives everything. If Chip had a P2 specification ten years ago...)

And when you work in one of these tools such as Chisel or MyHDL you typically generate Verilog or VHDL which means that at some point you are debugging code that doesn't look quite like what you wrote. The commercial tools won't support something like Chisel, so debugging is more of a pain. In Verdi it's very convenient to navigate Verilog designs and even go from edges in the waveform viewer to the code that caused them to occur. If you don't have your DUT + test bench coded in the language that Verdi supports then oops.

Even Chisel is being redesigned (again):

http://bar.eecs.berkeley.edu/projects/2015-firrtl.html

"However, Chisel's external rate of adoption was slow for the following reasons.

Writing custom circuit transformers requires intimate knowledge about the internals of the Chisel compiler.
Chisel semantics are underspecified and thus difficult to target from other languages.
Error checking is unprincipled due to underspecified semantics resulting in at times unclear error messages.
Learning a functional programming language (Scala) is difficult for RTL designers with limited programming language experience.
Confounding the previous point, conceptually separating the embedded Chisel HDL from the host language is difficult for new users.
The output of Chisel (Verilog) is unreadable and slow to simulate.
...
As a consequence, Chisel needed to be redesigned from the ground up to standardize its semantics, modularize its compilation process, and cleanly separate its front-end, intermediate representation, and backends."

To be fair they are a small group, and the industry has probably spent thousands of man years or more on the tools that this is trying to replace/supplement. And it's good to have people experimenting. But the RISC V design that you linked to seems to have come together in Verilog pretty easily.

Edited to add:

I don't think that the Haskell HDL stuff is from Berkeley? http://www.clash-lang.org

potatohead · 2017-04-09 02:12

Wasn't the point of Chisel to allow functional specs, ignore many lower level hardware considerations?

KeithE · 2017-04-09 03:46

Here's some back and forth with the MyHDL guys. Chisel sounds low level to me based on some of the statements.

https://news.ycombinator.com/item?id=9949630

http://www.jandecaluwe.com/blog/chisel-flawed-approach.html

There are other efforts. I thought Bluespec was doing this.

Heater. · 2017-04-09 06:57

msrobots,

I don't particularly mind instruction mnemonics getting a bit longer. It's probably a good idea for the more unusual ones.

Humans are lazy. If they use a word or phrase enough they end up shortening it to some acronym or just a symbol. Which is why mathematicians have +, -, etc. Unix has ls, cp, mv, rm. Londoner's drop their "h" and the internet has LOL, WTF. At which point things become unreadable to those who are not in the know.

I also welcome being able to use greater than 80 columns per line. Although using long lines in source code does not make code easier to read most of the time. It makes it messy to print or paste into forums and elsewhere.

My comment was merely my reflection that the P2 has so many instructions now that only a very few hard core users are ever going to get to know them all and be able to use them sensibly.

Oh, and yes "mov" is silly, should just be "m"

Heater. · 2017-04-09 07:20

KeithE,

I have no idea about Chisel. MyHDL etc. Every couple of years I end up learning a new programming language. This year it seems to be Verilog.
Those others are going to have to wait for me...

Of course Chisel is not really anything to do with RISC V. RISC V is only an instruction set specification. One can implement a CPU to that specification in
whatever way one likes (TTL anyone?) and use whatever verification tools. As many have done.

It does seem rather scary to be designing in a highly abstracted language that does not allow for linking back from it's output to the original source. That kind of traceability is rather important.

I don't think that the Haskell HDL stuff is from Berkeley?

Oops. I have been saying "Haskell" all along here when it should have been "Scala".

KeithE · 2017-04-09 19:15

Heater. wrote: »

RISC V is only an instruction set specification. One can implement a CPU to that specification in
whatever way one likes (TTL anyone?) and use whatever verification tools. As many have done.

That's true. I was hoping for some really great ISA verification tools though, but perhaps everyone is keeping those proprietary because they think it's too much to give away.

Heater. · 2017-04-09 20:25

KeithE,

Are there "really great ISA verification tools" available for any instruction set design, commercial or otherwise?

I'm a total green horn at this but I have heard it said that formal verification of any ISA implementation, that is to say a CPU that implements the ISA, is impossible.

When I think of pipelines, branch prediction, caches etc, I can well believe that is true. It all gets far too complex to check every possible thing that can happen during instruction execution. Even Intel with it's huge resources puts out buggy chips.

I don't believe the creators or RISC V have any secret tools they are keeping proprietary. They are academics after all. I guess it's possible someone else out there has something they keep to themselves. That's OK by me.

whicker · 2017-04-09 21:00

Heater. wrote: »

msrobots,

Oh, and yes "mov" is silly, should just be "m"

Technically, MOVE instructions should actually be called COPY. Because the source of the data is not affected. :nerd:

With a true MOVE, the information is translated from one place to another, leaving the source empty.
Before doing a true MOVE, one must use a DISCARD or SELL instruction.

On a multi-core processor, one could also have a STEAL instruction, too. :cool:

Heater. · 2017-04-09 21:24

whicker,

Technically, MOVE instructions should actually be called COPY. Because the source of the data is not affected.

How true that is.

I love these semantic issues.

When I move myself across the room nobody expects a valid working copy of Heater to be left behind in the place I started from. No, that place is full of air. (Hmm...some would say that is the same thing)

But if I'm swimming, the place that I started from gets replaced with water.

Or, should I ever make it to outer space the place I start from gets replace with vacuum.

So, a MOV instruction should of course put the source data into the destination. But after that the source should no longer be considered valid. The now empty space that something was moved out of could be full of any old junk.

Now, if we are talking multi-core processors we have a problem. A move becomes a transaction between one machine and another. We have to be sure that, whatever it is, is moved from one machine to the other in a fail safe way.

KeithE · 2017-04-09 21:37

Heater. wrote: »

KeithE,
Are there "really great ISA verification tools" available for any instruction set design, commercial or otherwise?

Based on what I've read I believe that there are, but yes it's difficult. Here's a something extracted from a current Apple job listing that is along the lines of what I'm thinking for non-formal testing:

As a random testing engineer for the processor, you will have the following responsibilities:
§ Work closely with verification team in creating random test templates and providing random tests in assembly
§ Improve existing random test generators or write new test generator to improve test coverage for various architecture corner cases in debug features, exceptions, error handling, multiprocessor, and memory management.
§ Develop architecture coverage to direct the test generation.
§ Verify some components in a processor design.

Heater. · 2017-04-09 22:58

I don't believe it.

Thing is, if you want to test a black box, like a CPU, you can stick inputs in and see what comes out. Already the task is impossible because there are so many possible instructions and operands to input.

But a CPU has ton of internal state. So the same input may have a different output every time.

The internal state is huge. We start with the 32 by 32 bit register file. That's already 2 ^ (32 * 32) states you have to test against!

That's before we consider pipelines and so on.

Exhaustive black box testing cannot be done before the end of the universe happens.

OK, what about "white box" testing. Where you know the logic that is inside. Well then you have the problem of creating a formal proof that whatever is inside does what you want.

I suspect this is also impossible.

That leaves us with the Apple approach you hinted at above. Shoot random inputs into the thing and check that it does the same as some other system we trust at all times.

This is in no way a "formal verification" of anything. In a mathematical sense.

potatohead · 2017-04-09 23:13

Yes, but we can reasonably write tests for the important use cases and expected instruction outcomes.

Just doing that is a big leg up. We may find state issues, and need to avoid them or do things differently, but the chance of those being a big problem on the intended use cases is sharply reduced.

jmg · 2017-04-09 23:17

Heater. wrote: »

OK, what about "white box" testing. Where you know the logic that is inside. Well then you have the problem of creating a formal proof that whatever is inside does what you want.

I suspect this is also impossible.

That's not impossible, but such 'expected outputs' testing has real dangers in that it can miss unexpected combinations.
ie the formal proof is less about "proof that whatever is inside does what you want", but also that whatever is inside does not do what you do not want.

If you look at many of the Chip Errata out there, a good number of errors are from tests simply not covering enough.
(I've seen a recent one, where a large and experienced chip maker managed to break the stop-bits setting on a UART.
I imagine the testing went something like : run a loop-back, and change all modes, Yup all pass.
Even a Logic-probe with UART protocol extract would have given the right answer.
However, 30 seconds on a standard scope would have revealed no physical change in stop-bits, as would careful time checking of message times)

Thus I can see the appeal of a Random generation system, that sits between the hopelessly complex 'black box' test, and the operator-risk-prone manual test generates.
It can have even a < 1% efficiency, as it can run 24hrs a day and self-check the results.

Heater. · 2017-04-09 23:40

True enough Spud.

That is far away from any formal proof the thing always works as expected.

We may find state issues, and need to avoid them or do things differently,...

Do you recall the famous "f00f" bug? It was an error in some version of Intel's Pentium that produced incorrect floating point results.

Intel guessed that this problem would be hit so rarely nobody would notice. They sold the chip anyway. How wrong they were.

Years before that I had a document under NDA from Intel describing all the known bugs in the, then new, 286 chip. Mostly to do with busting the memory protection.

But one of those bugs, that we had discovered, was mind bending. If you happened to do an integer multiply by an immediate value that was negative you got totally the wrong answer!

The work around for that was that everyone was using MSDOS at the time so memory protection was not an issue. And Microsoft's tools did not use multiply by immediate value instructions!

KeithE · 2017-04-10 00:36

I know that Jasper has been used for formal checking of CPus but don't know at what level. I've seen it used where I worked for things like verifying pin miltiplexing where it was effective. Much easier than checking via functional tests - you know what a mess pin mixing can be in SOCs.

There are some presentations about the use of Jasper at ARM.

KeithE · 2017-04-10 04:54

I found this post made by Andreas Olofsson, CEO/Founder at Adapteva commenting on some of the RISC V verification - the interesting part is the github link at the end:

My 2 cents...neither the risc-torture or the cmith are sufficient for use in product silicon. (not random enough and has clear coverage holes). This is especially critical for super optimized micro architectures and custom RTL implementations where corner cases tend to pop up (despite the risc-v architecture being clean and orthogonal). Apologies if I misunderstood the torture code, but it doesn't seem like it's completely random, but instead based on pre determined sequences? This might be ok for a simple pipeline, but as soon as you start having deep pipelines and exceptions, nasty bugs will pop up that are sequence dependent (and demand complete randomization).

In my opinion (based on many product tapeouts) the probability of building bug free silicon without true random verification is VERY low. You can get to something that is pretty clean without having high quality DV and there might not even be any "killer bugs", but the support burden associated with a length errata sheet of annoying little bugs could easily kill profitability and success of the chip.

Having a high quality random DV environment should be a priority for building trust in the risc-v architecture and associated RTL implementations.

Here's a pointer to the what we did for Epiphany (open source GPL DV code based on systemC libraries). Not sure if it's usefeul to anyone, but wanted to point to it in case anyone is interested in building a random test generator for risc-v. The basic idea is to crate legal tests that completely randomizes EVERYTHING (regs, values, memory locations, instructions, sequence, options, etc) and use machine power to get the coverage. This approach has served the industry well for 20 years with tools like (specman, SystemVerilog, etc). We didn't have the money for a lot of tools back in 2008 so we went the open source root based on systemc libraries.

https://github.com/adapteva/epiphany-dv

I can't really understand what Bluespec has for RISC V, but it may be interesting.

Heater. · 2017-04-10 20:55

It lives!

My RISC V core on DE0 Nano is now running a "Hello world" C program. Outputting via my UART module updated with a picorv32 bus interface.

That text also shows up nicely in the SimpleIDE terminal window.

Now to get it to output Chip's xoroshiro128+ shuffled random numbers for some statistical testing on the PC....

The new UART:

//
// Design Name : uartTx
// File Name   : uartTx.v
// Function    : Simple UART transmiter.
//               115200 baud when driven from 50Mhz clock.
//               Single byte buffer for concurrent loading while tramitting.
// Coder       : Heater.
//

module uartTx  #(
    parameter [31:0] BAUD_DIVIDER = 1301
) (
    // Bus interface
    input  wire        clk,
    input  wire        resetn,
    input  wire        enable,
    input  wire        mem_valid,
    output wire        mem_ready,
    input  wire        mem_instr,
    input  wire [3:0]  mem_wstrb,
    input  wire [31:0] mem_wdata,
    input  wire [31:0] mem_addr,
    output wire [31:0] mem_rdata,

    // Serial interface
    output reg  serialOut     // The serial outout.
);
    // Internal Variables
    reg [7:0]  shifter;
    reg [7:0]  buffer;
    reg [7:0]  state;
    reg [3:0]  bitCount;
    reg [19:0] bitTimer;
    reg        bufferEmpty;          // TRUE when ready to accept next character.
    reg        rdy;

    // UART TX Logic
    always @ (posedge clk or negedge resetn) begin
        if (!resetn) begin
            state       <= 0;
            buffer      <= 0;
            bufferEmpty <= 1;
            shifter     <= 0;
            serialOut   <= 1;
            bitCount    <= 0;
            bitTimer    <= 0;
            rdy         <= 0;
        end else begin
            if (mem_valid & enable) begin
                if  ((mem_wstrb[0] == 1) && (bufferEmpty == 1)) begin
                    buffer <= mem_wdata;
                    bufferEmpty <= 0;
                end;
                rdy <= 1;
            end else begin
                rdy <= 0;
            end

            // Generate bit clock timer for 115200 baud from 50MHz clock
            bitTimer <= bitTimer + 1;
            if (bitTimer == BAUD_DIVIDER) begin
                bitTimer <= 0;
            end

            if (bitTimer == 0) begin
                case (state)
                    // Idle
                    0 : begin
                        if (bufferEmpty == 0) begin
                            shifter <= buffer;
                            bufferEmpty <= 1;
                            bitCount <= 8;

                            // Start bit
                            serialOut <= 0;
                            state <= 1;
                        end
                    end

                    // Transmitting
                    1 : begin
                        if (bitCount > 0) begin
                            // Data bits
                            serialOut <= shifter[0];
                            bitCount <= bitCount - 1;
                            shifter <= shifter >> 1;
                        end else begin
                            // Stop bit
                            serialOut <= 1;
                            state <= 0;
                        end
                    end

                    default : ;
                endcase
            end
        end
    end

    // Tri-state the bus outputs.
    assign mem_rdata = enable ? bufferEmpty : 'bz;
    assign mem_ready = enable ? rdy : 'bz;

endmodule

Which is here: https://github.com/ZiCog/xoro

David Betz · 2017-04-10 21:07

Wow! That's amazing. How long did it take to compile using Quartus?

Heater. · 2017-04-10 21:50

That build takes almost exactly one minute under Quartus.

Mind you, I don't do much Quartus building. Everything gets simulated with Icarus first. Which only takes the blink of an eye to build and some seconds to simulate and check the results. Over many iterations as I'm a kind of "trial and error" Verilog programmer.

Of course this is a very small design. The picorv32 and my little peripherals are only 2704 logic elements, 12% of the DE0 Nano. I can imagine things get a lot longer to build when trying to fill it up.

By the way, I moved the UART TX pin to the same position Chip uses for the Prop Plug on his bare DE0 Nano P2 builds. For convenience.

Below is the test bench I used with Icarus to test the UART TX. Looks long winded but most of it is boiler plate that can be cut and pasted into other tests for peripheral devices.

//
// Test bench for uartTx.v
//
// Inspired by tutorial here: http://www.asic-world.com/verilog/art_testbench_writing.html
//

module uart_tb;

    // Define input signals (reg)
    reg        clk;
    reg        resetn;
    reg        enable;
    reg        mem_valid;
    reg        mem_instr;
    reg [3:0]  mem_wstrb;
    reg [31:0] mem_wdata;
    reg [31:0] mem_addr;

    // Define output signals (wire)
    wire        mem_ready;
    wire [31:0] mem_rdata;
    wire        serialOut;    // The serial outout.

    // UART TX empty status
    reg notEmpty = 1;

    // Instantiate DUT.
    defparam uartTx.BAUD_DIVIDER = 10;
    uartTx uartTx (
        .clk(clk),
        .resetn(resetn),
        .enable(enable),
        .mem_valid(mem_valid),
        .mem_instr(mem_instr),
        .mem_wstrb(mem_wstrb),
        .mem_wdata(mem_wdata),
        .mem_addr(mem_addr),
        .mem_ready(mem_ready),
        .mem_rdata(mem_rdata),
        .serialOut(serialOut)
    );

    // Initialize all inputs
    initial
    begin
        clk = 0;
        resetn = 0;
        enable = 0;
        mem_valid = 0;
        mem_instr = 0;
        mem_wstrb = 0;
        mem_wdata = 0;
        mem_addr = 0;
    end

    // Specify file for waveform dump
    initial  begin
        $dumpfile ("uartTx_tb.vcd");
        $dumpvars;
    end

    // Monitor all signals
    initial  begin
        $display("\tclk,\tresetn,\tenable,\tmem_valid,\tmem_ready,\tmem_instr,\tmem_addr,\tmem_wstrb,\tmem_wdata,\tmem_rdata,\tserialOut");
        $monitor("\t%b, \t%b, \t%b, \t%b, \t\t%b, \t\t%b, \t\t%h, \t%b\t\t%h\t%h\t%b", clk,  resetn,  enable,  mem_valid,  mem_ready,  mem_instr,  mem_addr,  mem_wstrb,  mem_wdata,  mem_rdata, serialOut);
    end

    // Generate a clock tick
    always
        #5clk = !clk;

    // Generate a reset on start up
    event reset_trigger;
    event reset_done_trigger;

    initial begin
        forever begin
            @ (reset_trigger);
            @ (negedge clk);
            resetn = 0;
            @ (negedge clk);
            resetn = 1;
            -> reset_done_trigger;
        end
    end

    // Write 32 bit word to some address on the picorv32 bus
    task writeBus32;
        input [31:0] address;
        input [31:0] data;

        begin
            @(posedge clk);
            enable = 1;
            mem_valid = 1;
            mem_instr = 0;
            mem_wstrb = 4'b1111;
            mem_wdata = data;
            mem_addr = address;
            @(posedge clk);
            enable = 0;
            mem_valid = 0;
            mem_instr = 0;
            mem_wstrb = 0;
            mem_wdata = 0;
            mem_addr = 0;
        end
    endtask

    // Read a 32 bit word from some address on the picorv32 bus
    task readBus32;
        input  [31:0] address;
        output [31:0] data;
        begin
            @(posedge clk);
            enable = 1;
            mem_valid = 1;
            mem_instr = 0;
            mem_wstrb = 4'b000;
            mem_addr = address;
            @(posedge clk);
            enable = 0;
            mem_valid = 0;
            mem_instr = 0;
            mem_wstrb = 0;
            mem_wdata = 0;
            mem_addr = 0;
            data = mem_rdata;
        end
    endtask

    reg bufferEmpty = 0;

    initial
    begin: TEST_CASE
        #10 -> reset_trigger;
        @ (reset_done_trigger);
        $write("reset done\n");

        $write("Transmit first char.\n");
        writeBus32(32'hffff0040, 8'haa);

        while (1) begin
            readBus32(32'hffff0040, bufferEmpty);
            while (bufferEmpty != 1) begin
                readBus32(32'hffff0040, bufferEmpty);
            end
            $write("Transmit following char.\n");
            writeBus32(32'hffff0040, 8'h55);
        end
    end
endmodule

Heater. · 2017-04-10 21:59

Oh yeah, David, sounds like you are all fired up to do some FPGA hacking. Go for it.

I'm wondering how to integrate Chip's P1 verilog into this....

jmg · 2017-04-10 22:29

I fed the source
https://github.com/cliffordwolf/picorv32/blob/master/picorv32.v

into Lattice iCE40 tools, and it spins the wheels for ~ 580 report lines
Gives 67 @W lines and 98 @N lines, some snippets below...

Line 24:  Verilog syntax check successful!
Line216: Process completed successfully.
Line219: Process completed successfully.
..Finished Pre Mapping Phase.
Line 293:
Clock Summary
*****************

Start                    Requested     Requested     Clock        Clock                     Clock
Clock                    Frequency     Period        Type         Group                     Load 
-------------------------------------------------------------------------------------------------
picorv32_wb|wb_clk_i     78.3 MHz      12.771        inferred     Autoconstr_clkgroup_0     750  
....
Pre-mapping successful!

###########################################################]
Map & Optimize Report

Synopsys Lattice Technology Mapper, Version maplat, Build 1612R, Built Dec  5 2016 10:31:39
....

Line534
@N: FX1016 :"c:\lattice\picorv\picorv32.v":2586:7:2586:14|SB_GB_IO inserted on the port wb_clk_i.
@N: FX1016 :"c:\lattice\picorv\picorv32.v":2585:7:2585:14|SB_GB_IO inserted on the port wb_rst_i.
@N: FX1017 :|SB_GB inserted on the net wbm_adr_o_0_sqmuxa_0.
@N: FX1017 :|SB_GB inserted on the net picorv32_core.pcpi_insn4.
@N: FX1017 :|SB_GB inserted on the net picorv32_core.cpu_state_0[7].
@N: FX1017 :|SB_GB inserted on the net N_3618.
@N: FX1017 :|SB_GB inserted on the net picorv32_core.N_1613.

@S |Clock Optimization Summary

#### START OF CLOCK OPTIMIZATION REPORT #####[

1 non-gated/non-generated clock tree(s) driving 812 clock pin(s) of sequential element(s)
0 gated/generated clock tree(s) driving 0 clock pin(s) of sequential element(s)
0 instances converted, 0 sequential instances remain driven by gated/generated clocks

============================== Non-Gated/Non-Generated Clocks 
Clock Tree ID     Driving Element         Drive Element Type     Fanout     Sample Instance 
--------------------------------------------------------------------------------------------
@K:CKID0001       wb_clk_i_ibuf_gb_io     SB_GB_IO               812        wbm_adr_o_esr[2]


##### END OF CLOCK OPTIMIZATION REPORT ######]
...
Writing EDIF Netlist and constraint files
@N: BW103 |The default time unit for the Synopsys Constraint File (SDC or FDC) is 1ns.
@N: BW107 |Synopsys Constraint File capacitance units using default value of 1pF 
@N: FX1056 |Writing EDF file: C:\Lattice\PicoRV\Chk_PicoRV32\Chk_PicoRV32_Implmnt\Chk_PicoRV32.edf
@W: FX708 |Found invalid parameter 0 
@W: FX708 |Found invalid parameter 0 
@W: FX708 |Found invalid parameter 0 
@W: FX708 |Found invalid parameter 0

Hmm, no mention of which file, or which step, coughed on 'Found invalid parameter 0' ?!

Addit: Changed to a larger Lattice device, no change.

RISC V ?

Comments