RISC V ?

KeithE · 2017-04-13 23:05

Heater - since you mentioned your previous incarnation with multiprocessor systems, this RISC V item that google now chose for me could be of interest:

https://m.phys.org/news/2017-04-tool-architectures-reveals-flaws-emerging.html

The researchers, testing a technique they created for analyzing computer memory use, found over 100 errors involving incorrect orderings in the storage and retrieval of information from memory in variations of the RISC-V processor architecture. The researchers warned that, if uncorrected, the problems could cause errors in software running on RISC-V chips. Officials at the RISC-V Foundation said the errors would not affect most versions of RISC-V but would have caused problems for higher-performance systems.

Heater. · 2017-04-13 23:22

KeithE,

I have had a long day and am a bit too tired to follow all that.

But I think you are being paranoid

The intention of my code is:

1) Make an SCLK out of my system clock(100MHz) by dividing by 33.

2) On the rising edge of that SCLK do something.

3) On the falling edge of that SCLK do something else.

Yes, SCLK edges will lag a bit behind clk edges. Who cares?

Now, I though about having separate always blocks waiting on posedge and negedge of SCLK:

always @ (posedge SCLK) begin
... da da da
rdData[bitCount] <= MISO;
... da da da

always @ (negedge SCLK) begin
... da da da
MOSI <= wrData[bitCount];
... da da da

But that lead me to updating "state" from two different always blocks. Which violates all the guidelines I have ever seen.

Next, up I like to talk about state machines and "case" statements. I was wondering why people use "one hot" encoding of their cases rather than my naive simple numbering. That led me to a paper that made simple state machines look so complex I don't ever want to go there.

I have to get back to your "truly asynchronous" thing after a snooze.

Thanks again.

KeithE · 2017-04-14 00:23

Heater. wrote: »

But just this once I attach a ZIP of the firmware directory for you.

So the Sieve of Eratosthenes code is enabled in there, and executes before the helloWorld stuff? I ran a sim for a while, but only saw a single 0x20 written to the UART.

KeithE · 2017-04-14 00:24

Why all of the da da da? Can it just be one flop?

always @ (negedge SCLK) begin
    MOSI <= something_calculated_based_on_the_posedge;
end

Since you're only playing with FPGAs it's fine to experiment. I usually draw some stuff on an actual piece of paper when it involves so few signals.

Regarding one-hot. I guess a lot of people figure there are tons of flops in an FPGA and they visualize an efficient implementation based on it. Some synthesis tools are smart enough to recode this stuff. I don't know that I have the best state machine style, so don't want to write too much on this topic. But I don't typically use a one-hot style ;-)

KeithE · 2017-04-14 00:35

Talk about long days - I've been working on my taxes. In other countries (e.g. Spain) I heard that the government does your taxes for you, and you just check them for correctness if you feel like it. Here an entire industry of software and CPAs is created by the insanity. (For a lot of people it's pretty simple, but in a way that's bad news because it means that they don't get to participate in employee stock purchase plans, get stock options or restricted stock units, have much in the way of investments,...)

Heater. · 2017-04-14 01:03

KeithE,

Oh poo, you are right, the sieve is still enabled in my latest version. I was just playing. It did once print it's results to my SimpleIDE terminal.

I have to clean up my act in the firmware department. That will have to wait till I have figured out how to get real C programs, with a main() and using newlib, to work.

The "da da da" is all about counting bits that have been clocked out and changing state etc.

As for, the one hot thing, I found a paper by Clifford Cummings discussing state machines. He made it so horrendously complicated! I could not follow.

Like I said, I have no feel for what actual logic my verilog generates just now. When I write C code or whatever I can get a dump of the compiled machine instructions. One starts to realize what a compiler can optimize and what it cannot. The best I have now is the schematic produced by the Quartus RTL viewer.

KeithE · 2017-04-14 01:12

Heater. wrote: »

KeithE,
The "da da da" is all about counting bits that have been clocked out and changing state etc.

If you draw just a little bit of a waveform on paper, can you create it just fine only using the posedge of SCLK, except for the MOSI edge coming out early? If so, then you just need to reclock MOSI with the falling edge of SCLK. Then all of the counting etcetera can be in one posedge block.

Edited to add: I wonder if I can get the tools up on a Linux Mint laptop until I try again with the Pi. I don't want to do much on my main computer until taxes are filed.

ersmith · 2017-04-14 01:15

Heater:

I think there's an embedded toolchain (w. newlib) for RISC-V available from www.sifive.com. They also have RISC-V development boards (with a real RISC-V ASIC, not an FPGA).

Heater. · 2017-04-14 01:25

KeithE,

If you draw just a little bit of a waveform on paper, can you create it just fine only using the posedge of SCLK, except for the MOSI edge coming out early? If so, then you just need to reclock MOSI with the falling edge of SCLK. Then all of the counting etcetera can be in one posedge block.

That is not making sense to me just now. I have to sleep on it.

KeithE · 2017-04-14 01:28

BTW - maybe RISC V trivia, but I asked and found out what extension N means. There's table 3.2 in https://people.eecs.berkeley.edu/~krste/papers/riscv-privileged-v1.9.1.pdf defines this as "User-level interrupts supported"

Heater. · 2017-04-14 01:38

Eric,

Yeah, I have the tool chain built and ready to go here. I just have to get my make file sorted out for building "real" programs . Makefiles always give me migraine!

I might just have to get me a HiFive board. Thanks for the heads up on sifive.com.

But with a RISC V frozen in silicon I don't get to hack on my own custom peripherals in Verilog

potatohead · 2017-04-14 01:46

Yeah you do. Just share a clock and wire the FPGA to it.

KeithE · 2017-04-14 01:51

Regarding the HiFive1 - I assume that they have some sort of debugger interface? That's one thing about the picorv32 - don't you have to build the FPGA to get new software loaded? I guess you could put some off-chip non-volatile storage. Is there some sort of monitor program already written? Or some way to interface a hardware debugger?

Heater. · 2017-04-14 02:15

potatohead,

Yeah you do. Just share a clock and wire the FPGA to it.

True enough.

But I already have a RISC V core in my FPGA for free. I save 50 bucks!

Heater. · 2017-04-14 02:34

KeithE,

That's one thing about the picorv32 - don't you have to build the FPGA to get new software loaded?

I have started to wonder about that on my own humble project.

If I ever get past getting the core and my little peripherals working I don't want to have Quartus wasting my life resynthesizing the whole thing just to get application code in there.

Clearly it needs a boot loader built in that fetches application code from elsewhere. EEPROM, or SD, or whatever.

Perhaps from a serial link, like the Propeller.

That is a bit in the future for xoro I think.

KeithE · 2017-04-14 02:37

I just wanted to make sure that I wasn't missing anything.

Some discussion of the SiFive boot loader here - https://forums.sifive.com/t/otp-bootloader-source/214

Edited to add: And https://forums.sifive.com/t/do-you-want-to-try-running-forth-code-on-your-hifive1/479

Heater. · 2017-04-14 02:47

I have to order one of those boards. They are gutsy enough to do that.

KeithE · 2017-04-14 03:06

ersmith wrote: »

Heater:

I think there's an embedded toolchain (w. newlib) for RISC-V available from www.sifive.com. They also have RISC-V development boards (with a real RISC-V ASIC, not an FPGA).

Well there is this https://static.dev.sifive.com/bsp/arduino/package_sifive_index.json

So you can easily get the tools for the Arduino IDE.

KeithE · 2017-04-14 03:24

Heater. wrote: »

I have to order one of those boards. They are gutsy enough to do that.

OTP contents are documented here https://dev.sifive.com/hifive1/hifive1-getting-started-guide/

Looks like they kept things pretty simple. (Compared to some schemes proposed on these forums for flexibility.)

Heater. · 2017-04-14 03:34

That is awesome.

I notice the have a USB/JTAG chip on the board that is vastly bigger than the RISC V SoC itself. Why for goodness sake!

Ale · 2017-04-14 05:27

SO much useful info...
I wanted to clarify some issues regarding clocks and some experience with SPI I gained last days... KeithE said everything I wanted to say and more...

Regarding SPI: Once upon a time I did a processor core in verilog, it could only execute from SPI RAM (FRAM was my target). I wrote a model of a SPI FRAM to test it. That is the way I saw some people use at work too, the ones that know how it is really done

. But I only simulated the code with icarus verilog. I am sure it has every possible mistake in it, that you can make in verilog, as I was at the beginning of the learning curve... I may have moved a bit now...

On this other project, I wanted to drive a DOGM132 and a SSD1306-based graphic display.

I used the following code:

assign disp_sck_o = clock_active ? ~clk_in:1'b0;
assign disp_cs_n_o = ~ss;

Yes, I gated the clock... bad: result ? there is a narrow spike when clock_active is set. I had a logic analyzer trace to show but that program ate it (the logic analyzer environment)

I replaced it with a shorter version:

assign disp_sck_o = (clock_active & clk_in);

That works. The target is a MachXO2-7000ZE. But it is not good. as the ss signal falls at kind of the same time as sck rises. because the clock is not inverted.

There is a reason why the spi clock always seems to be half of the input clock: you enable a flip-flop instead or gating the signal, it should be glitch free (spike free).

jmg · 2017-04-14 06:37

Heater. wrote: »

Here is the SPI driver so far, do let me know if it is crappy, actually I have never bit banged a SPI device before so the whole idea might be wrong :

SPI design is a whole topic in itself.

The very simplest is just a master shift register with parallel/serial modes.
That's usually enough to talk a few bytes of data to a slave peripheral.

However, this simplest design cannot write while shifting, you must add a buffer if you want to avoid dead-zones.
Slave SPI complicates things further, as you have no control of clock or phase.
Smallest MCUs are byte-only slaves, better ones have FIFOs allowing multiple bytes as slaves.

Next is the detail of do you want 'gapless SPI' ? That means no added clocks between byte-byte, and many small MCUs fail on this one.
This tends to need a FIFO and some care in the next-value load, as you must both update the pin, and fetch the next FIFO element on the same clock.

Next is fine control of the Number of bits and the clock speed... as in the smart pins.

I also like the idea of a SPI peripheral, that can manage JTAG io too..
( I don't think the smart pins can do this ?)

cgracey · 2017-04-14 06:41

jmg wrote: »

Heater. wrote: »

Here is the SPI driver so far, do let me know if it is crappy, actually I have never bit banged a SPI device before so the whole idea might be wrong :

SPI design is a whole topic in itself.

The very simplest is just a master shift register with parallel/serial modes.
That's usually enough to talk a few bytes of data to a slave peripheral.

However, this simplest design cannot write while shifting, you must add a buffer if you want to avoid dead-zones.
Slave SPI complicates things further, as you have no control of clock or phase.
Smallest MCUs are byte-only slaves, better ones have FIFOs allowing multiple bytes as slaves.

Next is the detail of do you want 'gapless SPI' ? That means no added clocks between byte-byte, and many small MCUs fail on this one.
This tends to need a FIFO and some care in the next-value load, as you must both update the pin, and fetch the next FIFO element on the same clock.

In the smart pins, everything gets registered from the I/O pins and then analyzed. This is how SPI stuff must work. You can't just bring a clock in straight from a pin and use it to clock things. You'll have a world of synchronization pitfalls. Better to nip it in the bud and get all signals into your own clock domain and then look for changes. This means, of course, that you must be going at least twice as fast as the external clock coming in, in order to grab it and track its changes without missing anything. Maybe 3x faster, or more, is practical, since you need to accommodate external signal skew.

jmg · 2017-04-14 07:03

cgracey wrote: »

In the smart pins, everything gets registered from the I/O pins and then analyzed. This is how SPI stuff must work. You can't just bring a clock in straight from a pin and use it to clock things. You'll have a world of synchronization pitfalls. Better to nip it in the bud and get all signals into your own clock domain and then look for changes. This means, of course, that you must be going at least twice as fast as the external clock coming in, in order to grab it and track its changes without missing anything. Maybe 3x faster, or more, is practical, since you need to accommodate external signal skew.

Yes, if Heater wants to support Slave as well, the above is exactly what I meant by Slave SPI complicates things further, as you have no control of clock or phase.
Master-only SPI, with gaps, is the simplest, but least flexible. In this case, 'small logic' may trump 'general purpose'.

jmg · 2017-04-14 07:09

Heater. wrote: »

I notice the have a USB/JTAG chip on the board that is vastly bigger than the RISC V SoC itself. Why for goodness sake!

Yup, the same on the Lattice ICE40UP5K-B-EVN
The FT2232H is larger than the QFP48 FPGA, but lattice add a large white arrow on the silkscreen in case you overlook their part

cgracey · 2017-04-14 07:20

jmg wrote: »

cgracey wrote: »

In the smart pins, everything gets registered from the I/O pins and then analyzed. This is how SPI stuff must work. You can't just bring a clock in straight from a pin and use it to clock things. You'll have a world of synchronization pitfalls. Better to nip it in the bud and get all signals into your own clock domain and then look for changes. This means, of course, that you must be going at least twice as fast as the external clock coming in, in order to grab it and track its changes without missing anything. Maybe 3x faster, or more, is practical, since you need to accommodate external signal skew.

Yes, if Heater wants to support Slave as well, the above is exactly what I meant by Slave SPI complicates things further, as you have no control of clock or phase.
Master-only SPI, with gaps, is the simplest, but least flexible. In this case, 'small logic' may trump 'general purpose'.

This reminds me. All our synchronous modes use slave clocking. Rather than time our own output with our own clock, we read the clock coming back in, because that's what the other side is going to have to do, too. We can do some data read phasing by picking taps along a 2-3 bit delay, clocked by the system clock.

Ale · 2017-04-14 07:26

This reminds me. All our synchronous modes use slave clocking. Rather than time our own output with our own clock, we read the clock coming back in, because that's what the other side is going to have to do, too.

All that is something we miss in the P1, among others, like the new all powerful SERDES. Bitbanging is ok when you have many cores but reading a clocked signal in is still a slow process. Conditional execution of opcodes helps too

Ale · 2017-04-14 16:59

@Heater: I got your project working on the DE10-Lite (MAX10). The Fmax is 54 MHz, and no it doesn't work at 100

. That means the MAX10 is not a Cyclone IV but something else but similar

. I thought that it were at least as good....
The built-in Byte blaster isn't recognized when I use my propplug

, the one on the BeMicro didn't have such a problem.

Heater. · 2017-04-14 17:23

Great stuff. Shame about the Fmax. Actually I was surprised when I managed to get it to run at 100MHz.

KeithE · 2017-04-14 17:58

Heater - I had forgotten about an old MacBook with Linux Mint that was sitting unused in the office. So I installed everything on it since it's a lot faster than the Pi for experimentation. It turns out that picorv32 includes an lattice icestorm example in picorv32/scripts/icestorm. It seemed to work ok, but I had to delete an "-m32" from the Makefile because riscv32-unknown-elf-gcc didn't understand it.

Anyways seems like it would be easy to run your example through those tools and see what they have to say. You would need to disable the Altera PLL if you were targeting Lattice.

RISC V ?

Comments