RISC V ?

Heater.Heater. Posts: 19,815
edited April 2 in Propeller 2 Vote Up0Vote Down
Q: What's the quickest way to get GCC working for the P2 ?

A: Drop a RISC V processor core in there. There are fully functional RISC V compilers from GCC and Clang/LLVM. RISC V cores can be pretty small and there is a bunch of them available already, free for use, in Verilog and VHDL.

Well, I'm joking. Mostly. I don't expect such a crazy thing to happen and I certainly would not want such a thing holding up P2 progress.

However, I just did something that made my eyes pop. A little while ago Chip posted the verilog for the new P2 PRNG. It seemed short and sweet so I was inspired to install the Icarus Verilog simulator and and see if I could learn just enough Verilog to check it's output was correct. Turned out to be pretty easy. I had just installed Quartus, waiting for the new P2 with PRNG release, so of course I had to see if I could get the PRNG test running on a real FPGA. Soon I had randomly flashing LEDs on my DE0 Nano. I joked that I would now proceed to design my own CPU.

I have not designed my own CPU. I found a ready made one. The picorv32 by Clifford Wolf : https://github.com/cliffordwolf/picorv32

The prospect of getting a CPU core working looked pretty daunting but after a while looking at what Clifford has there I realized it may not be impossible. I just cut and pasted his core into a Verilog project and started wrapping it around with memory and peripherals. I did not want all of Clifford's peripherals and buses and stuff. Too complicated for this humble beginner. Besides the challenge is to learn some Verilog so I wanted to make my own peripherals, as crude as they may be. Turns out that adding memory and GPIO to such a core is dead easy.

The end result is:

32 bit RISC V integer core with MUL and DIV running at 100MHz, about 25 MIPS.
12Kbytes RAM, using the memory on the Cyclone IV of the Nano.
GPIO port driving LEDs.
UART (Not quite done yet)
A PRNG port that serves up xorshiro128+ random numbers.
It runs some "firmware" compiled with GCC for RISC V that just counts up on 8 LEDs. (The "Hello World" of embedded systems)

This all fits in 2600 Logic Elements, about 12% of the FPGA. Shrinks to 8% without MUL and DIV.

Where does this all lead?

I have no idea. Just having fun. I could stick 8 of those cores on there and make a RISC V "poor mans Propeller" :) There is the SDRAM to take into use. And peripherals like the DE0 nano's ADC and accelerometer. Or perhaps what about replacing a COG from the open source P1 Verilog with a picoriscv32 core? Think I have a lot to learn for that one.

Anyway, if anyone is tempted to get their feet wet with FPGA and Verilog I highly recommend it. I suggest getting hold of the Icarus Verilog compiler/simulator. It makes experimenting very quick and easy. Rather than wait for the slow and ponderous Quartus to build anything. Bit like hacking code in Python or Javascript. Also it's easy to knock up quick test harnesses so you have some confidence your gadget will work. Without Icarus I would have given up in frustration ages ago.
http://iverilog.icarus.com/

Yeah, I know this is all off topic for a P2 forum. I was just so amazed at what is possible to do relatively easily now a days I had to tell someone. Besides, it's Chip's fault for kicking me down this Verilog road :) Thanks Chip. Oh and it does include Chip's P2 PRNG so it is in very small part a P2!

Rude and crude as it is the code is here:
https://github.com/ZiCog/xoro

I might get round to adding some documentation when I have simplified the way to build C code for it.
«13456716

Comments

  • 454 Comments sorted by Date Added Votes
  • RaymanRayman Posts: 8,336
    edited April 2 Vote Up0Vote Down
    I just bought a book "Programming FPGAs, Getting Started with Verilog".
    Might have been a mistake, we'll see...

    I'd like to make a custom P1V one day maybe...

    I'll have to remember this Verilog Simulator.
    Prop Info and Apps: http://www.rayslogic.com/
  • jmgjmg Posts: 10,619
    Heater. wrote: »
    32 bit RISC V integer core with MUL and DIV running at 100MHz, about 25 MIPS.
    12Kbytes RAM, using the memory on the Cyclone IV of the Nano.
    GPIO port driving LEDs.
    UART (Not quite done yet)
    A PRNG port that serves up xorshiro128+ random numbers.
    It runs some "firmware" compiled with GCC for RISC V that just counts up on 8 LEDs. (The "Hello World" of embedded systems)

    This all fits in 2600 Logic Elements, about 12% of the FPGA. Shrinks to 8% without MUL and DIV.

    Where does this all lead?

    Sounds cool.
    Did you try a build for the Lattice ICE40UP5K-SG48ITR50 ? (testable on ICE40UP5K-B-EVN)
    This part has 128K Bytes SRAM, and 5280 LE, but I'm unclear on how Lattice LE map to Altera LE....
    Heater. wrote: »
    Anyway, if anyone is tempted to get their feet wet with FPGA and Verilog I highly recommend it. I suggest getting hold of the Icarus Verilog compiler/simulator. It makes experimenting very quick and easy. Rather than wait for the slow and ponderous Quartus to build anything. Bit like hacking code in Python or Javascript. Also it's easy to knock up quick test harnesses so you have some confidence your gadget will work. Without Icarus I would have given up in frustration ages ago.
    http://iverilog.icarus.com/
    Did you run the above on icarus, and what speed does icarus simulate at ?
    Can icarus read a ROM file, or does it need to recompile the verilog for every simulate ?

  • Rayman,
    Might have been a mistake, we'll see...
    No. Go for it. It's fun.

    I should have done this years ago. Hmm...actually I did, I tried some experiments in VHDL running under the GHDL simulator. But VHDL is complicated and verbose. And FPGA boards were not so cheap and readily available then.

    Might take a while to get one's head around the fact that Verilog is not like a regular programming language. Potentially every statement you write can be happening at the same time. But if you are used to juggling parallel things on the Propeller it's not so shocking.


  • Heater.Heater. Posts: 19,815
    edited April 2 Vote Up0Vote Down
    jmg,
    Did you try a build for the Lattice ICE40UP5K-SG48ITR50 ? (testable on ICE40UP5K-B-EVN)
    No. The nano is all I have.

    But that is where things get interesting. Clifford Wolf runs that RISC V core on some Lattice FPGA. I forget which one but they are physically tiny and very cheap.

    Not only that but Clifford and a few other guys have reverse engineered the Lattice bit streams you need to configure those devices and created synthesis tools. With that one can get an FPGA working with a totally open source tool chain. Those guys are serious turbo nerds!

    So yeah, a Lattice FPGA dev board is now on my want list...
    Did you run the above on icarus, and what speed does icarus simulate at ?
    Yep, it all runs under Icarus and you can watch the RISC V core execute instructions. Trace the memory accesses etc. I guess it's dead slow. Good enough to dump a few hundred or thousand RISC V instruction steps per second. Good enough to see something actually works or not.

    What I have been doing mostly is using Icarus to develop/test the components. Eg. Create a UART, create a test bench for it, play with it till it works. Then integrate to the project. Icarus may be slow but the edit/test cycle is fast. As I said, like hacking code in Python.
    Can icarus read a ROM file, or does it need to recompile the verilog for every simulate ?
    Icarus compiles code into some kind of byte code. For example:
    $ iverilog -o uart.vvp uart.v uart_tb.v
    
    compiles the uart and it's test bench into a uart_tb.vvp file. Which can then be run:
    $ vvp uart_tb.vvp
    
    This is all very quick for a simple module test.

    If you feel the need for speed, or have a huge design that is slow to simulate then there is the "verilator". That compiles verilog into C++. Which you can then compile and run. I did not manage to get that working yet.

    I should add this kind of stuff to docs in the repo.



  • jmgjmg Posts: 10,619
    edited April 2 Vote Up0Vote Down
    Heater. wrote: »
    If you feel the need for speed, or have a huge design that is slow to simulate then there is the "verilator". That compiles verilog into C++. Which you can then compile and run. I did not manage to get that working yet.
    Ah, that is the path I was looking for.
    Be interested if you do get that working, with speed stats on RISC V, as that seems a good way to get an exact P2 Simulator.
    I believe the AVR simulator Atmel have, works this way - they feed it the chip design files, and get an EXE/DLL out.
    Heater. wrote: »
    But that is where things get interesting. Clifford Wolf runs that RISC V core on some Lattice FPGA. I forget which one but they are physically tiny and very cheap.
    Could be the iCE40, that 128k RAM family member is very new. (~$6) Eval boards in stock, but no disti-silicon yet.
    Do you have any links to his Lattice work, the link above mentions only Xilinx (but does hit some impressive MHz numbers)

  • Very cool, Heater. I've often thought that putting 8 or 16 RISC-V cores on a chip, with memory, a hub module and some custom Propeller like instructions (the RISC-V instruction set is extensible) would make for a very compelling Propeller3.

    There are a ton of open source RISC-V implementations available now (e.g. the VectorBlox Orca which is made for FPGAs). RISC-V hardware is becoming available now from companies like SiFive. It'll be very interesting to see where it all ends up.
  • I will certainly get back to the verilator thing.

    That is certainly an interesting way to get an accurate P2 simulator.

    Yep, it's the iCE40 HX8K.

    Be prepared to be amazed: Fully opensource FPGA synthesis tools, demo with the picoriscv32 and 128K RAM on iCE40 HX8K, even a board for the Raspberry Pi:

    http://www.clifford.at/papers/2015/icestorm-flow/slides.pdf

    http://www.clifford.at/icestorm/



  • Heater.Heater. Posts: 19,815
    edited April 2 Vote Up0Vote Down
    ersmith,

    Oh yeah, 8 picorv32 cores and some kind of HUB memory was on my mind too. Should just about fit in the nano. I didn't realize a RISC V core could be so small.

    As you say, RISC-V is extensible. With a few carefully crafted extensions to the instruction set and the smart pins it would make an excellent P3. I can't imagine Chip going for it though. The RISC V instruction set is designed to be compiler friendly not human assembler coder friendly. Consider this for example:
        // Write 0x55555555 to 0xffff0060
        li      a5,-65536                   // Get 0xffff0006 into reg a5
        add     a5,a5,96
        li      a4,1431654400          // Get 0x55555555 into reg a4
        add     a4,a4,1365
        sw      a4,0(a5)                     // Write it out.
    

    The Orca core looks interesting too. Should be an easy drop in replacement for the picorv32 here.

    Anyway, how does this compare to a P2 COG? In terms of logic elements/chip area and speed?
  • jmgjmg Posts: 10,619
    edited April 2 Vote Up0Vote Down
    Heater. wrote: »
    Be prepared to be amazed: Fully opensource FPGA synthesis tools, demo with the picoriscv32 and 128K RAM on iCE40 HX8K, even a board for the Raspberry Pi:

    http://www.clifford.at/papers/2015/icestorm-flow/slides.pdf
    http://www.clifford.at/icestorm/
    Wow, that's impressive.

    I find these stats
    Comparison with Lattice iCEcube2 
    Flow        Yosys           Synplify Pro     Lattice LSE
                Arachne-pnr     SBT Backend      SBT Backend
    Packed LCs  2996            2647            2533
    LUT4        2417            2147            2342
    DFF         1005            1072            945
    CARRY       497             372             372
    RAM4K       8               7               8
    Synth Time 30s              30s             21s
    Impl.Time  81s             405s            415s
    Notes:
    1) Timings for Intel Core2 Duo 6300 at 1860 MHz running Ubuntu 15.04.
    2) Using iCEcube2.2014.12 because I had troubles activating the license on newer versions.
    3) SoC without internal boot memory and frame buffer because Synplify Pro and LSE both could not
    infer implementations using iCE40 block RAM resources from the behavioral Verilog code.
    
    I guess the packed LCs is the chip usage value. Interesting Lattice packs better...

    The new ICE40UP5K-SG48ITR50 has slightly less Logic 5280 LC (vs 7680 LC on HX8K), but is has an easier QFN48 package, and adds 128kBytes SRAM (vs 128k bits), & based on those stats, it should be roughly half full.
    Space for a P1V COG ?



  • AribaAriba Posts: 2,104
    edited April 3 Vote Up0Vote Down
    Heater. wrote: »
    ...

    As you say, RISC-V is extensible. With a few carefully crafted extensions to the instruction set and the smart pins it would make an excellent P3. I can't imagine Chip going for it though. The RISC V instruction set is designed to be compiler friendly not human assembler coder friendly. Consider this for example:
        // Write 0x55555555 to 0xffff0060
        li      a5,-65536                   // Get 0xffff0006 into reg a5
        add     a5,a5,96
        li      a4,1431654400          // Get 0x55555555 into reg a4
        add     a4,a4,1365
        sw      a4,0(a5)                     // Write it out.
    
    ...

    I'm pretty sure you can write it like that:
        li      a5,0xffff0006           // Get 0xffff0006 into reg a5
        li      a4,0x55555555           // Get 0x55555555 into reg a4
        sw      a4,0(a5)                // Write it out.
    
    The assembler will produce 2 instructions for li that will be a LUI and an ADDI to compose a 32bit constant, just like the Prop2 assembler produces an AUGS and a MOV for every ## for the equivalent Prop2 code:
        mov     r5,##$ffff0006          // Get 0xffff0006 into reg r5
        mov     r4,##$55555555          // Get 0x55555555 into reg r4
        wrlong  r4,r5                   // Write it out.
    

    I don't see a big difference.

    Andy
  • jmgjmg Posts: 10,619
    edited April 3 Vote Up0Vote Down
    Heater. wrote: »
    Yep, it's the iCE40 HX8K.

    Be prepared to be amazed: Fully opensource FPGA synthesis tools, demo with the picoriscv32 and 128K RAM on iCE40 HX8K, even a board for the Raspberry Pi:

    http://www.clifford.at/papers/2015/icestorm-flow/slides.pdf

    http://www.clifford.at/icestorm/

    Did you find the verilog source for picoriscv32 on iCE40 HX8K ?
    I can find a lot of tools links, but the source file for picoriscv32 for Lattice tools is proving elusive ?

    I did find this, which is another FPGA board - looks nice.
    https://mystorm.uk/
    https://folknologylabs.wordpress.com/2016/08/03/storm-in-a-pint-pot/

    and one comment here :
    https://news.ycombinator.com/item?id=12193769
    " cliffordvienna 245 days ago [-]
    There is no 4K die. The 4K chips are using 8K dies, the lattice software limits the number of usable LUTs to 4K. IceStorm will give you access to all 8K LUTs in the device. "

    Lattice may not like that information leaking out .. :)


    I downloaded the latest iCEcube2 tools, and did a dummy run on a iCE40UP5K – SG48
    All seems ok, Synth -> P&R with green ticks everywhere.

    This iCE40UP5K is quite a recent addition, here is the change log ...

    iCEcube2 Version 2017.01
    Added VPP_2V5_TO_1P8V synthesis feature for iCE40 Ultra and iCE40 UltraPlus devices.
    Enhanced auto assign SPI dedicate pin if no SPI instance.
    Fixed bug for RGB LED driver pin mapping issue in iCE40 UltraPlus device.
    Fixed bug for I3C simulation model for iCE40 UltraPlus device.

    iCEcube2 Version 2016-12
    Removed the license control for iCE40UP5K – UWG30, iCE40UP3K – UWG30 and iCE40UP5K – SG48

    iCEcube2 Version 2016.08
    This version of the iCEcube2 software adds support for the device-package combinations:
    iCE40UP5K-UWG30
    iCE40UP3K-UWG30
    iCE40UP5K-SG48
    Enhanced Pin Constraint Editor with pullup/weakpullup constraints process.
    Added support for Windows 10 OS, 32-bit and 64-bit.
    Added Pack Area option to placer tool options.
    Added VPP_2V5_TO_1P8V synthesis feature for ICE40 UltraLite devices
  • Ariba,

    You are right. The assembler does deal with things like "li a5,0xffff0006" by generating two instructions.

    My gut does not like the idea of an assembler producing extra instructions behind my back. That's what compilers are for.

    But in cases like this it makes a lot of sense. Nobody wants to dick around figuring out how split immediates up for loading. And I guess it's not much more of a worry than an Intel assembler producing huge sequences instruction and operand bytes whose length depends on the actual values and addressing modes you use.

    Next up I have to turn on the RISC V compressed code feature and see what space savings we get.
  • jmgjmg Posts: 10,619
    Heater. wrote: »
    You are right. The assembler does deal with things like "li a5,0xffff0006" by generating two instructions.

    My gut does not like the idea of an assembler producing extra instructions behind my back. That's what compilers are for.
    Think of it as simply a 64b opcode, and that problem goes away.

  • Heater.Heater. Posts: 19,815
    edited April 3 Vote Up0Vote Down
    jmg,
    Did you find the verilog source for picoriscv32 on iCE40 HX8K ?
    I can find a lot of tools links, but the source file for picoriscv32 for Lattice tools is proving elusive ?
    The slide deck I linked to talks of a picorv32 demo on the "icoboard":
    http://icoboard.org/
    http://icoboard.org/risc-v.html

    Which points us to icoSoC which runs on that board:
    https://github.com/cliffordwolf/icotools/tree/master/icosoc

    Which contains a copy of the picorv32 core:
    https://github.com/cliffordwolf/icotools/blob/master/icosoc/common/picorv32.v

    So I guess if you follow the installation instruction in that repo you end up with a RISC V SoC for iCE40.

    I love that tidbit about getting around the 4K limit.

    Honestly I think this whole IceStorm thing is huge. I mean, we can now develop for FPGA in Verilog using totally Open Source tools, even running on a Raspberry Pi. That is a monumental achievement. I'm surprised I have not seen any talk of it on the Raspi forums.

    Soon it's goodbye clunky Quartus for me !
  • evanhevanh Posts: 4,384
    edited April 3 Vote Up-1Vote Down
    jmg wrote: »
    Heater. wrote: »
    You are right. The assembler does deal with things like "li a5,0xffff0006" by generating two instructions.

    My gut does not like the idea of an assembler producing extra instructions behind my back. That's what compilers are for.
    Think of it as simply a 64b opcode, and that problem goes away.
    Bull, the problem doesn't go away at all. It's just inserted a hidden instruction that takes more space and more time to execute. I'm not anti the assembler doing this but you can't say there is no gotchas.
    The Prisoner's Dilemma, in english - "Selfishness beats altruism within groups. Altruistic groups beat selfish groups." - Quoted part from 2007, D.S Wilson/E.O Wilson.
  • Watch the language, please.
    Infernal Machine
  • Heater.Heater. Posts: 19,815
    edited April 3 Vote Up0Vote Down
    Certainly there can be gotchas in the assembler sneaking in extra instructions for you. I'm guessing it's only a problem if one is into timing things by counting up instructions and clock cycles so as to meet some strict timing constraints. As people do on the P1 and no doubt will do on the P2. Or perhaps when squeezing code into really small memory spaces, like a COG.

    All in all not something the RISC-V designers or people writing assemblers for it worry about. RISC-V is intended as a general purpose instruction set architecture.
  • AribaAriba Posts: 2,104
    edited April 3 Vote Up0Vote Down
    You just can't encode a 32bit immediate value in a 32bit wide instruction. Every processor architecture needs to handle that with 2 instructions or an additional immediate word after a load instruction.
    Also the Propeller 2 does that (with AUGx).

    Load Immediate (LI) is not a native RISC-V instruction, it's an assembler pseudo instruction to simplify the load of constants, just like ## on the P2.

    You can write:
        lui     a5,%hi(0xffff0006)      // Get 0xffff0006 into reg a5 (higher 20 bits)
        addi    a5,a5,%lo(0xffff0006)   // lower 12 bits
        lui     a4,%hi(0x55555555)      // Get 0x55555555 into reg a4
        addi    a4,a4,%lo(0x55555555)
        sw      a4,0(a5)                // Write it out.
    
    if you want to see every instruction.


    The big difference between a Propeller (1 or 2) and RISC-V is in the tight integration of counters and ports with the instructions on the Propeller. This allows fast bitbanged software peripherals which are much harder on RISC-V.
    On RISC-V the ports are normally memory mapped which needs separate instructions for Load Modify and Store.

    What is just an XOR OUTA,#1 on the Propeller, becomes:
        li      a5,PORTA_ADDR
        lw      a4,0(a5)
        xori    a4,a4,1
        sw      a4,0(a5)
    
    on RISC-V. And Load/Store are often one of the slower instructions.
    Same for things like WAITCNT or WAITPNE.

    So if you want a Propeller like multicore with RISC-V cores, you will need custom instructions that allow tight integration with ports and counters.

    Andy
  • Heater.Heater. Posts: 19,815
    edited April 3 Vote Up0Vote Down
    That's right. A 32 bit immediate won't fit in a 32 bit instruction !

    That's not the issue though. The issue is simply extra instructions being generated by the assembler that you did not explicitly write. Which complicates simple minded instruction counting when making tight bit banging loops and so on. Also if you increase the size of a literal all of a sudden your code gets bigger!

    Anyway, I'm not inclined to worry about that much. My RISC-V will be in FPGA, unless someone starts selling actual RISC-V chips, so any such bit banging will be done in Verilog!

    Certainly the tight integration of I/O into the COG instruction set is a wonderful thing.

    I was pondering the idea of RISC-V extensions for such bit banging and timing. The picorv32 core has a coprocessor interface for exactly that purpose. Currently only used for the optional MUL and DIV instructions. I was starting wonder how easy it might be to add my own instructions to that interface for ports and counters etc.


  • When timing is super important, just write those instructions explicitly, and or hold constants in COG memory, P1 style.
    Do not taunt Happy Fun Ball! @opengeekorg ---> Be Excellent To One Another SKYPE = acuity_doug
    Parallax colors simplified: http://forums.parallax.com/showthread.php?123709-Commented-Graphics_Demo.spin<br>
  • Seems what the RISC-V needs is a COG like thing as a coprocessor...
  • The COGS are what make a Propeller.

    RISC-V is like being stuck in HUBEXEC. Several of them would be like several COGS stuck in HUBEXEC.

    This isn't a bad thing! It's just not tuned for real-time, embedded, sense, process, response like we are aiming for.

    Imagine a RISC-V with tons more registers. Now, imagine running code out of the registers themselves. That's a COG.

    Do not taunt Happy Fun Ball! @opengeekorg ---> Be Excellent To One Another SKYPE = acuity_doug
    Parallax colors simplified: http://forums.parallax.com/showthread.php?123709-Commented-Graphics_Demo.spin<br>
  • You could get something similar with a RISC-V core that had both a small local memory and shared hub memory. There really isn't any need to run code out of registers.
  • Quite so Spud.

    If I imagine a RISCV with tons of registers and executing from regs, it's not a RISC V anymore!

    Horses for courses and all that.

  • Heater. wrote: »
    I have not designed my own CPU. I found a ready made one. The picorv32 by Clifford Wolf : https://github.com/cliffordwolf/picorv32

    If you haven't done yet, you should take a look at another project he has made: Yosys (Open Verilog SYnthesis Suite)

    http://www.clifford.at/yosys/

    Heater. wrote: »
    Where does this all lead?

    I have no idea. Just having fun. I could stick 8 of those cores on there and make a RISC V "poor mans Propeller"

    Next step is to download OpenRAM (open Memory Compiler) and then you have almost 95% to make a P1 in 0.5um (FreePDK45 / SCN3ME_SUBM)

    https://github.com/mguthaus/OpenRAM

    OpenRAM was released around 6 months ago.

    This post is from 2 years ago, when openRAM was not released yet: https://www.reddit.com/r/yosys/comments/2g426s/readmemh_support/
  • potatoheadpotatohead Posts: 8,957
    edited April 3 Vote Up0Vote Down
    There really isn't any need to run code out of registers.

    On a Propeller, they are one and the same. It's a register when you want it to be, small, local memory when you want it to be.

    As for need, a Prop is a memory to memory direct design at the COG level. Code and Data are unified, registers / memory, etc... This means avoiding the load / store cycle, which improves throughput and real-time response.

    At the HUB level, a Prop is a load-store machine, just having a ton of registers.

    One distinction is the I/O is memory mapped, but in the same space as the COG memory is, or it's dedicated, accessed by implied addressing.

    This makes it a micro-controller, in my view, as that generally isn't the model used for general purpose computing. This also makes it very fast in terms of sense, process, response too.

    Finally, we have some shared resources, like CORDIC, PRNG. Most things are relative to a COG, cloned to maximize both throughput and real time.

    Should we get it done this year, it's going to be distinctive. Capable of things at a process and clock speed that is hard to beat. :D

    Do not taunt Happy Fun Ball! @opengeekorg ---> Be Excellent To One Another SKYPE = acuity_doug
    Parallax colors simplified: http://forums.parallax.com/showthread.php?123709-Commented-Graphics_Demo.spin<br>
  • Yeah, I noticed the Yosys thing. It's what he uses to build his picorv32 SoC in the repository I got the picorv32 core module from. It's used with the IceStorm open source Verilog compiler we mentioned above. All mind blowing stuff.

    Perhaps I'll get to looking at Yosys sometime. Just now I'm still feeling my way around Verilog itself.
  • Heater. wrote: »
    Perhaps I'll get to looking at Yosys sometime. Just now I'm still feeling my way around Verilog itself.

    Yosys is verilog too (same as Quartus or Vivado). You can do both.

    http://www.clifford.at/yosys/files/yosys_presentation.pdf

    Are you curious about how can be xorshiro128+ implemented with simple TTL logic gates? You can use the xorshiro128+ code as input for yosys and It will show you the logic gates needed to implement it. There is even one comand that will show you a graph of that.

    He made some automated scripts to compare the output of his verilog program (yosys) with Quartus and Vivado : http://www.clifford.at/yosys/vloghammer.html

    And found many bugs in both Yosys and commercial tools.

    Completely mind blowing !

  • Yosys is the synthesis engine of the IceStorm tools, it compiles your Verilog input into LUT definitions and Netlists. But it has other usecases like formal verification (I have no clue of that).

    ArachnePNR is the Place and route tool of IceStorm which decides which LUTs are used and routes them correctly according the netlists. The output is the configuration bitstream.
    It's mainly the Place and Route that takes so long in Quartus and other commercial tools. Arachne is blending fast in comparsion, at the cost of a bigger LUT count.

    But the supported ICE40 FPGAs are quite limited. Only the bitstreams of older ICE40 types are known. They have max 8k LUTs and no Multipliers.
    So don't expect you can use it for a P1V bigger than 4 cogs with some custom peripherals.
  • I think this is all totally amazing. Actually managing to create all the tool steps needed to get Verilog into an FPGA. Reverse engineering the required bit streams and all. Incredible.

    The supported FPGA's may not be so new or big but if it's enough to put a 32 bit RISCV core onto with a usable memory space and room spare for some custom logic that is very useful.

    Why would one need 16 cores of P2 when the functions you want to create can be done in Verilog on a super cheap FPGA?

    What with open source tools that run a Raspberry Pi we might see 10 year olds knocking out their own logic designs!

    I might be checking out this Yosys tool chain sooner than I thought, thanks Ramon for the little push there.
Sign In or Register to comment.