Software Emulation of PASM code

Wossname · 2017-05-27 12:40

One thing I have always wanted since I began using PASM for programming the P8X32A is a software emulator.
I have looked around and have not found such a thing (if anyone knows of one *please* let me know).

So I thought I'd write one myself. Why? Because of convenience mainly, I do like to write code when I'm away from the electronics bench, and it would be really nice to just throw quick ideas into an emulator without having to rummage through my equipment to see if it works or not.

I'd like to get your opinions on what you think of the following idea for Propeller PASM emulation software...

Overview:
A command-line tool (linux/windows etc.) that accepts a PASM code file as input and is able to step through the code and halt at user-defined breakpoints and output debugging information such as register states.

Open Source (GPL or MIT or whatever is most suitable) and free ($0.00)
Command-line interface to make scripting and automation easy
Written in Gnu99 standard C language (portable to any platform). I am also considering the Ada language as a possibility.

The initial release could have the following features (eg. the low-hanging fruit):

Support limited to a single Cog.
'Watchers' for automatically pausing execution when a condition is met ("break when VariableX == 0x0001be7d)
Simple interactive commands like "dr" (dump register values), "0x19?" (print value is in cog ram offset 0x19) etc...
Takes a "stimulus" file as input ("wait 2417998 cycles then set PORTA[4] HIGH.....")
Other stuff (any suggestions welcome)

Later versions might also have these features:

Full 8 cog support with correct handling of GPIO sharing / state detection
Hub operation support with accurate round-robin sequence emulation
Logic-analyser output (generate "GPIO pin states and internal register states over time" data that can be plotted as a graph (Excel or GnuPlot or whatever)
Accurate counter (CTRA/CTRB et al) support
More elaborate breakpoints that can wait for specific events on specific cogs or even the Hub itself ("break when any cog attempts to access a Hub resource X or Y")
Pass in a "normal" propeller image and it will be disassembled back into PASM source code and run in the emulator. (I actually already wrote a mostly-working disassembler a few years back)
Pixel correct VGA / PAL / NTSC generation simulation for those writing video games
Other stuff (any suggestions welcome)

I'm fairly familiar with the PASM language and the Prop's architecture (not so much SPIN to be honest, so I doubt that'll included until *much* later).

I'd like to hear your thoughts on whether you'd be interested in such a tool (or do you think it's a complete waste of time?).

RS_Jim · 2017-05-27 13:21

I like it.
Jim

Seairth · 2017-05-27 13:46

Also take a look at:

https://sourceforge.net/projects/gear-emu/

David Betz · 2017-05-27 14:48

Dave Hein's SpinSim already exists. Here is a link to it on the Parallax GitHub account but I think there might be a later version available directly from Dave.

https://github.com/parallaxinc/spinsim

DavidZemon · 2017-05-27 14:48

As Seaith mentioned, Gear would be one existing example, though it is neither open source nor free.

If you're going to create something new, I wonder what it would take to make it possible for GDB to connect to it. I'm imagining being able to use something like propeller-load or propload to program the emulator and then connect to the emulator with GDB such that I can step through the code in my IDE while controlling the pins through this new tool's interface as well... that would be really powerful.

You brought up the ability to stimulate the pins with an input file. Most of the time what I need help debugging is serial communication interfaces like SPI and I2C, so a file that just says "pin1 high, pin 1 low" would be really tedious. It'd be a lot better if the emulator understood a few basic protocols like UART, I2C, and SPI and allowed the author to input scripts using these protocols.

Also, can I encourage you to pick a higher level language than C to start with? Python and Java would be two fine languages. C# maybe if you prefer native binaries over interpreted. With Linux's mono, it should be possible to make it cross-platform even. Perhaps Ada also fits these requirements - I don't know since it's still foreign to me.

Good luck! I look forward to hearing more about this

Seairth · 2017-05-27 14:55

DavidZemon wrote: »

As Seaith mentioned, Gear would be one existing example, though it is neither open source nor free.

The link above is to the GPLed source. Am I missing something?

DavidZemon · 2017-05-27 15:16

Seairth wrote: »

DavidZemon wrote: »

As Seaith mentioned, Gear would be one existing example, though it is neither open source nor free.

The link above is to the GPLed source. Am I missing something?

Nope, probably the other way around. I must be thinking of something else. That's great

DavidZemon · 2017-05-27 15:26

Oh that's beautiful

I just checked out the code from SVN onto my Ubuntu 17.04 machine, opened it up with JetBrains Rider, compiled it, and started the main app and it worked! Haven't actually debugged a program yet, but it ran without any problems. I might have to start using this...

Wossname · 2017-05-27 15:35

DavidZemon wrote: »

Also, can I encourage you to pick a higher level language than C to start with?

Encouragement is to be encouraged

Can you provide a reason for using a "language-level" higher than C? I chose C for a few reasons, not least of which is that it's a very well understood and solid basis and one that the standard tools can actively enforce, with the appropriate switches. Secondly, everyone (myself included) and his dog knows C. In my limited experience, python or ruby or R or php, or F# or whatever changes it's stripes every 2 months and then nothing works properly.

However, I do appreciate the C# / Mono argument. Yes, I think that would be a lead worth pursuing. I have had some dealings with that in the past (like 10 years ago, I hope things have improved since then, comments welcome on this topic).

Also bear in mind the fact that it's PASM that we wish to simulate/emulate. Staying close to the "bare copper" (as C can often do with minimal effort) is beneficial.

It's hard to argue with C. *

*Or C++ if I'm feeling charitable.

macca · 2017-05-27 16:17

Wossname wrote: »

One thing I have always wanted since I began using PASM for programming the P8X32A is a software emulator.
I have looked around and have not found such a thing (if anyone knows of one *please* let me know).

Others have pointed you to Gear, I just want to add that I ported the code to C++, if you are interested I can clean it up a bit and post the source here. It is not optimized and doesn't support all the features of Gear (mainly the plugins) and doesn't have an UI. It works, as far as I can tell, at least with Spin and PASM, for some reasons it doesn't work when I feed it a C program. It is very slow, as it is Gear, I was hoping that a C++ porting would gain some speed but it is very far from being realtime or even useful at all. If you know some method to speed it up, it would be great!

VBB · 2017-05-27 16:50

I implemented a PASM c# emulator for my own use to support symbol level debugging and unit testing frameworks.

I first explored Gear - It uses a C# and is pretty nicely done but it's performance is linked to an internal clocking system that is hard to improve much. The Gear SPIN emulator is also quite nicely done with faster potential performance but only for SPIN only programs. I don't use SPIN but have come to appreciate the idea behind SPIN - if parallax had invested more in their software tools it might have gone further faster.

The real PITA is the poor support for debugging from the spin generation tools. I looked quite hard to find symbol tables etc that I could plug back into the Gear to generate the debug information I was after but came up empty.

I tried to understand the C++ openSIM version of the x86 (really!?) original compiler to try to figure out how to generate the debug info I was after and ended up giving up and writing my own PASM code emulator instead.

I think before writing a PASM simulator it would be worthwhile figuring out how to generate the debug information to use with the output of the SPIN. There were some other 3rd party efforts but they seemed unmaintained and I couldn't get them to run. Maybe I just missed something but debug info would the first step before taking the time to write a sim.

If you can get a C# implementation running at near real-time with debugging support I can wrap an IO footprint and plug it into Virtual Breadboard (www.virtualbreadboard.com) with integrated editing and code edit with debug support which is why I was exploring Gear in the first place.

Wossname · 2017-05-27 16:51

macca wrote: »

(Regarding Gear)...I ported the code to C++, if you are interested I can clean it up a bit and post the source here...

That is a very generous offer, macca. However, I'm not looking to adopt or port and existing project.

I'm hoping to make something small, simple and utilitarian, but useful. I don't like the idea of having a huge GUI overhead (sounds like an endless porting nightmare).

DavidZemon · 2017-05-27 17:14

Wossname wrote: »

DavidZemon wrote: »

Also, can I encourage you to pick a higher level language than C to start with?

Encouragement is to be encouraged Can you provide a reason for using a "language-level" higher than C?

Happily! It will be easier to write, read, and maintain. C doesn't know what a "string" is, so any language with a real string type will be a big jump right there. Any object-oriented language would be another big jump (assuming you're an OO guy and haven't jumped on the functional bandwagon). Garbage collection would be a nice addition as well, but that is certainly one of the things that could negatively affect performance. One might argue that speed is a primary concern, and a good reason to stick with C or C++, and maybe this is a good example of when language choice can actually make a difference... but it rarely does.

macca wrote: »

I ported the code to C++... It is very slow, as it is Gear, I was hoping that a C++ porting would gain some speed but it is very far from being realtime or even useful at all.

This would imply that dropping to a low-level language and dropping the GC didn't help much.

Wossname wrote: »

Secondly, everyone (myself included) and his dog knows C. In my limited experience, python or ruby or R or php, or F# or whatever changes it's stripes every 2 months and then nothing works properly.

If language popularity is what you're after, JavaScript is the undisputed champion (please don't take this sentence seriously). StackOverflow's annual survey shows C and C++ are hovering around 20% popularity, compared to JS's 62% and Java's 40%. Python even beats C/C++ with a score of 32%. I can't speak for Ruby, R, Php, or F#, but I first learned Python (more than "Hello, world") back in 2011 and haven't had to undergo any major relearning since then. Of course there is the infamous Python 2 -> Python 3 discussion, but having learned on Python 2.7 the upgrade to 3.x was a piece of cake for me. Perhaps if you learned Python on 2.5 or less it's a bigger deal, but it has not slowed my work much at all. And if you went Java, that is quite backwards compatible. If you develop it in Java 8 or 9 today, it will continue running on future JVMs for many years to come.

I very much like the ideas and discussion so far in this thread. No matter your language of choice, I'll do my best to offer support in the form of ideas (and hopefully pull requests) in the future!

macca · 2017-05-27 18:04

DavidZemon wrote: »

macca wrote: »

I ported the code to C++... It is very slow, as it is Gear, I was hoping that a C++ porting would gain some speed but it is very far from being realtime or even useful at all.

This would imply that dropping to a low-level language and dropping the GC didn't help much.

I'm not very familiar with C# but the emulator is mostly static code, the GC should never do anything. We are talking about emulating a thing that is running at an equivalent speed of 80x8=640MHz, even on a 3GHz processor you have about 18 clock cycles to emulate each instruction, not very much unless you code it in assembly (and I believe it will be very difficult even then).

Of course if you want to just emulate a single cog without any other feature (like PLL, counters, etc.) it will be much faster and maybe even realtime (haven't tested that).

Wossname · 2017-05-27 18:06

Thank you David, I also enjoy this kind of topic.

DavidZemon wrote: »

If language popularity is what you're after...

Good point well made

. I did foolishly state that I thought C was valuable because of it's popularity. Not the first time my loose tongue has scuppered me! That isn't really what I should have said...

I think C is a good fit for this project because it is a mature language which happens to also be ubiquitous and robustly documented. Plus it has simple data types that happen to correlate to the PASM model rather conveniently. C is also pretty much dominant in electronics at the level we're dealing with here.

DavidZemon wrote: »

...(((A language of a higher level than C)))... will be easier to write, read, and maintain.

Personally, I disagree.
The users/maintainers typically expected to use our hypothetical new tool would be conversant in languages common in very low-level electronics (C, Assembly, can't think of much else). This won't be a Spin simulator. Spin is a super high level language, rather problematical for us.

I don't know that many Python (or other VVHL language) developers also spend a lot of time writing assembly language. There are exceptions of course. This project is so niche it's ridiculous. Do we have the luxury of picking a VVHL language? No, I'm not convinced.

Speed performance had not occurred to me as a requirement in this project until you mentioned it. I had only considered that the (squishy human at the keyboard) would be manually stepping through their lines of PASM code one line at a time. Indeed, it is very likely that a user will, at some point, want to allow their code to run freely at full speed (20MIPS per cog let's say).

This needs more thought..................

Wossname · 2017-05-27 18:17

None of this needs to be real-time simulation by the way. Just as long as the clock-cycles counter is correct, it doesn't matter if a 1Hz blinky LED program takes 30 seconds to simulate each blink. It's data validity we're interested in, not real-time.

A Prop1 chip running from a 5MHz external source, and a 16x PLL config will have a clock speed of 80MHz. Each instruction takes (waving hand in the air) 4 clock cycles, which gives 20,000,000 instructions per second per Cog.

Totally irrelevant though because the user will be using typed-in commands to trigger breakpoints and dumping register values all the time. I don't think anyone is doing Bitcoin hashing on a Propeller!

It would be desirable to run code at realistic speeds but I don't think it should be a project goal. Would be extremely cool though!

Dave Hein · 2017-05-27 21:12

Wossname, spinsim will do most of what you listed in your initial post. The only thing it's missing is some of the debug features you described, and a cycle-accurate mode. Spinsim takes some shortcuts, and simulates PASM instructions at a four-cycle granularity. I do have a private version of spinsim that does simulate every cycle, but this makes it run slower. The private version also simulates an SD card so that code that accesses an SD card can be tested. I also added a feature to write the values of the pins out to a file. And there was another option to generate a BMP file for every frame generated by the VGA output. If there's sufficient interest I could add these features as options to the next public version of spinsim that I post.

Spinsim also supports the P2, but I haven't updated it for about a year. I'm planning on updating spinsim to the latest version of the P2, but it will probably be a few weeks before I do that.

EDIT: BTW, the latest version of spinsim is 0.95. It's posted in the P2 Simulator thread.

jmg · 2017-05-27 21:34

Dave Hein wrote: »

Wossname, spinsim will do most of what you listed in your initial post. The only thing it's missing is some of the debug features you described, and a cycle-accurate mode. Spinsim takes some shortcuts, and simulates PASM instructions at a four-cycle granularity. I do have a private version of spinsim that does simulate every cycle, but this makes it run slower.

Can that selection be a user-run-time-switch ?

Dave Hein wrote: »

The private version also simulates an SD card so that code that accesses an SD card can be tested. I also added a feature to write the values of the pins out to a file. And there was another option to generate a BMP file for every frame generated by the VGA output. If there's sufficient interest I could add these features as options to the next public version of spinsim that I post.

Logging & peripheral simulation is always useful.
Can a variant of the SD peripheral, simulate the SPI boot device for P2 ? (and maybe i2c for P1 ?)

jmg · 2017-05-27 21:46

Wossname wrote: »

None of this needs to be real-time simulation by the way. Just as long as the clock-cycles counter is correct, it doesn't matter if a 1Hz blinky LED program takes 30 seconds to simulate each blink. It's data validity we're interested in, not real-time.

Yes, and no. Speed does matter, if the Simulator is so slow, users can more easily download and test.
Sometimes users want to run to some advanced code location, and then debug from there.

Wossname wrote: »

It would be desirable to run code at realistic speeds but I don't think it should be a project goal. Would be extremely cool though!

The simulators I'm used to, run between 10% and 100% of the MCU, and they have a deliberate cap on the 100%.
I tried one that was under 1%, and quickly gave up.
That means speed does need to be a project goal, tho hitting 100.0% is not that important.

Because speed matters, C is a reasonable choice...., at least for the Sim.Engine

however, there is another very different path to simulation, that may be even better.

A problem all simulators suffer from, is properly simulating everything. Opcode coverage is ok, but peripheral interaction quickly becomes important, and it consumes a lot of time testing the peripheral interactions on MCUs.

What the AVR does on their simulator, is use the Verilog as a base.
I believe they do Verilog -> C

This has a speed cost over a crafted core-sim, but it does buy you very good peripheral simulation.

This approach would also allow P2 Sim to be added, as well as P1 via P1V.

Given there are already software based simulators, would you consider looking into Verilog based sim ?

Heater. · 2017-05-27 22:17

This idea has come up here a few times. Use Verilator to compile the Propeller Verilog into C++. Once you have that C++ you can surround it with whatever input, output, logging, simulated peripherals you like. The resulting simulation should be plenty fast enough.

The Verilator:
https://www.veripool.org/wiki/verilator/

The Propeller Verilog:
https://github.com/parallaxinc/Propeller_1_Design

Also has the neat side effect that one can then transpile the C++ simulation into Javascript and run it in the browser.

Dave Hein · 2017-05-28 00:04

jmg wrote: »

Dave Hein wrote: »

Wossname, spinsim will do most of what you listed in your initial post. The only thing it's missing is some of the debug features you described, and a cycle-accurate mode. Spinsim takes some shortcuts, and simulates PASM instructions at a four-cycle granularity. I do have a private version of spinsim that does simulate every cycle, but this makes it run slower.

Can that selection be a user-run-time-switch ?

Yes, the cycle-accurate mode could be controlled by a command-line option.

jmg wrote: »

Dave Hein wrote: »

The private version also simulates an SD card so that code that accesses an SD card can be tested. I also added a feature to write the values of the pins out to a file. And there was another option to generate a BMP file for every frame generated by the VGA output. If there's sufficient interest I could add these features as options to the next public version of spinsim that I post.

Logging & peripheral simulation is always useful.
Can a variant of the SD peripheral, simulate the SPI boot device for P2 ? (and maybe i2c for P1 ?)

An SPI flash chip should be fairly easy to simulate. In the P1 mode, spinsim already has the capability to boot from a file named eeprom.dat. It also simulates an I2C EEPROM, which allows reading and writing eeprom.dat.

Wossname · 2017-05-28 10:32

Does the Verilog approach bring with it a large amount of dependencies and libraries etc? Are there any closed-source components in there?

I have played with VHDL recently and the tools are far from simple or lightweight. If Verilog is similar in nature then it sounds like it would be a maintenance-hard problem.

Heater. · 2017-05-28 14:31

As far as I can tell the Verilator does not produce code that has any weird dependencies, perhaps even none at all, just straight
up C++ source code. The Verilator is entirely open source. However your generated simulator need not be.

Which VHDL tools have you played with? I only know that using Verilog from Altera Quartus is monstrous. Quartus is a huge and complex program. Quartus compilation is horrible slow. The editor is pretty poor. From what I read other commercial HDL tools are equally bad.

On the other hand the open source Icarus Verilog simulator is very light weight, quick and easy to use. It makes the edit, run, debug cycle of writing Verilog code as quick as hacking code in something like BASIC or Python.

I have not really tried the Verilator. I only compiled a little "hello world" verilog module with it a while back to see that it worked at all. Seems pretty straight forward.

I don't see that Verilog compiled with Verilator need be any more of a maintenance problem than code written in C++.

Still, we won't know what issues this approach might bring until someone tries it. Sadly I'm a bit too busy to experiment with such a thing just now.

Wossname · 2017-05-28 15:06

heater wrote:

Which VHDL tools have you played with?

I've dabbled with Xilinx ISE, Xilinx Vivado and Lattice Diamond in the past. I soon decided that life was literally too short to learn it in any detail. A total nightmare of insane complexity.

I think I understand the idea though... Use Verilator once to generate a lump of C++ code that represents the P8X32A and then we don't have to deal with the Verilog code ever again. The C++ lump becomes a "black box" which we shove some PASM code into and give it a clock source and some IO stimulus and let it run.

It does make sense.

Heater. · 2017-05-28 17:14

Yep, those tools are a long winded pain.

Verilog itself though is pretty easy. For example, "Hello world" in Verilog:

module hello_world ;
  
initial begin
    $display ("Hello World!");
    #10  $finish;
end

Kind of looks like any other programming language. The #10 is a bit odd but that just means wait ten time ticks before moving on.
("$display()" and "#10" are not things you can use in synthesizing circuits but are very useful in writing test harnesses for you circuit modules.

The complexity comes form the fact that you are designing circuit elements that are supposed to be actual hardware and therefore there is a huge lot of parallelism going on. There are things in the language that can jar the mind of a software engineer at first sight though. For example:

module counter (
    input wire clock,
    input wire reset,
    input wire enable,
    output reg [31:0] counter_out
);
    always @ (posedge clock) begin
        if (reset == 1) begin
            counter_out <= 0;
        end else if (enable == 1) begin
            counter_out <= counter_out + 1;
        end
    end
endmodule

Those things that look like parameters to a function that gets called to a software guy are not really. They are wires connected to a circuit module. That hardware defined in the module will respond to changes on those wires at any time.

That "always" is a bit odd to a software guy. It just means do something when the clock goes from low to high.

Let's test this counter module:

module counter_tb;
  reg clock;
  reg reset;
  reg enable;
  wire [31:0] counter_out;

    initial begin        
        $display ("time\t clk reset enable counter");	
        $monitor ("%g\t %b   %b     %b      %b", $time, clock, reset, enable, counter_out);	
        clock = 1;
        reset = 0;
        enable = 0;
        #5  reset = 1;
        #10  reset = 0;
        #10  enable = 1;
        #100  enable = 0;
        #5  $finish;
    end
  
    always begin
        #5  clock = ~clock;
    end
 
    counter deviceUnderTest (
        clock,
        reset,
        enable,
        counter_out
    );
endmodule

That's not so bad is it? It's a module, it creates some registers and wires like declarations in other languages. It instantiates a counter as deviceUnderTest and hooks the registers and wires up to that device. It runs through some test steps wiggling reset and enable inputs. It monitors the input and output values.

A slightly tricky thing is that clock signal. That "#5 clock = ~clock" is running in parallel with everything else toggling the clock every 5 time steps.

Let's run that using the Icarus simulator:

$ iverilog -o counter.vvp counter.v
$
$ vvp.exe counter.vvp
time     clk reset enable counter
0        1   0     0      xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
5        0   1     0      xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
10       1   1     0      00000000000000000000000000000000
15       0   0     0      00000000000000000000000000000000
20       1   0     0      00000000000000000000000000000000
25       0   0     1      00000000000000000000000000000000
30       1   0     1      00000000000000000000000000000001
35       0   0     1      00000000000000000000000000000001
40       1   0     1      00000000000000000000000000000010
45       0   0     1      00000000000000000000000000000010
50       1   0     1      00000000000000000000000000000011
55       0   0     1      00000000000000000000000000000011
60       1   0     1      00000000000000000000000000000100

Yay, it works.

All in all much like hacking code in Python. Quick and easy.

Next up, how do we compile that to C++ using Verilator? ....

KeithE · 2017-05-28 18:40

Heater - I would think that the Verilator idea might be interesting for P2 as well. Why debug on an FPGA board when you can debug in simulation? Also there might be people who don't want to spend money on FPGA boards, but would use a simulator.

Heater. · 2017-05-28 20:32

OK, the Verilator compilation of Verilog into C++ goes pretty easily.

This is my counter Verilog module:

module counter (
    input wire clock,
    input wire reset,
    input wire enable,
    output reg [31:0] counter_out
);
    always @ (posedge clock) begin
        if (reset == 1) begin
            counter_out <= 0;
        end else if (enable == 1) begin
            counter_out <= counter_out + 1;
        end
    end
endmodule

We write a test harness for that in C++:

#include "Vcounter.h"
#include "verilated.h"
#include <iostream>

vluint64_t main_time = 0;       // Current simulation time

double sc_time_stamp () {       // Called by $time in Verilog
    return main_time;
}

int main(int argc, char **argv, char **env) {
    Verilated::commandArgs(argc, argv);
    Vcounter* top = new Vcounter;

    // Set initial values of inputs
    top->clock = 0;
    top->reset = 1;
    top->enable = 1;

    while (!Verilated::gotFinish())
    {
        if (main_time > 10)
        {
            top->reset = 0;        // Deassert reset
        }

        // Toggle the clock.
        if ((main_time % 2) == 1) top->clock = 1; else top->clock = 0;

        // Run one iteration of our module
        top->eval();

        main_time++;            // Time passes...
        if (top->counter_out == 1000000000)
        {
                std::cout << "counter_out = " << top->counter_out << std::endl;       // Read a output
                std::cout << "Done!" << std::endl;
                break;
        }
    }
    delete top;
    exit(0);
}

Compile the verilog counter into C++:

$verilator -Wall -O3  --noassert --cc counter.v --exe sim_main.cpp

Compile the generated C++ together with our C++ test harness:

$ cd obj_dir
$ make OPT_FAST="-O2" -j -f Vcounter.mk Vcounter

And run the resulting executable:

$ time ./Vcounter.exe
counter_out = 1000000000
Done!

real    0m6.659s
user    0m6.609s
sys     0m0.015s

Hmmm...six and a half seconds to run a billion cycles of our counter. Not bad.

Looks like I have to try Verilator with my RISC V core...

Heater. · 2017-05-28 20:46

@Wossname,

That's right. Compile the Verilog once and use it as a black box in your C++ simulator.

The beauty of this is that it get's you an cycle accurate simulation of the whole Propeller, COGS, counters, HUB, I/O, video etc.

Looks like it might be fairly painless to compile the Prop Verlog, see above. Not yet sure about the resulting performance when building a whole Propeller though.

@KeithE,

I think the Verilator idea is a very interesting idea for the P2 as well.

Problem is we don't have the P2 Verilog to play with. It's probably not a good idea to disturb Chip with such notions at this time.

jmg · 2017-05-28 22:12

Great proof of concept work.

Heater. wrote: »

OK, the Verilator compilation of Verilog into C++ goes pretty easily.

Did you look at the C code created ?

Heater. wrote: »

Hmmm...six and a half seconds to run a billion cycles of our counter. Not bad.

Yes, that suggests the code is working above register-level, at least.
If it is worked back to registers used, that's 32*1e9/6.5 = 4.923 GFlop, where Flop here means Flip-Flop equivalent.
Even at a 32b quanta, that's 153M/s

Heater. wrote: »

Looks like it might be fairly painless to compile the Prop Verlog, see above. Not yet sure about the resulting performance when building a whole Propeller though.

Look forward to those numbers....

Heater. wrote: »

Problem is we don't have the P2 Verilog to play with. It's probably not a good idea to disturb Chip with such notions at this time.

Just the fact P2 exists in Verilog, is enough to pursue using already useful P1 emulation as proof of concept for eventual P2 flows.

Heater. · 2017-05-28 23:12

jmg,

Did you look at the C code created ?

Oh yeah. You really don't want to see that.

Well OK, here you go:

// Verilated -*- C++ -*-
// DESCRIPTION: Verilator output: Design implementation internals
// See Vcounter.h for the primary calling header

#include "Vcounter.h"          // For This
#include "Vcounter__Syms.h"

//--------------------
// STATIC VARIABLES


//--------------------

VL_CTOR_IMP(Vcounter) {
    Vcounter__Syms* __restrict vlSymsp = __VlSymsp = new Vcounter__Syms(this, name());
    Vcounter* __restrict vlTOPp VL_ATTR_UNUSED = vlSymsp->TOPp;
    // Reset internal values

    // Reset structure values
    _ctor_var_reset();
}

void Vcounter::__Vconfigure(Vcounter__Syms* vlSymsp, bool first) {
    if (0 && first) {}  // Prevent unused
    this->__VlSymsp = vlSymsp;
}

Vcounter::~Vcounter() {
    delete __VlSymsp; __VlSymsp=NULL;
}

//--------------------


void Vcounter::eval() {
    Vcounter__Syms* __restrict vlSymsp = this->__VlSymsp; // Setup global symbol table
    Vcounter* __restrict vlTOPp VL_ATTR_UNUSED = vlSymsp->TOPp;
    // Initialize
    if (VL_UNLIKELY(!vlSymsp->__Vm_didInit)) _eval_initial_loop(vlSymsp);
    // Evaluate till stable
    VL_DEBUG_IF(VL_PRINTF("\n----TOP Evaluate Vcounter::eval\n"); );
    int __VclockLoop = 0;
    QData __Vchange=1;
    while (VL_LIKELY(__Vchange)) {
        VL_DEBUG_IF(VL_PRINTF(" Clock loop\n"););
        vlSymsp->__Vm_activity = true;
        _eval(vlSymsp);
        __Vchange = _change_request(vlSymsp);
        if (++__VclockLoop > 100) vl_fatal(__FILE__,__LINE__,__FILE__,"Verilated model didn't converge");
    }
}

void Vcounter::_eval_initial_loop(Vcounter__Syms* __restrict vlSymsp) {
    vlSymsp->__Vm_didInit = true;
    _eval_initial(vlSymsp);
    vlSymsp->__Vm_activity = true;
    int __VclockLoop = 0;
    QData __Vchange=1;
    while (VL_LIKELY(__Vchange)) {
        _eval_settle(vlSymsp);
        _eval(vlSymsp);
        __Vchange = _change_request(vlSymsp);
        if (++__VclockLoop > 100) vl_fatal(__FILE__,__LINE__,__FILE__,"Verilated model didn't DC converge");
    }
}

//--------------------
// Internal Methods

VL_INLINE_OPT void Vcounter::_sequent__TOP__1(Vcounter__Syms* __restrict vlSymsp) {
    VL_DEBUG_IF(VL_PRINTF("    Vcounter::_sequent__TOP__1\n"); );
    Vcounter* __restrict vlTOPp VL_ATTR_UNUSED = vlSymsp->TOPp;
    // Variables
    VL_SIG(__Vdly__counter_out,31,0);
    // Body
    __Vdly__counter_out = vlTOPp->counter_out;
    // ALWAYS at counter.v:7
    if (vlTOPp->reset) {
        __Vdly__counter_out = 0U;
    } else {
        if (vlTOPp->enable) {
            __Vdly__counter_out = ((IData)(1U) + vlTOPp->counter_out);
        }
    }
    vlTOPp->counter_out = __Vdly__counter_out;
}

void Vcounter::_eval(Vcounter__Syms* __restrict vlSymsp) {
    VL_DEBUG_IF(VL_PRINTF("    Vcounter::_eval\n"); );
    Vcounter* __restrict vlTOPp VL_ATTR_UNUSED = vlSymsp->TOPp;
    // Body
    if (((IData)(vlTOPp->clock) & (~ (IData)(vlTOPp->__Vclklast__TOP__clock)))) {
        vlTOPp->_sequent__TOP__1(vlSymsp);
    }
    // Final
    vlTOPp->__Vclklast__TOP__clock = vlTOPp->clock;
}

void Vcounter::_eval_initial(Vcounter__Syms* __restrict vlSymsp) {
    VL_DEBUG_IF(VL_PRINTF("    Vcounter::_eval_initial\n"); );
    Vcounter* __restrict vlTOPp VL_ATTR_UNUSED = vlSymsp->TOPp;
}

void Vcounter::final() {
    VL_DEBUG_IF(VL_PRINTF("    Vcounter::final\n"); );
    // Variables
    Vcounter__Syms* __restrict vlSymsp = this->__VlSymsp;
    Vcounter* __restrict vlTOPp VL_ATTR_UNUSED = vlSymsp->TOPp;
}

void Vcounter::_eval_settle(Vcounter__Syms* __restrict vlSymsp) {
    VL_DEBUG_IF(VL_PRINTF("    Vcounter::_eval_settle\n"); );
    Vcounter* __restrict vlTOPp VL_ATTR_UNUSED = vlSymsp->TOPp;
}

VL_INLINE_OPT QData Vcounter::_change_request(Vcounter__Syms* __restrict vlSymsp) {
    VL_DEBUG_IF(VL_PRINTF("    Vcounter::_change_request\n"); );
    Vcounter* __restrict vlTOPp VL_ATTR_UNUSED = vlSymsp->TOPp;
    // Body
    // Change detection
    QData __req = false;  // Logically a bool
    return __req;
}

void Vcounter::_ctor_var_reset() {
    VL_DEBUG_IF(VL_PRINTF("    Vcounter::_ctor_var_reset\n"); );
    // Body
    clock = VL_RAND_RESET_I(1);
    reset = VL_RAND_RESET_I(1);
    enable = VL_RAND_RESET_I(1);
    counter_out = VL_RAND_RESET_I(32);
    __Vclklast__TOP__clock = VL_RAND_RESET_I(1);
}

void Vcounter::_configure_coverage(Vcounter__Syms* __restrict vlSymsp, bool first) {
    VL_DEBUG_IF(VL_PRINTF("    Vcounter::_configure_coverage\n");

It looks better when it's compiled to x86 assembler !

Heater. · 2017-05-28 23:15

I think if you boil it down the part of the C++ that is getting run most at emulation time is this:

   if (vlTOPp->reset) {
        __Vdly__counter_out = 0U;
    } else {
        if (vlTOPp->enable) {
            __Vdly__counter_out = ((IData)(1U) + vlTOPp->counter_out);
        }
    }
    vlTOPp->counter_out = __Vdly__counter_out;

Which is pretty much a C++ version of my counter.

Software Emulation of PASM code

Comments