Is the P1V bug free?
steddyman
Posts: 91
The reason I ask is I have recreated the DracBlade on the DE-115 (originally with an unmodified P1V core) and it always crashes emulating the Z80 with any program I have tried.
However, every other application works fine, even my test ones. I have even setup the DracBlade in SimpleIDE and compiled XMMC memory model applications to the board and it runs everything flawlessly so the hardware emulation of the DracBlade is clearly correct.
However, every ZiCog / z80 emulator I have tried that supports the DracBlade just crashes. Is it possible there are still bugs in the P1V verilog, or was this the actual code used for the real P1 production run?
However, every other application works fine, even my test ones. I have even setup the DracBlade in SimpleIDE and compiled XMMC memory model applications to the board and it runs everything flawlessly so the hardware emulation of the DracBlade is clearly correct.
However, every ZiCog / z80 emulator I have tried that supports the DracBlade just crashes. Is it possible there are still bugs in the P1V verilog, or was this the actual code used for the real P1 production run?
Comments
If I understand correctly the P1 chip was not designed in a hardware description language. For this reason the P1 verilog was released as the "P8X32A_Emulation".
I always thought "emulation" was a bit of a weird way to describe a different hardware implementation, that term is so often used for software emulators, like ZiCog say. But in this case it makes a bit of sense.
Is it bug free? Who knows? It's verilog. Verilog is software. Software is basically unprovably correct (Mind you so is a hand made schematic of logic gates)
Can we get some things straight:
1) Can we assume this mishap happens when using an out of the box P8X32A_Emulation configuration? I guess you have some external RAM attached in that case. We need to know if this is a problem in the original Verilog code or if it's due to something you have changed in it.
2) What do you mean by "crashes"? Is that Z80 emulation hangs up. Or is that the whole Propeller itself gets into a mess? What actually happens?
Hopefully this is some instruction that is not implemented correctly in Verilog, wrong flag setting or something, which just happens to be used by ZiCog in some rarely used way.
I have a nano board here. How can I recreate your set up here? If you have a very latest ZiCog to package up that would be great.
I can't tell you where exactly it crashes but in all emulator apps, it just hangs almost as soon as it hits the z80 engine but works fine up until that point. For example in the Spectrum emulator it displays the menu of roms on the SD card, let's you scroll up and doing them viewing screenshots of each. All of that is just standard spin and assembler code but that code uses the external SRAM and SD along with everything else on my schematic. As soon as you select a rom and press enter to star it, the screen corrupts and it crashes. Same with the colour genie emulator and same with other z80 based apps.
Cluso, are you saying that the p1v download source on the parallax site has two known bugs that need to be manually patched?
1.Tthe reset bug fix.
Change in cog.v in the oura/dira section
2. The counter module PLL bug fix
Change in cog_ctr.v in the simulated PLL section
3. The cog led bug for the BEmicro CV version.
Change in top.tdf file the cog led section and in the top.qsf file change the led assignments to
Hope this helps.
Cheers
Brian
doesnt loop, the loop is executing one time only, but this
works.
Don't know why this nop makes the difference.
Then this thing in spin
doesn't work, every loop executes one time, but when
these bad loops are executed as if they were good.
Of course I added this nop and forgot the problem, but... why? I can understand something is wrong with this loop without this nop. For example maybe the hardware needs some more time for one loop to write to the SRAM. But why - why???? this waitcnt can make the difference? The loop is of course executed in another cog, but the spin procedure waits for the end of the loop. I checked this, it really waits. Strange.
But then I added a nop and forgot about the problem.
Thanks for the heads up Pik. I guess these issues are what is probably killing the emulator. Very hard in emulation like this to find out where it might be going wrong.
They sound related, and could be something like the DJNZ needing more than 4 cycles to settle.
If NOP is ok, as a preceeding opcode, are there any others ?
ie DJNZ always needing NOP would have been picked up before now, so the last real opcode that the DJNZ packs may be important. ie only some opcode pairings with DJNZ may trigger this phase effect.
I run all my P1v's @ 80MHz now. Lots of different code with lots of DJNZ loops, no obvious issues so far.
I am running code on a Nano, a DE2-115 and BEMicro CV too.
It's not impossible that the P1V code still has bugs. There have been several bug fixes since it went into Github; if you get the Master branch from Mindrobots, you will have the most stable version as far as we know.
Any bugs that may still be in there, are likely to be related to the timers. For example, the PLL is emulated in digital hardware so there may be some jitter. Also, Quartus II generates many warnings and specifically states that it doesn't meet timing requirements; we have to look into that some day but Chip has no time, and most of us are HDL noobs. We're happy it works as well as it does, and I'd certainly be interested in troubleshooting anything that works on the real P1 but not on the P1V.
===Jac
I've implemented countless ARMs in Xilinx and on-chip and you have to 100% close timing on targeted clock rate with margin otherwise flakiness appears, even with golden code. Also, not sure if Propeller is same, but ARM is piped so compacted code may differ with instruction sequence used. Proper semaphoring must also be adhered to.
I suspect that we can say affirmative to this question:)
I have just spent an incredible amount of time trying to do the simplest of thing...
write to the SRAM with direct indexing. I thought I had solved it in an earlier post and then moved on to VGA... but then as I was about to reach a milestone... I went back and thoroughly tested the SRAM implementation.
The SRAM protocol is simple... but when you assume your own ignorance and try every
perturbation... it can take time:)
In this case, it turned out that if I tried to write to the sram (100Mhz independent pll)
with a Prop clock at 100Mhz... I often got good results...but occasioally bad results...each followed by furtive verilog changes.
Finally, I reduced the Prop clock to 80Mhz and on first testing, I got 350 million consecutive writes and reads without an error.
BUT... if I restarted my Spin program repeatedly, I occasionally got errors.
And the errors seemed patterned. Generally, there was a fixed rate of error for a particular run,
but the error rate differed significantly between runs.
Sometimes I would get one error per 35000 writes and sometimes 30000 errors in 35000 writes.
(the 35000 writes/reads were in a continuous loop with 35000 writes followed by 35000 reads.)
So, if a run started out bad, it stayed bad with the same amount of badness...
If a run was perfect in the beginning it stayed that way for the entire run.
What is interesting to me (but perfectly meaningless)... is that the errors didn't return as random 16 bit numbers... they always came back as "-1".
I have no idea what this means, but it has to mean something... seems vaguely hardware-ish:)
Today, I reread this thread and reduced the Prop's clock to 50Mhz.
I have been sitting here clicking the program button in PropellerIDE all morning without a direct indexing glitch.
I still have occasional issues with auto-indexing, but now it returns 16 bit values... occasionally wrong but not -1:)
Do not use independent PLL; this is a mother of all SRAM/Propeller problems which stopped my retromachine project for a long time. There were random read/write SRAM errors. The solution is use ONE pll with synchronous frequencies. Cyclone 4 PLL can have more than 1 output, use this.
I have 152 MHz pixel and SRAM clock, and 114 MHz (0.75*152) Propeller clocked from the same PLL.
Initially I had problems at 80Mhz, but then I shifted the clock feeding the P1V by -90degrees... and
that seems to have fixed it. Humming along at 100MHz.
Thanks for all the work you have done on the retromachine... none of what I am doing right now would have been possible without it.
Rich
I use DFF for this purpose. Recompile the project after changes - errors. Add DFF here and there - it works.
Recompile once again with some minor change - errors. What to do then? Remove DFF. and then it works
FPGA is fun
Speaking of chip, it'd be fun to chat with Chip sometime.
Not sure about any of the SRAM issues, but on the board, there are timing paths pin to pin which have to be considered. Tighter timing requires matched paths off-chip, to-chip, and on-chip. Try implementing 40GHz paths through a SERDES and onto an FPGA.
A couple of things I wanted to mention. Verilog IS HDL in this form, not software. It is not being used in a Verilog testbench, it is being implemented in an FPGA, which means it is converted to (programmable) gates, which are then laid out and routed in the FPGA, and MUST meet all timing to be robust, just like the real ASIC. Also, the P1 (that's the ASIC, right?) had to have been designed in HDL.
Cheers!
IIRC, that was done for simplicity, but as most FPGAs have more than one PLL, it would be nice to make use of them, and in many use cases the Counter PLL is not needed.
The ideal would be to have a Proper PLL for SysCLK generation, and access to a Proper PLL for Conter-NCO work, but of course that second item will be highly target dependent.