Propeller 1 running on Pipistrello (Xilinx Spartan6-LX45)

mkarlssonmkarlsson Posts: 13
edited 2014-09-03 - 11:55:30 in Propeller 1
Greetings!

I have played a bit with the code and got it to run on the Pipistrello board. More testing needs to be done but it least it runs the tutorial files in Propeller Tools v1.3.2
The board can be programmed directly in Propeller Tools since the code detects DTR going high and generates a long reset pulse to emulate the RC-filter on the Prop Plug.

Pins 0-15 are mapped to Wing A0-A15, pins 16-29 are mapped to Wing B0-B13. Pins 30 and 31 are connected to the FTDI chip for serial communication.

The cog LEDs are mapped to Wing C0-C7.

Two of the on-board LEDs (LED1 and LED2) are monitoring pin 30 (tx) and pin 31 (rx).
Two other LEDs (LED3 and LED4) are monitoring pins 16 and 17 which makes it easy to play with the tutorial examples that toggle those two pins.

Attached is a zip file with a full Xilinx ISE project (ISE14.4) including all verilog source files and a Pipistrello bit file (in the synth directory).

This is based on the 2014-08-11 code release.


Info about Pipistrello can be found here: http://pipistrello.saanlima.com/index.php?title=Welcome_to_Pipistrello

Enjoy!

Magnus Karlsson
Saanlima Electronics
«1

Comments

  • TubularTubular Posts: 3,723
    edited 2014-08-18 - 19:09:02
    Very nice, Magnus. Good that you can deal with the DTR pulse

    Can you comment on how full the Spartan 6 is (how many resources used/free?)
  • mkarlssonmkarlsson Posts: 13
    edited 2014-08-18 - 19:13:38
    Sure. 51% of the slices are occupied so there is plenty of room left over. BTW, the code has 64 KB RAM in the hub with the top 4K is preloaded with the boot code etc.
  • Bill HenningBill Henning Posts: 6,445
    edited 2014-08-18 - 19:28:31
    Nice work!

    Sounds like the design might fit in an LX25.
    www.mikronauts.com / E-mail: mikronauts _at_ gmail _dot_ com / @Mikronauts on Twitter
    RoboPi: The most advanced Robot controller for the Raspberry Pi (Propeller based)
  • jmgjmg Posts: 14,013
    edited 2014-08-18 - 19:37:11
    mkarlsson wrote: »
    Greetings!

    I have played a bit with the code and got it to run on the Pipistrello board. More testing needs to be done but it least it runs the tutorial files in Propeller Tools v1.3.2

    Sounds good.
    Just trying to work out the flows on this ?
    * Can the P1V code load from on board memory (SPI flash?), or does it expect an i2c chip ?
    * Does the FPGA load from config memory if no JTAG is present (ie can re-run stand alone, once programmed)

    Addit: The web site was a little vague, but google did find this
    http://www.element14.com/community/blogs/alex-the-kidd/2013/02/06/pipistrello-rev-20-fpga-board

    That says the FPGA can load from Flash and typically uses 2MB of the 16MB.
    - which leaves ~14MB? FLASH for users, and the Flash Memory can run to 108MHz QuadSPI.

    This also says "dedicated MCB (memory controller blocks) which allow the user to interface to SDRAM chips without having to code (or even understand) all the complexities and intricacies of driving the SDRAM's bus and performing refreshes." - but no numbers on random and burst access speeds, or how much logic this uses.

    The default P1V would expect i2c boot, but a modified loader could read from the SPI Flash @ QuadSPI Speeds.
  • overclockedoverclocked Posts: 80
    edited 2014-08-18 - 23:34:23
    mkarlsson wrote: »
    Greetings!

    I have played a bit with the code and got it to run on the Pipistrello board. More testing needs to be done but it least it runs the tutorial files in Propeller Tools v1.3.2
    The board can be programmed directly in Propeller Tools since the code detects DTR going high and generates a long reset pulse to emulate the RC-filter on the Prop Plug.

    Thanks for sharing Magnus!
    A question: when trying to port this to a Spartan-3E 1600E I get the following messages:
    -snip-
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/cog_ctr.v" line 65 'trig' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/cog.v" line 224 'm' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/cog.v" line 224 'px' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/cog.v" line 236 'waiti' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/cog.v" line 237 'waiti' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/cog.v" line 252 'cond' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/cog.v" line 252 'jump_cancel' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/cog.v" line 253 'px' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/cog.v" line 258 'cond' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/cog.v" line 258 'i' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/cog.v" line 259 'alu_co' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/cog.v" line 264 'cond' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/cog.v" line 264 'i' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/cog.v" line 265 'alu_zo' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/cog.v" line 294 'cond' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/cog.v" line 294 'i' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/cog.v" line 310 'cond' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/cog.v" line 310 'i' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/cog.v" line 312 'alu_wr' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/cog.v" line 314 'px' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/cog.v" line 315 'i' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/cog.v" line 316 'i' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/cog.v" line 449 'cancel' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/hub.v" line 145 'sys_q' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/hub.v" line 150 'sys_c' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/hub.v" line 173 'lock_e' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/hub.v" line 173 'cog_e' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/top.v" line 101 'res' has not been declared
    -snip-

    It actually synthed your original solution for the S6-LX45.
    I've seen these messags before and the only way to remove them seem to be to move all declaration above its first use. Have you seen these and what is your solution?
    And why to they only show up for the S3? Different toolchain fo building?
  • mkarlssonmkarlsson Posts: 13
    edited 2014-08-19 - 06:37:42
    Thanks for sharing Magnus!
    A question: when trying to port this to a Spartan-3E 1600E I get the following messages:
    -snip-
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/cog_ctr.v" line 65 'trig' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/cog.v" line 224 'm' has not been declared
    ....
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/hub.v" line 173 'lock_e' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/hub.v" line 173 'cog_e' has not been declared
    ERROR:HDLCompilers:28 - "C:/Share/FPGA/60050-60056-Propeller-1-Design-2014-08-06/P8X32A_Emulation/P8X32A_Pipistrello/src/top.v" line 101 'res' has not been declared
    -snip-

    It actually synthed your original solution for the S6-LX45.
    I've seen these messags before and the only way to remove them seem to be to move all declaration above its first use. Have you seen these and what is your solution?
    And why to they only show up for the S3? Different toolchain fo building?

    Yeah, it looks like the S6 HDL parser allows the use of a variable before it's declared but the S3 parser don't.
    The simplest fix is to declare all wires at the top and then use assign statement rather that declaring and assigning in one statement, like this (to fix top.v):

    wire res;

    <code>

    assign res = ~inp_resn;

    Magnus
  • mkarlssonmkarlsson Posts: 13
    edited 2014-08-19 - 07:10:45
    jmg wrote: »
    Sounds good.
    Just trying to work out the flows on this ?
    * Can the P1V code load from on board memory (SPI flash?), or does it expect an i2c chip ?
    * Does the FPGA load from config memory if no JTAG is present (ie can re-run stand alone, once programmed)

    Addit: The web site was a little vague, but google did find this
    http://www.element14.com/community/blogs/alex-the-kidd/2013/02/06/pipistrello-rev-20-fpga-board

    That says the FPGA can load from Flash and typically uses 2MB of the 16MB.
    - which leaves ~14MB? FLASH for users, and the Flash Memory can run to 108MHz QuadSPI.

    This also says "dedicated MCB (memory controller blocks) which allow the user to interface to SDRAM chips without having to code (or even understand) all the complexities and intricacies of driving the SDRAM's bus and performing refreshes." - but no numbers on random and burst access speeds, or how much logic this uses.

    The default P1V would expect i2c boot, but a modified loader could read from the SPI Flash @ QuadSPI Speeds.

    The P1V code expects an i2c chip connected to pins 28 and 29. To use the SPI flash instead the code needs to be rewritten and more pins used (at least 4).
    The FPGA load from config memory so it can run stand-alone.

    With an FPGA in the mix it's so many ways you can load the code. Here is one option:
    When I ported an NES emulation to Pipistrello it only had the option to load a game via serial communication with a PC. To make it stand-alone I added a small microcontroller (microblaze_mcs) front-end hooked up to the SD-card socket. At power-up the microcontroller checked the sd-card for a game file and if found sent it serially to the NES module. That way you could load any game by simply put it on the sd-card and reset the board. A similar solution could be done for the P1V core, i.e. the microcontroller would toggle the DTR line and send the code to the P1V just like the Propeller Tool does.

    As for the SDRAM, the performance is set by the SDRAM chip which can do up the 800 MB/s in peak data rate. A realistic sustained data rate is something like 700 MB/s but would require burst access.

    Magnus
  • overclockedoverclocked Posts: 80
    edited 2014-08-19 - 07:11:40
    mkarlsson wrote: »
    Yeah, it looks like the S6 HDL parser allows the use of a variable before it's declared but the S3 parser don't.
    The simplest fix is to declare all wires at the top and then use assign statement rather that declaring and assigning in one statement, like this (to fix top.v):

    wire res;

    <code>

    assign res = ~inp_resn;

    Magnus

    Thanks! To follow up on this subject:
    After getting your advice on this I search further and found that this is ACTUALLY the case. ISE switches between different parses between old and "new" devices. Very odd. I found several discussions about the subject.

    There are even a way to force new parser on old devices.
    NOTE: Not tested yet but as soon as a have an ISE I'll try it.
    http://forums.xilinx.com/t5/Synthesis/How-to-enable-the-new-parser-for-XST-in-ISE-12-1/m-p/133272
  • RamonRamon Posts: 364
    edited 2014-08-19 - 08:17:39
    Magnus, thanks a lot !

    I really liked your conversion, is really nice. I will try to adapt your code to the Spartan 3. Will post results when done.

    Today I have run verilator and icarus verilog with the systemverilog version and failed. But now, that there are no more packed arrays and the rom files are loaded using readmemh, I think that there is no reason for them to fail anymore. Will try tomorrow again with this Verilog 2001 code.

    I have seen in the code this two changes:

    cog.v
    line 212: ptr <= 27'b00000000000000_11111000000000;
    line 212: ptr <= 28'b00000000000000_11111000000000;

    cog_alu.v
    wire ri -> from 31 bits to 32 bits (several lines changed)
    wire rot -> from 63 bits to 65 bits

    Where those changes bugs in the original code?
  • RamonRamon Posts: 364
    edited 2014-08-19 - 08:21:48

    Overclocked, thanks for the link.

    When I saw the code from Magnus I said: "Hey! I what a genius. I needed much more millions changes to get rid of those annoying dependencies between wires and regs"

    Will try tomorrow ...
  • mkarlssonmkarlsson Posts: 13
    edited 2014-08-19 - 09:04:28
    Ramon wrote: »
    Magnus, thanks a lot !

    I really liked your conversion, is really nice. I will try to adapt your code to the Spartan 3. Will post results when done.

    Today I have run verilator and icarus verilog with the systemverilog version and failed. But now, that there are no more packed arrays and the rom files are loaded using readmemh, I think that there is no reason for them to fail anymore. Will try tomorrow again with this Verilog 2001 code.

    I have seen in the code this two changes:

    cog.v
    line 212: ptr <= 27'b00000000000000_11111000000000;
    line 212: ptr <= 28'b00000000000000_11111000000000;

    cog_alu.v
    wire ri -> from 31 bits to 32 bits (several lines changed)
    wire rot -> from 63 bits to 65 bits

    Where those changes bugs in the original code?

    I made no changes to cog.v from the 2014-08-11 release which has this as an 28 bit field.

    I did change ri from (8 * 31) bits to (8 * 32) bits and rot from 63 to 64 bits to make the ri field indexing simpler (i[2:0] * 31 needs a full multiplier while i[2:0] * 32 is just a shift).
    There is no functional change, this is part of the SystemVerilog -> Verilog2001 conversion.

    Magnus
  • overclockedoverclocked Posts: 80
    edited 2014-08-19 - 10:37:43
    Thanks! To follow up on this subject:
    After getting your advice on this I search further and found that this is ACTUALLY the case. ISE switches between different parses between old and "new" devices. Very odd. I found several discussions about the subject.

    There are even a way to force new parser on old devices.
    NOTE: Not tested yet but as soon as a have an ISE I'll try it.
    http://forums.xilinx.com/t5/Synthesis/How-to-enable-the-new-parser-for-XST-in-ISE-12-1/m-p/133272

    SUCCESS! After taking Magnus code above, changing the PLL=>Spartan-3E DCM, giving the switch above, rewriting the UCF-file for correct pin-config for the new FPGA it actully works! For now I use a switch to spoof the reset-signal so you need to be "quick" and try it a few times but I'm starting to get the hang of it! :-) Will work better og course withg real DTR-signal gievn from the Propeller Programming tool.

    Thanks for sharing and supporting. I actually had the core built for Spartan-3 but due to stuff like that Reset/DTR fix and probably other stuff, i just didn't boot/work! But now both "Identify Hardware" and programming works.

    Later when I get home (out travelling hotel and course now), I will start connecting some fun stuff like Video/VGA/Keyboard/SD-Card!
    Are you planning to mirror these files in some Github Magnus?
    I'm going to keep my promise and share these files, but it seem more correct to like Fork the work you've done.
  • AntoineDoinelAntoineDoinel Posts: 307
    edited 2014-08-19 - 14:24:23
    Many thanks for this Magnus!

    Also many thanks to overclock for his work in the xilinx port thread!

    I tried to synthesize for LogiPi board (LX9), with 3 COGs it completes at 93% LUTs and 100% BRAM.

    <Mram_r> is getting moved to slice LUTs. It's that cog ram?
    If so, something doesn't add up... since it's using only 768 slices, and that would be too small, right?

    Translate fails, I still trying to figure out pin replacements, from your ucf file to the one supplied with LogiPi.


    Thanks again
    Alessandro
  • mkarlssonmkarlsson Posts: 13
    edited 2014-08-20 - 09:26:08
    Since the parser used for Spartan3E wont allow the use of a reg or net before it's declared I decided to clean up the code where those errors showed up and it now compiles on both Spartan3E and Spartan6 without any special parser select option.

    I also added options to either use a PLL or a DCM for clock generation (Spartan6 can use either, Spartan3E must use the DCM) as well as option to use an active-low reset input like the original P1V release code or use a DTR input and simulate the RC-network in the Prop Plug in verilog. Both the options are controlled by ifdefs.

    See attached zip file for the complete ISE project (currently set for Spartan6 LX45).

    Enjoy!

    Magnus
  • jac_goudsmitjac_goudsmit Posts: 415
    edited 2014-08-20 - 09:43:07
    Magnus, I would like to add your code to my Github repository.

    I know I can add your code without asking (because GPL) but I would prefer if you would make your own fork and maintain your code there, and post pull requests to integrate changes. Would you be interested?

    Thanks!

    ===Jac
    Rancho Cucamonga, CA
  • overclockedoverclocked Posts: 80
    edited 2014-08-20 - 11:19:11
    mkarlsson wrote: »
    Since the parser used for Spartan3E wont allow the use of a reg or net before it's declared I decided to clean up the code where those errors showed up and it now compiles on both Spartan3E and Spartan6 without any special parser select option.

    I also added options to either use a PLL or a DCM for clock generation (Spartan6 can use either, Spartan3E must use the DCM) as well as option to use an active-low reset input like the original P1V release code or use a DTR input and simulate the RC-network in the Prop Plug in verilog. Both the options are controlled by ifdefs.

    See attached zip file for the complete ISE project (currently set for Spartan6 LX45).

    Enjoy!

    Magnus

    Tack.. or should I say Thank you for sharing! :-) I'll try it right away!
  • mkarlssonmkarlsson Posts: 13
    edited 2014-08-20 - 12:04:25
    Magnus, I would like to add your code to my Github repository.

    I know I can add your code without asking (because GPL) but I would prefer if you would make your own fork and maintain your code there, and post pull requests to integrate changes. Would you be interested?

    Thanks!

    ===Jac

    OK, you can find it here: https://github.com/magnuskarlsson/P8X32A_Emulation

    Magnus
  • jac_goudsmitjac_goudsmit Posts: 415
    edited 2014-08-20 - 12:58:34
    Merged!

    Thanks Magnus!

    ===Jac
    Rancho Cucamonga, CA
  • overclockedoverclocked Posts: 80
    edited 2014-08-20 - 16:17:30
    mkarlsson wrote: »
    Since the parser used for Spartan3E wont allow the use of a reg or net before it's declared I decided to clean up the code where those errors showed up and it now compiles on both Spartan3E and Spartan6 without any special parser select option.

    I also added options to either use a PLL or a DCM for clock generation (Spartan6 can use either, Spartan3E must use the DCM) as well as option to use an active-low reset input like the original P1V release code or use a DTR input and simulate the RC-network in the Prop Plug in verilog. Both the options are controlled by ifdefs.

    See attached zip file for the complete ISE project (currently set for Spartan6 LX45).

    Enjoy!

    Magnus

    OK I now have tried your new project. As it seem, using the "old" parser does not change as much as expected when it comes to Resource or FMAX, but it DOES actually break the code..

    I CAN boot the Propeller when running your original code, building for S3E-1600E with switch: "-use_new_parser yes".
    I CAN'T boot the Propeller when using your new code (with moved to top reg/wires), building for S3E-1600E without switch
    I CAN boot the Propeller when using your new code (with moved to top reg/wires), building for S3E-1600E with switch: "-use_new_parser yes".

    So very strange.. the old parser made for the S3E does NOT create executable BIT-file in this case..
    The timing is met according to TimeSpec.
  • mkarlssonmkarlsson Posts: 13
    edited 2014-08-20 - 16:40:14
    Yeah, found the problem. It's barfs at the memory initialization:

    WARNING:Xst:2319 - "J:/../P8X32A_Emulation/P8X32A_Xilinx/src/hub_mem.v" line 55: Signal ram0 in initial block is partially initialized. The initialization will be ignored.
    WARNING:Xst:2319 - "J:/../P8X32A_Emulation/P8X32A_Xilinx/src/hub_mem.v" line 56: Signal ram1 in initial block is partially initialized. The initialization will be ignored.
    WARNING:Xst:2319 - "J:/../P8X32A_Emulation/P8X32A_Xilinx/src/hub_mem.v" line 57: Signal ram2 in initial block is partially initialized. The initialization will be ignored.
    WARNING:Xst:2319 - "J:/../P8X32A_Emulation/P8X32A_Xilinx/src/hub_mem.v" line 58: Signal ram3 in initial block is partially initialized. The initialization will be ignored.

    I don't know why it complains but it seems the old version has lots of problems with $readmemh based on this post:
    http://forums.xilinx.com/t5/Synthesis/Error-initializing-memory-using-readmemh-Xst-2319-bug/td-p/44680

    So the bottom line is that the netlist is good but there is no bootloader in the RAM so it wont boot.

    Magnus
  • mkarlssonmkarlsson Posts: 13
    edited 2014-08-20 - 16:52:01
    I found this info somewhere on the net:


    Xilinx's support of $readmemh is pretty brain-dead. Here's the rules that have worked for me:
    1) No addresses in the hex file. Sorry - not supported.
    2) Exactly one value per line of text.
    3) No comments.
    4) Size of the initialization data must exactly match the size of the memory array it initializes. Partial initialization of an array results in no initialization (entire array reverts to its default of all zeroes).

    So I think the problem is that only the top 1KB of each RAM is initialized. I guess one workaround would be to add 15*1024 lines of "00" to the data files so that the whole RAM is initialized. What a pain...

    Magnus
  • overclockedoverclocked Posts: 80
    edited 2014-08-20 - 17:37:46
    mkarlsson wrote: »
    I found this info somewhere on the net:


    Xilinx's support of $readmemh is pretty brain-dead. Here's the rules that have worked for me:
    1) No addresses in the hex file. Sorry - not supported.
    2) Exactly one value per line of text.
    3) No comments.
    4) Size of the initialization data must exactly match the size of the memory array it initializes. Partial initialization of an array results in no initialization (entire array reverts to its default of all zeroes).

    So I think the problem is that only the top 1KB of each RAM is initialized. I guess one workaround would be to add 15*1024 lines of "00" to the data files so that the whole RAM is initialized. What a pain...

    Magnus

    Yes I found this too. I've added all lines in all files to match up and the errors/warnings about memory init is gone. Sorry to say, it still doesn't work..
    In the process of cleaning up I've also did some other changes. Does it seem correct? I've also tried defaulting back to original size RAM as you can see..

    The current lines I've got is:
    `define COG_COUNT 8
    `define COG_RAM_SIZE 512 // 512*32bit= 2KB
    `define HUB_MEM_SIZE 8 // KB*4 => 8*4=32KB
    reg [7:0] ram3 [0:(`HUB_MEM_SIZE*1024-1)];
    reg [7:0] ram2 [0:(`HUB_MEM_SIZE*1024-1)];
    reg [7:0] ram1 [0:(`HUB_MEM_SIZE*1024-1)];
    reg [7:0] ram0 [0:(`HUB_MEM_SIZE*1024-1)];
    $readmemh ("ROM_$F000-$FFFF_BYTE_3.spin", ram3, `HUB_MEM_SIZE*1024*8);
    $readmemh ("ROM_$F000-$FFFF_BYTE_2.spin", ram2, `HUB_MEM_SIZE*1024*8);
    $readmemh ("ROM_$F000-$FFFF_BYTE_1.spin", ram1, `HUB_MEM_SIZE*1024*8);
    $readmemh ("ROM_$F000-$FFFF_BYTE_0.spin", ram0, `HUB_MEM_SIZE*1024*8);
    
    
  • mkarlssonmkarlsson Posts: 13
    edited 2014-08-20 - 18:04:44
    Two things:

    1) The syntax of $readmemh is

    $readmemh("File",ArrayName,StartAddr,EndAddr);

    If you omit EndAddr it will go to the end of the RAM. If you omit both StartAddr and EndAddr it will load the whole ram.

    So in your case my guess is that if your data file now has data for the whole RAM then the line should look like this:
    $readmemh ("ROM_$F000-$FFFF_BYTE_3.spin", ram3);
    etc.

    2) Don't forget that you need to map the top 4KB of the HUB RAM address space to the top 4KB of your RAM.
    In my code that's not needed since I use the whole address space but if you have less than 64KB then this mapping must happen.

    Magnus
  • Cluso99Cluso99 Posts: 15,458
    edited 2014-08-20 - 18:28:10
    Try removing the comments from the rom files.
    BTW there are a few, not just the first line.
    My Prop boards: P8XBlade2, RamBlade, CpuBlade, TriBlade
    Prop OS (also see Sphinx, PropDos, PropCmd, Spinix)
    Website: www.clusos.com
    Prop Tools (Index) , Emulators (Index) , ZiCog (Z80)
  • mkarlssonmkarlsson Posts: 13
    edited 2014-08-20 - 18:30:42
    Yep, that's already done. That generated a hard error.

    Magnus
  • Cluso99Cluso99 Posts: 15,458
    edited 2014-08-20 - 19:32:46
    Magnus,did you see there are about 6+ comments throughout the ROM file? I updated my post but you may have missed it.
    My Prop boards: P8XBlade2, RamBlade, CpuBlade, TriBlade
    Prop OS (also see Sphinx, PropDos, PropCmd, Spinix)
    Website: www.clusos.com
    Prop Tools (Index) , Emulators (Index) , ZiCog (Z80)
  • mkarlssonmkarlsson Posts: 13
    edited 2014-08-20 - 19:50:05
    Yes. As I said, those comments caused errors in the Spartan3 HDL parser so I had to take them out.

    In hind sight this was probably not worth it (i.e. modifying the code so that the old Spartan3 HDL parser accepts it) since overclocked posted that he didn't see much difference is speed and area when the old parser is used and code works on Spartan3E when the new Spartan6 parser is used.

    BTW, the code we talk about can be found here: https://github.com/magnuskarlsson/P8X32A_Emulation/tree/master/P8X32A_Pipistrello/src

    Magnus
  • jmgjmg Posts: 14,013
    edited 2014-08-20 - 19:53:53
    mkarlsson wrote: »
    ....since overclocked posted that he didn't see much difference is speed and area when the old parser is used and code works on Spartan3E when the new Spartan6 parser is used.
    .

    Do we have some actual numbers on the LUT & MHz differences on old/new parsers ?
  • overclockedoverclocked Posts: 80
    edited 2014-08-20 - 23:31:11
    jmg wrote: »
    Do we have some actual numbers on the LUT & MHz differences on old/new parsers ?

    Yes to answer your question:
    96% for new, 97% for old. Thus, it uses MORE resources on the old, but very close.

    Timing for using "use_new_parser yes": 6.029ns => 165Mhz
    Timing for using old parser: 6.243ns => 160Mhz

    But with timing, this isn't really a good comparison. We ask for 160Mhz and get this in both cases. What would happen if we asked for something higher is more interesting of course.

    This is the same for device uitilization. Today I run both compiles with balanced-setting for quicker compile times. By using switches both of these can change.

    I second Magnus thoughts about this. Because we actually CAN boot the codebase using the "-use_new_parser yes" switch on old and new devices, we met the goal.

    So now it is just a matter of figuring out (as a challange) why it does not work with the old parser.
    If we were to move these changed files to Altera, maybe new problems would occur. I will try this when I have this possibility.
    So it would actually be a good thing to figure out why it does not boot/work using old parser. I keep on looking for a while..

    Details of the 2 versions of the verilog parser below.


    [TABLE="width: 100%"]
    [TR="bgcolor: #99ccff"]
    [TD="colspan: 5, align: center"]Device Utilization Summary (Compiling for S3E-1600E and using switch: "use_new_parser yes")[/TD]
    [TD="width: 10%, align: right"][-][/TD]
    [/TR]
    [TR="bgcolor: #ffff99"]
    [TD="align: left"]Logic Utilization[/TD]
    [TD]Used[/TD]
    [TD]Available[/TD]
    [TD]Utilization[/TD]
    [TD="colspan: 2"]Note(s)[/TD]
    [/TR]
    [TR]
    [TD="align: left"]Number of Slice Flip Flops[/TD]
    [TD="align: right"]5,537[/TD]
    [TD="align: right"]29,504[/TD]
    [TD="align: right"]18%[/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [TR]
    [TD="align: left"]Number of 4 input LUTs[/TD]
    [TD="align: right"]25,295[/TD]
    [TD="align: right"]29,504[/TD]
    [TD="align: right"]85%[/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [TR]
    [TD="align: left"]Number of occupied Slices[/TD]
    [TD="align: right"]14,283[/TD]
    [TD="align: right"]14,752[/TD]
    [TD="align: right"]96%[/TD]
    [TD="colspan: 2"][/TD]
    [/TR]
    [TR]
    [TD="align: left"] Number of Slices containing only related logic[/TD]
    [TD="align: right"]14,283[/TD]
    [TD="align: right"]14,283[/TD]
    [TD="align: right"]100%[/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [TR]
    [TD="align: left"] Number of Slices containing unrelated logic[/TD]
    [TD="align: right"]0[/TD]
    [TD="align: right"]14,283[/TD]
    [TD="align: right"]0%[/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [TR]
    [TD="align: left"]Total Number of 4 input LUTs[/TD]
    [TD="align: right"]25,548[/TD]
    [TD="align: right"]29,504[/TD]
    [TD="align: right"]86%[/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [TR]
    [TD="align: left"] Number used as logic[/TD]
    [TD="align: right"]17,103[/TD]
    [TD] [/TD]
    [TD] [/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [TR]
    [TD="align: left"] Number used as a route-thru[/TD]
    [TD="align: right"]253[/TD]
    [TD] [/TD]
    [TD] [/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [TR]
    [TD="align: left"] Number used for 32x1 RAMs[/TD]
    [TD="align: right"]8,192[/TD]
    [TD] [/TD]
    [TD] [/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [TR]
    [TD="align: left"]Number of bonded IOBs[/TD]
    [TD="align: right"]46[/TD]
    [TD="align: right"]250[/TD]
    [TD="align: right"]18%[/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [TR]
    [TD="align: left"]Number of RAMB16s[/TD]
    [TD="align: right"]32[/TD]
    [TD="align: right"]36[/TD]
    [TD="align: right"]88%[/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [TR]
    [TD="align: left"]Number of BUFGMUXs[/TD]
    [TD="align: right"]13[/TD]
    [TD="align: right"]24[/TD]
    [TD="align: right"]54%[/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [TR]
    [TD="align: left"]Number of DCMs[/TD]
    [TD="align: right"]1[/TD]
    [TD="align: right"]8[/TD]
    [TD="align: right"]12%[/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [TR]
    [TD="align: left"]Average Fanout of Non-Clock Nets[/TD]
    [TD="align: right"]4.80[/TD]
    [TD] [/TD]
    [TD] [/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [/TABLE]



    [TABLE="width: 100%"]
    [TR="bgcolor: #99ccff"]
    [TD="colspan: 5, align: center"]Device Utilization Summary (Using old parser)[/TD]
    [TD="width: 10%, align: right"][-][/TD]
    [/TR]
    [TR="bgcolor: #ffff99"]
    [TD="align: left"]Logic Utilization[/TD]
    [TD]Used[/TD]
    [TD]Available[/TD]
    [TD]Utilization[/TD]
    [TD="colspan: 2"]Note(s)[/TD]
    [/TR]
    [TR]
    [TD="align: left"]Number of Slice Flip Flops[/TD]
    [TD="align: right"]5,569[/TD]
    [TD="align: right"]29,504[/TD]
    [TD="align: right"]18%[/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [TR]
    [TD="align: left"]Number of 4 input LUTs[/TD]
    [TD="align: right"]25,182[/TD]
    [TD="align: right"]29,504[/TD]
    [TD="align: right"]85%[/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [TR]
    [TD="align: left"]Number of occupied Slices[/TD]
    [TD="align: right"]14,339[/TD]
    [TD="align: right"]14,752[/TD]
    [TD="align: right"]97%[/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [TR]
    [TD="align: left"] Number of Slices containing only related logic[/TD]
    [TD="align: right"]14,339[/TD]
    [TD="align: right"]14,339[/TD]
    [TD="align: right"]100%[/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [TR]
    [TD="align: left"] Number of Slices containing unrelated logic[/TD]
    [TD="align: right"]0[/TD]
    [TD="align: right"]14,339[/TD]
    [TD="align: right"]0%[/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [TR]
    [TD="align: left"]Total Number of 4 input LUTs[/TD]
    [TD="align: right"]25,533[/TD]
    [TD="align: right"]29,504[/TD]
    [TD="align: right"]86%[/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [TR]
    [TD="align: left"] Number used as logic[/TD]
    [TD="align: right"]16,990[/TD]
    [TD] [/TD]
    [TD] [/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [TR]
    [TD="align: left"] Number used as a route-thru[/TD]
    [TD="align: right"]351[/TD]
    [TD] [/TD]
    [TD] [/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [TR]
    [TD="align: left"] Number used for 32x1 RAMs[/TD]
    [TD="align: right"]8,192[/TD]
    [TD] [/TD]
    [TD] [/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [TR]
    [TD="align: left"]Number of bonded IOBs[/TD]
    [TD="align: right"]46[/TD]
    [TD="align: right"]250[/TD]
    [TD="align: right"]18%[/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [TR]
    [TD="align: left"]Number of RAMB16s[/TD]
    [TD="align: right"]32[/TD]
    [TD="align: right"]36[/TD]
    [TD="align: right"]88%[/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [TR]
    [TD="align: left"]Number of BUFGMUXs[/TD]
    [TD="align: right"]13[/TD]
    [TD="align: right"]24[/TD]
    [TD="align: right"]54%[/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [TR]
    [TD="align: left"]Number of DCMs[/TD]
    [TD="align: right"]1[/TD]
    [TD="align: right"]8[/TD]
    [TD="align: right"]12%[/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [TR]
    [TD="align: left"]Average Fanout of Non-Clock Nets[/TD]
    [TD="align: right"]4.90[/TD]
    [TD] [/TD]
    [TD] [/TD]
    [TD="colspan: 2"] [/TD]
    [/TR]
    [/TABLE]
  • overclockedoverclocked Posts: 80
    edited 2014-08-20 - 23:35:48
    mkarlsson wrote: »
    Two things:

    1) The syntax of $readmemh is

    $readmemh("File",ArrayName,StartAddr,EndAddr);

    If you omit EndAddr it will go to the end of the RAM. If you omit both StartAddr and EndAddr it will load the whole ram.

    So in your case my guess is that if your data file now has data for the whole RAM then the line should look like this:
    $readmemh ("ROM_$F000-$FFFF_BYTE_3.spin", ram3);
    etc.

    2) Don't forget that you need to map the top 4KB of the HUB RAM address space to the top 4KB of your RAM.
    In my code that's not needed since I use the whole address space but if you have less than 64KB then this mapping must happen.

    Magnus

    Thanks for clearing out the parameters of the $readmemh(). After changing sources back ot th original syntax it now at least compiles using the old parser. I changed back to using full mem also to not confuse too much. Still it does not work with old parser (not connection in PropTool) and when compiling same sources with new parser it just works!

    Well thanks Magnus for the work on this. I'll give it a last look and then let it go!
Sign In or Register to comment.