Shop OBEX P1 Docs P2 Docs Learn Events
Propeller II: Emulation of the P2 on FPGA boards (Prop123-A7/A9, DE0-NANO, DE2-115, etc) - Page 7 — Parallax Forums

Propeller II: Emulation of the P2 on FPGA boards (Prop123-A7/A9, DE0-NANO, DE2-115, etc)

145791024

Comments

  • cgraceycgracey Posts: 14,131
    edited 2012-12-04 12:04
    Thanks Chip!

    Once cycle for stack reads is great, as are the CALLD and RETD instructions.

    If I read your previous message correctly, then
           REP #1,#64
           RDLONGC      INDA++,PTR++
    

    Would load 64 longs from the hub to the cog in eight hub cycles (plus time for loading INDA, PTR, REP, and time to sync to the first hub access)

    I will document the REPeat instructions next.

    For what you are doing there:

    REPS #64,#1 'repeat 1 instruction 64 times
    SETINDA startreg 'need one spacer instruction between REPS and what is going to get repeated
    RDLONG INDA++,PTRA++
  • David BetzDavid Betz Posts: 14,511
    edited 2012-12-04 12:07
    cgracey wrote: »
    Okay, but what about the delta modes where a '+' or '-' must be expressed?

    SETINDA -#5
    SETINDA +#5

    I don't really like that.

    How about this:

    SETINDA @5
    SETINDA @-5
    I like your second option better.
  • Bill HenningBill Henning Posts: 6,445
    edited 2012-12-04 12:11
    Thanks Chip.

    If I read the specification PDF right, REPD would need two spacer instructions - right?
    cgracey wrote: »
    I will document the REPeat instructions next.

    For what you are doing there:

    REPS #64,#1 'repeat 1 instruction 64 times
    SETINDA startreg 'need one spacer instruction between REPS and what is going to get repeated
    RDLONG INDA++,PTRA++
  • cgraceycgracey Posts: 14,131
    edited 2012-12-04 12:24
    Thanks Chip.

    If I read the specification PDF right, REPD would need two spacer instructions - right?

    It needs three. The difference between REPS and REPD is that REPS uses an immediate repeat count, while REPD can use a register. REPS can execute early because the immediate repeat count is in the instruction, whereas REPD must wait for the register to be read.
  • Bill HenningBill Henning Posts: 6,445
    edited 2012-12-04 12:50
    Thanks!

    I don't mind pipeline delays, after all, it does not matter where initialization/setup code goes :)
  • cgraceycgracey Posts: 14,131
    edited 2012-12-04 14:58
    David Betz wrote: »
    I like your second option better.

    I just realized that the way it works for deltas is already:

    SETINDA ++3
    SETINDB --4
    SETINDS --6,++5

    Do you like that better than using @'s?
  • Cluso99Cluso99 Posts: 18,066
    edited 2012-12-04 15:07
    cgracey wrote: »
    I just realized that the way it works for deltas is already:

    SETINDA ++3
    SETINDB --4
    SETINDS --6,++5

    Do you like that better than using @'s?

    Personally, I don't like the use of @ where it is not an indirection/address of.

    Therefore, I prefer:

    SETINDA #32
    SETINDA #cogaddrs

    and

    SETINDA ++3
    SETINDA --4
    SETINDS --6,++5
  • cgraceycgracey Posts: 14,131
    edited 2012-12-04 15:10
    Cluso99 wrote: »
    Personally, I don't like the use of @ where it is not an indirection/address of.

    Therefore, I prefer:

    SETINDA #32
    SETINDA #cogaddrs

    and

    SETINDA ++3
    SETINDA --4
    SETINDS --6,++5

    I agree. So, I'll make the assembler insist on # for immediate values and ++/-- for delta values.
  • Cluso99Cluso99 Posts: 18,066
    edited 2012-12-04 15:17
    Chip:

    re REPS & REPD

    I understand the difference and the requirement that they use different instructions because of the differing pipeline delays.

    I could not find a REP instruction where the pipe just stalls.

    Might a better way to name the REPS & REPD instructions be

    REPD1 #times,#loops
    REPD3 times,#loops

    At least this way we are forced to remember how many instructions will be excuted. Otherwise I can see lots of wasted time debugging because we forgot how many instructions are executed before the loop takes place.

    Perhaps the same could be applied for other Delayed instructions?
  • RaymanRayman Posts: 13,767
    edited 2012-12-04 15:31
    Can you use something like this syntax?
    jmp #$-1 'Loop back endlessly

    cgracey wrote: »
    I agree. So, I'll make the assembler insist on # for immediate values and ++/-- for delta values.
  • cgraceycgracey Posts: 14,131
    edited 2012-12-04 16:22
    Rayman wrote: »
    Can you use something like this syntax?
    jmp #$-1 'Loop back endlessly

    You can do 'JMP #$' to loop endlessly, or 'JMP #$-1' to jump back one instruction.
  • cgraceycgracey Posts: 14,131
    edited 2012-12-04 16:42
    I've placed updated documentation, along with a .zip that contains everything anyone needs for DE0-Nano and DE2-115 Prop2 emulation. There's a new PNUT.EXE which supports the new SETINDx/FIXINDx #,# syntax, too:

    http://forums.parallax.com/showthread.php?144199-Propeller-II-Emulation-of-the-P2-on-DE0-NANO-amp-DE2-115-FPGA-boards&p=1146196&viewfull=1#post1146196
  • David BetzDavid Betz Posts: 14,511
    edited 2012-12-04 16:59
    cgracey wrote: »
    I've placed updated documentation, along with a .zip that contains everything anyone needs for DE0-Nano and DE2-115 Prop2 emulation. There's a new PNUT.EXE which supports the new SETINDx/FIXINDx #,# syntax, too:

    http://forums.parallax.com/showthread.php?144199-Propeller-II-Emulation-of-the-P2-on-DE0-NANO-amp-DE2-115-FPGA-boards&p=1146196&viewfull=1#post1146196
    Thanks Chip!!
    I'm working on the two-stage loader. Hopefully, it will be done soon.
  • SapiehaSapieha Posts: 2,964
    edited 2012-12-04 17:12
    Hi Chip.

    It is any changes in Config file for FPGA?



    cgracey wrote: »
    I've been working on the instruction set documentation and I've completed the parts that cover:

    1) Hub memory instructions
    2) Hub control instructions
    3) Cog RAM indirect instructions - New # syntax for SETINDx/FIXINDx
    4) Cog stack RAM instructions

    There is a new PNUT.EXE in this .zip which supports the new SETINDx/FIXINDx syntax. Also, all the files anyone needs to use the DE0-Nano or DE2-115 are in here:
  • Bill HenningBill Henning Posts: 6,445
    edited 2012-12-04 17:18
    Hi Chip,

    I was looking at your DE0_Nano_Hookup.png, and it gave me some ideas...

    - It looks like it would be possible to map P64-P89 onto the header that PropPlug is plugged into (JP1).

    - JP3 could provide P32-P52

    - the LED's could be mapped to P53-P60

    This would allow trying other I/O intensive tests before the SDRAM and VGA out work.

    If it would take too much of your time please forget the above suggestion, as we need the docs more :)

    Thanks for all your hard work!
    cgracey wrote: »
    I've placed updated documentation, along with a .zip that contains everything anyone needs for DE0-Nano and DE2-115 Prop2 emulation. There's a new PNUT.EXE which supports the new SETINDx/FIXINDx #,# syntax, too:

    http://forums.parallax.com/showthread.php?144199-Propeller-II-Emulation-of-the-P2-on-DE0-NANO-amp-DE2-115-FPGA-boards&p=1146196&viewfull=1#post1146196
  • cgraceycgracey Posts: 14,131
    edited 2012-12-04 17:26
    Sapieha wrote: »
    Hi Chip.

    It is any changes in Config file for FPGA?

    Not yet. I'll need to add the SDRAM pins a little later, and then the I/O for our various boards.
  • cgraceycgracey Posts: 14,131
    edited 2012-12-04 17:30
    Hi Chip,

    I was looking at your DE0_Nano_Hookup.png, and it gave me some ideas...

    - It looks like it would be possible to map P64-P89 onto the header that PropPlug is plugged into (JP1).

    - JP3 could provide P32-P52

    - the LED's could be mapped to P53-P60

    This would allow trying other I/O intensive tests before the SDRAM and VGA out work.

    If it would take too much of your time please forget the above suggestion, as we need the docs more :)

    Thanks for all your hard work!

    Yes, that's possible. I was figuring I'd wait for our board and Sapieha's board to be done, and then map the pins to those, as they provide nice DACs and other hook-ups.

    I'll be working on docs for a little while longer, it seems.
  • SapiehaSapieha Posts: 2,964
    edited 2012-12-04 17:31
    Hi Chip.

    Thanks.

    No need for reprograming

    cgracey wrote: »
    Not yet. I'll need to add the SDRAM pins a little later, and then the I/O for our various boards.
  • David BetzDavid Betz Posts: 14,511
    edited 2012-12-04 20:53
    I've written the beginnings of a second-stage loader for the P2 and I'm having some trouble getting the program I'm loading to actually start. In case you want to see my code I've attached it to this message. The idea is that the PC sends a bunch of CRC-checked packets to the second-stage loader and it writes them to memory. If it gets a full load with no CRC errors it starts the program. I start writing code at $e80 and I load the COG image at that location after a successful download. I use the following code to start the downloaded image:
    CON
    
      BASE = $e80
    
    DAT
    
    ' code to load hub memory over the serial link
    
    start                   setcog  #0                      'relaunch cog0 with loaded program
                            coginit base_addr, zero
    
    base_addr       long    BASE
    zero            long    0
    

    My assumption is that the coginit instruction will load a COG image starting at $e80. What I'm loading is the 2K image produced by PNut.exe for my "Hello, Propeller II!" program that I know works when loaded directly by PNut.exe. Any idea what might be going wrong?
  • David BetzDavid Betz Posts: 14,511
    edited 2012-12-04 21:08
    David Betz wrote: »
    I've written the beginnings of a second-stage loader for the P2 and I'm having some trouble getting the program I'm loading to actually start. In case you want to see my code I've attached it to this message. The idea is that the PC sends a bunch of CRC-checked packets to the second-stage loader and it writes them to memory. If it gets a full load with no CRC errors it starts the program. I start writing code at $e80 and I load the COG image at that location after a successful download. I use the following code to start the downloaded image:
    CON
    
      BASE = $e80
    
    DAT
    
    ' code to load hub memory over the serial link
    
    start                   setcog  #0                      'relaunch cog0 with loaded program
                            coginit base_addr, zero
    
    base_addr       long    BASE
    zero            long    0
    

    My assumption is that the coginit instruction will load a COG image starting at $e80. What I'm loading is the 2K image produced by PNut.exe for my "Hello, Propeller II!" program that I know works when loaded directly by PNut.exe. Any idea what might be going wrong?

    Ugh, never mind. There was a bug on the PC side of my program. The two-stage loader is now working for loading a program that is only 2K. Of course, the PNut.exe loader can already do that. Now I need to create a bigger program to make sure that loads correctly as well. If so, it should be possible to load all 32k of hub memory on the DE0-Nano. I don't have a DE2-115 so I can't try loading all 128k.
  • cgraceycgracey Posts: 14,131
    edited 2012-12-04 21:38
    David Betz wrote: »
    Ugh, never mind. There was a bug on the PC side of my program. The two-stage loader is now working for loading a program that is only 2K. Of course, the PNut.exe loader can already do that. Now I need to create a bigger program to make sure that loads correctly as well. If so, it should be possible to load all 32k of hub memory on the DE0-Nano. I don't have a DE2-115 so I can't try loading all 128k.

    You don't need to do that 'SETCOG #0', since that register was initialized to 0 when then your loader launched.
  • David BetzDavid Betz Posts: 14,511
    edited 2012-12-04 21:42
    cgracey wrote: »
    You don't need to do that 'SETCOG #0', since that register was initialized to 0 when then your loader launched.
    Thanks! That will save me an instruction.

    Can you comment on how programs should be layed out in hub memory? I am currently just writing your .obj files starting at $e80. Does that make sense or should there be something else in low memory like CLKFREQ, etc?
  • Bill HenningBill Henning Posts: 6,445
    edited 2012-12-04 21:48
    Chip & David,

    I think $0E80-$0FFF should be reserved for mailboxes and various system pointers, and suggest that programs be loaded starting at $1000
  • Cluso99Cluso99 Posts: 18,066
    edited 2012-12-05 02:12
    Chip & David,

    I think $0E80-$0FFF should be reserved for mailboxes and various system pointers, and suggest that programs be loaded starting at $1000
    I think this is an excellent suggestion.
  • David BetzDavid Betz Posts: 14,511
    edited 2012-12-05 03:03
    Chip & David,

    I think $0E80-$0FFF should be reserved for mailboxes and various systbem pointers, and suggest that programs be loaded starting at $1000

    Sounds reasonable. I'll modify my loader. I'll try to get an early version posted later today.
  • David BetzDavid Betz Posts: 14,511
    edited 2012-12-05 03:57
    Chip & David,

    I think $0E80-$0FFF should be reserved for mailboxes and various system pointers, and suggest that programs be loaded starting at $1000

    I've been thinking about this and it might be good to support a scheme where the second-stage loader could load COG images and start the COGs before loading the main program. I think RossH does this in Catalina. The idea would be to have the COGs start up spinning on a mailbox in this $0e80-$0fff area of memory. Then when the main program starts it can pass any initialization parameters to the COGs and start them up. This has the advantage that you don't waste 2K of space for the COG image for every "driver". This would require a slightly more complex executable image format but it could be optional so it is still possible to have programs that work in the traditional way where everything is loaded at once. In fact, some programs may want to use a combination of these methods so that, for instance, a Forth COG image can be linked in with the main program but COG images for "drivers" could be loaded by this second-stage loader. If we do something like this, we could either use a fixed layout in low memory to describe the drivers that are pre-loaded (like Catalina's registry) or we could have a linker bind those addresses statically to avoid any runtime searches.

    Beyond that, I suggest that we modify Bill's proposal a bit and have the loader load memory starting at $0e80 but have the address of the COG image for the main program stored at $1000. That way the loader can initialize the data in the $0e80-$0fff area.
  • David BetzDavid Betz Posts: 14,511
    edited 2012-12-05 04:05
    Has anyone started using their DE0-Nano or DE2-115 P2 boards yet? If so, what platforms do you use? I can make my loader available on Windows, Macintosh, or Linux but I'm wondering what platforms people are actually using. I guess this is a silly question to some extent because PNut.exe will only run under Windows and that will still be needed to assemble your PASM code but I suspect that situation will change rapidly once Chip's instruction set document is done and Roy adapts his compiler to generate P2 binaries. My assumption is that Windows will probably be the most common platform once the P2 gets into general use but I wasn't sure if that would also be true of the early adopter population. What platform do you want to use for your early P2 work?
  • SapiehaSapieha Posts: 2,964
    edited 2012-12-05 04:09
    Hi David.

    I run on win XP.

    BUT -- I run 2.500.000 Baud rate in PuTTY.

    David Betz wrote: »
    Has anyone started using their DE0-Nano or DE2-115 P2 boards yet? If so, what platforms do you use? I can make my loader available on Windows, Macintosh, or Linux but I'm wondering what platforms people are actually using. I guess this is a silly question to some extent because PNut.exe will only run under Windows and that will still be needed to assemble your PASM code but I suspect that situation will change rapidly once Chip's instruction set document is done and Roy adapts his compiler to generate P2 binaries. My assumption is that Windows will probably be the most common platform once the P2 gets into general use but I wasn't sure if that would also be true of the early adopter population. What platform do you want to use for your early P2 work?
  • David BetzDavid Betz Posts: 14,511
    edited 2012-12-05 04:12
    Sapieha wrote: »
    Hi David.

    I run on win XP.

    BUT -- I run 2.500.000 Baud rate in PuTTY.
    You're able to load the Propeller at that rate? I have never tried anything higher than 115200. I can easily provide support for selecting different baud rates. I can even use one baud rate for the first-phase loader that talks to Chip's ROM loader and a different one for the second-stage loader that loads your program if that would be helpful.
  • SapiehaSapieha Posts: 2,964
    edited 2012-12-05 04:14
    Hi David.

    Sounds reasonable.


    David Betz wrote: »
    You're able to load the Propeller at that rate? I have never tried anything higher than 115200. I can easily provide support for selecting different baud rates. I can even use one baud rate for the first-phase loader that talks to Chip's ROM loader and a different one for the second-stage loader that loads your program if that would be helpful.
Sign In or Register to comment.