Shop OBEX P1 Docs P2 Docs Learn Events
Prop2 FPGA files!!! - Updated 2 June 2018 - Final Version 32i - Page 2 — Parallax Forums

Prop2 FPGA files!!! - Updated 2 June 2018 - Final Version 32i

24567160

Comments

  • Chip,

    What loader protocol is required to load code into a P2? Is it the same as it was with P2-hot? I'd like to make a loader that will run on the Mac and Linux without having to resort to using wine to run PNut.exe.

    Thanks,
    David
  • Yes, please! A more portable loader. (I'm not slacking, I'm typing this from my lawn mower!)
  • Chores? Pfffft. That's nothing. I'm outta State on contract, until the 2nd. My FPGA boards miss me and require feeding. :p

    Have fun. I'll be watching with great interest.
  • It's been a while since I've looked at the new P2 instruction set. What does "@" mean? Is that relative addressing? Also, there are lots of CALLx (CALLA, CALLB, CALLD) instructions. Where do they put their return addresses? Are there still PTRA and PTRB registers? I assume CALL still works like before.

    What is the full register set and addresses?
  • cgraceycgracey Posts: 14,131
    Rayman wrote: »
    I think I almost understand the blinky example, but some things look magic...

    What does "orgh 1" do? Why not just orgh ? Does this code start at $1000 (I think so)?

    The last two lines with org and res x are hurting my brain...
    Does the compiler load anything after "org" into cog before starting?
    Or does this only work for "res" reserved space that doesn't need initializing?

    That ORGH 1 is there because that's where the loader jumps into your code. It's that non-aligned hub exec below $1000 that people hate. I just haven't changed it yet. I kind of don't want to, because it allows most efficient use of memory. You could always just put a JMP #$1000 after it and pretend it's not really happening.

    That ORG + RES business was just a quick way to get some symbolic cog registers declared. It doesn't generate any code. Each blinking cog will use its own instance of those registers.
  • cgraceycgracey Posts: 14,131
    David Betz wrote: »
    Chip,

    What loader protocol is required to load code into a P2? Is it the same as it was with P2-hot? I'd like to make a loader that will run on the Mac and Linux without having to resort to using wine to run PNut.exe.

    Thanks,
    David

    The ROM_Booter.spin is what runs on boot. It doesn't handle anything, yet, but serial loading. Then, MainLoader.spin gets downloaded by PNut.exe and it receives all the memory data and JMPs to your app. The last three longs get customized by PNut.exe for the board's RAM size and speed.
  • potatoheadpotatohead Posts: 10,253
    edited 2015-09-26 19:19
    Not everyone hates it... :) I think it's a perfectly great idea that makes good use of the addressing scheme.

    I really, really, really don't think we should change it. Having that region under $1000, non aligned as it is, makes for a perfect boot code area.

    If I were king, I would add a write protect bit and some instruction state type latch to that bit so the region can be loaded from ROM, then kept from being trashed easily to simulate having a real ROM there. This is one pretty great feature P2 "hot" had, and I think we should keep it in this P2.

    The beauty of doing it this way is there being no need to compromise on ROM / RAM like we had to in the "hot" chip. In that one, we all wanted to keep the ROM small, because it cost us RAM.

    In this one, if we were to add a write protect bit for that region, or maybe the whole 16K region ($4000), debug, dev, monitor, and whatever else could go there. The chip would ship with some stuff, and a binary image could ship with other stuff. We get the choice of default ROM, use it all as RAM, or a newer "rom" and I think people would take great advantage of this facility to offer up tools, utilities, etc...

    Say one is done with the development and or just wants to use the RAM as RAM. Simply include whatever you want in your binary image, and write over the area, using it as a data / font whatever buffer. Ignore the non-aligned code feature and carry on.
  • cgracey wrote: »
    David Betz wrote: »
    Chip,

    What loader protocol is required to load code into a P2? Is it the same as it was with P2-hot? I'd like to make a loader that will run on the Mac and Linux without having to resort to using wine to run PNut.exe.

    Thanks,
    David

    The ROM_Booter.spin is what runs on boot. It doesn't handle anything, yet, but serial loading. Then, MainLoader.spin gets downloaded by PNut.exe and it receives all the memory data and JMPs to your app. The last three longs get customized by PNut.exe for the board's RAM size and speed.
    Okay, so ROM_Booter.spin is the only thing that is fixed. MainLoader.spin is the second-stage loader and could be different for different loader implementations, right?

  • cgraceycgracey Posts: 14,131
    David Betz wrote: »
    It's been a while since I've looked at the new P2 instruction set. What does "@" mean? Is that relative addressing? Also, there are lots of CALLx (CALLA, CALLB, CALLD) instructions. Where do they put their return addresses? Are there still PTRA and PTRB registers? I assume CALL still works like before.

    What is the full register set and addresses?

    CALLA uses PTRA
    CALLB uses PTRB
    CALL uses the internal 8-level hardware stack
    CALLD is the link instruction

    # is absolute 20-bit address
    @ is relative 20-bit address, 9-bit for D,S branches
  • cgracey wrote: »
    David Betz wrote: »
    It's been a while since I've looked at the new P2 instruction set. What does "@" mean? Is that relative addressing? Also, there are lots of CALLx (CALLA, CALLB, CALLD) instructions. Where do they put their return addresses? Are there still PTRA and PTRB registers? I assume CALL still works like before.

    What is the full register set and addresses?

    CALLA uses PTRA
    CALLB uses PTRB
    CALL uses the internal 8-level hardware stack
    CALLD is the link instruction

    # is absolute 20-bit address
    @ is relative 20-bit address, 9-bit for D,S branches

    Thanks! What are the semantics of PTRA and PTRB? Are they stack pointers that auto increment and decrement?

  • cgraceycgracey Posts: 14,131
    David Betz wrote: »
    cgracey wrote: »
    David Betz wrote: »
    Chip,

    What loader protocol is required to load code into a P2? Is it the same as it was with P2-hot? I'd like to make a loader that will run on the Mac and Linux without having to resort to using wine to run PNut.exe.

    Thanks,
    David

    The ROM_Booter.spin is what runs on boot. It doesn't handle anything, yet, but serial loading. Then, MainLoader.spin gets downloaded by PNut.exe and it receives all the memory data and JMPs to your app. The last three longs get customized by PNut.exe for the board's RAM size and speed.
    Okay, so ROM_Booter.spin is the only thing that is fixed. MainLoader.spin is the second-stage loader and could be different for different loader implementations, right?

    Correct.
  • cgraceycgracey Posts: 14,131
    David Betz wrote: »
    cgracey wrote: »
    David Betz wrote: »
    It's been a while since I've looked at the new P2 instruction set. What does "@" mean? Is that relative addressing? Also, there are lots of CALLx (CALLA, CALLB, CALLD) instructions. Where do they put their return addresses? Are there still PTRA and PTRB registers? I assume CALL still works like before.

    What is the full register set and addresses?

    CALLA uses PTRA
    CALLB uses PTRB
    CALL uses the internal 8-level hardware stack
    CALLD is the link instruction

    # is absolute 20-bit address
    @ is relative 20-bit address, 9-bit for D,S branches

    Thanks! What are the semantics of PTRA and PTRB? Are they stack pointers that auto increment and decrement?

    Yes. And any offset expressed is scaled by the word size.
  • cgracey wrote: »
    David Betz wrote: »
    cgracey wrote: »
    David Betz wrote: »
    It's been a while since I've looked at the new P2 instruction set. What does "@" mean? Is that relative addressing? Also, there are lots of CALLx (CALLA, CALLB, CALLD) instructions. Where do they put their return addresses? Are there still PTRA and PTRB registers? I assume CALL still works like before.

    What is the full register set and addresses?

    CALLA uses PTRA
    CALLB uses PTRB
    CALL uses the internal 8-level hardware stack
    CALLD is the link instruction

    # is absolute 20-bit address
    @ is relative 20-bit address, 9-bit for D,S branches

    Thanks! What are the semantics of PTRA and PTRB? Are they stack pointers that auto increment and decrement?

    Yes. And any offset expressed is scaled by the word size.
    Ah, okay. So the new P2 instruction set still has all of the PTRx addressing modes of P2-hot? I didn't realize that.

  • Is this all still valid?
    PTRx expressions:
    
        INDEX = -32..+31 for simple offsets, 0..31 for ++'s, or 0..32 for --'s
        SCALE = 1 for byte, 2 for word, 4 for long, or 32 for wide
    
        S = 0 for PTRA, 1 for PTRB
        U = 0 to keep PTRx same, 1 to update PTRx
        P = 0 to use PTRx + INDEX*SCALE, 1 to use PTRx (post-modify)
        NNNNNN = INDEX
        nnnnnn = -INDEX
    
    
        SUPNNNNNN     PTR expression
        -----------------------------------------------------------------------------
        000000000     PTRA              'use PTRA
        100000000     PTRB              'use PTRB
        011000001     PTRA++            'use PTRA,                PTRA += SCALE
        111000001     PTRB++            'use PTRB,                PTRB += SCALE
        011111111     PTRA--            'use PTRA,                PTRA -= SCALE
        111111111     PTRB--            'use PTRB,                PTRB -= SCALE
        010000001     ++PTRA            'use PTRA + SCALE,        PTRA += SCALE
        110000001     ++PTRB            'use PTRB + SCALE,        PTRB += SCALE
        010111111     --PTRA            'use PTRA - SCALE,        PTRA -= SCALE
        110111111     --PTRB            'use PTRB - SCALE,        PTRB -= SCALE
    
        000NNNNNN     PTRA[INDEX]       'use PTRA + INDEX*SCALE
        100NNNNNN     PTRB[INDEX]       'use PTRB + INDEX*SCALE
        011NNNNNN     PTRA++[INDEX]     'use PTRA,                PTRA += INDEX*SCALE
        111NNNNNN     PTRB++[INDEX]     'use PTRB,                PTRB += INDEX*SCALE
        011nnnnnn     PTRA--[INDEX]     'use PTRA,                PTRA -= INDEX*SCALE
        111nnnnnn     PTRB--[INDEX]     'use PTRB,                PTRB -= INDEX*SCALE
        010NNNNNN     ++PTRA[INDEX]     'use PTRA + INDEX*SCALE,  PTRA += INDEX*SCALE
        110NNNNNN     ++PTRB[INDEX]     'use PTRB + INDEX*SCALE,  PTRB += INDEX*SCALE
        010nnnnnn     --PTRA[INDEX]     'use PTRA - INDEX*SCALE,  PTRA -= INDEX*SCALE
        110nnnnnn     --PTRB[INDEX]     'use PTRB - INDEX*SCALE,  PTRB -= INDEX*SCALE
    
    
  • cgraceycgracey Posts: 14,131
    David Betz wrote: »
    Is this all still valid?
    PTRx expressions:
    
        INDEX = -32..+31 for simple offsets, 0..31 for ++'s, or 0..32 for --'s
        SCALE = 1 for byte, 2 for word, 4 for long, or 32 for wide
    
        S = 0 for PTRA, 1 for PTRB
        U = 0 to keep PTRx same, 1 to update PTRx
        P = 0 to use PTRx + INDEX*SCALE, 1 to use PTRx (post-modify)
        NNNNNN = INDEX
        nnnnnn = -INDEX
    
    
        SUPNNNNNN     PTR expression
        -----------------------------------------------------------------------------
        000000000     PTRA              'use PTRA
        100000000     PTRB              'use PTRB
        011000001     PTRA++            'use PTRA,                PTRA += SCALE
        111000001     PTRB++            'use PTRB,                PTRB += SCALE
        011111111     PTRA--            'use PTRA,                PTRA -= SCALE
        111111111     PTRB--            'use PTRB,                PTRB -= SCALE
        010000001     ++PTRA            'use PTRA + SCALE,        PTRA += SCALE
        110000001     ++PTRB            'use PTRB + SCALE,        PTRB += SCALE
        010111111     --PTRA            'use PTRA - SCALE,        PTRA -= SCALE
        110111111     --PTRB            'use PTRB - SCALE,        PTRB -= SCALE
    
        000NNNNNN     PTRA[INDEX]       'use PTRA + INDEX*SCALE
        100NNNNNN     PTRB[INDEX]       'use PTRB + INDEX*SCALE
        011NNNNNN     PTRA++[INDEX]     'use PTRA,                PTRA += INDEX*SCALE
        111NNNNNN     PTRB++[INDEX]     'use PTRB,                PTRB += INDEX*SCALE
        011nnnnnn     PTRA--[INDEX]     'use PTRA,                PTRA -= INDEX*SCALE
        111nnnnnn     PTRB--[INDEX]     'use PTRB,                PTRB -= INDEX*SCALE
        010NNNNNN     ++PTRA[INDEX]     'use PTRA + INDEX*SCALE,  PTRA += INDEX*SCALE
        110NNNNNN     ++PTRB[INDEX]     'use PTRB + INDEX*SCALE,  PTRB += INDEX*SCALE
        010nnnnnn     --PTRA[INDEX]     'use PTRA - INDEX*SCALE,  PTRA -= INDEX*SCALE
        110nnnnnn     --PTRB[INDEX]     'use PTRB - INDEX*SCALE,  PTRB -= INDEX*SCALE
    
    

    That is all the same, but with one difference. Now, there are only five bits of offset, so you have + 15 to -16 range. Those SUP bits haved moved down by one. Now, if the MSB is zero, you access immediate addresses 00 to FF.
  • potatohead wrote: »
    Not everyone hates it... :) I think it's a perfectly great idea that makes good use of the addressing scheme.

    I really, really, really don't think we should change it. Having that region under $1000, non aligned as it is, makes for a perfect boot code area.

    If I were king, I would add a write protect bit and some instruction state type latch to that bit so the region can be loaded from ROM, then kept from being trashed easily to simulate having a real ROM there. This is one pretty great feature P2 "hot" had, and I think we should keep it in this P2.

    The beauty of doing it this way is there being no need to compromise on ROM / RAM like we had to in the "hot" chip. In that one, we all wanted to keep the ROM small, because it cost us RAM.

    In this one, if we were to add a write protect bit for that region, or maybe the whole 16K region ($4000), debug, dev, monitor, and whatever else could go there. The chip would ship with some stuff, and a binary image could ship with other stuff. We get the choice of default ROM, use it all as RAM, or a newer "rom" and I think people would take great advantage of this facility to offer up tools, utilities, etc...

    Say one is done with the development and or just wants to use the RAM as RAM. Simply include whatever you want in your binary image, and write over the area, using it as a data / font whatever buffer. Ignore the non-aligned code feature and carry on.

    The kluge to make hub exec work offset by 1 at addresses below $1000 is just that... a kluge. It's obviously a hack, and looks really lame to have in hardware as the way things work. Also, it doesn't give any advantage at all, I don't understand why anyone thinks it does?!?!

    Having the 16k rom be loaded and then jumping to $0001 instead of $1000 is the difference in practice. I don't see how it more efficiently uses memory at all? There is still 16k of ROM loaded into the first 16k of hub. the only difference is where the entry point is at. However, now things work differently when you branch to addresses below $1000 depending on if your branch target is aligned or not. This is all the time, and it's odd and makes little sense.

    Things should work in straight forward easy to describe ways, not with weird gotcha kluges.
  • Roy Eltham wrote: »
    potatohead wrote: »
    Not everyone hates it... :) I think it's a perfectly great idea that makes good use of the addressing scheme.

    I really, really, really don't think we should change it. Having that region under $1000, non aligned as it is, makes for a perfect boot code area.

    If I were king, I would add a write protect bit and some instruction state type latch to that bit so the region can be loaded from ROM, then kept from being trashed easily to simulate having a real ROM there. This is one pretty great feature P2 "hot" had, and I think we should keep it in this P2.

    The beauty of doing it this way is there being no need to compromise on ROM / RAM like we had to in the "hot" chip. In that one, we all wanted to keep the ROM small, because it cost us RAM.

    In this one, if we were to add a write protect bit for that region, or maybe the whole 16K region ($4000), debug, dev, monitor, and whatever else could go there. The chip would ship with some stuff, and a binary image could ship with other stuff. We get the choice of default ROM, use it all as RAM, or a newer "rom" and I think people would take great advantage of this facility to offer up tools, utilities, etc...

    Say one is done with the development and or just wants to use the RAM as RAM. Simply include whatever you want in your binary image, and write over the area, using it as a data / font whatever buffer. Ignore the non-aligned code feature and carry on.

    The kluge to make hub exec work offset by 1 at addresses below $1000 is just that... a kluge. It's obviously a hack, and looks really lame to have in hardware as the way things work. Also, it doesn't give any advantage at all, I don't understand why anyone thinks it does?!?!

    Having the 16k rom be loaded and then jumping to $0001 instead of $1000 is the difference in practice. I don't see how it more efficiently uses memory at all? There is still 16k of ROM loaded into the first 16k of hub. the only difference is where the entry point is at. However, now things work differently when you branch to addresses below $1000 depending on if your branch target is aligned or not. This is all the time, and it's odd and makes little sense.

    Things should work in straight forward easy to describe ways, not with weird gotcha kluges.
    Ummm... Is there any reason that the ROM has to be loaded starting at $0000? Why can't it be loaded starting at $1000 in the first place?
  • David,
    That works fine too. Not sure why Chip thinks it needs to be loaded at $0 and make this kluge to use that memory for hubexec.
  • David BetzDavid Betz Posts: 14,511
    edited 2015-09-26 22:22
    cgracey wrote: »
    David Betz wrote: »
    Is this all still valid?
    PTRx expressions:
    
        INDEX = -32..+31 for simple offsets, 0..31 for ++'s, or 0..32 for --'s
        SCALE = 1 for byte, 2 for word, 4 for long, or 32 for wide
    
        S = 0 for PTRA, 1 for PTRB
        U = 0 to keep PTRx same, 1 to update PTRx
        P = 0 to use PTRx + INDEX*SCALE, 1 to use PTRx (post-modify)
        NNNNNN = INDEX
        nnnnnn = -INDEX
    
    
        SUPNNNNNN     PTR expression
        -----------------------------------------------------------------------------
        000000000     PTRA              'use PTRA
        100000000     PTRB              'use PTRB
        011000001     PTRA++            'use PTRA,                PTRA += SCALE
        111000001     PTRB++            'use PTRB,                PTRB += SCALE
        011111111     PTRA--            'use PTRA,                PTRA -= SCALE
        111111111     PTRB--            'use PTRB,                PTRB -= SCALE
        010000001     ++PTRA            'use PTRA + SCALE,        PTRA += SCALE
        110000001     ++PTRB            'use PTRB + SCALE,        PTRB += SCALE
        010111111     --PTRA            'use PTRA - SCALE,        PTRA -= SCALE
        110111111     --PTRB            'use PTRB - SCALE,        PTRB -= SCALE
    
        000NNNNNN     PTRA[INDEX]       'use PTRA + INDEX*SCALE
        100NNNNNN     PTRB[INDEX]       'use PTRB + INDEX*SCALE
        011NNNNNN     PTRA++[INDEX]     'use PTRA,                PTRA += INDEX*SCALE
        111NNNNNN     PTRB++[INDEX]     'use PTRB,                PTRB += INDEX*SCALE
        011nnnnnn     PTRA--[INDEX]     'use PTRA,                PTRA -= INDEX*SCALE
        111nnnnnn     PTRB--[INDEX]     'use PTRB,                PTRB -= INDEX*SCALE
        010NNNNNN     ++PTRA[INDEX]     'use PTRA + INDEX*SCALE,  PTRA += INDEX*SCALE
        110NNNNNN     ++PTRB[INDEX]     'use PTRB + INDEX*SCALE,  PTRB += INDEX*SCALE
        010nnnnnn     --PTRA[INDEX]     'use PTRA - INDEX*SCALE,  PTRA -= INDEX*SCALE
        110nnnnnn     --PTRB[INDEX]     'use PTRB - INDEX*SCALE,  PTRB -= INDEX*SCALE
    
    

    That is all the same, but with one difference. Now, there are only five bits of offset, so you have + 15 to -16 range. Those SUP bits haved moved down by one. Now, if the MSB is zero, you access immediate addresses 00 to FF.
    Thanks Chip. So the encoding now looks like this? And the I bit in the instruction determines of the S field is interpreted as a register or as below.
    PTRx expressions:
    
        INDEX = -16..+15 for simple offsets, 0..15 for ++'s, or 0..16 for --'s
        SCALE = 1 for byte, 2 for word, 4 for long, or 32 for wide
    
        S = 0 for PTRA, 1 for PTRB
        U = 0 to keep PTRx same, 1 to update PTRx
        P = 0 to use PTRx + INDEX*SCALE, 1 to use PTRx (post-modify)
        NNNNN = INDEX
        nnnnn = -INDEX
    
    
        ISUPNNNN     PTR expression
        -----------------------------------------------------------------------------
        0NNNNNNNN     #NNNNNNNN
    
        100000000     PTRA              'use PTRA
        110000000     PTRB              'use PTRB
        101100001     PTRA++            'use PTRA,                PTRA += SCALE
        111100001     PTRB++            'use PTRB,                PTRB += SCALE
        101111111     PTRA--            'use PTRA,                PTRA -= SCALE
        111111111     PTRB--            'use PTRB,                PTRB -= SCALE
        101000001     ++PTRA            'use PTRA + SCALE,        PTRA += SCALE
        111000001     ++PTRB            'use PTRB + SCALE,        PTRB += SCALE
        101011111     --PTRA            'use PTRA - SCALE,        PTRA -= SCALE
        111011111     --PTRB            'use PTRB - SCALE,        PTRB -= SCALE
    
        1000NNNNN     PTRA[INDEX]       'use PTRA + INDEX*SCALE
        1100NNNNN     PTRB[INDEX]       'use PTRB + INDEX*SCALE
        1011NNNNN     PTRA++[INDEX]     'use PTRA,                PTRA += INDEX*SCALE
        1111NNNNN     PTRB++[INDEX]     'use PTRB,                PTRB += INDEX*SCALE
        1011nnnnn     PTRA--[INDEX]     'use PTRA,                PTRA -= INDEX*SCALE
        1111nnnnn     PTRB--[INDEX]     'use PTRB,                PTRB -= INDEX*SCALE
        1010NNNNN     ++PTRA[INDEX]     'use PTRA + INDEX*SCALE,  PTRA += INDEX*SCALE
        1110NNNNN     ++PTRB[INDEX]     'use PTRB + INDEX*SCALE,  PTRB += INDEX*SCALE
        1010nnnnn     --PTRA[INDEX]     'use PTRA - INDEX*SCALE,  PTRA -= INDEX*SCALE
        1110nnnnn     --PTRB[INDEX]     'use PTRB - INDEX*SCALE,  PTRB -= INDEX*SCALE
    
  • cgraceycgracey Posts: 14,131
    Roy Eltham wrote: »
    potatohead wrote: »
    Not everyone hates it... :) I think it's a perfectly great idea that makes good use of the addressing scheme.

    I really, really, really don't think we should change it. Having that region under $1000, non aligned as it is, makes for a perfect boot code area.

    If I were king, I would add a write protect bit and some instruction state type latch to that bit so the region can be loaded from ROM, then kept from being trashed easily to simulate having a real ROM there. This is one pretty great feature P2 "hot" had, and I think we should keep it in this P2.

    The beauty of doing it this way is there being no need to compromise on ROM / RAM like we had to in the "hot" chip. In that one, we all wanted to keep the ROM small, because it cost us RAM.

    In this one, if we were to add a write protect bit for that region, or maybe the whole 16K region ($4000), debug, dev, monitor, and whatever else could go there. The chip would ship with some stuff, and a binary image could ship with other stuff. We get the choice of default ROM, use it all as RAM, or a newer "rom" and I think people would take great advantage of this facility to offer up tools, utilities, etc...

    Say one is done with the development and or just wants to use the RAM as RAM. Simply include whatever you want in your binary image, and write over the area, using it as a data / font whatever buffer. Ignore the non-aligned code feature and carry on.

    The kluge to make hub exec work offset by 1 at addresses below $1000 is just that... a kluge. It's obviously a hack, and looks really lame to have in hardware as the way things work. Also, it doesn't give any advantage at all, I don't understand why anyone thinks it does?!?!

    Having the 16k rom be loaded and then jumping to $0001 instead of $1000 is the difference in practice. I don't see how it more efficiently uses memory at all? There is still 16k of ROM loaded into the first 16k of hub. the only difference is where the entry point is at. However, now things work differently when you branch to addresses below $1000 depending on if your branch target is aligned or not. This is all the time, and it's odd and makes little sense.

    Things should work in straight forward easy to describe ways, not with weird gotcha kluges.

    Here's the thing, though. The assembler will never let you create cog code at non-long-aligned addresses. In other words, you will never be jumping to cog locations that are not long-aligned. If you jump to low addresses that are not long aligned, it can only be hub code.
  • Cluso99Cluso99 Posts: 18,066
    edited 2015-09-26 22:58
    Can someone test this please (not sure about the rep instruction)?
    dat
            orgh    1
    
    ' launch 15 cogs (cog 0 falls through and runs 'blink', too)
    ' any cogs missing from the FPGA won't blink
    
            rep     #1,#15-1        'repeat 1 instruction 15 times
            coginit #16,#blink               '<--- think this might need to be coginit #3,#blink ???
    
    blink   cogid   x               'which cog am I?
            cogid   pin
            setb    dirb,pin        'make that pin an output
            add     x,#16           'add to my id
            shl     x,#18           'shift it up to make it big
    
            rep     #2,#0           'repeat 2 instructions for ever
            notb    outb,pin        'flip its output state
            waitx   x               'wait that many clocks
    
    me      jmp     @me             'never gets here!
    
            org
    x       res     1               'variable at cog address 32 (register 8, RAM)
    pin     res     1
    
  • cgraceycgracey Posts: 14,131
    edited 2015-09-26 22:57
    Cluso99 wrote: »
    Can someone test this please (not sure about the rep instruction)?
    dat
            orgh    1
    
    ' launch 15 cogs (cog 0 falls through and runs 'blink', too)
    ' any cogs missing from the FPGA won't blink
    
            rep     #1,#15-1        'repeat 1 instruction 15 times
            coginit #16,#blink
    
    blink   cogid   x               'which cog am I?
            cogid   pin
            setb    dirb,pin        'make that pin an output
            add     x,#16           'add to my id
            shl     x,#18           'shift it up to make it big
    
            rep     #2,#0           'repeat 2 instructions for ever
            notb    outb,pin        'flip its output state
            waitx   x               'wait that many clocks
    
    me      jmp     @me             'never gets here!
    
            org
    x       res     1               'variable at cog address 32 (register 8, RAM)
    pin     res     1
    

    REP only works in cog/LUT-exec mode.

    I made a change the other day that causes it to just fall through in hub-exec mode, rather than blow up.
  • Cluso99Cluso99 Posts: 18,066
    edited 2015-09-26 23:06
    Chip,
    It is really nice to finally see a new P2 release. Well done!


    I totally agree with Roy that orgh 1 seems a kluge. I have a few ideas to solve this.
    However, for now, lets proceed with some testing.

    Will you be releasing a DE0-Nano and/or BeMicroCV (5CEFA2F23C8N) version?
    Would you like to post the pinout mapping for the 123 A7? We could then re-do the mappings for these to help.
  • Cluso99 wrote: »
    blink   cogid   x               'which cog am I?
            cogid   pin
    

    Also, I thought there was something special about calling COGID twice in a row. Or was that a P2-hot thing? Or am I just recalling incorrectly?
  • Cluso99Cluso99 Posts: 18,066
    cgracey wrote: »
    Cluso99 wrote: »
    Can someone test this please (not sure about the rep instruction)?
    dat
            orgh    1
    
    ' launch 15 cogs (cog 0 falls through and runs 'blink', too)
    ' any cogs missing from the FPGA won't blink
    
            rep     #1,#15-1        'repeat 1 instruction 15 times
            coginit #16,#blink     '<---I think this should be coginit #3,#blink
    blink   cogid   x               'which cog am I?
            .....
    

    REP only works in cog/LUT-exec mode.

    I made a change the other day that causes it to just fall through in hub-exec mode, rather than blow up.
    Thanks Chip. Now I recall some things don't work in hubexec.
    BTW I edited the coginit after you read my code. I think it should be coginit #3,#blink ???
  • AribaAriba Posts: 2,682
    Chip

    Do we also get an FPGA image for the DE0-Nano with the Adapter board?
    It was a supported board for the P2-hot, and I and maybe others bought it specially for P2 developement and tests.
    The pin-assignement files and other necessary files should already exist, and also this adapter board would not be wasted.

    Not sure how many cogs it can fit, but I was quite happy with the one P2-hot cog. Okay there were 4 hardware tasks then.
    Perhaps now 2 cogs of the new simplified P2 will fit, without the cordic.

    I hope some others also want to play with the DE0-Nano, otherwise it makes not much sense to take the effort.
    I understand that you can't support too many different boards.

    -Andy

    Edit: I see Cluso99 also asked for it, so we are already two..
  • SeairthSeairth Posts: 2,474
    edited 2015-09-27 01:51
    (moved to a separate thread.)
  • Cluso99Cluso99 Posts: 18,066
    edited 2015-09-27 00:27
    Seairth wrote: »
    Cluso99 wrote: »
    blink   cogid   x               'which cog am I?
            cogid   pin
    

    Also, I thought there was something special about calling COGID twice in a row. Or was that a P2-hot thing? Or am I just recalling incorrectly?
    Nothing special. Could replace the second coginit with mov pin,x

    re your following post about REP
    I agree something nicer would be better. It will be the job of the compiler (not the FPGA image) so perhaps we can wait for this. We get by with Chip's pnut.exe and save complexities for Roy's open compiler later.

    I would also like to see CALL -> CALLS since it uses the internal stack.
    CALLD is fine for me although CALLR (cog register) would work fine too.
  • Cluso99 wrote: »
    BTW I edited the coginit after you read my code. I think it should be coginit #3,#blink ???

    I think the first parameter is:

    #0 - #15 : specific cog
    #16 : next available cog (WC indicates whether a cog was started or not)

    Since the cog memory is no longer automatically cleared/reloaded, I think it would also be possible to have one cog force another cog to "jmp". For instance, an auxiliary cog could run a snippet (instigated by another cog), then call cogstop on itself.
  • Cluso99Cluso99 Posts: 18,066
    edited 2015-09-27 00:41
    Seairth wrote: »
    Cluso99 wrote: »
    BTW I edited the coginit after you read my code. I think it should be coginit #3,#blink ???

    I think the first parameter is:

    #0 - #15 : specific cog
    #16 : next available cog (WC indicates whether a cog was started or not)
    Of course. The #blink is the start address - need more coffee ;)
    Wonder where you find out which cog started when using next available cog?
    Since the cog memory is no longer automatically cleared/reloaded, I think it would also be possible to have one cog force another cog to "jmp". For instance, an auxiliary cog could run a snippet (instigated by another cog), then call cogstop on itself.
    I believe that is the plan :)
Sign In or Register to comment.