Someone mentioned this long ago, but I don't recall there ever being an answer/response to it: what if the longs were 34-bit?
That's more a viable option on FPGA, as the bits are almost free there.
The P2 silicon is a OnSemi memory compiler block and I'm not sure they can generate 34b.
34b also messes with Byte and int16 overlays, so there are plenty of fish-hooks, for little benefit....
I would have no idea if On could do it.
The ALU/cordic path could remain 32bit, so that bit width for longs/words/bytes remains unchanged, with the two additional bits containing only, say, C & Z flags.
I can see the use for the WZ in the RDBYTE/WORD/LONG, but not really WC.
I can see the use for S/PTRx although I am unsure what use PTRx will be without auto-incrementing/decrementing (see lower).
But I cannot see the use for S being immediate #. Do we really require immediate for access to the first 512B of hub???
It would be nice to be able to use D as immediate # in RDxxxx, just like in WRxxxx. But it is not that important.
However, what would be really nice is these instructions...
CCCC xxxxx00 0Z@ DDDDDDDDD SSSSSSSSS RDBYTE D/@,S {WZ}
CCCC xxxxx01 0Z@ DDDDDDDDD SSSSSSSSS RDWORD D/@,S {WZ}
CCCC xxxxx10 0Z@ DDDDDDDDD SSSSSSSSS RDLONG D/@,S {WZ}
CCCC xxxxx00 1Z@ DDDDDDDDD SSSSSSSSS WRBYTE D/@,S {WZ}
CCCC xxxxx01 1Z@ DDDDDDDDD SSSSSSSSS WRWORD D/@,S {WZ}
CCCC xxxxx10 1Z@ DDDDDDDDD SSSSSSSSS WRLONG D/@,S {WZ}
where
S LONG %xxxxxxxx_xxxxhhhh_hhhhhhhh_hhhhhhhh ' x=0(future), h=20-bit byte hub address
(Note: xxLONG ignores bits[1:0], xxWORD ignores bit[0]. ie LONG & WORD boundaries enforced!)
D LONG %xxxxxxxx_xxxxxxxx_xxxxxccc_cccccccc ' x=0(future), c=11-bit long cog/lut contiguous address
(Note: x=0(future) are bits currently ignored by the P2 but could be used in a later P2 with expanded hub and/or cog/lut RAM)
This would mean that we could have the Destination COG/LUT address stored as an 11-bit result in a Cog Register and access it indirectly. There is an extra level of indirection when we use the @D and that may mean that we need an extra clock, but it's a small price to pay to be able to address COG/LUT contiguously. And it removes the requirement to have a RD/WRLUT instruction.
This keeps the usual RD/WR-BYTE/WORD/LONG simple and easy to explain/understand.
SETQ & SETQ2
Since there would no longer be any requirement to have separate cog and lut r/w hub instructions, SETQ2 would no longer be required.
SETQ should only apply to RD/WRLONG (or a new RD/WRBLOCK instruction). All block moves will be LONGs and must be on a LONG boundary.
So SETQ could become...
CCCC 1101011 LDD DDDDDDDDD 000010110SETQ D/#
or
CCCC 1101011 DDL DDDDDDDDD 000010110SETQ D/#
where...
D LONG %xxxxxxxx_xxxxxxxx_xxxxxccc_cccccccc 'x=0(future), c=11-bit long count for block move hub to/from cog/lut
(Note: when D is specified as an address in the SETQ instruction, D may only reside in COG RAM (ie D[10:9] will be ignored and should be "00")
Doesn't this simplify things and make it easier to understand as well ???
Would it be possible to load the rom into the end of the addressable hub space of one MB? More then 1MB is not possible without modifying Hub Exec's PC anyways. Just put 16K RAM (or how long the ROM is) at the end there? Else put it at the end of the 512K.
There Hub Exec is possible, it can still be overwritten, but the memory a $1000 is available for the user and gives a nice addressing scheme for COG/LUT/HUB/ROM in one continuous space.
I see no problem in having the first 4K of HUB used just for Data. We will need some space for Cog Pasm Images anyways, having 16 Cogs. Like on the P1 this space can be reused for Buffers or other Data if needed.
But the decryption, the monitor and some serial could be quite helpful while developing, but not needed in the end product.
Now you have to keep them while developing and your program has to start at say $1200 to keep the copied ROM but in production it is at $1000. Not good.
Production Code on a locked device fails and your Development Code shows complete different addresses? Not good.
I've almost got the Prop123 FPGA release done. I was hoping it would be today, but it looks like tomorrow. I set it at 50Mhz, since everybody can run that. It fit 12 cogs at 97% capacity. When we add smart pins, that will probably drop to 10. It compiles with 12 cogs at an Fmax of 80Mhz, which means we could certainly run it at 100Mhz. Need the PLLs for that, though.
After the Prop123 version, I'll get the DE2-115 version together.
I can see the use for the WZ in the RDBYTE/WORD/LONG, but not really WC.
I can see the use for S/PTRx although I am unsure what use PTRx will be without auto-incrementing/decrementing (see lower).
But I cannot see the use for S being immediate #. Do we really require immediate for access to the first 512B of hub???
It would be nice to be able to use D as immediate # in RDxxxx, just like in WRxxxx. But it is not that important.
However, what would be really nice is these instructions...
CCCC xxxxx00 0Z@ DDDDDDDDD SSSSSSSSS RDBYTE D/@,S {WZ}
CCCC xxxxx01 0Z@ DDDDDDDDD SSSSSSSSS RDWORD D/@,S {WZ}
CCCC xxxxx10 0Z@ DDDDDDDDD SSSSSSSSS RDLONG D/@,S {WZ}
CCCC xxxxx00 1Z@ DDDDDDDDD SSSSSSSSS WRBYTE D/@,S {WZ}
CCCC xxxxx01 1Z@ DDDDDDDDD SSSSSSSSS WRWORD D/@,S {WZ}
CCCC xxxxx10 1Z@ DDDDDDDDD SSSSSSSSS WRLONG D/@,S {WZ}
where
S LONG %xxxxxxxx_xxxxhhhh_hhhhhhhh_hhhhhhhh ' x=0(future), h=20-bit byte hub address
(Note: xxLONG ignores bits[1:0], xxWORD ignores bit[0]. ie LONG & WORD boundaries enforced!)
D LONG %xxxxxxxx_xxxxxxxx_xxxxxccc_cccccccc ' x=0(future), c=11-bit long cog/lut contiguous address
(Note: x=0(future) are bits currently ignored by the P2 but could be used in a later P2 with expanded hub and/or cog/lut RAM)
This would mean that we could have the Destination COG/LUT address stored as an 11-bit result in a Cog Register and access it indirectly. There is an extra level of indirection when we use the @D and that may mean that we need an extra clock, but it's a small price to pay to be able to address COG/LUT contiguously. And it removes the requirement to have a RD/WRLUT instruction.
This keeps the usual RD/WR-BYTE/WORD/LONG simple and easy to explain/understand.
SETQ & SETQ2
Since there would no longer be any requirement to have separate cog and lut r/w hub instructions, SETQ2 would no longer be required.
SETQ should only apply to RD/WRLONG (or a new RD/WRBLOCK instruction). All block moves will be LONGs and must be on a LONG boundary.
So SETQ could become...
CCCC 1101011 LDD DDDDDDDDD 000010110SETQ D/#
or
CCCC 1101011 DDL DDDDDDDDD 000010110SETQ D/#
where...
D LONG %xxxxxxxx_xxxxxxxx_xxxxxccc_cccccccc 'x=0(future), c=11-bit long count for block move hub to/from cog/lut
(Note: when D is specified as an address in the SETQ instruction, D may only reside in COG RAM (ie D[10:9] will be ignored and should be "00")
Doesn't this simplify things and make it easier to understand as well ???
this would imply that whole COG+LUT space can be used for registers
with the only caveat, that immediate addressing is only available in the 9-bit COG address range
right??
this would imply that whole COG+LUT space can be used for registers
with the only caveat, that immediate addressing is only available in the 9-bit COG address range
right??
No, definitely not at this time.
What it means is that we could load from hub to cog and/or lut, or save cog and/or lut to hub.
Apart from the fact that normal programs only have 9 bits available to address the registers ($000..1FF), COG RAM is dual ported permitting both D and S values to be read simultaneously, and permitting both I to be read and R (result of a previous instruction) to be written simultaneously. LUT is only single port RAM.
I've almost got the Prop123 FPGA release done. I was hoping it would be today, but it looks like tomorrow. I set it at 50Mhz, since everybody can run that. It fit 12 cogs at 97% capacity. When we add smart pins, that will probably drop to 10. It compiles with 12 cogs at an Fmax of 80Mhz, which means we could certainly run it at 100Mhz. Need the PLLs for that, though.
After the Prop123 version, I'll get the DE2-115 version together.
*WHEW!* I was actually worried that you would release it today. All my "real" work would have come to a screeching halt! Fortunately, it's raining all weekend, so a perfect excuse to play with an FPGA (starting tomorrow, of course)!
I've almost got the Prop123 FPGA release done. I was hoping it would be today, but it looks like tomorrow. I set it at 50Mhz, since everybody can run that. It fit 12 cogs at 97% capacity. When we add smart pins, that will probably drop to 10. It compiles with 12 cogs at an Fmax of 80Mhz, which means we could certainly run it at 100Mhz. Need the PLLs for that, though.
After the Prop123 version, I'll get the DE2-115 version together.
Any chance of a DE2-115 image today or will that have to wait until next week?
Any chance of a DE2-115 image today or will that have to wait until next week?
I've been working on it all day. I needed to get PNut.exe to work with multiple boards, so I thought I would do the DE2-115 and the Prop123 at the same time, to start.
I noticed a few little problems as I made all the compiles and cross-checks. So, I am fixing those right now. Maybe later tonight I will have something. Sorry this is taking so long.
Ah, not happy ... CALL D, CALLD, CALLD D! ... Just looking at what Cluso has compiled.
One problem with the relabelling of LINK to CALLD is there is no paired RETD because it's not quite a full stacked mechanism.
But obviously there is also the comprehension issue of D being used both as part of an opcode name and also a register direct place holder. This is particularly bad when both are in the one instruction together!
Ah, not happy ... CALL D, CALLD, CALLD D! ... Just looking at what Cluso has compiled.
One problem with the relabelling of LINK to CALLD is there is no paired RETD because it's not quite a full stacked mechanism.
But obviously there is also the comprehension issue of D being used both as part of an opcode name and also a register direct place holder. This is particularly bad when both are in the one instruction together!
I kind of liked "LINK". I was thinking CALLR for 'register' might make more sense than CALLD, when there's only one operand, anyway.
Where is I/O now on DE2_115 with Prop board relative to where it's going to be finally?
I guess right now, I'm mainly interested in digital I/O. Is this working the same way now that it will in final chip?
I think I've heard that smart-pins are the next thing Chip will work on... I don't seem to remember what "smart pins" are...
Do they include the analog I/O modes and resistor pull-up and down modes? Or, are they just special digital modes?
I think it'd be fun to connect an LCD to DE2-115 or Prop123 and would like to figure out how to do that mechanically and code wise...
Where is I/O now on DE2_115 with Prop board relative to where it's going to be finally?
I guess right now, I'm mainly interested in digital I/O. Is this working the same way now that it will in final chip?
I think I've heard that smart-pins are the next thing Chip will work on... I don't seem to remember what "smart pins" are...
Do they include the analog I/O modes and resistor pull-up and down modes? Or, are they just special digital modes?
I think it'd be fun to connect an LCD to DE2-115 or Prop123 and would like to figure out how to do that mechanically and code wise...
The streamer can write data directly to the i/o pins, not just to the DACs, up to 32 bits per clock, from hub or LUT.
The streamer can write data directly to the i/o pins, not just to the DACs, up to 32 bits per clock, from hub or LUT.
It captures bytes, words, or longs. I like the idea of one, two, or four bits, as well, getting written as bytes! The rate is already programmable by SETXFRQ: $8000000 = every clock, $40000000 = every 2nd clock, $2AAAAAAB = every 3rd clock. In that case of every third clock, the LSB must be set to ensure that it rolls over (reaches $80000000+) on the initial third clock. Bit 31 is not kept by the phase accumulator.
Comments
I would have no idea if On could do it.
The ALU/cordic path could remain 32bit, so that bit width for longs/words/bytes remains unchanged, with the two additional bits containing only, say, C & Z flags.
BTW I got it into my head that the RD/WRxxxx worked differently to how it really does
However, this got me thinking about what I had been thinking!
Here are the related instructions. I am unsure what they all are for, but here they are...
CCCC 1010110 CZI DDDDDDDDD SSSSSSSSS RDLUT D,S/# {WC,WZ} CCCC 1100001 0LI DDDDDDDDD SSSSSSSSS WRLUT D/#,S/# CCCC 1011000 CZI DDDDDDDDD SSSSSSSSS RDBYTE D,S/#/PTRx {WC,WZ} CCCC 1011001 CZI DDDDDDDDD SSSSSSSSS RDWORD D,S/#/PTRx {WC,WZ} CCCC 1011010 CZI DDDDDDDDD SSSSSSSSS RDLONG D,S/#/PTRx {WC,WZ} CCCC 1100010 0LI DDDDDDDDD SSSSSSSSS WRBYTE D/#,S/#/PTRx CCCC 1100010 1LI DDDDDDDDD SSSSSSSSS WRWORD D/#,S/#/PTRx CCCC 1100011 0LI DDDDDDDDD SSSSSSSSS WRLONG D/#,S/#/PTRx CCCC 1100011 1LI DDDDDDDDD SSSSSSSSS RDFAST D/#,S/# CCCC 1100100 0LI DDDDDDDDD SSSSSSSSS WRFAST D/#,S/# CCCC 1100100 1LI DDDDDDDDD SSSSSSSSS FBLOCK D/#,S/# CCCC 1101011 CZ0 DDDDDDDDD 000010000 RFBYTE D {WC,WZ} CCCC 1101011 CZ0 DDDDDDDDD 000010001 RFWORD D {WC,WZ} CCCC 1101011 CZ0 DDDDDDDDD 000010010 RFLONG D {WC,WZ} CCCC 1101011 00L DDDDDDDDD 000010011 WFBYTE D/# CCCC 1101011 00L DDDDDDDDD 000010100 WFWORD D/# CCCC 1101011 00L DDDDDDDDD 000010101 WFLONG D/# CCCC 1101011 00L DDDDDDDDD 000010110 SETQ D/# CCCC 1101011 00L DDDDDDDDD 000010111 SETQ2 D/# CCCC 11110nn nnn nnnnnnnnn nnnnnnnnn AUGS #23bits CCCC 11111nn nnn nnnnnnnnn nnnnnnnnn AUGD #23bits
RDBYTE/RDWORD/RDLONG & WRBYTE/WRWORD/WRLONG
Let's look at some of those that we all know pretty well, and are similar to P1...
CCCC 1011000 CZI DDDDDDDDD SSSSSSSSS RDBYTE D,S/#/PTRx {WC,WZ} CCCC 1011001 CZI DDDDDDDDD SSSSSSSSS RDWORD D,S/#/PTRx {WC,WZ} CCCC 1011010 CZI DDDDDDDDD SSSSSSSSS RDLONG D,S/#/PTRx {WC,WZ} CCCC 1100010 0LI DDDDDDDDD SSSSSSSSS WRBYTE D/#,S/#/PTRx CCCC 1100010 1LI DDDDDDDDD SSSSSSSSS WRWORD D/#,S/#/PTRx CCCC 1100011 0LI DDDDDDDDD SSSSSSSSS WRLONG D/#,S/#/PTRx
I can see the use for the WZ in the RDBYTE/WORD/LONG, but not really WC.
I can see the use for S/PTRx although I am unsure what use PTRx will be without auto-incrementing/decrementing (see lower).
But I cannot see the use for S being immediate #. Do we really require immediate for access to the first 512B of hub???
It would be nice to be able to use D as immediate # in RDxxxx, just like in WRxxxx. But it is not that important.
However, what would be really nice is these instructions...
CCCC xxxxx00 0Z@ DDDDDDDDD SSSSSSSSS RDBYTE D/@,S {WZ} CCCC xxxxx01 0Z@ DDDDDDDDD SSSSSSSSS RDWORD D/@,S {WZ} CCCC xxxxx10 0Z@ DDDDDDDDD SSSSSSSSS RDLONG D/@,S {WZ} CCCC xxxxx00 1Z@ DDDDDDDDD SSSSSSSSS WRBYTE D/@,S {WZ} CCCC xxxxx01 1Z@ DDDDDDDDD SSSSSSSSS WRWORD D/@,S {WZ} CCCC xxxxx10 1Z@ DDDDDDDDD SSSSSSSSS WRLONG D/@,S {WZ} where S LONG %xxxxxxxx_xxxxhhhh_hhhhhhhh_hhhhhhhh ' x=0(future), h=20-bit byte hub address (Note: xxLONG ignores bits[1:0], xxWORD ignores bit[0]. ie LONG & WORD boundaries enforced!) D LONG %xxxxxxxx_xxxxxxxx_xxxxxccc_cccccccc ' x=0(future), c=11-bit long cog/lut contiguous address (Note: x=0(future) are bits currently ignored by the P2 but could be used in a later P2 with expanded hub and/or cog/lut RAM)
This would mean that we could have the Destination COG/LUT address stored as an 11-bit result in a Cog Register and access it indirectly. There is an extra level of indirection when we use the @D and that may mean that we need an extra clock, but it's a small price to pay to be able to address COG/LUT contiguously. And it removes the requirement to have a RD/WRLUT instruction.This keeps the usual RD/WR-BYTE/WORD/LONG simple and easy to explain/understand.
SETQ & SETQ2
Since there would no longer be any requirement to have separate cog and lut r/w hub instructions, SETQ2 would no longer be required.
SETQ should only apply to RD/WRLONG (or a new RD/WRBLOCK instruction). All block moves will be LONGs and must be on a LONG boundary.
So SETQ could become...
CCCC 1101011 LDD DDDDDDDDD 000010110 SETQ D/# or CCCC 1101011 DDL DDDDDDDDD 000010110 SETQ D/# where... D LONG %xxxxxxxx_xxxxxxxx_xxxxxccc_cccccccc 'x=0(future), c=11-bit long count for block move hub to/from cog/lut (Note: when D is specified as an address in the SETQ instruction, D may only reside in COG RAM (ie D[10:9] will be ignored and should be "00")
Doesn't this simplify things and make it easier to understand as well ???
There Hub Exec is possible, it can still be overwritten, but the memory a $1000 is available for the user and gives a nice addressing scheme for COG/LUT/HUB/ROM in one continuous space.
I see no problem in having the first 4K of HUB used just for Data. We will need some space for Cog Pasm Images anyways, having 16 Cogs. Like on the P1 this space can be reused for Buffers or other Data if needed.
But the decryption, the monitor and some serial could be quite helpful while developing, but not needed in the end product.
Now you have to keep them while developing and your program has to start at say $1200 to keep the copied ROM but in production it is at $1000. Not good.
Production Code on a locked device fails and your Development Code shows complete different addresses? Not good.
Leaving the ROM copy always in? Not good either.
Thoughts?
Mike
Production code is written at $1000 always.
While developing, leave the ROM tools there. When done, clear them.
Optional write protect bit to prevent overwrites when developing and troubleshooting.
The chip allows non aligned hub code anyway.
After the Prop123 version, I'll get the DE2-115 version together.
Even 10 cogs would be fine, too, if it gives a bit more room for optimization/fitting
P2 INSTRUCTIONS (ALL) 24SEP2015 ==================================================================================================== CCCC XXXXXXX CZI DDDDDDDDD SSSSSSSSS ---------------------------------------------------------------------------------------------------- <----------------------------- D,S/# {WC,WZ} ---------------------------------> nnn 0000nnn 0001nnn 0010nnn 0011nnn 0100nnn 0101nnn 0110nnn 0111nnn CCCC 0iii000 CZI ROR ADD CMP MIN ISOB ANDN MOV ALTDS CCCC 0iii001 CZI ROL ADDX CMPX MAX NOTB AND NOT DECOD CCCC 0iii010 CZI SHR ADDS CMPS MINS CLRB OR ABS TOPONE CCCC 0iii011 CZI SHL ADDSX CMPSX MAXS SETB XOR NEG BOTONE CCCC 0iii100 CZI RCR SUB CMPR SUMC SETBC MUXC NEGC INCMOD CCCC 0iii101 CZI RCL SUBX CMPM SUMNC SETBNC MUXNC NEGNC DECMOD CCCC 0iii110 CZI SAR SUBS SUBR SUMZ SETBZ MUXZ NEGZ MUL CCCC 0iii111 CZI SAL SUBSX CMPSUB SUMNZ SETBNZ MUXNZ NEGNZ MULS ---------------------------------------------------------------------------------------------------- <----------------------------- D,S/# -------------------------------> <- D,S/@ -> 100000nnn 100001nnn 100010nnn 100011nnn 100100nnn 100101nnn 100110nnn 100111nnn CCCC 100iii0 00I SETNIB0 GETNIB0 ROLNIB0 SETBYT0 ROLBYT0 ROLWRD0 SEUSSF DJZ CCCC 100iii0 01I SETNIB1 GETNIB1 ROLNIB1 SETBYT1 ROLBYT1 ROLWRD1 SEUSSR DJNZ CCCC 100iii0 10I SETNIB2 GETNIB2 ROLNIB2 SETBYT2 ROLBYT2 SETBYTS REV DJS CCCC 100iii0 11I SETNIB3 GETNIB3 ROLNIB3 SETBYT3 ROLBYT3 MOVBYTS SETI DJNS CCCC 100iii1 00I SETNIB4 GETNIB4 ROLNIB4 GETBYT0 SETWRD0 SPLITB SETD TJZ CCCC 100iii1 01I SETNIB5 GETNIB5 ROLNIB5 GETBYT1 SETWRD1 MERGEB GETD TJNZ CCCC 100iii1 10I SETNIB6 GETNIB6 ROLNIB6 GETBYT2 GETWRD0 SPLITW SETS TJS CCCC 100iii1 11I SETNIB7 GETNIB7 ROLNIB7 GETBYT3 GETWRD1 MERGEW GETS TJNS ---------------------------------------------------------------------------------------------------- 1010nnn 1011nnn.n CCCC 101i000 CZI TESTN D,S/# {WC,WZ} CZI RDBYTE D,S/#/PTRx {WC,WZ} CCCC 101i001 CZI TEST D,S/# {WC,WZ} CZI RDWORD D,S/#/PTRx {WC,WZ} CCCC 101i010 CZI ANYB D,S/# {WC,WZ} CZI RDLONG D,S/#/PTRx {WC,WZ} CCCC 101i011 CZI TESTB D,S/# {WC,WZ} CZI <empty> CCCC 101i100 CZI WAITCNT D,S/# {WC,WZ} CZI <empty> CCCC 101i101 CZI CALLD D,S/@ {WC,WZ} CZI <empty> CCCC 101i110 CZI RDLUT D,S/# {WC,WZ} 0LI SETPAE D/#,S/# " " 1LI SETPAN D/#,S/# CCCC 101i111 CZI MSGIN D,S/# {WC,WZ} 0LI SETPBE D/#,S/# " " 1LI SETPBN D/#,S/# ---------------------------------------------------------------------------------------------------- 11000nn.n 11001nn.n 11010nn.n CCCC 110ii00 0LI JP D/#,S/@ 0LI WRFAST D/#,S/# 0LI QMUL D/#,S/# CCCC 110ii00 1LI JNP D/#,S/@ 1LI FBLOCK D/#,S/# 1LI QDIV D/#,S/# CCCC 110ii01 0LI WRLUT D/#,S/# 0LI XINIT D/#,S/# 0LI QSQR D/#,S/# CCCC 110ii01 1LI MSGOUT D/#,S/# 1LI XZERO D/#,S/# 1LI QSIN D/#,S/# CCCC 110ii10 0LI WRBYTE D/#,S/#/PTRx 0LI XCONT D/#,S/# 0LI QROT D/#,S/# CCCC 110ii10 1LI WRWORD D/#,S/#/PTRx 1LI REP D/#,S/# 1LI QATN D/#,S/# CCCC 110ii11 0LI WRLONG D/#,S/#/PTRx CLI COGINIT D/#,S/# {WC} <**Note 1**> CCCC 110ii11 1LI RDFAST D/#,S/# " " ---------------------------------------------------------------------------------------------------- CCCC 1101100 Rnn nnnnnnnnn nnnnnnnnn JMP #abs/@rel CCCC 1101101 Rnn nnnnnnnnn nnnnnnnnn CALL #abs/@rel CCCC 1101110 Rnn nnnnnnnnn nnnnnnnnn CALLA #abs/@rel CCCC 1101111 Rnn nnnnnnnnn nnnnnnnnn CALLB #abs/@rel CCCC 11100ww Rnn nnnnnnnnn nnnnnnnnn CALLD reg,#abs/@rel CCCC 11101ww Rnn nnnnnnnnn nnnnnnnnn LOC reg,#abs/@rel CCCC 11110nn nnn nnnnnnnnn nnnnnnnnn AUGS #23bits CCCC 11111nn nnn nnnnnnnnn nnnnnnnnn AUGD #23bits ==================================================================================================== <**Note 1**> **SPECIAL OPCODE 1101011 (uses S field)** CCCC 1101011 CZL DDDDDDDDD sssssssss ---------------------------------------------------------------------------------------------------- sssssssss sssssssss CZL 000000000 CLKSET D/# {WC,WZ} CZ0 000010000 RFBYTE D {WC,WZ} CZL 000000001 COGID D/# {WC,WZ} CZ0 000010001 RFWORD D {WC,WZ} CZL 000000010 <empty> CZ0 000010010 RFLONG D {WC,WZ} 00L 000000011 COGSTOP D/# 00L 000010011 WFBYTE D/# CZ0 000000100 LOCKNEW D {WC,WZ} 00L 000010100 WFWORD D/# 00L 000000101 LOCKRET D/# 00L 000010101 WFLONG D/# C0L 000000110 LOCKCLR D/# {WC} 00L 000010110 SETQ D/# C0L 000000111 LOCKSET D/# {WC} 00L 000010111 SETQ2 D/# CZL 000001000 <empty> CZ0 000011000 GETQX D {WC,WZ} CZL 000001001 <empty> CZ0 000011001 GETQY D {WC,WZ} CZL 000001010 <empty> 000 000011010 GETCNT D CZL 000001011 <empty> CZ0 000011011 GETRND {D} {WC,WZ} CZL 000001100 <empty> 00L 000011100 SETXDAC D/# CZL 000001101 <empty> 00L 000011101 SETXFRQ D/# 00L 000001110 QLOG D/# 000 000011110 GETXCOS D 00L 000001111 QEXP D/# 000 000011111 GETXSIN D sssssssss sssssssss 00L 000100000 SETPER D/# 00L 000101000 PUSH D/# 00L 000100001 SETEDG D/# CZ0 000101001 CALL D {WC,WZ} 00L 000100010 SETRDL D/# CZ0 000101010 CALLA D {WC,WZ} 00L 000100011 SETWRL D/# CZ0 000101011 CALLB D {WC,WZ} C00 000100100 <**Note 2**> CZ0 000101100 POP D {WC,WZ} 00L 000100101 SETINT1 D/# CZ0 000101101 RET {WC,WZ} 00L 000100110 SETINT2 D/# CZ0 000101110 RETA {WC,WZ} 00L 000100111 SETINT3 D/# CZ0 000101111 RETB {WC,WZ} sssssssss 00L 000110000 WAITX D/# CZL 000110001 SETCZ D/# {WC,WZ} 000 000110010 <**Note 3**> 00L 000110011 SETBRK D/# xxx 0001101xx <empty> xxx 000111xxx <empty> xxx 001xxxxxx <empty> xxx 01xxxxxxx <empty> xxx 1xxxxxxxx <empty> ==================================================================================================== <**Note 2**> **SPECIAL OPCODE 1101011 with S= 000100100 (uses D field)** CCCC 1101011 C00 ddddddddd 000100100 ---------------------------------------------------------------------------------------------------- ddddddddd sssssssss ddddddddd sssssssss C00 000000000 000100100 GETINT {WC} C00 000001000 000100100 WAITINT {WC} C00 000000001 000100100 GETPER {WC} C00 000001001 000100100 WAITPER {WC} C00 000000010 000100100 GETEDG {WC} C00 000001010 000100100 WAITEDG {WC} C00 000000011 000100100 GETPAT {WC} C00 000001011 000100100 WAITPAT {WC} C00 000000100 000100100 GETRDL {WC} C00 000001100 000100100 WAITRDL {WC} C00 000000101 000100100 GETWRL {WC} C00 000001101 000100100 WAITWRL {WC} C00 000000110 000100100 GETXRO {WC} C00 000001110 000100100 WAITXRO {WC} C00 000000111 000100100 GETFBW {WC} C00 000001111 000100100 WAITFBW {WC} ==================================================================================================== <**Note 3**> **SPECIAL OPCODE 1101011 with S= 000110010 (uses D field)** CCCC 1101011 000 ddddddddd 000100100 ---------------------------------------------------------------------------------------------------- czi ddddddddd sssssssss 000 000000000 000110010 ALLOWI 000 000000001 000110010 STALLI xxx 00000001x 000110010 <empty> ====================================================================================================
Refresh....refresh.....refresh.....
this would imply that whole COG+LUT space can be used for registers
with the only caveat, that immediate addressing is only available in the 9-bit COG address range
right??
What it means is that we could load from hub to cog and/or lut, or save cog and/or lut to hub.
Apart from the fact that normal programs only have 9 bits available to address the registers ($000..1FF), COG RAM is dual ported permitting both D and S values to be read simultaneously, and permitting both I to be read and R (result of a previous instruction) to be written simultaneously. LUT is only single port RAM.
*WHEW!* I was actually worried that you would release it today. All my "real" work would have come to a screeching halt! Fortunately, it's raining all weekend, so a perfect excuse to play with an FPGA (starting tomorrow, of course)!
I was going to get the P123 A7 version, but then the issues came up and I figured I should wait for the A9 one.
P2 INSTRUCTIONS (JUMPS etc) 24SEP2015 ==================================================================================================== R=0 R=1 CCCC 100111R 00I DDDDDDDDD SSSSSSSSS DJZ D,S/@ TJZ D,S/@ CCCC 100111R 01I DDDDDDDDD SSSSSSSSS DJNZ D,S/@ TJNZ D,S/@ CCCC 100111R 10I DDDDDDDDD SSSSSSSSS DJS D,S/@ TJS D,S/@ CCCC 100111R 11I DDDDDDDDD SSSSSSSSS DJNS D,S/@ TJNS D,S/@ Q=0 Q=1 CCCC 1100000 QLI DDDDDDDDD SSSSSSSSS JP D/#,S/@ JNP D/#,S/@ 'j pinD [not]positive? ==================================================================================================== CCCC 1101100 Rnn nnnnnnnnn nnnnnnnnn JMP #abs/@rel 'jump 20-bit absolute/relative address CCCC 1101011 CZ0 000000000 000101101 RET {WC,WZ} 'jump via internal stack CCCC 1101011 CZ0 000000000 000101110 RETA {WC,WZ} 'jump via Register A CCCC 1101011 CZ0 000000000 000101111 RETB {WC,WZ} 'jump via Register B ---------------------------------------------------------------------------------------------------- CCCC 1101011 CZ0 DDDDDDDDD 000101001 CALL D {WC,WZ} 'save return address on internal stack CCCC 1101011 CZ0 DDDDDDDDD 000101010 CALLA D {WC,WZ} 'save return address in Register A CCCC 1101011 CZ0 DDDDDDDDD 000101011 CALLB D {WC,WZ} 'save return address in Register B CCCC 1010101 CZI DDDDDDDDD SSSSSSSSS CALLD D,S/@ {WC,WZ} 'save return address in Register ??? CCCC 1101101 Rnn nnnnnnnnn nnnnnnnnn CALL #abs/@rel 'save return address on internal stack CCCC 1101110 Rnn nnnnnnnnn nnnnnnnnn CALLA #abs/@rel 'save return address in Register A CCCC 1101111 Rnn nnnnnnnnn nnnnnnnnn CALLB #abs/@rel 'save return address in Register B CCCC 11100ww Rnn nnnnnnnnn nnnnnnnnn CALLD reg,#abs/@rel 'save return address in Register ??? ==================================================================================================== CCCC 1101011 00L DDDDDDDDD 000101000 PUSH D/# 'push D/# on internal stack CCCC 1101011 CZ0 DDDDDDDDD 000101100 POP D {WC,WZ} 'pop D from internal stack CCCC 11101ww Rnn nnnnnnnnn nnnnnnnnn LOC reg,#abs/@rel 'loads Register A..D with 20-bit address ??? CCCC 11110nn nnn nnnnnnnnn nnnnnnnnn AUGS #23bits 'sets S[31..9] for next instruction CCCC 11111nn nnn nnnnnnnnn nnnnnnnnn AUGD #23bits 'sets D[31..9] for next instruction ====================================================================================================
I've been working on it all day. I needed to get PNut.exe to work with multiple boards, so I thought I would do the DE2-115 and the Prop123 at the same time, to start.
I noticed a few little problems as I made all the compiles and cross-checks. So, I am fixing those right now. Maybe later tonight I will have something. Sorry this is taking so long.
One problem with the relabelling of LINK to CALLD is there is no paired RETD because it's not quite a full stacked mechanism.
But obviously there is also the comprehension issue of D being used both as part of an opcode name and also a register direct place holder. This is particularly bad when both are in the one instruction together!
I kind of liked "LINK". I was thinking CALLR for 'register' might make more sense than CALLD, when there's only one operand, anyway.
There's a serious lack of documentation, yet, but those of you that have been playing with FPGA's will find your way.
I'll make a new thread and hope I can post the ~5MB file there.
I guess right now, I'm mainly interested in digital I/O. Is this working the same way now that it will in final chip?
I think I've heard that smart-pins are the next thing Chip will work on... I don't seem to remember what "smart pins" are...
Do they include the analog I/O modes and resistor pull-up and down modes? Or, are they just special digital modes?
I think it'd be fun to connect an LCD to DE2-115 or Prop123 and would like to figure out how to do that mechanically and code wise...
The streamer can write data directly to the i/o pins, not just to the DACs, up to 32 bits per clock, from hub or LUT.
IIRC Peter did a splendid job and put them up on google???
The P2 FPGA Image is finally here!
P2 Day is September 26, 2015. It will be celebrated for years to come.
Thanks Chip.
I'd be happy with CALLR. CALLK also.
JMPLR or JMPR or JMPLK or JMPK could work as this makes clear, just from the name, it's a branching operation - which LINK doesn't.
LINK is already well known naming so is also still fine, imho.
Is that up to 32 bits, one bit per clock?
Can you also read up to 32 bits, one per clock from a pin?
Is that instruction clock, or system clock?
Can the clock be exposed on an adjacent pin?
If the clock can be exposed, that gives us SPI master (half duplex) with the above for free.
For full duplex SPI, two pins would need to be sync'd with a third as a clock. An arbitrary other pin could be the chip select.
It captures bytes, words, or longs. I like the idea of one, two, or four bits, as well, getting written as bytes! The rate is already programmable by SETXFRQ: $8000000 = every clock, $40000000 = every 2nd clock, $2AAAAAAB = every 3rd clock. In that case of every third clock, the LSB must be set to ensure that it rolls over (reaches $80000000+) on the initial third clock. Bit 31 is not kept by the phase accumulator.