The New 16-Cog, 512KB, 64 analog I/O Propeller Chip - Part 2

mark · 2015-09-24 23:49

jmg wrote: »

mark wrote: »

Someone mentioned this long ago, but I don't recall there ever being an answer/response to it: what if the longs were 34-bit?

That's more a viable option on FPGA, as the bits are almost free there.
The P2 silicon is a OnSemi memory compiler block and I'm not sure they can generate 34b.
34b also messes with Byte and int16 overlays, so there are plenty of fish-hooks, for little benefit....

I would have no idea if On could do it.

The ALU/cordic path could remain 32bit, so that bit width for longs/words/bytes remains unchanged, with the two additional bits containing only, say, C & Z flags.

Cluso99 · 2015-09-25 00:11

I don't think any of this needs to be as complex as is being proposed.

BTW I got it into my head that the RD/WRxxxx worked differently to how it really does

However, this got me thinking about what I had been thinking!

Here are the related instructions. I am unsure what they all are for, but here they are...

CCCC 1010110 CZI DDDDDDDDD SSSSSSSSS        RDLUT   D,S/#       {WC,WZ}

CCCC 1100001 0LI DDDDDDDDD SSSSSSSSS        WRLUT   D/#,S/#

CCCC 1011000 CZI DDDDDDDDD SSSSSSSSS        RDBYTE  D,S/#/PTRx  {WC,WZ}
CCCC 1011001 CZI DDDDDDDDD SSSSSSSSS        RDWORD  D,S/#/PTRx  {WC,WZ}
CCCC 1011010 CZI DDDDDDDDD SSSSSSSSS        RDLONG  D,S/#/PTRx  {WC,WZ}

CCCC 1100010 0LI DDDDDDDDD SSSSSSSSS        WRBYTE  D/#,S/#/PTRx
CCCC 1100010 1LI DDDDDDDDD SSSSSSSSS        WRWORD  D/#,S/#/PTRx
CCCC 1100011 0LI DDDDDDDDD SSSSSSSSS        WRLONG  D/#,S/#/PTRx

CCCC 1100011 1LI DDDDDDDDD SSSSSSSSS        RDFAST  D/#,S/#
CCCC 1100100 0LI DDDDDDDDD SSSSSSSSS        WRFAST  D/#,S/#
CCCC 1100100 1LI DDDDDDDDD SSSSSSSSS        FBLOCK  D/#,S/#

CCCC 1101011 CZ0 DDDDDDDDD 000010000        RFBYTE  D           {WC,WZ}
CCCC 1101011 CZ0 DDDDDDDDD 000010001        RFWORD  D           {WC,WZ}
CCCC 1101011 CZ0 DDDDDDDDD 000010010        RFLONG  D           {WC,WZ}

CCCC 1101011 00L DDDDDDDDD 000010011        WFBYTE  D/#
CCCC 1101011 00L DDDDDDDDD 000010100        WFWORD  D/#
CCCC 1101011 00L DDDDDDDDD 000010101        WFLONG  D/#

CCCC 1101011 00L DDDDDDDDD 000010110        SETQ    D/#
CCCC 1101011 00L DDDDDDDDD 000010111        SETQ2   D/#

CCCC 11110nn nnn nnnnnnnnn nnnnnnnnn        AUGS    #23bits
CCCC 11111nn nnn nnnnnnnnn nnnnnnnnn        AUGD    #23bits

RDBYTE/RDWORD/RDLONG & WRBYTE/WRWORD/WRLONG

Let's look at some of those that we all know pretty well, and are similar to P1...

CCCC 1011000 CZI DDDDDDDDD SSSSSSSSS        RDBYTE  D,S/#/PTRx  {WC,WZ}
CCCC 1011001 CZI DDDDDDDDD SSSSSSSSS        RDWORD  D,S/#/PTRx  {WC,WZ}
CCCC 1011010 CZI DDDDDDDDD SSSSSSSSS        RDLONG  D,S/#/PTRx  {WC,WZ}

CCCC 1100010 0LI DDDDDDDDD SSSSSSSSS        WRBYTE  D/#,S/#/PTRx
CCCC 1100010 1LI DDDDDDDDD SSSSSSSSS        WRWORD  D/#,S/#/PTRx
CCCC 1100011 0LI DDDDDDDDD SSSSSSSSS        WRLONG  D/#,S/#/PTRx

I can see the use for the WZ in the RDBYTE/WORD/LONG, but not really WC.
I can see the use for S/PTRx although I am unsure what use PTRx will be without auto-incrementing/decrementing (see lower).
But I cannot see the use for S being immediate #. Do we really require immediate for access to the first 512B of hub???
It would be nice to be able to use D as immediate # in RDxxxx, just like in WRxxxx. But it is not that important.

However, what would be really nice is these instructions...

CCCC xxxxx00 0Z@ DDDDDDDDD SSSSSSSSS        RDBYTE   D/@,S  {WZ}
CCCC xxxxx01 0Z@ DDDDDDDDD SSSSSSSSS        RDWORD  D/@,S  {WZ}
CCCC xxxxx10 0Z@ DDDDDDDDD SSSSSSSSS        RDLONG  D/@,S  {WZ}

CCCC xxxxx00 1Z@ DDDDDDDDD SSSSSSSSS        WRBYTE  D/@,S  {WZ}
CCCC xxxxx01 1Z@ DDDDDDDDD SSSSSSSSS        WRWORD  D/@,S  {WZ}
CCCC xxxxx10 1Z@ DDDDDDDDD SSSSSSSSS        WRLONG  D/@,S  {WZ}

where
S   LONG   %xxxxxxxx_xxxxhhhh_hhhhhhhh_hhhhhhhh  ' x=0(future), h=20-bit byte hub address
   (Note: xxLONG ignores bits[1:0], xxWORD ignores bit[0]. ie LONG & WORD boundaries enforced!)
D   LONG   %xxxxxxxx_xxxxxxxx_xxxxxccc_cccccccc   ' x=0(future), c=11-bit long cog/lut contiguous address
(Note: x=0(future) are bits currently ignored by the P2 but could be used in a later P2 with expanded hub and/or cog/lut RAM)

This would mean that we could have the Destination COG/LUT address stored as an 11-bit result in a Cog Register and access it indirectly. There is an extra level of indirection when we use the @D and that may mean that we need an extra clock, but it's a small price to pay to be able to address COG/LUT contiguously. And it removes the requirement to have a RD/WRLUT instruction.
This keeps the usual RD/WR-BYTE/WORD/LONG simple and easy to explain/understand.

SETQ & SETQ2
Since there would no longer be any requirement to have separate cog and lut r/w hub instructions, SETQ2 would no longer be required.
SETQ should only apply to RD/WRLONG (or a new RD/WRBLOCK instruction). All block moves will be LONGs and must be on a LONG boundary.
So SETQ could become...

CCCC 1101011 LDD DDDDDDDDD 000010110        SETQ    D/#
or
CCCC 1101011 DDL DDDDDDDDD 000010110        SETQ    D/#
where...
D   LONG   %xxxxxxxx_xxxxxxxx_xxxxxccc_cccccccc   'x=0(future), c=11-bit long count for block move hub to/from cog/lut
(Note: when D is specified as an address in the SETQ instruction, D may only reside in COG RAM (ie D[10:9] will be ignored and should be "00")

Doesn't this simplify things and make it easier to understand as well ???

msrobots · 2015-09-25 00:16

Would it be possible to load the rom into the end of the addressable hub space of one MB? More then 1MB is not possible without modifying Hub Exec's PC anyways. Just put 16K RAM (or how long the ROM is) at the end there? Else put it at the end of the 512K.

There Hub Exec is possible, it can still be overwritten, but the memory a $1000 is available for the user and gives a nice addressing scheme for COG/LUT/HUB/ROM in one continuous space.

I see no problem in having the first 4K of HUB used just for Data. We will need some space for Cog Pasm Images anyways, having 16 Cogs. Like on the P1 this space can be reused for Buffers or other Data if needed.

But the decryption, the monitor and some serial could be quite helpful while developing, but not needed in the end product.

Now you have to keep them while developing and your program has to start at say $1200 to keep the copied ROM but in production it is at $1000. Not good.

Production Code on a locked device fails and your Development Code shows complete different addresses? Not good.

Leaving the ROM copy always in? Not good either.

Thoughts?

Mike

potatohead · 2015-09-25 01:23

All that is why the code execute under $1000 makes so much sense.

Production code is written at $1000 always.

While developing, leave the ROM tools there. When done, clear them.

Optional write protect bit to prevent overwrites when developing and troubleshooting.

The chip allows non aligned hub code anyway.

cgracey · 2015-09-25 04:34

I've almost got the Prop123 FPGA release done. I was hoping it would be today, but it looks like tomorrow. I set it at 50Mhz, since everybody can run that. It fit 12 cogs at 97% capacity. When we add smart pins, that will probably drop to 10. It compiles with 12 cogs at an Fmax of 80Mhz, which means we could certainly run it at 100Mhz. Need the PLLs for that, though.

After the Prop123 version, I'll get the DE2-115 version together.

Tubular · 2015-09-25 05:30

Really looking forward to it Chip

Even 10 cogs would be fine, too, if it gives a bit more room for optimization/fitting

Cluso99 · 2015-09-25 05:40

Here is a condensed list of 2 Instructions...

P2 INSTRUCTIONS (ALL) 24SEP2015
====================================================================================================
CCCC XXXXXXX CZI DDDDDDDDD SSSSSSSSS 
----------------------------------------------------------------------------------------------------
                   <----------------------------- D,S/# {WC,WZ} --------------------------------->
         nnn       0000nnn   0001nnn   0010nnn   0011nnn   0100nnn   0101nnn   0110nnn   0111nnn
CCCC 0iii000 CZI   ROR       ADD       CMP       MIN       ISOB      ANDN      MOV       ALTDS  
CCCC 0iii001 CZI   ROL       ADDX      CMPX      MAX       NOTB      AND       NOT       DECOD  
CCCC 0iii010 CZI   SHR       ADDS      CMPS      MINS      CLRB      OR        ABS       TOPONE 
CCCC 0iii011 CZI   SHL       ADDSX     CMPSX     MAXS      SETB      XOR       NEG       BOTONE 
CCCC 0iii100 CZI   RCR       SUB       CMPR      SUMC      SETBC     MUXC      NEGC      INCMOD 
CCCC 0iii101 CZI   RCL       SUBX      CMPM      SUMNC     SETBNC    MUXNC     NEGNC     DECMOD 
CCCC 0iii110 CZI   SAR       SUBS      SUBR      SUMZ      SETBZ     MUXZ      NEGZ      MUL    
CCCC 0iii111 CZI   SAL       SUBSX     CMPSUB    SUMNZ     SETBNZ    MUXNZ     NEGNZ     MULS   
----------------------------------------------------------------------------------------------------
                   <----------------------------- D,S/# -------------------------------> <- D,S/@ -> 
                   100000nnn 100001nnn 100010nnn 100011nnn 100100nnn 100101nnn 100110nnn 100111nnn
CCCC 100iii0 00I   SETNIB0   GETNIB0   ROLNIB0   SETBYT0   ROLBYT0   ROLWRD0   SEUSSF    DJZ      
CCCC 100iii0 01I   SETNIB1   GETNIB1   ROLNIB1   SETBYT1   ROLBYT1   ROLWRD1   SEUSSR    DJNZ     
CCCC 100iii0 10I   SETNIB2   GETNIB2   ROLNIB2   SETBYT2   ROLBYT2   SETBYTS   REV       DJS      
CCCC 100iii0 11I   SETNIB3   GETNIB3   ROLNIB3   SETBYT3   ROLBYT3   MOVBYTS   SETI      DJNS     
CCCC 100iii1 00I   SETNIB4   GETNIB4   ROLNIB4   GETBYT0   SETWRD0   SPLITB    SETD      TJZ      
CCCC 100iii1 01I   SETNIB5   GETNIB5   ROLNIB5   GETBYT1   SETWRD1   MERGEB    GETD      TJNZ     
CCCC 100iii1 10I   SETNIB6   GETNIB6   ROLNIB6   GETBYT2   GETWRD0   SPLITW    SETS      TJS      
CCCC 100iii1 11I   SETNIB7   GETNIB7   ROLNIB7   GETBYT3   GETWRD1   MERGEW    GETS      TJNS     
----------------------------------------------------------------------------------------------------
                   1010nnn                                 1011nnn.n
CCCC 101i000 CZI   TESTN   D,S/#  {WC,WZ}             CZI  RDBYTE  D,S/#/PTRx  {WC,WZ}
CCCC 101i001 CZI   TEST    D,S/#  {WC,WZ}             CZI  RDWORD  D,S/#/PTRx  {WC,WZ}
CCCC 101i010 CZI   ANYB    D,S/#  {WC,WZ}             CZI  RDLONG  D,S/#/PTRx  {WC,WZ}
CCCC 101i011 CZI   TESTB   D,S/#  {WC,WZ}             CZI  <empty>
CCCC 101i100 CZI   WAITCNT D,S/#  {WC,WZ}             CZI  <empty>
CCCC 101i101 CZI   CALLD   D,S/@  {WC,WZ}             CZI  <empty>
CCCC 101i110 CZI   RDLUT   D,S/#  {WC,WZ}             0LI  SETPAE  D/#,S/#
                   "   "                              1LI  SETPAN  D/#,S/#
CCCC 101i111 CZI   MSGIN   D,S/#  {WC,WZ}             0LI  SETPBE  D/#,S/#
                   "   "                              1LI  SETPBN  D/#,S/#
----------------------------------------------------------------------------------------------------
                   11000nn.n                   11001nn.n                  11010nn.n
CCCC 110ii00 0LI   JP      D/#,S/@            0LI  WRFAST  D/#,S/#       0LI  QMUL    D/#,S/# 
CCCC 110ii00 1LI   JNP     D/#,S/@            1LI  FBLOCK  D/#,S/#       1LI  QDIV    D/#,S/# 
CCCC 110ii01 0LI   WRLUT   D/#,S/#            0LI  XINIT   D/#,S/#       0LI  QSQR    D/#,S/# 
CCCC 110ii01 1LI   MSGOUT  D/#,S/#            1LI  XZERO   D/#,S/#       1LI  QSIN    D/#,S/# 
CCCC 110ii10 0LI   WRBYTE  D/#,S/#/PTRx       0LI  XCONT   D/#,S/#       0LI  QROT    D/#,S/# 
CCCC 110ii10 1LI   WRWORD  D/#,S/#/PTRx       1LI  REP     D/#,S/#       1LI  QATN    D/#,S/# 
CCCC 110ii11 0LI   WRLONG  D/#,S/#/PTRx       CLI  COGINIT D/#,S/# {WC}  <**Note 1**>
CCCC 110ii11 1LI   RDFAST  D/#,S/#                 "   "                     
----------------------------------------------------------------------------------------------------
CCCC 1101100 Rnn nnnnnnnnn nnnnnnnnn        JMP     #abs/@rel
CCCC 1101101 Rnn nnnnnnnnn nnnnnnnnn        CALL    #abs/@rel
CCCC 1101110 Rnn nnnnnnnnn nnnnnnnnn        CALLA   #abs/@rel
CCCC 1101111 Rnn nnnnnnnnn nnnnnnnnn        CALLB   #abs/@rel
CCCC 11100ww Rnn nnnnnnnnn nnnnnnnnn        CALLD   reg,#abs/@rel
CCCC 11101ww Rnn nnnnnnnnn nnnnnnnnn        LOC     reg,#abs/@rel
CCCC 11110nn nnn nnnnnnnnn nnnnnnnnn        AUGS    #23bits
CCCC 11111nn nnn nnnnnnnnn nnnnnnnnn        AUGD    #23bits
====================================================================================================
<**Note 1**>  **SPECIAL OPCODE 1101011 (uses S field)**
CCCC 1101011 CZL DDDDDDDDD sssssssss
----------------------------------------------------------------------------------------------------
      sssssssss                                  sssssssss
 CZL  000000000    CLKSET  D/#  {WC,WZ}     CZ0  000010000    RFBYTE  D    {WC,WZ} 
 CZL  000000001    COGID   D/#  {WC,WZ}     CZ0  000010001    RFWORD  D    {WC,WZ} 
 CZL  000000010    <empty>                  CZ0  000010010    RFLONG  D    {WC,WZ} 
 00L  000000011    COGSTOP D/#              00L  000010011    WFBYTE  D/#          
 CZ0  000000100    LOCKNEW D    {WC,WZ}     00L  000010100    WFWORD  D/#          
 00L  000000101    LOCKRET D/#              00L  000010101    WFLONG  D/#          
 C0L  000000110    LOCKCLR D/#  {WC}        00L  000010110    SETQ    D/#          
 C0L  000000111    LOCKSET D/#  {WC}        00L  000010111    SETQ2   D/#          
 CZL  000001000    <empty>                  CZ0  000011000    GETQX   D    {WC,WZ} 
 CZL  000001001    <empty>                  CZ0  000011001    GETQY   D    {WC,WZ} 
 CZL  000001010    <empty>                  000  000011010    GETCNT  D            
 CZL  000001011    <empty>                  CZ0  000011011    GETRND  {D}  {WC,WZ} 
 CZL  000001100    <empty>                  00L  000011100    SETXDAC D/#          
 CZL  000001101    <empty>                  00L  000011101    SETXFRQ D/#          
 00L  000001110    QLOG    D/#              000  000011110    GETXCOS D            
 00L  000001111    QEXP    D/#              000  000011111    GETXSIN D            
                                          
      sssssssss                                  sssssssss
 00L  000100000    SETPER  D/#              00L  000101000    PUSH    D/#                
 00L  000100001    SETEDG  D/#              CZ0  000101001    CALL    D    {WC,WZ}
 00L  000100010    SETRDL  D/#              CZ0  000101010    CALLA   D    {WC,WZ}
 00L  000100011    SETWRL  D/#              CZ0  000101011    CALLB   D    {WC,WZ}
 C00  000100100    <**Note 2**>             CZ0  000101100    POP     D    {WC,WZ}
 00L  000100101    SETINT1 D/#              CZ0  000101101    RET          {WC,WZ}
 00L  000100110    SETINT2 D/#              CZ0  000101110    RETA         {WC,WZ}
 00L  000100111    SETINT3 D/#              CZ0  000101111    RETB         {WC,WZ}

      sssssssss
 00L  000110000    WAITX   D/#
 CZL  000110001    SETCZ   D/#  {WC,WZ}
 000  000110010    <**Note 3**>            
 00L  000110011    SETBRK  D/#
 xxx  0001101xx    <empty>             
 xxx  000111xxx    <empty>             
 xxx  001xxxxxx    <empty>             
 xxx  01xxxxxxx    <empty>             
 xxx  1xxxxxxxx    <empty>             
====================================================================================================
<**Note 2**>  **SPECIAL OPCODE 1101011 with S= 000100100 (uses D field)**
CCCC 1101011 C00  ddddddddd 000100100
----------------------------------------------------------------------------------------------------
      ddddddddd  sssssssss                                ddddddddd  sssssssss  
 C00  000000000  000100100   GETINT   {WC}           C00  000001000  000100100   WAITINT  {WC} 
 C00  000000001  000100100   GETPER   {WC}           C00  000001001  000100100   WAITPER  {WC} 
 C00  000000010  000100100   GETEDG   {WC}           C00  000001010  000100100   WAITEDG  {WC} 
 C00  000000011  000100100   GETPAT   {WC}           C00  000001011  000100100   WAITPAT  {WC} 
 C00  000000100  000100100   GETRDL   {WC}           C00  000001100  000100100   WAITRDL  {WC} 
 C00  000000101  000100100   GETWRL   {WC}           C00  000001101  000100100   WAITWRL  {WC} 
 C00  000000110  000100100   GETXRO   {WC}           C00  000001110  000100100   WAITXRO  {WC}
 C00  000000111  000100100   GETFBW   {WC}           C00  000001111  000100100   WAITFBW  {WC}
====================================================================================================
<**Note 3**>  **SPECIAL OPCODE 1101011 with S= 000110010 (uses D field)**
CCCC 1101011 000  ddddddddd 000100100
----------------------------------------------------------------------------------------------------
 czi  ddddddddd  sssssssss
 000  000000000  000110010   ALLOWI
 000  000000001  000110010   STALLI
 xxx  00000001x  000110010   <empty>
====================================================================================================

Cluso99 · 2015-09-25 05:42

Sounds good Chip. I am sure no one will run out of 10-12 cogs for a while yet

ozpropdev · 2015-09-25 05:43

All systems go here Chip!
Refresh....refresh.....refresh.....

MJB · 2015-09-25 08:30

Cluso99 wrote: »
I don't think any of this needs to be as complex as is being proposed.

BTW I got it into my head that the RD/WRxxxx worked differently to how it really does
However, this got me thinking about what I had been thinking!

Here are the related instructions. I am unsure what they all are for, but here they are...
CCCC 1010110 CZI DDDDDDDDD SSSSSSSSS        RDLUT   D,S/#       {WC,WZ}

CCCC 1100001 0LI DDDDDDDDD SSSSSSSSS        WRLUT   D/#,S/#

CCCC 1011000 CZI DDDDDDDDD SSSSSSSSS        RDBYTE  D,S/#/PTRx  {WC,WZ}
CCCC 1011001 CZI DDDDDDDDD SSSSSSSSS        RDWORD  D,S/#/PTRx  {WC,WZ}
CCCC 1011010 CZI DDDDDDDDD SSSSSSSSS        RDLONG  D,S/#/PTRx  {WC,WZ}

CCCC 1100010 0LI DDDDDDDDD SSSSSSSSS        WRBYTE  D/#,S/#/PTRx
CCCC 1100010 1LI DDDDDDDDD SSSSSSSSS        WRWORD  D/#,S/#/PTRx
CCCC 1100011 0LI DDDDDDDDD SSSSSSSSS        WRLONG  D/#,S/#/PTRx

CCCC 1100011 1LI DDDDDDDDD SSSSSSSSS        RDFAST  D/#,S/#
CCCC 1100100 0LI DDDDDDDDD SSSSSSSSS        WRFAST  D/#,S/#
CCCC 1100100 1LI DDDDDDDDD SSSSSSSSS        FBLOCK  D/#,S/#

CCCC 1101011 CZ0 DDDDDDDDD 000010000        RFBYTE  D           {WC,WZ}
CCCC 1101011 CZ0 DDDDDDDDD 000010001        RFWORD  D           {WC,WZ}
CCCC 1101011 CZ0 DDDDDDDDD 000010010        RFLONG  D           {WC,WZ}

CCCC 1101011 00L DDDDDDDDD 000010011        WFBYTE  D/#
CCCC 1101011 00L DDDDDDDDD 000010100        WFWORD  D/#
CCCC 1101011 00L DDDDDDDDD 000010101        WFLONG  D/#

CCCC 1101011 00L DDDDDDDDD 000010110        SETQ    D/#
CCCC 1101011 00L DDDDDDDDD 000010111        SETQ2   D/#

CCCC 11110nn nnn nnnnnnnnn nnnnnnnnn        AUGS    #23bits
CCCC 11111nn nnn nnnnnnnnn nnnnnnnnn        AUGD    #23bits
RDBYTE/RDWORD/RDLONG & WRBYTE/WRWORD/WRLONG

Let's look at some of those that we all know pretty well, and are similar to P1...
CCCC 1011000 CZI DDDDDDDDD SSSSSSSSS        RDBYTE  D,S/#/PTRx  {WC,WZ}
CCCC 1011001 CZI DDDDDDDDD SSSSSSSSS        RDWORD  D,S/#/PTRx  {WC,WZ}
CCCC 1011010 CZI DDDDDDDDD SSSSSSSSS        RDLONG  D,S/#/PTRx  {WC,WZ}

CCCC 1100010 0LI DDDDDDDDD SSSSSSSSS        WRBYTE  D/#,S/#/PTRx
CCCC 1100010 1LI DDDDDDDDD SSSSSSSSS        WRWORD  D/#,S/#/PTRx
CCCC 1100011 0LI DDDDDDDDD SSSSSSSSS        WRLONG  D/#,S/#/PTRx
I can see the use for the WZ in the RDBYTE/WORD/LONG, but not really WC.
I can see the use for S/PTRx although I am unsure what use PTRx will be without auto-incrementing/decrementing (see lower).
But I cannot see the use for S being immediate #. Do we really require immediate for access to the first 512B of hub???
It would be nice to be able to use D as immediate # in RDxxxx, just like in WRxxxx. But it is not that important.

However, what would be really nice is these instructions...
CCCC xxxxx00 0Z@ DDDDDDDDD SSSSSSSSS        RDBYTE   D/@,S  {WZ}
CCCC xxxxx01 0Z@ DDDDDDDDD SSSSSSSSS        RDWORD  D/@,S  {WZ}
CCCC xxxxx10 0Z@ DDDDDDDDD SSSSSSSSS        RDLONG  D/@,S  {WZ}

CCCC xxxxx00 1Z@ DDDDDDDDD SSSSSSSSS        WRBYTE  D/@,S  {WZ}
CCCC xxxxx01 1Z@ DDDDDDDDD SSSSSSSSS        WRWORD  D/@,S  {WZ}
CCCC xxxxx10 1Z@ DDDDDDDDD SSSSSSSSS        WRLONG  D/@,S  {WZ}

where
S   LONG   %xxxxxxxx_xxxxhhhh_hhhhhhhh_hhhhhhhh  ' x=0(future), h=20-bit byte hub address
   (Note: xxLONG ignores bits[1:0], xxWORD ignores bit[0]. ie LONG & WORD boundaries enforced!)
D   LONG   %xxxxxxxx_xxxxxxxx_xxxxxccc_cccccccc   ' x=0(future), c=11-bit long cog/lut contiguous address
(Note: x=0(future) are bits currently ignored by the P2 but could be used in a later P2 with expanded hub and/or cog/lut RAM)
This would mean that we could have the Destination COG/LUT address stored as an 11-bit result in a Cog Register and access it indirectly. There is an extra level of indirection when we use the @D and that may mean that we need an extra clock, but it's a small price to pay to be able to address COG/LUT contiguously. And it removes the requirement to have a RD/WRLUT instruction.
This keeps the usual RD/WR-BYTE/WORD/LONG simple and easy to explain/understand.

SETQ & SETQ2
Since there would no longer be any requirement to have separate cog and lut r/w hub instructions, SETQ2 would no longer be required.
SETQ should only apply to RD/WRLONG (or a new RD/WRBLOCK instruction). All block moves will be LONGs and must be on a LONG boundary.
So SETQ could become...
CCCC 1101011 LDD DDDDDDDDD 000010110        SETQ    D/#
or
CCCC 1101011 DDL DDDDDDDDD 000010110        SETQ    D/#
where...
D   LONG   %xxxxxxxx_xxxxxxxx_xxxxxccc_cccccccc   'x=0(future), c=11-bit long count for block move hub to/from cog/lut
(Note: when D is specified as an address in the SETQ instruction, D may only reside in COG RAM (ie D[10:9] will be ignored and should be "00")
Doesn't this simplify things and make it easier to understand as well ???

this would imply that whole COG+LUT space can be used for registers
with the only caveat, that immediate addressing is only available in the 9-bit COG address range
right??

Cluso99 · 2015-09-25 08:39

MJB wrote: »

this would imply that whole COG+LUT space can be used for registers
with the only caveat, that immediate addressing is only available in the 9-bit COG address range
right??

No, definitely not at this time.

What it means is that we could load from hub to cog and/or lut, or save cog and/or lut to hub.

Apart from the fact that normal programs only have 9 bits available to address the registers ($000..1FF), COG RAM is dual ported permitting both D and S values to be read simultaneously, and permitting both I to be read and R (result of a previous instruction) to be written simultaneously. LUT is only single port RAM.

Seairth · 2015-09-25 12:28

.

cgracey wrote: »

I've almost got the Prop123 FPGA release done. I was hoping it would be today, but it looks like tomorrow. I set it at 50Mhz, since everybody can run that. It fit 12 cogs at 97% capacity. When we add smart pins, that will probably drop to 10. It compiles with 12 cogs at an Fmax of 80Mhz, which means we could certainly run it at 100Mhz. Need the PLLs for that, though.

After the Prop123 version, I'll get the DE2-115 version together.

*WHEW!* I was actually worried that you would release it today. All my "real" work would have come to a screeching halt! Fortunately, it's raining all weekend, so a perfect excuse to play with an FPGA (starting tomorrow, of course)!

David Betz · 2015-09-25 16:13

cgracey wrote: »

I've almost got the Prop123 FPGA release done. I was hoping it would be today, but it looks like tomorrow. I set it at 50Mhz, since everybody can run that. It fit 12 cogs at 97% capacity. When we add smart pins, that will probably drop to 10. It compiles with 12 cogs at an Fmax of 80Mhz, which means we could certainly run it at 100Mhz. Need the PLLs for that, though.

After the Prop123 version, I'll get the DE2-115 version together.

Any chance of a DE2-115 image today or will that have to wait until next week?

Roy Eltham · 2015-09-25 20:09

I need a DE2-115 image also.

I was going to get the P123 A7 version, but then the issues came up and I figured I should wait for the A9 one.

Baggers · 2015-09-25 21:58

Great news chip, as all here I am looking forward to having a play

Cluso99 · 2015-09-26 00:41

Here is a summary of the JMP/CALL/RET instructions...

P2 INSTRUCTIONS (JUMPS etc) 24SEP2015
====================================================================================================
                                       R=0                 R=1
CCCC 100111R 00I DDDDDDDDD SSSSSSSSS   DJZ     D,S/@       TJZ     D,S/@ 
CCCC 100111R 01I DDDDDDDDD SSSSSSSSS   DJNZ    D,S/@       TJNZ    D,S/@ 
CCCC 100111R 10I DDDDDDDDD SSSSSSSSS   DJS     D,S/@       TJS     D,S/@ 
CCCC 100111R 11I DDDDDDDDD SSSSSSSSS   DJNS    D,S/@       TJNS    D,S/@ 
                                       Q=0                 Q=1
CCCC 1100000 QLI DDDDDDDDD SSSSSSSSS   JP      D/#,S/@     JNP     D/#,S/@  'j pinD [not]positive?
====================================================================================================
CCCC 1101100 Rnn nnnnnnnnn nnnnnnnnn   JMP     #abs/@rel      'jump 20-bit absolute/relative address
CCCC 1101011 CZ0 000000000 000101101   RET           {WC,WZ}  'jump via internal stack
CCCC 1101011 CZ0 000000000 000101110   RETA          {WC,WZ}  'jump via Register A
CCCC 1101011 CZ0 000000000 000101111   RETB          {WC,WZ}  'jump via Register B
----------------------------------------------------------------------------------------------------
CCCC 1101011 CZ0 DDDDDDDDD 000101001   CALL    D     {WC,WZ}  'save return address on internal stack
CCCC 1101011 CZ0 DDDDDDDDD 000101010   CALLA   D     {WC,WZ}  'save return address in Register A    
CCCC 1101011 CZ0 DDDDDDDDD 000101011   CALLB   D     {WC,WZ}  'save return address in Register B    
CCCC 1010101 CZI DDDDDDDDD SSSSSSSSS   CALLD   D,S/@ {WC,WZ}  'save return address in Register ???  

CCCC 1101101 Rnn nnnnnnnnn nnnnnnnnn   CALL    #abs/@rel      'save return address on internal stack
CCCC 1101110 Rnn nnnnnnnnn nnnnnnnnn   CALLA   #abs/@rel      'save return address in Register A
CCCC 1101111 Rnn nnnnnnnnn nnnnnnnnn   CALLB   #abs/@rel      'save return address in Register B
CCCC 11100ww Rnn nnnnnnnnn nnnnnnnnn   CALLD   reg,#abs/@rel  'save return address in Register ???
====================================================================================================
CCCC 1101011 00L DDDDDDDDD 000101000   PUSH    D/#            'push D/# on internal stack
CCCC 1101011 CZ0 DDDDDDDDD 000101100   POP     D     {WC,WZ}  'pop  D from internal stack
CCCC 11101ww Rnn nnnnnnnnn nnnnnnnnn   LOC     reg,#abs/@rel  'loads Register A..D with 20-bit address ???
CCCC 11110nn nnn nnnnnnnnn nnnnnnnnn   AUGS    #23bits        'sets S[31..9] for next instruction
CCCC 11111nn nnn nnnnnnnnn nnnnnnnnn   AUGD    #23bits        'sets D[31..9] for next instruction
====================================================================================================

cgracey · 2015-09-26 03:30

David Betz wrote:

Any chance of a DE2-115 image today or will that have to wait until next week?

I've been working on it all day. I needed to get PNut.exe to work with multiple boards, so I thought I would do the DE2-115 and the Prop123 at the same time, to start.

I noticed a few little problems as I made all the compiles and cross-checks. So, I am fixing those right now. Maybe later tonight I will have something. Sorry this is taking so long.

ozpropdev · 2015-09-26 03:43

No apologies necessary Chip, after all this is a huge project.

evanh · 2015-09-26 06:03

Ah, not happy ... CALL D, CALLD, CALLD D! ... Just looking at what Cluso has compiled.

One problem with the relabelling of LINK to CALLD is there is no paired RETD because it's not quite a full stacked mechanism.

But obviously there is also the comprehension issue of D being used both as part of an opcode name and also a register direct place holder. This is particularly bad when both are in the one instruction together!

cgracey · 2015-09-26 11:36

evanh wrote: »

Ah, not happy ... CALL D, CALLD, CALLD D! ... Just looking at what Cluso has compiled.

One problem with the relabelling of LINK to CALLD is there is no paired RETD because it's not quite a full stacked mechanism.

But obviously there is also the comprehension issue of D being used both as part of an opcode name and also a register direct place holder. This is particularly bad when both are in the one instruction together!

I kind of liked "LINK". I was thinking CALLR for 'register' might make more sense than CALLD, when there's only one operand, anyway.

cgracey · 2015-09-26 11:38

Okay. I got the Prop123 and DE2_115 FPGA images together.

There's a serious lack of documentation, yet, but those of you that have been playing with FPGA's will find your way.

I'll make a new thread and hope I can post the ~5MB file there.

Rayman · 2015-09-26 14:35

Where is I/O now on DE2_115 with Prop board relative to where it's going to be finally?

I guess right now, I'm mainly interested in digital I/O. Is this working the same way now that it will in final chip?

I think I've heard that smart-pins are the next thing Chip will work on... I don't seem to remember what "smart pins" are...
Do they include the analog I/O modes and resistor pull-up and down modes? Or, are they just special digital modes?

I think it'd be fun to connect an LCD to DE2-115 or Prop123 and would like to figure out how to do that mechanically and code wise...

cgracey · 2015-09-26 20:04

Rayman wrote: »

Where is I/O now on DE2_115 with Prop board relative to where it's going to be finally?

I guess right now, I'm mainly interested in digital I/O. Is this working the same way now that it will in final chip?

I think I've heard that smart-pins are the next thing Chip will work on... I don't seem to remember what "smart pins" are...
Do they include the analog I/O modes and resistor pull-up and down modes? Or, are they just special digital modes?

I think it'd be fun to connect an LCD to DE2-115 or Prop123 and would like to figure out how to do that mechanically and code wise...

The streamer can write data directly to the i/o pins, not just to the DACs, up to 32 bits per clock, from hub or LUT.

Cluso99 · 2015-09-26 22:41

Where are the old docs?
IIRC Peter did a splendid job and put them up on google???

Dave Hein · 2015-09-26 23:27

P2 Watch:

The P2 FPGA Image is finally here!

P2 Day is September 26, 2015. It will be celebrated for years to come.

Thanks Chip.

evanh · 2015-09-26 23:47

Schedule some annual leave, crack open the caffeine. Party with keyboard time!

evanh · 2015-09-27 00:04

cgracey wrote: »

I kind of liked "LINK". I was thinking CALLR for 'register' might make more sense than CALLD, when there's only one operand, anyway.

I'd be happy with CALLR. CALLK also.

JMPLR or JMPR or JMPLK or JMPK could work as this makes clear, just from the name, it's a branching operation - which LINK doesn't.

LINK is already well known naming so is also still fine, imho.

Bill Henning · 2015-09-27 04:37

Questions:

Is that up to 32 bits, one bit per clock?

Can you also read up to 32 bits, one per clock from a pin?

Is that instruction clock, or system clock?

Can the clock be exposed on an adjacent pin?

If the clock can be exposed, that gives us SPI master (half duplex) with the above for free.

For full duplex SPI, two pins would need to be sync'd with a third as a clock. An arbitrary other pin could be the chip select.

cgracey wrote: »

The streamer can write data directly to the i/o pins, not just to the DACs, up to 32 bits per clock, from hub or LUT.

cgracey · 2015-09-27 05:32

Bill Henning wrote: »

Questions:

Is that up to 32 bits, one bit per clock?

Can you also read up to 32 bits, one per clock from a pin?

Is that instruction clock, or system clock?

Can the clock be exposed on an adjacent pin?

If the clock can be exposed, that gives us SPI master (half duplex) with the above for free.

For full duplex SPI, two pins would need to be sync'd with a third as a clock. An arbitrary other pin could be the chip select.

cgracey wrote: »

The streamer can write data directly to the i/o pins, not just to the DACs, up to 32 bits per clock, from hub or LUT.

It captures bytes, words, or longs. I like the idea of one, two, or four bits, as well, getting written as bytes! The rate is already programmable by SETXFRQ: $8000000 = every clock, $40000000 = every 2nd clock, $2AAAAAAB = every 3rd clock. In that case of every third clock, the LSB must be set to ensure that it rolls over (reaches $80000000+) on the initial third clock. Bit 31 is not kept by the phase accumulator.

AntoineDoinel · 2015-09-27 05:55

Rename STALLI and ALLOWI to FORBID and PERMIT !

The New 16-Cog, 512KB, 64 analog I/O Propeller Chip - Part 2

Comments