Propeller II update - BLOG

cgracey · 2013-12-17 14:58

Bill Henning wrote: »

Alternate way of feeding the video engine?

It seems too choppy, in time. It might have to functionally correlate with instruction execution to be useful.

Cluso99 · 2013-12-17 15:26

Chip,
Did you manage to fit the USB read bit (GETXP) and CRC (CRCBIT) instructions in this release?

cgracey · 2013-12-17 15:53

Cluso99 wrote: »

Chip,
Did you manage to fit the USB read bit (GETXP) and CRC (CRCBIT) instructions in this release?

I didn't put them in yet, but there's room for them. Could you please direct me to a post that spells out what is needed, or just repost it, here? What you wanted to do is not a big deal, at all. I just need to know what it is, again. Thanks!

jmg · 2013-12-17 15:55

cgracey wrote: »

I didn't put them in yet, but there's room for them. Could you please direct me to a post that spells out what is needed, or just repost it, here? What you wanted to do is not a big deal, at all. I just need to know what it is, again. Thanks!

Or start a new thread, for USB opcodes might be easier to track ?

Yanomani · 2013-12-17 16:52

cgracey wrote: »

I just realized that since all instructions during hub execution come from the hub, the cog RAM instruction fetching is still going on, but it's being ignored. We could stuff some other address in the instruction-read-address of the cog RAM and get any long out of cog RAM we want. I wonder if there is something useful that can be done by repurposing the cog's internal instruction fetch. It's a free cog RAM read on every hub exec instruction.

Chip

If there is some room in the opcode map, to craft a tight follower to the instruction block, being fetched from Hub memory, perhaps it could be arranged to give an opportunity to a suggestion made by Cluso99, at post #4217 of this thread, about the creation of some real time method to "teach" the cache LRU algorithm, how to deal with de following fetches from Hub memory.
Unfortunatelly, it will not yet deal with conditional branching preview, unless you could devise a method to automaticaly doing some "Y" branching at the cache level, advancing a single eight long fetche, as an alternate, and faster, way to "predict", from where it will start to gather new data, in case branching realy does happens to be taken.

Yanomani

Cluso99 · 2013-12-17 17:20

Yanomani wrote: »

Chip

If there is some room in the opcode map, to craft a tight follower to the instruction block, being fetched from Hub memory, perhaps it could be arranged to give an opportunity to a suggestion made by Cluso99, at post #4217 of this thread, about the creation of some real time method to "teach" the cache LRU algorithm, how to deal with de following fetches from Hub memory.
Unfortunatelly, it will not yet deal with conditional branching preview, unless you could devise a method to automaticaly doing some "Y" branching at the cache level, advancing a single eight long fetche, as an alternate, and faster, way to "predict", from where it will start to gather new data, in case branching realy does happens to be taken.

Yanomani

I think it would be nice if we could just "force" the loading of the instruction cache with an instruction like I proposed. I think it would be worth playing with this concept on the FPGA to see what we could determine. I don't think the autoload of the instruction cache can ever be as good as what a useful instruction could do. This way, we could control the cache contents by sw.

Bill Henning · 2013-12-17 17:32

As long as there is auto-load in the absence of forced-loading.

Auto load is needed to reduce the work compilers need to do.

Cluso99 wrote: »

I think it would be nice if we could just "force" the loading of the instruction cache with an instruction like I proposed. I think it would be worth playing with this concept on the FPGA to see what we could determine. I don't think the autoload of the instruction cache can ever be as good as what a useful instruction could do. This way, we could control the cache contents by sw.

Cluso99 · 2013-12-17 17:34

cgracey wrote: »

I didn't put them in yet, but there's room for them. Could you please direct me to a post that spells out what is needed, or just repost it, here? What you wanted to do is not a big deal, at all. I just need to know what it is, again. Thanks!

Here are the links
http://forums.parallax.com/showthread.php/125543-Propeller-II-update-BLOG?p=1224661&viewfull=1#post1224661

And here were some fixes/updates/etc (summaries) that I did a short time ago (most were resolved or are now irrelevant)
http://forums.parallax.com/showthread.php/151904-Here-is-the-update-from-the-Big-Change!!!?p=1224734&viewfull=1#post1224734

cgracey · 2013-12-18 00:48

Cluso99 wrote: »

Here are the links
http://forums.parallax.com/showthread.php/125543-Propeller-II-update-BLOG?p=1224661&viewfull=1#post1224661

And here were some fixes/updates/etc (summaries) that I did a short time ago (most were resolved or are now irrelevant)
http://forums.parallax.com/showthread.php/151904-Here-is-the-update-from-the-Big-Change!!!?p=1224734&viewfull=1#post1224734

Thanks for these links. I should be able to get to these in a few days. I'm still modifying PNut.exe to get the new instruction set integrated.

Cluso99 · 2013-12-18 01:11

Bill Henning wrote: »

As long as there is auto-load in the absence of forced-loading.

Auto load is needed to reduce the work compilers need to do.

Auto load is the prime requirement.

I just think it could be extremely useful to override this default. Our code would have to be more complex to take advantage of it, but it could be a real boost to some specialised code.
It would be nice to play with this in the fpga while other things are being done.

It would also be nice for the instruction cache autoloader to be able to take advantage of the next available slot to load the wide

Cluso99 · 2013-12-18 01:16

cgracey wrote: »

Thanks for these links. I should be able to get to these in a few days. I'm still modifying PNut.exe to get the new instruction set integrated.

You are welcome.
I don't suppose there is any way we can help you with pnut?

Sapieha · 2013-12-18 02:21

Hi Chip.

As Cluso said we can't help much on PNut -- But --->

As Instructions use Now Absolute and Relative addresses ---> That need in my opinion 2 directives ABSCOD and RELCOD
Then as code can execute from HUB else COG ---> HUBCOD and COGCOD.

Cluso99 wrote: »

You are welcome.
I don't suppose there is any way we can help you with pnut?

Cluso99 · 2013-12-18 04:25

I have grouped some of the new instructions as I have done previously.
This time they are in a pdf.

P2_Instruction_Set_20131217b.pdf

Bill Henning · 2013-12-18 05:41

Agreed!

What you were seeking was a PREFETCHI #hubaddr16bit ... to prefetch. Assembly coders (and optimizing compilers) could make good use of that.

hubexec would really benefit from slurping up unused slots...

Cluso99 wrote: »

Auto load is the prime requirement.

I just think it could be extremely useful to override this default. Our code would have to be more complex to take advantage of it, but it could be a real boost to some specialised code.
It would be nice to play with this in the fpga while other things are being done.

It would also be nice for the instruction cache autoloader to be able to take advantage of the next available slot to load the wide

jmg · 2013-12-18 10:14

Sapieha wrote: »

As Instructions use Now Absolute and Relative addresses

I'm not sure that is still correct. See Chip's post here:

http://forums.parallax.com/showthread.php/152079-Hub-Execution-Model-Thread-%28split-from-blog%29?p=1228729&viewfull=1#post1228729

Sapieha · 2013-12-18 10:21

Hi jmg.

I look on Instruction list abd it show both absolute and relative possibility's.

jmg wrote: »

I'm not sure that is still correct. See Chip's post here:

http://forums.parallax.com/showthread.php/152079-Hub-Execution-Model-Thread-(split-from-blog)?p=1228729&viewfull=1#post1228729

Cluso99 · 2013-12-18 17:27

Bill Henning wrote: »

Agreed!

What you were seeking was a PREFETCHI #hubaddr16bit ... to prefetch. Assembly coders (and optimizing compilers) could make good use of that.

hubexec would really benefit from slurping up unused slots...

Actually, I was hoping for a little more

PREFETCHI #hubaddr16bit,#1..3
to prefetch 1, 2 or 3 wides into the instruction cache because the program would know that the routine about to be called was <8, <16, <24 longs.

It would be particularly beneficial if the prefetching (auto or manual) could grab the next unused slot! What a performance boost this would be, and it's slot would then be free for another cogs prefetching!

Bill Henning · 2013-12-18 17:28

Agreed!

Cluso99 wrote: »

Actually, I was hoping for a little more
PREFETCHI #hubaddr16bit,#1..3
to prefetch 1, 2 or 3 wides into the instruction cache because the program would know that the routine about to be called was <8, <16, <24 longs.

It would be particularly beneficial if the prefetching (auto or manual) could grab the next unused slot! What a performance boost this would be, and it's slot would then be free for another cogs prefetching!

Cluso99 · 2013-12-18 17:36

Bill Henning wrote: »

Agreed!

Currently I expect that the auto icache loader waits until a hub instruction (that's not in the cache) is required. It then stalls until that wide is fetched. The wide will replace the LRU (last recent used).

What would be nice if the loader then automatically presumed the next block would be required, and fetched that wide into the next LRU. And, once that first block is used and the second wide's use has begun, the next block is fetched.

However, all this is a little complex for now. It will be nice to start playing to see what performance we can get, and where the stalls are.

Bill Henning · 2013-12-18 18:41

Cluso99 wrote: »

What would be nice if the loader then automatically presumed the next block would be required, and fetched that wide into the next LRU. And, once that first block is used and the second wide's use has begun, the next block is fetched.

I am pretty sure this is what Chip did...

Cluso99 wrote: »

[However, all this is a little complex for now. It will be nice to start playing to see what performance we can get, and where the stalls are.

Indeed...

Cluso99 · 2013-12-19 01:36

Here is the latest instruction set reformatted

ZCxS Opcode  ZC I Cond.Destinatn Source     Instr/00 01       10       11       Operand(s)                                              Flags
-------------------------------------------------------------------------------------------------------------------------------------------------
ZCWS 0000000 ZC I CCCC DDDDDDDDD SSSSSSSSS  RDBYTE                              D,S/PTRA/PTRB                                           [WZ],[WC]
ZCWS 0000001 ZC I CCCC DDDDDDDDD SSSSSSSSS  RDBYTEC                             D,S/PTRA/PTRB                                           [WZ],[WC]
ZCWS 0000010 ZC I CCCC DDDDDDDDD SSSSSSSSS  RDWORD                              D,S/PTRA/PTRB                                           [WZ],[WC]
ZCWS 0000011 ZC I CCCC DDDDDDDDD SSSSSSSSS  RDWORDC                             D,S/PTRA/PTRB                                           [WZ],[WC]
ZCWS 0000100 ZC I CCCC DDDDDDDDD SSSSSSSSS  RDLONG                              D,S/PTRA/PTRB                                           [WZ],[WC]
ZCWS 0000101 ZC I CCCC DDDDDDDDD SSSSSSSSS  RDLONGC                             D,S/PTRA/PTRB                                           [WZ],[WC]
ZCWS 0000110 ZC I CCCC DDDDDDDDD SSSSSSSSS  RDAUX                               D,S/#0..$FF/PTRX/PTRY                                   [WZ],[WC]
ZCWS 0000111 ZC I CCCC DDDDDDDDD SSSSSSSSS  RDAUXR                              D,S/#0..$FF/PTRX/PTRY                                   [WZ],[WC]
ZCMS 0001000 ZC I CCCC DDDDDDDDD SSSSSSSSS  ISOB                                D,S/#                                                   [WZ],[WC]
ZCMS 0001001 ZC I CCCC DDDDDDDDD SSSSSSSSS  NOTB                                D,S/#                                                   [WZ],[WC]
ZCMS 0001010 ZC I CCCC DDDDDDDDD SSSSSSSSS  CLRB                                D,S/#                                                   [WZ],[WC]
ZCMS 0001011 ZC I CCCC DDDDDDDDD SSSSSSSSS  SETB                                D,S/#                                                   [WZ],[WC]
ZCMS 0001100 ZC I CCCC DDDDDDDDD SSSSSSSSS  SETBC                               D,S/#                                                   [WZ],[WC]
ZCMS 0001101 ZC I CCCC DDDDDDDDD SSSSSSSSS  SETBNC                              D,S/#                                                   [WZ],[WC]
ZCMS 0001110 ZC I CCCC DDDDDDDDD SSSSSSSSS  SETBZ                               D,S/#                                                   [WZ],[WC]
ZCMS 0001111 ZC I CCCC DDDDDDDDD SSSSSSSSS  SETBNZ                              D,S/#                                                   [WZ],[WC]
ZCMS 0010000 ZC I CCCC DDDDDDDDD SSSSSSSSS  ANDN                                D,S/#                                                   [WZ],[WC]
ZCMS 0010001 ZC I CCCC DDDDDDDDD SSSSSSSSS  AND                                 D,S/#                                                   [WZ],[WC]
ZCMS 0010010 ZC I CCCC DDDDDDDDD SSSSSSSSS  OR                                  D,S/#                                                   [WZ],[WC]
ZCMS 0010011 ZC I CCCC DDDDDDDDD SSSSSSSSS  XOR                                 D,S/#                                                   [WZ],[WC]
ZCMS 0010100 ZC I CCCC DDDDDDDDD SSSSSSSSS  MUXC                                D,S/#                                                   [WZ],[WC]
ZCMS 0010101 ZC I CCCC DDDDDDDDD SSSSSSSSS  MUXNC                               D,S/#                                                   [WZ],[WC]
ZCMS 0010110 ZC I CCCC DDDDDDDDD SSSSSSSSS  MUXZ                                D,S/#                                                   [WZ],[WC]
ZCMS 0010111 ZC I CCCC DDDDDDDDD SSSSSSSSS  MUXNZ                               D,S/#                                                   [WZ],[WC]
ZCMS 0011000 ZC I CCCC DDDDDDDDD SSSSSSSSS  ROR                                 D,S/#                                                   [WZ],[WC]
ZCMS 0011001 ZC I CCCC DDDDDDDDD SSSSSSSSS  ROL                                 D,S/#                                                   [WZ],[WC]
ZCMS 0011010 ZC I CCCC DDDDDDDDD SSSSSSSSS  SHR                                 D,S/#                                                   [WZ],[WC]
ZCMS 0011011 ZC I CCCC DDDDDDDDD SSSSSSSSS  SHL                                 D,S/#                                                   [WZ],[WC]
ZCMS 0011100 ZC I CCCC DDDDDDDDD SSSSSSSSS  RCR                                 D,S/#                                                   [WZ],[WC]
ZCMS 0011101 ZC I CCCC DDDDDDDDD SSSSSSSSS  RCL                                 D,S/#                                                   [WZ],[WC]
ZCMS 0011110 ZC I CCCC DDDDDDDDD SSSSSSSSS  SAR                                 D,S/#                                                   [WZ],[WC]
ZCMS 0011111 ZC I CCCC DDDDDDDDD SSSSSSSSS  REV                                 D,S/#                                                   [WZ],[WC]
ZCWS 0100000 ZC I CCCC DDDDDDDDD SSSSSSSSS  MOV                                 D,S/#                                                   [WZ],[WC]
ZCWS 0100001 ZC I CCCC DDDDDDDDD SSSSSSSSS  NOT                                 D,S/#                                                   [WZ],[WC]
ZCWS 0100010 ZC I CCCC DDDDDDDDD SSSSSSSSS  ABS                                 D,S/#                                                   [WZ],[WC]
ZCWS 0100011 ZC I CCCC DDDDDDDDD SSSSSSSSS  NEG                                 D,S/#                                                   [WZ],[WC]
ZCWS 0100100 ZC I CCCC DDDDDDDDD SSSSSSSSS  NEGC                                D,S/#                                                   [WZ],[WC]
ZCWS 0100101 ZC I CCCC DDDDDDDDD SSSSSSSSS  NEGNC                               D,S/#                                                   [WZ],[WC]
ZCWS 0100110 ZC I CCCC DDDDDDDDD SSSSSSSSS  NEGZ                                D,S/#                                                   [WZ],[WC]
ZCWS 0100111 ZC I CCCC DDDDDDDDD SSSSSSSSS  NEGNZ                               D,S/#                                                   [WZ],[WC]
ZCMS 0101000 ZC I CCCC DDDDDDDDD SSSSSSSSS  ADD                                 D,S/#                                                   [WZ],[WC]
ZCMS 0101001 ZC I CCCC DDDDDDDDD SSSSSSSSS  SUB                                 D,S/#                                                   [WZ],[WC]
ZCMS 0101010 ZC I CCCC DDDDDDDDD SSSSSSSSS  ADDX                                D,S/#                                                   [WZ],[WC]
ZCMS 0101011 ZC I CCCC DDDDDDDDD SSSSSSSSS  SUBX                                D,S/#                                                   [WZ],[WC]
ZCMS 0101100 ZC I CCCC DDDDDDDDD SSSSSSSSS  ADDS                                D,S/#                                                   [WZ],[WC]
ZCMS 0101101 ZC I CCCC DDDDDDDDD SSSSSSSSS  SUBS                                D,S/#                                                   [WZ],[WC]
ZCMS 0101110 ZC I CCCC DDDDDDDDD SSSSSSSSS  ADDSX                               D,S/#                                                   [WZ],[WC]
ZCMS 0101111 ZC I CCCC DDDDDDDDD SSSSSSSSS  SUBSX                               D,S/#                                                   [WZ],[WC]
ZCMS 0110000 ZC I CCCC DDDDDDDDD SSSSSSSSS  SUMC                                D,S/#                                                   [WZ],[WC]
ZCMS 0110001 ZC I CCCC DDDDDDDDD SSSSSSSSS  SUMNC                               D,S/#                                                   [WZ],[WC]
ZCMS 0110010 ZC I CCCC DDDDDDDDD SSSSSSSSS  SUMZ                                D,S/#                                                   [WZ],[WC]
ZCMS 0110011 ZC I CCCC DDDDDDDDD SSSSSSSSS  SUMNZ                               D,S/#                                                   [WZ],[WC]
ZCMS 0110100 ZC I CCCC DDDDDDDDD SSSSSSSSS  MIN                                 D,S/#                                                   [WZ],[WC]
ZCMS 0110101 ZC I CCCC DDDDDDDDD SSSSSSSSS  MAX                                 D,S/#                                                   [WZ],[WC]
ZCMS 0110110 ZC I CCCC DDDDDDDDD SSSSSSSSS  MINS                                D,S/#                                                   [WZ],[WC]
ZCMS 0110111 ZC I CCCC DDDDDDDDD SSSSSSSSS  MAXS                                D,S/#                                                   [WZ],[WC]
ZCMS 0111000 ZC I CCCC DDDDDDDDD SSSSSSSSS  ADDABS                              D,S/#                                                   [WZ],[WC]
ZCMS 0111001 ZC I CCCC DDDDDDDDD SSSSSSSSS  SUBABS                              D,S/#                                                   [WZ],[WC]
ZCMS 0111010 ZC I CCCC DDDDDDDDD SSSSSSSSS  INCMOD                              D,S/#                                                   [WZ],[WC]
ZCMS 0111011 ZC I CCCC DDDDDDDDD SSSSSSSSS  DECMOD                              D,S/#                                                   [WZ],[WC]
ZCMS 0111100 ZC I CCCC DDDDDDDDD SSSSSSSSS  CMPSUB                              D,S/#                                                   [WZ],[WC]
ZCMS 0111101 ZC I CCCC DDDDDDDDD SSSSSSSSS  SUBR                                D,S/#                                                   [WZ],[WC]
ZCMS 0111110 ZC I CCCC DDDDDDDDD SSSSSSSSS  MUL                                 D,S/#                                                   [WZ],[WC]
ZCMS 0111111 ZC I CCCC DDDDDDDDD SSSSSSSSS  SCL                                 D,S/#                                                   [WZ],[WC]
-------------------------------------------------------------------------------------------------------------------------------------------------
ZCWS 1000000 ZC I CCCC DDDDDDDDD SSSSSSSSS  DECOD2                              D,S/#                                                   [WZ],[WC]
ZCWS 1000001 ZC I CCCC DDDDDDDDD SSSSSSSSS  DECOD3                              D,S/#                                                   [WZ],[WC]
ZCWS 1000010 ZC I CCCC DDDDDDDDD SSSSSSSSS  DECOD4                              D,S/#                                                   [WZ],[WC]
ZCWS 1000011 ZC I CCCC DDDDDDDDD SSSSSSSSS  DECOD5                              D,S/#                                                   [WZ],[WC]
Z-WS 1000100 Zf I CCCC DDDDDDDDD SSSSSSSSS  ENCOD    BLMASK                     D,S/#                                                   [WZ]
Z-WS 1000101 Zf I CCCC DDDDDDDDD SSSSSSSSS  ONECNT   ZERCNT                     D,S/#                                                   [WZ]
-CWS 1000110 fC I CCCC DDDDDDDDD SSSSSSSSS  INCPAT            DECPAT            D,S/#                                                        [WC]
--WS 1000111 ff I CCCC DDDDDDDDD SSSSSSSSS  SPLITB   MERGEB   SPLITW   MERGEW   D,S/#                                                   
--MS 10010nn nf I CCCC DDDDDDDDD SSSSSSSSS  GETNIB   SETNIB                     D,S/#,#0..7                                             
--MS 1001100 nf I CCCC DDDDDDDDD SSSSSSSSS  GETWORD  SETWORD                    D,S/#,#0..1                                             
--MS 1001101 ff I CCCC DDDDDDDDD SSSSSSSSS  STWORDS  ROLNIB   ROLBYTE  ROLWORD  D,S/#                                                   
--MS 1001110 ff I CCCC DDDDDDDDD SSSSSSSSS  SETS     SETD     SETX     SETI     D,S/#                                                   
-CMS 1001111 fC I CCCC DDDDDDDDD SSSSSSSSS  COGNEW            WAITCNT           D,S/#                                                        [WC]
--MS 101000n nf I CCCC DDDDDDDDD SSSSSSSSS  GETBYTE  SETBYTE                    D,S/#,#0..3                                             
--WS 1010010 ff I CCCC DDDDDDDDD SSSSSSSSS  STBYTES  SWBYTES  PACKRGB  UNPKRGB  D,S/#                                                   
--MS 1010011 ff I CCCC DDDDDDDDD SSSSSSSSS  ADDPIX   MULPIX   BLNPIX   MIXPIX   D,S/#                                                   
ZCMS 1010100 ZC I CCCC DDDDDDDDD SSSSSSSSS  JMPSW                               D,S/#                                                   [WZ],[WC]
ZCMS 1010101 ZC I CCCC DDDDDDDDD SSSSSSSSS  JMPSWD                              D,S/#                                                   [WZ],[WC]
--MS 1010110 ff I CCCC DDDDDDDDD SSSSSSSSS  IJZ      IJZD     IJNZ     IJNZD    D,S/#                                                   
--MS 1010111 ff I CCCC DDDDDDDDD SSSSSSSSS  DJZ      DJZD     DJNZ     DJNZD    D,S/#                                                   
ZCRS 1011000 ZC I CCCC DDDDDDDDD SSSSSSSSS  TESTB                               D,S/#                                                   [WZ],[WC]
ZCRS 1011001 ZC I CCCC DDDDDDDDD SSSSSSSSS  TESTN                               D,S/#                                                   [WZ],[WC]
ZCRS 1011010 ZC I CCCC DDDDDDDDD SSSSSSSSS  TEST                                D,S/#                                                   [WZ],[WC]
ZCRS 1011011 ZC I CCCC DDDDDDDDD SSSSSSSSS  CMP                                 D,S/#                                                   [WZ],[WC]
ZCRS 1011100 ZC I CCCC DDDDDDDDD SSSSSSSSS  CMPX                                D,S/#                                                   [WZ],[WC]
ZCRS 1011101 ZC I CCCC DDDDDDDDD SSSSSSSSS  CMPS                                D,S/#                                                   [WZ],[WC]
ZCRS 1011110 ZC I CCCC DDDDDDDDD SSSSSSSSS  CMPSX                               D,S/#                                                   [WZ],[WC]
ZCRS 1011111 ZC I CCCC DDDDDDDDD SSSSSSSSS  CMPR                                D,S/#                                                   [WZ],[WC]
--RS 11000nn nf I CCCC DDDDDDDDD SSSSSSSSS  COGINIT  WAITVID                    D,S/#,#0..7 | #0..$DFF,S/#                              
-CRS 110010n nC I CCCC DDDDDDDDD SSSSSSSSS  WAITPEQ                             D,S/#,#0..3                                                  [WC]
-CRS 110011n nC I CCCC DDDDDDDDD SSSSSSSSS  WAITPNE                             D,S/#,#0..3                                                  [WC]
--LS 1101000 fL I CCCC DDDDDDDDD SSSSSSSSS  WRBYTE            WRWORD            D/#,S/PTRA/PTRB | D/#,S/PTRA/PTRB                       
--LS 1101001 fL I CCCC DDDDDDDDD SSSSSSSSS  WRLONG            FRAC              D/#,S/PTRA/PTRB | D/#,S/#                               
--LS 1101010 fL I CCCC DDDDDDDDD SSSSSSSSS  WRAUX             WRAUXR            D/#,S/#0..$FF/PTRX/PTRY | D/#,S/#0..$FF/PTRX/PTRY       
--LS 1101011 fL I CCCC DDDDDDDDD SSSSSSSSS  SETACCA           SETACCB           D/#,S/#                                                 
--LS 1101100 fL I CCCC DDDDDDDDD SSSSSSSSS  MACA              MACB              D/#,S/#                                                 
--LS 1101101 fL I CCCC DDDDDDDDD SSSSSSSSS  MUL32             MUL32U            D/#,S/#                                                 
--LS 1101110 fL I CCCC DDDDDDDDD SSSSSSSSS  DIV32             DIV32U            D/#,S/#                                                 
--LS 1101111 fL I CCCC DDDDDDDDD SSSSSSSSS  DIV64             DIV64U            D/#,S/#                                                 
--LS 1110000 fL I CCCC DDDDDDDDD SSSSSSSSS  SQRT64            QSINCOS           D/#,S/#                                                 
--LS 1110001 fL I CCCC DDDDDDDDD SSSSSSSSS  QARCTAN           QROTATE           D/#,S/#                                                 
--LS 1110010 fL I CCCC DDDDDDDDD SSSSSSSSS  SETSERA           SETSERB           D/#,S/#                                                 
--LS 1110011 fL I CCCC DDDDDDDDD SSSSSSSSS  SETCTRS           SETWAVS           D/#,S/#                                                 
--LS 1110100 fL I CCCC DDDDDDDDD SSSSSSSSS  SETFRQS           SETPHSS           D/#,S/#                                                 
--LS 1110101 fL I CCCC DDDDDDDDD SSSSSSSSS  ADDPHSS           SUBPHSS           D/#,S/#                                                 
--LS 1110110 fL I CCCC DDDDDDDDD SSSSSSSSS  JP                JPD               D/#,S/#                                                 
--LS 1110111 fL I CCCC DDDDDDDDD SSSSSSSSS  JNP               JNPD              D/#,S/#                                                 
--LS 111100n nL I CCCC DDDDDDDDD SSSSSSSSS  CFGPINS                    JMPTASK  D/#,S/#,#0..2 | D/#,S/#                                 
--LS 1111010 fL I CCCC DDDDDDDDD SSSSSSSSS  SETXFR            SETMIX            D/#,S/#                                                 
--LS 1111011 fL I CCCC DDDDDDDDD SSSSSSSSS  <empty>           <empty>           D/#,S/#                                                 
--RS 1111100 ff I CCCC DDDDDDDDD SSSSSSSSS  JZ       JZD      JNZ      JNZD     D,S/#                                                   
-------------------------------------------------------------------------------------------------------------------------------------------------
---- 1111101 ff n nnnn nnnnnnnnn nnnnnnnnn  AUGI                                #23bits                                                 
---- 1111101 01 0 nnnn nnnnnnnnn nnniiiiii  REPS                                #1..$10000,#1..64                                       
---- 1111101 01 1 BBAA ddddddddd sssssssss  xxxINDx                             FIXINDx #d,#s | #d,#s | #d,#s | SETINDx #s | #d | #d,#s 
---- 1111101 10 0 CCCC ffnnnnnnn nnnnnnnnn  JMP      JMP_     JMP      JMP_     #abs | #abs | @rel | @rel                               
---- 1111101 10 1 CCCC ffnnnnnnn nnnnnnnnn  JMPD     JMPD_    JMPD     JMPD_    #abs | #abs | @rel | @rel                               
---- 1111101 11 0 CCCC ffnnnnnnn nnnnnnnnn  CALL     CALL_    CALL     CALL_    #abs | #abs | @rel | @rel                               
---- 1111101 11 1 CCCC ffnnnnnnn nnnnnnnnn  CALLD    CALLD_   CALLD    CALLD_   #abs | #abs | @rel | @rel                               
---- 1111110 00 0 CCCC ffnnnnnnn nnnnnnnnn  CALLA    CALLA_   CALLA    CALLA_   #abs | #abs | @rel | @rel                               
---- 1111110 00 1 CCCC ffnnnnnnn nnnnnnnnn  CALLAD   CALLAD_  CALLAD   CALLAD_  #abs | #abs | @rel | @rel                               
---- 1111110 01 0 CCCC ffnnnnnnn nnnnnnnnn  CALLB    CALLB_   CALLB    CALLB_   #abs | #abs | @rel | @rel                               
---- 1111110 01 1 CCCC ffnnnnnnn nnnnnnnnn  CALLBD   CALLBD_  CALLBD   CALLBD_  #abs | #abs | @rel | @rel                               
---- 1111110 10 0 CCCC ffnnnnnnn nnnnnnnnn  CALLX    CALLX_   CALLX    CALLX_   #abs | #abs | @rel | @rel                               
---- 1111110 10 1 CCCC ffnnnnnnn nnnnnnnnn  CALLXD   CALLXD_  CALLXD   CALLXD_  #abs | #abs | @rel | @rel                               
---- 1111110 11 0 CCCC ffnnnnnnn nnnnnnnnn  CALLY    CALLY_   CALLY    CALLY_   #abs | #abs | @rel | @rel                               
---- 1111110 11 1 CCCC ffnnnnnnn nnnnnnnnn  CALLYD   CALLYD_  CALLYD   CALLYD_  #abs | #abs | @rel | @rel                               
-------------------------------------------------------------------------------------------------------------------------------------------------
ZCW- 1111111 ZC 0 CCCC DDDDDDDDD 0000000ff  COGID    LOCKNEW  GETPC    GETLFSR  D                                                       [WZ],[WC]
ZCW- 1111111 ZC 0 CCCC DDDDDDDDD 0000001ff  GETCNT   GETCNTX  GETACAL  GETACAH  D                                                       [WZ],[WC]
ZCW- 1111111 ZC 0 CCCC DDDDDDDDD 0000010ff  GETACBL  GETACBH  GETPTRA  GETPTRB  D                                                       [WZ],[WC]
ZCW- 1111111 ZC 0 CCCC DDDDDDDDD 0000011ff  GETPTRX  GETPTRY  SERINA   SERINB   D                                                       [WZ],[WC]
ZCW- 1111111 ZC 0 CCCC DDDDDDDDD 0000100ff  GETMULL  GETMULH  GETDIVQ  GETDIVR  D                                                       [WZ],[WC]
ZCW- 1111111 ZC 0 CCCC DDDDDDDDD 0000101ff  GETSQRT  GETQX    GETQY    GETQZ    D                                                       [WZ],[WC]
ZCW- 1111111 ZC 0 CCCC DDDDDDDDD 0000110ff  GETPHSA  GETPHZA  GETCOSA  GETSINA  D                                                       [WZ],[WC]
ZCW- 1111111 ZC 0 CCCC DDDDDDDDD 0000111ff  GETPHSB  GETPHZB  GETCOSB  GETSINB  D                                                       [WZ],[WC]
ZCM- 1111111 ZC 0 CCCC DDDDDDDDD 0001000ff  PUSHZC   POPZC    SUBCNT   GETPIX   D                                                       [WZ],[WC]
ZCM- 1111111 ZC 0 CCCC DDDDDDDDD 0001001ff  BINBCD   BCDBIN   BINGRY   GRYBIN   D                                                       [WZ],[WC]
ZCM- 1111111 ZC 0 CCCC DDDDDDDDD 0001010ff  ESWAP4   ESWAP8   SEUSSF   SEUSSR   D                                                       [WZ],[WC]
Z-M- 1111111 ZC 0 CCCC DDDDDDDDD 0001011ff  INCD     DECD     INCDS    DECDS    D                                                       [WZ],[WC]
ZCW- 1111111 ZC 0 CCCC DDDDDDDDD 0001100ff  POP      <empty>  <empty>  <empty>  D                                                       [WZ],[WC]
-------------------------------------------------------------------------------------------------------------------------------------------------
--L- 1111111 00 L CCCC DDDDDDDDD 001iiiiii  REPD                                D/#1..512,#1..64                                        
-------------------------------------------------------------------------------------------------------------------------------------------------
--L- 1111111 00 L CCCC DDDDDDDDD 0100000ff  CLKSET   COGSTOP  LOCKSET  LOCKCLR  D/#                                                     
--L- 1111111 00 L CCCC DDDDDDDDD 0100001ff  LOCKRET  RDWIDEC  RDWIDE   WRWIDE   D/# | D/PTRA/PTRB | D/PTRA/PTRB | D/PTRA/PTRB           
ZCL- 1111111 ZC L CCCC DDDDDDDDD 0100010ff  GETP     GETNP    SEROUTA  SEROUTB  D/#                                                     [WZ],[WC]
-CL- 1111111 0C L CCCC DDDDDDDDD 0100011ff  CMPCNT   WAITPX   WAITPR   WAITPF   D/#                                                          [WC]
ZCL- 1111111 ZC L CCCC DDDDDDDDD 0100100ff  SETZC    SETMAP   SETXCH   SETTASK  D/#                                                     [WZ],[WC]
--L- 1111111 00 L CCCC DDDDDDDDD 0100101ff  SETRACE  SARACCA  SARACCB  SARACCS  D/#                                                     
--L- 1111111 00 L CCCC DDDDDDDDD 0100110ff  SETPTRA  SETPTRB  ADDPTRA  ADDPTRB  D/#                                                     
--L- 1111111 00 L CCCC DDDDDDDDD 0100111ff  SUBPTRA  SUBPTRB  SETWIDE  SETWIDZ  D/#                                                     
--L- 1111111 00 L CCCC DDDDDDDDD 0101000ff  SETPTRX  SETPTRY  ADDPTRX  ADDPTRY  D/#                                                     
--L- 1111111 00 L CCCC DDDDDDDDD 0101001ff  SUBPTRX  SUBPTRY  PASSCNT  WAIT     D/#                                                     
--L- 1111111 00 L CCCC DDDDDDDDD 0101010ff  OFFP     NOTP     CLRP     SETP     D/#                                                     
--L- 1111111 00 L CCCC DDDDDDDDD 0101011ff  SETPC    SETPNC   SETPZ    SETPNZ   D/#                                                     
--L- 1111111 00 L CCCC DDDDDDDDD 0101100ff  DIV64D   SQRT32   QLOG     QEXP     D/#                                                     
--L- 1111111 00 L CCCC DDDDDDDDD 0101101ff  SETQI    SETQZ    CFGDACS  SETDACS  D/#                                                     
--L- 1111111 00 L CCCC DDDDDDDDD 0101110ff  CFGDAC0  CFGDAC1  CFGDAC2  CFGDAC3  D/#                                                     
--L- 1111111 00 L CCCC DDDDDDDDD 0101111ff  SETDAC0  SETDAC1  SETDAC2  SETDAC3  D/#                                                     
--L- 1111111 00 L CCCC DDDDDDDDD 0110000ff  SETCTRA  SETWAVA  SETFRQA  SETPHSA  D/#                                                     
--L- 1111111 00 L CCCC DDDDDDDDD 0110001ff  ADDPHSA  SUBPHSA  SETVID   SETVIDY  D/#                                                     
--L- 1111111 00 L CCCC DDDDDDDDD 0110010ff  SETCTRB  SETWAVB  SETFRQB  SETPHSB  D/#                                                     
--L- 1111111 00 L CCCC DDDDDDDDD 0110011ff  ADDPHSB  SUBPHSB  SETVIDI  SETVIDQ  D/#                                                     
--L- 1111111 00 L CCCC DDDDDDDDD 0110100ff  SETPIX   SETPIXZ  SETPIXU  SETPIXV  D/#                                                     
--L- 1111111 00 L CCCC DDDDDDDDD 0110101ff  SETPIXA  SETPIXR  SETPIXG  SETPIXB  D/#                                                     
--L- 1111111 00 L CCCC DDDDDDDDD 0110110ff  SETPORA  SETPORB  SETPORC  SETPORD  D/#                                                     
--L- 1111111 00 L CCCC DDDDDDDDD 0110111ff  PUSH     <empty>  JMPREL   JMPRELD  D/# | D/# | D | D                                       
--R- 1111111 00 0 CCCC DDDDDDDDD 0111010ff  JMP      JMP_     JMPD     JMPD_    D                                                       
--R- 1111111 00 0 CCCC DDDDDDDDD 0111011ff  CALL     CALL_    CALLD    CALLD_   D                                                       
--R- 1111111 00 0 CCCC DDDDDDDDD 0111100ff  CALLA    CALLA_   CALLAD   CALLAD_  D                                                       
--R- 1111111 00 0 CCCC DDDDDDDDD 0111101ff  CALLB    CALLB_   CALLBD   CALLBD_  D                                                       
--R- 1111111 00 0 CCCC DDDDDDDDD 0111110ff  CALLX    CALLX_   CALLXD   CALLXD_  D                                                       
--R- 1111111 00 0 CCCC DDDDDDDDD 0111111ff  CALLY    CALLY_   CALLYD   CALLYD_  D                                                       
ZC-- 1111111 ZC x CCCC xxxxxxxxx 1000000ff  RETA     RETAD    RETB     RETBD                                                            [WZ],[WC]
ZC-- 1111111 ZC x CCCC xxxxxxxxx 1000001ff  RETX     RETXD    RETY     RETYD                                                            [WZ],[WC]
ZC-- 1111111 ZC x CCCC xxxxxxxxx 1000010ff  RET      RETD     POLCTRA  POLCTRB                                                          [WZ],[WC]
ZC-- 1111111 ZC x CCCC xxxxxxxxx 1000011ff  POLVID   CAPCTRA  CAPCTRB  CAPCTRS                                                          [WZ],[WC]
---- 1111111 00 x CCCC xxxxxxxxx 1000100ff  CACHEX   CLRACCA  CLRACCB  CLRACCS                                                          
ZC-- 1111111 ZC x CCCC xxxxxxxxx 1000101ff  CHKPTRX  CHKPTRY  SYNCTRA  SYNCTRB                                                          [WZ],[WC]
---- 1111111 00 x CCCC xxxxxxxxx 1000110ff  SETPIXW  <empty>  <empty>  <empty>                                                          
-------------------------------------------------------------------------------------------------------------------------------------------------

InstructionSet_20131217.spin

cgracey · 2013-12-19 03:09

Cluso99 wrote: »

Here is the latest instruction set reformatted
InstructionSet_20131217.spin

Nice list, Cluso99. You could do the ff-bit compaction for those initial instructions, too, and shrink it down some more.

I was at Parallax yesterday for some meetings, but I'm back on the Prop2 work today. I've got to finish a little work on PNut.exe and then do some checks on it. Then, I can make changes to the ROM code (branches are different now) and see if it all works together. After that, I'll be able to work on the cache line mechanism that feeds instructions from the hub. It's taken two weeks to get things in order to be able to make that addition. When I get it working, I'll post an update after I modify Prop2_Docs.txt.

Sapieha · 2013-12-19 03:26

Hi Chip.

Can You include in Prop2_Docs.txt -- list of all directives used in PNut?

cgracey wrote: »

Nice list, Cluso99. You could do the ff-bit compaction for those initial instructions, too, and shrink it down some more.

I was at Parallax yesterday for some meetings, but I'm back on the Prop2 work today. I've got to finish a little work on PNut.exe and then do some checks on it. Then, I can make changes to the ROM code (branches are different now) and see if it all works together. After that, I'll be able to work on the cache line mechanism that feeds instructions from the hub. It's taken two weeks to get things in order to be able to make that addition. When I get it working, I'll post an update after I modify Prop2_Docs.txt.

cgracey · 2013-12-19 03:54

Sapieha wrote: »

Hi Chip.

Can You include in Prop2_Docs.txt -- list of all directives used in PNut?

I will in the next release. The ORG-type directives select between cog and hub assembler modes. In cog mode, the cog address is used. In hub mode, the hub address >> 2 is used.

David Betz · 2013-12-19 04:19

cgracey wrote: »

After that, I'll be able to work on the cache line mechanism that feeds instructions from the hub.

Isn't this going to be kind of tricky since you could possibly have all four tasks trying to fill a cache line at the same time? Somehow you have to make sure they don't step on each other. Although, maybe with LRU that falls out of the design. Anyway, it sounds a little mind bending to me! :-)

cgracey · 2013-12-19 04:32

David Betz wrote: »

Isn't this going to be kind of tricky since you could possibly have all four tasks trying to fill a cache line at the same time? Somehow you have to make sure they don't step on each other. Although, maybe with LRU that falls out of the design. Anyway, it sounds a little mind bending to me! :-)

Yes, they could all want new caches at once. It's hard to think about. I think that's why I'm getting everything else ready before I get into the implementation of the cache lines. I'm hoping that the LRU technique will perform equitably in all cases. I'm thinking the LRU "timer" used to measure usage must be the instruction clock. I'll probably get things running, at first, with no cache lines, just a single long. This will let me prove that hub execution is working properly. Then, I'll get into the caching. This has been a lot of work, but it's not going to be complicated to use.

Cluso99 · 2013-12-19 04:34

cgracey wrote: »

Nice list, Cluso99. You could do the ff-bit compaction for those initial instructions, too, and shrink it down some more.

I was at Parallax yesterday for some meetings, but I'm back on the Prop2 work today. I've got to finish a little work on PNut.exe and then do some checks on it. Then, I can make changes to the ROM code (branches are different now) and see if it all works together. After that, I'll be able to work on the cache line mechanism that feeds instructions from the hub. It's taken two weeks to get things in order to be able to make that addition. When I get it working, I'll post an update after I modify Prop2_Docs.txt.

Thanks Chip. I will do the initial ones too.
Much of the work is done with formulae in excel. This makes it easier when the instruction set changes.
I need to break the instructions into common format groups for ny disassembler.

When you get the cache line mechanism working, could you consider adding an instruction to force the loading of a line(s) please?

Bill Henning · 2013-12-19 06:19

Sounds good... dusting off de2 & nano's...

cgracey wrote: »

Nice list, Cluso99. You could do the ff-bit compaction for those initial instructions, too, and shrink it down some more.

I was at Parallax yesterday for some meetings, but I'm back on the Prop2 work today. I've got to finish a little work on PNut.exe and then do some checks on it. Then, I can make changes to the ROM code (branches are different now) and see if it all works together. After that, I'll be able to work on the cache line mechanism that feeds instructions from the hub. It's taken two weeks to get things in order to be able to make that addition. When I get it working, I'll post an update after I modify Prop2_Docs.txt.

Bill Henning · 2013-12-19 06:20

Would it help if:

- single cog mode, LRU 4 line cache
- multi-task mode, each task gets 1 line (bypass LRU mechanism)

cgracey wrote: »

Yes, they could all want new caches at once. It's hard to think about. I think that's why I'm getting everything else ready before I get into the implementation of the cache lines. I'm hoping that the LRU technique will perform equitably in all cases. I'm thinking the LRU "timer" used to measure usage must be the instruction clock. I'll probably get things running, at first, with no cache lines, just a single long. This will let me prove that hub execution is working properly. Then, I'll get into the caching. This has been a lot of work, but it's not going to be complicated to use.

pedward · 2013-12-19 10:06

cgracey wrote: »

Yes, they could all want new caches at once. It's hard to think about. I think that's why I'm getting everything else ready before I get into the implementation of the cache lines. I'm hoping that the LRU technique will perform equitably in all cases. I'm thinking the LRU "timer" used to measure usage must be the instruction clock. I'll probably get things running, at first, with no cache lines, just a single long. This will let me prove that hub execution is working properly. Then, I'll get into the caching. This has been a lot of work, but it's not going to be complicated to use.

Chip, why don't you just use an 8 bit byte for the LRU.

When you load a cache line, shift the byte right, put the 2 bit line ID in the upper 2 bits. When you load the next line, you just pull the 2 bit ID off the bottom of the byte and use that to determine what the next cache line is.

Init LRU to 11100100 at startup
First load grabs 00, shifts right, pushes 00 to the top: 00111001
Next load, so on.

When the cogs start running instructions through the cache, the efficiency of the code will mix up the order, creating LRU instead of round-robin, but at start it will simply be round-robin.

Also, I don't see how all tasks could want data at once, since they are staggered through the 4 stages of the pipeline. Sure, it takes 3-8 clocks to load, but the tasks will stall anyway. So, yes you can have a pile-up, and RDLONG should have priority over HUBEX, which will cause further stalls.

This is all a better reason not to support HUBEX for more than task 0. You will have totally non-deterministic execution with 4 HUBEX tasks, because of issue order of the HUBEX loads and overriding priority for RDXXXX instructions. That means if the first instruction of a HUBEX is RDXXXX, it will stall the next HUBEX load, then if it's multiple RDXXXX instructions in a row, those other HUBEX tasks will stall indefinitely.

If you only make 1 task HUBEX and have a 4 line cache, that will give best possible performance.

pre-caching can make hub execution fast, but you must allow explicit memory ops to interrupt pending pre-cache requests.

Propeller II update - BLOG

Comments