Prop2 FPGA files!!! - Updated 2 June 2018 - Final Version 32i

cgracey · 2017-04-06 01:37

This is rather problematic:

	calld	myret,##hub	' <-- Error - Expected a constant,unary operator or "("

Crossing from cog to hub using a 9-bit relative jump augmented to 20 bits is a brain bender. I'll need to think about this. Sorry this issue has carried on so long.

Dave Hein · 2017-04-06 02:15

cgracey wrote: »

Dave Hein,

Try this v17b of PNut.exe:

https://drive.google.com/file/d/0B9NbgkdrupkHRE80d1lqRnd2NkE/view?usp=sharing

That fixed the problem I was having. Thanks.

ozpropdev · 2017-04-06 02:16

cgracey wrote: »
This is rather problematic:
	calld	myret,##hub	' <-- Error - Expected a constant,unary operator or "("
Crossing from cog to hub using a 9-bit relative jump augmented to 20 bits is a brain bender. I'll need to think about this. Sorry this issue has carried on so long.

It works Ok using the D reg variant so maybe a note in the documentation saying DONT AUGment the #rel9 variant and your done.

jmg · 2017-04-06 02:22

ozpropdev wrote: »

Crossing from cog to hub using a 9-bit relative jump augmented to 20 bits is a brain bender. .

It works Ok using the D reg variant so maybe a note in the documentation saying DONT AUGment the #rel9 variant and your done.

I thought COG -> HUB always must be Absolute, so is this not something the tools can manage, without even needing a note in the documents ?

ozpropdev · 2017-04-06 04:25

jmg wrote: »

I thought COG -> HUB always must be Absolute, so is this not something the tools can manage, without even needing a note in the documents ?

In this case the CALLD only has a relative option (#rel9).

EEEE 1011001 CZI DDDDDDDDD SSSSSSSSS        CALLD   D,S/#rel9   {WC,WZ}

ozpropdev · 2017-04-06 04:45

Actually the ##rel9 issue is not isolated to just CALLD.

EEEE 1011001 CZI DDDDDDDDD SSSSSSSSS        CALLD   D,S/#rel9   {WC,WZ}

EEEE 1011010 00I DDDDDDDDD SSSSSSSSS        IJZ     D,S/#rel9
EEEE 1011010 01I DDDDDDDDD SSSSSSSSS        IJNZ    D,S/#rel9
EEEE 1011010 10I DDDDDDDDD SSSSSSSSS        IJS     D,S/#rel9
EEEE 1011010 11I DDDDDDDDD SSSSSSSSS        IJNS    D,S/#rel9

EEEE 1011011 00I DDDDDDDDD SSSSSSSSS        DJZ     D,S/#rel9
EEEE 1011011 01I DDDDDDDDD SSSSSSSSS        DJNZ    D,S/#rel9
EEEE 1011011 10I DDDDDDDDD SSSSSSSSS        DJS     D,S/#rel9
EEEE 1011011 11I DDDDDDDDD SSSSSSSSS        DJNS    D,S/#rel9

EEEE 1011100 00I DDDDDDDDD SSSSSSSSS        TJZ     D,S/#rel9
EEEE 1011100 01I DDDDDDDDD SSSSSSSSS        TJNZ    D,S/#rel9
EEEE 1011100 10I DDDDDDDDD SSSSSSSSS        TJS     D,S/#rel9
EEEE 1011100 11I DDDDDDDDD SSSSSSSSS        TJNS    D,S/#rel9

EEEE 1011101 0LI DDDDDDDDD SSSSSSSSS        JP      D/#,S/#rel9
EEEE 1011101 1LI DDDDDDDDD SSSSSSSSS        JNP     D/#,S/#rel9

EEEE 1011110 0LI DDDDDDDDD SSSSSSSSS        CALLPA  D/#,S/#rel9
EEEE 1011110 1LI DDDDDDDDD SSSSSSSSS        CALLPB  D/#,S/#rel9

EEEE 1011111 01I 000000000 SSSSSSSSS        JINT    S/#rel9
EEEE 1011111 01I 000000001 SSSSSSSSS        JCT1    S/#rel9
EEEE 1011111 01I 000000010 SSSSSSSSS        JCT2    S/#rel9
EEEE 1011111 01I 000000011 SSSSSSSSS        JCT3    S/#rel9
EEEE 1011111 01I 000000100 SSSSSSSSS        JSE1    S/#rel9
EEEE 1011111 01I 000000101 SSSSSSSSS        JSE2    S/#rel9
EEEE 1011111 01I 000000110 SSSSSSSSS        JSE3    S/#rel9
EEEE 1011111 01I 000000111 SSSSSSSSS        JSE4    S/#rel9
EEEE 1011111 01I 000001000 SSSSSSSSS        JPAT    S/#rel9
EEEE 1011111 01I 000001001 SSSSSSSSS        JFBW    S/#rel9
EEEE 1011111 01I 000001010 SSSSSSSSS        JXMT    S/#rel9
EEEE 1011111 01I 000001011 SSSSSSSSS        JXFI    S/#rel9
EEEE 1011111 01I 000001100 SSSSSSSSS        JXRO    S/#rel9
EEEE 1011111 01I 000001101 SSSSSSSSS        JXRL    S/#rel9
EEEE 1011111 01I 000001110 SSSSSSSSS        JATN    S/#rel9
EEEE 1011111 01I 000001111 SSSSSSSSS        JQMT    S/#rel9

EEEE 1011111 01I 000010000 SSSSSSSSS        JNINT   S/#rel9
EEEE 1011111 01I 000010001 SSSSSSSSS        JNCT1   S/#rel9
EEEE 1011111 01I 000010010 SSSSSSSSS        JNCT2   S/#rel9
EEEE 1011111 01I 000010011 SSSSSSSSS        JNCT3   S/#rel9
EEEE 1011111 01I 000010100 SSSSSSSSS        JNSE1   S/#rel9
EEEE 1011111 01I 000010101 SSSSSSSSS        JNSE2   S/#rel9
EEEE 1011111 01I 000010110 SSSSSSSSS        JNSE3   S/#rel9
EEEE 1011111 01I 000010111 SSSSSSSSS        JNSE4   S/#rel9
EEEE 1011111 01I 000011000 SSSSSSSSS        JNPAT   S/#rel9
EEEE 1011111 01I 000011001 SSSSSSSSS        JNFBW   S/#rel9
EEEE 1011111 01I 000011010 SSSSSSSSS        JNXMT   S/#rel9
EEEE 1011111 01I 000011011 SSSSSSSSS        JNXFI   S/#rel9
EEEE 1011111 01I 000011100 SSSSSSSSS        JNXRO   S/#rel9
EEEE 1011111 01I 000011101 SSSSSSSSS        JNXRL   S/#rel9
EEEE 1011111 01I 000011110 SSSSSSSSS        JNATN   S/#rel9
EEEE 1011111 01I 000011111 SSSSSSSSS        JNQMT   S/#rel9

jmg · 2017-04-06 05:22

ozpropdev wrote: »

Actually the ##rel9 issue is not isolated to just CALLD.

Most of the looping and Jump-Bit variants, you would expect to be short & relative jumps, as they are on other MCUs.
It's just the CALL variants, where usually there is a far/long version in the tools ?

ozpropdev · 2017-04-06 06:36

A simple work around is to substitute this

	callpa	#%1101,##hub	'pnut error ****

with this

	augs	#@hub-@here
here	callpa	#%1101,#(@hub-@here) & $1ff

Seems to work OK.

ozpropdev · 2017-04-06 14:48

Chip
Any chance when you compile V18 for DE2-115 of dropping a cog(s) to get some smartpins back.
This also brings back DAC3(red).

cgracey · 2017-04-06 15:54

CALLD is more complicated than DJNZ because there are two different instructions for it. One is like DJNZ and the other has a 20-bit immediate field, plus absolute/relative bit.

Another thing about all the DJNZ-like instructions with immediate relative addressing, those addresses step 4x in hub exec, making them only branchable to commonly-aligned addresses. Same rule using AUGS with them.

cgracey · 2017-04-06 15:58

ozpropdev wrote: »

Chip
Any chance when you compile V18 for DE2-115 of dropping a cog(s) to get some smartpins back.
This also brings back DAC3(red).

The next step down from 8 cogs is 4 cogs.

I'm not sure what you mean about DAC3. Is pin 3 currently not a smart pin? I'm only on my phone at the moment.

evanh · 2017-04-06 22:32

cgracey wrote: »

The next step down from 8 cogs is 4 cogs.

It'll be interesting to see how many Smartpins that'll produce.

cgracey · 2017-04-06 22:37

evanh wrote: »

cgracey wrote: »

The next step down from 8 cogs is 4 cogs.

It'll be interesting to see how many Smartpins that'll produce.

As soon as compilation finishes for the BeMicro-A9 board, we'll find out.

ozpropdev · 2017-04-06 22:45

cgracey wrote: »

I'm not sure what you mean about DAC3. Is pin 3 currently not a smart pin? I'm only on my phone at the moment.

That's correct,Pin 3 is no longer a smartpin.
DE2-115 has only 5 smartpins. 63,62,2..0

cgracey · 2017-04-07 01:35

I changed the DE2-115 compile to 4 cogs and 38 smart pins (63, 62, 35..0). It's at 97% capacity now. I need to make these AUGS+RDxxxx/WRxxxx changes and then I'll do a final compile on everything.

ozpropdev · 2017-04-07 01:57

cgracey wrote: »

I changed the DE2-115 compile to 4 cogs and 38 smart pins (63, 62, 35..0). It's at 97% capacity now. I need to make these AUGS+RDxxxx/WRxxxx changes and then I'll do a final compile on everything.

Wow, that's quite a jump in smartpins.
Hmm, so a 8 cog "A9" would have all 64 smartpins then?
Could be a useful additional build for testing.

evanh · 2017-04-07 04:12

It's not as many as I was hoping for. That's eight Smartpins is as big as one Cog. They're bulker than I imagined.

cgracey · 2017-04-07 04:24

evanh wrote: »

It's not as many as I was hoping for. That's eight Smartpins is as big as one Cog. They're bulker than I imagined.

Logic wise, but they don't have RAMs.

msrobots · 2017-04-07 04:26

evanh wrote: »

It's not as many as I was hoping for. That's eight Smartpins is as big as one Cog. They're bulker than I imagined.

I found that also interesting, but imagining what they do it makes some sense. Just naming them Smartpins took away the feeling that they are independent sub-processors running parallel to the 16 COGS.

Interesting times are coming.

Mike

Cluso99 · 2017-04-07 06:19

1 Cog (w/o RAM) = 8 Smart Pins

I wonder what a really fast reduced instruction set P1 Cog with 512B (128 registers) would be in size and what it could perform instead of a Smart Pin ???

IOW a tailored software cog specifically for each I/O.

cgracey · 2017-04-07 06:46

Cluso99 wrote: »

1 Cog (w/o RAM) = 8 Smart Pins

I wonder what a really fast reduced instruction set P1 Cog with 512B (128 registers) would be in size and what it could perform instead of a Smart Pin ???

IOW a tailored software cog specifically for each I/O.

What's unique about those smart pins is that they can input, output, and respond on a clock-by-clock basis.

Cluso99 · 2017-04-07 08:13

cgracey wrote: »

Cluso99 wrote: »

1 Cog (w/o RAM) = 8 Smart Pins

I wonder what a really fast reduced instruction set P1 Cog with 512B (128 registers) would be in size and what it could perform instead of a Smart Pin ???

IOW a tailored software cog specifically for each I/O.

What's unique about those smart pins is that they can input, output, and respond on a clock-by-clock basis.

A specialised cog might be able to run at 2x the speed of the P2 cog.
Add a serialiser/deserialiser plus the counters with extra modes.
Add some basic software instructions.
Might be smaller than current, and perhaps more flexible too.

Perhaps 1 per P2 cog and connect as OR onto the existing I/O ring.

Anyway, it's too late for this. But it's nice to think about just the same.

kwinn · 2017-04-07 13:17

Cluso99 wrote: »

cgracey wrote: »

Cluso99 wrote: »

1 Cog (w/o RAM) = 8 Smart Pins

I wonder what a really fast reduced instruction set P1 Cog with 512B (128 registers) would be in size and what it could perform instead of a Smart Pin ???

IOW a tailored software cog specifically for each I/O.

What's unique about those smart pins is that they can input, output, and respond on a clock-by-clock basis.

A specialised cog might be able to run at 2x the speed of the P2 cog.
Add a serialiser/deserialiser plus the counters with extra modes.
Add some basic software instructions.
Might be smaller than current, and perhaps more flexible too.

Perhaps 1 per P2 cog and connect as OR onto the existing I/O ring.

Anyway, it's too late for this. But it's nice to think about just the same.

Yes, nice idea, and worth thinking about, but lets save it until the P2 is done.

evanh · 2017-04-07 13:44

No! ... no ... NO! ... Won't be faster nor smaller. "Add a serialiser/deserialiser plus the counters with extra modes." - That's basically a smartpin right there.

evanh · 2017-04-07 13:52

I may have been surprised at their size but I'm happy with how they are and the flexibility they offer. Smartpins was one area that Chip extended when the extra die space came up. And, as he mentioned above, once you add in the die space needs of CogRAM/LUTRAM (4kB of dual ported SRAM per Cog) - which is not accounted for on the FPGA image compiles - it'll rebalance the proportions better.

evanh · 2017-04-07 14:58

Oh, I just found the old posting from Chip about a test run at Treehouse - http://forums.parallax.com/discussion/comment/1344242/#Comment_1344242

SRAM is bulky! Dual ported is really bad for some reason. It put CogRAM alone at nearly 70% of the Cog's 424322 um² die area. That was back when LUTRAM was still single ported. LUTRAM is now the same size as CogRAM. 292 x 2 = 584 + 45ish = 630000 um² for a current Cog.

About 45000 is the logic and flops. Which is a mere 7.1% of the Cog's die space.

So, I'm guessing around 6000 um², or 1% of a Cog, will fit a Smartpin.

kwinn · 2017-04-07 20:02

evanh wrote: »

No! ... no ... NO! ... Won't be faster nor smaller. "Add a serialiser/deserialiser plus the counters with extra modes." - That's basically a smartpin right there.

Getting ideas and approaches from a group of knowledgeable people is always a good idea. Technology changes rapidly so it pays to take a look around when starting any new project. Keeps it from being limited by the preconceptions and limited knowledge of what is currently possible.

evanh · 2017-04-08 04:19

Is that meant to be a thank you? Hard to tell.

potatohead · 2017-04-08 04:52

. You should take it.

kwinn · 2017-04-08 13:08

evanh wrote: »

Is that meant to be a thank you? Hard to tell.

In part a thank you, but also a caution. With rapid changes in technology an impractical idea today may be a great one tomorrow.

Prop2 FPGA files!!! - Updated 2 June 2018 - Final Version 32i

Comments