Prop2 FPGA files!!! - Updated 2 June 2018 - Final Version 32i

Seairth · 2016-06-26 11:06

Cluso99 wrote: »

Just finished verifying the internal stack. It is a depth of 8.
As you push more onto the stack, the bottom one will drop off when full.
As you pop off the stack, the bottom value moves up one and the new bottom value is the same as it was previously. ie once you empty the stack, the last legitimate value popped will continue to be delivered on underflow.

I needed to verify this behaviour as I am building some hubexec routines where I want to use the internal stack but I do not know what depth is available for use. So I can now save the stack (to hub???), and restore it when I am finished.

That behavior is correct. As for saving the stack, you don't need to know the depth. Just pop all 8, then push all 8 when restoring it.

ozpropdev · 2016-06-26 14:10

Cluso99 wrote: »

Just finished verifying the internal stack. It is a depth of 8.
As you push more onto the stack, the bottom one will drop off when full.
As you pop off the stack, the bottom value moves up one and the new bottom value is the same as it was previously. ie once you empty the stack, the last legitimate value popped will continue to be delivered on underflow.

I needed to verify this behaviour as I am building some hubexec routines where I want to use the internal stack but I do not know what depth is available for use. So I can now save the stack (to hub???), and restore it when I am finished.

Also be aware that the hardware stack is 22 bits wide not 32.

Seairth · 2016-06-26 14:44

You know, storing the stack to hub memory will be kinda slow. Too bad there isn't a pair of instructions that would pop/push all of them at once to/from the hub. They would take ~9-24 clock cycles. This would make context switches much faster than the current approach (I think).

Dave Hein · 2016-06-26 15:08

Transferring the stack to cog RAM would take just 8 cycles, and then storing it to hub RAM ~9-24 cycles. It doesn't seem worth the effort to create special instructions to do the same thing. I suppose the special instructions could be faster by overlapping the stack accesses with the hub accesses. If we're voting, I'd vote not to do it so the P2 would be available sooner. I don't think I can wait another decade.

Seairth · 2016-06-26 15:24

Dave Hein wrote: »

Transferring the stack to cog RAM would take just 8 cycles, and then storing it to hub RAM ~9-24 cycles. It doesn't seem worth the effort to create special instructions to do the same thing. I suppose the special instructions could be faster by overlapping the stack accesses with the hub accesses. If we're voting, I'd vote not to do it so the P2 would be available sooner. I don't think I can wait another decade.

Wouldn't that be 16 cycles, not 8? And you would also require 8 unused cog register. And you would need to set up the fast write. But, you are right. It certainly shouldn't be added if it affects delivery of the chip. This is purely a minor optimization, and therefore not critical to the design.

Rayman · 2016-06-26 15:29

As ozpropdev just said, stack is only 22 bits. I don't think it's very useful in the general case. I'm just going to leave it alone and let it do it's call/ret thing.

Dave Hein · 2016-06-26 15:38

Sorry, I forgot that instructions take 2 cycles instead of 1. It's been a while since I paid much attention to Parallax. As time marches on there are more new and exciting chips and boards out there, and Parallax is kind of getting lost in the crowd. They really need to get the P2 out ASAP to remain relevant in the market.

Rayman · 2016-06-26 16:48

For smartpin instructions, think I'd like:

SetPinM
SetPinX
SetPin
GetPin
AckPin

Peter Jakacki · 2016-06-26 16:58

Rayman wrote: »

For smartpin instructions, think I'd like:

SetPinM
SetPinX
SetPinY
GetPin
AckPin

Have you tried writing code and keeping track of what you are doing? Which is it that we write data to again? Was that X or was that Y?

Having written some smartpin code in assembler and in high level I must say that to read and write data using RDPIN and WRPIN makes it a lot easier to read and remember. As for setting the mode then SETPIN also makes sense but so as not to confuse what we do with the extra parameter register I would avoid using the word set and simply say WXPIN. Also all these mnemonics end cleanly in "PIN" and are clearly identified as the PIN family along with their cousin AKPIN. I would say that WXPIN could be called something else but it is just as easy to leave it too.

Rayman · 2016-06-26 17:02

Looks to me like M, X, and Y will usually only get set once, so they're not so important.

The thing you'll do a lot of is reading and acking the pin.

I'd be fine with READPIN instead of GETPIN.
Not a fan of contracting syntax when not necessary. Think that makes it less clear, and harder to remember.

Rayman · 2016-06-26 17:07

Also they way I have it, they all sorta rhyme with "SmartPin". I think that will also make it easier to remember...

Peter Jakacki · 2016-06-26 17:10

contracting is normal for mnemonics, that's why they're called mnemonics, you know like rdbyte and wrfast etc. I don't know where you get the idea from that you would only be doing a lot of reading from a pin as I have found that that is not the case. What do you do when you are transmitting serial data?

Rayman · 2016-06-26 17:15

You're right Y, you do need that a lot too...
Maybe SetPinY should just be SetPin, or WritePin (if that's not to long) or SendPin or something else without the Y.

There is rdbyte and wrfast already, like you say, so maybe that's OK. I think those things are harder to remember though.

Cluso99 · 2016-06-26 19:05

When you are coding or looking at your code later, it's nice to be able to clearly see where you are reading and writing the data register. I think we are agreed on WRPIN and RDPIN (the Y and Z registers).

That. Just leaves us with the M and X registers.

Sine you are mostly only ever going to set the M once, and that is the smart pin MODE, then why not just call it MODEPIN ?

In some modes, the X can be set once and other modes it it set often. How about we call it either SETPIN or SETXPIN. I very much dislike WXPIN as it is confusing with WRPIN which is the data.

I vote for

M:   MODEPIN
X:   SETPIN (or SETXPIN)
Y:   WRPIN
Z:   RDPIN

and (postedit)

     ACKPIN

Cluso99 · 2016-06-26 19:09

Stack saving...

I don't see a need for special saving instructions. My use is not very time specific as it will usually result in waiting for user input. If my code only requires a depth of 4 then I only need to save/restore 4 levels.

My code is for doing an extended monitor/debugger like I did before P2Hot. The routines will also be useful for standard input/output plus serial. Things like data, hex, decimal, binary, dump, string, etc.

And yes, the width is only 22 bits. (Not verified - will check tomorrow)

Cluso99 · 2016-06-26 19:30

<ducks for cover>

While coding, I have been thinking about the differences between the P1 instruction set and the P2 instruction set. They are really quite different aside from the basic instructions such as ADD, AND, ROR, etc.

This made me ponder the question...

What if the first 8 cogs were P1B compatible?

The P1 section would be (than P1) faster, much smaller die than P2, 64 I/O, more hub, security, maybe in the wash up not that much more power than the P1, and would give an immediate upgrade path.

Remember, P1V verilog is already done.

Or dare I even suggest, configurable P1 or P2 ALU ???

At least, it could be considered for a pin compatible variant.

Dave Hein · 2016-06-26 20:27

[Comment deleted]

Rayman · 2016-06-26 21:25

After getting to know this P2, I don't really want to go back to P1...
Except maybe for cases where small size is important...

Cluso99 · 2016-06-26 21:39

Rayman wrote: »

After getting to know this P2, I don't really want to go back to P1...
Except maybe for cases where small size is important...

I am really missing the "glue" objects in the P1. I have to get them running before I can get to the real P2 stuff.

So I was just musing that a mix of P1 & P2 cogs would be quite nifty. We could have the old combined with the new. Of course we would want a version with just the new bits. Others would likely be happy with just a P1 with more hub and faster. Later maybe we could get a mix of P2 and P2Hot cogs

But if the first cab out of the ranks were a P1 & P2 combo, then if the P2 section had problems, at least the P1 would most likely work.

User Name · 2016-06-26 22:33

Cluso99 wrote: »

Others would likely be happy with just a P1 with more hub and faster.

Or just faster.

Tiny and with four fast P1 cogs - that's where I live.

jmg · 2016-06-27 01:41

Cluso99 wrote: »

<ducks for cover>
What if the first 8 cogs were P1B compatible?

It is easy to write, just a mere few words, but much harder to actually do.

In reality, this would add more complexity to testing, increasing the chance of falling between two stools and having neither work properly.
Documentation gets much more difficult, as do toolchains, as now you have to manage two compilers at the same time.... well, probably 3 as the P1B is bound to end up not-quite-P1.

There seems little risk of P2 16 COGS not fitting, and that is the only scenario where any consideration of a half-baked hybrid would be needed.

Compared to the issues you create, the issue you sought to solve is quite minor.
Most new users will not be coding in PASM anyway.

ozpropdev · 2016-06-27 01:42

Cluso99 wrote: »

And yes, the width is only 22 bits. (Not verified - will check tomorrow)

See here for more discussion on the hardware stack

jmg · 2016-06-27 01:43

User Name wrote: »

Or just faster.

Tiny and with four fast P1 cogs - that's where I live.

Remember, Chip has the Verilog configured so four fast P2 COGS is quite practical.

Cluso99 · 2016-06-27 01:52

Does anyone recall what the ROLBYTE and MOVBYTS instructions do?

I am trying to reverse the bytes in a long. A while ago we had ESWAP8 but this is gone now.

Seairth · 2016-06-27 02:45

Cluso99 wrote: »

Does anyone recall what the ROLBYTE and MOVBYTS instructions do?

I am trying to reverse the bytes in a long. A while ago we had ESWAP8 but this is gone now.

I think you want:

MOVBYTS d, #%%0123

(Edit: dangit! I thought MOV while I typed ROL!)

ozpropdev · 2016-06-27 03:25

Use MOVBYTS for a replacement for ESWAP8

			movbyts	myreg,#%%0123

User Name · 2016-06-27 03:49

jmg wrote: »

User Name wrote: »

Or just faster.

Tiny and with four fast P1 cogs - that's where I live.

Remember, Chip has the Verilog configured so four fast P2 COGS is quite practical.

Good point. A reduced P2 has an excellent chance of being made. Seems likely the package & pin count would be reduced too.

evanh · 2016-06-27 09:29

User Name wrote: »

Good point. A reduced P2 has an excellent chance of being made. Seems likely the package & pin count would be reduced too.

A big yes to all of the above - http://forums.parallax.com/discussion/164364/prop2-family/p1

ozpropdev · 2016-06-27 10:18

Re: ROLBYTE instruction

This instruction shifts he D register left by 8 bits and inserts the selected byte in S register in D[7:0]

Edit: Treat the following example as a sequence of instructions.

		mov	myreg,##$01234567
		mov	myreg2,##$89abcdef

		rolbyte	myreg,myreg2,#0		'results in myreg = $234567ef
		rolbyte	myreg,myreg2,#1		'results in myreg = $4567efcd
		rolbyte	myreg,myreg2,#2		'results in myreg = $67efcdab
		rolbyte	myreg,myreg2,#3		'results myreg = $efcdab89

Here are the Prop1 equivalents


'equvalent P1 code for P2 "rolbyte myreg,myreg2,#0"

		shl	myreg,#8
		mov	temp,myreg2
		and	temp,#$ff
		or	myreg,temp

'equvalent p1 code for P2 "rolbyte myreg,myreg2,#1"

		shl	myreg,#8
		mov	temp,myreg2
		shr	temp,#8
		and	temp,#$ff
		or	myreg,temp

'equvalent p1 code for P2 "rolbyte myreg,myreg2,#2"

		shl	myreg,#8
		mov	temp,myreg2
		shr	temp,#16
		and	temp,#$ff
		or	myreg,temp

'equvalent p1 code for P2 "rolbyte myreg,myreg2,#3"

		shl	myreg,#8
		mov	temp,myreg2
		shr	temp,#24
		and	temp,#$ff
		or	myreg,temp

Does this instruction and ROLWORD,ROLNIB really need to be SHLBYTE, SHLWORD & SHLNIB?

Seairth · 2016-06-27 11:16

ozpropdev wrote: »

		mov	myreg,##$01234567
		mov	myreg2,##$89abcdef

		rolbyte	myreg,myreg2,#0		'results in myreg = $234567ef
		rolbyte	myreg,myreg2,#1		'results in myreg = $4567efcd
		rolbyte	myreg,myreg2,#2		'results in myreg = $67efcdab
		rolbyte	myreg,myreg2,#3		'results in myreg = $efcdab89

Based on your description, should that be:

		rolbyte	myreg,myreg2,#0		'results in myreg = $234567ef
		rolbyte	myreg,myreg2,#1		'results in myreg = $234567cd
		rolbyte	myreg,myreg2,#2		'results in myreg = $234567ab
		rolbyte	myreg,myreg2,#3		'results in myreg = $23456789

Prop2 FPGA files!!! - Updated 2 June 2018 - Final Version 32i

Comments