New Spin

potatohead · 2017-03-10 09:46

Here comes the standard library.

libSPIN

cgracey · 2017-03-10 10:37

potatohead wrote: »

Here comes the standard library.

libSPIN

On second thought, maybe we should NOT go there. Sounds like an avenue for rapid baggage accumulation.

Dave Hein · 2017-03-10 12:59

Just a thought. Can all this discussion about enhancements to the Spin language wait until after the chip goes into synthesis? Any time spent on making Spin look more like C++ is just delaying the date when we can all get our hands on a P2 chip. Our efforts may be better spent on testing the latest and "final" FPGA image.

ersmith · 2017-03-10 13:47

cgracey in the Random thread wrote:

An implied library which is an object, but does not need the object.method() syntax, just the method() syntax. If any method uses an unknown keyword, the implied object's methods get checked for a name match. Any object that uses any of those methods will have that implied object included, which is just a 2-long cost. The top-level file can specify this implied object. That way, Spin can get extended without any tool changes.

This is almost exactly how fastspin implements most of the standard Spin functions like stringlen, waitcnt, lockclr, etc; there is a "system" object that's automatically included by the compiler and has definitions for all of those. When it sees a function call without an explicit object it looks first in the current object's methods, then in the system object. The optimizer removes all the unused methods, so most of the system functions don't actually have to be placed in the final output (and many of them are small and get inlined anyway).

Eric

kwinn · 2017-03-10 14:56

ersmith wrote: »

cgracey in the Random thread wrote:

An implied library which is an object, but does not need the object.method() syntax, just the method() syntax. If any method uses an unknown keyword, the implied object's methods get checked for a name match. Any object that uses any of those methods will have that implied object included, which is just a 2-long cost. The top-level file can specify this implied object. That way, Spin can get extended without any tool changes.

This is almost exactly how fastspin implements most of the standard Spin functions like stringlen, waitcnt, lockclr, etc; there is a "system" object that's automatically included by the compiler and has definitions for all of those. When it sees a function call without an explicit object it looks first in the current object's methods, then in the system object. The optimizer removes all the unused methods, so most of the system functions don't actually have to be placed in the final output (and many of them are small and get inlined anyway).

Eric

Sounds like a good way to go!

Roy Eltham · 2017-03-10 18:02

Dave Hein,
This is the path Chip wants to take, why do you keep saying to stop doing what Chip wants?

ersmith,
That's pretty much what I was thinking then, with the system library thing.

Martin Hodge · 2017-03-10 18:22

Dave Hein.

My understanding is that these things are happening in parallel. One is not dependant on the other. The silicon (PASM) is done and is being synthesized while Chip-and-the-forum work on building the SPIN language and compiler/interpreter using the "locked in" PASM opcodes.

Correct me?

Heater. · 2017-03-10 18:38

My understanding is that the Verilog for the P2 is not fixed yet.

It's only a few days ago that a new PRNG was added to it.

And some instruction or other that was supposed to help the Spin byte code.

.....

Roy Eltham · 2017-03-10 19:32

Chip also mentioned some layout changes that were needed still, I assume that is with the frame that will contain the synthesized verilog stuff.

Chip isn't going to send the stuff to synthesis until he is confident in the vetting of the FPGA version, and part of that for him is getting Spin2 up and running.

cgracey · 2017-03-10 19:40

Roy Eltham wrote: »

Chip also mentioned some layout changes that were needed still, I assume that is with the frame that will contain the synthesized verilog stuff.

Chip isn't going to send the stuff to synthesis until he is confident in the vetting of the FPGA version, and part of that for him is getting Spin2 up and running.

That's right.

We're going to have to spend a lot of money to get through synthesis and fabrication. When we pull that trigger, we want to know, more than we do now, that things are in proper order. Having Spin working will get the confidence level up to where it needs to be.

Dave Hein · 2017-03-10 20:15

Roy Eltham wrote: »

This is the path Chip wants to take, why do you keep saying to stop doing what Chip wants?

The reason I'm asking Chip to stop doing what he wants is a selfish one on my part. I want to see the P2 as soon as possible and I'm not interested in seeing an enhanced Spin running under a bytecode interpreter -- at least not until the P2 development goes into synthesis and fab. I think then Chip will have lots of time on his hands to work on enhancing Spin. In the meantime, I think Chip can test how well the P2 works with the Spin bytecode interpreter by just porting the P1 interpreter and optimizing it for the P2. I really don't understand why Chip is going down the New Spin tangent at this time.

Martin Hodge wrote: »

My understanding is that these things are happening in parallel. One is not dependent on the other. The silicon (PASM) is done and is being synthesized while Chip-and-the-forum work on building the SPIN language and compiler/interpreter using the "locked in" PASM opcodes.

I seriously doubt that synthesis is happening at this time. I think there is a lot of work to verify the latest FPGA before that can happen. Also, Chip wants to verify that the Spin bytecode interpreter works efficiently with the latest design. As I said many times, the bytecode interpreter can be tested using a P1-like Spin interpreter. All the work on enhancing the Spin syntax and functionality is happening right in the middle of the P2 design critical path, and is pushing out the P2 delivery date day-by-day for every day that New Spin is being pursued. Please correct me if I'm wrong.

Heater. wrote: »

My understanding is that the Verilog for the P2 is not fixed yet.

Chip just stated in the FPGA thread that the Verilog was DONE. Heater, didn't you get a piece of the cake from the celebration?

Chip did add that the booter code may get some minor changes.

Roy Eltham wrote: »

Chip also mentioned some layout changes that were needed still, I assume that is with the frame that will contain the synthesized verilog stuff.

Chip isn't going to send the stuff to synthesis until he is confident in the vetting of the FPGA version, and part of that for him is getting Spin2 up and running.

Yes, so we should all be testing out the latest FPGA image with all the stuff we've run before to make sure there aren't any problems. I agree on testing with Spin2, but not with all the bells and whistles that people have been kicking around.

cgracey wrote: »

Having Spin working will get the confidence level up to where it needs to be.

I agree, but it seems like the basic P1 Spin functionality plus P2-specific features should be sufficient at this time. Many of the new features that have been proposed are enhancements to the language, but do not change the bytecode interpreter. These features can be added later on after the Verilog has been tested.

Heater. · 2017-03-10 21:03

Dave,

Oh yeah. I missed that "DONE" announcement. The date on the last FPGA release did not move so I did not check that thread.

Hurrah! I'm going to get me some cake!

Dave Hein · 2017-03-10 21:33

BTW, I ported the P1 bytecode interpreter to the P2 back in November of 2015. Maybe this can be optimized and used to test out bytecode processing on the P2.

The code is contained in the p1spin thread at http://forums.parallax.com/discussion/162858/p1spin . Actually, "porting" is probably not the right term in this case. I basically wrote PASM code to implement each of the Spin bytecodes, and then used a jump table to execute each fetched bytecode. At one time I did attempt to port the interpreter contained in the ROM code, but I gave up because it relies heavily on self-modifying code specific to the P1.

It's doubtful that the November 2015 code will run on the latest FPGA image. However, maybe someone would like to give it a try. I am away from my FPGA board currently, but I'll try to get it to run next week when I get home.

EDIT: I found an earlier thread from March 2014 -- 3 years ago! -- when I first wrote p1spin for the P2. It's located at http://forums.parallax.com/discussion/154460/p1spin . Boy, time flies when you're getting old having fun.

cgracey · 2017-03-11 08:56

In starting the math stack operations for the Spin2 interpreter, I realized that it was taking three instructions to do a cog-register-stack pop:

        sub     stkptr,#1
        alts    stkptr
        mov     x,0

In the current v16, instructions ALTSN..ALTB do not alter their D register.

I changed these instructions to use S[17:09] as a signed adder value for their D register. Under normal usage, where S is zero or #imm9, these S[17:09] bits are always 0, making the instruction behave as currently documented. When those S[17:09] bits are non-0, they get sign-extended and added to D. This makes indexing possible with just two instructions.

Here's what a cog-register-stack pop looks like now:

        alts    stkptr,stkpop
        mov     x,0
...
stkpop  long    $1FF<<9 + (stkbase-1) & $1FF   'use stkbase-1 (like --stkptr) and decrement D

There will be a v17 release soon which contains this change, along with the xoroshiro128+ PRNG.

There may be more simple enhancements along the way to getting Spin2 working. This is why I don't want to jump to synthesis immediately. Little things come up when you start writing software.

Roy Eltham · 2017-03-11 09:25

Nice, stack operations are so common that saving 1 instruction per stack op is a huge deal.

This should be bonus for C/C++ too.

ozpropdev · 2017-03-11 10:17

cgracey wrote: »

I changed these instructions to use S[17:09] as a signed adder value for their D register. Under normal usage, where S is zero or #imm9, these S[17:09] bits are always 0, making the instruction behave as currently documented. When those S[17:09] bits are non-0, they get sign-extended and added to D. This makes indexing possible with just two instructions.

A worthwhile tweak I think.

David Betz · 2017-03-11 11:02

cgracey wrote: »
In starting the math stack operations for the Spin2 interpreter, I realized that it was taking three instructions to do a cog-register-stack pop:
        sub     stkptr,#1
        alts    stkptr
        mov     x,0
In the current v16, instructions ALTSN..ALTB do not alter their D register.

I changed these instructions to use S[17:09] as a signed adder value for their D register. Under normal usage, where S is zero or #imm9, these S[17:09] bits are always 0, making the instruction behave as currently documented. When those S[17:09] bits are non-0, they get sign-extended and added to D. This makes indexing possible with just two instructions.

Here's what a cog-register-stack pop looks like now:
        alts    stkptr,stkpop
        mov     x,0
...
stkpop  long    $1FF<<9 + (stkbase-1) & $1FF   'use stkbase-1 (like --stkptr) and decrement D
There will be a v17 release soon which contains this change, along with the xoroshiro128+ PRNG.

There may be more simple enhancements along the way to getting Spin2 working. This is why I don't want to jump to synthesis immediately. Little things come up when you start writing software.

I thought you had decided to put the stack in hub memory?

cgracey · 2017-03-11 11:10

The stack will be in hub memory, but I'm thinking that the current stack frame can be cached in cog RAM.

Here are some stack-based math operations:

'
'
' Add
'
_add		alts	sp,spop
		mov	y,0
		alts	sp,spop
		mov	x,0
		add	x,y
		altd	sp,spush
		mov	0,x
		ret
'
'
' Sub
'
_sub		alts	sp,spop
		mov	y,0
		alts	sp,spop
		mov	x,0
		sub	x,y
		altd	sp,spush
		mov	0,x
		ret
'
'
' Logical AND
'
_logand		alts	sp,spop
		mov	x,0		wz
		alts	sp,spop
	if_nz	mov	x,0		wz
	if_nz	not	x,#0
		altd	sp,spush
		mov	0,x
		ret
'
'
' Logical OR
'
_logor		alts	sp,spop
		mov	x,0		wz
		alts	sp,spop
	if_z	mov	x,0		wz
	if_nz	not	x,#0
		altd	sp,spush
		mov	0,x
		ret
'
'
' Logical NOT
'
_lognot		alts	sp,spop
		mov	x,0		wz
	if_z	not	x,#0
	if_nz	mov	x,#0
		altd	sp,spush
		mov	0,x
		ret
'
'
' Stack setup
'
spop		long	$1FF<<9 + $1FF		'$000-based stack pop  [--sp]
spush		long	$001<<9 + $000		'$000-based stack push [sp++]
sp		long	$000			'$000-based stack pointer

The instruction(s) at their heart could be modified to save lots of instances of almost the same code.

potatohead · 2017-03-11 11:12

There may be more simple enhancements along the way to getting Spin2 working. This is why I don't want to jump to synthesis immediately. Little things come up when you start writing software

Worth it.

ersmith · 2017-03-11 17:34

cgracey wrote: »

There may be more simple enhancements along the way to getting Spin2 working. This is why I don't want to jump to synthesis immediately. Little things come up when you start writing software.

True; on the other hand there's always "just one more thing" that you can do to improve the chip. At some point it does have to ship!

Eric

ersmith · 2017-03-11 17:41

Roy Eltham wrote: »

Nice, stack operations are so common that saving 1 instruction per stack op is a huge deal.

This should be bonus for C/C++ too.

I very much doubt that the C compiler will use this. The C stack will (normally) be in hub memory, so the ptra/ptrb instructions would be used for stack operations. Even iwe did want to add a mode to allow the stack to be in cog, I'm not sure the compiler would be able to use alts, since it requires the register to be set up in a very particular way. We couldn't use it for an arbitrary predecrement unless the bottom 9 bits of the S register are treated as signed.

Eric

Ariba · 2017-03-12 01:44

cgracey wrote: »

The stack will be in hub memory, but I'm thinking that the current stack frame can be cached in cog RAM.

Here are some stack-based math operations:

'
'
' Add
'
_add		alts	sp,spop
		mov	y,0
		alts	sp,spop
		mov	x,0
		add	x,y
		altd	sp,spush
		mov	0,x
		ret
'
'
' Sub
'
_sub		alts	sp,spop
		mov	y,0
		alts	sp,spop
		mov	x,0
		sub	x,y
		altd	sp,spush
		mov	0,x
		ret
'
'
' Logical AND
'
_logand		alts	sp,spop
		mov	x,0		wz
		alts	sp,spop
	if_nz	mov	x,0		wz
	if_nz	not	x,#0
		altd	sp,spush
		mov	0,x
		ret
'
'
' Logical OR
'
_logor		alts	sp,spop
		mov	x,0		wz
		alts	sp,spop
	if_z	mov	x,0		wz
	if_nz	not	x,#0
		altd	sp,spush
		mov	0,x
		ret
'
'
' Logical NOT
'
_lognot		alts	sp,spop
		mov	x,0		wz
	if_z	not	x,#0
	if_nz	mov	x,#0
		altd	sp,spush
		mov	0,x
		ret
'
'
' Stack setup
'
spop		long	$1FF<<9 + $1FF		'$000-based stack pop  [--sp]
spush		long	$001<<9 + $000		'$000-based stack push [sp++]
sp		long	$000			'$000-based stack pointer

The instruction(s) at their heart could be modified to save lots of instances of almost the same code.

If you have the stack in the cogram, then you can do the calculation direct on the stack. (The stack is made of registers).
With that you spare a pop and a push:

' Add
'
_add		alts	sp,spop
		mov	y,0
		altd	sp,#0
		add	0,y
		ret

Andy

cgracey · 2017-03-12 05:32

Andy, thanks. That should have occurred to me, too.

I've been working on the code snippets today that make up the interpreter operations. I've switched to hub push/pop, for now, since it seems too early to know how to optimize for caching.

Rayman · 2017-06-01 22:11

I just saw a P1 thread that reminded me of another feature that would be nice for P2:

C style structures

I have no idea how hard that is to add, but would be very useful...

Phil Pilgrim (PhiPi) · 2017-06-01 23:10

It should be easier in Spin2, since the P2 doesn't have alignment strictures.

-Phil

Dave Hein · 2017-06-02 00:04

I think the idea of structures was discussed earlier in this thread, and I believe the consensus was not to add it to Spin2. If you're going to add structures you might as well just program in C. The concept of passing an object pointer to a method was discuss, and there seem to be general agreement to add that. The object VAR space is kind of like a structure. However, because longs, words and bytes are grouped together in VAR space it messes up the order of elments with different sizes.

Roy Eltham · 2017-06-02 00:50

Dave,
I believe Chip still wants to add structures. I think you are wrong about the consensus.

Also, with the P2 the VARs do not need to be reordered anymore, and I think Chip already made that change.

cgracey · 2017-06-02 02:13

Roy Eltham wrote: »

Dave,
I believe Chip still wants to add structures. I think you are wrong about the consensus.

Also, with the P2 the VARs do not need to be reordered anymore, and I think Chip already made that change.

That's right, VARs can be assembled in order of declaration, without any alignment concerns.

I'm working on the call/return code in the interpreter now.

Dave Hein · 2017-06-02 02:37

OK, so on the P2 the Spin compiler will not re-order the VAR data. And unaligned longs and words will be allowed. Got it. Structs will be a nice addition to Spin. I've got a few more suggestions if your interested. Pretty soon Spin will look just like C, but with indentation instead of braces.

cgracey · 2017-06-02 02:50

Dave Hein wrote: »

OK, so on the P2 the Spin compiler will not re-order the VAR data. And unaligned longs and words will be allowed. Got it. Structs will be a nice addition to Spin. I've got a few more suggestions if your interested. Pretty soon Spin will look just like C, but with indentation instead of braces.

I'm interested!

New Spin

Comments