Running in LUT exec mode

Seairth · 2015-10-01 13:18

Suppose I want to run code exclusively in LUT exec mode. How would I go about bootstrapping the LUT code efficiently? I only see WRLUT, which would require hub memory to first be copied to cog ram. I know there's some other method for quickly filling a LUT, but I'm not sure which instructions do that.

ozpropdev · 2015-10-01 13:37

dat		orgh	1

comms		setq	#$1F7
		rdlong	comms_tx,ptrb[(code2-comms)>>2]
		jmp	#comms_tx
code2
		org	$200

comms_tx

Edit: Oops! Presseed submit instead of preview

mindrobots · 2015-10-01 13:38

In Chip's cog_1k_program.spin, he uses rdlong to load the code from HUBRAM to the LUT

        loc	        adra,@code + $1F0<<2	'load lut starting at $000 with 2nd half code
	setq2	#$200-1
	rdlong	$000,adra

I don't completely understand all this yet but it does appear to work.

Since I don't have a scope, I changed Chip's program to output on PortB so you can at least see some blinky lights. For all the toggle instructions,

long	$F4240400 [250]		'notb outa,#0 (250 instances)

becomes

long	$F4240600 [250]		'notb outb,#0 (250 instances)

I'm still not sure about any way to really tell you are running in COGEXEC, LUTEXEC or HUBEXEC except looking at the code addresses, where you put the code. If all that looks like it makes sense to your abstraction of the P2 memory models and if something appears to be running, you're good to go!

P.S. Don't ask PNUT to load something to your FPGA until you are sure the P2 is loaded. I just did this to run the above program and PNUT is horrible, miserably hung up on Win10. Task Manager won;t even get rid of it.

Seairth · 2015-10-01 14:41

mindrobots wrote: »
In Chip's cog_1k_program.spin, he uses rdlong to load the code from HUBRAM to the LUT
        loc	        adra,@code + $1F0<<2	'load lut starting at $000 with 2nd half code
	setq2	#$200-1
	rdlong	$000,adra

*doink* I forgot that there was sample code for this! SETQ2 seems to be the thing you need. I wish there were a more meaningful name for this, even if it was just an alias for SETQ2.

ozpropdev · 2015-10-01 14:55

In my example above where does "org $200" code end up then?
Was it masked to 9 bits and loaded in cog.
Code ran Ok.
Not with my P2 at the moment so can't check it.

David Betz · 2015-10-01 15:14

SETQ? This is driving me crazy. (setq foo 1) is the LISP expression to set the value of a variable. Does P2 implement LISP in hardware? :-)

mindrobots · 2015-10-01 17:04

ozpropdev wrote: »

In my example above where does "org $200" code end up then?
Was it masked to 9 bits and loaded in cog.
Code ran Ok.
Not with my P2 at the moment so can't check it.

From what I've learned (or mislearned) so far, my guess is that your code ended up in COGRAM because you used the SETQ. As far as any of the labels you referred to in the code after the org $200, the would have lost that upper bit but since they are all 9 bit S and D values anyway, it probably didn't matter. if the 10th bit was there or not.

I would also think any of the branches you did were OK because you essentially wrote COG (9 bit) code.

I have my P2, I'll try your test program skeleton with an org $200

(all of the above is speculation and black magic and most likely has little basis in any actual understanding of the P2)

mindrobots · 2015-10-01 19:48

Ok, I tried, got something to work, one thing doesn't make sense and the means to the end is UGLY.

This little program is an extension of Brian's test routine. It loads its data into COGRAM and it's code into LUT, zeroes out the HUBRAM where the code was (just to prove we are really running in LUTEXEC) and then jumps to $200 to start running code in LUT.

con
	sys_clk = 50_000_000
	baud_rate = 115_200
	rx_pin = 63
	tx_pin = 62


' start off HUBEXEC in HUBRAM
dat		orgh	1

' move the data space to COGRAM

		loc	adra,@data_start
		setq	#$1FF
		rdlong	$000,adra

' move the code to LUT for LUTEXEC

		setq2	#$1FF
		loc	adra,@data_start + $200 << 2
		rdlong	$000,adra

' zero the HUBRAM after data & code is moved, to prove we are in LUTEXEC
:loop	
		mov	ptra, @data_start << 2
		wrlong	val, ptra++
		djnz	count, @:loop

		jmp	@$200		' jump to LUT LONG #$000

' build the $3FF image you will copy to COG + LUT

data_start
		org	$0, $400 << 2

		long	0[16]
regx		long	$12345678

bit_time	long	sys_clk / baud_rate
val		long	0
count		long	$400

timer		res	1
dx		res	1
nibc		res	1
code0		res	1
lut0		res	1
		res	$200 - lut0
data_end

' this is ugly - I'd like to say org $200 here or SEGLUT
' instead of just doing a res to account for the rest of COG space
' the address counter has moved up to start of LUT	

lut_start	long	$0FFFFFFF 	' dummy value, easy to find (NOP)
		setb	outb,#tx_pin
		setb	dirb,#tx_pin

wait4start	testb	 inb,#rx_pin wz 	'press a key to start in PST
	if_nz	jmp	#wait4start

'### TEST CODE ####
		rdlut	lut0,#0		' I expect to pick up $0FFFFFFF
					' from LUT address $0 here
		mov	val,regx
		call	@show_hex
		call	@newline
		mov	val,lut0
		call	@show_hex
		call	@newline
		
here		jmp	@here

'============================================================		
newline		mov	dx,#13

send_byte	setb	dx,#8
		shl	dx,#1
		getcnt	timer
		rep	@endrep,#10
		testb 	dx,#0 wz
		setbnz	outb,#tx_pin
		addcnt	timer,bit_time
		waitcnt
		shr	dx,#1
endrep
		ret
'============================================================

show_hex	mov	nibc,#8		'8 x nibbles

show_hex2	mov	dx,val
		shr	dx,#28
		cmp	dx,#9 wz,wc
	if_a	add	dx,#"A"-10
	if_be	add	dx,#"0"
		call	@send_byte
		shl	val,#4
		djnz	nibc,@show_hex2
		ret

'============================================================

I tried to fix the tab issues with the code - It does compile but you may need to fix some spaces.

load it, enable PST and hit a key, you should see the value of regx displayed and also contents of LUT0

The only thing I really don't understand is my rdlut isn't getting the value I think it should be getting - I expect $0FFFFFFF.

The org and address manipulations are rather ugly to get to the point where you have code in LUT to execute but it does work.

Was it fun? Sure....just like an adventure game!!

Cluso99 · 2015-10-01 23:02

The SETQ2 should be immediately prior to the RDLONG.

mindrobots · 2015-10-01 23:05

Thanks! Ok, I'll flip them around but the above code still runs as is.

Need to try when back at P2.

David Betz · 2015-10-02 00:21

Cluso99 wrote: »

The SETQ2 should be immediately prior to the RDLONG.

Can someone point me to a description of SETQ and SETQ2?

Cluso99 · 2015-10-02 00:33

I don't have a link.

SETQ is used to set a count 1-256 whereby the following RD/WRLONG instruction is repeated "n" times, incrementing hub and cog each time. ie used to load cog up to 256 longs in one operation at full clock speed.

SETQ2 is/ws used as an equivalent to SETQ, but replaces cog with LUT RAM. Not sure if this is still required.

Hope this helps David.

David Betz · 2015-10-02 00:49

Cluso99 wrote: »

I don't have a link.

SETQ is used to set a count 1-256 whereby the following RD/WRLONG instruction is repeated "n" times, incrementing hub and cog each time. ie used to load cog up to 256 longs in one operation at full clock speed.

SETQ2 is/ws used as an equivalent to SETQ, but replaces cog with LUT RAM. Not sure if this is still required.

Hope this helps David.

Thanks!

jmg · 2015-10-02 00:54

Cluso99 wrote: »

The SETQ2 should be immediately prior to the RDLONG.

If this always has to be paired, it is probably better wrapped in the Assembler as one 64b opcode ?

A BlockMove type name would avoid it ever being accidently split.

ozpropdev · 2015-10-02 01:12

@mindrobots
I tried running your code and got no output at all.
It seems Pnut does not like pasm instruction following directly after RES directives.
I changed these to longs and established comms again.
Now to try and work out this LUT stuff.

mindrobots · 2015-10-02 01:14

Strange, that code is cut straight out of PNUT after running it on my P2.

I'll report back after I get some P2 time.

**Edit: Brian, very strange. I changed my RES to LONG and it started working. I ran it all afternoon with the RES in there wondering how it was doing that because res usually need to be the last thing under the current org.

It still isn't doing what I expected (displaying $0FFFFFFF as the contents of LUT[0].

I thought maybe I had "old code" stuck in the LUT since it doesn't get zeroed on reset but I was changing the code between lut_start and newline, and those changes were faithfully reproduced.

Much more that I don't understand than I do understand. I'm done for the day, time for the other half of the world to play!

P.S. The position of the SETQ2 doesn't appear to matter, it works both ways.

cgracey · 2015-10-02 08:36

I started to document that the other day, but got sidetracked by the addressing conundrum.

Here is a link to what I started:

https://docs.google.com/document/d/10qQn_-B7avY2ce0N1MDDdzOF1lACPNWUJkjHFyzITiY/edit?usp=sharing

Look at the end.

cgracey · 2015-10-02 08:37

SETQ/SETQ2 must immediately precede RDLONG in order to read multiple longs.

David Betz · 2015-10-02 10:39

cgracey wrote: »

SETQ/SETQ2 must immediately precede RDLONG in order to read multiple longs.

Just out of curiosity, what does the "Q" in SETQ and SETQ2 stand for?

Seairth · 2015-10-02 10:47

Qonfusing.

mindrobots · 2015-10-02 11:42

Quantity?
Quota? Short for Hub Read Write Quota Enforcement Register but a SETHRWQER mnemonic was a bit awkward. :0)

Dave Hein · 2015-10-02 11:54

I think SETQ sets the Quotient register, which is normally used with the divide instruction.

ozpropdev · 2015-10-02 12:34

@mindrobots
I made the following change to your code.

	loc	adra,@data_start + $200 << 2
changed to
		mov	adra,#@lut_start

and the result was

12345678
0FFFFFFF

:cool:

Edit: As well as these changes mentioned earlier

timer		long	0	'res	1
dx		long	0	'res	1
nibc		long	0	'res	1
code0		long	0	'res	1
lut0		long	0	'res	1
	'	res	$200 - lut0

Seairth · 2015-10-02 12:39

So if you want to copy a single long from HUB to LUT, you still have to use SETQ2?

SETQ2 #0
RDLONG $000, adra

Hmm...

ozpropdev · 2015-10-02 13:08

Arghh...
Just removed DE2-115 and powered up Prop123-A7 ,Hit F11 and code failed??
This memory stuff is starting to stress me out!

Seairth · 2015-10-02 13:11

ozpropdev wrote: »

Arghh...
Just removed DE2-115 and powered up Prop123-A7 ,Hit F11 and code failed??
This memory stuff is starting to stress me out!

Hopefully, Chip's next image drop will make some of this stuff a bit easier to write.

mindrobots · 2015-10-02 13:33

ozpropdev wrote: »

Arghh...
Just removed DE2-115 and powered up Prop123-A7 ,Hit F11 and code failed??
This memory stuff is starting to stress me out!

Good, it's not just me, we're both crazy!!!

I'll try your changes (I've been doing everything on the A7 lately)

ozpropdev · 2015-10-02 13:41

Rick
I've dumped SETQ2 for now and dropped this in.

	'	setq2	#$1FF
	'	loc	adra,@data_start + $200 << 2
	'	rdlong	$000,adra

		mov	ptrb,#@lut_start
		mov	dx,#0
		mov	nibc,#$1ff
load_lut	rdlong	val,ptrb++
		wrlut	val,dx
		add	dx,#1
		djnz	nibc,@load_lut

This worked on both DE2-115 and Prop123-A7 after a power up.
I'm more confused now....

Rayman · 2015-10-02 13:52

Thanks for posting some documentation Chip!
There's been so many changes, it'd good to get grounded with some facts...

ozpropdev · 2015-10-02 14:12

mindrobots wrote: »

Good, it's not just me, we're both crazy!!!

Crazy is good, right?

Cluso99 · 2015-10-02 17:15

Ozpropdev,
What does
Mov ptrb,#@lut_start
Do? The #@ has me confused.

FWIW only 511 longs are loaded into LUT RAM.

Running in LUT exec mode

Comments