To ORGH or not to ORGH?

ersmith · 2019-04-09 11:25

This is a bug in the program code, I think. The variable x is declared as:

x     res 1

which does not initialize it, just reserves some space for it (in COG memory... "res" does nothing to HUB memory!). It'll have as an initial value whatever happens to be in HUB memory after the label; in this case it'll be the first instruction of the PropCam_Acq function.

But in the actual code x appears to be used without being explicitly set, so that initial value will determine the behavior of the program. The first use of x I see is in this loop:

doVga		
                rdfast	#0,##$1000-$400		'load .bmp palette into lut
				rep	@.end,#$100
				rflong	y
				shl	y,#8
				wrlut	y,x
				add	x,#1
.end

x is used as the index into LUT for writing, but since it is not set first the results will depend on the contents of HUB memory.

This explains why everything works if the first instruction after PropCam_Acq is "nop"; in that case x happens to be initialized to the bit pattern of the nop instruction, which is all 0. It also works if you change the "x res 1" to "x long 0", for a similar reason (now x is initialized to 0), or if there's an "orgh" before PropCam_Acq (orgh pads with 0). Another fix would be to insert "mov x, #0" before the rep loop.

cgracey · 2019-04-09 12:12

I think ORGH needs to bump the adress to at least $400 to get around these ambiguities.

I'm changing PNut.exe to separate the ORGH address from the actual object code address.

Cluso99 · 2019-04-09 12:21

IMHO I don’t think ORGH should prevent values below $400 as we may well want to put buffers, variables and tables there.

ersmith · 2019-04-09 12:29

cgracey wrote: »

I think ORGH needs to bump the adress to at least $400 to get around these ambiguities.

This particular example didn't have anyting to do with ORGH, except accidentally -- it was just use of an uninitialized variable.

I'm changing PNut.exe to separate the ORGH address from the actual object code address.

What exactly do you mean by that, Chip? Do you mean that labels will now have 3 values, their default value, notional hub value as set by ORGH, and actual location in HUB memory? This seems like it could get very confusing. I think one of the things people didn't like in P1 was that "@label" had different meanings in Spin and PASM; in Spin it gave you the actual HUB location, but in PASM it just gave you an offset of sorts. It would be nice to fix that in P2.

Rayman · 2019-04-09 12:41

What does

ORGH 0

do exactly?
I think it pushes to at least $400.

I think the way it works now (but could be wrong) is that

ORGH X

tries to start code at X in HUB memory with X being made to be at least $400.
This seems to be working just fine for me...

Rayman · 2019-04-09 12:47

BTW: "RES" is nothing but trouble. It's too easy to get caught putting variables after a RES and messing up your code.
And here, we see how it can really mess you up if you don't realize it's not initiated...
I'm trying to think if RES serves any use that justifies itself..

ersmith · 2019-04-09 13:00

Rayman wrote: »
What does
ORGH 0
do exactly?
I think it pushes to at least $400.

In fastspin at least "orgh 0" will only work if it's the very first thing in the program. Otherwise you'd be asking the assembler to go backwards, which would be impossible.

I think the way it works now (but could be wrong) is that
ORGH X
tries to start code at X in HUB memory with X being made to be at least $400.
This seems to be working just fine for me...

In fastspin (and I'm pretty sure the other assemblers) "ORGH X" will try to insert padding to make sure the code/data will start at location X in hub memory. If there is already more than X bytes of things in HUB then it will report an error. There's no special case for $400.

Note that unlike ORG, just plain ORGH on its own is *not* the same as "ORGH 0". Just "ORGH" with no number says "labels defined after this point are in HUB memory", but it doesn't change the hub memory location. So in:

   ORGH 0  ' first thing
   ORG 0 ' COG labels
x  nop
y  nop
   ORGH ' HUB labels
z  nop

"x" has the value 0, "y" as the value 1 (it's a COG address), but "z" has the value 8 (it's a HUB address). If the ORGH wasn't there, "z" would have the value 2.

Rayman · 2019-04-09 13:04

Hmm... In that case, I'm surprised I didn't have an issue with "ORGH" yet...

I thought there were some things that don't work if address is below $400.
What was it? I think it was something about a relative vs. hub address...
Or, maybe it is hubexec code doesn't work if below $400 ?...

evanh · 2019-04-09 13:05

Eric,
Thank you! I never checked for that. I've tested it and it works just by adding a MOV x,#0.

'******************************
'*  VGA 640 x 480 x 8bpp-lut  *
'******************************

CON

  intensity	= 80	'0..128

  fclk		= 80_000_000.0
  fpix		= 25_000_000.0
  fset		= (fpix / fclk * 2.0) * float($4000_0000)

  vsync		=	0	'vsync pin (all FPGA boards now)

DAT
org
'
'
' Setup
'
		hubset	#$FF			'set clock to 80MHz


		coginit #2,##@PropCam_Acq


		mov	x,#0
		rdfast	#0,##$1000-$400		'load .bmp palette into lut
		rep	@.end,#$100
		rflong	y
		shl	y,#8
		wrlut	y,x
		add	x,#1
.end
		rdfast	##640*480/64,##$1000	'set rdfast to wrap on bitmap

		setxfrq ##round(fset)		'set transfer frequency to 25MHz

		'the next 4 lines may be commented out to bypass level scaling

		setcy	##intensity << 24	'r	set colorspace for rgb
		setci	##intensity << 16	'g
		setcq	##intensity << 08	'b
		setcmod	#%01_0_000_0		'enable colorspace conversion

		wrpin	dacmode,#0		'enable dac modes in pins 0..3
		wrpin	dacmode,#1
		wrpin	dacmode,#2
		wrpin	dacmode,#3
'
'
' Field loop
'
field		mov	x,#33			'top blanks
		call	#blank

		mov     x,#480			'set visible lines
line		call	#hsync			'do horizontal sync
		xcont	m_rf,#0			'visible line
		djnz    x,#line           	'another line?

		mov	x,#10			'bottom blanks
		call	#blank

		drvnot	#vsync			'sync on

		mov	x,#2			'sync blanks
		call	#blank

		drvnot	#vsync			'sync off

                jmp     #field                  'loop
'
'
' Subroutines
'
blank		call	#hsync			'blank lines
		xcont	m_vi,#0
	_ret_	djnz	x,#blank

hsync		xcont	m_bs,#0			'horizontal sync
		xcont	m_sn,#1
	_ret_	xcont	m_bv,#0
'
'
' Initialized data
'
dacmode		long	%0000_0000_000_1010000000000_01_00000_0

m_bs		long	$CF000000+16		'before sync
m_sn		long	$CF000000+96		'sync
m_bv		long	$CF000000+48		'before visible
m_vi		long	$CF000000+640		'visible

m_rf		long	$7F000000+640		'visible rlong 8bpp lut

x		res	1
y		res	1

'****************************************************************************
'endfold
'comment

'fold propcam
org

PropCam_Acq
'		nop                      'if this is first instruction everything fine
'		waitx ##80_000_000   'if this is first instruction ... no display
		cogid camera_number  'if this is first instruction ... bad lut
		cogstop #2            'if this is first instruction ... bad lut

'temporary jmp
temploop
		nop
		jmp #temploop


camera_number	res  1

'endfold


'
' Bitmap
'
		orgh	$1000 - $436	'justify pixels at $1000, pallete at $1000-$400
		file	"bitmap2.bmp"	'640 x 480, 8pbb-lut

evanh · 2019-04-09 13:07

So, whole issue caused by a single uninitialised variable.

ersmith · 2019-04-09 13:08

Rayman wrote: »

BTW: "RES" is nothing but trouble. It's too easy to get caught putting variables after a RES and messing up your code.
And here, we see how it can really mess you up if you don't realize it's not initiated...
I'm trying to think if RES serves any use that justifies itself..

You have to use RES carefully, but it does have uses. The main one is that it allows you to declare variables that you know will be initialized from elsewhere (e.g. data read from pins or SD card) without requiring that space be allocated in the binary. That can save a lot on load time, and also lets you fit more things into HUB RAM. For example:

x res 100

reserves space for 400 longs at runtime, but since it's not initialized you don't have to download that to the P2, and if it's in an ORG section it consumes no extra HUB RAM; whereas

x long 100[0]

makes your .binary file 400 bytes larger. In some cases this could be the difference between having code that fits in memory and not.

Rayman · 2019-04-09 13:12

I would argue that you could just use FIT to achieve the same purpose with less risk...

But, I do use RES, I'm just very careful with it after having been burned more than once.

evanh · 2019-04-09 13:49

Here's a modified version for driving the P2ES-Eval accessory board. If the MOV x,#0 at line 69 is commented out then things go bad.

'******************************
'*  VGA 640 x 480 x 8bpp-lut  *
'******************************

CON
	XTALFREQ	= 20_000_000				'PLL stage 0: crystal frequency
	XDIV		= 2					'PLL stage 1: crystal divider
	XMUL		= 8					'PLL stage 2: crystal / div * mul
	XDIVP		= 1					'PLL stage 3: crystal / div * mul / divp (1,2,4..30)

	XOSC		= %10                             'OSC    ' %00=OFF, %01=OSC, %10=15pF, %11=30pF
	XSEL		= %11                             'XI+PLL ' %00=rcfast(20+MHz), %01=rcslow(~20KHz), %10=XI(5ms), %11=XI+PLL(10ms)
	XPPPP		= ((XDIVP>>1) + 15) & $F                  ' 1->15, 2->0, 4->1, 6->2...30->14
	CLOCKFREQ	= round(float(XTALFREQ) / float(XDIV) * float(XMUL) / float(XDIVP))
	SETFREQ		= 1<<24 + (XDIV-1)<<18 + (XMUL-1)<<8 + XPPPP<<4 + XOSC<<2
	ENAFREQ		= SETFREQ + XSEL                          ' %0000_000e_dddddd_mmmmmmmmmm_pppp_cc_ss  ' enable oscillator


'Serial functions for data logging
	rx_pin		= 63
	tx_pin		= 62
	BAUDRATE	= 115_200
	ASYNCFG		= round(float(CLOCKFREQ) * 64.0 / float(BAUDRATE))<<10 + 7	'bitrate format is 16.6<<10, 8N1 framing


  intensity	= 80	'0..128

  fclk		= 80_000_000.0
  fpix		= 25_000_000.0
  fset		= (fpix / fclk * 2.0) * float($4000_0000)

'vsync		=	0	'vsync pin (all FPGA boards now)
vsync		=	4	'vsync pin (for Prop2-Eval accessory board)

DAT
org
'
'
' Setup
'
'		hubset	#$FF			'set clock to 80MHz
'Silicon frequencies
'-----------------------------------------------------------
		hubset	clk_mode			'switch to RCFAST using known prior mode
		mov	clk_mode, ##SETFREQ		'replace old with new
		jmp	#bootinit
'--------------------------------------------------------
'***  Boot-loader can fill all four of the following  ***
'--------------------------------------------------------
spare1		long	0			'hubRAM addr $010 - compatible reserved for system variable
clk_freq	long	CLOCKFREQ		'hubRAM addr $014 - sysclock frequency, integer frequency in hertz
clk_mode	long	0			'hubRAM addr $018 - clock mode config word, used directly in HUBSET
asyn_baud	long	BAUDRATE		'hubRAM addr $01c - comport baud rate, integer baud in hertz

bootinit
		hubset	clk_mode			'setup for new mode, still RCFAST
		waitx	##20_000_000/100		'~10ms for crystal/PLL to come up to speed
		hubset	##ENAFREQ			'engage
'-----------------------------------------------------------

		coginit #2,##@PropCam_Acq


		mov	x,#0
		rdfast	#0,##$1000-$400		'load .bmp palette into lut
		rep	@.end,#$100
		rflong	y
		shl	y,#8
		wrlut	y,x
		add	x,#1
.end
		rdfast	##640*480/64,##$1000	'set rdfast to wrap on bitmap

		setxfrq ##round(fset)		'set transfer frequency to 25MHz

		'the next 4 lines may be commented out to bypass level scaling

		setcy	##intensity << 24	'r	set colorspace for rgb
		setci	##intensity << 16	'g
		setcq	##intensity << 08	'b
		setcmod	#%01_0_000_0		'enable colorspace conversion

		wrpin	dacmode_s,#0		'enable dac modes in pins 0..3
		wrpin	dacmode_c,#1
		wrpin	dacmode_c,#2
		wrpin	dacmode_c,#3
		setnib  dira,#$f,#0		'RJA:  New for real P2
'
'
' Field loop
'
field		mov	x,#33			'top blanks
		call	#blank

		mov     x,#480			'set visible lines
line		call	#hsync			'do horizontal sync
		xcont	m_rf,#0			'visible line
		djnz    x,#line           	'another line?

		mov	x,#10			'bottom blanks
		call	#blank

		drvnot	#vsync			'sync on

		mov	x,#2			'sync blanks
		call	#blank

		drvnot	#vsync			'sync off

                jmp     #field                  'loop
'
'
' Subroutines
'
blank		call	#hsync			'blank lines
		xcont	m_vi,#0
	_ret_	djnz	x,#blank

hsync		xcont	m_bs,#0			'horizontal sync
		xcont	m_sn,#1
	_ret_	xcont	m_bv,#0
'
'
' Initialized data
'
'dacmode	long	%0000_0000_000_1010000000000_01_00000_0
'RJA:  New dacmodes for real P2
dacmode_s       long    %0000_0000_000_1011000000000_01_00000_0         'hsync is 123-ohm, 3.3V
dacmode_c       long    %0000_0000_000_1011100000000_01_00000_0         'R/G/B are 75-ohm, 2.0V

m_bs		long	$CF000000+16		'before sync
m_sn		long	$CF000000+96		'sync
m_bv		long	$CF000000+48		'before visible
m_vi		long	$CF000000+640		'visible

m_rf		long	$7F000000+640		'visible rlong 8bpp lut

x		res	1
y		res	1

'****************************************************************************
'endfold
'comment

'fold propcam
org

PropCam_Acq
'		nop                      'if this is first instruction everything fine
'		waitx ##80_000_000   'if this is first instruction ... no display
		cogid camera_number  'if this is first instruction ... bad lut
		cogstop #2            'if this is first instruction ... bad lut

'temporary jmp
temploop
		nop
		jmp #temploop


camera_number	res  1

'endfold


'
' Bitmap
'
		orgh	$1000 - $436	'justify pixels at $1000, pallete at $1000-$400
		file	"bitmap2.bmp"	'640 x 480, 8pbb-lut

evanh · 2019-04-09 13:57

Rayman,
I've had bad experience too but, at this stage, it doesn't look to be the fault of the tools.

Dave Hein · 2019-04-09 14:07

I also ran into the issue with x not being initialized. I posted a thread about this called Minor issue with VGA_640_x_480_8bpp.spin2 and VGA_640_x_480_16bpp.spin2. It worked OK in the original programs because the ORGH caused the memory to be padded out with zeros. However, if you put code immediately after x and y the memory will be nonzero.

The RES instruction should be used with care. If a variable is required to be initialize to zero it is better to use "long 0" instead of using res, and then depend on the assembler to pad with zeros.

cgracey · 2019-04-09 23:44

For assembler mode, like PNut.exe works in right now, nothing will change.

For Spin2 mode, though, there isn't going to be controllable absolute addresses because of how objects build. So, in the case if Spin2 mode, ORGH will only affect the assembler-aware origin, and will default to $400, and not allow addresses below $400.

cgracey · 2019-04-09 23:50

About RES, this is a very useful directive, but it is only good for allocating uninitialized registers within ORG'd code.

jmg · 2019-04-10 00:01

Rayman wrote: »

I would argue that you could just use FIT to achieve the same purpose with less risk...

But, I do use RES, I'm just very careful with it after having been burned more than once.

I think there is room for both, but they need to be very clear about what they actually do, and where they are doing it !!

As P2 is totally RAM based, there are code-gains to be had, by declaring a VAR and initialising it at the same time.
Flash based MCUs have to load any initialized RAM vars, which costs time and code space.
Some MCUs sweep RAM to clear to 0x00, as part of their init code. That approach does reduce code size and helps avoid 'oops' bugs.

Arrays that are declared maybe should have an option to load a known value ?

As mentioned above, large arrays could have a download cost, if their value is really don't care.

Rayman · 2019-04-10 00:01

Why can’t res vars be relocated to after declared registers?

cgracey · 2019-04-10 00:17

Rayman wrote: »

Why can’t res vars be relocated to after declared registers?

The assembler would have to group them up under each ORG and arrange them before the end of the DAT block or before the next ORG or ORGH. It's a lot simpler to just place them manually and understand what they do.

Rayman · 2019-04-10 00:36

Seems like there should be a warning in such cases...

I like warnings
Ignore them all the time

ozpropdev · 2019-04-10 00:38

The RES directive is not a new directive for P2.
It's has been used extensively in P1 code.
The P1 manual explains it's use and even includes this statement.

Caution: Use RES Only After Instructions and Data

Rayman · 2019-04-10 00:50

Caution exactly

ersmith · 2019-04-10 16:17

cgracey wrote: »

For assembler mode, like PNut.exe works in right now, nothing will change.

For Spin2 mode, though, there isn't going to be controllable absolute addresses because of how objects build. So, in the case if Spin2 mode, ORGH will only affect the assembler-aware origin, and will default to $400, and not allow addresses below $400.

I would urge you to think about this again. Addresses were one of the problem points in Spin 1; I think people often tripped over the difference between @ in Spin and PASM. I know that making @ working as expected in mixed PASM and Spin is difficult. But it is possible -- homespun and bstc did it, for example (although they had to invent a new syntax, @ @ @, to maintain Spin compatibility).

Rayman · 2019-04-10 16:39

Now I'm confused... When do we need the triple-@?

I found this quote from ersmith:

Correct, code in a method can use (at)image to get the absolute hub address of a DAT symbol "image". (Code in a DAT section would have to use a more complicated method involving a run-time fixup, unless you're using bstc or fastspin, in which case triple (at) will do the trick.)

Seems the triple is only for PASM sections?
But, I think the tools right now already give us the absolute address of things when inside a PASM section.

ersmith · 2019-04-10 17:04

Rayman wrote: »

Now I'm confused... When do we need the triple-@?

I found this quote from ersmith:

Correct, code in a method can use (at)image to get the absolute hub address of a DAT symbol "image". (Code in a DAT section would have to use a more complicated method involving a run-time fixup, unless you're using bstc or fastspin, in which case triple (at) will do the trick.)

Seems the triple is only for PASM sections?
But, I think the tools right now already give us the absolute address of things when inside a PASM section.

Not in Spin 1. On the Propeller 1, @ inside a DAT section always meant an address relative to the start of the DAT section. Right now in P2 code this is different and @ means absolute address, but I think Chip was talking about making Spin 2 act like Spin 1. I think this could end up being very confusing.

cgracey · 2019-04-10 17:35

In Spin2, hub-exec PASM code can be ORGH'd at any address >= $400, but it will never wind up there after compilation. With relative addressing modes within PASM blocks, though, all you'd need to know is the entry address, which is absolute and easy to determine at runtime, but not as ORGH'd. As objects are stacked together, there's no telling where some block of PASM code is going to wind up.

Am I missing something?

We'd have to have some kind of linker step to afford absolute addressing between uncommonly ORGH'd blocks of PASM code, right? It seems way more trouble than it's worth if we can determine an entry-point at run time and just use that, instead.

cgracey · 2019-04-10 17:37

So, my thinking is that ORGH'd labels return the absolute hub address via @label.

Rayman · 2019-04-10 19:00

Ok, that's mostly what I do now, except for the VGA bitmap, etc.
There we currently rely on the orgh value, I think, so that would have to change somehow...

Dave Hein · 2019-04-10 19:07

Couldn't there be two types of DAT sections with one type connected to a Spin object, and the other independent of an object? The independent one would look like what we have now with absolute addresses. The other type would look like the Spin1 DAT sections, where the addresses are relative to the start of the object.

EDIT: Maybe the independent DAT section could contain global symbols that any object could directly address. That's a feature that would have been useful in Spin1.

To ORGH or not to ORGH?

Comments