p2asm

RossH · 2019-04-10 23:07

Hi Dave - sorry to keep throwing these at you, but here's another couple of minor incompatibilities ...

1. p2asm appears to allow "wzc" in place of "wcz" - i.e. the code compiles without error. But actually it silently ignores it, leading to much head scratching!.

2. p2asm accepts the "fit" directive but it seems to do nothing - i.e. it doesn't complain if the code doesn't actually fit.

PNUT complains about both of these.

ersmith · 2019-04-11 00:06

evanh wrote: »

Anyone looking at the LOC bug?
EDIT: I've just updated it.

Thanks for your test program. LOC and JMP below $400 are certainly difficult cases, and I think you've run into a mixture of assembler bugs, hardware misfeatures, and misunderstandings on all sides.

One potential problem in your code is that you use raw labels rather than asking for hub addresses, so for example the COG code that does:

    loc pa, #loctest2
    ...
    orgh $110
loctest2 long $deadbeef

gets compiled as if it read

    loc pa, #$110

which of course won't work because as you note in your comments the value $deadbeef is actually located at $44 in COG memory. Should the assemblers catch this mistake? In principle I think they could issue a warning, at least (since the label "loctest2" is after an "orgh" directive we should know it's a hub expression). I think none of them do so now though.

Your program did point out a bug in fastspin, where it needs to divide the relative offset by 4 in COG mode. It looks like the hardware handles LOC and JMP differently in this regard; is that correct @cgracey ?

Actually it rather looks like we always need to divide the LOC value by 4 when running in COG mode, even for HUB addresses. That makes using LOC on byte addresses problematic. Is that intentional? It seems like it may be a hardware restriction that programmers need to be aware of -- it means that we can't use the same LOC code in COG and HUB, and that COG code cannot use LOC to load a HUB byte or word address. Never mind, that was based on a misunderstanding of how Evan's code works.

I think it would be prudent though for "loc" to default to absolute mode rather than relative; it might save trouble in the long run.

Regards,
Eric

evanh · 2019-04-11 00:28

Eric,
That example LOC is very much intended to be used as a hub address. There is valid data access hubram below $400. Byte addressing is correct there.

Also, the first LOC is a cog address, 8, which also has the machine code encoded wrongly.

All three assemblers produce the same.

Only the third LOC, above hub $400, is encoded correctly.

evanh · 2019-04-11 00:41

Hubexec code assembles correctly. It's only cogexec has the problem. I haven't tested lutram but I expect it to be same as cogram.

evanh · 2019-04-11 00:47

ersmith wrote: »

I think it would be prudent though for "loc" to default to absolute mode rather than relative; it might save trouble in the long run.

Good question. Looking at the hubexec results - https://forums.parallax.com/discussion/comment/1457051/#Comment_1457051 I can see it has done an absolute encoding of $110 when it didn't need to.

It would be nice to retain the relative function. It can be done as Chip intended I think. So even the hubexec encoding needs a little work over for data references below $400.

evanh · 2019-04-11 01:35

ersmith wrote: »
gets compiled as if it read
    loc pa, #$110
which of course won't work because as you note in your comments the value $deadbeef is actually located at $44 in COG memory. Should the assemblers catch this mistake? In principle I think they could issue a warning, at least (since the label "loctest2" is after an "orgh" directive we should know it's a hub expression). I think none of them do so now though.

Your program did point out a bug in fastspin, where it needs to divide the relative offset by 4 in COG mode. It looks like the hardware handles LOC and JMP differently in this regard; is that correct @cgracey ?

All of that is related. Yes, there is a distinct difference between data addressing and program addressing. Program address encoding is byte scaled in all domains, but data addressing is longword scaled for cog space while byte scaled for hub space.

A LOC instruction has the encoding features of program addressing but is more likely to be used for data addressing ...

Dave Hein · 2019-04-11 02:50

RossH wrote: »

Hi Dave - sorry to keep throwing these at you, but here's another couple of minor incompatibilities ...

1. p2asm appears to allow "wzc" in place of "wcz" - i.e. the code compiles without error. But actually it silently ignores it, leading to much head scratching!.

2. p2asm accepts the "fit" directive but it seems to do nothing - i.e. it doesn't complain if the code doesn't actually fit.

PNUT complains about both of these.

p2asm now prints an error message if an invalid symbol is found when expecting wc, wz or wcz. I also fixed the FIT directive.

evanh · 2019-04-11 03:18

ersmith wrote: »
gets compiled as if it read
    loc pa, #$110

I guess I should point out that that would have been a correct answer but the buggy relative address is really $408 for some reason. Producing an absolute address of $416, not the $110 it should be.

evanh · 2019-04-11 03:22

I doubt there is any legit reason to want to LOC a cog data address. MOV immediate is just as compact there. If LOC gets assembled as always calculated to a program address (byte scaled) then that would be the easy fix.

It would then get treated same as existing branch long-immediate encoding.

evanh · 2019-04-11 07:10

evanh wrote: »

Yes, there is a distinct difference between data addressing and program addressing. Program address encoding is byte scaled in all domains, but data addressing is longword scaled for cog space while byte scaled for hub space.

Oh, that's only true for long-immediate program addressing. Cogexec register direct addressing, for example, doesn't do that. Bit of a brain twister but for hubRAM addressing it's all byte scaled. That's where LOC wants to be used.

RossH · 2019-04-11 08:45

Dave Hein wrote: »

p2asm now prints an error message if an invalid symbol is found when expecting wc, wz or wcz. I also fixed the FIT directive.

Much appreciated, Dave!

pilot0315 · 2019-04-13 03:28

@evanh

question?

I am attempting to tease out your code to the serial terminal. Can you separate it out so I can access it??
Your putch appears to be simple. I was not part of the fpga develpment. I would like to develop a simple subroutine to get to the serial terminal.
thanks

evanh · 2019-04-13 04:23

The routines, putch and getch, are simple, yes. The naming was from Dave Hein's bit-bashing routines that I started from.

'===============================================
'  input:  (none)
' result:  pb
'scratch:  (none)
'
getch
		testp	#rx_pin		wz	'byte received? (IN high == yes)
if_z		rdpin	pb, #rx_pin		'get data
if_z		shr	pb, #32-8		'shift the data to bottom of register
if_z		ret			wcz	'restore C/Z flags of calling routine
		jmp	#getch			'wait while Smartpin is idle


'===============================================
'  input:  pb
' result:  (none)
'scratch:  (none)
'
putch
		rqpin	inb, #tx_pin	wc	'transmiting? (C high == yes)  *Needed to initiate tx
		testp	#tx_pin		wz	'buffer free? (IN high == yes)
if_z_or_nc	wypin	pb, #tx_pin		'write new byte to Y buffer
if_z_or_nc	ret			wcz	'restore C/Z flags of calling routine
		jmp	#putch			'wait while Smartpin is both full (nz) and transmitting (c)

Okay, so the two smartpins first need config'd before they operate as a comport. This is in the _diaginit routine.

'----- Configure diag comport to use smartpins instead of bit-bashing -----
		wrpin	#%00_11111_0, #rx_pin		'Asynchronous serial receive
		wxpin	##ASYNCFG, #rx_pin		'set baurdrate and framing
		dirh	#rx_pin
		wrpin	#%01_11110_0, #tx_pin		'Asynchronous serial transmit mode
		wxpin	##ASYNCFG, #tx_pin		'set X with baudrate and framing
		dirh	#tx_pin
'		wypin	#0, #tx_pin			'trigger first tx ready state (not needed if dual checked)
							'single check is buffer full only, dual checking adds in tx flag

ASYNCFG is a preset constant defined at the start of the source. It is dependant on the prop2 system clock frequency, which is also preset in the same group of constants.

Example of 115200 baud and sysclock of 80 MHz from a crystal of 20 MHz.

CON
	XTALFREQ	= 20_000_000				'PLL stage 0: crystal frequency
	XDIV		= 2					'PLL stage 1: crystal divider
	XMUL		= 8					'PLL stage 2: crystal / div * mul
	XDIVP		= 1					'PLL stage 3: crystal / div * mul / divp (1,2,4..30)

	XOSC		= %10                             'OSC    ' %00=OFF, %01=OSC, %10=15pF, %11=30pF
	XSEL		= %11                             'XI+PLL ' %00=rcfast(20+MHz), %01=rcslow(~20KHz), %10=XI(5ms), %11=XI+PLL(10ms)
	XPPPP		= ((XDIVP>>1) + 15) & $F                  ' 1->15, 2->0, 4->1, 6->2...30->14
	CLOCKFREQ	= round(float(XTALFREQ) / float(XDIV) * float(XMUL) / float(XDIVP))
	SETFREQ		= 1<<24 + (XDIV-1)<<18 + (XMUL-1)<<8 + XPPPP<<4 + XOSC<<2
	ENAFREQ		= SETFREQ + XSEL                          ' %0000_000e_dddddd_mmmmmmmmmm_pppp_cc_ss  ' enable oscillator


'Serial presets for data logging
	rx_pin		= 63
	tx_pin		= 62
	BAUDRATE	= 115_200
	ASYNCFG		= round(float(CLOCKFREQ) * 64.0 / float(BAUDRATE))<<10 + 7	'bitrate format is 16.6<<10, 8N1 framing

pilot0315 · 2019-04-13 22:13

@evanh
Hello,
I copied your code and have time now to work with it. I am getting an error that I assume is not right. Looking at your loc-cbug code shows similar stuff.
Trying to just print a single character.
Any Ideas?? when you get a chance.
Thanks

evanh · 2019-04-13 22:31

You've placed everything in the CON section. You need to add a DAT section for assembly code to go in.

Eg:

CON
	rx_pin		= 63
	tx_pin		= 62

DAT
		wrpin	#%00_11111_0, #rx_pin		'Asynchronous serial receive

evanh · 2019-04-13 22:37

Oh, and there is more code in the _diaginit routine that sets the sysclock rate as well. I don't think pnut has any clock setting features.

'Silicon frequencies
		hubset	clk_mode			'switch to RCFAST using known prior mode
		mov	clk_mode, ##SETFREQ		'replace old with new
'		mov	clk_freq, ##CLOCKFREQ		'optional, if clk_freq used
		hubset	clk_mode			'setup for new mode, still RCFAST
		waitx	##20_000_000/100		'~10ms for crystal/PLL to come up to speed
		hubset	##ENAFREQ			'engage

evanh · 2019-04-13 22:44

And that, in turn, uses the convention reserved config longwords located at hub addresses $10 to $1f.

So, recommend to use this at the start of your testing:

CON
	XTALFREQ	= 20_000_000				'PLL stage 0: crystal frequency
	XDIV		= 2					'PLL stage 1: crystal divider
	XMUL		= 8					'PLL stage 2: crystal / div * mul
	XDIVP		= 1					'PLL stage 3: crystal / div * mul / divp (1,2,4..30)

	XOSC		= %10                             'OSC    ' %00=OFF, %01=OSC, %10=15pF, %11=30pF
	XSEL		= %11                             'XI+PLL ' %00=rcfast(20+MHz), %01=rcslow(~20KHz), %10=XI(5ms), %11=XI+PLL(10ms)
	XPPPP		= ((XDIVP>>1) + 15) & $F                  ' 1->15, 2->0, 4->1, 6->2...30->14
	CLOCKFREQ	= round(float(XTALFREQ) / float(XDIV) * float(XMUL) / float(XDIVP))
	SETFREQ		= 1<<24 + (XDIV-1)<<18 + (XMUL-1)<<8 + XPPPP<<4 + XOSC<<2
	ENAFREQ		= SETFREQ + XSEL                          ' %0000_000e_dddddd_mmmmmmmmmm_pppp_cc_ss  ' enable oscillator


'Serial presets for data logging
	rx_pin		= 63
	tx_pin		= 62
	BAUDRATE	= 115_200
	ASYNCFG		= round(float(CLOCKFREQ) * 64.0 / float(BAUDRATE))<<10 + 7	'bitrate format is 16.6<<10, 8N1 framing



DAT
		jmp     #_diaginit
		long	0[3]
'--------------------------------------------------------
'***  Boot-loader can fill all four of the following  ***
'--------------------------------------------------------
spare1		long	0			'hubRAM addr $010 - compatible reserved for system variable
clk_freq	long	CLOCKFREQ		'hubRAM addr $014 - sysclock frequency, integer frequency in hertz
clk_mode	long	0			'hubRAM addr $018 - clock mode config word, used directly in HUBSET
asyn_baud	long	BAUDRATE		'hubRAM addr $01c - comport baud rate, integer baud in hertz
'--------------------------------------------------------

_diaginit
'Silicon frequencies
		hubset	clk_mode			'switch to RCFAST using known prior mode
		mov	clk_mode, ##SETFREQ		'replace old with new
'		mov	clk_freq, ##CLOCKFREQ		'optional, if clk_freq used
		hubset	clk_mode			'setup for new mode, still RCFAST
		waitx	##20_000_000/100		'~10ms for crystal/PLL to come up to speed
		hubset	##ENAFREQ			'engage

'----- Configure diag comport to use smartpins instead of bit-bashing -----
		wrpin	#%00_11111_0, #rx_pin		'Asynchronous serial receive
		wxpin	##ASYNCFG, #rx_pin		'set baurdrate and framing
		dirh	#rx_pin
		wrpin	#%01_11110_0, #tx_pin		'Asynchronous serial transmit mode
		wxpin	##ASYNCFG, #tx_pin		'set X with baudrate and framing
		dirh	#tx_pin
'		wypin	#0, #tx_pin			'trigger first tx ready state (not needed if dual checked)
							'single check is buffer full only, dual checking adds in tx flag

evanh · 2019-04-13 23:02

And last problem is, because this transmits gapless, the character framing becomes invisible if the first start bit is not observed. Which means the receiving terminal can't work out what the transmitted characters are without some inserted gaps. A long gap at the beginning is enough.

Eg, 1/2 second delay for terminal startup:

		waitx	##CLOCKFREQ/2			'0.5 seconds
		mov	pb, #"x"
testing
		call	#putch
		jmp	#testing

Or wait for a keypress before transmitting:

		call	#getch
		mov	pb, #"x"
testing
		call	#putch
		jmp	#testing

pilot0315 · 2019-04-13 23:48

@evanh

Didn't see that. Changed it and am getting gobbly goop but that is better than nothing. Gonna work on it later.
Thanks.

evanh · 2019-04-14 01:08

Or echo what you type:

testing
		call	#getch
		call	#putch
		mov	pb, #" "			'add a space for frill
		call	#putch
		jmp	#testing

RossH · 2019-04-14 02:27

Hi Dave - just a question. Are you planning to have p2asm support the Encode and Decode operators?

From the P1 reference manual ...

|< Decode value (modulus of 32; 0-31) into single-high-bit long; p 160.
>| Encode long into magnitude (0 - 32) as high-bit priority; p 160.

I can work around this, so it's not a big problem.

Dave Hein · 2019-04-14 02:36

I added support for the decode operator a few months ago. I'll look into supporting the encode operator.

RossH · 2019-04-14 08:35

Dave Hein wrote: »

I added support for the decode operator a few months ago. I'll look into supporting the encode operator.

Thanks, Dave!

Dave Hein · 2019-04-14 14:32

I added support for the ">|" operator, and checked it into GitHub.

evanh · 2019-04-14 14:44

I could make use of such an operator/function in a Bash script right about now. I'm making do with a long-winded log(x)/log(2) via Awk instead.

RossH · 2019-04-16 00:07

Hi Dave

Found another "head scratching" incompatibility. The following code compiles with p2asm, but not with PNUT ...

DAT
 nop
.sym1 nop
.sym1 nop
 jmp #\.sym1

Here is the output ...

DAT
00000 000 00000000  nop
00004 001 00000000 .sym1 nop
00008 002 00000000 .sym1 nop
0000c 003 fd800001  jmp #\.sym1

Actually, does anyone have a description of precisely what the "." notation does for symbols - I am assuming it is supposed to be similar to what ":" used to do on the P1?

Dave Hein · 2019-04-16 01:11

The ":" notation that was used for the P1 was changed to "." for the P2. I'm not sure why this change was made, but I expressed my concerns about using "." at the time when this was being discussed. When p2asm is in the object mode, it supports some of the GAS directives that happen to start with the "." character. Using "." for local variables complicates things a bit, so I currently don't allow local labels in the object mode. I'll fix this eventually, and just not allow local labels that match the supported GAS directives.

p2asm handles local labels a bit differently than PNut does. In PNut, local labels can exists only between global variables. p2asm allows local labels to be defined without a preceding global label. The scope of the local label does end when a global label is encountered, just like with PNut. Clearly, allowing two local labels with the same name within the same scope is a bug. I'll look into fixing that. At some point I'll also probably adopt the same rules as PNut.

jmg · 2019-04-16 01:59

Dave Hein wrote: »

The ":" notation that was used for the P1 was changed to "." for the P2. I'm not sure why this change was made, but I expressed my concerns about using "." at the time when this was being discussed. When p2asm is in the object mode, it supports some of the GAS directives that happen to start with the "." character. Using "." for local variables complicates things a bit, so I currently don't allow local labels in the object mode. I'll fix this eventually, and just not allow local labels that match the supported GAS directives.

p2asm handles local labels a bit differently than PNut does. In PNut, local labels can exists only between global variables. p2asm allows local labels to be defined without a preceding global label. The scope of the local label does end when a global label is encountered, just like with PNut. Clearly, allowing two local labels with the same name within the same scope is a bug. I'll look into fixing that. At some point I'll also probably adopt the same rules as PNut.

It's also common for assemblers to allow define of local labels - maybe that's another, better solution ?
eg this from an Assembler manual

$LOCALPREFIX (_)        ; local label prefix is _
                         ; Local labels generated with the previous global label + local label. Local labels must only be unique within the global labels and global label + local label must be unique.
_ThisIsLocalLabel:

; and the associated
$LOCALLIST    ;   Function: Enable local labels listing in symbol table. 
$NOLOCALLIST ; Disable

Dave Hein · 2019-04-16 02:23

I just need to change how I handle local labels. Concatenating the last global label with the local label seems to be the way to do it. The local label scope would then be identical to scope implemented by PNut.

EDIT: I ran a few tests with local labels in PNut, and it appears that they are limited to 26 characters instead of 30 characters, which is the limit for global labels. Of course, local labels are typically very short, so a 26-character limit isn't really a problem. I suspect that PNut internally concatenates local labels with a unique 4-character tag.

EDIT2: In my previous post I was wrong about local labels requiring a preceding global label in PNut. A local label can be defined before a global label, and it's scope extends to the following global label.

pilot0315 · 2019-04-21 21:02

@evanh

Thanks for your yelp. I am getting serial terminal data on a rudimentary level.
Now working on decimal numbers.
Attempting to look at Dave Hein's code. Do you have any examples or suggestions.
Martin

p2asm

Comments