Shop OBEX P1 Docs P2 Docs Learn Events
p2asm - Page 3 — Parallax Forums

p2asm

13

Comments

  • RossHRossH Posts: 5,462
    Hi Dave - sorry to keep throwing these at you, but here's another couple of minor incompatibilities ...

    1. p2asm appears to allow "wzc" in place of "wcz" - i.e. the code compiles without error. But actually it silently ignores it, leading to much head scratching!.

    2. p2asm accepts the "fit" directive but it seems to do nothing - i.e. it doesn't complain if the code doesn't actually fit.

    PNUT complains about both of these.
  • ersmithersmith Posts: 6,053
    edited 2019-04-11 00:21
    evanh wrote: »
    Anyone looking at the LOC bug?
    EDIT: I've just updated it.

    Thanks for your test program. LOC and JMP below $400 are certainly difficult cases, and I think you've run into a mixture of assembler bugs, hardware misfeatures, and misunderstandings on all sides.

    One potential problem in your code is that you use raw labels rather than asking for hub addresses, so for example the COG code that does:
        loc pa, #loctest2
        ...
        orgh $110
    loctest2 long $deadbeef
    
    gets compiled as if it read
        loc pa, #$110
    
    which of course won't work because as you note in your comments the value $deadbeef is actually located at $44 in COG memory. Should the assemblers catch this mistake? In principle I think they could issue a warning, at least (since the label "loctest2" is after an "orgh" directive we should know it's a hub expression). I think none of them do so now though.

    Your program did point out a bug in fastspin, where it needs to divide the relative offset by 4 in COG mode. It looks like the hardware handles LOC and JMP differently in this regard; is that correct @cgracey ?

    Actually it rather looks like we always need to divide the LOC value by 4 when running in COG mode, even for HUB addresses. That makes using LOC on byte addresses problematic. Is that intentional? It seems like it may be a hardware restriction that programmers need to be aware of -- it means that we can't use the same LOC code in COG and HUB, and that COG code cannot use LOC to load a HUB byte or word address. Never mind, that was based on a misunderstanding of how Evan's code works.

    I think it would be prudent though for "loc" to default to absolute mode rather than relative; it might save trouble in the long run.

    Regards,
    Eric
  • evanhevanh Posts: 15,916
    edited 2019-04-11 00:29
    Eric,
    That example LOC is very much intended to be used as a hub address. There is valid data access hubram below $400. Byte addressing is correct there.

    Also, the first LOC is a cog address, 8, which also has the machine code encoded wrongly.

    All three assemblers produce the same.

    Only the third LOC, above hub $400, is encoded correctly.
  • evanhevanh Posts: 15,916
    Hubexec code assembles correctly. It's only cogexec has the problem. I haven't tested lutram but I expect it to be same as cogram.
  • evanhevanh Posts: 15,916
    edited 2019-04-11 01:59
    ersmith wrote: »
    I think it would be prudent though for "loc" to default to absolute mode rather than relative; it might save trouble in the long run.
    Good question. Looking at the hubexec results - https://forums.parallax.com/discussion/comment/1457051/#Comment_1457051 I can see it has done an absolute encoding of $110 when it didn't need to.

    It would be nice to retain the relative function. It can be done as Chip intended I think. So even the hubexec encoding needs a little work over for data references below $400.
  • evanhevanh Posts: 15,916
    ersmith wrote: »
    gets compiled as if it read
        loc pa, #$110
    
    which of course won't work because as you note in your comments the value $deadbeef is actually located at $44 in COG memory. Should the assemblers catch this mistake? In principle I think they could issue a warning, at least (since the label "loctest2" is after an "orgh" directive we should know it's a hub expression). I think none of them do so now though.

    Your program did point out a bug in fastspin, where it needs to divide the relative offset by 4 in COG mode. It looks like the hardware handles LOC and JMP differently in this regard; is that correct @cgracey ?
    All of that is related. Yes, there is a distinct difference between data addressing and program addressing. Program address encoding is byte scaled in all domains, but data addressing is longword scaled for cog space while byte scaled for hub space.

    A LOC instruction has the encoding features of program addressing but is more likely to be used for data addressing ...
  • RossH wrote: »
    Hi Dave - sorry to keep throwing these at you, but here's another couple of minor incompatibilities ...

    1. p2asm appears to allow "wzc" in place of "wcz" - i.e. the code compiles without error. But actually it silently ignores it, leading to much head scratching!.

    2. p2asm accepts the "fit" directive but it seems to do nothing - i.e. it doesn't complain if the code doesn't actually fit.

    PNUT complains about both of these.

    p2asm now prints an error message if an invalid symbol is found when expecting wc, wz or wcz. I also fixed the FIT directive.
  • evanhevanh Posts: 15,916
    ersmith wrote: »
    gets compiled as if it read
        loc pa, #$110
    
    I guess I should point out that that would have been a correct answer but the buggy relative address is really $408 for some reason. Producing an absolute address of $416, not the $110 it should be.
  • evanhevanh Posts: 15,916
    edited 2019-04-11 03:48
    I doubt there is any legit reason to want to LOC a cog data address. MOV immediate is just as compact there. If LOC gets assembled as always calculated to a program address (byte scaled) then that would be the easy fix.

    It would then get treated same as existing branch long-immediate encoding.
  • evanhevanh Posts: 15,916
    evanh wrote: »
    Yes, there is a distinct difference between data addressing and program addressing. Program address encoding is byte scaled in all domains, but data addressing is longword scaled for cog space while byte scaled for hub space.
    Oh, that's only true for long-immediate program addressing. Cogexec register direct addressing, for example, doesn't do that. Bit of a brain twister but for hubRAM addressing it's all byte scaled. That's where LOC wants to be used.
  • RossHRossH Posts: 5,462
    Dave Hein wrote: »
    p2asm now prints an error message if an invalid symbol is found when expecting wc, wz or wcz. I also fixed the FIT directive.

    Much appreciated, Dave!
  • @evanh

    question?

    I am attempting to tease out your code to the serial terminal. Can you separate it out so I can access it??
    Your putch appears to be simple. I was not part of the fpga develpment. I would like to develop a simple subroutine to get to the serial terminal.
    thanks
  • evanhevanh Posts: 15,916
    edited 2019-04-13 04:27
    The routines, putch and getch, are simple, yes. The naming was from Dave Hein's bit-bashing routines that I started from.
    '===============================================
    '  input:  (none)
    ' result:  pb
    'scratch:  (none)
    '
    getch
    		testp	#rx_pin		wz	'byte received? (IN high == yes)
    if_z		rdpin	pb, #rx_pin		'get data
    if_z		shr	pb, #32-8		'shift the data to bottom of register
    if_z		ret			wcz	'restore C/Z flags of calling routine
    		jmp	#getch			'wait while Smartpin is idle
    
    
    '===============================================
    '  input:  pb
    ' result:  (none)
    'scratch:  (none)
    '
    putch
    		rqpin	inb, #tx_pin	wc	'transmiting? (C high == yes)  *Needed to initiate tx
    		testp	#tx_pin		wz	'buffer free? (IN high == yes)
    if_z_or_nc	wypin	pb, #tx_pin		'write new byte to Y buffer
    if_z_or_nc	ret			wcz	'restore C/Z flags of calling routine
    		jmp	#putch			'wait while Smartpin is both full (nz) and transmitting (c)
    
    


    Okay, so the two smartpins first need config'd before they operate as a comport. This is in the _diaginit routine.
    '----- Configure diag comport to use smartpins instead of bit-bashing -----
    		wrpin	#%00_11111_0, #rx_pin		'Asynchronous serial receive
    		wxpin	##ASYNCFG, #rx_pin		'set baurdrate and framing
    		dirh	#rx_pin
    		wrpin	#%01_11110_0, #tx_pin		'Asynchronous serial transmit mode
    		wxpin	##ASYNCFG, #tx_pin		'set X with baudrate and framing
    		dirh	#tx_pin
    '		wypin	#0, #tx_pin			'trigger first tx ready state (not needed if dual checked)
    							'single check is buffer full only, dual checking adds in tx flag
    
    

    ASYNCFG is a preset constant defined at the start of the source. It is dependant on the prop2 system clock frequency, which is also preset in the same group of constants.

    Example of 115200 baud and sysclock of 80 MHz from a crystal of 20 MHz.
    CON
    	XTALFREQ	= 20_000_000				'PLL stage 0: crystal frequency
    	XDIV		= 2					'PLL stage 1: crystal divider
    	XMUL		= 8					'PLL stage 2: crystal / div * mul
    	XDIVP		= 1					'PLL stage 3: crystal / div * mul / divp (1,2,4..30)
    
    	XOSC		= %10                             'OSC    ' %00=OFF, %01=OSC, %10=15pF, %11=30pF
    	XSEL		= %11                             'XI+PLL ' %00=rcfast(20+MHz), %01=rcslow(~20KHz), %10=XI(5ms), %11=XI+PLL(10ms)
    	XPPPP		= ((XDIVP>>1) + 15) & $F                  ' 1->15, 2->0, 4->1, 6->2...30->14
    	CLOCKFREQ	= round(float(XTALFREQ) / float(XDIV) * float(XMUL) / float(XDIVP))
    	SETFREQ		= 1<<24 + (XDIV-1)<<18 + (XMUL-1)<<8 + XPPPP<<4 + XOSC<<2
    	ENAFREQ		= SETFREQ + XSEL                          ' %0000_000e_dddddd_mmmmmmmmmm_pppp_cc_ss  ' enable oscillator
    
    
    'Serial presets for data logging
    	rx_pin		= 63
    	tx_pin		= 62
    	BAUDRATE	= 115_200
    	ASYNCFG		= round(float(CLOCKFREQ) * 64.0 / float(BAUDRATE))<<10 + 7	'bitrate format is 16.6<<10, 8N1 framing
    
    
  • @evanh
    Hello,
    I copied your code and have time now to work with it. I am getting an error that I assume is not right. Looking at your loc-cbug code shows similar stuff.
    Trying to just print a single character.
    Any Ideas?? when you get a chance.
    Thanks

  • evanhevanh Posts: 15,916
    You've placed everything in the CON section. You need to add a DAT section for assembly code to go in.

    Eg:
    CON
    	rx_pin		= 63
    	tx_pin		= 62
    
    DAT
    		wrpin	#%00_11111_0, #rx_pin		'Asynchronous serial receive
    
    
  • evanhevanh Posts: 15,916
    Oh, and there is more code in the _diaginit routine that sets the sysclock rate as well. I don't think pnut has any clock setting features.
    'Silicon frequencies
    		hubset	clk_mode			'switch to RCFAST using known prior mode
    		mov	clk_mode, ##SETFREQ		'replace old with new
    '		mov	clk_freq, ##CLOCKFREQ		'optional, if clk_freq used
    		hubset	clk_mode			'setup for new mode, still RCFAST
    		waitx	##20_000_000/100		'~10ms for crystal/PLL to come up to speed
    		hubset	##ENAFREQ			'engage
    
    
  • evanhevanh Posts: 15,916
    edited 2019-04-13 22:45
    And that, in turn, uses the convention reserved config longwords located at hub addresses $10 to $1f.

    So, recommend to use this at the start of your testing:
    CON
    	XTALFREQ	= 20_000_000				'PLL stage 0: crystal frequency
    	XDIV		= 2					'PLL stage 1: crystal divider
    	XMUL		= 8					'PLL stage 2: crystal / div * mul
    	XDIVP		= 1					'PLL stage 3: crystal / div * mul / divp (1,2,4..30)
    
    	XOSC		= %10                             'OSC    ' %00=OFF, %01=OSC, %10=15pF, %11=30pF
    	XSEL		= %11                             'XI+PLL ' %00=rcfast(20+MHz), %01=rcslow(~20KHz), %10=XI(5ms), %11=XI+PLL(10ms)
    	XPPPP		= ((XDIVP>>1) + 15) & $F                  ' 1->15, 2->0, 4->1, 6->2...30->14
    	CLOCKFREQ	= round(float(XTALFREQ) / float(XDIV) * float(XMUL) / float(XDIVP))
    	SETFREQ		= 1<<24 + (XDIV-1)<<18 + (XMUL-1)<<8 + XPPPP<<4 + XOSC<<2
    	ENAFREQ		= SETFREQ + XSEL                          ' %0000_000e_dddddd_mmmmmmmmmm_pppp_cc_ss  ' enable oscillator
    
    
    'Serial presets for data logging
    	rx_pin		= 63
    	tx_pin		= 62
    	BAUDRATE	= 115_200
    	ASYNCFG		= round(float(CLOCKFREQ) * 64.0 / float(BAUDRATE))<<10 + 7	'bitrate format is 16.6<<10, 8N1 framing
    
    
    
    DAT
    		jmp     #_diaginit
    		long	0[3]
    '--------------------------------------------------------
    '***  Boot-loader can fill all four of the following  ***
    '--------------------------------------------------------
    spare1		long	0			'hubRAM addr $010 - compatible reserved for system variable
    clk_freq	long	CLOCKFREQ		'hubRAM addr $014 - sysclock frequency, integer frequency in hertz
    clk_mode	long	0			'hubRAM addr $018 - clock mode config word, used directly in HUBSET
    asyn_baud	long	BAUDRATE		'hubRAM addr $01c - comport baud rate, integer baud in hertz
    '--------------------------------------------------------
    
    _diaginit
    'Silicon frequencies
    		hubset	clk_mode			'switch to RCFAST using known prior mode
    		mov	clk_mode, ##SETFREQ		'replace old with new
    '		mov	clk_freq, ##CLOCKFREQ		'optional, if clk_freq used
    		hubset	clk_mode			'setup for new mode, still RCFAST
    		waitx	##20_000_000/100		'~10ms for crystal/PLL to come up to speed
    		hubset	##ENAFREQ			'engage
    
    '----- Configure diag comport to use smartpins instead of bit-bashing -----
    		wrpin	#%00_11111_0, #rx_pin		'Asynchronous serial receive
    		wxpin	##ASYNCFG, #rx_pin		'set baurdrate and framing
    		dirh	#rx_pin
    		wrpin	#%01_11110_0, #tx_pin		'Asynchronous serial transmit mode
    		wxpin	##ASYNCFG, #tx_pin		'set X with baudrate and framing
    		dirh	#tx_pin
    '		wypin	#0, #tx_pin			'trigger first tx ready state (not needed if dual checked)
    							'single check is buffer full only, dual checking adds in tx flag
    
    
  • evanhevanh Posts: 15,916
    edited 2019-04-13 23:18
    And last problem is, because this transmits gapless, the character framing becomes invisible if the first start bit is not observed. Which means the receiving terminal can't work out what the transmitted characters are without some inserted gaps. A long gap at the beginning is enough.

    Eg, 1/2 second delay for terminal startup:
    		waitx	##CLOCKFREQ/2			'0.5 seconds
    		mov	pb, #"x"
    testing
    		call	#putch
    		jmp	#testing
    

    Or wait for a keypress before transmitting:
    		call	#getch
    		mov	pb, #"x"
    testing
    		call	#putch
    		jmp	#testing
    

  • @evanh

    Didn't see that. Changed it and am getting gobbly goop but that is better than nothing. Gonna work on it later.
    Thanks.

  • evanhevanh Posts: 15,916
    edited 2019-04-14 01:08
    Or echo what you type:
    testing
    		call	#getch
    		call	#putch
    		mov	pb, #" "			'add a space for frill
    		call	#putch
    		jmp	#testing
    
  • RossHRossH Posts: 5,462
    Hi Dave - just a question. Are you planning to have p2asm support the Encode and Decode operators?

    From the P1 reference manual ...
    |< Decode value (modulus of 32; 0-31) into single-high-bit long; p 160.
    >| Encode long into magnitude (0 - 32) as high-bit priority; p 160.

    I can work around this, so it's not a big problem.
  • I added support for the decode operator a few months ago. I'll look into supporting the encode operator.
  • RossHRossH Posts: 5,462
    Dave Hein wrote: »
    I added support for the decode operator a few months ago. I'll look into supporting the encode operator.

    Thanks, Dave!
  • I added support for the ">|" operator, and checked it into GitHub.
  • evanhevanh Posts: 15,916
    I could make use of such an operator/function in a Bash script right about now. I'm making do with a long-winded log(x)/log(2) via Awk instead.
  • RossHRossH Posts: 5,462
    Hi Dave

    Found another "head scratching" incompatibility. The following code compiles with p2asm, but not with PNUT ...
    DAT
     nop
    .sym1 nop
    .sym1 nop
     jmp #\.sym1
    

    Here is the output ...
    DAT
    00000 000 00000000  nop
    00004 001 00000000 .sym1 nop
    00008 002 00000000 .sym1 nop
    0000c 003 fd800001  jmp #\.sym1
    

    Actually, does anyone have a description of precisely what the "." notation does for symbols - I am assuming it is supposed to be similar to what ":" used to do on the P1?
  • The ":" notation that was used for the P1 was changed to "." for the P2. I'm not sure why this change was made, but I expressed my concerns about using "." at the time when this was being discussed. When p2asm is in the object mode, it supports some of the GAS directives that happen to start with the "." character. Using "." for local variables complicates things a bit, so I currently don't allow local labels in the object mode. I'll fix this eventually, and just not allow local labels that match the supported GAS directives.

    p2asm handles local labels a bit differently than PNut does. In PNut, local labels can exists only between global variables. p2asm allows local labels to be defined without a preceding global label. The scope of the local label does end when a global label is encountered, just like with PNut. Clearly, allowing two local labels with the same name within the same scope is a bug. I'll look into fixing that. At some point I'll also probably adopt the same rules as PNut.
  • jmgjmg Posts: 15,173
    edited 2019-04-16 02:00
    Dave Hein wrote: »
    The ":" notation that was used for the P1 was changed to "." for the P2. I'm not sure why this change was made, but I expressed my concerns about using "." at the time when this was being discussed. When p2asm is in the object mode, it supports some of the GAS directives that happen to start with the "." character. Using "." for local variables complicates things a bit, so I currently don't allow local labels in the object mode. I'll fix this eventually, and just not allow local labels that match the supported GAS directives.

    p2asm handles local labels a bit differently than PNut does. In PNut, local labels can exists only between global variables. p2asm allows local labels to be defined without a preceding global label. The scope of the local label does end when a global label is encountered, just like with PNut. Clearly, allowing two local labels with the same name within the same scope is a bug. I'll look into fixing that. At some point I'll also probably adopt the same rules as PNut.

    It's also common for assemblers to allow define of local labels - maybe that's another, better solution ?
    eg this from an Assembler manual
    $LOCALPREFIX (_)        ; local label prefix is _
                             ; Local labels generated with the previous global label + local label. Local labels must only be unique within the global labels and global label + local label must be unique.
    _ThisIsLocalLabel:
    
    ; and the associated
    $LOCALLIST    ;   Function: Enable local labels listing in symbol table. 
    $NOLOCALLIST ; Disable
    
  • Dave HeinDave Hein Posts: 6,347
    edited 2019-04-16 02:55
    I just need to change how I handle local labels. Concatenating the last global label with the local label seems to be the way to do it. The local label scope would then be identical to scope implemented by PNut.

    EDIT: I ran a few tests with local labels in PNut, and it appears that they are limited to 26 characters instead of 30 characters, which is the limit for global labels. Of course, local labels are typically very short, so a 26-character limit isn't really a problem. I suspect that PNut internally concatenates local labels with a unique 4-character tag.

    EDIT2: In my previous post I was wrong about local labels requiring a preceding global label in PNut. A local label can be defined before a global label, and it's scope extends to the following global label.
  • @evanh

    Thanks for your yelp. I am getting serial terminal data on a rudimentary level.
    Now working on decimal numbers.
    Attempting to look at Dave Hein's code. Do you have any examples or suggestions.
    Martin
Sign In or Register to comment.