Shop OBEX P1 Docs P2 Docs Learn Events
PNut Problem with CALLD when calling from a cog address to another cog address — Parallax Forums

PNut Problem with CALLD when calling from a cog address to another cog address

If I assemble the following program with PNut I get the wrong instruction encoding for CALLD.
dat
        orgh    0
        org     0
entry
        'jmp     #start
        calld   adra, #start
        'long    $fe100004
        jmp     #entry
start   or      dirb, #1
.loop   waitx   ##50000000
        xor     outb, #1
        jmp     #.loop
PNut generates the code $fe0ffffe, which is an absolute jump to location $ffffe. If I comment out the CALLD and uncomment the JMP I get $fd900004, which is a relative jump to the correct address. I believe the CALLD should be generating $fe100004. If I comment out the CALLD and the JMP, and uncomment the LONG the program seems to work correctly.

Comments

  • ozpropdevozpropdev Posts: 2,793
    edited 2015-12-19 00:42
    The correct encoding for CALLD is
    CCCC 1010101 CZI DDDDDDDDD SSSSSSSSS        CALLD   D,S/#rel9   {WC,WZ}
    
    Pnut encodes CALLD fine for all cog locations except for the following locations
    	ADRA	1f6
    	ADRB	1f7
    	PTRA	1f8
    	PTRB	1f9
    	INA	1fe
    	INB	1ff
    
    I assume it may be related to some of the aliases
    RETI0                           =       CALLD   INB,INB     WC,WZ
    RETI1                           =       CALLD   INB,$1F5    WC,WZ
    RETI2                           =       CALLD   INB,$1F3    WC,WZ
    RETI3                           =       CALLD   INB,$1F1    WC,WZ
    
    RESI0                           =       CALLD   INA,INB     WC,WZ
    RESI1                           =       CALLD   $1F4,$1F5   WC,WZ
    RESI2                           =       CALLD   $1F2,$1F3   WC,WZ
    RESI3                           =       CALLD   $1F0,$1F1   WC,WZ
    


  • Here's the code I verified Pnut with and tested on a P123_A9 board.
    con
    	myret = $123		'works for all except 1f6..1f9,1fe,1ff
    
    dat
            orgh    0
            org 
    
    	or	dirb,#$ff
    	calld	myret,#@s1
    	calld	myret,#@s2
    	calld	myret,#@s3
    	calld	myret,#@s4
    	setb	outb,#7		'done
    here	jmp	#here
    
    s1	setb	outb,#0
    	jmp	myret
    
    s2	setb	outb,#1
    	jmp	myret
    
    s3	setb	outb,#2
    	jmp	myret
    
    s4	setb	outb,#3
    	jmp	myret
    



  • @ozpropdev, there are two encodings for CALLD.
    CCCC 1010101 CZI DDDDDDDDD SSSSSSSSS        CALLD   D,S/#rel9   {WC,WZ}
    CCCC 11100ww Rnn nnnnnnnnn nnnnnnnnn        CALLD   reg,#abs/#rel
    
    The first is used to cover the entire range of cog memory for the destination register, and either use a cog memory location for the source, or specify a limited range of -256 to 255. The second encoding is used to cover a large jump range with either an absolute or relative address. The destination register only has a range from $1f6 to $1f9, which is the same as ADRA, ADRB, PTRA and PTRB. The aliases for RETIx and RESIx use the first form of CALLD.

    Your examples of "calld myret,#@sx" also use the first form of CALLD since myret is $123, and is outside of the range of $1f6 to $1f9. In my case I'm using ADRA for the destination register, so PNut uses the second form of CALLD to provide a larger range. I tried "calld adra, #@start" in my code, and it does give the correct encoding. However, I don't feel that the "@" should be use. It is not used in the case of "jmp #start", which encodes correctly. It also doesn't make sense if I were to do "calld adra, #\@start", which forces an absolute address. If my cog code was not located starting at hub address zero this would give the wrong address.
  • ozpropdevozpropdev Posts: 2,793
    edited 2015-12-19 03:59
    Oh yes, I forgot about the OTHER CALLD.
    It works for me too but note the ORGH $400 in the following code was needed.
    The absolute address needs to be $400 or higher.
    dat
            orgh    0
            org 
    
    	or	dirb,#$ff
    	calld	adra,#s1
    	calld	adra,#s2
    	calld	adra,#s3
    	calld	adra,#s4
    	setb	outb,#7		'done
    here	jmp	#here
    
    
    	orgh	$400
    
    s1	setb	outb,#0
    	jmp	adra
    
    s2	setb	outb,#1
    	jmp	adra
    
    s3	setb	outb,#2
    	jmp	adra
    
    s4	setb	outb,#3
    	jmp	adra
    




  • cgraceycgracey Posts: 14,210
    I'll get this fixed in the next release. Thanks for finding this. I'm just on the cusp of testing the smart pins.
  • Hi Chip
    I've noticed a couple of issues when trying to use the CALLD instruction (rel9) variant (Pnut_v11a).
    The first issue simply that the WC and WZ effects are not allowed according to Pnut yet
    the opcode has these bits defined.
    CCCC 1011001 CZI DDDDDDDDD SSSSSSSSS   *    CALLD   D,S/#rel9   {WC,WZ}
    CCCC 11100ww Rnn nnnnnnnnn nnnnnnnnn        CALLD   reg,#abs/#rel
    

    Other instructions that support "rel9" addressing don't require then "@" symbol (i.e DJNZ).
    The second issue is that depending on the code being defined as COGEXEC or HUBEXEC
    returns different results from Pnut.

    Here are three different program configurations and their responses from Pnut.

    'Configuration #1 COGEXEC calling cog memory (Hub address 0)
    '@ symbol required by Pnut 
    
    dat	org
    
    'start	calld	0,#fred		'Relative address is not aligned with instruction
    start	calld	0,#@fred	'OK
    	mov	0,#20
    there	outn	#32
    	waitx	##20_000_000
    	djnz	0,#there
    	cogstop	#0
    
    fred	dirh	#32
    	jmp	0
    
    'Configuration #2 COGEXEC calling cog memory  (Hub address > $3FF)
    'Both CALLD D,#rel9 variants error
    
    dat	org
    
    	loc	ptra,#@start
    	coginit	#0,ptra
    
    	orgh	$1000
    	org
    
    'start	calld	0,#fred 	'Relative address is not aligned with instruction
    start	calld	0,#@fred 	'Register must be PTRA,PTRB,ADRA,ADRB
    
    	mov	0,#20
    there	outn	#33
    	waitx	##20_000_000
    	djnz	0,#there
    	cogstop	#0
    
    fred	dirh	#33
    	jmp	0
    
    'Configuration #3 HUBEXEC calling hub memory
    'Both CALLD D,#rel9 variants work OK
    
    dat	org
    
    	loc	ptra,#@start
    	coginit	#0,ptra
    
    	orgh	$1000
    
    start	calld	0,#fred  	'OK
    'start	calld	0,#@fred	'OK
    
    	mov	0,#20
    there	outn	#34
    	waitx	##20_000_000
    	djnz	0,#there
    	cogstop	#0
    
    fred	dirh	#34
    	jmp	0
    



  • cgraceycgracey Posts: 14,210
    ozpropdev wrote: »
    Hi Chip
    I've noticed a couple of issues when trying to use the CALLD instruction (rel9) variant (Pnut_v11a).
    The first issue simply that the WC and WZ effects are not allowed according to Pnut yet
    the opcode has these bits defined.
    CCCC 1011001 CZI DDDDDDDDD SSSSSSSSS   *    CALLD   D,S/#rel9   {WC,WZ}
    CCCC 11100ww Rnn nnnnnnnnn nnnnnnnnn        CALLD   reg,#abs/#rel
    

    Other instructions that support "rel9" addressing don't require then "@" symbol (i.e DJNZ).
    The second issue is that depending on the code being defined as COGEXEC or HUBEXEC
    returns different results from Pnut.

    Here are three different program configurations and their responses from Pnut.

    'Configuration #1 COGEXEC calling cog memory (Hub address 0)
    '@ symbol required by Pnut 
    
    dat	org
    
    'start	calld	0,#fred		'Relative address is not aligned with instruction
    start	calld	0,#@fred	'OK
    	mov	0,#20
    there	outn	#32
    	waitx	##20_000_000
    	djnz	0,#there
    	cogstop	#0
    
    fred	dirh	#32
    	jmp	0
    
    'Configuration #2 COGEXEC calling cog memory  (Hub address > $3FF)
    'Both CALLD D,#rel9 variants error
    
    dat	org
    
    	loc	ptra,#@start
    	coginit	#0,ptra
    
    	orgh	$1000
    	org
    
    'start	calld	0,#fred 	'Relative address is not aligned with instruction
    start	calld	0,#@fred 	'Register must be PTRA,PTRB,ADRA,ADRB
    
    	mov	0,#20
    there	outn	#33
    	waitx	##20_000_000
    	djnz	0,#there
    	cogstop	#0
    
    fred	dirh	#33
    	jmp	0
    
    'Configuration #3 HUBEXEC calling hub memory
    'Both CALLD D,#rel9 variants work OK
    
    dat	org
    
    	loc	ptra,#@start
    	coginit	#0,ptra
    
    	orgh	$1000
    
    start	calld	0,#fred  	'OK
    'start	calld	0,#@fred	'OK
    
    	mov	0,#20
    there	outn	#34
    	waitx	##20_000_000
    	djnz	0,#there
    	cogstop	#0
    
    fred	dirh	#34
    	jmp	0
    



    I've been going over this addressing stuff today. There were some problems with the assembler which I'm still working on. It's going to take me a day, or so, to get this straightened out.
  • jmgjmg Posts: 15,175
    cgracey wrote: »
    I've been going over this addressing stuff today. There were some problems with the assembler which I'm still working on. It's going to take me a day, or so, to get this straightened out.

    and earlier
    cgracey wrote: »
    Twice today, I was bitten by relative addressing.
    I think I need to change 20-bit branches originating from and going to $00000..$003FF to absolute addressing.
    Also, all LOC addressing from within cog space needs to be absolute, for sure. I don't know how I didn't see that coming.
    Maybe some of you have realized that you need to put a "\" before some addresses to get them to assemble sensibly. These kinds of problems are draining to deal with.
    I'm going to revisit all addressing cases and make sure they are set up properly. Sorry for any exasperation you might have experienced.
    ....
    cgracey wrote: »
    I finally got PNut.exe working with the new loader. I had forgotten to use a # on a constant in the ROM loader and it took me two days to figure out what the problem was. I hope to get a new release out tomorrow.

    I have to mention, these issues around P2 Assembler syntax confusing even the creator, do worry me...

    Is it a good time to do a clean up pass on the Assembler, so it is less way-out-in-left-field, and easier to work in ?

    Seems P2 is complex enough, without adding the confusion of non-standard assembler on top ?



  • cgraceycgracey Posts: 14,210
    edited 2016-09-18 00:57
    I've been thinking the same thing, jmg. I'm hoping that with an easy-to-understand set of rules, this madness can be contained. I think I might have it now. Too much complexity does drive people away. The Prop1 was very simple and direct and easy to step into. There is quite a lot more here to deal with. It feels a little like x86 to me.
  • Cluso99Cluso99 Posts: 18,069
    While thinking, what about a simple P1++ in the meantime?
    <goes offline for cover>
  • jmgjmg Posts: 15,175
    cgracey wrote: »
    I've been thinking the same thing, jmg. I'm hoping that with an easy-to-understand set of rules, this madness can be contained. I think I might have it now. Too much complexity does drive people away. The Prop1 was very simple and direct and easy to step into. There is quite a lot more here to deal with. It feels a little like x86 to me.
    Doing a quick refresh on google, I find some possible 'borrow ideas from' sources like
    NASM - looks active, but is rather x86 centric
    FASM - has an interesting new direction, called FASMG

    https://flatassembler.net/download.php

    The flat assembler g is the new assembly engine designed to become a successor of the one used by flat assembler 1. It does not have a built-in support for x86 instructions. It is a generic assembler that can be used in place of flat assembler 1 in applications where only the pure macroinstruction engine is needed instead of x86 encoder, for example when an instruction set for a different architecture is defined through macroinstructions. However it still has the ability of self-hosting, thanks to the included set of macroinstructions implementing basic x86 instructions and formats.
    flat assembler g 0.98 size: 235 kilobytes last update: 14 Sep 2016 13:13:53 UTC
    This release contains executables for 32-bit Linux and Windows. It contains examples of macroinstructions that allow assembly of simple programs for the architectures like x86, x64, 8052, AVR, or Java Virtual Machine.


    The multiple targets angle of this, has real appeal. It's not large, I'll download and have a play around...


  • jmgjmg Posts: 15,175
    edited 2016-09-18 22:55
    I am quite impressed with FASMG.

    Downloaded what is a very small assembler exe - fasmg.exe is 47.5k + examples = just 236k
    Ran tests on examples for 8051, 8048, ez80, x86
    Speed indication is largest eZ80 test file of 240k, 8912 lines, reports 1.0s
    (plus that reads 1097 lines of macro defines, via simple include 'ez80.inc' )

    eg This source code from 8051 test
    include 'hex.inc'
    include '8051.inc'
    AliasNameR7 EQU R7	
    	
    	DEC	  A
    	DEC   R0
    	DEC   @R1
    	dec   AliasNameR7
    	DEC   0x55
    	DEC   55H
    	DEC    $55
    

    uses this part of '8051.inc' Macro definition file,
    macro DEC? operand
    	local value
    	match =A?, operand
    		db 14h
    	else
    		value = +operand
    		if value relativeto 0
    			db 15h,value
    		else if value eq value element 1
    			if value metadata 1 relativeto @
    				db 16h + value metadata 1 - @
    			else if value metadata 1 relativeto R
    				db 18h + value metadata 1 - R
    			else
    				err "invalid operand"
    			end if
    		else
    			err "invalid operand"
    		end if
    	end match
    end macro
    

    to give this DisAsm hex
                    DEC     A                                       ; 005F 14
                    DEC     R0                                      ; 0060 18
                    DEC     @R1                                     ; 0061 17
                    DEC     R7                                      ; 0062 1F
                    DEC     55h                                     ; 0063 15 55
                    DEC     55h                                     ; 0065 15 55
                    DEC     55h                                     ; 0067 15 55
    

    Note that all of those macro scripts are available to your Assembler source too, because this is just a simple include step.
    Conditional assembler on steriods...

    Addit:
    Default output is Binary, and if you want intel hex you
    include 'hex.inc'

    Looking into how to generate a Listing file, that is similar, no command switch, another
    include 'listing.inc'
    with the script code from 3rd forum post here,
    https://board.flatassembler.net/topic.php?p=190446#190446
    simply pasted into listing.inc

    and you get a flat/absolute-listing file (ie addresses resolved), on console like this...
    00000060:                         AliasNameR7 EQU R7
    00000060: 14                      DEC A
    00000061: 18                      DEC R0
    00000062: 17                      DEC @R1
    00000063: 1F                      dec AliasNameR7
    00000064: 15 55                   DEC 0x55
    00000066: 15 55                   DEC 55H
    00000068: 15 55                   DEC $55
    0000006A: 74 0F                   MOV A,#00001111b
    0000006C: 53 69 6D 70 6C 65 20 53 
              74 72 69 6E 67          DB 'Simple String'
    00000079: 00                      DB 00
    0000007A: DB 0F 49 40             DD 3.141592653589793238462643
    0000007E: 18 2D 44 54 FB 21 09 40 DQ 3.141592653589793238462643
    00000086: 00                      DB 00
    

    Seems to nicely manage Generic Any-MCU.ASM -> .BIN and .LST quite easily :)
  • cgraceycgracey Posts: 14,210
    edited 2016-09-20 06:40
    Ozprop, try this version of PNut.exe:

    https://drive.google.com/file/d/0B9NbgkdrupkHbkNRcnRmT19YUUU/view?usp=sharing

    I think I got the assembler fixed, as it seems to assemble every case of CALLD correctly now.
  • Chip
    CALLD all working now, thanks . :)

    Oops, BMASKN appears to missing from the keywords now.
  • cgraceycgracey Posts: 14,210
    ozpropdev wrote: »
    Chip
    CALLD all working now, thanks . :)

    Oops, BMASKN appears to missing from the keywords now.

    Thanks for testing CALLD. I need to improve the error reporting on it, still.

    In the current Verilog, BMASKN is now MOVINT, which moves S into D and triggers INT3. It's not an ideal software interrupt, since the interrupt doesn't occur until after the next instruction. I might change it into something else, or maybe leave it that way. Any better ideas?

  • cgracey wrote: »
    In the current Verilog, BMASKN is now MOVINT, which moves S into D and triggers INT3. It's not an ideal software interrupt, since the interrupt doesn't occur until after the next instruction. I might change it into something else, or maybe leave it that way. Any better ideas?

    We now haveTRGINT1,TRGINT2 and TRGINT3 for software triggering of interrupts, do we need one more variant?

    Looking at BMASK and BTRIM a little deeper I have the following thoughts.
    I think BTRIM is a very useful instruction and maybe it could be called LTRIM (Left trim) instead.
    Maybe BMASK could then be replaced with a RTRIM (right trim) variant.
    The combination of the two makes for easy MASK/OR operations in 3 ops.

  • jmgjmg Posts: 15,175
    cgracey wrote: »
    In the current Verilog, BMASKN is now MOVINT, which moves S into D and triggers INT3. It's not an ideal software interrupt, since the interrupt doesn't occur until after the next instruction. I might change it into something else, or maybe leave it that way. Any better ideas?

    Such a structure has been used for Single-Step of code on other MCUs - could it be used for that on P2 ?


  • jmg wrote: »
    Such a structure has been used for Single-Step of code on other MCUs - could it be used for that on P2 ?

    FYI The P2 can support single-step now through the SETBRK instruction.

  • cgraceycgracey Posts: 14,210
    ozpropdev wrote: »
    cgracey wrote: »
    In the current Verilog, BMASKN is now MOVINT, which moves S into D and triggers INT3. It's not an ideal software interrupt, since the interrupt doesn't occur until after the next instruction. I might change it into something else, or maybe leave it that way. Any better ideas?

    We now haveTRGINT1,TRGINT2 and TRGINT3 for software triggering of interrupts, do we need one more variant?

    Looking at BMASK and BTRIM a little deeper I have the following thoughts.
    I think BTRIM is a very useful instruction and maybe it could be called LTRIM (Left trim) instead.
    Maybe BMASK could then be replaced with a RTRIM (right trim) variant.
    The combination of the two makes for easy MASK/OR operations in 3 ops.

    I like LTRIM and RTRIM. I'll implement those.
  • cgraceycgracey Posts: 14,210
    jmg wrote: »
    cgracey wrote: »
    In the current Verilog, BMASKN is now MOVINT, which moves S into D and triggers INT3. It's not an ideal software interrupt, since the interrupt doesn't occur until after the next instruction. I might change it into something else, or maybe leave it that way. Any better ideas?

    Such a structure has been used for Single-Step of code on other MCUs - could it be used for that on P2 ?


    We have pretty comprehensive debugging, already. I've always thought it would be nice to be able to write a register and have an implied CALL take place. For example, sending a serial byte.
Sign In or Register to comment.