Shop OBEX P1 Docs P2 Docs Learn Events
Testing rdbyte/word/long and wrbyte/word/long — Parallax Forums

Testing rdbyte/word/long and wrbyte/word/long

78rpm78rpm Posts: 264
edited 2015-10-24 07:20 in Propeller 2
**** EDITED - 24th Oct ****
I've decided to attach the source code to this post in case anyone is struggling with P2 FPGA. Not found the error yet. :frown:


I have volunteered myself to checking the rdbyte/word/long and wrbyte/word/long instructions. I have written a basic test harness which uses my DE0-Nano and Parallax Serial Terminal (PST), thanks to the serial test program uploaded on the 14th Oct by mindrobots. :smile:

I think there is a memory overwrite within the harness caused by my binary to ASCII decimal conversion routine, but I haven't found where it is happening. Making data areas larger and more code causes slightly different problems. I shall of course continue to look for the error. Thus I wish to find the error before posting the code.

The rd/wr tests need to check every combination, which is a fair few with all pre/post decrement, indexing and scaling options. They also should be checked for execution in COG, LUT and HUB space.

I have written a couple of support functions for the test harness to assist with binary to decimal conversion, which I will post cut'n'pasted if anyone wishes to view or use them:
multiply32x32 -
mutltiply two 32 bit values toegther for a 64 bit result. Two registers supply the two values, and return two longs, a high long nand a low long. This is useful when you require amultipliction of two longs where one has a value fully in the low word of the long, and the other has some data in the high word of its' long, and where the result still fits in 32 bits.

divide32b32 -
divides a 32 bit value by a 32 bit value for a 32bit quotient and a 32 bit remainder.
«13

Comments

  • ozpropdevozpropdev Posts: 2,792
    edited 2015-10-24 06:09
    Welcome to P2 testing!
    FYI. The P2 has a cordic engine that facilitates multiply 32x32 and 64/32 divide in hardware.
    CCCC 1101000 0LI DDDDDDDDD SSSSSSSSS        QMUL    D/#,S/#
    CCCC 1101000 1LI DDDDDDDDD SSSSSSSSS        QDIV    D/#,S/#
    CCCC 1101001 0LI DDDDDDDDD SSSSSSSSS        QFRAC   D/#,S/#
    CCCC 1101001 1LI DDDDDDDDD SSSSSSSSS        QSQRT   D/#,S/#
    CCCC 1101010 0LI DDDDDDDDD SSSSSSSSS        QROTATE D/#,S/#
    CCCC 1101010 1LI DDDDDDDDD SSSSSSSSS        QVECTOR D/#,S/#
    
    CCCC 1101011 CZ0 DDDDDDDDD 000011000        GETQX   D           {WC,WZ}
    CCCC 1101011 CZ0 DDDDDDDDD 000011001        GETQY   D           {WC,WZ}
    


    Edit: Sorry, I just noticed you mentioned DE0-Nano. So ignore the above.
  • ozpropdev wrote: »
    Welcome to P2 testing!
    FYI. The P2 has a cordic engine that facilitates multiply 32x32 and 64/32 divide in hardware.

    Edit: Sorry, I just noticed you mentioned DE0-Nano. So ignore the above.

    No cordic ... no problem.
  • I have edited the top post to attach the source of my test harness for anyone who may wish to make use of it.

    I have yet to find the error which appears to move.
  • Line 461: change to SETQ #(cog_end-cog_start).

  • Seairth wrote: »
    Line 461: change to SETQ #(cog_end-cog_start).

    Thank you, Seairth, well spotted. I have changed my code but the problem is still there ... sometimes ... depending on what is where.

    I wonder if Chip would support a new instruction ... FINDBUG D, S/#
    Where D is the destination to receive the bug report.
    Where S is the source code to check starting address.
    Where # is:
    %xx0 - simple report / 1= full report
    %x0x - find this bug / 1 = find future bugs
    %0xx - find bug in my code / 1 = find bug in others' code
  • cgraceycgracey Posts: 14,152
    78rpm wrote: »
    Seairth wrote: »
    Line 461: change to SETQ #(cog_end-cog_start).

    Thank you, Seairth, well spotted. I have changed my code but the problem is still there ... sometimes ... depending on what is where.

    I wonder if Chip would support a new instruction ... FINDBUG D, S/#
    Where D is the destination to receive the bug report.
    Where S is the source code to check starting address.
    Where # is:
    %xx0 - simple report / 1= full report
    %x0x - find this bug / 1 = find future bugs
    %0xx - find bug in my code / 1 = find bug in others' code

    Do you suspect there's a bug in your code, the chip, or you're not sure?
  • cgracey wrote: »
    Do you suspect there's a bug in your code, the chip, or you're not sure?

    I'm not sure as It could be either. I always suspect my code, though I can't see anything obvious. And the problem appears when code or data are added or moved. It doesn't always happen which makes me suspect a code overwrite somewhere.

    I am running in HubExec and using @# for Hub references and #\ for Cog references.

    I have made another change, looking into what the code looks like for passing parameters and return values on the stack. Looks nice and readable even with the limitations of PNut macro support.
    jmp #over_my_data
    
    CON
      STK_LOCAL_1	   	=  0
      STK_TOS_ON_ENTRY	= -1		'ptrb here
      STK_RET_ADDR     	= -2
      STK_ARG_3_DIV_RESULT  = -3	
      STK_ARG_2_DIVISOR     = -4
      STK_ARG_1_NUMERATOR	= -5
      STK_RESULT            = -6
    DAT
    
    pass_on_stack
    		pusha	ptrb
    		pusha	ptra
    		popa	ptrb			' now addressing stack
    	' make space for locals by adding to ptra, 
    	' accessed by rdlong x,ptrb[ 1 ] etc
    		pusha	r1
    		pusha	r2
    
    		' divide takes values in registers, so we must load them
    		' instead of doing a pusha scratch
    		rdlong	scratch, ptrb[ STK_ARG_1_NUMERATOR ]
    		mov	r1, scratch
    		rdlong	scratch, ptrb[ STK_ARG_2_DIVISOR ]
    		mov	r2, scratch
    
    		calla	#@divide32b32		' returns Q:R in r2:r1, r1 / r2
    
    		pusha	r1			' Q
    		rdlong	scratch, ptrb[ STK_ARG_3_DIV_RESULT ]		
    		cmp	r1, scratch  wc,  wz
    	if_e	jmp	#.L10
    
    		' error - we may wish to return an error code or 
    		' null value here
    		wrlong	scratch, ptrb[ STK_RESULT ]
    
    		loc 	adrb, #@pass_on_stack_div_error
    		calla 	#@send_msg
    		pusha	ptrb
    		popa	adrb			' address of result
    		add	adrb, ##STK_ARG_3_DIV_RESULT * 4 ' longs
    		calla	#@send_dec
    		loc 	adrb, #@newline
    		calla 	#@send_msg
    		jmp	#.L20
    .L10		
    		' result is correct
    		wrlong	scratch, ptrb[ STK_RESULT ]
    .L20
    		popa	scratch			' discard r1
    		popa	r2
    		popa	r1
    		popa	ptrb
    		reta
    ' pass_on_stack end
    
    
    
    stack_args_result byte	"Result from passing arguments on the stack = ", 0
    pass_on_stack_div_error	byte	"Error dividing values passed on stack", 0
    
    over_my_data
    		pusha	##0			' return value
    		pusha	##$76543210		' arg 1
    		pusha	##$1cedcafe		' arg 2
    		pusha	##$76543210/$1cedcafe	' arg 3
    		calla	#@pass_on_stack
    		sub	ptra, #3 * 4		' 3 args of longs
    		popa	r8			' return value
    		
    		loc 	adrb, #@stack_args_result
    		calla 	#@send_msg
    		loc	adrb,#\r8
    		calla	#@send_dec
    		loc 	adrb, #@newline
    		calla 	#@send_msg
    
    
    

    Now a problem which was there, in binary to ascii conversion, has gone away again. Perhaps it is based on a modulo of an address? It may be a case of do something completely different for an AHA! moment later, or tomorrow?

  • It almost sounds like what Dave Hein(?) reported in his HubExec thread - the results depended on what was where. I have seen similar but wrote mine off to programmer error. It's hard to recreate (and believe) but if it is some sort of timing thing certainly needs to be isolated and somehow identified. It can be very easy to write off as programmer error - certainly in my case!
  • mindrobots wrote: »
    It almost sounds like what Dave Hein(?) reported in his HubExec thread - the results depended on what was where. I have seen similar but wrote mine off to programmer error. It's hard to recreate (and believe) but if it is some sort of timing thing certainly needs to be isolated and somehow identified. It can be very easy to write off as programmer error - certainly in my case!

    I thought I recalled someone mentioning similar before.

    If it is a real problem, how can we:
    a. Rule out software bug
    b. Determine if it is a hardware bug

    I can likely get my code into a state where the effect is happening. Which I don't mind posting here. That would allow more eyes on the software problem. It would also mean Chip could use the code in someway to assist with a hardware assistance / debugging feature in the FPGA verilog code and image.

    Fortunately my code is running on the original boot single cog in HubExec mode, without interrupts. The serial receive routine uses edge detection. The debug interrupt vectors are correct, I dumped the memory to ensure they were there even though the Nano is limited to 32K or Hub RAM.

    If I can help with this I certainly will.
  • Shall we try this?
    BTW - you need to send a character to the Nano from a serial terminal to kick it off, done to prevent losing messages.

    I have the code arranged so it generates 2 "Decimal error" messages. Code attached below (whole file). Don't worry about the clock and spinners on the screen, they're meant to be there

    Locate the 'start' symbol in the source. When some long data above the start symbol are commented out, by uncommenting the block comment lines! (hope that makes sense) the code then generates 5 "Decimal error" messages.

    By also uncommenting byte definitions before the start, the number of errors changes. I've recorded the numbers I see as comments in the code for both commented block (assembled) and uncommented block (not assembled - purely comment) to see if you get the same.
    '{ 'uncomment this line to generate 5 decimal errors - see few lines below[/color]
    
    ' this is 'trap' data to aid in identifying incorrect indexing
    hub_long_after	long	 16,  17,  18,  19,  20,  21,  22,  23 ' some data before the array
    		long	 24,  25,  26,  27,  28,  29,  30,  31
    		long	 32
    '} 'uncomment this line to generate 5 decimal errors - see few lines above[/color]
    
    in_hub		long 	0
    
    '-----------------------------------------------------------------------------------
    
    		alignl
    
    '     aligned long 2 Decimal error reports, with hub_long_after assembled
    'byte 1  ' long off by 1 byte 1 Decimal error report
    'byte 2  ' long off by 2 byte 1 Decimal error report
    'byte 3  ' long off by 3 byte 1 Decimal error report
    'byte 4  ' long off by 0 byte 1 Decimal error report - long aligned again
    'byte 5  ' long off by 1 byte 1 Decimal error report
    'byte 6  ' long off by 2 byte 1 Decimal error report
    'byte 7  ' long off by 3 byte 1 Decimal error report
    'byte 8  ' long off by 0 byte 1 Decimal error report - long aligned again
    
    '     aligned long 5 Decimal error reports, with hub_long_after 
    '    commented out
    'byte 1  ' long off by 1 byte 1 Decimal error report
    'byte 2  ' long off by 2 byte 2 Decimal error report
    'byte 3  ' long off by 3 byte 2 Decimal error report
    'byte 4  ' long off by 0 byte 2 Decimal error report - long aligned again
    'byte 5  ' long off by 1 byte 1 Decimal error report
    'byte 6  ' long off by 2 byte 1 Decimal error report
    'byte 7  ' long off by 3 byte 1 Decimal error report
    'byte 8  ' long off by 0 byte 1 Decimal error report - long aligned again
    
    
    ' comment out all the hub_long_after data a few lines above and 5 Decimal 
    ' errors are logged
    
    start
    
    		mov 	dira, ##$ffff
    

    I look forward to hearing your comments.
  • I realised I have not described the code to help you. It's evening time here, so time for a story. Are you sitting comfortably? Good, then we will begin.

    In a P2 image in an FPGA in a foreign land, far, far away, there resided a small assmbler program, fluent in its' native language of PASM2. The program lived in a file on a disc which went round and round all day long...

    A short way down from the start symbol, this code is executed. The send_dec routine is the start of a chain of calls. The value to converted from binary to ASCII numeric is in register r0 in COG.
    		mov	r0, ##SYS_CLK		' system clock
    		loc 	adrb, #@sys_clocks
    		calla 	#@send_msg
    		loc	adrb,#\r0
    		calla	#@send_dec
    		loc 	adrb, #@newline
    		calla 	#@send_msg
    
    In send_dec routine, the divide32b32 is called. r2 is the divisor, although it's called multiplier, it's both, and due to where it originates.
    .mult_is_valid	loc	ptrb, #@digit_buf	
    .next_digit	mov	r1, r0			' r0 = value
    		mov	r2, multiplier		' r2 = 2nd term 	
    		calla	#@divide32b32		' 32b / 32b, returns Q:R in r1:r2
    		calla	#@check_for_decimal_error
    
    
    The check_for_decimal_error routine is small and checks that the value returned is within 0 and 9 inclusive. For some reason the send_hex doesn't output to the screen, perhaps it has been clobered? The values you see on the screen for the entire converted value, 5íë24544, have had an ASCII '0' added to them, ie $30.
    check_for_decimal_error
    		cmp	r1, #9  wc, wz
    	if_be	jmp	#.L888
    		pusha	adrb
    		loc	adrb, #@decimal_error
    		calla	#@send_msg
    		loc 	adrb, #\r1
    		calla	#@send_hex
    		loc 	adrb, #@newline
    		calla	#@send_msg
    		popa 	adrb
    .L888
    		reta
    ' check_for_decimal_error end
    

    It would be good too get to the bottom of this, but I think it possible that the registers used in the divide32b32 function, which are located in the single COG, are corrupted by some means. I'm not overwriting when doing divide, so is something else, or did I overwrite prior to it's call? I suspect it is registers r1 and r2, and/or t0 through t7, but what?
    divide32b32
    		pusha	t0
    		pusha	t1
    		pusha	t2
    		pusha	t3
    		pusha	t4
    		pusha	t5
    		pusha	t6
    
    		or	r2, r2  wz
    	if_nz	jmp	#send_dbg_div_go
    		loc	adrb, #@division_by_zero_error
    		calla	#@send_msg
    		jmp	#send_dbg_div_done
    
    send_dbg_div_go
    
    
    		mov	t1, #0		' Q
    		mov	t2, #0		' R
    		mov	t3, r1		' N
    		mov	t4, ##1 << 31   ' bit mask
    		mov	t0, #32		' num bits in N
    		mov	t6, t0
    		sub	t6, #1		' range 31 to 0
    .next_loop	  
    		  shl   t2, #1		' R = R << 1
       		  mov   t5, t3		' N
    		  and	t5, t4  wz	' bit n set of N
    	if_z	  clrb  t2, #0		' clr bit 0 of R
    	if_nz	  setb  t2, #0		' set bit 0 of R
    		  cmp   t2, r2  wc, wz
    	if_b	  jmp	#.after
    		    sub	  t2, r2 ' R <- R - D
    		    setb  t1, t6	' but 31 to 0 of Q
    .after
    		  shr	t4, #1 		' bit mask
    		  sub   t6, #1		' adjust bit number 31 to 0
    		djnz	t0, #.next_loop
    	  
    send_dbg_div_done
    		mov	r1, t1		' return quotient result in r1
    		mov	r2, t2		' return remainder result in r2
    
    		popa	 t6
    		popa	 t5
    		popa	 t4
    		popa	 t3
    		popa	 t2
    		popa	 t1
    		popa	 t0
    		reta
    ' divide32b32	end
    
    Tune in on this channel, at this time tomorrow, for the same story...or perhaps not.

    I wish everyone looking into this good furtune in bug hunting.
  • jmgjmg Posts: 15,173
    78rpm wrote: »
    They also should be checked for execution in COG, LUT and HUB space.
    Is this problem only in HUB exec, and is ok in COG & LUT ?

  • jmg wrote: »
    78rpm wrote: »
    They also should be checked for execution in COG, LUT and HUB space.
    Is this problem only in HUB exec, and is ok in COG & LUT ?

    I have NOT tried it in COG or LOT, so yes, only HUB EXEC. I could try that. I think the main thing I need to do is change all the global calla #@fun to calla #fun or #\fun, and I think all the jumps are relative anyway. Oh, and obviously to load it into a COG and init it or branch.
  • jmg wrote: »
    78rpm wrote: »
    They also should be checked for execution in COG, LUT and HUB space.
    Is this problem only in HUB exec, and is ok in COG & LUT ?

    It is too big to fit in COG. Then I'm caught up in the same problem of moving things around and commenting some out and the problem disappears for a while.

    However, I created a copy of the divide function and appended _inCog to its' name. Then I checked with that version loaded in COG that the HUB version still produced the same error, which it did. Then I changed the call to the _inCog version. The same error result occurs.

    That function only uses registers in the COG RAM. Thus it looks to me the problem is regardless of HUB EXEC
  • Would someone please help me understand the instruction format for these rd/wr byte/word/longs.

    I do not know what the L and I bit in the WR instructions refer. One obviously means immediate and I suspect that would be the I bit, but what is the L? Is it used to indicate the immediate field addresses hub
    locations $00 to $ff inclusive? If it is for hub address, why does it not go the full 9bit range to $1ff? An enquiring mind wishes to know. Plus it I need to know for encoding the instructions to encode them on-the-fly as there are so many combinations to check with all the optional indexing and pre/post de/in-increment, and therefore checking register values pre and post op.

    The following taken from P2 documentation:
    $000..$1FF indicates a register
    S= %0_xxxx_xxxx == #$00..$FF indicates a hub address

    Here are the basic instructions that read hub RAM into the cog:
    CCCC 1011000 CZI DDDDDDDDD SSSSSSSSS RDBYTE D,S/#/PTRx {WC,WZ}
    CCCC 1011001 CZI DDDDDDDDD SSSSSSSSS RDWORD D,S/#/PTRx {WC,WZ}
    CCCC 1011010 CZI DDDDDDDDD SSSSSSSSS RDLONG D,S/#/PTRx {WC,WZ}
    Here are the basic instructions that write hub RAM from the cog:
    CCCC 1100010 0LI DDDDDDDDD SSSSSSSSS WRBYTE D/#,S/#/PTRx
    CCCC 1100010 1LI DDDDDDDDD SSSSSSSSS WRWORD D/#,S/#/PTRx
    CCCC 1100011 0LI DDDDDDDDD SSSSSSSSS WRLONG D/#,S/#/PTRx
  • ozpropdevozpropdev Posts: 2,792
    edited 2015-10-25 00:20
    L is the immediate D register flag
    Immediate S values > $FF indicate indexed PTRx variants.

  • ozpropdev wrote: »
    L is the immediate D register flag
    Thank you, I'd forgotten that. And to think I've used it in my code a few times this week! Wood for trees
    :smile:
  • I notice that you make reference to labels declared in cog space using LOC #\ instructions.
    These labels when referenced as hub addresses fall below $400 (cog and lut addresses).
    Erratic issues in some of my code were cured by dropping a ORGH $400 in to bump them into hub space.
    Strange behaviour disappeared.
    Might be worth looking at. :)

  • ozpropdev wrote: »
    I notice that you make reference to labels declared in cog space using LOC #\ instructions.
    These labels when referenced as hub addresses fall below $400 (cog and lut addresses).
    Erratic issues in some of my code were cured by dropping a ORGH $400 in to bump them into hub space.
    Strange behaviour disappeared.
    Might be worth looking at. :)

    Ah, may be I am misunderstanding this. I think you can access registers in Cog RAM from Hub Exec mode, without having to do a wrlong etc to Hub RAM. Are you saying this isn't the case?

    The use of #\ is to get the absolute address of Cog Ram as that is where the registers are located. The addresses returned are long addresses where adjacent longs are located by current + 1, not +4 as in Hub. They are ORG 0.

    My Hub Exec code and data is ORGH $1000.




  • jmgjmg Posts: 15,173
    78rpm wrote: »
    Ah, may be I am misunderstanding this. I think you can access registers in Cog RAM from Hub Exec mode, without having to do a wrlong etc to Hub RAM.

    I think that is correct, if it were not, HubExec would be of limited use.
    78rpm wrote: »

    The use of #\ is to get the absolute address of Cog Ram as that is where the registers are located. The addresses returned are long addresses where adjacent longs are located by current + 1, not +4 as in Hub. They are ORG 0.

    My Hub Exec code and data is ORGH $1000.
    This type of confusion is why I think the Assembler needs a clean-up pass to remove 'hieroglyph tags'.
    I think introducing specific memory areas ( called data segments in some assemblers) can make it clear to the assembler and user which Registers/Ram they are addressing, without needing tags.

    Specific data segments also makes code more naturally relocatable, which will matter more as ASM mixes with HLL.
  • Hubexec can use cog ram no problem at all.
    The issue seems to be hub operations addressing < $400
    Consider the following code
    dat		orgh	0
    		org
    
    		setb	dirb,#62
    		setb	outb,#62
    
    wait4key	jp	#63,#wait4key
    
    		loc	ptrb,#message
    		call	#send_msg
    here		jmp	#here
    
    send_msg	rdbyte	ax,ptrb++ wz
    	if_z	ret
    		call	#send_byte
    		jmp	#send_msg
    
    send_byte	setb	ax,#8
    		shl	ax,#1
    		getcnt	bx
    
    		rep	@sb_loop,#10
    		addcnt	bx,bit_time
    		testb	ax,#0 wz
    		setbnz	outb,#62
    		waitcnt
    		shr	ax,#1
    sb_loop		ret
    
    bit_time	long	50_000_000 / 115_200
    ax		long	0
    bx		long	0
    
    		orgh	$400
    
    message		byte	"Ozpropdev was here!",0
    
    Remove the ORGH $400 and it does not work.
  • jmg wrote: »
    78rpm wrote: »
    Ah, may be I am misunderstanding this. I think you can access registers in Cog RAM from Hub Exec mode, without having to do a wrlong etc to Hub RAM.

    I think that is correct, if it were not, HubExec would be of limited use.
    78rpm wrote: »

    The use of #\ is to get the absolute address of Cog Ram as that is where the registers are located. The addresses returned are long addresses where adjacent longs are located by current + 1, not +4 as in Hub. They are ORG 0.

    My Hub Exec code and data is ORGH $1000.
    This type of confusion is why I think the Assembler needs a clean-up pass to remove 'hieroglyph tags'.
    I think introducing specific memory areas ( called data segments in some assemblers) can make it clear to the assembler and user which Registers/Ram they are addressing, without needing tags.

    Specific data segments also makes code more naturally relocatable, which will matter more as ASM mixes with HLL.


    I hope we're both correct on the matter of accessing registers in Cog RAM in this manner in Hub Exec.

    Are you testing / developing on an FPGA board?

    Too true, a good assembler would be useful for all this tinkering, but they take a while to develop. Certainly relocating data segments would be useful.

    I find PNut delightful to use, and it's so fast at assembling and downloading to the FPGA that I'm not able to make out more than a few characters on the screen whilst it does its' stuff. My only problem with PNut is when I move to the browser I press F11 to post a message.
  • jmgjmg Posts: 15,173
    edited 2015-10-25 04:00
    ozpropdev wrote: »
    Remove the ORGH $400 and it does not work.

    and if it is smaller than ORGH $400, what happens ?
    With no ORGH there, wouldn't that drop message into COG memory ?
    Because opcodes presently have the '# forcing', there is no possible error checks on this.

  • ozpropdev wrote: »
    Hubexec can use cog ram no problem at all.
    The issue seems to be hub operations addressing < $400
    Consider the following code
    dat		orgh	0
    		org
    
    		setb	dirb,#62
    		setb	outb,#62
    
    wait4key	jp	#63,#wait4key
    
    		loc	ptrb,#message
    		call	#send_msg
    here		jmp	#here
    
    send_msg	rdbyte	ax,ptrb++ wz
    	if_z	ret
    		call	#send_byte
    		jmp	#send_msg
    
    send_byte	setb	ax,#8
    		shl	ax,#1
    		getcnt	bx
    
    		rep	@sb_loop,#10
    		addcnt	bx,bit_time
    		testb	ax,#0 wz
    		setbnz	outb,#62
    		waitcnt
    		shr	ax,#1
    sb_loop		ret
    
    bit_time	long	50_000_000 / 115_200
    ax		long	0
    bx		long	0
    
    		orgh	$400
    
    message		byte	"Ozpropdev was here!",0
    
    Remove the ORGH $400 and it does not work.

    No, it wouldn't work, because message would then be in Cog RAM and you'd need to do a long access to it and mask or shift to get to the individual bytes in the long.

    However, my code in Hub, when accessing bytes in Cog RAM, writes a long from Cog into Hub RAm and then moves each byte out into a local array, via Cog Ram, by means of a RDBYTE loop. Then if the data is valid, it executes a WRBYTE to a RAM buffer in Hub. That all looks correct to me even if it sound a bit long winded. Look at send_msg to see what I mean for the copying from Cog, Lut or Hub to the message buffer.

    So, I only access data as longs in Cogs from Hub, even though that data may be packed into a long.

    I'm no nearer to finding the problem, though I did look to see if I could see something in my code. And yes, I yes rdlut and wrlut to access Lut RAM.

    I think my brain might melt :)


  • Even dropping a ORGH by itself to put in hub space does not work.
    All attempts to set ORGH to addresses below $400 wil fail.
    From what I see all hub references must be $400+

  • ElectrodudeElectrodude Posts: 1,657
    edited 2015-10-25 04:40
    ozpropdev wrote: »
    Hubexec can use cog ram no problem at all.
    The issue seems to be hub operations addressing < $400
    Consider the following code
    dat		orgh	0
    		org
    
    		setb	dirb,#62
    		setb	outb,#62
    
    wait4key	jp	#63,#wait4key
    
    		loc	ptrb,#message
    		call	#send_msg
    here		jmp	#here
    
    send_msg	rdbyte	ax,ptrb++ wz
    	if_z	ret
    		call	#send_byte
    		jmp	#send_msg
    
    send_byte	setb	ax,#8
    		shl	ax,#1
    		getcnt	bx
    
    		rep	@sb_loop,#10
    		addcnt	bx,bit_time
    		testb	ax,#0 wz
    		setbnz	outb,#62
    		waitcnt
    		shr	ax,#1
    sb_loop		ret
    
    bit_time	long	50_000_000 / 115_200
    ax		long	0
    bx		long	0
    
    		orgh	$400
    
    message		byte	"Ozpropdev was here!",0
    
    Remove the ORGH $400 and it does not work.

    LOC is meant to get code addresses, not data addresses. Replace "loc ptrb, #message" with "mov ptrb, ##@message" and it should work.

    Code addresses below $400 are in cogram, while data addresses below $400 are in hubram.
  • My point is data addresses below $400 in hubram cause issues.
  • If data addresses below $400 are in Hub RAM, how does Hub Exec work?

    My understanding is:
    long aligned x 512, long access only, +1 adddresses next long
    $000 <= Cog RAM <= $1ff
    $200 <= LUT RAM <= $3ff
    byte aligned x lots, byte/word/long access, + 1 addresses next byte

    The LOC instruction is for location and can only be used with special address registers, adra, adrb, ptra, ptrb. LOC is for loading these ptr registers. If you want to know the value contained in a pointer register, can you do
    mov   my_reg, ptrb
    
    or is it safer to do
    push  my_reg
    pop   ptrb
    
    I think they both work as the ptr access and indexing and scaling are only used with RDBYTE, WRBYTE etc

    Read the end section of the instructions.txt in the FPGA download files.
    
    For immediate-branch and LOC address operands, "#" is used before the
    address. In cases where there is an option between absolute and relative
    addressing, the assembler will choose absolute addressing when the branch
    crosses between cog and hub domains, or relative addressing when the
    branch stays in the same domain. Absolute addressing can be forced by
    following "#" with "\".
    
        DJZ/DJNZ/DJS/DJNS/TJZ/TJNZ/TJS/TJNS/JP/JNP   - relative/indirect
        JMP/CALL/CALLA/CALLB/CALLD          - absolute/relative/indirect
        LOC                                 - absolute/relative
    
    However, this there is also this in the P2 Assembly Instructions, so which is right?
    Label values are determined as follows:
    ● Labels defined in an ORGH section resolve to a hub address or offset (in
    bytes), regardless of whether the label is referenced in an ORGH or ORG section.
    ● Labels defined in an ORG section resolve to a cog address or offset (in longs),
    regardless of whether the label is referenced in an ORGH or ORG section.
    ● When the effective hub address or offset is needed for a label that is defined
    in an ORG section, the label may be preceded by a "@" to force resolution to a
    hub address or offset.
    ● Though it is possible to apply the "@" to labels defined in ORGH sections, it
    has no effect.
    Expressions
    ● Expressions can contain numbers, labels, and nested expressions. The
    simplest expression is either a single number or label.
    ● An expression that begins with # or ## is known as an "immediate" value.
    ● For branching instructions, immediate values can be either "absolute" or
    "relative", depending on context.
    "relative", depending on context.
    ● For non-branching instructions, immediate values are always "absolute".
    ● "Absolute immediate" interpretation can be forced by using "#\" or "##\".
    ● There is no operator for forcing a "relative immediate" interpretation.
    ● # indicates a 9-bit (short-form) or 20-bit (long-form) immediate value:
    ○ For short-form branch instructions, this is a 9-bit relative immediate.
    ○ For long-form branch instructions that change execution mode (cog <-
    > hub), this is a 20-bit absolute immediate.
    ○ For long-form branch instructions that do not change execution mode,
    this is a 20-bit relative immediate.
    ○ For all other instructions, this is a 9-bit absolute immediate.
    ○ In circumstances where an absolute immediate must be forced, the
    expression is prefaced with "#\".
    ● ## indicates a 32-bit immediate value
    ○ An implicit AUGx will precede the instruction containing the expression.
    ○ The lower 9 bits will be encoded in the instruction and the upper 23 bits
    will be encoded in the AUGx.
    ○ For short-form branch instructions, this is a 20-bit relative immediate.
    The upper 12 bits are ignored.
    ○ For non-branch instructions, this is a 32-bit absolute immediate.
    ○ This is meaningless for long-form branche instructions. PNUT throws an
    error.
    ● For BYTE/WORD/LONG, the expression is encoded as raw data. If the
    expression begins with # or ##, PNUT throws an error.
    ● For all other expressions that do not begin with # or ##, the expression is
    encoded as a register address and must be between $000 and $1FF.
    

    Thoughts?
  • jmgjmg Posts: 15,173
    ozpropdev wrote: »
    My point is data addresses below $400 in hubram cause issues.

    If you pattern fill HUB and then dump that, what does that show ?

  • ElectrodudeElectrodude Posts: 1,657
    edited 2015-10-25 06:06
    ozpropdev wrote: »
    My point is data addresses below $400 in hubram cause issues.

    You didn't do your data access to hubram correctly. You had "loc ptrb, #message", which gets the code address of message into ptrb. You wanted "mov ptrb, ##message", which puts the data address of message into ptrb. Code and data addresses are the same for addresses greater than or equal to $400, which is why it worked with "orgh $400".


    Addresses passed to mov and normal ALU instructions are always cog addresses, exactly the same as on the P1. If AUGx or something is used to make it more than 9 bits, those extra bits are ignored.

    If you do "WRxxxx data, address", you will always get data out of hubram.

    These are both data addresses.

    If you do "jmp #x" or #loc ptrb, #x", you get the code address of x. If x is greater or equal to than $400, then x refers to the first byte of the instruction in hubram. If x is less than $400 and greater than $200, x & $1FF refers to the long of lutram that has the instruction. If x is less than $200, x refers to the long of cogram that has the instruction.

    You cannot hubexec out of addresses less than $400. There was a long thread about that.

    EDITED: added quote to top, moved last paragraph to top
Sign In or Register to comment.