Shop OBEX P1 Docs P2 Docs Learn Events
Problem with LOCLONG instruction — Parallax Forums

Problem with LOCLONG instruction

ozpropdevozpropdev Posts: 2,792
edited 2014-03-25 04:03 in Propeller 2
Hi All

I'm having a problem with the LOCLONG instruction.
If the instruction sits in certain address spaces it fails.
The code below looks up a long in a table and flashes a LED representing the value. (DE2-115 FPGA)
If I insert spacer NOP's or any other instruction to move the LOCLONG away from those addresses it works.
I kept inserting spacers and a pattern emerged in the bad addresses.
The bad addresses were $E18,$E38,$E58. In another program (Toolbox) I got similar patterns $2338,$2358,$2378.
The common pattern seems to be bites 3 and 4 set in the absolute address.
dat		orgh	$e00/4	'F11 to run from PNUT
		org
		jmp	@hub_code

bx		long	0
timer		long	0
delay		long	40_000_000

		orgh
hub_code	mov	bx,#2	'set item index = 4

		nop	'fail bug address = $e18
		nop	'ok
		nop	'ok
		nop	'ok
		nop	'ok
		nop	'ok
		nop	'ok
		nop	'ok
		nop	'fail bug address = $e38
		nop	'ok
		nop	'ok
		nop	'ok
		nop	'ok
		nop	'ok
		nop	'ok
		nop	'ok
		nop	'fail bug address = $e58
		nop	'ok

bug		loclong	bx,@list
		rdlong	bx,bx

		getcnt	timer
		add	timer,delay
		waitcnt	timer,delay

:loop		setp	#32
		waitcnt	timer,delay
		clrp	#32
		waitcnt	timer,delay
		djnz	bx,@:loop

		jmp	#$

list		long	3	'index 0
		long	1	'index 1
		long	4	'index 2
		long	1	'index 3
		long	5	'index 4
		long	9	'index 5



If Ok LED flashes 4 times, a fail flashes more than I can be bothered to count!
Am I missing something? :(
Cheers
Brian

Comments

  • Cluso99Cluso99 Posts: 18,069
    edited 2014-03-22 05:30
    What is more interesting it is the 6th long of the 8 WIDE instruction cache in each case.
  • ozpropdevozpropdev Posts: 2,792
    edited 2014-03-22 17:51
    FYI Chip,
    I just went back to the previous FPGA Build (6 Feb 2014) and the problem is still there.
    Cheers
    Brian
  • Bill HenningBill Henning Posts: 6,445
    edited 2014-03-22 19:13
    I just tested my code to see if I'd have problems.

    I forced the locptra to start out at $400 ($1000 byte address)

    then I added nops to test all 8 longs in the cache line $400-$407

    locptra worked in every slot.

    Why don't you snag the bit-banged tx (verified working in cog and hubexec) then you won't have to count blinking lights... which are still very helpful tools!
    CON
    		_rx	= 91
    		_tx	= 90
    DAT
    		orgh	$380
    
    start		org
    
    		clkset	#$FF		'set 80MHz
    		
    		'setsera	ser_parms, bit_time
    		clrp	#_tx		' Enable _tx pin
    		mov	dirb,leds
    
    		mov	outb,#1		' if we don't get to another outb, we jumped to never-never land
    		jmp	#hubexec
    
    		nop
    		nop
    		nop
    
    ser_parms	long	%10 << 16 + _rx << 9 + %10 << 7 + _tx
    bit_time	long	80_000_000 / 115200
    ch		long	0
    leds		long    $FFF
    y		long 	0
    x		long	0
    w		long	0
    count		long	0
        
    '----------------------------------------------------------------
    
    		orgh $400			' force locptra to $400 - pass
    
    		nop	' force locptra to $401 - pass
    		nop	' force locptra to $402 - pass
    		nop	' force locptra to $403 - pass
    		nop	' force locptra to $404 - pass
    		nop	' force locptra to $405 - pass
    		nop	' force locptra to $406 - pass
    		nop	' force locptra to $407 - pass
    
    hubexec		locptra #hello_world		' orgh, F10 download works  $38A 
    
     
    		mov	outb,##hubexec		' $38B/C
    '		mov	outb,##hello_world	
    '		getptra	outb
    '		nop	
    
    loop		rdbyte	ch, ptra++ wz		' $38D
    		mov	x,ch
    	if_nz	call	#tx 			' $38E
    '	if_nz	serouta	ch 			' $38E
    	if_nz	jmp	#loop			' $38F
    
    		jmp	@hubexec		' $390
    
    hello_world	byte	"Hello World!",13,10,0	' $391
    
    bub		serouta #64
    		jmp	#bub
    
    tx              shl     x,#1                    ' insert start bit
                    setb    x,#9                    ' set stop bit
    		mov	count,#10
                    getcnt  w                       ' get initial time
    :loop           add     w,bit_time              ' add bit period to time
                    shr     x,#1  wc
                    setpc   #_tx                    ' write c to tx pin
    		'passcnt  w
                    waitcnt w,#0                    ' loop until bit period elapsed
                    jnz	x,@:loop                 ' loop until 10 bits done
    		add     w,bit_time              ' add bit period to time
                    waitcnt w,#0                    ' loop until bit period elapsed
                    ret
    
    
  • ozpropdevozpropdev Posts: 2,792
    edited 2014-03-22 23:23
    Looking at the issue a bit deeper it appears to be related to instruction pre fetch.

    With pre fetch on (Cog start up default) the code works ok on first pass then fails thereafter.
    00000EDC 00000004
    00000084 FCFD000D
    00000084 FCFD000D
    00000084 FCFD000D
    00000084 FCFD000D
    00000084 FCFD000D
    00000084 FCFD000D
    

    With pre fetch off the first pass fails and every other pass works.
    00000084 FCFD000D
    00000EDC 00000004
    00000EDC 00000004
    00000EDC 00000004
    00000EDC 00000004
    00000EDC 00000004
    00000EDC 00000004
    

    This is with the LOCLONG aligned to hub addresses with bits 3 and 4 set. (As described in first post above)
    con
    
    	tx=90
    	rx=91
    	baudrate = 9600
    
    dat		orgh	$e00/4	'F11 to run from PNUT
    		org
    		jmp	@hub_code
    
    ax		long	0
    bx		long	0
    cx		long	0
    val		long	0
    
    timer		long	0
    delay		long	10_000_000
    delay2		long	80_000_000 * 20
    
    config		long	%10 << 16 | rx << 9 | %10 << 7 | tx
    bittime		long	80_000_000 / baudrate
    
    		orgh
    
    hub_code	setsera	config,bittime
    		clrp	#tx
    
    	'	icachep		'work first pass then fails thereadter
    		icachen		'fails first pass then works thereafter
    
    		long	0[5]
    	'	long	0[8]
    
    		getcnt	timer
    		add	timer,delay2
    		waitcnt	timer,delay
    
    
    again		mov	bx,#2
    
    bug		loclong	bx,@list
    
    		rdlong	cx,bx
    
    		mov	val,bx
    		call	@show_hex
    		serouta	#32
    		mov	val,cx
    		call	@show_hex
    		serouta	#13
    
    		getcnt	timer
    		add	timer,delay
    		and	cx,#15
    
    :loop		setp	#32
    		waitcnt	timer,delay
    		clrp	#32
    		waitcnt	timer,delay
    		djnz	cx,@:loop
    
    		getcnt	timer
    		mov	cx,delay2
    		shr	cx,#3
    		add	timer,cx
    		waitcnt	timer,cx
    	
    		jmp	@again
    
    
    show_hex	reps	#8,#6
    		nop
    		getnib	ax,val,#7
    		cmp	ax,#9 wz,wc
    	if_a	add	ax,#"A"-10
    	if_be	add	ax,#"0"
    		serouta	ax
    		shl	val,#4
    		ret
    
    list		long	3	'index 0
    		long	1	'index 1
    brian		long	4	'index 2
    		long	1	'index 3
    		long	5	'index 4
    		long	9	'index 5
    

    I'll keep digging.... :) Brian
  • Bill HenningBill Henning Posts: 6,445
    edited 2014-03-23 06:22
    Very interesting... especially since LOCPTRA does not seem to exhibit the same issue.

    I wonder if the related LOCWORD LOCBYTE LOCINST have the same issue?
  • ozpropdevozpropdev Posts: 2,792
    edited 2014-03-23 06:40
    Very interesting... especially since LOCPTRA does not seem to exhibit the same issue.

    I wonder if the related LOCWORD LOCBYTE LOCINST have the same issue?

    Bill
    Identical fault for LOCWORD and LOCBYTE.
    I'm looking at LOCINST now...
  • Bill HenningBill Henning Posts: 6,445
    edited 2014-03-23 06:50
    I suspected as much.

    LOCBASE is also part of the same group...
    ozpropdev wrote: »
    Bill
    Identical fault for LOCWORD and LOCBYTE.
    I'm looking at LOCINST now...
  • ozpropdevozpropdev Posts: 2,792
    edited 2014-03-23 07:13
    I suspected as much.

    LOCBASE is also part of the same group...

    As expected LOCBASE is the same as the others.
    BTW What is LOCINST supposed to do? Wasn't in the last Docs.
  • ctwardellctwardell Posts: 1,716
    edited 2014-03-23 07:46
    Tried this on the nano by changing the LED on P32 to an LED I have on P1.
    I see the same behavior.

    I like the number sequence, makes me hungry for pie...

    C.W.
  • ctwardellctwardell Posts: 1,716
    edited 2014-03-23 12:47
    I played around with this a little more this afternoon.

    I have a nano so I modified the original example so that it saves a couple of test values to the hub and then launches the monitor so I they can be reviewed.

    It looks like in the cases where LOCLONG fails it is using the index value from D and the relative offset from S but is failing to include the absolute address of the instruction itself.

    Examples:

    When label bug is at address $e54 the value returned by the LOCLONG is $e70 which points at the correct value in the list.
    When label bug is at address $e58 the value returned by the LOCLONG is $1C which is $e74 - $e58. ($e74 is the expected value)

    The error pattern repeats as shown by Brian and the error value is always off by the address of the LOCLONG.
    dat		orgh	$e00/4	'F11 to run from PNUT
    		org
    		jmp	@hub_code
    
    bx		long	0
    timer		long	0
    delay		long	40_000_000
    monitor		long	91 << 24 + 90 << 16 + $52C >> 2	'added for launching monitor
    tval1		long	$4000	'a place to write a value to check from the monitor
    tval2		long	$4004	'a place to write a value to check from the monitor
    
    		orgh
    hub_code	mov	bx,#2	'set item index = 4
    
    		nop
    		nop
    		nop
    		nop
    		nop
    		nop	'fail bug @ $e38
    		nop
    		nop
    		nop
    		nop
    		nop
    		nop
    		nop
    		nop	'fail bug @ $e58
    
    bug		loclong	bx,@list
    		wrlong	bx, tval1	'save bx so it can be checked from the monitor
    		rdlong	bx, bx
    		wrlong	bx, tval2	'save bx so it can be checked from the monitor
    		cogrun	monitor, #0	'run the monitor
    
    
    list		long	3	'index 0
    		long	1	'index 1
    		long	4	'index 2
    		long	1	'index 3
    		long	5	'index 4
    		long	9	'index 5
    

    C.W.
  • ozpropdevozpropdev Posts: 2,792
    edited 2014-03-23 17:30
    Hi C.W.
    Nice work. That explains the results nicely.
    I have been testing on DE2 and DE0 with the same results.
    The tests I am doing at the moment also seem to show the results being affected by what type
    of instruction preceed the LOCLONG instruction. Still working on that one.
    Regarding my earlier question on LOCINST it appears to return the offset from the current PC to the @label instruction.
    Cheers
    Brian

    Post Edit: I wondered if anyone would notice the number sequence! :lol:
  • ozpropdevozpropdev Posts: 2,792
    edited 2014-03-23 18:28
    Here's a different result. By moving the "again" label back 3 instructions makes the code work in pre fetch mode.
    If pre fetch is turn off same result as before. Hopefully this makes sense to Chip and he has a "Aha" moment. :)
    con
    
    	tx=90
    	rx=91
    	baudrate = 9600
    
    dat		orgh	$e00/4	'F11 to run from PNUT
    		org
    		jmp	@hub_code
    
    ax		long	0
    bx		long	0
    cx		long	0
    val		long	0
    
    timer		long	0
    delay		long	10_000_000
    delay2		long	80_000_000 * 8
    
    config		long	%10 << 16 | rx << 9 | %10 << 7 | tx
    bittime		long	80_000_000 / baudrate
    
    		orgh
    
    hub_code	setsera	config,bittime
    		clrp	#tx
    
    		icachep		'work first pass then fails thereadter
    	'	icachen		'fails first pass then works thereafter
    
    		long	0[5]
    	'	long	0[8]
    
    
    again
    		getcnt	timer
    		add	timer,delay2
    		waitcnt	timer,delay
    
    
    'again
    		mov	bx,#2
    
    bug		loclong	bx,@list
    
    		rdlong	cx,bx
    
    		mov	val,bx
    		call	@show_hex
    		serouta	#32
    		mov	val,cx
    		call	@show_hex
    		serouta	#13
    
    		getcnt	timer
    		add	timer,delay
    		and	cx,#15 wz
    	if_z	mov	cx,#15
    
    :loop		setp	#32
    		waitcnt	timer,delay
    		clrp	#32
    		waitcnt	timer,delay
    		djnz	cx,@:loop
    
    		getcnt	timer
    		mov	cx,delay2
    		shr	cx,#3
    		add	timer,cx
    		waitcnt	timer,cx
    	
    		jmp	@again
    
    
    show_hex	reps	#8,#6
    		nop
    		getnib	ax,val,#7
    		cmp	ax,#9 wz,wc
    	if_a	add	ax,#"A"-10
    	if_be	add	ax,#"0"
    		serouta	ax
    		shl	val,#4
    		ret
    
    list		long	3	'index 0
    		long	1	'index 1
    brian		long	4	'index 2
    		long	1	'index 3
    		long	5	'index 4
    		long	9	'index 5
    
    
  • cgraceycgracey Posts: 14,151
    edited 2014-03-23 21:12
    ozpropdev wrote: »
    Here's a different result. By moving the "again" label back 3 instructions makes the code work in pre fetch mode.
    If pre fetch is turn off same result as before. Hopefully this makes sense to Chip and he has a "Aha" moment. :)
    con
    
    	tx=90
    	rx=91
    	baudrate = 9600
    
    dat		orgh	$e00/4	'F11 to run from PNUT
    		org
    		jmp	@hub_code
    
    ax		long	0
    bx		long	0
    cx		long	0
    val		long	0
    
    timer		long	0
    delay		long	10_000_000
    delay2		long	80_000_000 * 8
    
    config		long	%10 << 16 | rx << 9 | %10 << 7 | tx
    bittime		long	80_000_000 / baudrate
    
    		orgh
    
    hub_code	setsera	config,bittime
    		clrp	#tx
    
    		icachep		'work first pass then fails thereadter
    	'	icachen		'fails first pass then works thereafter
    
    		long	0[5]
    	'	long	0[8]
    
    
    again
    		getcnt	timer
    		add	timer,delay2
    		waitcnt	timer,delay
    
    
    'again
    		mov	bx,#2
    
    bug		loclong	bx,@list
    
    		rdlong	cx,bx
    
    		mov	val,bx
    		call	@show_hex
    		serouta	#32
    		mov	val,cx
    		call	@show_hex
    		serouta	#13
    
    		getcnt	timer
    		add	timer,delay
    		and	cx,#15 wz
    	if_z	mov	cx,#15
    
    :loop		setp	#32
    		waitcnt	timer,delay
    		clrp	#32
    		waitcnt	timer,delay
    		djnz	cx,@:loop
    
    		getcnt	timer
    		mov	cx,delay2
    		shr	cx,#3
    		add	timer,cx
    		waitcnt	timer,cx
    	
    		jmp	@again
    
    
    show_hex	reps	#8,#6
    		nop
    		getnib	ax,val,#7
    		cmp	ax,#9 wz,wc
    	if_a	add	ax,#"A"-10
    	if_be	add	ax,#"0"
    		serouta	ax
    		shl	val,#4
    		ret
    
    list		long	3	'index 0
    		long	1	'index 1
    brian		long	4	'index 2
    		long	1	'index 3
    		long	5	'index 4
    		long	9	'index 5
    
    


    I'm glad you guys found this problem. I'll be on it tomorrow morning.

    I've been thinking that for the JMP #addresslabel(>>2) issue, I'll have the assembler return >>2 values for labels in the cases of operand use. I'll see in the morning how this will work, but that would be the proper way to handle things. That way, JMP #constant would still be what you'd expect. JMP #addresslabel would also force a long-address check, which is important.

    It turns out that no '>>2' was ever needed. I didn't realize that I already had the assembler handing this properly. Sorry for all the confusion on this >>2 matter. Now, to find what's wrong with LOCLONG...
  • Cluso99Cluso99 Posts: 18,069
    edited 2014-03-23 22:56
    cgracey wrote: »
    I'm glad you guys found this problem. I'll be on it tomorrow morning.

    I've been thinking that for the JMP #addresslabel(>>2) issue, I'll have the assembler return >>2 values for labels in the cases of operand use. I'll see in the morning how this will work, but that would be the proper way to handle things. That way, JMP #constant would still be what you'd expect. JMP #addresslabel would also force a long-address check, which is important.
    That would only be for addresses >=$200 wouldn't it? Because cog addresses are already longs.
  • ctwardellctwardell Posts: 1,716
    edited 2014-03-24 02:36
    Found another interesting condition.

    If there is a JMP directly after the LOCLONG the error does not occur.
    It does not matter if the jump is to a COG or HUB address.

    If the line with the label 'buggy' is commented out to insert a NOP between the LOCLONG and the JMP the error occurs.

    Question: Do the cache lines always load from a WIDE boundary such as $00, $20, $40, etc.?
    dat		orgh	$e00/4	'F11 to run from PNUT
    		org
    		jmp	@hub_code
    
    bx		long	0
    timer		long	0
    delay		long	40_000_000
    monitor		long	91 << 24 + 90 << 16 + $52C >> 2	'added for launching monitor
    tval1		long	$4000	'a place to write a value to check from the monitor
    tval2		long	$4004	'a place to write a value to check from the monitor
    
    
    cog_code	nop
    		jmp	@hub_code2
    
    		orgh
    hub_code	mov	bx,#2	'set item index = 4
    
    		nop
    		nop
    		nop
    		nop	'fail bug @ $e38 when line 'buggy' uncommented, no fail when line 'buggy' commented out
    		nop
    		nop
    		nop
    		nop
    		nop
    		nop
    		nop
    		nop	'fail bug @ $e58 when line 'buggy' uncommented, no fail when line 'buggy' commented out
    		nop
    
    bug		loclong	bx,@list
    'buggy		nop
    		jmp	@cog_code	'can also use jmp @hub_code2, this was to see if the the jump target mattered
    hub_code2	wrlong	bx, tval1	'save bx so it can be checked from the monitor
    		rdlong	bx, bx
    		wrlong	bx, tval2	'save bx so it can be checked from the monitor
    		cogrun	monitor, #0	'run the monitor
    
    
    list		long	3	'index 0
    		long	1	'index 1
    		long	4	'index 2
    		long	1	'index 3
    		long	5	'index 4
    		long	9	'index 5
    

    C.W.
  • Bill HenningBill Henning Posts: 6,445
    edited 2014-03-24 05:34
    Yes, the WIDE's always load from a 32 byte (8 long) boundary
  • cgraceycgracey Posts: 14,151
    edited 2014-03-24 10:41
    ctwardell wrote: »
    It looks like in the cases where LOCLONG fails it is using the index value from D and the relative offset from S but is failing to include the absolute address of the instruction itself.


    Good sleuthing, ctwardell!

    I was running tests of my own and I remembered that someone said this about the address of the instruction being missing. I did the math and, sure enough, that was the problem, all right.

    I had accidentally omitted the pipeline-stage-advance signal from a flop clock gate, so when the cache was reloading, it clocked more than once, and after the first time, the data became errant. I fixed it and now I'm checking for any other such omissions.

    I'll do a recompile soon and post new files.

    GOOD JOB DISCOVERING THIS, GUYS!!!
  • Cluso99Cluso99 Posts: 18,069
    edited 2014-03-24 13:44
    WTG guys. One bug killed ;)
  • cgraceycgracey Posts: 14,151
    edited 2014-03-24 16:46
    I updated the FPGA configuration file here:

    http://forums.parallax.com/showthread.php/125543-Propeller-II-update-BLOG?p=1251927&viewfull=1#post1251927

    This fixes the LOCxxxx bug.
  • Cluso99Cluso99 Posts: 18,069
    edited 2014-03-24 21:22
    Here is a little test I ran to call some of the monitor routines from a different cog program.
    Unfortunately it is not possible to use all the routines! For instance, tx_string calls the cog routine rdstring (which of course is not present).
    It is ok to call tx_crlf though. It is also possible to use the text strings such as hello, hitspace and error.

    Note that the compiler requires the address in an EQU (ie in a CON block) to be shifted >>2. This is both for jmp/call and for locptra/b/etc. The shift is not required when referencing labels for data/instructions within a DAT. It is a pnut inconsistency that can be looked at later.

    So here is an example...
    CON
      _clkmode = xinput
      _xinfreq = 80_000_000
      _baud    = 115_200
      _bitrate = _xinfreq / _baud
      _txpin   = 90                                 ' P90=SO
      _rxpin   = 91                                 ' P91=SI
    
    ' P2 Monitor hub addresses
    rx_line         = $990 >> 2
    tx_crlf         = $A0C >> 2
    tx_string       = $B78 >> 2
    tx_hex          = $BCC >> 2
    rx_chr          = $BF8 >> 2
    rx_check        = $C04 >> 2
    error           = $C70 >> 2
    hitspace        = $C7C >> 2
    hello           = $C88 >> 2
    help            = $CE0 >> 2
     
    
      
    DAT
                    orgh    $00E00/4                        ' start of hub ram
                    org     0
    start        
                  setp      #0                      '\ external led on
                  clrp      #1                      '/
    
    'the following is a 5 sec delay mechanism only (allows PST to start)
                    getcnt  waitx
                    add     waitx,delta5
                    waitcnt waitx,0
                  SETSERA   #<<7 + _txpin, baud          'set SERA for 8-bit transmit on pin at baud
                  CLRP      #_txpin                         'make pin an output, SERA drives it high
    :loop         SEROUTA   #"O"                            'send message
                  SEROUTA   #"K"
                  SEROUTA   #$0D
                  SEROUTA   #"*"
                    getcnt  waitx                           '\ 1s delay
                    add     waitx,delta                     '|
                    waitcnt waitx,0                         '/
                  notp      #1                              ' toggle external led  
    '             jmp       #:loop
                  call      #tx_crlf
                  locptra   #hitspace
    '              locptra   #msg
    '             call      #tx_string                      ' we cannot use this because it then calls cog code!!
    :nextchr      rdbyte    x,ptra++        wz,wc           'read string byte
            if_nz serouta   x              
            if_nz jmp       @:nextchr
    :monitor
                    getcnt  waitx                           '\ 1s delay
                    add     waitx,delta                     '|
                    waitcnt waitx,0                         '/
                  cogrun    monitor_pgm,#0                  'relaunch cog0 with shutdown or monitor
    monitor_pgm   long      _rxpin<<24+_txpin<<16+($1B0+$37C)>>2    'monitor parameter (conveys pins)
    x             long      0
    '------------------------------------------------------------------------------------------------
    countx          long    _xinfreq
    count           long    5
    waitx           long    0
    delta           long    _xinfreq                ' 1 sec
    delta5          long    _xinfreq * 5            ' 5 sec
    baud            long    _bitrate
    '================================================================================================
                  orgh      $1000
    msg           long
                  byte      "Hit <space> to start monitor...",0
    
    Note: I have a LED and resistor across P0-P1.
  • AribaAriba Posts: 2,690
    edited 2014-03-24 21:49
    Where should we report the bugs we find? Perhaps these thrtead can get a more general title.

    Andyway here is what I have found yet:

    1) Delayed jumps do not work correct if one of the delayed instructions is a WAITVID, and tasks are enabled.

    2) In the Monitor: If you want to start a cog from hubmemory with 0+addr, the addr must be entered as hubaddr/4, which is a bit strange.
    I think the monitor should calculate that for us (shr value,#2 before cogrun).

    Andy
  • ctwardellctwardell Posts: 1,716
    edited 2014-03-24 22:17
    Ariba wrote: »
    Delayed jumps do not work correct if one of the delayed instructions is a WAITVID, and tasks are enabled.

    I think that would be by design since waitvid jumps to itself instead of stalling the pipeline when in multitasking mode.

    All of the instructions listed below would likely be in the same category.

    This is one of those cases where we will need good documentation.


    From the 20Mar2014 Prop2_Docs.txt:
    Some instructions which stall the pipeline during single-task execution will, instead, jump back to
    themselves during multi-task execution (JMP #$), until their release condition is met. This way they
    avoid stalling the pipeline, allowing other tasks to execute in the interstitial time slots:
    
      WAITVID D/#,S/#    wait for VID to grab new data
    
      SERINA  D          wait for serial input on SERA
      SERINB  D          wait for serial input on SERB
      SEROUTA D/#        wait to send serial output on SERA
      SEROUTB D/#        wait to send serial output on SERB
    
      GETMULL D          wait for lower multiplier result
      GETMULH D          wait for upper multiplier result
      GETDIVQ D          wait for divider quotient result
      GETDIVR D          wait for divider remainder result
      GETSQRT D          wait for square root result
      GETQX   D          wait for CORDIC X result
      GETQY   D          wait for CORDIC Y result
      GETQZ   D          wait for CORDIC Z result
    
      SYNCTRA            wait for PHSA to roll over
      SYNCTRB            wait for PHSB to roll over
    
    
    For the above instructions, multi-tasking is considered to be active when SETTASK D/# has written
    a mixture of tasks to the time slots. Remember that in multi-tasking, the above instructions behave
    as branches, and therefore cannot be used in REPD/REPS instruction-repeat blocks. Also, you should
    not use INDx++/INDx--/++INDx with these instructions during multi-tasking, as they will cause
    INDA/INDB to increment or decrement each time they loop back to themselves, before the release
    condition is met.
    

    C.W.
  • AribaAriba Posts: 2,690
    edited 2014-03-24 22:37
    Thank you C.W

    Yes that explains it. I try to port an old code from a time when this was possible, but if I think about it, it was not very efficient because the task stalled until the waitvid was done. So I should have used POLLVID, and then it's the same as we have now.

    Andy
  • cgraceycgracey Posts: 14,151
    edited 2014-03-25 04:03
    Ariba wrote: »
    Where should we report the bugs we find? Perhaps these thrtead can get a more general title.

    Andyway here is what I have found yet:

    1) Delayed jumps do not work correct if one of the delayed instructions is a WAITVID, and tasks are enabled.

    2) In the Monitor: If you want to start a cog from hubmemory with 0+addr, the addr must be entered as hubaddr/4, which is a bit strange.
    I think the monitor should calculate that for us (shr value,#2 before cogrun).

    Andy


    Andy, this is a hard thing to decide. With hub execution now, things are a little different, since the cog uses a 16-bit address (ignores the two LSBs) for instructions in hub memory, which are longs. Have you noticed the 'X' mode in the ROM Monitor yet? I put it there to help orient people to a cog's execution perspective. In that mode, you see and manipulate the hub memory in this 16-bit-address/long perspective. If you see what the cog sees, it becomes very simple again.
Sign In or Register to comment.