Shop OBEX P1 Docs P2 Docs Learn Events
Prop2 FPGA files!!! - Updated 2 June 2018 - Final Version 32i - Page 151 — Parallax Forums

Prop2 FPGA files!!! - Updated 2 June 2018 - Final Version 32i

1148149151153154160

Comments

  • evanh, could you explain in more detail what you're suggesting? The ORG directive currently tells the assembler to go into cog mode, and it sets the starting cog address. What else should it do?
  • evanhevanh Posts: 15,126
    edited 2018-12-06 01:30
    I'm fine with ORG, that was just a passing remark. It's LOC that needs the work.

    EDIT: I'd call it a base address rather than start address. "Start" might be mistaken for start of execution.
    EDIT2: Hmm, base is wrong too, ORG is not a relative thing at all. Section origin then.

  • I added a warning in p2asm when LOC is used with a relative address. I think that should be sufficient. Maybe PNut should add that as well.
  • evanhevanh Posts: 15,126
    Thanks.
  • evanh wrote: »
    It's not base-relative but PC-relative. PC-relative only makes sense for actual branches.

    That really isn't true. One of the things chip made explicit early on was the fact that data can be intermingled with code.
  • Dave Hein wrote: »
    I added a warning in p2asm when LOC is used with a relative address. I think that should be sufficient. Maybe PNut should add that as well.

    I still don't quite see the danger of using LOC with a relative address, at least one above $400. After this code:
       orgh $400
       loc pa, #@label  ' relative addressing
       loc pb, #\@label ' absolute addressing
       cogstop #0
    label
       long 1
    
    PA and PB should have the same value. Am I missing something?
  • PA will contain $40C-$400 = $C. PB will contain $40C.
  • I was wrong. PA will contain a value of 8. Here's the listing from p2asm:
                       dat
    00400                 orgh $400
    3: WARNING: Relative mode used with LOC instruction
    00400     fe900008    loc pa, #@label  ' relative addressing
    00404     fea0040c    loc pb, #\@label ' absolute addressing
    00408     fd640003    cogstop #0
    0040c              label
    0040c     00000001    long 1
    
    As I said before, the might be value in using the difference for position-independent code.
  • OK, I was wrong. I ran this under spinsim, and I got PA=$1020 and PB=$1030. This may be correct, or p2asm may be wrong, or spinsim might be wrong. I'll have to check the binary with PNut's binary.
  • ozpropdevozpropdev Posts: 2,791
    edited 2018-12-06 03:10
    Both PA/PB will contain $408 $40C

    Edit: typo
  • Dave Hein wrote: »
    OK, I was wrong. I ran this under spinsim, and I got PA=$1020 and PB=$1030. This may be correct, or p2asm may be wrong, or spinsim might be wrong. I'll have to check the binary with PNut's binary.

    p2asm looks OK, it produces the same thing as fastspin does, and when I run the result on the FPGA both pa and pb have the same value. I've attached the code I used for testing: foo.bas is the original source, foo.spin2 is the raw PASM produced by fastspin, foo.lst is the listing file that p2asm produces when it compiles foo.spin2. The output is:
    $ bin/loadp2 foo.binary -t
    [ Entering terminal mode.  Press ESC to exit. ]
    getting values
    pa=1064 pb=     1064
    
    which is correct (1064 = $428, which is where the label ends up in memory).

    Note that there's a bug in fastspin 3.9.10 such that it cannot handle @ and \ in inline assembly. That's fixed in the current github sources, so you'll need to use those if you want to regenerate foo.spin2.
  • ozpropdevozpropdev Posts: 2,791
    edited 2018-12-05 23:10
    Checked in Pnut
     orgh $400
       loc pa, #@label  ' relative addressing
       loc pb, #\@label ' absolute addressing
       cogstop #0
    label
    
    shows PA = $8 PB = $40c
    00400- 08 00 90 FE 0C 04 A0 FE 03 00 64 FD 00 00 00 00   '..........d.....'
    

    but this code shows PA = $40C and PB = $40C
    orgh	$400
    	org
    	loc	pa,#@label
    	loc	pb,#\@label
    	cogstop	#0
    label
    
    
    shows
    00400- 0C 04 80 FE 0C 04 A0 FE 03 00 64 FD 00 00 00 00   '..........d.....'
    

    Edit: Pnut switches to absolute because the ORG directive causes a domain crossiing.






  • ersmithersmith Posts: 5,900
    edited 2018-12-06 00:41
    ozpropdev wrote: »
    Checked in Pnut
     orgh $400
       loc pa, #@label  ' relative addressing
       loc pb, #\@label ' absolute addressing
       cogstop #0
    label
    
    shows PA = $8 PB = $40c
    00400- 08 00 90 FE 0C 04 A0 FE 03 00 64 FD 00 00 00 00   '..........d.....'
    

    I think you're confusing the instruction encoding with what is actually put in the register when the instruction executes. If you execute the "relative addressing" version of the loc instruction, the PC after the instruction ($404) is added to the offset ($8 in this case) to get the final value of $40c. In other words, at run time PA and PB will end up with the same value of $40c in them when the two instructions execute.

    (Try it!)
  • Cluso99Cluso99 Posts: 18,066
    ersmith wrote: »
    ozpropdev wrote: »
    Checked in Pnut
     orgh $400
       loc pa, #@label  ' relative addressing
       loc pb, #\@label ' absolute addressing
       cogstop #0
    label
    
    shows PA = $8 PB = $40c
    00400- 08 00 90 FE 0C 04 A0 FE 03 00 64 FD 00 00 00 00   '..........d.....'
    

    I think you're confusing the instruction encoding with what is actually put in the register when the instruction executes. If you execute the "relative addressing" version of the loc instruction, the PC after the instruction ($404) is added to the offset ($8 in this case) to get the final value of $40c. In other words, at run time PA and PB will end up with the same value of $40c in them when the two instructions execute.

    (Try it!)

    WHAT ?!?!
  • evanhevanh Posts: 15,126
    hehe ... isn't it lovely ... come join me in my torment ... hahaha



  • Cluso99 wrote: »
    ersmith wrote: »
    ozpropdev wrote: »
    Checked in Pnut
     orgh $400
       loc pa, #@label  ' relative addressing
       loc pb, #\@label ' absolute addressing
       cogstop #0
    label
    
    shows PA = $8 PB = $40c
    00400- 08 00 90 FE 0C 04 A0 FE 03 00 64 FD 00 00 00 00   '..........d.....'
    

    I think you're confusing the instruction encoding with what is actually put in the register when the instruction executes. If you execute the "relative addressing" version of the loc instruction, the PC after the instruction ($404) is added to the offset ($8 in this case) to get the final value of $40c. In other words, at run time PA and PB will end up with the same value of $40c in them when the two instructions execute.

    (Try it!)

    WHAT ?!?!

    Look at foo.spin2 and/or foo.lst that I posted a few pages back (that's foo.bas converted to PASM2 by fastspin). The relevant instructions are:
                       ' 
                       ' sub getlabelvals()
    00408              _getlabelvals
                       '   asm
    00408     fe90001c 	loc	pa, #@label
    0040c     fea00428 	loc	pb, #\@label
    00410     f6006df6 	mov	_var_00, pa
    00414     f6006ff7 	mov	_var_01, pb
                       '   paval = x
    00418     fc606c2b 	wrlong	_var_00, objptr
                       '   pbval = y
    0041c     f1045604 	add	objptr, #4
    00420     fc606e2b 	wrlong	_var_01, objptr
    00424     f1845604 	sub	objptr, #4
                       ' label:
    00428              label
    00428              _getlabelvals_ret
    00428     fd64002e 	reta
    
    Note that the first loc is encoded as $fe90001c (relative addressing) whereas the second loc is encoded as $fea00428 (absolute addressing). At runtime they both put $428 into the respective registers, as is proven by the program output.

    The reason is simple: the PC relative "loc" instruction adds the next PC (PC+4) to the offset to get the value to put into the register, just like a relative "jmp" adds the next PC to the offset to get the new PC. So the first loc, at address $408, adds $40c to the offset $1c to get the final value $428.

    Note that it isn't *just* the offset that is different in the two loc encodings, there's actually a bit in the instruction that says whether the offset is absolute or relative.

    You should be able to assemble and run foo.spin2 with PNut to verify this. Actually maybe not, it may use @@@, so you may have to use fastspin or p2asm. But all 3 assemblers agree about the encoding of the LOC instructions, so this isn't some quirk of fastspin or p2asm, it's the way the hardware works.

  • evanhevanh Posts: 15,126
    edited 2018-12-06 01:40
    ersmith wrote: »
    The reason is simple: the PC relative "loc" instruction adds the next PC (PC+4) to the offset to get the value to put into the register, just like a relative "jmp" adds the next PC to the offset to get the new PC. So the first loc, at address $408, adds $40c to the offset $1c to get the final value $428.

    Oh, oops, I've not been examining the final register content ... and I was convinced I was too, damn ...

  • I get $40C for both cases when running on the FPGA. However, spinsim seems to be confused. It produces $1020 and $1030. It's shifting the value up by 2 bits, which means it must think it's in the COG mode.

    If I move the routine to a different location other than $400 I get the correct value in the relative mode, but an incorrect value in the absolute mode. This kind of shows the value of having position-independent-code. It appears that my linker isn't adjusting the address for the absolution mode. It doesn't surprise me since I don't recall handling relocation for the LOC command.

    I'm going to take the warning print out for the LOC command.
  • I get $40C in both cases on silicon.
  • evanhevanh Posts: 15,126
    edited 2018-12-06 04:29
    The specified ORG addresses are for the "label". There are actually three labels, one for each ORG case.
    ===================================================
     LOC/MOV syntax        PA register results
     from hubRAM         ORG $0F  ORGH $110  ORGH $600
    ===================================================
    loc  pa, #label     0000000f   00000110   00000600
    loc  pa, #@label    0000003c   00000110   00000600
    loc  pa, #\label    0000000f   00000110   00000600
    loc  pa, #\@label   0000003c   00000110   00000600
    mov  pa, ##label    0000000f   00000110   00000600
    mov  pa, ##@label   0000003c   00000110   00000600
    ===================================================
     LOC/MOV syntax        PA register results
     from cogRAM         ORG $0F  ORGH $110  ORGH $600
    ===================================================
    loc  pa, #label     000fffee   000003e9   00000600
    loc  pa, #@label    00000081   000003c8   00000600
    loc  pa, #\label    0000000f   00000110   00000600
    loc  pa, #\@label   0000003c   00000110   00000600
    mov  pa, ##label    0000000f   00000110   00000600
    mov  pa, ##@label   0000003c   00000110   00000600
    
    
  • evanhevanh Posts: 15,126
    Here's code for one line:
    		loc     pa, #\@str1
    		call    #puts
    		loc     pa, #palette1
    		call    #itoh
    		call    #putsp
    		loc     pa, #palette2
    		call    #itoh
    		call    #putsp
    		loc     pa, #palette3
    		call    #itoh
    		call    #putnl
    
  • evanhevanh Posts: 15,126
    edited 2018-12-06 04:40
    Apologies on the PC-relative complaint. I was way off there.

    There is still the bug in Pnut though. It is in the cogexec LOC instruction encoding for PC-relative encoding below absolute $400. I guess that's where Cluso came unstuck and got me digging.

  • evanhevanh Posts: 15,126
    edited 2018-12-06 07:26
    Here's another one:
    I've just been experimenting with building some diagnostic code and discovered it would be nice to know if the caller code was from cogexec or hubexec. A third status bit in the stacked address maybe.

    In this case I'm wanting a subroutine to extract the encoding of the instruction prior to the call. If I don't know whether the caller was in cogexec at the time or not then I can't calculate the relative address from the call stack.

    EDIT: Ah, forgot that code can't execute below $400 in hubRAM. That should be enough ...

    EDIT2: And working source code:
    		pop     char                  'grab caller address
    		push    char                  'restack it
    
    		cmp     char, ##$400   wcz    'test if caller was cogexec or hubexec, C = borrow of (D - S)
    
    if_c		sub     char, #2              'was cogexec
    if_c		alts    char                  'MOV indirection - get register content of register number in "char"
    if_c		mov     pa, 0-0
    
    if_nc		sub     char, #8              'was hubexec
    if_nc		rdlong  pa, char
    
    
  • evanhevanh Posts: 15,126
    edited 2018-12-06 07:43
    Here's the results of above:
     LOC/MOV syntax         ORG $00F           ORGH $00110         ORGH $00600
     from hubRAM         op-code  PA-data    op-code  PA-data    op-code  PA-data
    ==============================================================================
    loc  pa, #label     fe80000f 0000000f   fe800110 00000110   fe900198 00000600
    loc  pa, #@label    fe80003c 0000003c   fe800110 00000110   fe900174 00000600
    loc  pa, #\label    fe80000f 0000000f   fe800110 00000110   fe800600 00000600
    loc  pa, #\@label   fe80003c 0000003c   fe800110 00000110   fe800600 00000600
    mov  pa, ##label    f607ec0f 0000000f   f607ed10 00000110   f607ec00 00000600
    mov  pa, ##@label   f607ec3c 0000003c   f607ed10 00000110   f607ec00 00000600
    
     LOC/MOV syntax         ORG $00F           ORGH $00110         ORGH $00600
     from cogRAM         op-code  PA-data    op-code  PA-data    op-code  PA-data
    ==============================================================================
    loc  pa, #label     fe9fffe0 000ffff7   fe9003dc 000003f5   fe800600 00000600
    loc  pa, #@label    fe900070 00000090   fe9003b8 000003da   fe800600 00000600
    loc  pa, #\label    fe80000f 0000000f   fe800110 00000110   fe800600 00000600
    loc  pa, #\@label   fe80003c 0000003c   fe800110 00000110   fe800600 00000600
    mov  pa, ##label    f607ec0f 0000000f   f607ed10 00000110   f607ec00 00000600
    mov  pa, ##@label   f607ec3c 0000003c   f607ed10 00000110   f607ec00 00000600
    
    
  • evanhevanh Posts: 15,126
    edited 2018-12-06 08:34
    Wow, that detail is needed. Pnut is making a mess of the PC-relative LOC encodings. Only six of the twelve PC-relative combinations above is correct. Even two of the hubexec encodings ($fe800110 is absolute encoding) is wrong because it is using absolute encoding below $400 in hubRAM where it should still be PC-relative.

    Or is that case intentional because hubexec can't go there?

  • Cluso99Cluso99 Posts: 18,066
    evanh wrote: »
    Wow, that detail is needed. Pnut is making a mess of the PC-relative LOC encodings. Only six of the twelve PC-relative combinations above is correct. Even two of the hubexec encodings ($fe800110 is absolute encoding) is wrong because it is using absolute encoding below $400 in hubRAM where it should still be PC-relative.

    Or is that case intentional because hubexec can't go there?
    LOC is also usable to obtain the hub address of a table, which can reside below or above hub $400.

    That is one of the problems I found - the hard way as I wasted a whole day trying to find a bug in my program.
  • evanhevanh Posts: 15,126
    edited 2018-12-17 02:47
    Chip,
    I've bumped into a design flaw/bug in lutRAM sharing! RDLUT data, or address, is being garbaged if the sharing cog WRLUTs to the same address on the same sysclock.

    In my case, an instruction stall would also mess me up but it would have to be a number of clocks to produce the result I'm getting.

    PS: I'm very certain. Testing is on P123 board with v32i image loaded.

  • jmgjmg Posts: 15,140
    evanh wrote: »
    Chip,
    I think I've bumped into a design flaw/bug in lutRAM sharing! RDLUT data, or address, is being garbaged if the sharing cog WRLUTs to the same address on the same sysclock.

    In my case, an instruction stall would also mess me up but it would have to be a number of clocks to produce the result I'm getting.

    Do you mean RDLUT from COGn, occurring on the same address, and on the same sysclk as the WRLUT from COGm, is corrupted ?
    ie it is neither the old value, nor the new value ?
    Do you have come test code that reproduces this ?
  • evanhevanh Posts: 15,126
    jmg wrote: »
    ie it is neither the old value, nor the new value ?
    Definitely not the new value. I don't think an old value could upset things the way it has because it will be the same every time and the importance of the data is metronomic ...
    Do you have come test code that reproduces this ?
    It's messy and non-specific.

  • evanhevanh Posts: 15,126
    The source as is:
    '==================================
    ' paired mailbox
    '==================================
    ORG $3fe
    monitor         res     1
    duration        res     1
    
    
    '==================================
    ' Sinc3 filter (cogexec, paired)
    '==================================
    ORG
    start_sinc3
    cid		cogid   cid
    		testb   cid, #0         wz
    if_z		jmp     #start_monitor        'identical code on paired cogs
    
    		wrpin   ##%0111_0000_000_0000100000000_00_01111_0, #mpin
    		                'adc/counter mode, bitstream is #tpin, clock input is #mpin (#tpin+1)
    		wypin   #0, #mpin             'inc on high
    		wxpin   #0, #mpin             'totaliser
    		dirh    #mpin                 'enable smart pin
    
    'Sinc3 loop (8 sysclocks)
    		rep     @.lend, #0            'loop forever
    
    		rdpin   acc1, #mpin
    		add     acc2, acc1
    		add     acc3, acc2
    		wrlut   acc3, #(monitor & $1ff)           'for the decimator (lut sharing is active)
    '		add     acc4, acc3
    '		wrlut   acc4, #(monitor & $1ff)           'for the decimator (lut sharing is active)
    .lend
    		cogstop cid
    
    
    
    acc1		long    0
    acc2		long    0
    acc3		long    0
    acc4		long    0
    diff1		long    0
    diff2		long    0
    diff3		long    0
    diff4		long    0
    period		long    400                   'max 1024 clocks, needs 30-bit registers
    
    
    '============================
    ' Monitor (cogexec, paired)
    '============================
    start_monitor
    		lutson
    samp		wrpin   ##%1010000000000_00_00010_0, #1
    				'set DAC mode for DAC1/Blue, 16-bit dither smartpin mode
    offset		wxpin   #1, #1                'continuous dither
    scale		wypin   #0, #1                'DAC level
    		dirh    #1                    'enable DAC
    
    delay		setse1  #%110_000000|rx_pin     'high IN from smartpin
    
    tick		getct   tick
    key		addct1  tick, period          'start the clock!
    
    'keyboard controls
    '===================
    .keyboard
    		rdpin   key, #rx_pin
    		shr     key, #32-8           'lsbit align
    
    'scaling recalculation
    		encod   scale, period
    		shr     scale, #1
    		mul     scale, #8
    
    		cmp     key, #"+"      wz
    if_z		add     offset, #511
    		cmp     key, #"-"      wz
    if_z		sub     offset, #511
    
    		cmp     key, #"*"      wz
    if_z		add     period, scale
    		cmp     key, #"/"      wz
    if_z		sub     period, scale
    		fles    period, #504           'max of 1024 clocks, multiple of 8
    		fges    period, #32            'fastest monitor loop time, multiple of 8
    
    		mov     delay, period
    		sub     delay, #25
    		waitx   #6                     'lutRAM sharing flaw!!  Change the #6 to hide it
    
    'monitor loop
    '==============
    .monl
    '		waitct1
    '		addct1  tick, period
    
    '		rep     @.monend, #0          'loop forever
    
    		waitx   delay                 '2
    		rdlut   samp, #(monitor & $1ff) '5
    		sub     samp, diff1           '7
    		add     diff1, samp           '9
    		sub     samp, diff2           '11
    		add     diff2, samp           '13
    		sub     samp, diff3           '15
    		add     diff3, samp           '17
    '		sub     samp, diff4
    '		add     diff4, samp
    
    'non-smartpin mode
    '		shr     samp, scale           '     scale signal to fit DAC trace
    '		add     samp, offset          '     offset centring
    '		shl     samp, #8              '     align for 8-bit DAC1
    '		setdacs samp
    'smartpin mode
    '		shr     samp, scale           '     scale signal to fit DAC trace
    		add     samp, offset          '19   offset centring
    		wypin   samp, #1              '21   smartpin 16-bit dither into DAC1
    '		jse1   #.keyboard             '23   branch on event reduces monitor loop by one instruction
    .monend
    
    		jnse1   #.monl                '25   branch on event reduces monitor loop by one instruction
    		jmp     #.keyboard
    
    		cogstop cid
    
    
Sign In or Register to comment.