Shop OBEX P1 Docs P2 Docs Learn Events
Prop2 FPGA files!!! - Updated 2 June 2018 - Final Version 32i - Page 136 — Parallax Forums

Prop2 FPGA files!!! - Updated 2 June 2018 - Final Version 32i

1133134136138139160

Comments

  • cgraceycgracey Posts: 14,155
    Cluso99, only the 9-bit S field of the next instrucion is affected. I may not be understanding. Did you see the documentation?
  • Cluso99Cluso99 Posts: 18,069
    edited 2018-04-28 10:40
    From the docs...
    REGISTER INDIRECTION

    Cog registers can be accessed indirectly most easily by using the ALTS/ALTD/ALTR instructions. These instructions sum their D[5:0] and S/#[5:0] values to compute an address that is directly substituted into the next instruction’s S field, D field, or result register address (normally, this is the same as the D field). This all happens within the pipeline and does not affect the actual program code. The idea is that S/# can serve as a register base address and D can be used as an index:

    ALTS index,#table ‘set next S field to table+index
    MOV OUTA,0 ‘output register[table+index] to OUTA

    ALTD index,#table ‘set next D field to table+index
    MOV 0,INA ‘write INA to register[table+index]

    ALTR index,#table ‘set next write to table+index
    XOR INA,INB ‘write INA^INB to register[table+index]
    The docs were not clear as to which bits are being used.

    However, the issue is that normally when a sub-range of bits are used, the others are ignored.

    This is not the case here, and in particular, ALTS overwrites the D value, and in some cases it is incorrect.
    Also, the result passed through to the next instruction after the ALTS is also incorrect if the ignored bits are not zero.

    If ALTS where D=$144 and S=$144, D is overwritten as $5D and the passed S is $88. Note that S[7:0] seems to be added and S[8] results in a subtraction from the D value.

    While it is not a show-stopper, it best be spelt out as it is a restriction, not normal use.
    ALTS sums D[8:0] and S/#[5:0]. All other bits of D & S must be zero

    BTW I often store values in unused bits.
    I note so does Chip...
    .v _ret_ cmp 0,#0 'bottom byte used as a counter

    I discovered this as a consequence of trying to use ALTS followed by a RDLONG, expecting an S result of 20bits :(
  • cgraceycgracey Posts: 14,155
    edited 2018-04-28 11:59
    Whoops!

    The doc's should say...

    "These instructions sum their D[8:0] and S/#[8:0] values to compute a 9-bit address that is directly substituted into the next instruction’s S field (ALTS), D field (ALTD), or result register address (ALTR) (normally, this is the same as the D field).

    Upper bits of S are used to inc/dec D[8:0].
  • Cluso99Cluso99 Posts: 18,069
    cgracey wrote: »
    Whoops!

    The doc's should say...

    "These instructions sum their D[8:0] and S/#[8:0] values to compute a 9-bit address that is directly substituted into the next instruction’s S field (ALTS), D field (ALTD), or result register address (ALTR) (normally, this is the same as the D field).

    Upper bits of S are used to inc/dec D[8:0].
    So S[D] means base[index], and the new incremented/decremented index is rewritten by ALTS

    Needs more explanation in the docs (later)

    Looks like the following happens...

    Inserts S into the next instruction...
    Take original D[31:0] and replace D[8:0] with the sum of (D[8:0]+S[8:0]) ignoring overflow.

    Take original D[31:0] and write back with ALTS instruction D[31:0] ADDS S[17:9]>>9

  • Cluso99
    Another contributing factor to the incorrect results in your tests is the effect of the
    AUGS instruction also being applied to the instruction following the ALTS.
    This can be demonstrated with the following code.
    		mov	y,#$144
    		alts	y,##$11223344
    		mov	a,#0-0
    
    		mov	y,#$144
    		mov	x,##$11223344
    		alts	y,x
    		mov	a,#0-0
    
    
    Results :
    '
    
    00000: F6045144              MOV     y,#$144
    00001: FF089119              AUGS    #$89119
    00002: F9945144              ALTS    y,#$144 {$11223344}
    00003: F6045200              MOV     a,#$000
    
    COG | $00029: "a"      $11223288 %00010001_00100010_00110010_10001000 #287453832
    
    00004: F6045144              MOV     y,#$144
    00005: FF089119              AUGS    #$89119
    00006: F6045544              MOV     x,#$144 {$11223344}
    00007: F990502A              ALTS    y,x
    00008: F6045200              MOV     a,#$000
    
    COG | $00029: "a"      $00000088 %00000000_00000000_00000000_10001000 #136
    

  • Cluso99Cluso99 Posts: 18,069
    edited 2018-04-29 04:15
    I did do results with presetting the S & D values without AUGS.
     mov  base,  #$144      ' base                           
     mov  ndx,   #$144      ' original index                 
     mov  index, ndx        ' index                          
     alts index, base       ' index,base  s[d] = base[index] 
     mov  answer, #0                                         
           ----base--- ---index--- ---index--- ---answer--
      084- 44 01 00 00 44 01 00 00 44 01 00 00 88 00 00 00  'D...D...D.......'
                                                  ^^
     -----------------------------------------------------
     mov  base,  ##$244     ' base +1
     mov  ndx,   ##$144     ' original index
     mov  index, ndx        ' index
     alts index, base       ' index,base  s[d] = base[index]
     mov  answer, #0
      084- 44 02 00 00 44 01 00 00 45 01 00 00 88 01 00 00
                                   +1
     -----------------------------------------------------
     mov  base,  ##$444     ' base +2
     mov  ndx,   ##$144     ' original index
     mov  index, ndx        ' index
     alts index, base       ' index,base  s[d] = base[index]
     mov  answer, #0
      084- 44 04 00 00 44 01 00 00 46 01 00 00 88 01 00 00
                                   +2
     -----------------------------------------------------
     mov  base,  ##$844     ' base +4
     mov  ndx,   ##$144     ' original index
     mov  index, ndx        ' index
     alts index, base       ' index,base  s[d] = base[index]
     mov  answer, #0
      084- 44 08 00 00 44 01 00 00 48 01 00 00 88 01 00 00
                                   +4
     -----------------------------------------------------
    
    You will note the overflow loss. I did work out which bit made the ALTS decrement D.
    IIRC it was s[17] and S[16:9] is the value to add/subtract.

    Certainly the instruction needs expanding, including a warning about the caveats of lost overflow and non-zero upper bits.
     mov  base,  ##%0000_0000__0000_00___111110000___101000100  ' $xxxx_xx44 base -16??
     mov  base,  ##%0000_0000__0000_00___011110000___101000100  ' $xxxx_xx44 base +$F0 -> $0134
    

    I do understand the advantages the instruction gives, but it is not obvious, and a trap.
  • jmgjmg Posts: 15,173
    cgracey wrote: »
    Upper bits of S are used to inc/dec D[8:0]
    Cluso99 wrote: »

    Take original D[31:0] and write back with ALTS instruction D[31:0] ADDS S[17:9]>>9
    <then>
    Certainly the instruction needs expanding, including a warning about the caveats of lost overflow and non-zero upper bits.

    I'm not following here, the equation given shows a 31:0 wide addition, but Chip's comment, and the lost overflow comment, both suggest this +/- wraps inside the lower 9 bits D[8:0] ?



  • Cluso99Cluso99 Posts: 18,069
    jmg wrote: »
    cgracey wrote: »
    Upper bits of S are used to inc/dec D[8:0]
    Cluso99 wrote: »

    Take original D[31:0] and write back with ALTS instruction D[31:0] ADDS S[17:9]>>9
    <then>
    Certainly the instruction needs expanding, including a warning about the caveats of lost overflow and non-zero upper bits.

    I'm not following here, the equation given shows a 31:0 wide addition, but Chip's comment, and the lost overflow comment, both suggest this +/- wraps inside the lower 9 bits D[8:0] ?


    yes
  • jmgjmg Posts: 15,173
    edited 2018-04-29 05:23
    Cluso99 wrote: »
    yes
    So the equation you gave above needs to be modified, to refect the wrap ?
    Such a wrap is not going to be very compiler/relocatable friendly.
  • Cluso99Cluso99 Posts: 18,069
    Chip or anyone,
    I forget, how much HUB ROM at the top needs to be kept free for the debug and interrupts?
  • 2K $ff800 to $fffff

    isr_address = $ff840 + (!cog & $f) << 7
    reg_buffer = $ff800 + (!cog & $f) << 7
  • cgraceycgracey Posts: 14,155
    Cluso99 wrote: »
    Chip or anyone,
    I forget, how much HUB ROM at the top needs to be kept free for the debug and interrupts?

    There are 16 longs for buffering cog regs $000..$00F and 16 longs for the debug routine. That's 32 longs per cog. Times 8 cogs means 256 longs, or 1KB. So, $FFC00...$FFFFF are directly needed, plus maybe that much, again, for overlay code to make the whole debugger work. So, perhaps $FF800..$FFFFF should be reserved.
  • ozpropdev wrote: »
    2K $ff800 to $fffff

    isr_address = $ff840 + (!cog & $f) << 7
    reg_buffer = $ff800 + (!cog & $f) << 7

    I'm still trying to hang onto 16 cogs!

    isr_address = $ffc40 + (!test_cog & 7) << 7
    reg_buffer = $ffc00 + (!test_cog & 7) << 7


  • cgraceycgracey Posts: 14,155
    Yes, allowance for 16 cogs would work on any possible Prop2 chip.
  • Cluso99Cluso99 Posts: 18,069
    None of that code is being setup by the ROM though?

    So we can use that area to preload something??? Because all of a sudden, 16KB ROM doesn't seem that much :(
  • Cluso99Cluso99 Posts: 18,069
    edited 2018-05-01 01:55
    Chip,
    I seem to be unable to successfully test for a pull-down.

    By chance, is there a weak pull-down enabled in the FPGA? And if so, is this also present in the real P2 silicon?

    My mistake! Of course the pins without pull-downs will read as high :(

    Here is my test code
    test_pullups    mov     pullups, #0
                    mov     pulldns, #0
    
                    callpa  #cfg_dq,#check_pullup   'spi_dq pull-up?     load from SERIAL
            if_c    bith    pullups, #0             ' y:
                    outh    pa                      'spi_dq pull-down?   ignore SERIAL
                    callpa  #cfg_dq,#check_pulldn   '
            if_c    bith    pulldns, #0             ' y:
    
                    callpa  #cfg_cs,#check_pullup   'spi_cs pull-up?     load from FLASH
            if_c    bith    pullups, #2             ' y:
                    outh    pa                      'spi_dq pull-down? 
                    callpa  #cfg_cs,#check_pulldn   '
            if_c    bith    pulldns, #2             ' y:
    
                    callpa  #cfg_ck,#check_pullup   'spi_ck pull-up?     load from SD
            if_c    bith    pullups, #1             ' y:
                    outh    pa                      'spi_ck pull-down?   
                    callpa  #cfg_ck,#check_pulldn   '
            if_c    bith    pulldns, #1             ' y:
    
    '+-----------------------------------------------------------------------------+
    '+      Check pin pull-up (pull-down)                                          +
    '+-----------------------------------------------------------------------------+
    check_pullup    outl    pa                      'out bit low
    check_pulldn    dirh    pa                      'drive pin low (out bit must be low)
                    waitx   #30*1                   'wait >1us
                    dirl    pa                      'float pin
                    waitx   #30*5                   'wait >5us
            _ret_   testp   pa              wc      'sample pin, c=1 if pull-up
    '+-----------------------------------------------------------------------------+
    
  • Also your pulldns result will be the same as pullups result.
    'should be
            if_nc    bith    pulldns, #0
    '..etc
    
  • Cluso99Cluso99 Posts: 18,069
    ozpropdev wrote: »
    Also your pulldns result will be the same as pullups result.
    'should be
            if_nc    bith    pulldns, #0
    '..etc
    
    yes, i did change that too

  • Cluso99Cluso99 Posts: 18,069
    Using v32c fpga code, Windows 10, PST (115,200) and a BeMicroCV-A9

    I cannot get the autobaud/Prop_Chk sequence to return anything. pnut will download code.

    Maybe the timeout isn't long enough to manually type in the sequence. I tried to load a new ROM code with bigger timeout into $4C000 but that didn't work.

    Can someone please confirm that the manual sequence "> " and then "Prop_Chk<cr>" does indeed work (or not) with W10 & PST?
  • jmgjmg Posts: 15,173
    Cluso99 wrote: »
    Using v32c fpga code, Windows 10, PST (115,200) and a BeMicroCV-A9

    I cannot get the autobaud/Prop_Chk sequence to return anything. pnut will download code.

    Maybe the timeout isn't long enough to manually type in the sequence.
    The code shows a 100ms monostable, resets on each char, so I doubt you would be able to type that - most terminals can paste strings tho ?

    My notes from earlier postings say send string of 19 chars is min, "> Prop_Chk 0 0 0 0 "
    and echo is 15 chars of CR+LF+“Prop_Ver Au”+CR+LF

    You could also capture what pnut sends, as a sanity check.

  • I tried that in a WIN7 system with TeraTerm and typed fairly slow and it worked responding with "Prop_Ver A" so I don't think it is timing out. There is something definitely wrong with Cluso's setup.
  • ozpropdevozpropdev Posts: 2,792
    edited 2018-05-01 05:41
    Cluso99 wrote: »
    Using v32c fpga code, Windows 10, PST (115,200) and a BeMicroCV-A9

    I cannot get the autobaud/Prop_Chk sequence to return anything. pnut will download code.

    Maybe the timeout isn't long enough to manually type in the sequence. I tried to load a new ROM code with bigger timeout into $4C000 but that didn't work.

    Can someone please confirm that the manual sequence "> " and then "Prop_Chk<cr>" does indeed work (or not) with W10 & PST?
    Running same setup here.
    > Prop_Chk 0 0 0 0 works fine.
    > Prop_Chk 0 0 0 0
    
    Prop_Ver F
    

    Edit: Make sure you issue a space after the ">" character.



  • Cluso99Cluso99 Posts: 18,069
    edited 2018-05-01 08:17
    Yes, issuing a space after ">".
    Gave up. Just wont work yet I can download. I can also run my own code with absolutely no problems which outputs characters, then reads and echos correctly.

    ozprop What are the 0 0 0 0 after the Prop_Chk ? Are they necessary??? I gave it "> Prop_Chk<cr>" and maybe "<lf>" (not sure what enter gives on PST.

    Postedit: Just spoke to Peter. Yep requires the 0 0 0 0 :(
  • You need the 0 0 0 0 as a "select all Props" mask I believe and I know it doesn't work if you leave them out.

    Just copy and paste the next two lines (includes a CR)
    > Prop_Chk 0 0 0 0
    
    
  • jmgjmg Posts: 15,173
    edited 2018-05-01 20:34
    Just copy and paste the next two lines (includes a CR)
    > Prop_Chk 0 0 0 0
    
    

    I think it replies on the 4th 0, and does not need a trailing space, or <cr> ? oops, get_hex (of course) needs anything non-hex to exit, however ">" is swallowed inside INT as timing char.

    One detail worth checking is to send a large file of many repeating 18 19 char blocks
    > Prop_Chk 0 0 0 0 > Prop_Chk 0 0 0 0 > Prop_Chk 0 0 0 0 > Prop_Chk 0 0 0 0 > Prop_Chk 0 0 0 0 > Prop_Chk 0 0 0 0 > Prop_Chk 0 0 0 0 > Prop_Chk 0 0 0 0 > Prop_Chk 0 0 0 0 > Prop_Chk 0 0 0 0 > Prop_Chk 0 0 0 0
    

    That repeat is what MCU booters, and PC hosts, can repeat for the shortest possible reset-boot times.

    Test is to confirm P2 always autobauds and echos ID(s), no matter what the reset exit phase is ?
    The echo ID is 15 bytes for 18 19 bytes in, and I see a largish 1ms allowed for turnaround, which is 11.52 char times at 115200,
    so that means once out of reset, P2 will echo one ID (15+12chT) for every two Prop_Chks (36chT) received.

    A small turnaround is useful, but that 1ms sounds high, for Baud speeds of 115200~2MBd ?

    If that is dropped to 1 Char time, the echo now always takes 15+1 chars, and gives a 3 char margin to catch the next _Chk - thus will echo on every _Chk, thanks to rx FIFO.


    edit: corrected get_hex exit and TX:RX timing
  • Cluso99Cluso99 Posts: 18,069
    It needs a trailing whitespace (space, cr etc)
  • jmg wrote:
    One detail worth checking is to send a large file of many repeating 18 char blocks
    Sending a repeating string of "> Prop_Chk 0 0 0 0 " responds to each command with "Prop_Ver F" as expected.


  • Prop_Chk <INAmask> <INAdata> <INBmask> <INBdata>
    A while back I hooked up 3 Nano's to a propplug and configured assorted pullup/downs to configure their ID's.
    I was able to load the nano's seperately and in parallel with ease.
    Works great! :)
  • jmgjmg Posts: 15,173
    edited 2018-05-01 20:45
    ozpropdev wrote: »
    jmg wrote:
    One detail worth checking is to send a large file of many repeating 18 char blocks
    Sending a repeating string of "> Prop_Chk 0 0 0 0 " responds to each command with "Prop_Ver F" as expected.

    hmm... Was that a packed/repeating string, no gaps ? - and every _Chk gave a _Ver ? ie you checked counts ?

    That's not quite what I'd expect from the 1ms TX pause, as that adds 11.52 chars(!) to the 15ch reply, and now the rx block IN(19) is smaller than the tx block OUT(11.52+15), and the Rx fifo overruns/wraps

    If there is instead a 1-char time TX pause, now the Tx block fits inside the Rx timing, and you can get a _Ver for every _Chk
    present 1ms turnaround (w..w = wait, varies with baud rate, so unpredictable behaviour) RxFIFO - 16 bytes
    RX:    > Prop_Chk 0 0 0 0 > Prop_Chk 0 0 0 0 > Prop_Chk 0 0 0 0 > Prop_Chk 0 0 0 0 
    FIFO:  -000000000000000000-123456789abcdefWWWWWWWWWW??
    TX:                       wwwwwwwwwwwclProp_Ver Acl                                wwwwwwwwwwwclProp_Ver Acl
    
    proposed 1 char turnaround (w=wait)
    RX:    > Prop_Chk 0 0 0 0 > Prop_Chk 0 0 0 0 > Prop_Chk 0 0 0 0 > Prop_Chk 0 0 0 0 
    FIFO:  -000000000000000000-123456789abcde0000-123456789abcde0000-123456789abcde000
    TX:                       wclProp_Ver Acl    wclProp_Ver Acl    wclProp_Ver Acl   wclProp_Ver Acl
    
    
  • jmg wrote: »
    hmm... Was that a packed/repeating string, no gaps ? - and every _Chk gave a _Ver ? ie you checked counts ?
    The test sent 10 x "> Prop_Chk 0 0 0 0 " @ 115200 with a 3mS gap.
    10 x "Prop_Ver F" responses were returned.
    As the boot loader isn't a full duplex arrangement I wasn't expecting it to respond during the response message transmission.
    Sorry I was a bit vague on that detail.

Sign In or Register to comment.