Shop OBEX P1 Docs P2 Docs Learn Events
Prop2 FPGA files!!! - Updated 2 June 2018 - Final Version 32i - Page 160 — Parallax Forums

Prop2 FPGA files!!! - Updated 2 June 2018 - Final Version 32i

1154155156157158160»

Comments

  • evanhevanh Posts: 15,091
    cgracey wrote: »
    Evanh, here's the spoiler: the ASIC tools optimize the logic cell placement and routing only to meet the timing goal, so that signals with plenty of slack get routed around the hot spots, loosing their slack, while the signals needing speed get the prime placement and shortest routes. In the end, hundreds of thousands of paths are stacked against the timing wall, forming a cliff, where the chip fails systemically if the clock period becomes too short. So, while in theory some things take less time than others, the implementation is a blob of nearly identically-timed paths that affords no possibility of speed-up via selective clock cycle shortening. When you hit the speed limit, everything fails at once.
    I know, I didn't show the gap as a speed-up feature. It was just indicative of where the propagations end is all. The propagations must complete ahead of flop setup timing requirements.
  • cgraceycgracey Posts: 14,131
    Roy Eltham wrote: »
    @cgracey aren't the new chips due soon?

    The new wafers came out of the fab on the 13th and are now being packaged into 10 glob tops. The rest of the dice will be packaged in the real Amkor ePad package, but they won't be shipped for several weeks.

    ON Semi has been promising delivery of the 10 glob tops on August 1, but they always under-promise and over-deliver, so I'm thinking we may have chips sometime next week. Man, I really hope the design is okay. If it works as planned, it's going to be great. We'll have a really nice chip.
  • Cluso99Cluso99 Posts: 18,066
    Fingers doubly crossed. With all the work you've done Chip, it needs to be a rewarded success!

    I am off to the UK for 4 weeks, so I am taking my ES board with me to work on while there B)
  • cgraceycgracey Posts: 14,131
    Cluso99 wrote: »
    Fingers doubly crossed. With all the work you've done Chip, it needs to be a rewarded success!

    I am off to the UK for 4 weeks, so I am taking my ES board with me to work on while there B)

    We've all done a lot of work on this. This chip is a distillation of a lot of fantastic ideas that people have had over the last 13 years. We packed at least 15 pounds into the 5 pound bag.
  • Cluso99Cluso99 Posts: 18,066
    cgracey wrote: »
    Cluso99 wrote: »
    Fingers doubly crossed. With all the work you've done Chip, it needs to be a rewarded success!

    I am off to the UK for 4 weeks, so I am taking my ES board with me to work on while there B)

    We've all done a lot of work on this. This chip is a distillation of a lot of fantastic ideas that people have had over the last 13 years. We packed at least 15 pounds into the 5 pound bag.

    More like 50 pounds into that 5 pound bag B)
    There is something for everyone in the P2!
  • Excellent news Chip!
    Validation with the 10 glob tops in a couple weeks, then the proper packaged ones in late September or Early October (on the new rev of the eval board) for us! Yay!
  • jmgjmg Posts: 15,140
    cgracey wrote: »
    The new wafers came out of the fab on the 13th and are now being packaged into 10 glob tops. The rest of the dice will be packaged in the real Amkor ePad package, but they won't be shipped for several weeks.

    ON Semi has been promising delivery of the 10 glob tops on August 1, but they always under-promise and over-deliver, so I'm thinking we may have chips sometime next week....

    Have those 10 passed the test program you were working on ? Have OnSemi indicated yields on this run yet ?

  • cgraceycgracey Posts: 14,131
    jmg wrote: »
    cgracey wrote: »
    The new wafers came out of the fab on the 13th and are now being packaged into 10 glob tops. The rest of the dice will be packaged in the real Amkor ePad package, but they won't be shipped for several weeks.

    ON Semi has been promising delivery of the 10 glob tops on August 1, but they always under-promise and over-deliver, so I'm thinking we may have chips sometime next week....

    Have those 10 passed the test program you were working on ? Have OnSemi indicated yields on this run yet ?

    I'm not sure about the test program status for the new silicon. It would be all new files within ON Semi, since the logic was resynthesized.
  • Chip,
    He meant the PASM2 code that you wrote to test things. You indicated that it was for ON Semi to use for testing the chips.
  • cgraceycgracey Posts: 14,131
    Roy Eltham wrote: »
    Chip,
    He meant the PASM2 code that you wrote to test things. You indicated that it was for ON Semi to use for testing the chips.

    Our analog test program is the same, but they must make new digital test programs.
  • evanhevanh Posts: 15,091
    Chip,
    This is cool! I finally understand QFRAC and why you've included it! I had the following code working, seemed perfect. Purpose is to scale up the remainder of (clk_freq / asyn_baud) so as to make use of the full 16.6 bit format for clock divider in smartpin asynchronous serial mode.
    		qdiv	##$ffff_ffff, asyn_baud
    


    But there was a small doubt I was missing something. I went to ask you about it, whether it really was as perfect as it seemed and whether you had any recommendations. As I was typing the question, it dawned on me there was this other cordic divide I'd never understood ... QFRAC, so I tried it and blow me down it did the job even better because it didn't need the large constant.
    		qfrac	#1, asyn_baud
    
    Same answer. :)
  • evanhevanh Posts: 15,091
    edited 2019-09-22 17:24
    Chip,
    Is there a way to release the FIFO back to hubexec after a RD/WRFAST? Grr, silly question, it's not what I wanted anyway.

    I've been testing the timing of hubRAM accesses vs non-hubRAM hub-ops like COGID and cordic commands. I've found that issuing WRFAST seems to produce peculiar responses and I'm wondering, to get something sensible, if I need to reset the FIFO ops back to idle in some fashion.

    I've tried issuing a RDFAST in between each WRFAST test but that's not particularly effective. The numbers shuffle around a little but are still just as weird.

    Here's the table of results. The values in the middle are execution duration, of the second instruction, in sysclocks. The X-axis labels are hubRAM addresses of the first instruction's data access, and right column is the address that produced the shortest execution.
                         0   28   24   20   16   12    8    4
    --------------------------------------------------------------------
     RDLONG  QMUL        9    2    3    4    5    6    7    8   28
     WRLONG  QMUL        7    8    9    2    3    4    5    6   20
     RDFAST  QMUL        9    2    3    4    5    6    7    8   28
     WRFAST  QMUL        4    7    3    7    3    3    7    3   24
     RDLONG  COGID      11    4    5    6    7    8    9   10   28
     WRLONG  COGID       9   10   11    4    5    6    7    8   20
     RDFAST  COGID      11    4    5    6    7    8    9   10   28
     WRFAST  COGID      11    9    5    9    5    5    9    5   24
     RDLONG  COGID WC   11    4    5    6    7    8    9   10   28
     WRLONG  COGID WC    9   10   11    4    5    6    7    8   20
     RDFAST  COGID WC   11    4    5    6    7    8    9   10   28
     WRFAST  COGID WC    7    9    5    9    5    5    9    5   24
     RDLONG  LOCKRET     9    2    3    4    5    6    7    8   28
     WRLONG  LOCKRET     7    8    9    2    3    4    5    6   20
     RDFAST  LOCKRET     9    2    3    4    5    6    7    8   28
     WRFAST  LOCKRET     2    7    3    7    7    7    7    3    0
    

    Ignoring the WRFASTs, you can see a regular sequence to each line - where the execution duration increases by one for each column in the table.

    WRFAST doesn't even slightly follow that pattern. Any idea why?

    Here's the critical measuring code
    		...
    inst1		nop
    		getct	tickstart		'measure time
    inst2		nop
    		getct	pa			'measure time
    		rdfast	#0, #0
    		...
    

    And the instruction tables that fill those two NOPs
    hubram_tab
    		rdlong	inb, phase
    		byte	13,10," RDLONG ",0
    		wrlong	inb, phase
    		byte	13,10," WRLONG ",0
    		rdfast	#0, phase
    		byte	13,10," RDFAST ",0
    		wrfast	#0, phase
    		byte	13,10," WRFAST ",0
    
    hubop_tab
    		qmul	tickstart, #37
    		byte	" QMUL    ",0
    		cogid	inb
    		byte	" COGID   ",0
    		cogid	#15	wc
    		byte	" COGID WC",0
    		lockret	#0
    		byte	" LOCKRET ",0
    
  • evanhevanh Posts: 15,091
    Huh, it comes right when I do a three instruction variant of the above. "inst4" and "inst5" are hubram and hubop respectively.
    		...
    inst3		wrlong	inb, phase2
    		getct	tickstart		'measure time
    inst4		nop
    inst5		nop
    		getct	pa			'measure time
    		rdfast	#0, #0
    		...
    
  • evanhevanh Posts: 15,091
    Ah, got it! WRFAST isn't the controlling factor because it always takes 2 clocks unless it is blocked by a prior WRFAST flushing. Lol, that took a while to sink in. I guess it is 6:00 AM now, time to hit the sack.
  • cgraceycgracey Posts: 14,131
    edited 2019-09-22 19:32
    evanh wrote: »
    Ah, got it! WRFAST isn't the controlling factor because it always takes 2 clocks unless it is blocked by a prior WRFAST flushing. Lol, that took a while to sink in. I guess it is 6:00 AM now, time to hit the sack.

    I had forgotten that WRFAST takes only 2 clocks.
  • cgraceycgracey Posts: 14,131
    WRFAST will take more than two clocks if a prior WRFAST has not finished and queued data still needs to be written.
  • evanhevanh Posts: 15,091
    edited 2019-09-23 02:10
    EDIT: Err, it's 3 clocks for WRFAST normally. I'd guessed 2.

    Okay, those results indicate a discrepancy with the docs:
    COGID execution times says "2...9, +2 if result". I'm always seeing a minimum of 4 (2+2). No apparent way to get down to 2 clocks.

  • cgraceycgracey Posts: 14,131
    evanh wrote: »
    EDIT: Err, it's 3 clocks for WRFAST normally. I'd guessed 2.

    Okay, those results indicate a discrepancy with the docs:
    COGID execution times says "2...9, +2 if result". I'm always seeing a minimum of 4 (2+2). No apparent way to get down to 2 clocks.

    Yes, COGID must always have a result. I will uppdate the sheets.
  • evanhevanh Posts: 15,091
    Bumped into a couple of out-of-date names in the doc:

    The section on branch addressing talks about the JMPREL instruction but the listed encoding line right below labels it as JMP only:
    EEEE 1101011 00L DDDDDDDDD 000110000 JMP {#}D

    The section on interrupts has this line with SCLU and SCL instead of SCA and SCAS respectively:
    ALTxx / CRCNIB / SCLU / SCL / GETXACC / SETQ / SETQ2 / XORO32 / XBYTE must not be executing
  • cgraceycgracey Posts: 14,131
    evanh wrote: »
    Bumped into a couple of out-of-date names in the doc:

    The section on branch addressing talks about the JMPREL instruction but the listed encoding line right below labels it as JMP only:
    EEEE 1101011 00L DDDDDDDDD 000110000 JMP {#}D

    The section on interrupts has this line with SCLU and SCL instead of SCA and SCAS respectively:
    ALTxx / CRCNIB / SCLU / SCL / GETXACC / SETQ / SETQ2 / XORO32 / XBYTE must not be executing

    Thanks, Evanh. I will get those cleaned up.
  • cgraceycgracey Posts: 14,131
    edited 2019-11-20 10:10
    This thread can be un-stuck. There is a newer thread somewhere for the current-silicon FPGA files.
  • Unstuck.

    The latest (and archive) FPGA files, along with many other resources include IDEs and sample code, can be found here : https://propeller.parallax.com/
Sign In or Register to comment.