Prop2 FPGA files!!! - Updated 2 June 2018 - Final Version 32i

1154155156157158160»

Comments

  • cgracey wrote: »
    Evanh, here's the spoiler: the ASIC tools optimize the logic cell placement and routing only to meet the timing goal, so that signals with plenty of slack get routed around the hot spots, loosing their slack, while the signals needing speed get the prime placement and shortest routes. In the end, hundreds of thousands of paths are stacked against the timing wall, forming a cliff, where the chip fails systemically if the clock period becomes too short. So, while in theory some things take less time than others, the implementation is a blob of nearly identically-timed paths that affords no possibility of speed-up via selective clock cycle shortening. When you hit the speed limit, everything fails at once.
    I know, I didn't show the gap as a speed-up feature. It was just indicative of where the propagations end is all. The propagations must complete ahead of flop setup timing requirements.
    "We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
    By doing that, we can more accurately measure their mass, and determine whether
    scientists have systematically been underestimating how much matter they contain."
  • Roy Eltham wrote: »
    @cgracey aren't the new chips due soon?

    The new wafers came out of the fab on the 13th and are now being packaged into 10 glob tops. The rest of the dice will be packaged in the real Amkor ePad package, but they won't be shipped for several weeks.

    ON Semi has been promising delivery of the 10 glob tops on August 1, but they always under-promise and over-deliver, so I'm thinking we may have chips sometime next week. Man, I really hope the design is okay. If it works as planned, it's going to be great. We'll have a really nice chip.
  • Fingers doubly crossed. With all the work you've done Chip, it needs to be a rewarded success!

    I am off to the UK for 4 weeks, so I am taking my ES board with me to work on while there B)
    My Prop boards: P8XBlade2, RamBlade, CpuBlade, TriBlade
    Prop OS (also see Sphinx, PropDos, PropCmd, Spinix)
    Website: www.clusos.com
    Prop Tools (Index) , Emulators (Index) , ZiCog (Z80)
  • Cluso99 wrote: »
    Fingers doubly crossed. With all the work you've done Chip, it needs to be a rewarded success!

    I am off to the UK for 4 weeks, so I am taking my ES board with me to work on while there B)

    We've all done a lot of work on this. This chip is a distillation of a lot of fantastic ideas that people have had over the last 13 years. We packed at least 15 pounds into the 5 pound bag.
  • cgracey wrote: »
    Cluso99 wrote: »
    Fingers doubly crossed. With all the work you've done Chip, it needs to be a rewarded success!

    I am off to the UK for 4 weeks, so I am taking my ES board with me to work on while there B)

    We've all done a lot of work on this. This chip is a distillation of a lot of fantastic ideas that people have had over the last 13 years. We packed at least 15 pounds into the 5 pound bag.

    More like 50 pounds into that 5 pound bag B)
    There is something for everyone in the P2!
    My Prop boards: P8XBlade2, RamBlade, CpuBlade, TriBlade
    Prop OS (also see Sphinx, PropDos, PropCmd, Spinix)
    Website: www.clusos.com
    Prop Tools (Index) , Emulators (Index) , ZiCog (Z80)
  • Excellent news Chip!
    Validation with the 10 glob tops in a couple weeks, then the proper packaged ones in late September or Early October (on the new rev of the eval board) for us! Yay!
  • jmgjmg Posts: 13,928
    cgracey wrote: »
    The new wafers came out of the fab on the 13th and are now being packaged into 10 glob tops. The rest of the dice will be packaged in the real Amkor ePad package, but they won't be shipped for several weeks.

    ON Semi has been promising delivery of the 10 glob tops on August 1, but they always under-promise and over-deliver, so I'm thinking we may have chips sometime next week....

    Have those 10 passed the test program you were working on ? Have OnSemi indicated yields on this run yet ?

  • jmg wrote: »
    cgracey wrote: »
    The new wafers came out of the fab on the 13th and are now being packaged into 10 glob tops. The rest of the dice will be packaged in the real Amkor ePad package, but they won't be shipped for several weeks.

    ON Semi has been promising delivery of the 10 glob tops on August 1, but they always under-promise and over-deliver, so I'm thinking we may have chips sometime next week....

    Have those 10 passed the test program you were working on ? Have OnSemi indicated yields on this run yet ?

    I'm not sure about the test program status for the new silicon. It would be all new files within ON Semi, since the logic was resynthesized.
  • Chip,
    He meant the PASM2 code that you wrote to test things. You indicated that it was for ON Semi to use for testing the chips.
  • Roy Eltham wrote: »
    Chip,
    He meant the PASM2 code that you wrote to test things. You indicated that it was for ON Semi to use for testing the chips.

    Our analog test program is the same, but they must make new digital test programs.
  • Chip,
    This is cool! I finally understand QFRAC and why you've included it! I had the following code working, seemed perfect. Purpose is to scale up the remainder of (clk_freq / asyn_baud) so as to make use of the full 16.6 bit format for clock divider in smartpin asynchronous serial mode.
    		qdiv	##$ffff_ffff, asyn_baud
    


    But there was a small doubt I was missing something. I went to ask you about it, whether it really was as perfect as it seemed and whether you had any recommendations. As I was typing the question, it dawned on me there was this other cordic divide I'd never understood ... QFRAC, so I tried it and blow me down it did the job even better because it didn't need the large constant.
    		qfrac	#1, asyn_baud
    
    Same answer. :)
    "We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
    By doing that, we can more accurately measure their mass, and determine whether
    scientists have systematically been underestimating how much matter they contain."
  • evanhevanh Posts: 7,922
    edited 2019-09-22 - 17:24:22
    Chip,
    Is there a way to release the FIFO back to hubexec after a RD/WRFAST? Grr, silly question, it's not what I wanted anyway.

    I've been testing the timing of hubRAM accesses vs non-hubRAM hub-ops like COGID and cordic commands. I've found that issuing WRFAST seems to produce peculiar responses and I'm wondering, to get something sensible, if I need to reset the FIFO ops back to idle in some fashion.

    I've tried issuing a RDFAST in between each WRFAST test but that's not particularly effective. The numbers shuffle around a little but are still just as weird.

    Here's the table of results. The values in the middle are execution duration, of the second instruction, in sysclocks. The X-axis labels are hubRAM addresses of the first instruction's data access, and right column is the address that produced the shortest execution.
                         0   28   24   20   16   12    8    4
    --------------------------------------------------------------------
     RDLONG  QMUL        9    2    3    4    5    6    7    8   28
     WRLONG  QMUL        7    8    9    2    3    4    5    6   20
     RDFAST  QMUL        9    2    3    4    5    6    7    8   28
     WRFAST  QMUL        4    7    3    7    3    3    7    3   24
     RDLONG  COGID      11    4    5    6    7    8    9   10   28
     WRLONG  COGID       9   10   11    4    5    6    7    8   20
     RDFAST  COGID      11    4    5    6    7    8    9   10   28
     WRFAST  COGID      11    9    5    9    5    5    9    5   24
     RDLONG  COGID WC   11    4    5    6    7    8    9   10   28
     WRLONG  COGID WC    9   10   11    4    5    6    7    8   20
     RDFAST  COGID WC   11    4    5    6    7    8    9   10   28
     WRFAST  COGID WC    7    9    5    9    5    5    9    5   24
     RDLONG  LOCKRET     9    2    3    4    5    6    7    8   28
     WRLONG  LOCKRET     7    8    9    2    3    4    5    6   20
     RDFAST  LOCKRET     9    2    3    4    5    6    7    8   28
     WRFAST  LOCKRET     2    7    3    7    7    7    7    3    0
    

    Ignoring the WRFASTs, you can see a regular sequence to each line - where the execution duration increases by one for each column in the table.

    WRFAST doesn't even slightly follow that pattern. Any idea why?

    Here's the critical measuring code
    		...
    inst1		nop
    		getct	tickstart		'measure time
    inst2		nop
    		getct	pa			'measure time
    		rdfast	#0, #0
    		...
    

    And the instruction tables that fill those two NOPs
    hubram_tab
    		rdlong	inb, phase
    		byte	13,10," RDLONG ",0
    		wrlong	inb, phase
    		byte	13,10," WRLONG ",0
    		rdfast	#0, phase
    		byte	13,10," RDFAST ",0
    		wrfast	#0, phase
    		byte	13,10," WRFAST ",0
    
    hubop_tab
    		qmul	tickstart, #37
    		byte	" QMUL    ",0
    		cogid	inb
    		byte	" COGID   ",0
    		cogid	#15	wc
    		byte	" COGID WC",0
    		lockret	#0
    		byte	" LOCKRET ",0
    
    "We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
    By doing that, we can more accurately measure their mass, and determine whether
    scientists have systematically been underestimating how much matter they contain."
  • Huh, it comes right when I do a three instruction variant of the above. "inst4" and "inst5" are hubram and hubop respectively.
    		...
    inst3		wrlong	inb, phase2
    		getct	tickstart		'measure time
    inst4		nop
    inst5		nop
    		getct	pa			'measure time
    		rdfast	#0, #0
    		...
    
    "We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
    By doing that, we can more accurately measure their mass, and determine whether
    scientists have systematically been underestimating how much matter they contain."
  • Ah, got it! WRFAST isn't the controlling factor because it always takes 2 clocks unless it is blocked by a prior WRFAST flushing. Lol, that took a while to sink in. I guess it is 6:00 AM now, time to hit the sack.
    "We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
    By doing that, we can more accurately measure their mass, and determine whether
    scientists have systematically been underestimating how much matter they contain."
  • cgraceycgracey Posts: 11,711
    edited 2019-09-22 - 19:32:02
    evanh wrote: »
    Ah, got it! WRFAST isn't the controlling factor because it always takes 2 clocks unless it is blocked by a prior WRFAST flushing. Lol, that took a while to sink in. I guess it is 6:00 AM now, time to hit the sack.

    I had forgotten that WRFAST takes only 2 clocks.
  • WRFAST will take more than two clocks if a prior WRFAST has not finished and queued data still needs to be written.
  • evanhevanh Posts: 7,922
    edited 2019-09-23 - 02:10:10
    EDIT: Err, it's 3 clocks for WRFAST normally. I'd guessed 2.

    Okay, those results indicate a discrepancy with the docs:
    COGID execution times says "2...9, +2 if result". I'm always seeing a minimum of 4 (2+2). No apparent way to get down to 2 clocks.

    "We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
    By doing that, we can more accurately measure their mass, and determine whether
    scientists have systematically been underestimating how much matter they contain."
  • evanh wrote: »
    EDIT: Err, it's 3 clocks for WRFAST normally. I'd guessed 2.

    Okay, those results indicate a discrepancy with the docs:
    COGID execution times says "2...9, +2 if result". I'm always seeing a minimum of 4 (2+2). No apparent way to get down to 2 clocks.

    Yes, COGID must always have a result. I will uppdate the sheets.
  • Bumped into a couple of out-of-date names in the doc:

    The section on branch addressing talks about the JMPREL instruction but the listed encoding line right below labels it as JMP only:
    EEEE 1101011 00L DDDDDDDDD 000110000 JMP {#}D

    The section on interrupts has this line with SCLU and SCL instead of SCA and SCAS respectively:
    ALTxx / CRCNIB / SCLU / SCL / GETXACC / SETQ / SETQ2 / XORO32 / XBYTE must not be executing
    "We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
    By doing that, we can more accurately measure their mass, and determine whether
    scientists have systematically been underestimating how much matter they contain."
  • evanh wrote: »
    Bumped into a couple of out-of-date names in the doc:

    The section on branch addressing talks about the JMPREL instruction but the listed encoding line right below labels it as JMP only:
    EEEE 1101011 00L DDDDDDDDD 000110000 JMP {#}D

    The section on interrupts has this line with SCLU and SCL instead of SCA and SCAS respectively:
    ALTxx / CRCNIB / SCLU / SCL / GETXACC / SETQ / SETQ2 / XORO32 / XBYTE must not be executing

    Thanks, Evanh. I will get those cleaned up.
Sign In or Register to comment.