Shop OBEX P1 Docs P2 Docs Learn Events
New FPGA files for next silicon version - 5th/final release - contains new ROM!! - Page 2 — Parallax Forums

New FPGA files for next silicon version - 5th/final release - contains new ROM!!

24

Comments

  • Cluso99Cluso99 Posts: 18,069
    cgracey wrote: »
    evanh wrote: »
    Chip,
    Have you set the dual-port SRAM parameter, READ_DURING_WRITE_MODE_MIXED_PORTS? See https://forums.parallax.com/discussion/comment/1462814/#Comment_1462814

    I forgot!

    I just talked to Wendy at ON Semi about this, though, and she is looking into what we must do to ensure that random data is not returned on a READ during a simultaneous write to the same location from the other port. She is going to call me back soon about this. If it's doable, I'll update the FPGA images, accordingly.

    Thanks for bringing this up!!!

    If there is a problem, having now seen the speed of the P2, I would be happy to forego the dual porting if needs be.
  • Cluso99Cluso99 Posts: 18,069
    If there is a bugfix, any possibility of a JMPRET P1 style (or partial)?

    The missing part is the CALL D,#/S where we want to write the 9bit return address without the C & Z bits so that can be placed into a JMP absolute instruction. A 20bit return would work but its the C & Z bits that destroy the return instruction.
  • evanh wrote: »
    Well, the problem with event branching, Jxx, instructions within a REP block counts as a design bug. I never saw any fix mentioned for those. See https://forums.parallax.com/discussion/comment/1459273/#Comment_1459273
    See some extra notes here
    https://forums.parallax.com/discussion/169438/rep-blocks-and-branching-issue

  • evanhevanh Posts: 15,916
    Thanks Brian, I'd lost that one. I'll add it to the traps post ...
  • jmgjmg Posts: 15,173
    Cluso99 wrote: »
    If there is a problem, having now seen the speed of the P2, I would be happy to forego the dual porting if needs be.

    Why forgo dual porting ?
    The issue is one of same-clock-access, so a SW work-around exists of reading until the same answer occurs twice, if your design means this can occur.
    I think this could be fixed in HW, as the apertures of registers are very small in P2.
  • Publison wrote: »
    What is the Minimum / Maximum Quartas versions for the 1-2-3 A9? I think 15.0 was safe.

    If you want to flash this FPGA image to a P123-A9 board you can't use Quartus.
    That board uses a custom Parallax loader (PX.EXE).


    Set switch to "PGM"
    px Prop123_A9_Prop2_v33g.rbf /p /4
    
    Set switch to "RUN" when complete.
    :)

  • PublisonPublison Posts: 12,366
    edited 2019-01-31 14:33
    ozpropdev wrote: »
    Publison wrote: »
    What is the Minimum / Maximum Quartas versions for the 1-2-3 A9? I think 15.0 was safe.

    If you want to flash this FPGA image to a P123-A9 board you can't use Quartus.
    That board uses a custom Parallax loader (PX.EXE).


    Set switch to "PGM"
    px Prop123_A9_Prop2_v33g.rbf /p /4
    
    Set switch to "RUN" when complete.
    :)

    Thanks Brian. I'm new to the FPGA world. Just got the last 1-2-3 A9.

    @cgracey

    Loaded v33g.rbf. Blinkey works fine. PNut only reports 4 COGS. I was under the assumption the new A9 version was 8 COGs.

    EDIT: Same response Pnut v32j. Eight green LEDs are blinking.
    616 x 408 - 63K
  • cgraceycgracey Posts: 14,155
    Publison wrote: »
    ozpropdev wrote: »
    Publison wrote: »
    What is the Minimum / Maximum Quartas versions for the 1-2-3 A9? I think 15.0 was safe.

    If you want to flash this FPGA image to a P123-A9 board you can't use Quartus.
    That board uses a custom Parallax loader (PX.EXE).


    Set switch to "PGM"
    px Prop123_A9_Prop2_v33g.rbf /p /4
    
    Set switch to "RUN" when complete.
    :)

    Thanks Brian. I'm new to the FPGA world. Just got the last 1-2-3 A9.

    @cgracey

    Loaded v33g.rbf. Blinkey works fine. PNut only reports 4 COGS. I was under the assumption the new A9 version was 8 COGs.

    EDIT: Same response Pnut v32j. Eight green LEDs are blinking.

    My mistake. I was assuming I had left off with version "A" in the ROM file, which would have indicated 8 cogs. As long as the memory is understood to be 512KB, though, there should be no problem.
  • Thanks Chip.
  • cgraceycgracey Posts: 14,155
    In talking to Wendy some more, it was apparent that our needs for not glitching LUT reads during LUT writes were beyond what she could address via clock inversion and timing constraints. So, I made some Verilog changes to detect these r/w conflicts and pass the write data to the read port of the otherwise-victim.

    I was able to produce the glitch condition on the current FPGA image, so it enabled me to write a work-around which has verified okay and Wendy now has the latest Verilog code.

    I also slipped in the C=C change for 'GETCT reg WC'.

    I even tried to replicate the JMP-event within a REP block, but couldn't get this case to fail. I need to find the link someone left about the cases which fail.

    This program works okay, though:
    dat		org
    
    		getct	t
    .loop		addct1	t,#100
    
    		rep	@.r,#0		'inifinite REP block
    		jct1	#.out		'JCT1 happens every 100 clocks
    		drvnot	#0
    .r
    		drvnot	#1		'never gets here
    		jmp	#.loop
    
    .out		drvnot	#2		'gets here every 100 clocks
    		jmp	#.loop
    
    
    t		res	1
    
  • cgraceycgracey Posts: 14,155
    I'm recompiling some FPGA images.

    I'm hoping some of you will be able to verify that the LUT-sharing bug is now gone. This fix is good for the streamer, too, as it allows live updating of palette and DDS data without glitching.
  • cgracey wrote: »
    I also slipped in the C=C change for 'GETCT reg WC'.

    I really like this new C=C, don't know why exactly. Just out of interest, was it easier to (a) actually write C to C, or (b) disable write to C for GETCT?

    This could be a path to extra functionality for certain instructions in the future, by using an otherwise redundant opcode bit without any side-effects.
  • cgraceycgracey Posts: 14,155
    TonyB_ wrote: »
    cgracey wrote: »
    I also slipped in the C=C change for 'GETCT reg WC'.

    I really like this new C=C, don't know why exactly. Just out of interest, was it easier to (a) actually write C to C, or (b) disable write to C for GETCT?

    This could be a path to extra functionality for certain instructions in the future, by using an otherwise redundant opcode bit without any side-effects.

    It was much easier to just copy C, than to make the instruction not write C.
  • evanhevanh Posts: 15,916
    edited 2019-01-31 22:07
    cgracey wrote: »
    In talking to Wendy some more, it was apparent that our needs for not glitching LUT reads during LUT writes were beyond what she could address via clock inversion and timing constraints. So, I made some Verilog changes to detect these r/w conflicts and pass the write data to the read port of the otherwise-victim.
    Good stuff! It explains why the default is not OLD_DATA.

    I even tried to replicate the JMP-event within a REP block, but couldn't get this case to fail. I need to find the link someone left about the cases which fail.

    This program works okay, though:
    So it does! I'm blown away. More test cases to come I guess ... EDIT: Yay, I see Chip has found a reason for it.
  • cgraceycgracey Posts: 14,155
    Does anyone remember any other bugs that haven't been fixed yet?
  • evanhevanh Posts: 15,916
    Other things I thought I found were just my own mistakes.

    There was an idea or two I had but they weren't of much significance, or too big.
  • evanhevanh Posts: 15,916
    edited 2019-01-31 23:25
    evanh wrote: »
    There was an idea or two I had but they weren't of much significance, or too big.
    One idea that would be nice to have is changing XORO32 and SCA results to feeding next D input instead of next S input.
  • cgraceycgracey Posts: 14,155
    evanh wrote: »
    evanh wrote: »
    There was an idea or two I had but they weren't of much significance, or too big.
    One idea that would be nice to have is changing XORO32 and SCA results to feeding next D input instead of next S input.

    Yes. I looked into this. It's doable, but I wasn't convinced of its benefit. Could you please refresh me on this? A link would do. Thanks, Evanh.
  • evanhevanh Posts: 15,916
    edited 2019-02-01 00:01
    The main benefit this change for these instruction pairings is together they then become like a 3-operand arrangement because the D field of the second instruction is still valid for its result address.

    PS: Which is also why the idea of generalising it for the prop3 came up.
  • Hey Chip
    Sounds like you're almost there with the verilog
    How many days before the component values (adc cap etc) get tweaked?
  • cgraceycgracey Posts: 14,155
    evanh wrote: »

    Thanks, Evanh. I looked all that over. I also looked at the Verilog code. I don't feel like this would be worth doing, at this point. Thanks for bringing it up, again, though.
  • cgraceycgracey Posts: 14,155
    Tubular wrote: »
    Hey Chip
    Sounds like you're almost there with the verilog
    How many days before the component values (adc cap etc) get tweaked?

    That has to happen soon, maybe within the next week.
  • evanhevanh Posts: 15,916
    edited 2019-02-05 01:17
    I never got round to testing OUT to IN speed on the real chip. That was something I wasn't happy with in the FPGA. Forgot about it till now.

    The pin drivers and input buffers on the FPGA are only a few nanosecond combined, but there were very long lags of maybe 30 ns of asynchronous transition coming back to IN from the prior OUT - In addition to clocked stages.

    EDIT: I've found the prior effort - https://forums.parallax.com/discussion/comment/1439248/#Comment_1439248
    and https://forums.parallax.com/discussion/comment/1430499/#Comment_1430499
    and where it all started from: http://forums.parallax.com/discussion/comment/1426018/#Comment_1426018
  • jmgjmg Posts: 15,173
    evanh wrote: »
    I never got round to testing OUT to IN speed on the real chip. That was something I wasn't happy with in the FPGA. Forgot about it till now.

    The pin drivers and input buffers on the FPGA are only a few nanosecond combined, but there were very long lags of maybe 30 ns of asynchronous transition coming back to IN from the prior OUT - In addition to clocked stages.

    The other area of Pin-core delay that needs to be checked, is the Xtal Buffer to PFD detector.
    Ideally, that non-clocked path should be matched with a equal-gate-delay path in the counter feedback, (so they track) to avoid the PFD moving with temperature across a SysCLK threshold.
    That mechanism may explain the observed temperature 'hot zones' for jitter issues.

    On P2, as you mention, I have seen similar ten+ ns movement in Xtal to SysCLK pin vs temperature sweeps.

    if PFD paths are matched, that also means external clocks will (mostly) keep pin-relative placement, and that will be important for application that clock P2 from a master clock, and expect P2 pins to keep exact relative time.
  • cgraceycgracey Posts: 14,155
    edited 2019-02-01 04:29
    I just posted a new set of FPGA files at the top of the thread - v33i. Please try them out.

    If you can, please verify that the LUT-sharing bug is fixed, as well as the JMP-event-within-REP bug.

    Thanks.
  • Chip
    V33i LUT sharing tests Ok here.
    Offset  Original New
    +0      FFFFFFFF FFFFFFFF
    +1      FFFFFFFF FFFFFFFF
    +2      FFFFFFFF 00000000
    +3      FFFFFFFF 00000000
    +4      FFFFFFFF 00000000
    
    Offset  Original New
    +0      00000000 00000000
    +1      00000000 00000000
    +2      00000000 FFFFFFFF
    +3      00000000 FFFFFFFF
    +4      00000000 FFFFFFFF
    
    Offset  Original New
    +0      55555555 55555555
    +1      55555555 55555555
    +2      55555555 AAAAAAAA
    +3      55555555 AAAAAAAA
    +4      55555555 AAAAAAAA
    
    Offset  Original New
    +0      AAAAAAAA AAAAAAAA
    +1      AAAAAAAA AAAAAAAA
    +2      AAAAAAAA 55555555
    +3      AAAAAAAA 55555555
    +4      AAAAAAAA 55555555
    
    

  • cgraceycgracey Posts: 14,155
    ozpropdev wrote: »
    Chip
    V33i LUT sharing tests Ok here.
    Offset  Original New
    +0      FFFFFFFF FFFFFFFF
    +1      FFFFFFFF FFFFFFFF
    +2      FFFFFFFF 00000000
    +3      FFFFFFFF 00000000
    +4      FFFFFFFF 00000000
    
    Offset  Original New
    +0      00000000 00000000
    +1      00000000 00000000
    +2      00000000 FFFFFFFF
    +3      00000000 FFFFFFFF
    +4      00000000 FFFFFFFF
    
    Offset  Original New
    +0      55555555 55555555
    +1      55555555 55555555
    +2      55555555 AAAAAAAA
    +3      55555555 AAAAAAAA
    +4      55555555 AAAAAAAA
    
    Offset  Original New
    +0      AAAAAAAA AAAAAAAA
    +1      AAAAAAAA AAAAAAAA
    +2      AAAAAAAA 55555555
    +3      AAAAAAAA 55555555
    +4      AAAAAAAA 55555555
    
    

    Thanks, Brian. And this is a difference in behavior from before, right?
  • cgracey wrote: »
    And this is a difference in behavior from before, right?
    That's right Chip.
    Here's the Silicon results showing the "glitch"
    Offset  Original New
    +0      FFFFFFFF FFFFFFFF
    +1      FFFFFFFF FFFFFFFF
    +2      FFFFFFFF 09009DFF	'glitch
    +3      FFFFFFFF 00000000
    +4      FFFFFFFF 00000000
    
    Offset  Original New
    +0      00000000 00000000
    +1      00000000 00000000
    +2      00000000 00000000
    +3      00000000 FFFFFFFF
    +4      00000000 FFFFFFFF
    
    Offset  Original New
    +0      55555555 55555555
    +1      55555555 55555555
    +2      55555555 01005555	'glitch
    +3      55555555 AAAAAAAA
    +4      55555555 AAAAAAAA
    
    Offset  Original New
    +0      AAAAAAAA AAAAAAAA
    +1      AAAAAAAA AAAAAAAA
    +2      AAAAAAAA 88288AAA	'glitcg
    +3      AAAAAAAA 55555555
    +4      AAAAAAAA 55555555
    
Sign In or Register to comment.