HyperRAM driver for P2

191012141523

Comments

  • evanhevanh Posts: 9,867
    edited 2020-05-14 - 12:23:25
    Excellent, thanks Von. You have to remember 200 MT/s is the rated speed for these parts at 3v3. And even at 1v8 the rated speed is still only 333 MT/s. So clearing 350 MT/s, without even optimising the board for it, I'd say we're running into severe attenuation with the >360 MT/s failures rather than any timing issue.

  • rogloh wrote: »
    Interesting there was one single error observed at 256MHz on P32 even with the capacitor. Maybe some brief random noise spike?
    It'll be related to the poor matching on the Eval Board for that pin group. Reflection interference or something. I get a long string of single/double digit error counts above 300 MHz. The difference could be down to me using a leaded capacitor vs Von's surface mount. Maybe 27 pF would be a safer choice.

  • VonSzarvasVonSzarvas Posts: 2,045
    edited 2020-05-14 - 13:02:34
    Looking to summarise, each test hit the first error at...:

    P32, no cap = 30
    P32, 22pF = 256 (possible glitch, with the next error coming at 353)

    P16, no cap = 89
    P16, 22pF = 359

    P0, no cap = 110
    P0, 22pF = 360
    P0, 10pF = 363
    P0, 1pF = 207

    It would be fair to add that without caps, P16 had significantly more errors than P0. However, they both hit the first error at similar points both with and without caps. Also P16 had fairly close matching by accident in the layout, whereas P32 was quite a bit off (and which is reflected in the results)

    Thus it seems the improvement achieved by impedance matching of traces is not by itself the key to higher speeds, but it helps significantly.

    I'm thinking it would be worth to trace-match at least P16 to P31 on the next Eval board, if not P32 to P47 also (if possible). P0 to P15 are already matched for customers who got a RevB board with date-code 1952 or higher, or want to grab one whilst there's only 16 left!

    For the HyperRAM accessory I'll make a note to look at the layout around the data and clock traces and see if we can improve anything for a future rev, or maybe add a footprint for a cap on each clock line at least. I'm fairly sure we included some series R on the clock and data? traces... perhaps those values could be adjusted for higher speed gains too. hmm.. better open the layout...

    Edit-- ok, series 10R's on the clock and RWDS signals! Yeah, a cap with fit quite nicely next to the each of the clock series R's without needing to change anything else that might have other unintended consequences..
  • VonSzarvas wrote: »
    ... P0 to P15 are already matched for customers who got a RevB board with date-code 1952 or higher, or want to grab one whilst there's only 16 left!
    Ah, that could be the difference for me. I've got 1928 date stamp on the underside.

  • Gosh, there's some teeny tiny trimmer cap and res packages available that would fit nicely for experimentation! I'll have a hunt around for a spare module to adjust and scope out.

    Does that test code @evanh have an option to run continuously at a certain fixed rate ? (Sorry- I should look really!)
  • Yep, I didn't make a #define for it but it's a as simple as uncomment line 513 and set the starting XMUL to what you want.
    		incmod	comp, #compref+9	wc	'number of columns
    	if_nc	jmp	#comploop
    
    		call	#putnl
    '		jmp	#logloop			'uncomment this line to cycle forever at one frequency
    		cmp	xtalmul, #370		wcz
    	if_b	ijnz	xtalmul, #logloop		'loop back for next sysclock setting, keep going up until crash
    
  • Right now my driver doesn't support sysclk/1 writes (only sysclk/2), though it could be added in time if this capacitor change and signal trace matching works out well. It may add a slight penalty of a couple of instructions to dynamically select it in the write path but I'd think it should be able to fit in the code space.
  • VonSzarvasVonSzarvas Posts: 2,045
    edited 2020-05-14 - 15:31:36
    Thanks guys. Thank you both for walking me through this testing.

    line 513 got it!

    and about sysclk/1, is that something that the test-code simulates (or could it?)
  • VonSzarvas wrote: »
    and about sysclk/1, is that something that the test-code simulates (or could it?)

    Not quite sure what you are asking for there. I think evanh's standalone test code is actually using sysclk/1 operation for writes, so no need to simulate, just use it for real.
  • evanhevanh Posts: 9,867
    edited 2020-05-15 - 04:20:57
    Yep, with HR_DIV (dmadiv) = 1 and HR_READS not defined it'll burst write at sysclock/1. If you define HR_READS then it'll burst read at sysclock/1 instead.

    EDIT: And, yes, as Roger says, it really is writing lots of data to the HyperRAM chip. The high bit error counts reported, ~400,000, is roughly 50% of the data written for each case.

  • So to understand the results, is getting the middle column all to zeros the only target, or should all the colums be zero for an ideal test?
  • Only one column will likely match good results without error. There is one software timing delay value that lines up with getting the correct result (I think that is what evanh is calling "compensations" in the table but I might be wrong...)
  • For sysclock/1, only a single column can be valid. Doesn't matter which column, I have so many columns just so that I don't have to guess the right compensation as much. For syscloc/2 potentially up to two columns could be valid. Sysclock/3 can achieve three columns, and so on ...

  • Ah, ok.
    Is changing HR_DIV to 2 or 3 the right way to test at sysclk/2 or 3 ?

  • evanhevanh Posts: 9,867
    edited 2020-05-15 - 06:23:20
    Yup

    EDIT: The constant is actually called "dmadiv". It's a legacy of much earlier code that was copying straight between two streamers long before I had my hands on any HyperRAM parts.

  • Awesome. Thanks again guys.

    I'm expecting trimmer deliveries on Tuesday, and will post results soon after.

    If there's anything else you think of which you'd like testing just shout!
  • Lol, you asked! :) A custom layout with dedicated prop2 pins for single HR chip. No long routes beyond the HR so that best speed can be achieved. Particularly interested in how that would improve usable frequency bands of reading data out of the HyperRAM chip.

  • And obviously it would make sense to test out a board fitted with new 3V v2.0 HyperRAMs as soon as those chips become available - must be soon. We could see how well they perform given they are not going to be overclocked when running a P2 up 333MHz.
  • evanhevanh Posts: 9,867
    edited 2020-05-15 - 07:14:42
    Oh, and certainly have no use for the /RESET pin on a HyperRAM. Rescue an I/O pin by tying /RESET high. If the RAM cells are corrupt on power up, it's no biggie. DRAM isn't expected to be coherent after a power down. EDIT: Just attach it to a small capacitor for an automatic power-up reset feature. EDIT2: Err, had it right the first time, There is already a built-in power-on-reset.

    RWDS I guess should be left connected for those single byte writes ... just seems overkill throwing a whole I/O pin at it is all. I'd try leaving it out if I did my own layout.

    It'd be cool to merge CLK and /CS to rescue another I/O pin. It'd be tricky though, CLK is required to idle low and /CS needs to return high while CLK is idling.

  • @evanh Are there any speed gains to be had though ? If your test shows 0 errors up to 350 MHz already, then it seems about as fast as P2 limits already?

    @rogloh Not seeing them just yet, and I suppose they'd need a dedicated LDO for 3V- not a simple swap out on the existing PCB. (Which is a shame, as I'd gladly swap out a couple HR chips). Poop!

  • VonSzarvas wrote: »
    @evanh Are there any speed gains to be had though ? If your test shows 0 errors up to 350 MHz already, then it seems about as fast as P2 limits already?
    Read data is not so friendly as writes. There is a lot of narrow bands that work and don't work. And the 22 pF capacitor makes them a little narrower!

    Some band overlap is achieved with combinations of registered and unregistered pins but it's pretty hairy, especially if taking temperature effect into account.

  • VonSzarvasVonSzarvas Posts: 2,045
    edited 2020-05-15 - 07:22:28
    evanh wrote: »
    Oh, and certainly have no use for the /RESET pin on a HyperRAM. Rescue an I/O pin by tying /RESET high. If the RAM cells are corrupt on power up, it's no biggie. DRAM isn't expected to be coherent after a power down.

    RWDS I guess should be left connected for those single byte writes ... just seems overkill throwing a whole I/O pin at it is all. I'd try leaving it out if I did my own layout.

    It'd be cool to merge CLK and /CS to rescue another I/O pin. It'd be tricky though, CLK is required to idle low and /CS needs to return high while CLK is idling.

    Noted on RESET, good point.

    I suppose one issue with pin sharing those others, is that the traces/circuits to achieve that will add the very crud one seeks to avoid. RWDS might be tempting to leave off if multi-writes are fast enough, but we need minimum 10 pins anyway, so already into a second bank of 8. (Although I get what you are saying for a custom project).

  • I'm expecting great things of an optimised board layout. I just don't know how great.

  • VonSzarvas wrote: »

    @rogloh Not seeing them just yet, and I suppose they'd need a dedicated LDO for 3V- not a simple swap out on the existing PCB. (Which is a shame, as I'd gladly swap out a couple HR chips). Poop!

    I think the acceptable range of the VCC supply was up to 3.6V on the data sheet for 2.0 HyperRAM here:

    S70KL1282/S70KS1282
    https://www.cypress.com/file/501841/download
  • VonSzarvas wrote: »
    I suppose one issue with pin sharing those others, is that the traces/circuits to achieve that will add the very crud one seeks to avoid. RWDS might be tempting to leave off if multi-writes are fast enough, but we need minimum 10 pins anyway, so already into a second bank of 8. (Although I get what you are saying for a custom project).
    Yeah, 11 pins is the sensible answer for first shot at this.

  • roglohrogloh Posts: 2,689
    edited 2020-05-15 - 07:32:27
    VonSzarvas wrote: »
    evanh wrote: »
    Oh, and certainly have no use for the /RESET pin on a HyperRAM. Rescue an I/O pin by tying /RESET high. If the RAM cells are corrupt on power up, it's no biggie. DRAM isn't expected to be coherent after a power down.

    RWDS I guess should be left connected for those single byte writes ... just seems overkill throwing a whole I/O pin at it is all. I'd try leaving it out if I did my own layout.

    It'd be cool to merge CLK and /CS to rescue another I/O pin. It'd be tricky though, CLK is required to idle low and /CS needs to return high while CLK is idling.

    Noted on RESET, good point.

    I suppose one issue with pin sharing those others, is that the traces/circuits to achieve that will add the very crud one seeks to avoid. RWDS might be tempting to leave off if multi-writes are fast enough, but we need minimum 10 pins anyway, so already into a second bank of 8. (Although I get what you are saying for a custom project).

    My current driver supporting byte granular memory writes requires and uses the RWDS pin. I think you'd want to keep that routed to the memory chips on the plug in boards for P2-EVAL use. Other pin constrained setups could potentially drop it and sacrifice byte writes and try to create their own custom drivers I guess. But I'd route it. Reset is probably less important and my driver will optionally pulse it if specified, though it doesn't require it. In a dedicated setup you'd probably just connect it into the system master reset signal.
  • For the accessory boards we would keep both. They are all about evaluation and experimenting with all the features. There may have only been a choice to make if dropping RESET would allow a single edge-header accessory, but that's not going to happen :)

  • evanhevanh Posts: 9,867
    edited 2020-05-15 - 07:48:26
    Note that when it comes to using RWDS as the write mask at high speed the signal performance of RWDS has to be identical to the data pins, ie: It's effectively a 9th data bit then.

    EDIT: That said, on the prop2, I doubt it'll get used for more than a single byte at a time. So pointless needing it to electrically perform when it's faster to just bit-bash the one byte and leave the performance for complete bursts without RWDS.

  • VonSzarvasVonSzarvas Posts: 2,045
    edited 2020-05-15 - 07:44:07
    rogloh wrote: »
    VonSzarvas wrote: »

    @rogloh Not seeing them just yet, and I suppose they'd need a dedicated LDO for 3V- not a simple swap out on the existing PCB. (Which is a shame, as I'd gladly swap out a couple HR chips). Poop!

    I think the acceptable range of the VCC supply was up to 3.6V on the data sheet for 2.0 HyperRAM here:

    S70KL1282/S70KS1282
    https://www.cypress.com/file/501841/download

    Ah, ok that would be interesting to run them at 3.3V and see where they top-out. Those parts are available.... hmm, I'd better not distract myself on that until the Evals are in production, but I've just added S70KL1282 to a samples order. (I'll check later if that's the best available choice)

    Edit: Excitement got the better of me! Not actually available, I was reading the "minimum qty" column !
Sign In or Register to comment.