Shop OBEX P1 Docs P2 Docs Learn Events
64 MB PSRAM module using 16 pins? --> 96 MB w/16 pins or 24 MB w/8 pins - Page 10 — Parallax Forums

64 MB PSRAM module using 16 pins? --> 96 MB w/16 pins or 24 MB w/8 pins

17810121318

Comments

  • yea non-consecutive clocks don't work

  • Ok. Was just looking at that and realizing the clock pin would be a problem even if I managed to fix the CE. Ideally your banks should allow any CE and any CLK group. It will be quite difficult to have one single global clock pin go drive 8 or 12 chips on a 64 or 96MB board for example without loading it up rather a lot. Maybe P2 has enough drive for some speeds but 12 loads is quite a lot.

  • You can have multiple clock pins, they just need to be consecutive pins.

  • Maybe we can remap pin neighbors somehow to fix this problem...?

  • evanhevanh Posts: 15,847

    Nope.

    I presume the clocks need to be consecutive to allow ADDPINS effect.

  • Yes, that's it, needs to be a single ADDPINS group.

  • Wuerfel_21Wuerfel_21 Posts: 4,999
    edited 2022-07-02 10:47

    Unrelatedly, sysclk/4 PSRAM. It kinda works, but kinda just doesn't. (not pictured: the later levels, where it goes full slideshow)

    Also odd: Had to change PSRAM_WAIT (extra clocks between command and data). Was 10 not the right value to begin with?

    PSRAM_WAIT  = 9
    PSRAM_DELAY = 2
    
  • Damn, would have been able to try this out with this 64MB board had the code supported different clk pins for different banks. Is there a patch possible? Given this is the faster x16 config in theory there will be cycle savings that might compensate for the extra clock pin patching....vs 4 bit PSRAM.

  • Yea, you can totally hack the code to allow for weird pin configs. There's 3 different read paths in the memory arbiter section. you'd have to get the clock pingroup into a register and then use that instead of the constant.

  • roglohrogloh Posts: 5,745
    edited 2022-07-02 11:08

    Probably just needs two registers (one for CE and one for CLK). Because the pingroup size the same for both banks (2) it would just have to patch bit 2 of both pin registers based on A25 of the address to select upper or lower 32MB bank. So really its just 3 instructions for a 2 bank setup which is one less than you already have... I am assuming we can use C or Z here.
    Something like this...

    testb address, #25 wc
    bitc cegroup, #2
    bitc clkgroup, #2
    ...
    cegroup long PSRAM_SELECT  ' pin group for low 32 MB bank
    clkgroup long  PSRAM_CLK ' pin group for low 32 MB bank
    
  • Just try it out if you want. The code isn't that hard to read.

  • RaymanRayman Posts: 14,544

    Have you all tried using the smartpin pull downs on the bus?
    Maybe a bit of termination would help?

  • During writes you'll need maximum possible drive. just to fight target chip's , connectors and clock/enable/dat lanes capacitances.

    During reads, perhaps data lanes signal integrity can be improved a bit if one sets P2 input-pins termination at current-mode; 10 to 100 uA. both HHH and LLL.

    It must be remembered that there's an initial write-phase, when address/commands are transfered towards the ram chips, so the proper data-bus termination will need to be changed on-the-fly, during delay interval countdown, and restored to maximum, as soon as chip enable is negated, at the end of read-data-transfer period.

    Chip enable and clock will need to be kept at a constant maximum drive, all the time.

    Maybe it can act someway as a DDR2 active-termination regulator, trying to keep the bus near a mid point (~1.65 V), but this will "force" psram's 50 Ohm output drivers a little bit more.

    Nothing bad will happen, so every attempt is valid.

  • evanhevanh Posts: 15,847

    Going down that line of thinking needs termination at far end as well, and removal of physical branches in all signal paths. Really should be designed in.

    I thought about adding it to the six 8-bit RAMs I laid out but figured it wasn't the biggest issue when compared to the loading of the six chips.

  • RaymanRayman Posts: 14,544

    Sounds like bottom line is that the 24 MB module will work as is but 96 MB will not, at least at the high speed that Neoyume needs…

    It is possible to just use N of the 24 MB modules to get as much memory as needed?

  • @Rayman said:
    Sounds like bottom line is that the 24 MB module will work as is but 96 MB will not, at least at the high speed that Neoyume needs…

    It is possible to just use N of the 24 MB modules to get as much memory as needed?

    In theory yes, in practice it'd require big overhead to handle switching all the pins. Also, the bandwidth from 4 bit bus is not really ideal, it slows down quite a bit. I'd vote for a half-capacity version of the 96MB design, 6 chips, 8 bit bus (leaves 3 pins free...). Or a corner board with 16 bit bus, I guess. Or try making another 8bit board, but with really tight layout (wonder if connecting the P2 pins in the middle instead of at one end does any good?).

    Speaking of, if @rogloh can pull an 8bit driver out of his hat at some point, we'll find out if 8bit sysclock/4 is at least fast enough to not have video corruption.

  • Speaking of, if @rogloh can pull an 8bit driver out of his hat at some point, we'll find out if 8bit sysclock/4 is at least fast enough to not have video corruption.

    It may take a while to happen (if it fits in the COG). You could probably test the performance in the meantime if you make an 8 bit reader and just use my driver in 4 bit mode to fill the memory initially. You just need to do the transformation so the correct data goes into the correct devices. You'll basically just write the even nibbles of the ROMs into one memory half in one pass and the odd nibbles into the other half in another pass. Shouldn't be too hard to do for a test.

  • RaymanRayman Posts: 14,544
    edited 2022-07-03 19:02

    Here's maybe a tighter layout for 48 MB on 8-bit bus, as @Wuerfel_21 mentioned...

    Is it worth testing that 96 MB board with only half the chips installed?

    Also, do we really need 2 clocks? Seems like one should be enough...

    744 x 717 - 85K
  • Wuerfel_21Wuerfel_21 Posts: 4,999
    edited 2022-07-03 19:08

    I said tighter layout OR half-capacity ;) I guess you could do the compromise and up it to a 4th bank for 64MB then.

    Single clock should work (since the P2 pin drive seems to be strong enough), but idk.

  • RaymanRayman Posts: 14,544
    edited 2022-07-03 19:12

    I remember now... It's not byte addressable with just one clock...
    Guess we can try 4th bank...

  • You can do byte addressing with one clock, you just don't send a sensible command to the half you don't want to muck with.

  • The way the pins of that SOP package are organized isn't any flow-thru, layout-friendly.

    If one's intention is to keep with any nibble-oriented bit order, horizontal/vertical crossing on two layers (and the resulting multitude of vias), seems to be inevitable.

    That adds extra (and unnecessary) series inductance, thus increasing the whole signaling distortion budget, which worsens it a bit more, and also "spreads" the chips, forcing the use of a larger pcb area.

    It's just another proof that those chips where never designed/meant to be "bused" togheter. Unfortunatelly, the same applies to the USON package version.

    It would be fantastic if the software drivers could ever "scramble" those four bits (SIO[3:0]), so as to easy the burden of finding the straighst path, towards the data bus connector...

  • @Yanomani said:
    It would be fantastic if the software drivers could ever "scramble" those four bits (SIO[3:0]), so as to easy the burden of finding the straighst path, towards the data bus connector...

    You know what, that's actually possible.
    Code for formatting the command long looks like this:

                  setbyte ma_prog_addr,#$EB,#3
                  splitb  ma_prog_addr
                                       '< bit scrambling w/ MOVBYTS here
                  rev     ma_prog_addr
                  movbyts ma_prog_addr, #%%0123
                  mergeb  ma_prog_addr
    

    Adding bit scrambling adds just one instruction.

  • @Wuerfel_21 said:

    You know what, that's actually possible.
    Code for formatting the command long looks like this:

                  setbyte ma_prog_addr,#$EB,#3
                  splitb  ma_prog_addr
                                       '< bit scrambling w/ MOVBYTS here
                  rev     ma_prog_addr
                  movbyts ma_prog_addr, #%%0123
                  mergeb  ma_prog_addr
    

    Adding bit scrambling adds just one instruction.

    Ich wettete, dass die Antwort ja sein würde!

    Ich war mir nur nicht sicher, wer die erste Antwort verfasst hat ...

    Frauen zuerst! Bitte! :smiley:

  • RaymanRayman Posts: 14,544

    Verified 96 MB board not working right in NeoYume for some banks/nibbles.
    Unsoldered the top 6 chips and now all chips run NeoYume (Crossed Swords).

    Think this is a good sign that new design will work, at least for 48 MB, hopefully, 64 MB.
    Maybe I should add two more chips back...

  • @Rayman said:
    Verified 96 MB board not working right in NeoYume for some banks/nibbles.
    Unsoldered the top 6 chips and now all chips run NeoYume (Crossed Swords).

    Crossed swords only uses the first chip, that one always works (and should for you. I think it's delay=5 sync_data=false).

  • RaymanRayman Posts: 14,544

    Ok, adding two more chips back in doesn't work.
    Either limit is 3 in parallel or else doesn't like long tracks..

  • YanomaniYanomani Posts: 1,524
    edited 2022-07-03 22:34

    @Wuerfel_21 said:
    Hmm, is there anything that'd stop the data bus from going like this? Not sure what'd be ideal for clock. I think it might be beneficial to run it along the data bus, but source it on the opposite end. So the banks further from the data pins are closer to the clock pin in equal proportion.

    But a 3-header corner setup with 16 bit bus and 3 banks would also give 96MB with a lot less headache, I think.

    Now, it'll only deppend on you, @evanh , @Rayman , and @rogloh to settle on an agreement, on how the relationship between data bus connector's D[7:0], and the related ram chip's data pins will better work...

    P.S. Better yet; it can already be tested (if anyone has a strong heart), with a cut&jump at any of the actual boards. So, "Knives Out" anyone? :smile:

  • @Rayman said:
    Ok, adding two more chips back in doesn't work.
    Either limit is 3 in parallel or else doesn't like long tracks..

    Please post config. You should have PSRAM_SYNC_DATA on false. That should always work on the first chip.

  • RaymanRayman Posts: 14,544
    edited 2022-07-03 22:51

    PSRAM_SYNC_DATA = false ... seems to help

    Just did a test where I piggybacked a chip removed from bank 5 on top of the upper nibble chip of bank 4 (ran a jumper wire for chip select). Seems to work now.

    Thats a good sign for 4 chips in parallel...

Sign In or Register to comment.