Shop OBEX P1 Docs P2 Docs Learn Events
64 MB PSRAM module using 16 pins? --> 96 MB w/16 pins or 24 MB w/8 pins - Page 6 — Parallax Forums

64 MB PSRAM module using 16 pins? --> 96 MB w/16 pins or 24 MB w/8 pins

13468918

Comments

  • Wow that was quick. Mine is still on its way in transit somewhere unknown. I'm thinking if you can get 8 bit mode working or at least split the 4 bit over 12 banks, you can use my 4 bit driver to at least populate the memories with something. You'll probably have to interleave the written data when you copy in the ROM data if you do attempt to get an 8 bit read working but use my 4 bit driver to write the memory. Getting the 4 bit burst reads split up at memory boundaries will be interesting. Hopefully the information in these ROM types didn't span more than 8MB each, and you could assume each burst read won't cross into the next bank in the same access, but I'm not sure how the cartridge addressing worked. My driver code will currently start a new access on page boundaries (in the same bank) but it might have problems if trying to burst write across the banks as I don't think I ever increment the bank in the driver. I wrote this below in my own documentation when I did the original HyperRam stuff:

    In all cases with memory fills, copies and burst read/write operations, if the number of transfers from the starting source address (or destination address) exceeds the last address for that device’s size, the address will just wrap around within the device. Both the source and the destination address increment only, so reverse copies are not possible.

  • Wuerfel_21Wuerfel_21 Posts: 4,513
    edited 2022-06-28 11:17

    NeoYume only writes 64K or 32K at a time, which individually shouldn't usually cross even a 1MB boundary.

    The gfx/adpcm read never cross a page boundary at all, so no problem there. The way the program read code is written, it should also just work.

    Can you give me a 4bit driver with that "two CS per bank" linear addressing hack you spoke of?

  • (I do have NeoYume running with a single bank currently, btw. Works as well as it does.)

  • roglohrogloh Posts: 5,171
    edited 2022-06-28 12:20

    @Wuerfel_21 said:
    Can you give me a 4bit driver with that "two CS per bank" linear addressing hack you spoke of?

    Yeah, although it will take a bit longer than I initially thought because of the extra chip initialization sequence which needs more work too so that all devices get put into the correct QSPI mode. This is the very first time multiple PSRAM banks have been tested and I'm not sure if anything else is missing yet. I'll probably want to test it myself here when the board arrives otherwise I'll be sending over versions to you that might fail and create other problems.

  • RaymanRayman Posts: 13,903

    @Wuerfel_21 You got both boards to work with MegaYume? That's great news! Was thinking needed a redesign...

    How much too wide is it? The board could be much smaller, but I didn't prioritize that at all...

  • RaymanRayman Posts: 13,903

    @rogloh said:
    Change it to this:

    That works! All 12 chips seem to now be able to do the video test @ 325 MHz.

  • some centimeter or so.

    Ignore bad angle, strange measure and partially disassembled monitor in background

  • RaymanRayman Posts: 13,903

    Maybe I could just remove the mounting holes on either side of the connectors? Would 1/2 centimeter in width and length on both sides be enough for it to fit?

  • No. I thought about just filing off the hole bits, but it intersects with some of the traces.

  • evanhevanh Posts: 15,192

    @rogloh said:
    Wow that was quick. Mine is still on its way in transit somewhere unknown.

    Did it hop through Japan by any chance? That's where mine was a couple of days back.

  • RaymanRayman Posts: 13,903

    Hope @evanh and @rogloh get your boards soon. The USPS tracking is usually next to worthless, but actually seemed OK for @Wuerfel_21 , amazingly...

  • RaymanRayman Posts: 13,903

    Got new boards and stencils...

    756 x 1008 - 140K
    756 x 1008 - 140K
  • roglohrogloh Posts: 5,171
    edited 2022-06-29 02:21

    @Rayman said:
    Hope @evanh and @rogloh get your boards soon. The USPS tracking is usually next to worthless, but actually seemed OK for @Wuerfel_21 , amazingly...

    Yeah no updates since this one a few days ago...the plane should have arrived by now.

    June 25, 2022, 1:25 pm
    Departed
    NEW YORK, UNITED STATES
    Your item departed a transfer airport in JOHN F KENNEDY INTL, NEW YORK, UNITED STATES on June 25, 2022 at 1:25 pm. The item is currently in transit to the destination.

    Evanh's made it to Japan? Well hopefully my board's flight didn't run out of fuel and ditch somewhere in the Pacific, or it took a vacation in Hawaii instead. :smile:

    A short delay is probably okay as it gives me time to get the extra initialization changes done before it arrives and I'm ready to test it.

  • roglohrogloh Posts: 5,171
    edited 2022-06-29 06:56

    @Wuerfel_21 , I was able to create the updated PSRAM device init sequence that I think is required to initialize the 12 devices on Rayman's custom 96MB board. I hope I've not missed something and if by some fluke this works first shot you can get the first crack at using it before I've had a chance to even test... Hacked driver code attached.

    You are going to want to create 6 "16MB" contiguous banks using the lower CLK pin for each bank along with the associated CE pin for that bank, and also pass in the lowest pin number for the data bus (must be on 8 pin boundary). The driver code will do the rest and split the 16MB internally into two different devices, and access the upper 8MB using the next highest clock pin device dynamically based on A23 in the memory address of the request, as well as selecting the nibble of the data bus based on this too.

    Access itself is still nibble oriented. 8MB wrap around during burst/read writes won't work because the correct device nibble is selected only once per transfer and doesn't get recomputed part way through, but you already mentioned this likely won't be an issue for NeoYume directly. You just may need to be aware when doing your ROM copy into PSRAM from HUB to not cross this boundary each time you do a transfer.

    The device data would be setup in the startup parameters something like this...where CLK0 is the lower pin of the pair.

    ' data for memory
    deviceData
        ' 16 bank parameters follow (16MB per bank)
        long    ((burst<<16)|(delay<<12)|23) [6] ' where burst and delay are computed values
        long    0[10]                               ' banks 6-15
        ' 16 banks of pin parameters follow
        long    (CLK0_PIN << 8) | CE0_PIN             ' bank 0 
        long    (CLK0_PIN << 8) | CE1_PIN             ' bank 1 
        long    (CLK0_PIN << 8) | CE2_PIN             ' bank 2 
        long    (CLK0_PIN << 8) | CE3_PIN             ' bank 3 
        long    (CLK0_PIN << 8) | CE4_PIN             ' bank 4 
        long    (CLK0_PIN << 8) | CE5_PIN             ' bank 5 
        long    -1[10]                              ' banks 6-15
    
  • Oh, I wanted the one that just uses two CS per bank. The nibble switching thing just seems like a bad idea.

  • roglohrogloh Posts: 5,171
    edited 2022-06-29 10:46

    @Wuerfel_21 said:
    Oh, I wanted the one that just uses two CS per bank. The nibble switching thing just seems like a bad idea.

    I see, well I can look at that...it might not take too long for me to re-arrange.

    I think that would only give you 48MB with a 4 bit bus (3x16MB), until the 8 bit stuff is fully coded up and that will take quite a while longer and I'll need the board for that too.

    Is 48MB enough?

  • I mean its more than 32MB...

  • roglohrogloh Posts: 5,171
    edited 2022-06-29 10:50

    Not by much but maybe more games will work...

    I think I can hack it up pretty easily, anyway.

  • roglohrogloh Posts: 5,171
    edited 2022-06-29 11:25

    Ok here's the version you want instead. You'll need to setup 3x16MB banks using even CE0, CE2, CE4 chip select pins... the next odd CE pin will get driven if the upper 8MB of each bank is accessed by looking at A23.

    It's definitely a lot easier/faster to manage this. I do initialize each "16MB" bank in pairs now with both CE0+CE1 pin active, etc. Hopefully writing control sequences into 2 memories at the same time should be okay, I've not tried that before.

    Let me know if this works. It's pretty simple so it should at least do something...

    ' data for memory
    deviceData
        ' 16 bank parameters follow (16MB per bank)
        long    ((burst<<16)|(delay<<12)|23) [3] ' where burst and delay are computed values
        long    0[13]                               ' banks 3-15
        ' 16 banks of pin parameters follow
        long    (CLK0_PIN << 8) | CE0_PIN             ' bank 0 
        long    (CLK0_PIN << 8) | CE2_PIN             ' bank 1 
        long    (CLK0_PIN << 8) | CE4_PIN             ' bank 2 
        long    -1[13]                              ' banks 3-15
    

    UPDATE: I'll have to dig deeper into the whole bank wrapping/sizing thing. From my recollection it may actually use the size parameter of each bank (which is set to 23 above, one less than the number of significant address bits), to form a mask applied after the address is incremented. It has to be a multiple of 2 but if this size value is set to 64MB instead for each bank, it may be possible to do larger bursts crossing over banks. Fragmentation for PSRAM's 1kB page boundaries already occurs as well inherently at the 8MB boundaries so the A23 bit address change should be taken care of there too...a small win for us.

  • Wuerfel_21Wuerfel_21 Posts: 4,513
    edited 2022-06-29 11:46

    Odd/even doesn't work for the 24MB board though (CS on 5,6,7)

    (Don't have time for messing with code right now, will test later)

  • roglohrogloh Posts: 5,171
    edited 2022-06-29 12:23

    @Wuerfel_21 said:
    Odd/even doesn't work for the 24MB board though (CS on 5,6,7)

    (Don't have time for messing with code right now, will test later)

    Okay I thought you wanted to use the 96MB board.

    To deal with that you would probably have to reorder your chip select order and patch the code (in two places) to reverse the address bit mapping.

    Change these:
    bitc cspin, #0 'patch into CE pin LSB to select hi/low 8MB sub-bank
    to this each time:
    bitnc cspin, #0 'patch into CE pin LSB to select hi/low 8MB sub-bank

    You'd pass CE6 for bank 0 (and CE6 will be used for the upper 8MB) and "CLK" for bank 1 and CE5 would be used for the lower 8MB of the second bank. CE4 doesn't exist (it's the clock pin) so be sure to not access the upper 8MB after the first 24MB.

    ' data for memory
    deviceData
        ' 16 bank parameters follow (16MB per bank)
        long    ((burst<<16)|(delay<<12)|23) [2] ' where burst and delay are computed values
        long    0[14]                               ' banks 2-15
        ' 16 banks of pin parameters follow
        long    (CLK_PIN << 8) | CE6_PIN             ' bank 0 
        long    (CLK_PIN << 8) | CLK_PIN             ' bank 1 
        long    -1[14]                              ' banks 2-15
    
    

    I need to come up with a way to initialize the memories though and that's a problem. It sort of needs to select CE5,CE6,CE7 all in one go instead of trying to init (CE4,CE5) and (CE6,CE7) because CE4 is the clock pin on the 24MB board so it needs more thought there and probably another custom build....

  • @rogloh said:
    Okay I thought you wanted to use the 96MB board.

    I have both, so I want both to work.

  • Wasn't the initial idea to just specifiy a second CS pin?

    That'd work fine for odd bank counts (including using the same driver for a single bank setup)

  • Yeah that was an idea discussed some time back in the thread. I can make one of those to test, it will need another small change....will post it here shortly.

  • Here's the change that supports up to 48MB of the 96MB board (and also 24 MB) with primary and secondary chip selects per 16MB bank. Note you will need to setup the banks like this below and include a new byte in the 32 bit word with the secondary chip enable pin number that gets used for the upper 8MB sub-bank of the 16MB bank. The primary (original) CE pin is still used for the lower 8MB.

    For the 96MB board to address 48MB:

    deviceData
        ' 16 bank parameters follow (16MB per bank)
        long    ((burst<<16)|(delay<<12)|23) [3] ' where burst and delay are computed values
        long    0[13]                               ' banks 3-15
        ' 16 banks of pin parameters follow
        long    (CE1_PIN << 16) | (CLK0_PIN << 8) | CE0_PIN             ' bank 0 
        long    (CE3_PIN << 16) | (CLK0_PIN << 8) | CE2_PIN             ' bank 1 
        long    (CE5_PIN << 16) | (CLK0_PIN << 8) | CE4_PIN             ' bank 2
        long    -1[13]                              ' banks 3-15
    

    For the 24MB board, the second bank shares the CE pin for both primary and secondary, the upper 8MB of bank 1 after 24MB will foldover back into the 16-24MB range because the CE pin is the same. I've used CE5,6,7 for the three chip enable names here but you know what I mean, maybe they are labelled as 0,1,2 on the board,not sure.

    deviceData
        ' 16 bank parameters follow (16MB per bank)
        long    ((burst<<16)|(delay<<12)|23) [2] ' where burst and delay are computed values
        long    0[14]                               ' banks 2-15
        ' 16 banks of pin parameters follow
        long    (CE6_PIN << 16) | (CLK_PIN << 8) | CE5_PIN             ' bank 0 
        long    (CE7_PIN << 16) | (CLK_PIN << 8) | CE7_PIN             ' bank 1  (same value for both so only 8MB addressable)
        long    -1[14]                              ' banks 2-15
    
  • Wuerfel_21Wuerfel_21 Posts: 4,513
    edited 2022-06-29 14:18

    Thanks. Currently fighting with flexspin, seems to generate bung code if I try setting up banks in a loop. Flexsplorp.

    Edit: any loop in exmem_start breaks. cool.

  • I think its related to alloca-ing the parameter structures - anything that adds a 4th local to the function breaks. oh well, malloc it is.

  • Doesn't seem to be working. Small games still work. Not sure if bug is in driver or my read code.

    repeat i from 0 to (PSRAM_BANKS-1)/2
      long[banks][0+i] := 128<<16 + (PSRAM_DELAY+1)<<13 + (PSRAM_SYNC_DATA?0:1)<<12 + 24
      long[banks][16+i] := PSRAM_SELECT+i*2 + PSRAM_CLK<<8 + (PSRAM_SELECT+i*2+(((PSRAM_BANKS&1)&&(i>=PSRAM_BANKS/2)) ? 0 : 1))<<16
    

    (yes I tried shortening that subexpression to just (PSRAM_SELECT+i*2+1)<<16)

  • If you diff the driver's sub-bank changes vs the original 4 bit PSRAM driver you can see they are reasonably minor in scope. I hope that it's functional but I can't test it out yet until Rayman's boards arrive here.

    If you have a scope or logic analyzer (or even another P2) maybe you can quickly hack something that reads PSRAM on different 8MB boundaries and look to see if you get a trigger on the correct CE pin...? And, also that each PSRAM CE is exercised at init time to prove all chips get setup. That'd hopefully show the driver doing the right thing.

  • @Wuerfel_21 said:
    Doesn't seem to be working. Small games still work. Not sure if bug is in driver or my read code.

    repeat i from 0 to (PSRAM_BANKS-1)/2
      long[banks][0+i] := 128<<16 + (PSRAM_DELAY+1)<<13 + (PSRAM_SYNC_DATA?0:1)<<12 + 24
      long[banks][16+i] := PSRAM_SELECT+i*2 + PSRAM_CLK<<8 + (PSRAM_SELECT+i*2+(((PSRAM_BANKS&1)&&(i>=PSRAM_BANKS/2)) ? 0 : 1))<<16
    

    (yes I tried shortening that subexpression to just (PSRAM_SELECT+i*2+1)<<16)

    Maybe hard code it once to prove it's working before getting too fancy with parameters...

Sign In or Register to comment.