Shop OBEX P1 Docs P2 Docs Learn Events
Exmem_mini.spin2 testing - Page 2 — Parallax Forums

Exmem_mini.spin2 testing

2

Comments

  • roglohrogloh Posts: 6,097
    edited 2025-09-27 01:39

    I think that's probably it. That would have been the first (and only) real 8 bit mode board as a target. And the chip select pattern or clock speed may have been tied somehow in that driver, assuming lots of fanout load due to having 6 banks of 16MB. As a sanity test you could try my 8 bit driver with your code on a P2Edge and only use half the data pins. You may still need to tweak the clock timing delay of course.

  • No, it's something about how the transfers are chunked up, bad leftover code from the 16-bit driver or smth.

  • RaymanRayman Posts: 15,799

    Testing with SWaP's hyperram now... Can't find a setting that works... Missing something?

    Trying things like this:

    'Config for SWaP Module (hyperram)
    exmem : "exmem_mini" | HYPER_DELAY = 12, MEMORY_TYPE = "H", HYPER_CLK = 47, HYPER_SELECT = 46, HYPER_BASE = 48, HYPER_RWDS = 56, HYPER_RESET = -1, HYPER_LATENCY = 7, HYPER_SYNC_CLOCK = false
    

    These are settings from MegaYume that work...

    HYPER_CLK    = 47' 8+HYPER_ACCESSORY
    HYPER_RWDS   = 56'10+HYPER_ACCESSORY
    HYPER_SELECT = 46'12+HYPER_ACCESSORY
    HYPER_BASE   = 48' 0+HYPER_ACCESSORY
    HYPER_RESET  = -1'15+HYPER_ACCESSORY
    
    HYPER_LATENCY = 7'6
    HYPER_WAIT  = HYPER_LATENCY*4 - 2
    HYPER_DELAY = 12   'scan starting from 11
    HYPER_SYNC_CLOCK = false
    HYPER_SYNC_DATA = false
    

    Just realized might have to play with clock freq...

  • RaymanRayman Posts: 15,799

    Think found the problem in ExMem_mini...
    Seems the latency is hard coded to 6 whereas this module needs 7:

    exmem_banks[16+i] := HYPER_SELECT+i + HYPER_CLK<<8 + HYPER_RWDS<<16 + 7<<25 '6<<25 ' Latency????

  • RaymanRayman Posts: 15,799
    edited 2025-09-27 14:39

    Added a "HYPER_LATENCY" constant to exmem_mini and have the code above use that instead.
    Seems good now.

  • RaymanRayman Posts: 15,799

    @Wuerfel_21 Do you have settings for the EC32 module?

  • Should just work

  • RaymanRayman Posts: 15,799

    Ok, it does:

    'Config for Parallax Edge with 32MB PSRAM
    exmem : "exmem_mini" | MEMORY_TYPE = 16

  • RaymanRayman Posts: 15,799

    Having trouble finding any code that works with 96MB module... Sad...

  • RaymanRayman Posts: 15,799

    Ok, @Wuerfel_21 reminded me that the emulators all have suggested PSRAM settings.
    Got 96MB working with MegaYume on P2 Eval board.

    But, these settings don't work with exmem_mini for some reason. Tried a lot of variations, but nothing works...

  • evanhevanh Posts: 16,856

    Might want to try Roger's delay-test reporting programs. They show what "delay" values are suitable. If nothing shows up in the map then that would indicate there is other problems.

    My ram-tester.spin2 is easier to setup and run if you just want to prove the hardware - https://forums.parallax.com/discussion/comment/1569287/#Comment_1569287

  • RaymanRayman Posts: 15,799
    edited 2025-09-27 20:15

    @evanh module works with Megayume but not exmem_mini. Think will just wait and hope that @Wuerfel_21 can figure it out...
    Going to move on to higher resolutions.

    Also thinking about what kind of video your 4-bit uSD driver might be able to deliver...

  • RaymanRayman Posts: 15,799

    This one loads the 8bpp bitmap from uSD instead of hub ram.
    Using the @evanh bashed3 uSD driver with FAT32 (instead of FSRW this time).

    the bmp file has to be on the uSD of course...

  • evanhevanh Posts: 16,856
    edited 2025-09-28 00:30

    @Rayman said:
    Also thinking about what kind of video your 4-bit uSD driver might be able to deliver...

    If you're wanting to push the read speed then disabling the CRC check during reads allows sysclock/2 divider to hit its full capability.
    The example speed tester program has the relevant couple of lines commented out. Uncomment them and you'll see the difference.

    It uses ioctl() and the device handle to set them so the tester make the setting change at mount time, but in reality those settings can be adjusted on the fly as you like.

        clkdiv = 0;
        _ioctl(handle, 70, &clkdiv);    // disable read-block CRC processing
        clkdiv = 2;
        _ioctl(handle, 72, &clkdiv);    // set CLK_DIV value
    

    Of note is the parameter value to be set must be held in hubRAM, hence the "&". That's just the way ioctl() works, since it is allowed to pass a whole structure. The control numbers, 70 and 72, I chose in the driver myself. Here's the full list:

        CTRL_BLOCKREAD_CRC = 70,    // Disables/enables CRC processing of block read, default is enabled
        CTRL_GET_CLKDIV = 71,    // Get the active clock-divider
        CTRL_SET_CLKDIV = 72,    // Set the active clock-divider
        CTRL_GET_CLKPOL = 73,    // (Not implemented)
        CTRL_SET_CLKPOL = 74,    // (Not implemented)
    
  • evanhevanh Posts: 16,856

    Oh, huh, the Spin2 libc support doesn't have ioctl(). I'll ask Eric about it ...

  • RaymanRayman Posts: 15,799

    Think have the 800x480 4.3" IPS LCD board working.
    At least, have something on the screen...

    Seems to be a good fit for this exmem driver.
    Hopefully, will have it going soon...

  • RaymanRayman Posts: 15,799

    Was testing converting the raycaster to use exmem today...

    Starting with the background image (mountains).
    Loading the image from uSD, so doesn't have to be embedded anymore.
    Anyway, at the start of each frame the assembly graphics drivers draw the background image by copying 100 lines of 320 pixels from exmem to HUB screen buffer.

    One interesting thing is how to monitor that exmem is ready for another command.
    Seems it would be better to poll the mailbox base just before writing to the base with new command.
    But, this doesn't work so well, a couple lines flicker.

    Appears better to poll the mailbox base after writing to it.
    Guess this is a hair slower, but produces a stable image...

  • Check out how the Spin exmem_mini code implements this - you check the first long is zero before writing ANYTHING. The longs should either be written last to first or in one go with SETQ+WRLONG (this is generally the fruitful approach for ASM code)

  • evanhevanh Posts: 16,856

    I just did a little testing with both the Swap Module and the Edge Card using Rayman's 24 MB add-on, attaching with a short piece of ribbon cable. The results are a mess! The Edge scores basically 0% success, an occasional 1% pops up ... While the Swap mostly achieves 99% score. Almost nothing is perfect, some frequencies worse, a few frequencies do hit 100%. The same 24 MB add-on works flawlessly when plugged into my Eval Boards.

    What's interesting is I've used both those ribbon cables without issue with the 4-bit SD add-on. Not even an occasional CRC error. So I now wonder how important those inline 33 ohm resistors are on the 4-bit SD add-on board? I might have to try adding them to the DRAM add-on boards too ...

  • roglohrogloh Posts: 6,097
    edited 2025-10-04 13:09

    I think that card needs a sysclk/3 to work due to the capacitive load with all the banks in parallel. If you plug your high speed scope on the clock/data lines you might see how bad it is...?
    EDIT: oops confused 24 variant for 96MB which had 6 banks. 24MB has 3 in parallel right?

  • evanhevanh Posts: 16,856
    edited 2025-10-04 14:07

    @rogloh said:
    24MB has 3 in parallel right?

    Yep, that's the one. I checked at sysclock/4 and /8 to see if that might help - It didn't, although the pattern of frequencies where better and worse scores occurred did change as a result.

    PS: I've dug up one 22R and one 47R, of the short resistors, from my misc box of parts. Not enough to test with. Need to go to the shop I think.
    PPS: I do have a bunch of wire wound at those values but I'm not about to use those.
    PPPS: I can get this at the local, and always can use big variety of resistors - https://www.jaycar.co.nz/the-easy-way-to-buy-1-4-watt-carbon-film-resistors/p/RR1697

  • RaymanRayman Posts: 15,799

    What is Burst mode about?

    Does it break up long read/write into smaller chunks ?
    If so, Is there a default chunk size?

  • Large transfers need to be broken into chunks, either when they cross the page boundary or when they get too long to meet the spec for "CE# low pulse-width" (8µs). That last limit is set by maxburst. Though in practice I'm not sure if the latter is actually a real concern? I think my default settings might actually exceed that? idk right now. Though each burst is uninterruptible, so reducing the burst limit can help reduce latency for smaller transfers.

  • RaymanRayman Posts: 15,799

    Let me rephrase, sorry...

    When would you use R_READBURST vs R_READLONG ?

    Does the driver not automatically break up long reads if you pick R_READLONG ?
    Maybe that is what you mean by the above?

  • I think that one just reads one long (4 bytes), but that's a question for @rogloh

  • evanhevanh Posts: 16,856

    I'd guess it is how the data gets passed. With READLONG the data is returned in the mailbox. With READBURST you pass an address to a block of hubRAM to be filled.

  • RaymanRayman Posts: 15,799

    Ok, see one needs to use BURST mode now...

    If one requests a burst that is very long, will it work? Or, does one manually need to break it up?
    Seeing a setting in QOS:

    PUB setQoS(cogn,maxburst,priority,locked,attention) | mb
    exmem_cogs[cogn] := (maxburst<<16)+(priority<<12)+(locked?1<<10:0)+(attention?1<<11:0)
    mb := (exmem_cog-1)*12 + @exmem_mailbox
    repeat while long[mb]
    long[mb][2] := 0
    long[mb][1] := 0
    long[mb][0] := drv16.R_CONFIG + cogid()
    repeat while long[mb]
    

    If one doesn't call this and asks for a very long burst, will it break?

  • evanhevanh Posts: 16,856

    I imagine it's all automatic. Roger's driver has the kitchen sink.

  • evanhevanh Posts: 16,856

    At the top of each driver file:

     Features:
     --------
     * single COG PSRAM based driver
     * supports 4x4bit PSRAM devices formin a 16 bit data bus, with common clock and chip enable
     * up to 16 banks of memory devices can be mapped on 16MB/32MB/64MB/128MB/256MB boundaries 
     * device selected on the bus will be based on the address bank in the memory request
     * configurable control pins and shared data bus group for the memory devices
     * uses a 3 long mailbox per COG for reading memory requests and for writing results
     * error reporting for all failed requests
     * supports strict priority and round-robin request polling (selectable per COG)
     * optional notification of request completion with the COGATN signal to the requesting COG
     * re-configurable maximum transfer burst size limits setup per COG, and per device
     * automatic fragmentation of transfers exceeding configured burst sizes or page
     * sysclk/2 read/write transfer rates are supported and optionally sysclk/1 transfer rates
     * provides single byte/word/long and burst transfers for reading/writing external memory
     * input delay can be controlled to allow driver to operate with varying P2 clocks/temps/boards
     * graphics copies/fill and other external memory to memory copy operations are supported
     * request lists supported allowing multiple requests with one mailbox transaction (DMA engine)
     * maskable read-modify-write support for atomic memory changes and sub-byte sized pixel writes
     * unserviced COGs can be removed from the polling loop to reduce latency
    
Sign In or Register to comment.