Shop OBEX P1 Docs P2 Docs Learn Events
New SD mode P2 accessory board - Page 15 — Parallax Forums

New SD mode P2 accessory board

1121315171830

Comments

  • evanhevanh Posts: 16,023
    edited 2024-10-10 09:01

    First thing to note is that while read performance is okay, write performance is way behind what the raw speed was doing in driver development. That's something I expect can be improved on as the filesystem code gets reworked.

    Here's one of the better examples using the Sandisk Extreme 32 GB (2017). First the development program run (Using only an 8 kByte buffer size):

     400 kHz SD clock divider = 900
      CMD8 R7 0000015a - Valid v2.0+ SD Card
      ACMD41 busy duration: 184 ms
      ACMD41 OCR c0ff8000 - Valid SDHC/SDXC Card
      CMD2/CMD3 - Data Transfer Mode entered - Published RCA aaaa0000
      ACMD6 - 4-bit data interface engaged
      CMD6 - High-Speed access mode engaged
      CMD9 - CSD backed-up
      CMD10 - CID backed-up
      Full Speed clock divider = 3 (120.0 MHz)
      rxlag=7 selected  Lowest=6 Highest=8
     CID decode:  ManID=03   OEMID=SD   Name=SE32G
       Ver=8.0   Serial=6018369B   Date=2017-12
     CSD decode:  Ver 2
       Unformatted User Capacity = 29.22 GiBytes
       Selected Bus Speed = 50.00 MHz   Strength Adjust = no
       Supported Command Classes =  basic  blockread  blockwrite  erase  lock  appspec  switch 
       Unsupported Command Classes =  queuing  reserved  writeprotect  i/o  extension 
     Functions decode:
       Current limit = 200 mA (averaged)
       Function BitMap: G6=8001 G5=8001 G4=8001 G3=8001 G2=c001 G1=8003
       Function Select: G6 = 0  G5 = 0  G4 = 0  G3 = 0  G2 = 0  G1 = 1 
    Init successful
     Other compile options:
      - Response handler #2
      - Buffer size is 8 kiBytes
      - Multi-block-read loop
      - Read data CRC processed in parallel
      - P_PULSE clock-gen
      RX_SCHMITT=0  DAT_REG=0  CLK_REG=0  CLK_POL=0  CLK_DIV=3
    
    Write blocks speed test (ACMD23 not used):
    32768 blocks = 16384 kiB   busy=37   rate = 42.4 MiB/s   duration = 376912 us   zero-overhead = 279620 us   overheads = 25.8 %
    16384 blocks = 8192 kiB   busy=37   rate = 47.0 MiB/s   duration = 170105 us   zero-overhead = 139810 us   overheads = 17.8 %
    8192 blocks = 4096 kiB   busy=37   rate = 47.2 MiB/s   duration = 84577 us   zero-overhead = 69905 us   overheads = 17.3 %
    4096 blocks = 2048 kiB   busy=37   rate = 42.7 MiB/s   duration = 46740 us   zero-overhead = 34953 us   overheads = 25.2 %
    2048 blocks = 1024 kiB   busy=37   rate = 46.5 MiB/s   duration = 21497 us   zero-overhead = 17476 us   overheads = 18.7 %
    1024 blocks = 512 kiB   busy=37   rate = 45.5 MiB/s   duration = 10987 us   zero-overhead = 8738 us   overheads = 20.4 %
    512 blocks = 256 kiB   busy=37   rate = 43.6 MiB/s   duration = 5727 us   zero-overhead = 4369 us   overheads = 23.7 %
    256 blocks = 128 kiB   busy=37   rate = 40.3 MiB/s   duration = 3098 us   zero-overhead = 2185 us   overheads = 29.4 %
    128 blocks = 64 kiB   busy=37   rate = 43.9 MiB/s   duration = 1422 us   zero-overhead = 1092 us   overheads = 23.2 %
    64 blocks = 32 kiB   busy=37   rate = 40.8 MiB/s   duration = 765 us   zero-overhead = 546 us   overheads = 28.6 %
    32 blocks = 16 kiB   busy=37   rate = 35.7 MiB/s   duration = 437 us   zero-overhead = 273 us   overheads = 37.5 %
    16 blocks = 8 kiB   busy=37   rate = 28.6 MiB/s   duration = 273 us   zero-overhead = 137 us   overheads = 49.8 %
    8 blocks = 4 kiB   busy=37   rate = 20.5 MiB/s   duration = 190 us   zero-overhead = 68 us   overheads = 64.2 %
    4 blocks = 2 kiB   busy=37   rate = 13.1 MiB/s   duration = 149 us   zero-overhead = 34 us   overheads = 77.1 %
    2 blocks = 1 kiB   busy=37   rate = 7.5 MiB/s   duration = 129 us   zero-overhead = 17 us   overheads = 86.8 %
    
    Read blocks speed test:
    32768 blocks = 16384 kiB   rate = 47.3 MiB/s   duration = 337991 us   zero-overhead = 279620 us   overheads = 17.2 %
    16384 blocks = 8192 kiB   rate = 47.5 MiB/s   duration = 168375 us   zero-overhead = 139810 us   overheads = 16.9 %
    8192 blocks = 4096 kiB   rate = 47.4 MiB/s   duration = 84252 us   zero-overhead = 69905 us   overheads = 17.0 %
    4096 blocks = 2048 kiB   rate = 47.3 MiB/s   duration = 42251 us   zero-overhead = 34953 us   overheads = 17.2 %
    2048 blocks = 1024 kiB   rate = 47.0 MiB/s   duration = 21251 us   zero-overhead = 17476 us   overheads = 17.7 %
    1024 blocks = 512 kiB   rate = 46.5 MiB/s   duration = 10751 us   zero-overhead = 8738 us   overheads = 18.7 %
    512 blocks = 256 kiB   rate = 45.4 MiB/s   duration = 5500 us   zero-overhead = 4369 us   overheads = 20.5 %
    256 blocks = 128 kiB   rate = 43.4 MiB/s   duration = 2875 us   zero-overhead = 2185 us   overheads = 24.0 %
    128 blocks = 64 kiB   rate = 39.9 MiB/s   duration = 1563 us   zero-overhead = 1092 us   overheads = 30.1 %
    64 blocks = 32 kiB   rate = 34.4 MiB/s   duration = 907 us   zero-overhead = 546 us   overheads = 39.8 %
    32 blocks = 16 kiB   rate = 27.0 MiB/s   duration = 578 us   zero-overhead = 273 us   overheads = 52.7 %
    16 blocks = 8 kiB   rate = 21.6 MiB/s   duration = 361 us   zero-overhead = 137 us   overheads = 62.0 %
    8 blocks = 4 kiB   rate = 16.7 MiB/s   duration = 233 us   zero-overhead = 68 us   overheads = 70.8 %
    4 blocks = 2 kiB   rate = 10.0 MiB/s   duration = 195 us   zero-overhead = 34 us   overheads = 82.5 %
    2 blocks = 1 kiB   rate = 5.5 MiB/s   duration = 175 us   zero-overhead = 17 us   overheads = 90.2 %
    

    Then the filesystem tester run:

       clkfreq = 360000000   clkmode = 0x10011fb
     Clock divider for SD card is 3 (120 MHz)
    mount: OK
     Buffer = 2 kB,  Written 2048 kB at 741 kB/s,  Verified,  Read 2048 kB at 6353 kB/s
     Buffer = 2 kB,  Written 2048 kB at 796 kB/s,  Verified,  Read 2048 kB at 6286 kB/s
     Buffer = 2 kB,  Written 2048 kB at 756 kB/s,  Verified,  Read 2048 kB at 6299 kB/s
     Buffer = 2 kB,  Written 2048 kB at 822 kB/s,  Verified,  Read 2048 kB at 6157 kB/s
    
     Buffer = 4 kB,  Written 2048 kB at 1494 kB/s,  Verified,  Read 2048 kB at 11038 kB/s
     Buffer = 4 kB,  Written 2048 kB at 1312 kB/s,  Verified,  Read 2048 kB at 11055 kB/s
     Buffer = 4 kB,  Written 2048 kB at 1490 kB/s,  Verified,  Read 2048 kB at 11231 kB/s
     Buffer = 4 kB,  Written 2048 kB at 1476 kB/s,  Verified,  Read 2048 kB at 10339 kB/s
    
     Buffer = 8 kB,  Written 4096 kB at 2172 kB/s,  Verified,  Read 4096 kB at 18582 kB/s
     Buffer = 8 kB,  Written 4096 kB at 2046 kB/s,  Verified,  Read 4096 kB at 16885 kB/s
     Buffer = 8 kB,  Written 4096 kB at 2304 kB/s,  Verified,  Read 4096 kB at 18455 kB/s
     Buffer = 8 kB,  Written 4096 kB at 2272 kB/s,  Verified,  Read 4096 kB at 18500 kB/s
    
     Buffer = 16 kB,  Written 4096 kB at 2988 kB/s,  Verified,  Read 4096 kB at 27761 kB/s
     Buffer = 16 kB,  Written 4096 kB at 2782 kB/s,  Verified,  Read 4096 kB at 25235 kB/s
     Buffer = 16 kB,  Written 4096 kB at 3174 kB/s,  Verified,  Read 4096 kB at 27669 kB/s
     Buffer = 16 kB,  Written 4096 kB at 3106 kB/s,  Verified,  Read 4096 kB at 27615 kB/s
    
     Buffer = 32 kB,  Written 8192 kB at 4975 kB/s,  Verified,  Read 8192 kB at 26591 kB/s
     Buffer = 32 kB,  Written 8192 kB at 3432 kB/s,  Verified,  Read 8192 kB at 27955 kB/s
     Buffer = 32 kB,  Written 8192 kB at 4651 kB/s,  Verified,  Read 8192 kB at 26414 kB/s
     Buffer = 32 kB,  Written 8192 kB at 4953 kB/s,  Verified,  Read 8192 kB at 27815 kB/s
    
     Buffer = 64 kB,  Written 8192 kB at 6420 kB/s,  Verified,  Read 8192 kB at 26215 kB/s
     Buffer = 64 kB,  Written 8192 kB at 5877 kB/s,  Verified,  Read 8192 kB at 27816 kB/s
     Buffer = 64 kB,  Written 8192 kB at 5684 kB/s,  Verified,  Read 8192 kB at 27824 kB/s
     Buffer = 64 kB,  Written 8192 kB at 6784 kB/s,  Verified,  Read 8192 kB at 27806 kB/s
    
     Buffer = 128 kB,  Written 16384 kB at 5012 kB/s,  Verified,  Read 16384 kB at 26996 kB/s
     Buffer = 128 kB,  Written 16384 kB at 8265 kB/s,  Verified,  Read 16384 kB at 26973 kB/s
     Buffer = 128 kB,  Written 16384 kB at 8212 kB/s,  Verified,  Read 16384 kB at 27853 kB/s
     Buffer = 128 kB,  Written 16384 kB at 7704 kB/s,  Verified,  Read 16384 kB at 26901 kB/s
    
     Buffer = 256 kB,  Written 16384 kB at 9425 kB/s,  Verified,  Read 16384 kB at 26920 kB/s
     Buffer = 256 kB,  Written 16384 kB at 9410 kB/s,  Verified,  Read 16384 kB at 26960 kB/s
     Buffer = 256 kB,  Written 16384 kB at 9467 kB/s,  Verified,  Read 16384 kB at 27021 kB/s
     Buffer = 256 kB,  Written 16384 kB at 9375 kB/s,  Verified,  Read 16384 kB at 27020 kB/s
    
  • evanhevanh Posts: 16,023

    And a lesser example using the Samsung EVO 128 GB (2023):

     400 kHz SD clock divider = 900
      CMD8 R7 0000015a - Valid v2.0+ SD Card
      ACMD41 busy duration: 150 ms
      ACMD41 OCR c0ff8000 - Valid SDHC/SDXC Card
      CMD2/CMD3 - Data Transfer Mode entered - Published RCA 59b40000
      ACMD6 - 4-bit data interface engaged
      CMD6 - High-Speed access mode engaged
      CMD9 - CSD backed-up
      CMD10 - CID backed-up
      Full Speed clock divider = 3 (120.0 MHz)
      rxlag=7 selected  Lowest=6 Highest=8
     CID decode:  ManID=1B   OEMID=SM   Name=ED2S5
       Ver=3.0   Serial=49C16906   Date=2023-2
     CSD decode:  Ver 2
       Unformatted User Capacity = 119.37 GiBytes
       Selected Bus Speed = 50.00 MHz   Strength Adjust = no
       Supported Command Classes =  basic  queuing  blockread  blockwrite  erase  lock  appspec  switch  extension 
       Unsupported Command Classes =  reserved  writeprotect  i/o 
     Functions decode:
       Current limit = 200 mA (averaged)
       Function BitMap: G6=8001 G5=8001 G4=8001 G3=8001 G2=c001 G1=8003
       Function Select: G6 = 0  G5 = 0  G4 = 0  G3 = 0  G2 = 0  G1 = 1 
    Init successful
     Other compile options:
      - Response handler #2
      - Buffer size is 8 kiBytes
      - Multi-block-read loop
      - Read data CRC processed in parallel
      - P_PULSE clock-gen
      RX_SCHMITT=0  DAT_REG=0  CLK_REG=0  CLK_POL=0  CLK_DIV=3
    
    Write blocks speed test (ACMD23 not used):
    32768 blocks = 16384 kiB   busy=37   rate = 43.9 MiB/s   duration = 363813 us   zero-overhead = 279620 us   overheads = 23.1 %
    16384 blocks = 8192 kiB   busy=37   rate = 46.0 MiB/s   duration = 173777 us   zero-overhead = 139810 us   overheads = 19.5 %
    8192 blocks = 4096 kiB   busy=37   rate = 37.0 MiB/s   duration = 107824 us   zero-overhead = 69905 us   overheads = 35.1 %
    4096 blocks = 2048 kiB   busy=37   rate = 43.2 MiB/s   duration = 46268 us   zero-overhead = 34953 us   overheads = 24.4 %
    2048 blocks = 1024 kiB   busy=37   rate = 43.5 MiB/s   duration = 22966 us   zero-overhead = 17476 us   overheads = 23.9 %
    1024 blocks = 512 kiB   busy=37   rate = 35.1 MiB/s   duration = 14235 us   zero-overhead = 8738 us   overheads = 38.6 %
    512 blocks = 256 kiB   busy=37   rate = 35.9 MiB/s   duration = 6950 us   zero-overhead = 4369 us   overheads = 37.1 %
    256 blocks = 128 kiB   busy=37   rate = 20.0 MiB/s   duration = 6226 us   zero-overhead = 2185 us   overheads = 64.9 %
    128 blocks = 64 kiB   busy=37   rate = 20.5 MiB/s   duration = 3036 us   zero-overhead = 1092 us   overheads = 64.0 %
    64 blocks = 32 kiB   busy=37   rate = 7.3 MiB/s   duration = 4225 us   zero-overhead = 546 us   overheads = 87.0 %
    32 blocks = 16 kiB   busy=37   rate = 7.0 MiB/s   duration = 2220 us   zero-overhead = 273 us   overheads = 87.7 %
    16 blocks = 8 kiB   busy=37   rate = 3.6 MiB/s   duration = 2164 us   zero-overhead = 137 us   overheads = 93.6 %
    8 blocks = 4 kiB   busy=37   rate = 1.7 MiB/s   duration = 2215 us   zero-overhead = 68 us   overheads = 96.9 %
    4 blocks = 2 kiB   busy=37   rate = 0.8 MiB/s   duration = 2242 us   zero-overhead = 34 us   overheads = 98.4 %
    2 blocks = 1 kiB   busy=37   rate = 0.4 MiB/s   duration = 2321 us   zero-overhead = 17 us   overheads = 99.2 %
    
    Read blocks speed test:
    32768 blocks = 16384 kiB   rate = 35.1 MiB/s   duration = 455120 us   zero-overhead = 279620 us   overheads = 38.5 %
    16384 blocks = 8192 kiB   rate = 32.4 MiB/s   duration = 246691 us   zero-overhead = 139810 us   overheads = 43.3 %
    8192 blocks = 4096 kiB   rate = 29.9 MiB/s   duration = 133429 us   zero-overhead = 69905 us   overheads = 47.6 %
    4096 blocks = 2048 kiB   rate = 28.1 MiB/s   duration = 71126 us   zero-overhead = 34953 us   overheads = 50.8 %
    2048 blocks = 1024 kiB   rate = 27.2 MiB/s   duration = 36760 us   zero-overhead = 17476 us   overheads = 52.4 %
    1024 blocks = 512 kiB   rate = 27.1 MiB/s   duration = 18390 us   zero-overhead = 8738 us   overheads = 52.4 %
    512 blocks = 256 kiB   rate = 26.8 MiB/s   duration = 9303 us   zero-overhead = 4369 us   overheads = 53.0 %
    256 blocks = 128 kiB   rate = 26.2 MiB/s   duration = 4761 us   zero-overhead = 2185 us   overheads = 54.1 %
    128 blocks = 64 kiB   rate = 25.2 MiB/s   duration = 2473 us   zero-overhead = 1092 us   overheads = 55.8 %
    64 blocks = 32 kiB   rate = 23.7 MiB/s   duration = 1316 us   zero-overhead = 546 us   overheads = 58.5 %
    32 blocks = 16 kiB   rate = 20.4 MiB/s   duration = 765 us   zero-overhead = 273 us   overheads = 64.3 %
    16 blocks = 8 kiB   rate = 16.9 MiB/s   duration = 462 us   zero-overhead = 137 us   overheads = 70.3 %
    8 blocks = 4 kiB   rate = 12.8 MiB/s   duration = 303 us   zero-overhead = 68 us   overheads = 77.5 %
    4 blocks = 2 kiB   rate = 7.3 MiB/s   duration = 265 us   zero-overhead = 34 us   overheads = 87.1 %
    2 blocks = 1 kiB   rate = 3.9 MiB/s   duration = 246 us   zero-overhead = 17 us   overheads = 93.0 %
    
       clkfreq = 360000000   clkmode = 0x10011fb
     Clock divider for SD card is 3 (120 MHz)
    mount: OK
     Buffer = 2 kB,  Written 2048 kB at 312 kB/s,  Verified,  Read 2048 kB at 6154 kB/s
     Buffer = 2 kB,  Written 2048 kB at 313 kB/s,  Verified,  Read 2048 kB at 6193 kB/s
     Buffer = 2 kB,  Written 2048 kB at 318 kB/s,  Verified,  Read 2048 kB at 6193 kB/s
     Buffer = 2 kB,  Written 2048 kB at 331 kB/s,  Verified,  Read 2048 kB at 6421 kB/s
    
     Buffer = 4 kB,  Written 2048 kB at 568 kB/s,  Verified,  Read 2048 kB at 11115 kB/s
     Buffer = 4 kB,  Written 2048 kB at 570 kB/s,  Verified,  Read 2048 kB at 10954 kB/s
     Buffer = 4 kB,  Written 2048 kB at 572 kB/s,  Verified,  Read 2048 kB at 11304 kB/s
     Buffer = 4 kB,  Written 2048 kB at 574 kB/s,  Verified,  Read 2048 kB at 11379 kB/s
    
     Buffer = 8 kB,  Written 4096 kB at 902 kB/s,  Verified,  Read 4096 kB at 15562 kB/s
     Buffer = 8 kB,  Written 4096 kB at 910 kB/s,  Verified,  Read 4096 kB at 18040 kB/s
     Buffer = 8 kB,  Written 4096 kB at 924 kB/s,  Verified,  Read 4096 kB at 16157 kB/s
     Buffer = 8 kB,  Written 4096 kB at 916 kB/s,  Verified,  Read 4096 kB at 17806 kB/s
    
     Buffer = 16 kB,  Written 4096 kB at 1295 kB/s,  Verified,  Read 4096 kB at 21790 kB/s
     Buffer = 16 kB,  Written 4096 kB at 1288 kB/s,  Verified,  Read 4096 kB at 20542 kB/s
     Buffer = 16 kB,  Written 4096 kB at 1284 kB/s,  Verified,  Read 4096 kB at 24078 kB/s
     Buffer = 16 kB,  Written 4096 kB at 1289 kB/s,  Verified,  Read 4096 kB at 21756 kB/s
    
     Buffer = 32 kB,  Written 8192 kB at 1770 kB/s,  Verified,  Read 8192 kB at 31619 kB/s
     Buffer = 32 kB,  Written 8192 kB at 1690 kB/s,  Verified,  Read 8192 kB at 25854 kB/s
     Buffer = 32 kB,  Written 8192 kB at 1622 kB/s,  Verified,  Read 8192 kB at 28498 kB/s
     Buffer = 32 kB,  Written 8192 kB at 1675 kB/s,  Verified,  Read 8192 kB at 32336 kB/s
    
     Buffer = 64 kB,  Written 8192 kB at 2748 kB/s,  Verified,  Read 8192 kB at 26696 kB/s
     Buffer = 64 kB,  Written 8192 kB at 2893 kB/s,  Verified,  Read 8192 kB at 31585 kB/s
     Buffer = 64 kB,  Written 8192 kB at 2728 kB/s,  Verified,  Read 8192 kB at 25030 kB/s
     Buffer = 64 kB,  Written 8192 kB at 2757 kB/s,  Verified,  Read 8192 kB at 27363 kB/s
    
     Buffer = 128 kB,  Written 16384 kB at 3909 kB/s,  Verified,  Read 16384 kB at 30130 kB/s
     Buffer = 128 kB,  Written 16384 kB at 4107 kB/s,  Verified,  Read 16384 kB at 28206 kB/s
     Buffer = 128 kB,  Written 16384 kB at 3826 kB/s,  Verified,  Read 16384 kB at 27884 kB/s
     Buffer = 128 kB,  Written 16384 kB at 4128 kB/s,  Verified,  Read 16384 kB at 27955 kB/s
    
     Buffer = 256 kB,  Written 16384 kB at 5071 kB/s,  Verified,  Read 16384 kB at 27712 kB/s
     Buffer = 256 kB,  Written 16384 kB at 5299 kB/s,  Verified,  Read 16384 kB at 28316 kB/s
     Buffer = 256 kB,  Written 16384 kB at 5022 kB/s,  Verified,  Read 16384 kB at 28461 kB/s
     Buffer = 256 kB,  Written 16384 kB at 5261 kB/s,  Verified,  Read 16384 kB at 28816 kB/s
    
  • roglohrogloh Posts: 5,837

    I just read this in your results:

    Full Speed clock divider = 3 (120.0 MHz)

    So is this being tested at 360MHz? I'll be running around 297MHz or 270MHz I guess. Wonder what I can expect to see there.

  • evanhevanh Posts: 16,023

    Point is the filesystem code is showing its weaknesses now that we have a more performant driver.

    I haven't tried looking into how it all works so aren't able to give any reason though.

  • roglohrogloh Posts: 5,837
    edited 2024-10-10 09:28

    Hmm, maybe it's choking on lots of random sector seeks for updating the file system's FAT tables etc, although a lot of that activity should mostly happen once at the start of a large file write if the disk isn't fragmented and uses large cluster size. The cluster chain needs to be updated, presumably done as the file is being written in case of sudden loss. It'd be interesting to see the list of sectors being accessed by the filesystem. A timestamped debug log history in your driver might be needed to see this.

  • evanhevanh Posts: 16,023

    I could do it real simple just by printing each start block number. There's no need for performance when mapping ...

  • roglohrogloh Posts: 5,837

    @evanh said:
    I could do it real simple just by printing each start block number. There's no need for performance when mapping ...

    Yeah then you'll know what the underlying file system is doing between all the sector transfers etc. Might be good to timestamp still, just for seeing relative gaps in calls etc. Absolute time is not so important.

  • evanhevanh Posts: 16,023
    edited 2024-10-10 11:00

    Here's just the writes alone - for writing one 512 kB file, using a 256 kB buffer, to a 16 GB card:

     wr#7e00  wr#84c  wr#433c  wr#84d  wr#433d  wr#84e  wr#433e  wr#84f  wr#433f  wr#850  wr#4340  wr#851  wr#4341  wr#852  wr#4342  wr#853  wr#4343  wr#854  wr#4344  wr#855  wr#4345  wr#856  wr#4346  wr#857  wr#4347  wr#858  wr#4348  wr#859  wr#4349  wr#85a  wr#434a  wr#85b  wr#434b  wr#85c  wr#434c  Buffer = 256 kB,  wr#7e00  wr#801  wr#1de10+10  wr#1de20+10  wr#1de30+10  wr#1de40+10  wr#1de50+10  wr#1de60+10  wr#1de70+10  wr#1de80+10  wr#1de90+10  wr#1dea0+10  wr#1deb0+10  wr#1dec0+10  wr#1ded0+10  wr#1dee0+10  wr#1def0+10  wr#1df00+10  wr#1df10+10  wr#1df20+10  wr#1df30+10  wr#1df40+10  wr#1df50+10  wr#1df60+10  wr#1df70+10  wr#1df80+10  wr#1df90+10  wr#1dfa0+10  wr#1dfb0+10  wr#1dfc0+10  wr#1dfd0+10  wr#1dfe0+10  wr#1dff0+10  wr#1e000+10  wr#84c  wr#433c  wr#7e00  wr#801  wr#1e010+10  wr#1e020+10  wr#1e030+10  wr#1e040+10  wr#1e050+10  wr#1e060+10  wr#1e070+10  wr#1e080+10  wr#1e090+10  wr#1e0a0+10  wr#1e0b0+10  wr#1e0c0+10  wr#1e0d0+10  wr#1e0e0+10  wr#1e0f0+10  wr#1e100+10  wr#1e110+10  wr#1e120+10  wr#1e130+10  wr#1e140+10  wr#1e150+10  wr#1e160+10  wr#1e170+10  wr#1e180+10  wr#1e190+10  wr#1e1a0+10  wr#1e1b0+10  wr#1e1c0+10  wr#1e1d0+10  wr#1e1e0+10  wr#1e1f0+10  wr#1e200+10  wr#84c  wr#433c  wr#7e00  wr#801  Written 512 kB at 2411 kB/s,  Verified,  Read 512 kB at 13244 kB/s
    

    The ones with +10 (16 block, 8 kB cluster size) are multi-block bursts. So clusters don't look to be grouped into a larger burst. Which means the larger buffer in the program isn't doing much good.

  • roglohrogloh Posts: 5,837
    edited 2024-10-10 10:06

    Looks like it's interleaving writes between two sector groups at the start. Maybe this part is updating the two FAT tables or something in preparation for the real write. Once the writes start after you print Buffer=256kB then it doesn't seem to touch the other sectors for a bit at least until the end, so this should be pretty fast. Are there gaps between each slowing things down I wonder. You probably want to log the timestamps in real time into a buffer for printing later. Then any gaps between calls will be noticed.

  • evanhevanh Posts: 16,023

    @rogloh said:
    Looks like it's interleaving writes between two sector groups at the start. Maybe this part is updating the two FAT tables or something in preparation for the real write.

    That'd be right. The fopen() is before it prints Buffer = 256 kB.

    Once the writes start after you print Buffer=256kB then it doesn't seem to touch the other sectors for a bit at least until the end, so this should be pretty fast. Are there gaps between each slowing things down I wonder. You probably want to log the timestamps in real time into a buffer for printing later. Then any gaps between calls will be noticed.

    Actually, funnily, although a long way short of the raw performance, it's quite fast at writing with 256 kB buffer. The slow ones are when using smaller buffer sizes.

    Oh, oops, that block list is not 16 MB at all. It's only 512 kB, 2 x 256 kB.

  • roglohrogloh Posts: 5,837

    If the reads were not being printed, then they might also be happening between some of the write calls as well if the filesystem keeps checking the table (hopefully it's not dumb and repeatedly reading the same thing). You'd expect to see some reads happening when the file is initially opened.

  • evanhevanh Posts: 16,023
    edited 2024-10-10 11:18

    That card above was a different one. Here's the same test again using the Samsung 128 GB card. You can see the cluster size is 4x larger.

     wr#f740  wr#84b  wr#7fcb  wr#84c  wr#7fcc  wr#84d  wr#7fcd  wr#84e  wr#7fce  wr#84f  wr#7fcf  Buffer = 256 kB,  wr#f740  wr#801  wr#25780+40  wr#257c0+40  wr#25800+40  wr#25840+40  wr#25880+40  wr#258c0+40  wr#25900+40  wr#25940+40  wr#84b  wr#7fcb  wr#f740  wr#801  wr#25980+40  wr#259c0+40  wr#25a00+40  wr#25a40+40  wr#25a80+40  wr#25ac0+40  wr#25b00+40  wr#25b40+40  wr#84b  wr#7fcb  wr#f740  wr#801  Written 512 kB at 3061 kB/s,  Verified,  Read 512 kB at 28504 kB/s
    

    Another thing that becomes more obvious here is there is four single block writes between each 256 kB buffer written.

    EDIT: Same again but with reads now reported as well:

     RD0  RD800  RD801  RDf740  WRf740  RD84b  WR84b  WR7fcb  RDf740  Buffer = 256 kB,  WRf740  WR801  RD84b  WR25780+40  WR257c0+40  WR25800+40  WR25840+40  WR25880+40  WR258c0+40  WR25900+40  WR25940+40  WR84b  WR7fcb  RDf740  WRf740  WR801  RD84b  WR25980+40  WR259c0+40  WR25a00+40  WR25a40+40  WR25a80+40  WR25ac0+40  WR25b00+40  WR25b40+40  WR84b  WR7fcb  RDf740  WRf740  WR801  Written 512 kB at 3458 kB/s,  RDf740  RD25780+40  RD84b  RD257c0+40  RD25800+40  RD25840+40  RD25880+40  RD258c0+40  RD25900+40  RD25940+40  RD25980+40  RD259c0+40  RD25a00+40  RD25a40+40  RD25a80+40  RD25ac0+40  RD25b00+40  RD25b40+40  Verified,  RD25780+40  RD257c0+40  RD25800+40  RD25840+40  RD25880+40  RD258c0+40  RD25900+40  RD25940+40  RD25980+40  RD259c0+40  RD25a00+40  RD25a40+40  RD25a80+40  RD25ac0+40  RD25b00+40  RD25b40+40  Read 512 kB at 19893 kB/s
    
  • evanhevanh Posts: 16,023
    edited 2024-10-10 13:24

    For comparison, using the SPI driver with the Samsung EVO 128 GB card:

       clkfreq = 360000000   clkmode = 0x10011fb
    Filesystem = fatfs,  Driver = sdcardx
    mount: OK
     Buffer = 2 kB,  Written 2048 kB at 299 kB/s,  Verified,  Read 2048 kB at 2441 kB/s
     Buffer = 2 kB,  Written 2048 kB at 289 kB/s,  Verified,  Read 2048 kB at 2277 kB/s
     Buffer = 2 kB,  Written 2048 kB at 289 kB/s,  Verified,  Read 2048 kB at 2461 kB/s
     Buffer = 2 kB,  Written 2048 kB at 291 kB/s,  Verified,  Read 2048 kB at 2286 kB/s
    
     Buffer = 4 kB,  Written 2048 kB at 496 kB/s,  Verified,  Read 2048 kB at 2873 kB/s
     Buffer = 4 kB,  Written 2048 kB at 499 kB/s,  Verified,  Read 2048 kB at 2877 kB/s
     Buffer = 4 kB,  Written 2048 kB at 494 kB/s,  Verified,  Read 2048 kB at 2855 kB/s
     Buffer = 4 kB,  Written 2048 kB at 498 kB/s,  Verified,  Read 2048 kB at 2785 kB/s
    
     Buffer = 8 kB,  Written 4096 kB at 736 kB/s,  Verified,  Read 4096 kB at 3210 kB/s
     Buffer = 8 kB,  Written 4096 kB at 735 kB/s,  Verified,  Read 4096 kB at 3195 kB/s
     Buffer = 8 kB,  Written 4096 kB at 731 kB/s,  Verified,  Read 4096 kB at 3216 kB/s
     Buffer = 8 kB,  Written 4096 kB at 722 kB/s,  Verified,  Read 4096 kB at 3199 kB/s
    
     Buffer = 16 kB,  Written 4096 kB at 960 kB/s,  Verified,  Read 4096 kB at 3383 kB/s
     Buffer = 16 kB,  Written 4096 kB at 963 kB/s,  Verified,  Read 4096 kB at 3375 kB/s
     Buffer = 16 kB,  Written 4096 kB at 963 kB/s,  Verified,  Read 4096 kB at 3367 kB/s
     Buffer = 16 kB,  Written 4096 kB at 961 kB/s,  Verified,  Read 4096 kB at 3382 kB/s
    
     Buffer = 32 kB,  Written 8192 kB at 1140 kB/s,  Verified,  Read 8192 kB at 3460 kB/s
     Buffer = 32 kB,  Written 8192 kB at 1157 kB/s,  Verified,  Read 8192 kB at 3466 kB/s
     Buffer = 32 kB,  Written 8192 kB at 1187 kB/s,  Verified,  Read 8192 kB at 3468 kB/s
     Buffer = 32 kB,  Written 8192 kB at 1257 kB/s,  Verified,  Read 8192 kB at 3461 kB/s
    
     Buffer = 64 kB,  Written 8192 kB at 1653 kB/s,  Verified,  Read 8192 kB at 3468 kB/s
     Buffer = 64 kB,  Written 8192 kB at 1653 kB/s,  Verified,  Read 8192 kB at 3459 kB/s
     Buffer = 64 kB,  Written 8192 kB at 1606 kB/s,  Verified,  Read 8192 kB at 3466 kB/s
     Buffer = 64 kB,  Written 8192 kB at 1681 kB/s,  Verified,  Read 8192 kB at 3470 kB/s
    
     Buffer = 128 kB,  Written 16384 kB at 1980 kB/s,  Verified,  Read 16384 kB at 3469 kB/s
     Buffer = 128 kB,  Written 16384 kB at 2033 kB/s,  Verified,  Read 16384 kB at 3468 kB/s
     Buffer = 128 kB,  Written 16384 kB at 2029 kB/s,  Verified,  Read 16384 kB at 3469 kB/s
     Buffer = 128 kB,  Written 16384 kB at 2034 kB/s,  Verified,  Read 16384 kB at 3469 kB/s
    
     Buffer = 256 kB,  Written 16384 kB at 2270 kB/s,  Verified,  Read 16384 kB at 3470 kB/s
     Buffer = 256 kB,  Written 16384 kB at 2290 kB/s,  Verified,  Read 16384 kB at 3470 kB/s
     Buffer = 256 kB,  Written 16384 kB at 2267 kB/s,  Verified,  Read 16384 kB at 3470 kB/s
     Buffer = 256 kB,  Written 16384 kB at 2282 kB/s,  Verified,  Read 16384 kB at 3469 kB/s
    

    Write speeds are maybe 50% up in the newer driver, double speed with large buffer. Quite dismal considering I was claiming 10x uplift from the raw mode testing. :( Read speeds are something like 8x so not all gloom. :)

    Sandisk Extreme 32GB using SPI driver is a little better on the write speeds. Starts +50% but moves above 3x at larger buffer sizes. Reads are a solid 8x all round. The Sandisk will be fighting a smaller cluster size too.

       clkfreq = 360000000   clkmode = 0x10011fb
    Filesystem = fatfs,  Driver = sdcardx
    mount: OK
     Buffer = 2 kB,  Written 2048 kB at 645 kB/s,  Verified,  Read 2048 kB at 2410 kB/s
     Buffer = 2 kB,  Written 2048 kB at 612 kB/s,  Verified,  Read 2048 kB at 2450 kB/s
     Buffer = 2 kB,  Written 2048 kB at 655 kB/s,  Verified,  Read 2048 kB at 2447 kB/s
     Buffer = 2 kB,  Written 2048 kB at 651 kB/s,  Verified,  Read 2048 kB at 2443 kB/s
    
     Buffer = 4 kB,  Written 2048 kB at 919 kB/s,  Verified,  Read 2048 kB at 2912 kB/s
     Buffer = 4 kB,  Written 2048 kB at 997 kB/s,  Verified,  Read 2048 kB at 2834 kB/s
     Buffer = 4 kB,  Written 2048 kB at 999 kB/s,  Verified,  Read 2048 kB at 2908 kB/s
     Buffer = 4 kB,  Written 2048 kB at 990 kB/s,  Verified,  Read 2048 kB at 2911 kB/s
    
     Buffer = 8 kB,  Written 4096 kB at 1234 kB/s,  Verified,  Read 4096 kB at 3212 kB/s
     Buffer = 8 kB,  Written 4096 kB at 1298 kB/s,  Verified,  Read 4096 kB at 3130 kB/s
     Buffer = 8 kB,  Written 4096 kB at 1323 kB/s,  Verified,  Read 4096 kB at 3219 kB/s
     Buffer = 8 kB,  Written 4096 kB at 1230 kB/s,  Verified,  Read 4096 kB at 3214 kB/s
    
     Buffer = 16 kB,  Written 4096 kB at 1589 kB/s,  Verified,  Read 4096 kB at 3380 kB/s
     Buffer = 16 kB,  Written 4096 kB at 1581 kB/s,  Verified,  Read 4096 kB at 3383 kB/s
     Buffer = 16 kB,  Written 4096 kB at 1594 kB/s,  Verified,  Read 4096 kB at 3379 kB/s
     Buffer = 16 kB,  Written 4096 kB at 1478 kB/s,  Verified,  Read 4096 kB at 3378 kB/s
    
     Buffer = 32 kB,  Written 8192 kB at 2038 kB/s,  Verified,  Read 8192 kB at 3382 kB/s
     Buffer = 32 kB,  Written 8192 kB at 1677 kB/s,  Verified,  Read 8192 kB at 3378 kB/s
     Buffer = 32 kB,  Written 8192 kB at 2022 kB/s,  Verified,  Read 8192 kB at 3379 kB/s
     Buffer = 32 kB,  Written 8192 kB at 1937 kB/s,  Verified,  Read 8192 kB at 3379 kB/s
    
     Buffer = 64 kB,  Written 8192 kB at 2350 kB/s,  Verified,  Read 8192 kB at 3380 kB/s
     Buffer = 64 kB,  Written 8192 kB at 2350 kB/s,  Verified,  Read 8192 kB at 3378 kB/s
     Buffer = 64 kB,  Written 8192 kB at 2347 kB/s,  Verified,  Read 8192 kB at 3380 kB/s
     Buffer = 64 kB,  Written 8192 kB at 2345 kB/s,  Verified,  Read 8192 kB at 3378 kB/s
    
     Buffer = 128 kB,  Written 16384 kB at 2555 kB/s,  Verified,  Read 16384 kB at 3371 kB/s
     Buffer = 128 kB,  Written 16384 kB at 2486 kB/s,  Verified,  Read 16384 kB at 3371 kB/s
     Buffer = 128 kB,  Written 16384 kB at 2533 kB/s,  Verified,  Read 16384 kB at 3371 kB/s
     Buffer = 128 kB,  Written 16384 kB at 2565 kB/s,  Verified,  Read 16384 kB at 3371 kB/s
    
     Buffer = 256 kB,  Written 16384 kB at 2677 kB/s,  Verified,  Read 16384 kB at 3372 kB/s
     Buffer = 256 kB,  Written 16384 kB at 2676 kB/s,  Verified,  Read 16384 kB at 3371 kB/s
     Buffer = 256 kB,  Written 16384 kB at 2679 kB/s,  Verified,  Read 16384 kB at 3372 kB/s
     Buffer = 256 kB,  Written 16384 kB at 2574 kB/s,  Verified,  Read 16384 kB at 3373 kB/s
    
  • RaymanRayman Posts: 14,744

    @evanh impressive speeds. So, it's ready to test? Does one just drop those files you posted into Flexprop to use it?

    Does the driver support a power enable pin?

  • evanhevanh Posts: 16,023
    edited 2024-10-10 13:54

    As for 270 MHz, with the new driver. It barely affects write performance. Reads are down 15% maybe.

    Sandisk Extreme 32 GB:

       clkfreq = 270000000   clkmode = 0x1041afb
    Filesystem = fatfs,  Driver = sdsdcard
     Clock divider for SD card is 3 (90 MHz)
    mount: OK
     Buffer = 2 kB,  Written 2048 kB at 736 kB/s,  Verified,  Read 2048 kB at 5473 kB/s
     Buffer = 2 kB,  Written 2048 kB at 814 kB/s,  Verified,  Read 2048 kB at 5954 kB/s
     Buffer = 2 kB,  Written 2048 kB at 814 kB/s,  Verified,  Read 2048 kB at 5990 kB/s
     Buffer = 2 kB,  Written 2048 kB at 753 kB/s,  Verified,  Read 2048 kB at 5996 kB/s
    
     Buffer = 4 kB,  Written 2048 kB at 1473 kB/s,  Verified,  Read 2048 kB at 10225 kB/s
     Buffer = 4 kB,  Written 2048 kB at 1456 kB/s,  Verified,  Read 2048 kB at 10375 kB/s
     Buffer = 4 kB,  Written 2048 kB at 1257 kB/s,  Verified,  Read 2048 kB at 10277 kB/s
     Buffer = 4 kB,  Written 2048 kB at 1425 kB/s,  Verified,  Read 2048 kB at 10149 kB/s
    
     Buffer = 8 kB,  Written 4096 kB at 2237 kB/s,  Verified,  Read 4096 kB at 16406 kB/s
     Buffer = 8 kB,  Written 4096 kB at 1262 kB/s,  Verified,  Read 4096 kB at 16523 kB/s
     Buffer = 8 kB,  Written 4096 kB at 2237 kB/s,  Verified,  Read 4096 kB at 16251 kB/s
     Buffer = 8 kB,  Written 4096 kB at 2224 kB/s,  Verified,  Read 4096 kB at 16375 kB/s
    
     Buffer = 16 kB,  Written 4096 kB at 2659 kB/s,  Verified,  Read 4096 kB at 23517 kB/s
     Buffer = 16 kB,  Written 4096 kB at 2963 kB/s,  Verified,  Read 4096 kB at 23622 kB/s
     Buffer = 16 kB,  Written 4096 kB at 3049 kB/s,  Verified,  Read 4096 kB at 23307 kB/s
     Buffer = 16 kB,  Written 4096 kB at 3024 kB/s,  Verified,  Read 4096 kB at 23495 kB/s
    
     Buffer = 32 kB,  Written 8192 kB at 4205 kB/s,  Verified,  Read 8192 kB at 23583 kB/s
     Buffer = 32 kB,  Written 8192 kB at 3033 kB/s,  Verified,  Read 8192 kB at 23411 kB/s
     Buffer = 32 kB,  Written 8192 kB at 4730 kB/s,  Verified,  Read 8192 kB at 23561 kB/s
     Buffer = 32 kB,  Written 8192 kB at 4287 kB/s,  Verified,  Read 8192 kB at 23471 kB/s
    
     Buffer = 64 kB,  Written 8192 kB at 6326 kB/s,  Verified,  Read 8192 kB at 23566 kB/s
     Buffer = 64 kB,  Written 8192 kB at 6382 kB/s,  Verified,  Read 8192 kB at 23476 kB/s
     Buffer = 64 kB,  Written 8192 kB at 6268 kB/s,  Verified,  Read 8192 kB at 23438 kB/s
     Buffer = 64 kB,  Written 8192 kB at 6002 kB/s,  Verified,  Read 8192 kB at 23441 kB/s
    
     Buffer = 128 kB,  Written 16384 kB at 7243 kB/s,  Verified,  Read 16384 kB at 23200 kB/s
     Buffer = 128 kB,  Written 16384 kB at 7092 kB/s,  Verified,  Read 16384 kB at 23128 kB/s
     Buffer = 128 kB,  Written 16384 kB at 7662 kB/s,  Verified,  Read 16384 kB at 23197 kB/s
     Buffer = 128 kB,  Written 16384 kB at 7663 kB/s,  Verified,  Read 16384 kB at 23201 kB/s
    
     Buffer = 256 kB,  Written 16384 kB at 8927 kB/s,  Verified,  Read 16384 kB at 23128 kB/s
     Buffer = 256 kB,  Written 16384 kB at 8915 kB/s,  Verified,  Read 16384 kB at 23229 kB/s
     Buffer = 256 kB,  Written 16384 kB at 8957 kB/s,  Verified,  Read 16384 kB at 23211 kB/s
     Buffer = 256 kB,  Written 16384 kB at 8976 kB/s,  Verified,  Read 16384 kB at 23214 kB/s
    

    Read speeds flatline from 16 kB buffer size. Which tells me the cluster size is 16 kB.

    Removing block read CRC processing and using sysclock/2 - Same story with 16 kB clusters:

       clkfreq = 270000000   clkmode = 0x1041afb
    Filesystem = fatfs,  Driver = sdsdcard
     Clock divider for SD card is 2 (135 MHz)
    mount: OK
     Buffer = 2 kB,  Written 2048 kB at 750 kB/s,  Verified,  Read 2048 kB at 6597 kB/s
     Buffer = 2 kB,  Written 2048 kB at 814 kB/s,  Verified,  Read 2048 kB at 6796 kB/s
     Buffer = 2 kB,  Written 2048 kB at 742 kB/s,  Verified,  Read 2048 kB at 6527 kB/s
     Buffer = 2 kB,  Written 2048 kB at 808 kB/s,  Verified,  Read 2048 kB at 6514 kB/s
    
     Buffer = 4 kB,  Written 2048 kB at 1473 kB/s,  Verified,  Read 2048 kB at 12196 kB/s
     Buffer = 4 kB,  Written 2048 kB at 1278 kB/s,  Verified,  Read 2048 kB at 11430 kB/s
     Buffer = 4 kB,  Written 2048 kB at 1475 kB/s,  Verified,  Read 2048 kB at 11683 kB/s
     Buffer = 4 kB,  Written 2048 kB at 1456 kB/s,  Verified,  Read 2048 kB at 11661 kB/s
    
     Buffer = 8 kB,  Written 4096 kB at 2250 kB/s,  Verified,  Read 4096 kB at 19782 kB/s
     Buffer = 8 kB,  Written 4096 kB at 2001 kB/s,  Verified,  Read 4096 kB at 19990 kB/s
     Buffer = 8 kB,  Written 4096 kB at 2207 kB/s,  Verified,  Read 4096 kB at 19963 kB/s
     Buffer = 8 kB,  Written 4096 kB at 1998 kB/s,  Verified,  Read 4096 kB at 19991 kB/s
    
     Buffer = 16 kB,  Written 4096 kB at 2987 kB/s,  Verified,  Read 4096 kB at 29832 kB/s
     Buffer = 16 kB,  Written 4096 kB at 3069 kB/s,  Verified,  Read 4096 kB at 30003 kB/s
     Buffer = 16 kB,  Written 4096 kB at 3120 kB/s,  Verified,  Read 4096 kB at 29910 kB/s
     Buffer = 16 kB,  Written 4096 kB at 3053 kB/s,  Verified,  Read 4096 kB at 29982 kB/s
    
     Buffer = 32 kB,  Written 8192 kB at 2089 kB/s,  Verified,  Read 8192 kB at 28451 kB/s
     Buffer = 32 kB,  Written 8192 kB at 4682 kB/s,  Verified,  Read 8192 kB at 30135 kB/s
     Buffer = 32 kB,  Written 8192 kB at 3951 kB/s,  Verified,  Read 8192 kB at 29978 kB/s
     Buffer = 32 kB,  Written 8192 kB at 3963 kB/s,  Verified,  Read 8192 kB at 30280 kB/s
    
     Buffer = 64 kB,  Written 8192 kB at 6363 kB/s,  Verified,  Read 8192 kB at 28573 kB/s
     Buffer = 64 kB,  Written 8192 kB at 6290 kB/s,  Verified,  Read 8192 kB at 29972 kB/s
     Buffer = 64 kB,  Written 8192 kB at 6343 kB/s,  Verified,  Read 8192 kB at 28627 kB/s
     Buffer = 64 kB,  Written 8192 kB at 5380 kB/s,  Verified,  Read 8192 kB at 30158 kB/s
    
     Buffer = 128 kB,  Written 16384 kB at 7736 kB/s,  Verified,  Read 16384 kB at 29431 kB/s
     Buffer = 128 kB,  Written 16384 kB at 7803 kB/s,  Verified,  Read 16384 kB at 29510 kB/s
     Buffer = 128 kB,  Written 16384 kB at 7903 kB/s,  Verified,  Read 16384 kB at 29562 kB/s
     Buffer = 128 kB,  Written 16384 kB at 7866 kB/s,  Verified,  Read 16384 kB at 29510 kB/s
    
     Buffer = 256 kB,  Written 16384 kB at 8052 kB/s,  Verified,  Read 16384 kB at 29511 kB/s
     Buffer = 256 kB,  Written 16384 kB at 7875 kB/s,  Verified,  Read 16384 kB at 29538 kB/s
     Buffer = 256 kB,  Written 16384 kB at 8954 kB/s,  Verified,  Read 16384 kB at 29519 kB/s
     Buffer = 256 kB,  Written 16384 kB at 8919 kB/s,  Verified,  Read 16384 kB at 29517 kB/s
    
  • evanhevanh Posts: 16,023
    edited 2024-10-11 02:09

    @Rayman said:
    @evanh impressive speeds. So, it's ready to test? Does one just drop those files you posted into Flexprop to use it?

    Yep, and yes. But you will need the accessory board that has 4-bit bus width. Roger designed the board, so Roger and I are the only two that have it.

    Since the SPI wired bootable SD slot has that EEPROM clash limiting resistor it can't be clocked at extreme rates, so I've not bothered to support 1-bit mode. Although it certainly could be done in the future. There might be some speed-up since the new driver is using streamer ops for everything, which can operate at sysclock/2 even at high sysclocks. At high sysclocks the SPI driver steps down to sysclock/8.

    Does the driver support a power enable pin?

    Yes. Here's a report from the development code:

    Card detected ... power cycle of SD card
      power-down threshold = 37   pin state = 1
      power-down slope = 34914 us   pin state = 0
      power-up threshold = 209   pin state = 0
      power-up slope = 1128 us   pin state = 1
    

    It's using the DAC-comparator pin mode to set each threshold (0.5 V low and 2.7 V high) and then measure how long it takes the power rail to transition. I take 32 samples spread across 1.0 ms, unanimous voting like the debounce circuit, to provide the spec'd hold time before state change. Voltage measuring is at the CLK pin. All I/O pins each have a 22 kR pull-up soldered on the accessory board.

    EDIT: Removed one repeated sentence for clarity.

  • evanhevanh Posts: 16,023
    edited 2024-10-10 14:30

    I'm guessing Von is planning on making 4-bit versions for Parallax to sell.

    Now that it's proven, I guess it wouldn't be such a huge effort to port the driver to Spin2 and place it in the Obex. I doubt I would have made it this far without printf() for debug. It's not just the ease of formatting the prints either, scrollable terminal history is a massive sanity saver.

    Another thing I do now is, not unlike how Chip works, I have four separate editor windows open, tiled across the width of a single 43" 4k monitor. Each has multiple tabs that are mixtures of the files I'm examining. I can have all windows showing different parts of the one file if I like. None are full height. I have multiple terminals open below, and a multi-tabbed file manager, are fixtures.

    It's not a specific IDE but it looks like one.

  • RaymanRayman Posts: 14,744

    @evanh Awesome! I have to try it out. See how sensitive it is to trace lengths and all...

    Wonder how hard would be to make this work with FSRW...
    Maybe just replace the block driver with this one and it's done?

  • RaymanRayman Posts: 14,744

    Just looked at it and see it's all in C with inline PASM2.
    Thought it might be spin2 that was converted with spin2cpp, but guess not.

  • evanhevanh Posts: 16,023

    @Rayman said:
    Just looked at it and see it's all in C with inline PASM2.
    Thought it might be spin2 that was converted with spin2cpp, but guess not.

    printf() was a critical part of development.

    Yeah, interfacing to FSRW should be next step. Can you provide the newest edition of that? I know it had lots of contributions over the years and many were broken.

  • RaymanRayman Posts: 14,744

    The latest might be here?
    https://forums.parallax.com/discussion/173378/the-actually-functional-spin2-sd-driver-i-hope-kyefat-sdspi-with-audio/p2

    Guess needs a better maintainer...
    Also see a note where @ke4pjw wanted SD power to cycle with reset. Think that would be good enough?
    Or, needs a dedicated pin?

  • roglohrogloh Posts: 5,837

    @evanh said:

    @Rayman said:
    @evanh impressive speeds. So, it's ready to test? Does one just drop those files you posted into Flexprop to use it?

    Yep, and yes. But you will need the accessory board that has 4-bit bus width. Roger designed the board, so Roger and I are the only two that have it.

    It's an open design shared earlier in this thread, and with any luck Parallax may run with a design based on it, or at least try to share the same pin mapping/functionality. It makes good use of the 8 pins of a P2 accessory module and evanh's driver obviously uses this pin mapping now too - so it's the "standard" right now. But anyone else making their own larger P2 boards can just use the same type of circuit and evanh's code should work at speed if the layout is not too crazy. The little pFET I just had in my spare parts bin could be changed to something else easier to solder if needed it just needs a reasonably low Rds to not drop voltage too much at SD card currents. Or it could be controlled by a regulator from 5V down to 3.3V. The card detect trick is handy too to share the pin and ensures power gets shut off when a card is ejected - also handy for controlling regulator use in low powered setups.

  • evanhevanh Posts: 16,023
    edited 2024-10-11 13:39

    Ah, oh, I've found something - The filesystem is requesting a SYNC after each cluster is written. That's excessively pedantic behaviour, imho. I never imagined SYNC was even being used, to be honest. Maybe once upon file close seemed a reasonable place. Calling SYNC makes no diff to the SD card. It'll store the sent data just the same either way.

  • roglohrogloh Posts: 5,837
    edited 2024-10-11 13:47

    If you return early by skipping that request does it speed it up much? Also with those extra interleaved single sector reads/writes between bursts of 256kB I wonder how much delay that contributes and how much of the bandwidth improvements are reduced.

    Interestingly you can see it write to a sector "84b", then it reads it again between the multi-sector bursts. What has changed it in the meantime that it needs to read it again? Some buffer limitation in the filesystem storage? Ideally it could cache this.
    ... WR25940+40 WR84b WR7fcb RDf740 WRf740 WR801 RD84b WR25980+40 ...

  • evanhevanh Posts: 16,023
    edited 2024-10-11 15:55

    @rogloh said:
    If you return early by skipping that request does it speed it up much? Also with those extra interleaved single sector reads/writes between bursts of 256kB I wonder how much delay that contributes and how much of the bandwidth improvements are reduced.

    Doesn't help at the moment. The problem is those singles, they kill any chance of performance because they break up the contiguous order. I just tried, via the driver, concatenating the separate but contiguous cluster writes into one CMD25. And got it working seemingly reliably too. It made a difference, but only for buffers larger than the cluster size. So kind of annoying.

    Eg: Samsung, using 256 kB buffer, went from 5 MB/s to 9.5 MB/s writing a 16 MB file.

    Interestingly you can see it write to a sector "84b", then it reads it again between the multi-sector bursts. What has changed it in the meantime that it needs to read it again? Some buffer limitation in the filesystem storage? Ideally it could cache this.
    ... WR25940+40 WR84b WR7fcb RDf740 WRf740 WR801 RD84b WR25980+40 ...

    Yeah, that's the sort of thing that needs addressed for sure.

  • RaymanRayman Posts: 14,744

    going to try it with this very soon...

    640 x 480 - 129K
    sdio.jpg 128.6K
  • RaymanRayman Posts: 14,744

    Never seen atexit() before... Interesting...

  • roglohrogloh Posts: 5,837

    @Rayman said:
    going to try it with this very soon...

    Nice. Are there any series resistors underneath the board, or were they skipped? Will be interesting to see what performance can be gained if you did omit them.

  • roglohrogloh Posts: 5,837

    @evanh said:

    @rogloh said:
    If you return early by skipping that request does it speed it up much? Also with those extra interleaved single sector reads/writes between bursts of 256kB I wonder how much delay that contributes and how much of the bandwidth improvements are reduced.

    Doesn't help at the moment. The problem is those singles, they kill any chance of performance because they break up the contiguous order. I just tried, via the driver, concatenating the separate but contiguous cluster writes into one CMD25. And got it working seemingly reliably too. It made a difference, but only for buffers larger than the cluster size. So kind of annoying.

    Eg: Samsung, using 256 kB buffer, went from 5 MB/s to 9.5 MB/s writing a 16 MB file.

    Interestingly you can see it write to a sector "84b", then it reads it again between the multi-sector bursts. What has changed it in the meantime that it needs to read it again? Some buffer limitation in the filesystem storage? Ideally it could cache this.
    ... WR25940+40 WR84b WR7fcb RDf740 WRf740 WR801 RD84b WR25980+40 ...

    Yeah, that's the sort of thing that needs addressed for sure.

    Where is the filesystem code that controls all this exactly? Is it part of flexspin source tree? I'd like to see the C code if it's published somewhere.

  • evanhevanh Posts: 16,023
    edited 2024-10-12 01:26

    @rogloh said:
    Where is the filesystem code that controls all this exactly? Is it part of flexspin source tree? I'd like to see the C code if it's published somewhere.

    It'll be there in that same directory as the driver file. I copied it all into another directory because it needed the top level API altered to get the extra pins assigned. Presumably ff.c is the actual filesystem source since it's the big file.

    If you wanted to have both the new and old drivers for two SD cards operating at once then there would presumably be two complete copies of the filesystem compiled in as well.

    Another temporary hack by me really.

    EDIT: Taking a peek, it's quite the beast I see. Supports long file names, TRIMming, and ExFAT too.
    The Unicode stuff seems overkill. That's going to just be for filenames.

Sign In or Register to comment.