Shop OBEX P1 Docs P2 Docs Learn Events
New SD mode P2 accessory board - Page 35 — Parallax Forums

New SD mode P2 accessory board

1293031323335»

Comments

  • evanhevanh Posts: 16,856
    edited 2025-10-20 01:41

    Comparing progress on slimming down the driver I've gone from a compiled speedtest.binary using the sdsd driver v1.3 where I added the lazy CMD12 feature it nets 63108 bytes file size, to driver v1.9 being 62480 bytes. So 628 bytes smaller.

    A long ways off the smartpins SPI solution. Which nets a compiled speedtest.binary file size of 56964 byes when built using the same plug-in API. 5516 bytes smaller again.

    Interestingly, when smartpins SPI solution is built using the original inbuilt API the speedtest file size is even smaller: 55404 bytes.

    And lastly, the bashed SPI solution (Based on the smartpins solution) when built using the plug-in API is a decent size bigger: 59064 bytes. This surprised me a little since the SD card init routine is simpler and smaller than the smartpins solutions. I guess it's all down to there being four low level routines instead of two.

    On that note, the three low level routines inside the 4-bit SD mode driver are all very large and complex in comparison to any of the SPI mode drivers. The streamer + clock pin pairing is a struggle to make compact. Particularly when the clock divider is user runtime adjustable.

  • roglohrogloh Posts: 6,097

    Perhaps 5-10k extra code size maybe be a reasonable tradeoff to get ~ 5x - 10x transfer performance gains?

  • evanhevanh Posts: 16,856

    A question mark? That should be a statement of opinion! :)

  • roglohrogloh Posts: 6,097

    @evanh said:
    A question mark? That should be a statement of opinion! :)

    LOL I'll certainly take it, was not sure about everyone else.

  • RaymanRayman Posts: 15,799

    @evanh I've forgotten... Did you make a 4-bit code for Spin2?

  • evanhevanh Posts: 16,856
    edited 2025-10-22 20:20

    Yes but it's not really useful since the existing spin based filesystem handlers, FSRW and FAT-Engine, can't make use of the speed. They need rewritten for performance. Probably porting Flexspin's handler would be the way to go.

    EDIT: I tested here - https://forums.parallax.com/discussion/comment/1563795/#Comment_1563795
    It seems it did make a difference but still below what Flexspin gets with the 1-bit SPI mode driver so I still concluded it wasn't worth promoting.

    PS: From memory, that Spin2 edition of 4-bit SD mode driver was based on v0.9 of the C code - which was basically same as v1.0.

  • evanhevanh Posts: 16,856
    edited 2025-10-24 20:15

    @evanh said:
    Comparing progress on slimming down the driver I've gone from a compiled speedtest.binary using the sdsd driver v1.3 where I added the lazy CMD12 feature it nets 63108 bytes file size, to driver v1.9 being 62480 bytes. So 628 bytes smaller.

    I've improved the speedtest.binary size to 62320 by compacting the three set*set() functions, so a 788 byte reduction now.

  • evanhevanh Posts: 16,856
    edited 2025-10-28 06:43

    Ohhhh, it seems sdmm.cc is still present in the compiled binary! Just had a squiz at the .lst file and can see both sdsd_cc (2525 lines) and sdmm_cc (861 lines) labels.

    EDIT: I've hacked it out, by deleting the _sdmm_open() function from include/filesys/fatfs/fatfs_vfs.c, and that reduces the binary another 2612 bytes. By far the biggest impact I've achieved on size.

  • evanhevanh Posts: 16,856

    PS: I merged the block read/write functions too. Got one of my better outcomes, another 364 bytes smaller, because of the large amount of identical prep in both functions.

  • Hmm, so SDSD isn't actually much bigger, the compiler just pulls in the plain SPI driver due to bugs?

    I think it could be solved by moving the _sdmm_open and the _vfs_open_sdcardx wrapper into their own file?

  • https://github.com/totalspectrum/spin2cpp/pull/503 etc, etc

    Yep, that claws back some 2.5K of RAM (1.3K of storage after compression)

  • evanhevanh Posts: 16,856
    edited 2025-10-29 21:20

    I think SDSD still has a robust 7.5 kB footprint. Including the extraneous SDMM, it would have been around 11 kB at its peak.

  • evanhevanh Posts: 16,856

    Oh, the block r/w merge is in v1.11. I held it back thinking I would get the _sdsd_open() smaller. Didn't work though.

  • evanhevanh Posts: 16,856
    edited 2025-11-01 01:06

    Uh-oh, I messed up the command-response merge in some way, from v1.8 onward. Only been testing on one card since ...

    Right, yep, confirmed, one of my seven cards wasn't responding to a CMD9 during card init. Must have lost some trailing clocks as a result of the merge. Solved by adding a second clock pulse ahead of command issuing.

    I've got an unrelated issue that might be my good Eval board failing on P16 pin group. :( more testing ... Nah, it's gone away. I'd tried with older drivers too. :shrug:

  • evanhevanh Posts: 16,856
    edited 2025-11-01 02:09

    @evanh said:
    Right, yep, confirmed, one of my seven cards wasn't responding to a CMD9 during card init. Must have lost some trailing clocks as a result of the merge. Solved by adding a second clock pulse ahead of command issuing.

    It'll be the Nrc = 8 minimum clocks before a fresh command. I try to have excess trailing clocks so then there's no need for command leading clocks but I guess there's now cases where that doesn't happen.

    EDIT: Ah, it wasn't the merge itself that is at fault. For some reason I had deleted the insertion of trailing clocks for non-response cases. CMD7 deselect (needed for CMD9) is one such case.

  • urghghgh I just updated it to v1.10

  • evanhevanh Posts: 16,856
    edited 2025-11-01 02:11

    @Wuerfel_21 said:
    urghghgh I just updated it to v1.10

    Yeah, I figured I needed to warn.

    PS: I identified cause in the above edit now.

  • evanhevanh Posts: 16,856

    I wonder, I might have been thinking I'd remove all trailing clocks and instead rely completely on command leading instead ...

  • When you think you got a fixed version coming?

  • evanhevanh Posts: 16,856
    edited 2025-11-01 23:53

    Dunno, it could be hours but more likely days. I've dug up another issue that appears when using large clock dividers that looks to have existed since v1.3, inclusive. I want to understand it before deciding on what to do with either of them. There could be a relationship.

    EDIT: Hmm, yeah, making sense, still got bugs in the lazy CMD12. Have been assuming the card busy would serve as post CMD12 handshake, ignoring any response on the CMD pin. Bad move.

  • evanhevanh Posts: 16,856
    edited 2025-11-03 10:42

    A proper regression squashed: The minimum write block-to-block gap (Nwr = 2) of two clock pulses wasn't being respected from v1.8 onward. I'd even deleted the comment stating spec requirements. Brain fart!

    This only manifested when operating at a large clock divider because then clock gen can cancel in less than a SD clock cycle length after card-ready is detected, and Nwr also applies to the end of card-busy. The block write routine fully preps for a quick transition while waiting on a long card-busy.

    The screenshot shows the beginning (top half) and end (bottom half) of the second block written of consecutive blocks.
    The end of the second block (bottom half) shows very end of 512 byte data block, then the written CRC, followed by slower clocked CRC response back from the SD card. The response being CRC mismatch error. Which occurred because the SD card wasn't ready for the start bit when it was sent.

    Orange trace is SD CLK pin.
    Green trace is SD DAT0 pin.
    Pink trace is SD CMD pin.

    DAT0 is low (top left) while card is busy handling the first block written. DAT0 goes high to indicate card has become ready. The driver stops the clock and sets DAT0 low (start bit) to send the second block, then restarts the clock and write data all nicely aligned.

    Problem is the clock should have gone at least one more pulse before stopping to reconfigure. Now it does, again.

  • roglohrogloh Posts: 6,097

    Good job finding this @evanh

  • evanhevanh Posts: 16,856

    Thanks.

    That seems to be everything. I'm no longer getting any Nrc problems even though I'm not explicitly ensuring the minimum value of 8. Implementing full response collecting of the many CMD12 occurrences has rectified this one. Most command handling has always had excess trailing clocks beyond end of response to fill the minimum Nrc spec.

    So, stable again. :)

  • evanhevanh Posts: 16,856
    edited 2025-11-03 14:23

    This all kicked off when preparing for using an Eval board to copy files between two cards - https://forums.parallax.com/discussion/comment/1569967/#Comment_1569967
    I slotted a uSD card I hadn't used in a while and immediately got errors.

    Copy tests are working fine now. Multiple instances of the same driver can manage multiple cards simultaneously. :)

  • evanhevanh Posts: 16,856
    edited 2025-11-03 21:31

    @evanh said:
    That seems to be everything. I'm no longer getting any Nrc problems even though I'm not explicitly ensuring the minimum value of 8. Implementing full response collecting of the many CMD12 occurrences has rectified this one. Most command handling has always had excess trailing clocks beyond end of response to fill the minimum Nrc spec.

    I did do one other notable change in v1.12 also: I've tweaked the lazy CMD12 state tracker logic. It had been generating spurious unresponsive error messages, that weren't acted on, due to excessive unneeded CMD12 issuing. This had caused me to make incorrect assumptions about Nrc violations for a while. I'd almost forgotten this relationship.

    Control logic and state trackers are fiendishly tricky to get perfect, but I still enjoy the challenge of creating them.

Sign In or Register to comment.