FSRW (uSD card read/write) for P2 with 2.4 MB/s read speed
Rayman
Posts: 14,789
Here is a version of FSRW that works with both FastSpin and the Parallax Spin compiler (AKA PNut and soon to be PropTool for P2).
Thanks to @cheezus for providing the base from which this came from as discussed here: http://forums.parallax.com/discussion/169786/
Thanks to input from @Cluso99 and @"Peter Jakacki" for showing me how to improve read speed.
And, of course thanks to Tomas Rokicki and Jonathan Dummer for providing the original P1 FSRW 2.6 from where this all came from...
This driver works very much like the P1 version, with some small differences:
- Mount() used to return 0 on success and negative on failure. Now, it returns 1 or 2 on success to tell you card type (1==SD and 2==SDHC).
- There are two versions of the low level SPI driver, sdspi, to select from in Fsrw.spin2. The "bashed" version does not use a separate cog, while the other, "ASM" one does. The "ASM" version is about 10% or so faster in some tests.
Note: If you choose the assembly version, edit the "sdspi_asm_mb.spin2" driver if you want to change the cog# being used. This version has "SPI_COG=6".
- There are two new functions FastBlocksRead() and FasterBlocksRead() that blindly read a series of blocks without following the FAT chain. This can be useful for maximizing read speed of large files on freshly formatted disks. Both of these options require the use of the new remount() function to switch files.
The read speed can be tested using the "fsrw_test2.spin2" file in the attached. There are three different read options to choose from.
This code repeatedly loads and displays the file bitmap2.bmp over VGA and sends diagnostic info over the USB serial connection. This is useful for checking for read errors.
With the FasterBlocksRead() option, I'm seeing about 2000 kB/s at 250 MHz clock and 2400 kB/s at 300 MHz clock.
This is fast enough to do QVGA video demonstrated by the included file, "P2VideoPlayer.spin2", as described here: http://www.rayslogic.com/Propeller2/P2Video/P2Video.html
This is by no means complete, there are still a lot of improvements that can be made. Any improvements or suggestions are welcome.
I don't know if I'll be the one doing any updates or not as it seems to fill my needs as is now.
Also, I haven't really tested the writing part of this very much, mostly just been focused on reading.
25May20: Fixed a timeout issue with the bashed version that would cause opening a new file to fail when a long time passed since opening the first file. Also, added in a basic timeout to the assembly version.
29May20: removed a PINL(56) and a PINL(57) in fsrw.spin2 that was used for debugging.
Thanks to @cheezus for providing the base from which this came from as discussed here: http://forums.parallax.com/discussion/169786/
Thanks to input from @Cluso99 and @"Peter Jakacki" for showing me how to improve read speed.
And, of course thanks to Tomas Rokicki and Jonathan Dummer for providing the original P1 FSRW 2.6 from where this all came from...
This driver works very much like the P1 version, with some small differences:
- Mount() used to return 0 on success and negative on failure. Now, it returns 1 or 2 on success to tell you card type (1==SD and 2==SDHC).
- There are two versions of the low level SPI driver, sdspi, to select from in Fsrw.spin2. The "bashed" version does not use a separate cog, while the other, "ASM" one does. The "ASM" version is about 10% or so faster in some tests.
Note: If you choose the assembly version, edit the "sdspi_asm_mb.spin2" driver if you want to change the cog# being used. This version has "SPI_COG=6".
- There are two new functions FastBlocksRead() and FasterBlocksRead() that blindly read a series of blocks without following the FAT chain. This can be useful for maximizing read speed of large files on freshly formatted disks. Both of these options require the use of the new remount() function to switch files.
The read speed can be tested using the "fsrw_test2.spin2" file in the attached. There are three different read options to choose from.
This code repeatedly loads and displays the file bitmap2.bmp over VGA and sends diagnostic info over the USB serial connection. This is useful for checking for read errors.
With the FasterBlocksRead() option, I'm seeing about 2000 kB/s at 250 MHz clock and 2400 kB/s at 300 MHz clock.
This is fast enough to do QVGA video demonstrated by the included file, "P2VideoPlayer.spin2", as described here: http://www.rayslogic.com/Propeller2/P2Video/P2Video.html
This is by no means complete, there are still a lot of improvements that can be made. Any improvements or suggestions are welcome.
I don't know if I'll be the one doing any updates or not as it seems to fill my needs as is now.
Also, I haven't really tested the writing part of this very much, mostly just been focused on reading.
25May20: Fixed a timeout issue with the bashed version that would cause opening a new file to fail when a long time passed since opening the first file. Also, added in a basic timeout to the assembly version.
29May20: removed a PINL(56) and a PINL(57) in fsrw.spin2 that was used for debugging.
zip
527K
Comments
One disadvantage, also like the original, is that it only works with 8.3 DOS style filenames and also only in the root directory.
However, you can still open files with long filenames if you can figure out the 8.3 version of the filename.
In Windows, you can bring up the cmd DOS window and do "dir /x" to see both long and short filenames.
I just googled this to remember how to do it and got this from here: https://digitalsupport.ge.com/en_US/Article/Is-Windows-8-3-File-Naming-Enabled-How-Do-I-Enable-8-3-File-Naming-If-It-Is-Not
Tried this on the attached and it showed up as "LARGEF~1.bmp". Quick test shows this does work (see screenshot of fsrw_test2.spin with the filename changed to this).
Other directories in the root can be opened just like a file except the first sector points to the new directory. So it is rather simple to navigate directories if you use a pointer to the directory which is initially loaded with the root. 8.3 file names are very efficient for embedded systems although I wish they came up with almost anything else but the mangled horrible mess that LFN is.
Here is a quick look at TAQOZ that stores the current working directory sector in the cwdir variable, and checking the sector where it is pointing.
This means you can do this: Highly amusing.
Today I did a couple of tests and the results were interesting...
The card is a SanDisk Ultra 16GB micro SDHC I C10.
These are the times to setup a read of sector 0 of the SD card after it had been initialised with commands 0,8,55,A41,58,16. The CSD & CID have not been read. The setup is in system clocks from after the SPI pins were set as output, but before any command was sent.
Timer starts, then Command 17 with sector=00000000 was sent to the card along with the CRC byte. Once the card had replied following the busy bytes (ie $FE received) the timer was stopped. This time excludes the actual read of the 512 bytes and crc16 of 2 bytes.
This was then repeated a second time to show the first time and subsequent time to access the same sector. These two times are significantly different as expected.
For P2 system clock frequencies of 100MHz, 200MHz, 300MHz and 360MHz the times were essentially identical.
First read sector setup time 4.18ms
Second read sector setup time 2.7ms
From this I deduce that the method of coding for the setup is almost immaterial. Only the actual code reading or writing the sector data bytes has any bearing on the throughput. The setup time is determined by the SD Card being used.
I've also got a 16 GB SanDisk Ultra - does yours have the little A1 symbol? (A-class is more relevant to setup time (and speed in general - to actually write at the specced speed a special "speed class recording" mode is required) than the old speed classes. A1 also implies Class 10)
Anyways, some things I've noticed:
- If the card is left idle for a while (is less than one NTSC field, so maybe 10 ms?) the first access afterwards has much longer setup time.
- Enabling performance enhancement features (notably on-card write-behind cache) is not possible in SPI mode because CMD49 is not allowed in SPI mode.
Well, if you can follow my code you will see the results say that without checking further yet, it is between 5ms and 10ms.
Here's the code and the delay is in us starting from 100 then 500, 5000, and 10000 as it rereads the same sector 3 times.
I tried sectors from all over and they responded in similar manner and the latency figure is not technically correct since it also includes the sector read time, but the actual sector transfer time is measured as 150us.
The point at which it did dither was around the 6ms mark. I'd say that the SD control chip is saving power perhaps? I wonder if we always kept it busy whether it would exhibit the same behavior. More tests.
Here's the cards CSD report:
You can mix FAT and NTFS volumes on the same PC. So installshield scripts, BAT files, rundll, open and save dialog boxes, etc. All work as expected.
Yes you can think of ways to break it (use a Linux tool to write invalid characters and symbols to the filename), but it worked great for decades.
The fact that they had to do it that way for compatibility reasons doesn't somehow imply that it isn't a mangled horrible mess. Unfortunately, attempts at compatibility turn into horrible messes all too often.
It appears to be disabled in the assembly version and missing a start time initialization in the bashed version that can cause opening a new file to fail...
This looks easy to fix, so I'll work on that now.
I found a PINL(56) and a PINL(57) in there that I was using for troubleshooting.
Can't let that stand...
Found whilst turning into a eMMC driver...
This one has that removed.
I always set the DIRCMD environment (system not user) variable to /OGN /X to have always this switches active. This is also deployed with GPO on all the clients machines where I administer the IT
If you prefer using Spin, the CLib object in the OBEX contains a Spin version of FSRW that support subdirectories and handles. It is located at https://github.com/parallaxinc/propeller/tree/master/libraries/community/p1/All/CLib .
BTW, the "ls -l" command in TAQOZ prints out a more standard listing which indicates the names that are directories, I should update DIR to do the same.
Trying to figure out what's going on and if it ever worked...
Seems I switched Cheezus's code from Spartpin mode to bit bang mode and broke writing in the process. Guess I was so focused on read speed, didn't pay attention...
Cheezus's assembly code version still works for writing though....