I'm wondering how much faster this is with the assembly driver versus just plain Spin (or FastSpin)…
I think the regular SPI interface for SD is limited to 25 MHz, right?
Also, can a smartpin be used to make it better in Spin?
That's a really good question! To be honest, I'm not really sure. I think a pure spin2/fastspin version would probably be pretty close in speed to the current code. There's still a lot of optimizations I'm missing along the way. I wanted to get something working and incrementally improve along the way, although now that it's working I haven't had a lot of time to revisit.
One of the things I've been thinking about a lot is pushing data directly to LUT. I also need to add the ability to load a binary.
@"Peter Jakacki" Is the maximum clock speed reported by the SDCARD the same for SPI and SD modes? For some weird reason I thought I read somewhere that legacy SPI mode was limited to 25MHz and if you wanted to operate at higher frequencies than that you'd need to move into SD mode. Perhaps this is something that is card dependent or only for older card types. I realize from a board signal integrity viewpoint both modes should be limited to about the same so it might come down to whether the SD card's internal processor / logic path is designed to clock data via SPI at this rate.
@"Peter Jakacki" Is the maximum clock speed reported by the SDCARD the same for SPI and SD modes? For some weird reason I thought I read somewhere that legacy SPI mode was limited to 25MHz and if you wanted to operate at higher frequencies than that you'd need to move into SD mode. Perhaps this is something that is card dependent or only for older card types. I realize from a board signal integrity viewpoint both modes should be limited to about the same so it might come down to whether the SD card's internal processor / logic path is designed to clock data via SPI at this rate.
• TRAN_SPEED The following table defines the maximum data transfer rate per one data line -
TRAN_SPEED: TRAN_SPEED bit code
2:0 transfer rate unit 0=100kbit/s, 1=1Mbit/s, 2=10Mbit/s, 3=100Mbit/s,
4... 7=reserved
6:3 time value 0=reserved, 1=1.0, 2=1.2, 3=1.3, 4=1.5, 5=2.0, 6=2.5, 7=3.0, 8=3.5, 9=4.0,
A=4.5, B=5.0, C=5.5, D=6.0, E=7.0, F=8.0 7 reserved
Table 5-6: Maximum Data Transfer Rate Definition
Note that for current SD Memory Cards, this field shall be always 0_0110_010b (032h) which is equal to 25 MHz -
the mandatory maximum operating frequency of SD Memory Card.
In High-Speed mode, this field shall be always 0_1011_010b (05Ah) which is equal to 50 MHz,
and when the timing mode returns to the default by CMD6 or CMD0 command, its value will be 032h.
So another improvement I need to make is to read this register and adjust SPI clk accordingly. Right now I have the constant spi_clk_max =25 that I should probably keep as a hard limiting factor.
The old SD SPI spec and cards were limited to 25MHz but this was upped a long time ago to, 50MHz. The various SD modes themselves may specify clock rates over 200MHz.
Obviously I have gone well and truly over 25MHz without any problems.
Note that for current SD Memory Cards, this field shall be always 0_0110_010b (032h) which is equal to 25 MHz - the mandatory maximum operating frequency of SD Memory Card. In High-Speed mode,
This is true for plain old-fashioned "SD" cards, but nobody sells or uses SD cards anymore, they are all SDHC or better, and I have never seen this field anything other than 50MHz in SDHC.
Note that for current SD Memory Cards, this field shall be always 0_0110_010b (032h) which is equal to 25 MHz - the mandatory maximum operating frequency of SD Memory Card. In High-Speed mode,
This is true for plain old-fashioned "SD" cards, but nobody sells or uses SD cards anymore, they are all SDHC or better, and I have never seen this field anything other than 50MHz in SDHC.
That was poorly formatted, sorry. That table looks a bit better now. All HC cards are High Speed... The part I need to wrap my head around next is when the timing mode returns to the default by CMD6 or CMD0 command
I keep thinking it might be handy to have a test, to determine IF the hardware is able to run up to the speed specified by the card. Still not sure how to do this...
I like that there are two versions of the spi driver: sdspi_asm.spin2 and sdspi_bashed.spin2
The bashed one is inline assembly, very nice when you are low on cogs.
The only issue I seem to have is that I can't rerun this test program without physically removing the card for a second.
Another card doesn't have this issue...
I've tried putting various things into the "release" function but haven't found anything that helps...
Anyway, "fsrw_test1.spin2" is a little program that just shows the directory and tests read and write speed (adapted from a previous file).
PRI getmask(clk, data) : t
t := clk - data
if t < 0
t := ABS t + %0100
But I think this is wrong for negative t values (CLK pin lower than DAT pin). The BBBB field just expects a 2s complement value for the B pin displacement, so this should work:
PRI getmask(clk, data) : t
t := (clk - data) & %0111
Thanks!
The spin language really is an oddball compared to other languages. || mean abs in spin 1?! I never knew that, but most of my time was spent in pasm. Now days (in a professional project) I use C++ and pasm only on the propeller 1. I'm back with spin on the P2 through, since it is a fast prototyping language. (and I kind of like it)
Ahle2,
You should check out FlexGui with it's FlexC implementation. Super nice. Inline PASM, interop with Spin/BASIC/C/PASM. It also targets both P1 and P2.
I'm already using FastSpin as my main compiler for the P2, so I know about all that. I like to do object oriented programming though, so FlexC isn't an option. I do actually like spin2, but i'm not as fluent in it as C++ (20 years ongoing, 8 years hobby + 12 years as a professional)
Ahle2,
I am also a long time C/C++ coder (since the late 80s). As you probably know, Eric intends to add (and already has some) elements of C++. You can do object oriented programming with it now.
I think I might have made some improvements to the bashed spi driver...
The attached seems to work just as well as the original, but may be a hair more robust.
This also has a test program and some bitmap files for test reading from uSD.
I spent way to much time trying to make one of my uSD cards work without pulling out and reinserting.
Just decided it was defective somehow...
Just tested with a new 64 GB uSD card formatted as 32 GB FAT32. Seems to work fine.
Read speed is maybe not the best (450 kB/s), but this version doesn't need a separate cog to run.
(new version attached) Compiled with FastSpin version 4.1.7
@Rayman,
I posted a P2 SD Driver (hubexec pasm) with examples of being called from pasm and spin and does not use an extra cog.
And yesterday I posted a sdspi_bashed.spin2 replacement using this code over on Mike's FemtoBasic thread that work.
I will put some more info about this driver over on my P2 SD Driver thread. If you still want to work on sdspi_bashed then there are a few things I found about the SD Cards that may help.
I saw your driver and looked to see if it could help me with my bad card, but I think that card has something funny about it...
Anyway, I spent some time today cleaning up FSRW for compatibility with PNUT.
I'm also very interested in read speed on large files.... How many kB/s do you get?
I've tried tweaking up bashed but can only get to ~660 kB/s on a class 10 card.
I think FSRW on P1 was faster (at least with small files).
BTW: Do you think your sdspi_bashed replacement could be combined into one file?
@Cluso99: Ok, I speed tested your sdspi driver and got a respectable 538 kB/s read speed on a Class 10 card.
I added a method to FSRW to blind read sequential blocks. For cases of a freshly formatted card, you can often assume a file is in sequential blocks.
This gives 558 kB/s.
@Cluso99: Ok, I speed tested your sdspi driver and got a respectable 538 kB/s read speed on a Class 10 card.
I added a method to FSRW to blind read sequential blocks. For cases of a freshly formatted card, you can often assume a file is in sequential blocks.
This gives 558 kB/s.
Compiled with FastSpin 4.1.8.
Not just freshly formatted cards. In my experience, Windows (not sure about other OSs) will (almost?) always place a new file you copy to a card in sequential clusters if possible. (In particular, if you copy a file such that it replaces a smaller file with the same name, it may become fragmented if the cluster after the file to be replaced is already used). And obviously you can always defragment the entire card or any file if you need it to be unfragmented (useful for fast random access...)
Comments
I think the regular SPI interface for SD is limited to 25 MHz, right?
Also, can a smartpin be used to make it better in Spin?
This is the CSD report from a common Sandisk:
That's a really good question! To be honest, I'm not really sure. I think a pure spin2/fastspin version would probably be pretty close in speed to the current code. There's still a lot of optimizations I'm missing along the way. I wanted to get something working and incrementally improve along the way, although now that it's working I haven't had a lot of time to revisit.
One of the things I've been thinking about a lot is pushing data directly to LUT. I also need to add the ability to load a binary.
I was just wondering the same and found this. -see pg 88
http://users.ece.utexas.edu/~valvano/EE345M/SD_Physical_Layer_Spec.pdf
So another improvement I need to make is to read this register and adjust SPI clk accordingly. Right now I have the constant spi_clk_max =25 that I should probably keep as a hard limiting factor.
*edited table- poor formatting
Obviously I have gone well and truly over 25MHz without any problems.
This is true for plain old-fashioned "SD" cards, but nobody sells or uses SD cards anymore, they are all SDHC or better, and I have never seen this field anything other than 50MHz in SDHC.
That was poorly formatted, sorry. That table looks a bit better now. All HC cards are High Speed... The part I need to wrap my head around next is when the timing mode returns to the default by CMD6 or CMD0 command
I keep thinking it might be handy to have a test, to determine IF the hardware is able to run up to the speed specified by the card. Still not sure how to do this...
I like that there are two versions of the spi driver: sdspi_asm.spin2 and sdspi_bashed.spin2
The bashed one is inline assembly, very nice when you are low on cogs.
The only issue I seem to have is that I can't rerun this test program without physically removing the card for a second.
Another card doesn't have this issue...
I've tried putting various things into the "release" function but haven't found anything that helps...
Anyway, "fsrw_test1.spin2" is a little program that just shows the directory and tests read and write speed (adapted from a previous file).
Set for VGA accessory on P8..16, but you can change that.
LOL Gee, thanks a lot. Now I have to clean cereal off my laptop ;-)
Done that myself a time or two.
In file "sdspi_asm.spin2"
But I think this is wrong for negative t values (CLK pin lower than DAT pin). The BBBB field just expects a 2s complement value for the B pin displacement, so this should work:
Andy
The spin language really is an oddball compared to other languages. || mean abs in spin 1?! I never knew that, but most of my time was spent in pasm. Now days (in a professional project) I use C++ and pasm only on the propeller 1. I'm back with spin on the P2 through, since it is a fast prototyping language. (and I kind of like it)
You should check out FlexGui with it's FlexC implementation. Super nice. Inline PASM, interop with Spin/BASIC/C/PASM. It also targets both P1 and P2.
I'm already using FastSpin as my main compiler for the P2, so I know about all that. I like to do object oriented programming though, so FlexC isn't an option. I do actually like spin2, but i'm not as fluent in it as C++ (20 years ongoing, 8 years hobby + 12 years as a professional)
I am also a long time C/C++ coder (since the late 80s). As you probably know, Eric intends to add (and already has some) elements of C++. You can do object oriented programming with it now.
The attached seems to work just as well as the original, but may be a hair more robust.
This also has a test program and some bitmap files for test reading from uSD.
I spent way to much time trying to make one of my uSD cards work without pulling out and reinserting.
Just decided it was defective somehow...
Just tested with a new 64 GB uSD card formatted as 32 GB FAT32. Seems to work fine.
Read speed is maybe not the best (450 kB/s), but this version doesn't need a separate cog to run.
(new version attached) Compiled with FastSpin version 4.1.7
Looks for a file named P2BOOT.BIN.
Could be used for other things...
I posted a P2 SD Driver (hubexec pasm) with examples of being called from pasm and spin and does not use an extra cog.
And yesterday I posted a sdspi_bashed.spin2 replacement using this code over on Mike's FemtoBasic thread that work.
I will put some more info about this driver over on my P2 SD Driver thread. If you still want to work on sdspi_bashed then there are a few things I found about the SD Cards that may help.
Anyway, I spent some time today cleaning up FSRW for compatibility with PNUT.
I'm also very interested in read speed on large files.... How many kB/s do you get?
I've tried tweaking up bashed but can only get to ~660 kB/s on a class 10 card.
I think FSRW on P1 was faster (at least with small files).
BTW: Do you think your sdspi_bashed replacement could be combined into one file?
I added a method to FSRW to blind read sequential blocks. For cases of a freshly formatted card, you can often assume a file is in sequential blocks.
This gives 558 kB/s.
Compiled with FastSpin 4.1.8.
Can now get 700 kB/s read rate regular way and 730 kB/s in a blind, sequential block read with class 10 card (64 GB formatted as 32 GB FAT32).
A regular 2 GB card gives only 430 kB/s the regular way and 441 kB/s with sequential block read.
Files compiled with FastSpin 4.1.8 (But, I'm in the process of fixing so might work with PNut one day).
Getting 920 kB/s for both 64 GB and 2 GB cards...
BTW: Found some simple looking C code for MMC mode on SD cards here:
https://people.ece.cornell.edu/land/courses/ece4760/FinalProjects/s2007/cd247_maw72/cd247_maw72/mmc.c
Not just freshly formatted cards. In my experience, Windows (not sure about other OSs) will (almost?) always place a new file you copy to a card in sequential clusters if possible. (In particular, if you copy a file such that it replaces a smaller file with the same name, it may become fragmented if the cluster after the file to be replaced is already used). And obviously you can always defragment the entire card or any file if you need it to be unfragmented (useful for fast random access...)