Help with Using Spi To Access The Sd Card
I want to be able (eventually) to access files on an sd card using SPI.
And since I am just learning, and finding it difficult to comprehend PASM or C code, I put together a basic skeleton of a block driver in spin, is incomplete, since there are several points I'm stuck on.
I'm using spin because that's the one I feel most comfortable with.
So with my uncompleted driver (see attached file)
There are several points which are befuddling me, and if anyone could be so kind as to point out my mistakes, that would be super.
I know there are many excellent complete drivers out there already in the wild, but it's more important to me to understand what's going on under the hood rather than just take a script and implement it.
It is based on Jonathon Dummer's Safe_Spi.spin object that came with Tomas Rokicki's fsrw.spin object
What's more important to me is to know WHY, rather than HOW (if that makes sense lol)
Thanks JD
Point 1 - Use chip_select and chip_deselect methods.
I want to use ChipSelect() and ChipDeselect() methods to signal to the card that it is selected,
as opposed to using inline statements to set the pins states.
Where I am confused is where exactly to place those method calls and in which methods.
I'm not sure at all that I can use chipselect() and/or chipedselect() calls because the CommandSend() method calls itself. (see footnote)
Point 2 - Use wait_till_ready() method.
I want to use the WaitTillReady() method to ensure the card is free to accept commands, before actually sending the commands.
Once again, I'm confused at which methods should call the WaitTillReady() method. (see footnote)
Point 3 - CRC table
Do I need a CRC table?
I don't think its needed but as I understand things, some CARDS want or expect the correct CRC
Point 4 - Speed
How do I change the speed when I need to from 100Khz, 400Khz or maxspeed?
I have no idea how this is done, whether to place delays the methods or alter the frequency, which I don't know how to do either lol
Point 5 - Timeout
How do I implement a timeout? like maybe 1 second or 2 seconds?
So as not to wait forever for a read/write method to complete?
Point 6 - Recommendations
I know that there are bugs or errors, and If members can be kind enough to point them out, I'll try to correct them before moving onto my main course, which can only be done once the SPI layer is completed.
Footnote
Since some of commands involve calling the SendCommand() method from within itself,
Its difficult to decide where and when to use methods such as WaitTillReady() or ChipSelect() or ChipDeselect() since it maybe break one of the nested calls.
I'm suspecting that I need to use those methods in my SendCommand() method only?
Comments
You might consider using Smart Pins for SPI instead of bit-banging; this will let you move data a whole lot faster. If you still want to bit-bang then consider using pinhigh(), pinlow(), and pinwrite() instead of using conditional code to deal with which set of IO pins are employed.
I've attached a simple driver for the flash that uses Smart Pins as a guide to get you going.
I just posted some block diagrams that I did when writing the SD ROM Boot code in the P2 on my SD thread.
I posted them years ago but I couldn't find those threads.
My drivers do not need an extra cog and can read at up to 7,500k bytes/second. The underlying PASM routines are very simple to understand as there are basically the bit-bash SPI read and write for commands and response etc, and also the smartpin assisted block mode SPI routines for fast sector reads and writes. While you clearly state your points, the rationale for this has not been made obvious, but if it were then maybe we can understand your particular requirements better.
Here's the console output from TAQOZ when I request information on the SD card and speeds etc.
For masochistic people who don't want to use the streamer or Smartpins for receiving serial data I think I've found an I/O pin poll sequence that should be able to read SD SPI data in 32 bit bursts at (instantaneous) polling rates up to sysclk/2 bps with 32 bits read every ~84 P2 clocks in the example below. Depending on the P2 clock rate this might be too fast for some cards or for the signal integrity available on the PCB but it might still suit systems with slower clocks to maximize the transfer rate. The clock would need to be handled by a Smartpin and the input data synced up to it and so it may also need a small delay added here to tune it. Depending on any delay needed it might achieve sector transfers around ~11MBps at 250MHz. Obviously this is not sustainable due to the other access overheads but it's still quite fast.
Uses 4 temp regs to accumulate the 32 bits (here a,b,c,d). Untested sample code for the inner read loop of 128 longs (a 512 byte sector):
@rogloh - I use a smartpin as a clock too although I a simple rep loop of 2 instructions that can be expanded to 3 when I need to read slower. I don't have the cog memory to spare for an unrolled rolnib loop unless I dedicate a cog for this.
Yes I recall you had a good solution from some time back. What transfer rate do you get again Peter? Would this sample code be able to transfer sectors faster or slower that what yours does? I know it is unrolled and takes up some space.
Update: ok, yes I see above it would peak at 4 clocks per bit excluding wflong and clock retrigger. Mine peaks at 2 clocks per bit but with overheads I'd imagine it goes up to just under 3 clocks/bit.
Ok I found Peter's original code from another post (reproduced below).
It looks like it would read a sector in 152*128 clocks assuming the 2 cycle time you'd need to init the "a" loop register elsewhere is compensated for by the last djnz at the end.
Assuming I need the same 12 clock cycle delay after starting the clock before reading in pin data (and reading from ina/inb with rolnib has the same delay as testp - it may not) then my loop should be able to read a 512 byte sector in 96x128+2 clocks. This is 12290 vs 19456 or 58% faster which is significant.
Update: you could still achieve a speed up with a reduced unroll count too:
8 bits per loop = 34 clocks per 8 bits = 4.25 (12 instructions)
16 bits per loop = 54 clocks per 16 bits = 3.375
32 bits per loop = 96 clocks per 32 bits = 3
Peter's current loop @200Mhz is 152 clocks per 32 bits = 4.75 (9 instructions)
I am outputting an 80MHz clock to the SD when running at 320MHz and the limit in the spec is 50MHz. I don't think the card would work at 160MHz!!!!
Thank You Jonny Mac,
I do like the look of PinHigh and PinLow way of doing things,
It helps get me away from assessing if the pin was on a PortA or PortB...
As for smartpins, I think since I'm kind of struggling already, maybe it is something to look at in the future.
Maybe if time allows I will take a look at the SmartPins in Easy Adopter tutorials or videos to learn a little more about how they work and how to implement them.
Thank you for your "simple" driver
I think it will take me a few days or weeks at least to try and understand what's going .
But as you say, if its faster then surely its something I need to learn.
Thank You Cluso,
I actually saw everyone's replies on Friday 7th May while I was at work, but didn't get any free time to respond until now.
I took the liberty of printing the flowcharts out so I could study them at home.
I did see a similar one at http://elm-chan.org/docs/mmc/i/sdinit.png
But your flowcharts are a bit more detailed.
So cheers for sharing them.
Thank You Peter,
I completely understand that Pasm is way faster.
And I know from all the forums threads that your TAQOZ is a super program/os, of that I have no doubt.
But since I am a beginner who is using Spin to learn, as I find that easiest to understand.
I thought I would learn about FAT32 and wanted to have a dabble at writing a program to read/write files from disk, and for it to be Fat32 compatible.
Initially I had struggled because the method used to initialise an sd card was alien to me.
But the flow chart and the FSRW 2.x by Tomas Rockiki's and Jonathon Dummer's safe_spi.spin helped lots too.
So I wrote that program which I'd uploaded, more or less so that experienced members could point out flaws in my logic.
I don't doubt that all the members are giving me good advice, based upon their experiences,
but I cannot understand many of the complex statements in much of the code, so I tried to break it down in to individual statements.
And hence I tried to write it in a way which I could understand, and that is the rationale behind my question.
I really am at the beginner stage but have a passion and thirst to know more.
Thank you for your i/o pin poll sequence sample/code.
I like to learn how a thing works, so that when it breaks I know how to fix it, or where to look to find the cause of an issue.
And that has always the mentality of how I like to learn about things.
The sample you posted, is way over my head, since i'm only at a beginner level.
Maybe its something I would understand when I have more knowledge,
Thank you none-the-less.