Help with Using Spi To Access The Sd Card

JaanDoh · 2021-05-05 21:30

I want to be able (eventually) to access files on an sd card using SPI.

And since I am just learning, and finding it difficult to comprehend PASM or C code, I put together a basic skeleton of a block driver in spin, is incomplete, since there are several points I'm stuck on.

I'm using spin because that's the one I feel most comfortable with.

So with my uncompleted driver (see attached file)
There are several points which are befuddling me, and if anyone could be so kind as to point out my mistakes, that would be super.

I know there are many excellent complete drivers out there already in the wild, but it's more important to me to understand what's going on under the hood rather than just take a script and implement it.

It is based on Jonathon Dummer's Safe_Spi.spin object that came with Tomas Rokicki's fsrw.spin object

What's more important to me is to know WHY, rather than HOW (if that makes sense lol)

Thanks JD

Point 1 - Use chip_select and chip_deselect methods.
I want to use ChipSelect() and ChipDeselect() methods to signal to the card that it is selected,
as opposed to using inline statements to set the pins states.

Where I am confused is where exactly to place those method calls and in which methods.
I'm not sure at all that I can use chipselect() and/or chipedselect() calls because the CommandSend() method calls itself. (see footnote)

Point 2 - Use wait_till_ready() method.
I want to use the WaitTillReady() method to ensure the card is free to accept commands, before actually sending the commands.

Once again, I'm confused at which methods should call the WaitTillReady() method. (see footnote)

Point 3 - CRC table
Do I need a CRC table?
I don't think its needed but as I understand things, some CARDS want or expect the correct CRC

Point 4 - Speed
How do I change the speed when I need to from 100Khz, 400Khz or maxspeed?
I have no idea how this is done, whether to place delays the methods or alter the frequency, which I don't know how to do either lol

Point 5 - Timeout
How do I implement a timeout? like maybe 1 second or 2 seconds?
So as not to wait forever for a read/write method to complete?

Point 6 - Recommendations
I know that there are bugs or errors, and If members can be kind enough to point them out, I'll try to correct them before moving onto my main course, which can only be done once the SPI layer is completed.

Footnote
Since some of commands involve calling the SendCommand() method from within itself,
Its difficult to decide where and when to use methods such as WaitTillReady() or ChipSelect() or ChipDeselect() since it maybe break one of the nested calls.

I'm suspecting that I need to use those methods in my SendCommand() method only?

JonnyMac · 2021-05-06 00:01

You might consider using Smart Pins for SPI instead of bit-banging; this will let you move data a whole lot faster. If you still want to bit-bang then consider using pinhigh(), pinlow(), and pinwrite() instead of using conditional code to deal with which set of IO pins are employed.

I've attached a simple driver for the flash that uses Smart Pins as a guide to get you going.

Cluso99 · 2021-05-06 00:18

I just posted some block diagrams that I did when writing the SD ROM Boot code in the P2 on my SD thread.
I posted them years ago but I couldn't find those threads.

Peter Jakacki · 2021-05-06 00:47

My drivers do not need an extra cog and can read at up to 7,500k bytes/second. The underlying PASM routines are very simple to understand as there are basically the bit-bash SPI read and write for commands and response etc, and also the smartpin assisted block mode SPI routines for fast sector reads and writes. While you clearly state your points, the rationale for this has not been made obvious, but if it were then maybe we can understand your particular requirements better.

Here's the console output from TAQOZ when I request information on the SD card and speeds etc.

TAQOZ# .disk ---  CARD: SANDISK   SD SL08G REV$80 #188035386 DATE:2016/7                                             

                   *** OCR ***                                                                                       
    VALUE........................... $C0FF_8000                                                                      
    RANGE........................... 2.7V to 3.6V                                                                    

                   *** CSD ***                                                                                       
    CARD TYPE....................... SDHC                                                                            
    LATENCY......................... 1ms+1400 clocks                                                                 
    SPEED........................... 50MHz                                                                           
    CLASSES......................... 0 1 0 1 1 0 1 1 0 1 0 1                                                         
    BLKLEN.......................... 512                                                                             
    SIZE............................ 7,761MB                                                                         
    Iread Vmin...................... 100ma                                                                           
    Iread Vmax...................... 25ma                                                                            
    Iwrite Vmin..................... 1ma                                                                             
    Iwrite Vmax..................... 45ma                                                                            

                 *** SPEEDS ***                                                                                      
    LATENCY......................... 206us,228us,220us,229us,220us,230us,240us,205us,                                
    SECTOR.......................... 257us,290us,282us,292us,283us,293us,303us,268us,                                
    BLOCKS.......................... 7,440kB/s @320MHz                                                               

                   *** MBR ***                                                                                       
    PARTITION....................... 0 00 INACTIVE                                                                   
    FILE SYSTEM..................... FAT32 LBA                                                                       
    CHS START....................... 1023,254,63                                                                     
    CHS END......................... 0,0,0                                                                           
    FIRST SECTOR.................... $0000_2000                                                                      
    TOTAL SECTORS................... 15,515,648 = 7,944MB                                                            

00170: 0000_0000 0000_0001 0002_0000 506F_7250     '............ProP'                                                

                  *** FAT32 ***                                                                                      
    OEM............................. TAQOZ P2                                                                        
    Byte/Sect....................... 512                                                                             
    Sect/Clust...................... 16 = 8KB                                                                        
    FATs............................ 2                                                                               
    Media........................... F8                                                                              
    Sect/Track...................... 63                                                                              
    Heads........................... 255                                                                             
    Hidden Sectors.................. 8,192 = 4MB                                                                     
    Sect/Part....................... 15,515,648 = 7,944MB                                                            
    Sect/FAT........................ 7,576 = 3MB                                                                     
    Flags........................... 0                                                                               
    Ver............................. 00 00                                                                           
    ROOT Cluster.................... $0000_0002 SECTOR: $0000_5B50                                                   
    INFO Sector..................... $0001 = $0000_2001                                                              
    Backup Sector................... $0006 = $0000_2006                                                              
    res............................. 00 00 00 00 00 00 00 00 00 00 00 00                                             
    Drive#.......................... 128                                                                             
    Ext sig......................... $29 OK!                                                                         
    Part Serial#.................... $50AD_0021 #1353515041                                                          
    Volume Name..................... P2 CARD    FAT32    ok                                                          
TAQOZ#

rogloh · 2021-05-06 04:29

For masochistic people who don't want to use the streamer or Smartpins for receiving serial data I think I've found an I/O pin poll sequence that should be able to read SD SPI data in 32 bit bursts at (instantaneous) polling rates up to sysclk/2 bps with 32 bits read every ~84 P2 clocks in the example below. Depending on the P2 clock rate this might be too fast for some cards or for the signal integrity available on the PCB but it might still suit systems with slower clocks to maximize the transfer rate. The clock would need to be handled by a Smartpin and the input data synced up to it and so it may also need a small delay added here to tune it. Depending on any delay needed it might achieve sector transfers around ~11MBps at 250MHz. Obviously this is not sustainable due to the other access overheads but it's still quite fast.

Uses 4 temp regs to accumulate the 32 bits (here a,b,c,d). Untested sample code for the inner read loop of 128 longs (a 512 byte sector):

rep #42, #128
wypin #32, CLKPIN ' start 32 clocks
' tuning delay needed here perhaps?
rolnib a,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib a,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib a,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib a,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib a,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib a,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib a,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib a,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib b,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib b,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib b,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib b,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib b,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib b,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib b,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib b,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib c,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib c,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib c,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib c,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib c,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib c,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib c,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib c,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib d,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib d,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib d,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib d,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib d,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib d,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib d,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib d,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
splitb a
splitb b
splitb c
splitb d
rolbyte d, d, #MISOPIN & 3
rolbyte d, c, #MISOPIN & 3
rolbyte d, b, #MISOPIN & 3
rolbyte d, a, #MISOPIN & 3
wflong d

Peter Jakacki · 2021-05-06 04:51

@rogloh - I use a smartpin as a clock too although I a simple rep loop of 2 instructions that can be expanded to 3 when I need to read slower. I don't have the cog memory to spare for an unrolled rolnib loop unless I dedicate a cog for this.

rep     x,#32       ' x=2 for fast or 3 for slower
testp   miso wc         ' 4 cycle/bit read
rcl     r1,#1       ' shift in next bit
waitx   clkdly      ' optional extra wait instruction for modified SPIRX

rogloh · 2021-05-06 04:55

Yes I recall you had a good solution from some time back. What transfer rate do you get again Peter? Would this sample code be able to transfer sectors faster or slower that what yours does? I know it is unrolled and takes up some space.
Update: ok, yes I see above it would peak at 4 clocks per bit excluding wflong and clock retrigger. Mine peaks at 2 clocks per bit but with overheads I'd imagine it goes up to just under 3 clocks/bit.

rogloh · 2021-05-06 05:23

Ok I found Peter's original code from another post (reproduced below).

.l0             wypin   #32,sck                 ' trigger 32 clocks
                waitx   sddly                   ' delay (set to 8 for 200MHz)
                rep     x,#32                   '  x = 2 for fast or 3 for slower
                 testp  miso wc                 ' 4 cycle/bit read
                 rcl    r1,#1
                 waitx  clkdly                  ' extra wait instruction for modified SPIRX
                movbyts r1,#%00011011           ' rearrange bytes
                wflong  r1                      ' save four bytes
                djnz    a,#.l0                  ' for long count

It looks like it would read a sector in 152*128 clocks assuming the 2 cycle time you'd need to init the "a" loop register elsewhere is compensated for by the last djnz at the end.

Assuming I need the same 12 clock cycle delay after starting the clock before reading in pin data (and reading from ina/inb with rolnib has the same delay as testp - it may not) then my loop should be able to read a 512 byte sector in 96x128+2 clocks. This is 12290 vs 19456 or 58% faster which is significant.

Update: you could still achieve a speed up with a reduced unroll count too:
8 bits per loop = 34 clocks per 8 bits = 4.25 (12 instructions)
16 bits per loop = 54 clocks per 16 bits = 3.375
32 bits per loop = 96 clocks per 32 bits = 3
Peter's current loop @200Mhz is 152 clocks per 32 bits = 4.75 (9 instructions)

Peter Jakacki · 2021-05-06 06:49

I am outputting an 80MHz clock to the SD when running at 320MHz and the limit in the spec is 50MHz. I don't think the card would work at 160MHz!!!!

                 *** SPEEDS *** 
    LATENCY......................... 535us,248us,246us,263us,246us,246us,266us,247us,
    SECTOR.......................... 321us,305us,305us,322us,304us,305us,325us,305us,
    BLOCKS.......................... 7,905kB/s @340MHz ok

JaanDoh · 2021-05-09 02:15

@JonnyMac said:
You might consider using Smart Pins for SPI instead of bit-banging; this will let you move data a whole lot faster. If you still want to bit-bang then consider using pinhigh(), pinlow(), and pinwrite() instead of using conditional code to deal with which set of IO pins are employed.

I've attached a simple driver for the flash that uses Smart Pins as a guide to get you going.

Thank You Jonny Mac,

I do like the look of PinHigh and PinLow way of doing things,
It helps get me away from assessing if the pin was on a PortA or PortB...

As for smartpins, I think since I'm kind of struggling already, maybe it is something to look at in the future.
Maybe if time allows I will take a look at the SmartPins in Easy Adopter tutorials or videos to learn a little more about how they work and how to implement them.

Thank you for your "simple" driver
I think it will take me a few days or weeks at least to try and understand what's going .
But as you say, if its faster then surely its something I need to learn.

JaanDoh · 2021-05-09 02:29

@Cluso99 said:
I just posted some block diagrams that I did when writing the SD ROM Boot code in the P2 on my SD thread.
I posted them years ago but I couldn't find those threads.

Thank You Cluso,

I actually saw everyone's replies on Friday 7th May while I was at work, but didn't get any free time to respond until now.
I took the liberty of printing the flowcharts out so I could study them at home.
I did see a similar one at http://elm-chan.org/docs/mmc/i/sdinit.png
But your flowcharts are a bit more detailed.
So cheers for sharing them.

JaanDoh · 2021-05-09 02:44

@"Peter Jakacki" said:
My drivers do not need an extra cog and can read at up to 7,500k bytes/second. The underlying PASM routines are very simple to understand as there are basically the bit-bash SPI read and write for commands and response etc, and also the smartpin assisted block mode SPI routines for fast sector reads and writes. While you clearly state your points, the rationale for this has not been made obvious, but if it were then maybe we can understand your particular requirements better.

Here's the console output from TAQOZ when I request information on the SD card and speeds etc.

TAQOZ# .disk ---  CARD: SANDISK   SD SL08G REV$80 #188035386 DATE:2016/7                                             
                                                                                                                     
                   *** OCR ***                                                                                       
    VALUE........................... $C0FF_8000                                                                      
    RANGE........................... 2.7V to 3.6V                                                                    
                                                                                                                     
                   *** CSD ***                                                                                       
    CARD TYPE....................... SDHC                                                                            
    LATENCY......................... 1ms+1400 clocks                                                                 
    SPEED........................... 50MHz                                                                           
    CLASSES......................... 0 1 0 1 1 0 1 1 0 1 0 1                                                         
    BLKLEN.......................... 512                                                                             
    SIZE............................ 7,761MB                                                                         
    Iread Vmin...................... 100ma                                                                           
    Iread Vmax...................... 25ma                                                                            
    Iwrite Vmin..................... 1ma                                                                             
    Iwrite Vmax..................... 45ma                                                                            
                                                                                                                     
                 *** SPEEDS ***                                                                                      
    LATENCY......................... 206us,228us,220us,229us,220us,230us,240us,205us,                                
    SECTOR.......................... 257us,290us,282us,292us,283us,293us,303us,268us,                                
    BLOCKS.......................... 7,440kB/s @320MHz                                                               
                                                                                                                     
                   *** MBR ***                                                                                       
    PARTITION....................... 0 00 INACTIVE                                                                   
    FILE SYSTEM..................... FAT32 LBA                                                                       
    CHS START....................... 1023,254,63                                                                     
    CHS END......................... 0,0,0                                                                           
    FIRST SECTOR.................... $0000_2000                                                                      
    TOTAL SECTORS................... 15,515,648 = 7,944MB                                                            
                                                                                                                     
00170: 0000_0000 0000_0001 0002_0000 506F_7250     '............ProP'                                                
                                                                                                                     
                  *** FAT32 ***                                                                                      
    OEM............................. TAQOZ P2                                                                        
    Byte/Sect....................... 512                                                                             
    Sect/Clust...................... 16 = 8KB                                                                        
    FATs............................ 2                                                                               
    Media........................... F8                                                                              
    Sect/Track...................... 63                                                                              
    Heads........................... 255                                                                             
    Hidden Sectors.................. 8,192 = 4MB                                                                     
    Sect/Part....................... 15,515,648 = 7,944MB                                                            
    Sect/FAT........................ 7,576 = 3MB                                                                     
    Flags........................... 0                                                                               
    Ver............................. 00 00                                                                           
    ROOT Cluster.................... $0000_0002 SECTOR: $0000_5B50                                                   
    INFO Sector..................... $0001 = $0000_2001                                                              
    Backup Sector................... $0006 = $0000_2006                                                              
    res............................. 00 00 00 00 00 00 00 00 00 00 00 00                                             
    Drive#.......................... 128                                                                             
    Ext sig......................... $29 OK!                                                                         
    Part Serial#.................... $50AD_0021 #1353515041                                                          
    Volume Name..................... P2 CARD    FAT32    ok                                                          
TAQOZ#

Thank You Peter,

I completely understand that Pasm is way faster.
And I know from all the forums threads that your TAQOZ is a super program/os, of that I have no doubt.

But since I am a beginner who is using Spin to learn, as I find that easiest to understand.
I thought I would learn about FAT32 and wanted to have a dabble at writing a program to read/write files from disk, and for it to be Fat32 compatible.
Initially I had struggled because the method used to initialise an sd card was alien to me.
But the flow chart and the FSRW 2.x by Tomas Rockiki's and Jonathon Dummer's safe_spi.spin helped lots too.
So I wrote that program which I'd uploaded, more or less so that experienced members could point out flaws in my logic.

I don't doubt that all the members are giving me good advice, based upon their experiences,
but I cannot understand many of the complex statements in much of the code, so I tried to break it down in to individual statements.
And hence I tried to write it in a way which I could understand, and that is the rationale behind my question.

I really am at the beginner stage but have a passion and thirst to know more.

JaanDoh · 2021-05-09 02:52

@rogloh said:
For masochistic people who don't want to use the streamer or Smartpins for receiving serial data I think I've found an I/O pin poll sequence that should be able to read SD SPI data in 32 bit bursts at (instantaneous) polling rates up to sysclk/2 bps with 32 bits read every ~84 P2 clocks in the example below. Depending on the P2 clock rate this might be too fast for some cards or for the signal integrity available on the PCB but it might still suit systems with slower clocks to maximize the transfer rate. The clock would need to be handled by a Smartpin and the input data synced up to it and so it may also need a small delay added here to tune it. Depending on any delay needed it might achieve sector transfers around ~11MBps at 250MHz. Obviously this is not sustainable due to the other access overheads but it's still quite fast.

Uses 4 temp regs to accumulate the 32 bits (here a,b,c,d). Untested sample code for the inner read loop of 128 longs (a 512 byte sector):
rep #42, #128
wypin #32, CLKPIN ' start 32 clocks
' tuning delay needed here perhaps?
rolnib a,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib a,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib a,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib a,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib a,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib a,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib a,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib a,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib b,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib b,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib b,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib b,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib b,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib b,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib b,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib b,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib c,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib c,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib c,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib c,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib c,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib c,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib c,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib c,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib d,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib d,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib d,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib d,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib d,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib d,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib d,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
rolnib d,  ina+((MISOPIN>>5)&1), #(MISOPIN>>2)&7
splitb a
splitb b
splitb c
splitb d
rolbyte d, d, #MISOPIN & 3
rolbyte d, c, #MISOPIN & 3
rolbyte d, b, #MISOPIN & 3
rolbyte d, a, #MISOPIN & 3
wflong d

Thank you for your i/o pin poll sequence sample/code.

I like to learn how a thing works, so that when it breaks I know how to fix it, or where to look to find the cause of an issue.
And that has always the mentality of how I like to learn about things.

The sample you posted, is way over my head, since i'm only at a beginner level.
Maybe its something I would understand when I have more knowledge,
Thank you none-the-less.

Help with Using Spi To Access The Sd Card

Comments