Expected read speed for a 512 block on micro SD with SPI?

gis667en11 · 2015-12-08 20:19

Hi everyone,

I couldn't find this information anywhere and my SD card module is in the mail; I was curious enough that I decided to ask. What do you think I could expect in terms of read operation baud rate to get 512 bytes, or 1 block, from a class 10 micro SD card? I'll certainly be conducting my own experiments over the weekend, but if someone else has had done a time sensitive project with an SD card over SPI, I'd love to hear what your experience was with reading fast?

Thanks

msrobots · 2015-12-09 01:03

a while ago I tested both available drivers for SD cards and the propeller.

Kye's Fat_Engine is more comfortable and FSRW is faster.

Both provide a whole FAT System, so are a bit slower as if just reading/writing Sectors.

The PASM block-driver out of FSRW is IMHO the fastest way to access Sectors. It provides read ahead and write behind function.

The result is that your main cog does not have to wait long for writing a sector, and if you -say - read sector 34, sector 35 will be read ahead and is available very fast.

Overall you can get speeds between 800 to 1200 Kbytes per second.

Enjoy!

Mike

DavidZemon · 2015-12-09 02:31

With SD cards on the Propeller, I'm not sure if the class matters very much. I think we're more limited by the Propeller's max SPI speed which is 4 MHz last I heard (unless someone else knows of a way to go faster???)

The numbers @msrobots gave above are about right. They're so much higher than my own I had to go find a source for them, which I did: http://forums.parallax.com/discussion/136781/fsrw-2-6-speed-tests

Raw write 3968 kB in 2191 ms at 1810 kB/s
Raw read 1920 kB in 2179 ms at 880 kB/s
fsrw pwrite 4064 kB in 3766 ms at 1079 kB/s
fsrw pread 4064 kB in 4626 ms at 878 kB/s
FSRW pputc 63 kB in 2097 ms at 30 kB/s
FSRW pgetc 63 kB in 1838 ms at 34 kB/s

In my own implementation (C++) is much slower since I did not implement look-ahead or write-behind. I only benchmarked the equavalent of FSRW's pputc and pgetc, but I get about 13.3 kB/s in LMM and 3.7 kB/s in CMM.

gis667en11 · 2015-12-09 14:23

How could these read tests be done if hub RAM is only 32kB?

kwinn · 2015-12-09 14:45

gis667en11 wrote: »

How could these read tests be done if hub RAM is only 32kB?

It's not necessary to store all the data in ram, only to to read it in, so only a small amount of ram is required. Possibly as little as a single byte depending on how the code is written.

gis667en11 · 2015-12-09 14:54

Thanks guys, I get it. I will need to keep all of the data in hub RAM as it's read, so even the 64kB read test isn't extremely helpful, beyond making me concerned that the trend seems to be that the less data you read at once the slower the effective rate is... So, because I need to keep all of the data in hub memory (I'm going to be using a back and forth pair of buffers to continuously stream out), I'll probably be reading in 4kB at a time. I need to be streaming data out of my propeller at around 100kB/s, so if the above data is any indication of the read times I could expect with FSRW I might be in trouble. I'm basically making a controller that reads data from an SD card and sends it to other devices continously.

DavidZemon · 2015-12-09 15:04

kwinn wrote: »

gis667en11 wrote: »

How could these read tests be done if hub RAM is only 32kB?

It's not necessary to store all the data in ram, only to to read it in, so only a small amount of ram is required. Possibly as little as a single byte depending on how the code is written.

That's exactly it. I can't speak for FSRW's benchmark, but PropWare's "benchmark" is pretty basic:

volatile int start = CNT;
while (!reader.eof())
    writer.put_char(reader.get_char());
volatile int time = CNT - start;

This will do a byte-by-byte copy from the read buffer to the write buffer. As the read buffer (512 bytes) reaches the end, it overwrites it with a new one and as the write buffer (also 512 bytes) reaches the end it writes it to the SD card and overwrites it too. This means that you only need 1024 bytes of memory for buffers, not the full file size.

gis667en11 · 2015-12-09 15:07

Are the read and write buffers you're talking about both inside hub memory? Or would the read buffer be in the SD card and the write buffer is in hub RAM?

DavidZemon · 2015-12-09 15:29

gis667en11 wrote: »

Are the read and write buffers you're talking about both inside hub memory? Or would the read buffer be in the SD card and the write buffer is in hub RAM?

SD card can only be read from and written to in 512 byte chucks, so the chip must keep at least a 512 byte buffer for reading and writing.

DavidZemon · 2015-12-09 15:30

PropWare's current implementation always uses HUB memory. FSRW uses HUB memory for the user and cog memory for the read-ahead/write-behind buffers.

gis667en11 · 2015-12-09 15:36

Alright, thanks David

I'm getting the micro SD module tomorrow, I'm excited to start playing with it. I'll probably start by trying FSRW with two 2kB buffers, so I'd be reading 4 blocks at a time.

DavidZemon · 2015-12-09 15:42

Awesome! Have fun

If at any point you decide you like FSRW but want to give C/C++ a try, libpropeller has ported FSRW over to C++, so you can get the same feature set and performance in a completely different environment. I don't know much about PropForth and Tachyon, but I know they each support SD cards as well and are very fast and efficient.

gis667en11 · 2015-12-09 15:46

I looked at PropForth and was terrified haha. As a guy that's really only programmed in C, and is totally new to the propeller, Spin is intimidating enough. It does look very powerful, though. I think I might try to start over this project and "do it again" in propforth once it's all working as a way to learn it.

DavidZemon · 2015-12-09 15:57

If you're familiar with C already, give libpropeller a shot in SimpleIDE. It even comes with nice API documentation

https://rawgit.com/libpropeller/libpropeller.docs/master/html/classSD.html

gis667en11 · 2015-12-09 17:41

I've considered it, but at the rate I need these cores to be reading and spitting out data over SPI, I've been planning to ultimately boil the code down from Spin to PASM to save time and space. I'm not doing a lot of complex calculations, it's just reading data really fast and sending that data over several different protocols simultaneously. So, my needs dictate that programming in C probably isn't the best choice for this project. Though I really wish it was :P

DavidZemon · 2015-12-09 18:01

gis667en11 wrote: »

I've considered it, but at the rate I need these cores to be reading and spitting out data over SPI, I've been planning to ultimately boil the code down from Spin to PASM to save time and space. I'm not doing a lot of complex calculations, it's just reading data really fast and sending that data over several different protocols simultaneously. So, my needs dictate that programming in C probably isn't the best choice for this project. Though I really wish it was :P

I'm not sure who convinced you that programming in C is slower than Spin, but that isn't the case. There are ways to make C slower than Spin, and there are ways to make Spin slower than C.

If you think you'll have it all in assembly anyway, then I definitely suggest starting with C/C++ because you write only the functions you need in assembly (inline assembly) and then slowly convert more and more into assembly as performance demands. When the inline assembly becomes too unweildly, you use a .S file the same way you would the DAT section of a .spin file.

gis667en11 · 2015-12-09 18:15

Reads your signature

Ahhh, now I understand haha. I read in a few places, and heard from a Propeller technical support representative, that spin was faster. What do you mean when the "inline assembly becomes too unweildly"? What is "inline assembly"?

Scratch that- I googled it. Seems complicated, but powerful. I wonder, why not just use a propeller C to PASM compiler?

DavidZemon · 2015-12-09 18:49

gis667en11 wrote: »

I wonder, why not just use a propeller C to PASM compiler?

That's exactly what a C compiler is.

It takes your C or C++ code and creates assembly instructions for it. If you add the "-s" flag when compiling (such as "gcc -s main.c") you'll get an extra file such as main.s. That main.s file is the assembly generated from main.c. Now, if you're using CMM memory model, then it's a little different (see the ongoing disucssion here: http://forums.parallax.com/discussion/163002/simpleide-c-output-tokens-or-native#latest) but basically the same.
The reason you might use inline assembly is because the compiler is omnicient. It's REALLY smart, but every now and again, a human is able to write better code than a compiler (especially with a CPU that isn't RISC-based, like the Propeller). So, for those cases we can tell the compiler exactly what assembly to write, instead of allowing it to make an educated guess.

Peter Jakacki · 2015-12-10 06:41

Normally when you read and write to the SD you have to do something with it at the higher level, so plain transfer speed itself doesn't tell you all that much. For instance I think there are drivers that use the counters for much faster byte transfers but the overall throughput is compromised by other code.

I wrote Tachyon so I could use it in my commercial products, and so it just has to work. Tachyon is very fast at the high-level plus it treats the SD card and files as virtual memory, even at the FAT32 level. To open a file and print out a 32-bit number at position 2000 in a file is as easy as typing into the console:
FOPEN MYFILE.TXT
2000 F@ PRINT
- or to dump the whole file to the screen -
0 FSIZE@ DUMP

The latest update to Tachyon which I am backporting from the P2 version supports virtual memory access from within files as well as the original access from the first 4GB of the card, so files up to 4GB can be accessed as if they were memory. Up to 4 files can be opened at a time plus Tachyon doesn't require a dedicated cog just for SPI or SD card access. Networking and VGA etc are also built-in, all in the standard unexpanded Prop.

EDIT: just found a post where I stated that I get "sustained" read speeds of 250kB/second running out of the same cog as the Tachyon kernel and application.

I have a project where I need to be madly logging data to the card from a network of over 40 Props up to 240 times a second!

gis667en11 · 2015-12-10 13:13

A network of over 40 props?? I hope they're breadboard diy props, or that would be an expensive project.

Peter Jakacki · 2015-12-10 14:50

A commercial project is never cheap and never breadboarded. The Props are distributed over a common four quadrant serial bus and can all be programmed from any one Prop. All Props are independent and I can plug a serial terminal into anyone of them and "talk Tachyon" while they are doing their thing. Or I can access the cluster from the Internet via Telnet, FTP, or browse web pages served up by one of the supervisory Props. That's what Tachyon was designed for, real, hard, and fast commercial projects, and yet I have made it available for the Prop community to "tinker" with, which is about the most they do, .....if they do

gis667en11 · 2015-12-10 15:20

That's all fantastic, but... why not a PIC? Why not an ARM? What's the appeal for you to use props in such a large scale, commercial application?

DavidZemon · 2015-12-10 16:33

gis667en11 wrote: »

That's all fantastic, but... why not a PIC? Why not an ARM? What's the appeal for you to use props in such a large scale, commercial application?

Just a guess, but if he needs 40 of them, each with 8 cores, it sounds to me like he needs some serious parallelism!

Heater. · 2015-12-10 16:44

gis667en11

why not a PIC? Why not an ARM?

1) Longevity of the selected device.

2) No need to comprehend 4000 page manuals to get the the thing working.

3) Code running on separate cores makes things predictable and reliable. Especially timing wise. Google "Communicating Sequential Processes".

4) Code running on separate cores means that changes to one part are easy, you know that they cannot upset other parts.

5) Lots of support, easily available, like this very forum.

potatohead · 2015-12-10 21:31

Yep.

That, and Forth really excels at these kinds of things. I only know a few Forth guru types, they all pretty much echo the sentiments Peter does here. The amazing thing to me is what can be done quickly, once a working kernel is bootstrapped on to some machine.

@Peter, thanks. You shared a lot. It is not my thing, but I have learned a lot from your efforts. Honestly, I see the Forth mindset as a significant barrier. It is for me. However, for those who grok it, you have left pure gold here.

DavidZemon · 2015-12-10 22:04

First, I want to echo what potatohead has said: Peter, your work is amazing, and I wish I understood Forth enough to take advantage of it.

However, I am surprised by the talk about Forth specifically instead of Tachyon. For instance, Peter gave the following examples:

FOPEN MYFILE.TXT
2000 F@ PRINT
- or to dump the whole file to the screen -
0 FSIZE@ DUMP

One could just as easily have a C/C++ interface that looked like:

f = fopen("myfile.txt");
print(f, 2000);
// or dump the whole file
print(f);

What I see that's magical is that Peter went through the work to create the functions that do this, not that the Forth language supports the ability to do it. Am I missing something?

Peter Jakacki · 2015-12-10 22:32

Thanks guys for your comments. All those extra functions that I have built into Tachyon is the way Forth is meant to be extended just as C itself is meant to be extended with libraries. I just don't agree with traditional or ANSI ways of doing these things when they don't make any sense, especially for an embedded system based on the Propeller.

Part of that mindset I suppose is that of simplicity. Take the the PRINT function in Forth (normally "."), all it does is take a 32-bit number off the top of the stack and print it in the currently selected base. So essentially it is THE subroutine whereas in C "print" is far from that as first it is more of a compiler directive really. We are asking the compiler to generate the correct sequence of operation and calls based on the syntax so there is a whole heap of stuff that has to be done on the PC at compile time whereas Forth has none of that, it normally "assembles" as I like to say on the target device itself without regard to syntax and all we get is the equivalent of a call to that PRINT subroutine. Simple!

(like me)

Forth is not the answer for everything, but it does work very well for chips at this level. A small PIC I would just code in assembler, an RPi has plenty of language support, but at that level you are compiling and testing directly on the target, just as I do now with Forth on the Prop.

@gis - just like Heater said. I don't need the pain and grief wrangling with these little fellas with their inversely (dis)proportional levels of documentation and tools. Besides you may have gleaned from my description that each Prop has 4 network buses plus the console serial port running, something that is very easy to do with the Prop besides many other things.

Heater. · 2015-12-10 22:39

SwimDudeZemon

You have blown my mind there.

Assuming we are talking C here:

f = fopen("myfile.txt");

Leaves us with a handle/pointer thing "f" for a file.

I have no idea what "print" does but it seems to me that printing file handles is pretty dull.

I am amazed at what Peter can do with the Tachyon. He is living in a twilight zone of his own creation. Which seems to be profitable for him. More power to his elbow. Dare we enter there? It's not for me.

David Betz · 2015-12-10 22:46

FOPEN MYFILE.TXT

This obviously opens the file whose name is "MYFILE.TXT". How do I open a file whose name is stored in a variable and is not constant?

DavidZemon · 2015-12-10 23:25

Heater. wrote: »
SwimDudeZemon

You have blown my mind there.

Assuming we are talking C here:
f = fopen("myfile.txt");
Leaves us with a handle/pointer thing "f" for a file.

I have no idea what "print" does but it seems to me that printing file handles is pretty dull.

I'm more referring to C++, so perhaps that clears up some confusion.

I only wrote it like that as an example that it could be done. In practice, I'd write a function with the signature: "PropWare::Printer& operator<< (FILE *file)". Then if someone really wanted the syntax above, the function would be as simple as:

void print(FILE *f) {
    // pwOut is a global variable in PropWare projects, so no parameter is needed
    pwOut << f;
}

Or rather than the FILE pointer from the C world, I'd use PropWare's FileReader class.

Mike Green · 2015-12-11 05:09

David,
There's another Forth word (FOPEN$) that takes a pointer to a filename and opens it. FOPEN is intended for either interactive use or in compiled code where the filename is fixed.

Expected read speed for a 512 block on micro SD with SPI?

Comments