Shop OBEX P1 Docs P2 Docs Learn Events
SD Card Speed - Page 2 — Parallax Forums

SD Card Speed

2456710

Comments

  • Peter...Would you explain your data above?

    96 MHz suggests you were not driving the 8 GB SD memory with a Propeller.

    How does the data translate to word rate out of the SD?

    What does the 1.611 ms data represent?

    Discovery
  • Peter JakackiPeter Jakacki Posts: 10,193
    edited 2017-03-07 14:27
    Prop chips are normally run at 80MHz with a 5MHz crystal but the Propeller also runs quite happily at 96MHz using a 6MHz crystal with a suitable layout and bypass caps.

    With SDHC cards the memory is addressed in sectors or blocks of 512 bytes (default). When we request a sector the SD card does what it has to do, signals ready, and then 512 bytes of data are read sequentially from the card into a RAM buffer. So from issuing a command to read a sector to completion of that task was taking 1.611 ms. In actual fact it is faster than that but I had sector scanning enabled which looks for special characters and marks the position. With a simple adjustment I could read in twice that rate, still with no special cogs and no special clocking.

    Anyhow, if one sector of 512 bytes takes 1.611ms then that means we can read 318kB/sec which is more than fast enough for your application since that equates to 636k nibbles/sec. Of course another cog would simply sift through those buffers and output nibbles at 2us intervals which is very similar to how we buffer and play wav files. That still leaves the problem of creating that file or sequence of files.

    EDIT: just tested my read speed with my latest kernel and I get 1.544ms per sector so that's 331.6kB/sec read speed, still without anything special.

    Time how long it takes to read in 1,000 sectors of 512 bytes each.
    ( 0004 $3C2A  ok )   0 1000 LAP ADO I SECTOR LOOP LAP .LAP
    148282736 cycles at 96MHz  or 1544.611ms 
    ( 0005 $3C2A  ok )   
    


  • OK, rather than storing every step in your stream, why not just store the step rate and the number of steps per motor until the next change, then you will have a lot less data, and you will be able to directly generate the steps fast enough from the per-calculated data.

    I do not know what calculations you are attempting to do that are so slow, though using two cogs to completely interpret raw G-Code, and how many ever cogs to drive the steppers (thinking written in PASM Propeller Assembly Language), you should be able to process data faster than any stepper motor on earth can step, that I know of any way (I think when I tested the maximum back in 2013, it came out to up to about 400,000 steps per second with interpreting the G-Code on the Propeller, do not quote I am going from faded human memory). Unfortunately I never did finish that project, irony as I am now attempting to get back to the propeller for similar projects.
  • Peter...That sounds great. I didn't realize that the Propeller would run nicely with a 6 MHz crystal. Your speeds appear excellent. The question is, "What are you doing in your code that achieves the stated speeds that is not present in my code?"

    I attached my test code for you to examine.

    Discovery
  • The included code above running on the Propeller Activity Board produces a series of two pulses spaced 250us apart with a period of 1.2 ms roughly on output pin #2. The code outputs all 3,000 pulses.

    Peter...is this representative of what you get with your code?

    Discovery

  • David...I bought another laptop specifically for developing PropWare projects. I downloaded SimpleIDE and followed your instructions for (SimpleIDE Any System). I loaded your SD Speed code and immediately got an error stating that include PropWare/utility/utility.h could not be found.

    When I ran the extract code the screen display did not match your instructions so I doubt that the PropWare instructions were properly implemented. What did I leave out?

    Discovery
  • David...the new laptop is running Windows 10 if that can be of any use.

    Discovery
  • I'll be home soon and will walk through the instructions on my own Windows 10 machine (I, too, happened to have purchased a new laptop very recently with Windows 10 on it, so it will be my first time installing SimpleIDE and PropWare on it). In the meantime, can you please share with me the .side file for your project? Or, a zip of the entire project might be even better (but I can't remember off the top of my head how to do that from SimpleIDE).
  • jmgjmg Posts: 15,182
    Discovery wrote: »
    ... but the propeller cannot compute and output the steps fast enough to cut thin materials. The CNC works perfectly well at moderate speeds.
    Maybe you need to improve the compute speed ?
    Discovery wrote: »
    The propeller would do the calculations and load the shift register to the depth necessary to cut out the part and the readout would use a dedicated circuit.
    How long have you budgeted for this load phase ?
    SD writing will always be slower than read, and a calculate time that takes too long, will cost too much.
    ie if you struggle to read this, the time to write is likely to kill this stone dead.
    Discovery wrote: »
    Clocking nibbles into a shift register then clocking them back out in reverse through a fast decode then driving the motors is the simplest concept I can think of for making a modification to the existing design.
    Getting the smooth playback could be a lottery. It may start fine, then as the SD card wears, the fetch times could increase. Even change to a new SD could give differing results.
    I'm not sure you could rely on a fixed rate, so would need (very) good buffering, and some means to decide what would be worst case tolerated SD delays, and flag those with a large 'replace SD' light bulb...
    Discovery wrote: »
    .... The cutter can handle steel or aluminum raw stock 7 feet long by 3.5 feet wide with a precision of 0.0005"; therefore, the number of nibbles gets to be quite large.
    My interest now is to locate a shift register that is four bits (a nibble) by 46 giga nibbles deep.
    Unusual approach, but if you must do this, have you looked at run-length-encode storage.
    Instead of a direct copy, only changes are stored, which slashes the time & Size needed.
    Further compression is possible if you store velocity and time and location sets.

    Does this machine have Quadrature counters for position verify, or is it pure Stepper open loop ?



  • jmgjmg Posts: 15,182
    On a Sandisk Ultra 8GB I get sector read times which includes command response phase and reading in 512 bytes of 1.6ms. I had an older Sandisk which was taking over 2ms.
    ( 0031 $3C78  ok )   $40.0000 $40 ADO CR I .LONG SPACE I LAP SECTOR LAP .LAP LOOP
    
    0040.0000 624 cycles at 96000000Hz  or 6.500us 
    0040.0001 154688 cycles at 96000000Hz  or 1.611ms 
    0040.0002 154688 cycles at 96000000Hz  or 1.611ms 
    0040.0003 154688 cycles at 96000000Hz  or 1.611ms 
    0040.0004 154688 cycles at 96000000Hz  or 1.611ms 
    0040.0005 154688 cycles at 96000000Hz  or 1.611ms 
    0040.0006 154688 cycles at 96000000Hz  or 1.611ms 
    0040.0007 154688 cycles at 96000000Hz  or 1.611ms 
    0040.0008 154688 cycles at 96000000Hz  or 1.611ms 
    0040.0009 154688 cycles at 96000000Hz  or 1.611ms 
    0040.000A 154688 cycles at 96000000Hz  or 1.611ms 
    

    Interesting. Can you scan a SD looking for the MIN & MAX of those, and try some older more worn parts ?
    I see even 8G will take 7 hours to read at those speeds.
    Makes the claimed '46 giga nibbles' look suss ? - that's 20 hours read-back at your top SD speed ~ 318kB/s
  • jmgjmg Posts: 15,182
    Discovery wrote: »
    My interest now is to locate a shift register that is four bits (a nibble) by 46 giga nibbles deep. The propeller would do the calculations and load the shift register to the depth necessary to cut out the part and the readout would use a dedicated circuit. I noticed a German CMOS shift register of 128 bits deep. A really big shift register would be perfect but I have not located one.
    Focusing on that, Serial Flash has continual read ability, and the Quad-SPI version will thus work exactly like " a really big shift register"


    Looks like Micron makes those up to 4GB in Serial NAND and 2GB in Serial NOR
    Serial NOR parts can clock well above P1 ability, so 500k clk speeds are easy.
    You can set the clock rate with high precision in P1.

  • Storing the complete pulse train seems like massive overkill for what you want. There are a number of ways you could simplify this:

    - Store sequences of pulse counts with specific timing (IE, don't store 1,1,1,1,1,0,0,0,0,0,1,1,1,1,1,0,0,0,0,0), store (4, 5), meaning toggle 4 times, delay 5 beats per toggle

    - Store run length compressed sequences (the above would be 1,5, 0,5, 1,5, 0,5) as in general this will be smaller than your uncompressed pulse train

    - Store short line segments and use a Bresenham algorithm to convert them back to pulse trains

    - Store the pulse trains "un-interleaved" - IE, instead of storing your data as ABCD EFGH ABCD EFGH ... where each letter is a bit for one of the motors, store a series of pulses like AAAA BBBB CCCC DDDD ... because they'll be more likely to compress really well with any of the above schemes.

    Storing raw pulse train data isn't really going to be workable. If you have one cog in the Prop decoding a compressed form, and pushing the results into an array of pulse data that is being read by another cog in the same chip, it'll work much more like most hobby-level Arduino or other GCode interpreters. TinyG, Grbl, and SmoothieBoard CNC controllers all work very much like this, as do things like RepRap 3D printers.
  • Can you please share with me the .side file for your project?

    David...it won't be helpful since the current .side file is configured differently than the proposed new version. The current .side file computes the number of steps for the X-axis and the Y-axis. There is no decoding the pulses go directly to two four-phase stepper motor algorithms to rotate the x and y stepper motors in the correct direction and the correct number of steps for each axis.

    The upgrade will utilize the concept that I have described earlier the nibble contains the eight coded conditions for direction and step commands to the ClearPath steppers. In my tests, I output direction and step commands to my ClearPath stepper trying to get the stepper to reach its highest rpm (4,000 rpm). So this focuses on the software and hardware to output the stored nibbles at a rate near 500 kHz.

    Discovery
  • Discovery wrote: »
    Can you please share with me the .side file for your project?

    David...it won't be helpful since the current .side file is configured differently than the proposed new version. The current .side file computes the number of steps for the X-axis and the Y-axis. There is no decoding the pulses go directly to two four-phase stepper motor algorithms to rotate the x and y stepper motors in the correct direction and the correct number of steps for each axis.

    The upgrade will utilize the concept that I have described earlier the nibble contains the eight coded conditions for direction and step commands to the ClearPath steppers. In my tests, I output direction and step commands to my ClearPath stepper trying to get the stepper to reach its highest rpm (4,000 rpm). So this focuses on the software and hardware to output the stored nibbles at a rate near 500 kHz.

    Discovery

    First, all of the suggestions above are very good. I, too, would encourage you to consider them. But I'm always happy to brainwash help a new PropWare user.

    The .side file contains only the metadata used by SimpleIDE for compiling your project. You're absolutely right that I don't care about the code you're executing - assuming you know how to copy and paste ( :) ), I'm going to assume the code is correct. I need to see the .side file so that I can determine whether or not SimpleIDE was configured incorrectly or if PropWare may have been installed incorrectly or if there is a problem with the PropWare package.
  • @Discovery, sorry I wasn't able to try this out last night. Been banging my head trying to get SimpleIDE packaged for Linux and spent all night doing that. I'll give PropWare on Windows another try this weekend.
  • kwinnkwinn Posts: 8,697
    @Discovery

    After looking at what you want to do and the available memory technologies there seem to be only two choices that can provide the 46 Giga nibbles (23GB) your application requires.

    Sdram can provide the capacity required but it will need multiple chips and circuitry/software to refresh the memory, which makes it a fairly complicated choice.

    SD cards have the storage capacity and speed needed, and seems to be the better choice. There are disadvantages, but they can be overcome. By storing the data in blocks rather than using a file system and storing one X and Y nibble per byte they are more than fast enough to provide the data at 500KHz.
  • Discovery wrote: »
    In answer to Mickster...if you understood the design you would say okay.

    Hummm, no I wouldn't.

    Your performance spec is nothing special. You can do that with less data and more interpolation of the axes.

    In the early 90's we created this lathe control.

    Banging data at 1KHz was way overkill.

    Heck, paper-tape-fed machines could do what you are trying to do.

  • jmgjmg Posts: 15,182
    kwinn wrote: »
    After looking at what you want to do and the available memory technologies there seem to be only two choices that can provide the 46 Giga nibbles (23GB) your application requires.


    I'm not even sure it does require 46GN ?

    Going back to simple lineal cuts, the claimed bed is
    The cutter can handle steel or aluminum raw stock 7 feet long by 3.5 feet wide with a precision of 0.0005"; therefore, the number of nibbles gets to be quite large.

    Quite large, yes, but large numbers still need a sanity check....

    The 8 pin 2Gb QuadSPI memory I referenced, can clock very simply, and it can store a shipload of lineal-feet of cuts...

    I get this number of cuts
    (2G/4)/(7*12*2000) = 2976.19 full length (7') cuts
    or, if those are evenly made, each of those ~ 3000 pcs, is 14 thou across.
    (2G/4)/(7*12*2000)/(3.5*12) = 70.86 slices per per inch

    Nothing I've machined has come anywhere near 3000 x 7' cuts per sheet.
  • The current design data are the following:

    The ClearPath stepper motor takes 800 steps to make 1 revolution.
    The threaded rod that moves the cutting head pitch is 16 revolutions per inch.
    That makes 12,800 steps per inch.
    To move the cutting head 7 feet along the x-axis takes 1,075,200 steps.
    To move the cutting head 3.5 feet along the y-axis takes half the number of the x-axis or 537,600 steps.
    To cut a rectangle 7 feet by 3.5 feet therefore takes 3,225,600 steps.
    Each step represents one nibble of SD memory or 3,225,600 4-bit nibbles. Not much memory at all.
    The ClearPath stepper motor can turn at 4,000 revolutions per minute and thus cover 7 feet in 0.3360 minutes.
    To cut the complete rectangle takes 60.48 seconds or just over one minute.
    The precision of one step is 0.0013 inches.

    I will take a look at the 8 pin 2Gb QuadSPI memory.

    Discovery
  • kwinnkwinn Posts: 8,697
    @jmg

    I'm just answering based on the specs that were posted by Discovery. I agree that 46Gnibbles is a lot larger than what is required to cut out a rectangle I can see how cutting a large sheet into a number of smaller complex shapes can require much more.

    @Discovery

    $46Gnibbles will fit on a 32GB sd card, and by reading raw blocks rather than going through the software required for a file system can be read fast enough for your application. With one cog to read blocks from the sd card to hub memory and one or two cogs to output the data to the steppers there should be no problem getting the data out at 500KHz.
  • jmgjmg Posts: 15,182
    edited 2017-03-11 05:44
    Discovery wrote: »
    .....
    That makes 12,800 steps per inch.
    ....
    The precision of one step is 0.0013 inches.
    If your precision is .0013, you do not need to be storing 12800 nibbles per inch.
    You only need around 769 nibbles per inch stored, so you can save 8~16x in storage.

    Run some of your files into a simple run-length compression, and see what results.
    Simplest is a nibble of count, and a nibble of Motor data.
    The count can be N for 2^N, or a table lookup, of 16 preloaded counts.
    A table needs a little more processing to decide on the best values. 2^N is very easy to code/decode.

    "There are specifically eight values that are decoded (2, 3, 8, 10, 11, 12, 14,and 15)."
    or you could look at using the spare nibble space, of 8 entries, for the repeat count.
    Because the values are unique, this encoding can do
    VVVVLVLVL
    V=value, L=count
    or even
    VVLVVLLVLL
    where double L's expand the compression length, exponent style.
  • jmg...Let's take one step at a time. I will store uncompressed motor control nibbles. If successful, I will consider compression methods.

    kwinn...How in the world would I write the code to do precisely what you say can be done. If someone can supply the "C" code in SimpleIDE so that data can be written into the SD memory on my Propeller Activity Board then read out by the Propeller to my oscilloscope for observation, that would be great.
    As I mentioned earlier, I wrote 3,000 "1s" and "0s" into the SD memory using the propeller on the Propeller Activity Board then read out on a propeller pin to the oscilloscope. I was expecting to see blocks of 512 "1s" and "0s" but observed four pieces of data: 1,0,1,0 with the rising edge spaced 250us and this pattern repeated every 1.6ms (or so) until all 3,000 data points were output. This will not do for my application.

    One step at a time please!

    Discovery
  • @Discovery - I think you are getting it wrong about what you "observe". The pattern repeated every 1.6ms should be another sector of 512 bytes but you may be too dependent upon your oscilloscope which may not be up to the task or set correctly anyway. Try a faster time base next time and you should see more detail.

    Here is a capture of a sector read, For almost the first half of that read the MISO channel is idle after a read sector command is issued until the data becomes ready, then the data is read in a burst. It doesn't matter that the data comes out this way, as long as it's buffered and then another cog can output uncompressed nibbles to your heart's delight as another sector is being read etc. You didn't say how long it took for 3,000 data points to be output which I assume are streamed nibbles, not the raw bit stream from the card.


    sectorread.png
    1072 x 187 - 9K
  • Apples and oranges Peter.

    I ran the test setup again and found that the full read of 3,000 data points comes out in 1.75 seconds. As shown in my program below, I am monitoring the propeller signal output(2) to the stepper motor. The motor shaft turns for 1.75 seconds as the scope indicates. Output(3) is set to turn the motor counter clockwise. You must be looking at something else.

    #include"propeller.h"
    #include"stdio.h"
    #include"math.h"
    #include"stdlib.h"
    #include "simpletools.h" // Include simpletools header

    int val[3000];
    int i;
    int DO = 22, CLK = 23, DI = 24, CS = 25; // SD card pins on Propeller BOE
    int main(void) // main function
    {
    high(3);
    sd_mount(DO, CLK, DI, CS); // Mount SD card

    for(i=1; i<3000+1; i+=2)
    {
    val=1;
    val[i+1]=0;
    }
    FILE* fp = fopen("test.txt", "w"); // Open a file for writing

    print("Start Program.\n");
    for(i=1;i<3000+1;i++)
    {
    fwrite(&val, sizeof(val),1,fp);
    }
    fclose(fp);

    fp = fopen("test.txt", "r");
    for(i=1;i<3000+1;i++)
    {
    fread(&val,4,1,fp);
    if(val==1)
    {
    high(2);
    }
    else
    {
    low(2);
    }
    }
    fclose(fp);
    print("End Program.\n");
    }

    Discovery
  • kwinnkwinn Posts: 8,697
    edited 2017-03-11 20:50
    @Discovery

    Wish I could help you with the code but I have only just started on learning "C" for the propeller. All my projects to date have been in Spin/PASM.

    What I can say with certainty is that most current SD cards have more than enough speed to output data as fast as you need.

    This may be possible in "C", is probably doable in Tachyon, and can definitely be done in PASM with two cogs. The main thing is that the data needs to be fetched from the SD card in a block of 512 bytes, not one byte at a time as you are doing. Much less overhead in the block read The start address is sent to the SD, and then 512 bytes are fetched, versus sending the address once for each byte.

    Cog1 would:
    -wait for buff1flag empty
    -send read commands and start address to SD
    -transfer 512 bytes from the sd to buffer 1
    -set buff1flag full
    -wait for buff2flag empty
    -send read commands and start address to SD
    -transfer 512 bytes from the sd to buffer 2
    -set buff2flag full
    loop

    Cog2 would:
    -wait for buff1flag full
    -output data to pins
    -set buff1flag empty
    -wait for buff2flag full
    -output data to pins
    -set buff2flag empty
    loop
  • You most very certainly do not want to use fopen() because that will use the FAT16 or FAT32 filesystem for reading/writing data. This incurs a significant (and for you, unacceptable) overhead. You need to access the 512 byte blocks on the SD card directly.
  • David...How do I do that?
    Would it be an implementation of Kwinn's outline above?
    I could use some help coding Cog1 and Cog2.

    If this works...it will be excellent.

    Discovery
  • Oh my golly gosh, one byte at a time method? No wonder you are having such poor results :) This is where it helps if you attach images of what you are seeing. Yet interesting it shouldn't be taking that long between bytes being read in this method. Nonetheless, as pointed out, read in a sector. So one while one cog is nibbling away at buffered sectors, another one is busy reading in sector after sector into alternate buffers while synchronizing with the nibbler cog waiting for it to use up a sector buffer before filling it. David will guide you on your way I'm sure.



    BTW, just mentioning that Tachyon just uses virtual memory methods under the FAT32 layer so you can ask for bytes using a 32-bit address offset into a file and if the sector isn't buffered then there is that 1.6ms latency as it is read in otherwise it will return with the byte from the buffer immediately.
  • Some sample code would get me started.

    Discovery
  • I'm afraid I only know how to read sectors from an SD card via PropWare, not any other driver. That is the code I posted earlier in this thread, which you were unable to compile. I know, I know... you're still waiting for SimpleIDE help from me. I shall look into that now :)
Sign In or Register to comment.