How fast could a Prop chip parse and extract data from an ASCII text file that is 5000 rows and 10 columns? I could break the file down into small pieces and parse it with multiple cogs to speed things up. Thanks.
I need to be able to read my file multiple times a second. So, in theory, with say five cogs running it, I could read 50,000 entries (5000 rows by 10 columns) five times a second? That would be awesome.
pbhuter: "I need to be able to read my file multiple times a second."
and "...ASCII text file with 50,000 or so entries" and "...five times a second"
This makes no sense to me. Why not just parse the ASCII text, whatever numbers you have in columns and rows, just once?
Perhaps you have to initially parse it into binary data written out to a second file. Then read that binary data back into memory and use it as is.
However, 50 thousand numbers won't fit into 32K RAM.
So you may end up repeatedly reading that binary into RAM in smaller chunks and using it for whatever purpose. As binary data 5 times per second straight of the file it may well be doable.
If not, get a external RAM on to your Prop and pull it in from there.
I will have some values from a source that will (hopefully) also show up in the file. I need to find the place where those values show up and extract additional information from the file.
I know I'll need external memory for the file. The data source (mentioned in my post a minute ago) will refresh at 5Hz or so with new information to be found in the file.
Hmm...So we are not really parsing a file here but rather a continuous stream of data coming from somewhere.
So where is it coming from, how is being delivered and how fast can that be?
I don't see any mention of you data source here.
Thing is, if its a stream from which you have to extract some stuff then there is no "file" anywhere. What I mean is there is no need to have it all stored anywhere for any time. Just let it pass by and pull out what you want and discard the original stream as you go.
I want to go through the ASCII file until I find a line or block of data that matches the data from my data source. Once that line or block of data is found, I need to read additional information from the ASCII file associated with that line or block of data.
ARM doesn't make chips, they only license the cores. Companies like NXP actually make the chips. Have a look at the LPC1768, it has 64k SRAM on-chip which should be able to hold your array. You can buy a little development board for it for under $30.
RE Attached is a txt file with some data. Could you parse it and find the very last entry (last row, last column)? Thanks.
Good and bad news.
Good news: Yes it can be done on a propeller. See screen grab. This is your 12k text file. Downloaded via xmodem to the propeller, then run a small program, and it prints out your last row, last column entry. I'd prefer C or Sbasic but Mbasic does have a rather nice command for finding the end of files.
Bad news. It took 5 seconds to read the file off the sd card.
But - it is possible to do slowly. I think it would be much much faster using Kyedos and some custom spin code, plus that would not involve external memory.
I don't think that is what he really wants to do, though:
"I want to go through the ASCII file until I find a line or block of data that matches the data from my data source. Once that line or block of data is found, I need to read additional information from the ASCII file associated with that line or block of data."
I misunderstood the requirement I guess Just reading the last column of the last row should be no slower than reading the file which is apparently quite slow.
As far as a parsing which is probably a more relevant performance measure, I created a test to find the last column of data on EACH row and measure how long that takes. Parsing each item until the last row gives worst case.
I get 1.38s for Spin to find the last column on each of 250 rows at 80MHz.
So, parsing 50,000 such records in spin at minimum would take > 4.6 minutes and that's without reading from SD card. Written in PASM, you might knock that down to maybe 15s since Spin is about 20 times slower than PASM but again you still have SD card overhead.
That is still not what the OP wants. As I understand his requirement he wants to search the file for text in input data, and when a match is found, output the matched text and some of the subsequent records.
So what we have here is a big table containing value sets, An, Bn, Cn, Dn, En.... Where "n" is 0 to some number in tens of thousands.
Then we have in coming data X, Y.
We want to match X, Y with something in the table, say A and B.
Then we want to return the corresponding C, D, E..for that A, B.
Does that sound correct?
If so, the table should be pre-parsed only once and held as binary numbers somewhere. The incoming data chunks are small and can be parsed very quickly as they arrive. Probably want to store the numbers as scaled integers rather than floating point as it looks like they only have a couple of decimal places.
So basically we have problem to search for the X, Y,s in the table on each new input. Searching linearly through the table on each new input will be slow. So something smarter should be done. Basically the incoming X,Y value is a key for which we have to find the corresponding value in the table.
A simple thing would be to have the table be sorted at start up in increasing X,Y. Then for each incoming key value a binary search on the table will quickly find you the value you want. Given that the table is now basically arrays of values a binary search is easy to do.
Problem, the table still does not fit in HUB RAM. BUt it could be held in external RAM and the binary search done on it there. I think there is plenty of speed for that.
However in the absence of other "real-time" tasks in this application perhaps a different processor/micro-controller with enough RAM or FLASH to hold the table would be a simpler approach.
Heater, I think you hit on exactly what I am trying to do. I downloaded a complete file with a couple hundred thousand entries (which I'll probably cut down some), and the file size was about 256 kB, so I know I'll need to keep it stored in external memory somewhere. I'm going to look into other chips, though. The ARM processors Leon mentioned may do the trick. Thanks.
Comments
Does anyone have an idea of how long it will take to parse an ASCII text file with 50,000 or so entries? Thanks.
and "...ASCII text file with 50,000 or so entries" and "...five times a second"
This makes no sense to me. Why not just parse the ASCII text, whatever numbers you have in columns and rows, just once?
Perhaps you have to initially parse it into binary data written out to a second file. Then read that binary data back into memory and use it as is.
However, 50 thousand numbers won't fit into 32K RAM.
So you may end up repeatedly reading that binary into RAM in smaller chunks and using it for whatever purpose. As binary data 5 times per second straight of the file it may well be doable.
If not, get a external RAM on to your Prop and pull it in from there.
So where is it coming from, how is being delivered and how fast can that be?
I don't see any mention of you data source here.
Thing is, if its a stream from which you have to extract some stuff then there is no "file" anywhere. What I mean is there is no need to have it all stored anywhere for any time. Just let it pass by and pull out what you want and discard the original stream as you go.
Attached is a txt file with some data. Could you parse it and find the very last entry (last row, last column)? Thanks.
Good and bad news.
Good news: Yes it can be done on a propeller. See screen grab. This is your 12k text file. Downloaded via xmodem to the propeller, then run a small program, and it prints out your last row, last column entry. I'd prefer C or Sbasic but Mbasic does have a rather nice command for finding the end of files.
Bad news. It took 5 seconds to read the file off the sd card.
But - it is possible to do slowly. I think it would be much much faster using Kyedos and some custom spin code, plus that would not involve external memory.
"I want to go through the ASCII file until I find a line or block of data that matches the data from my data source. Once that line or block of data is found, I need to read additional information from the ASCII file associated with that line or block of data."
As far as a parsing which is probably a more relevant performance measure, I created a test to find the last column of data on EACH row and measure how long that takes. Parsing each item until the last row gives worst case.
I get 1.38s for Spin to find the last column on each of 250 rows at 80MHz.
So, parsing 50,000 such records in spin at minimum would take > 4.6 minutes and that's without reading from SD card. Written in PASM, you might knock that down to maybe 15s since Spin is about 20 times slower than PASM but again you still have SD card overhead.
--Steve
Then we have in coming data X, Y.
We want to match X, Y with something in the table, say A and B.
Then we want to return the corresponding C, D, E..for that A, B.
Does that sound correct?
If so, the table should be pre-parsed only once and held as binary numbers somewhere. The incoming data chunks are small and can be parsed very quickly as they arrive. Probably want to store the numbers as scaled integers rather than floating point as it looks like they only have a couple of decimal places.
So basically we have problem to search for the X, Y,s in the table on each new input. Searching linearly through the table on each new input will be slow. So something smarter should be done. Basically the incoming X,Y value is a key for which we have to find the corresponding value in the table.
A simple thing would be to have the table be sorted at start up in increasing X,Y. Then for each incoming key value a binary search on the table will quickly find you the value you want. Given that the table is now basically arrays of values a binary search is easy to do.
Problem, the table still does not fit in HUB RAM. BUt it could be held in external RAM and the binary search done on it there. I think there is plenty of speed for that.
However in the absence of other "real-time" tasks in this application perhaps a different processor/micro-controller with enough RAM or FLASH to hold the table would be a simpler approach.
Paul