Shop OBEX P1 Docs P2 Docs Learn Events
wikipedia on a circuit board (in progress) — Parallax Forums

wikipedia on a circuit board (in progress)

BrokerBroker Posts: 4
edited 2015-01-12 20:29 in Propeller 1
Has anyone put wikipedia on a propeller PCB? I downloaded the entire wikipedia site to my laptop (250gigs to start) and have began making a PCB that contains a propeller and a 1TB SD-card where I can eventually put the site data on.

My plan is to make a intelligent robot that can retrieve wikipedia data on command and eventually I would like it to get data by itself, but so far I am having difficulties just starting.

How would I go about retrieving data quickly from the SDcard - would data storage need to be arranged in a fashion like Google does or MYSQL? What do I need to learn?

Help and comments would be appreciated.

Comments

  • Heater.Heater. Posts: 21,230
    edited 2015-01-11 10:44
    You are not going to be retrieving data from an SD card with a Propeller quickly. It's going to be terribly slow. I'm not even sure we have a file system driver for the Prop that can handle 1TB file system.

    "What do I need to learn?" - Get a Prop with an SD card interface. Or build one as did ages ago using a DIP Prop on a bread board. Learn to program in Spin. Spin is very easy and should not take to long. Use an SD driver from the Parallax OBEX and do some experiments reading and writing files from SD.

    It's all good fun but I'm not sure it will get you where you want to go. However, having learned all that you will very likely find other interesting ideas to pursue.

    By the way, how does wikipedia store it's data? I guess they have some database system not just raw text files hanging around. The best way to mirror wikipedia surely would be to use the same db on a machine(s) that can run it.
  • kwinnkwinn Posts: 8,697
    edited 2015-01-12 06:48
    That's a huge undertaking, and setting up storage space for it is one of the smaller challenges. To get any reasonable speed using a propeller will require multiple propellers accessing multiple SD cards in parallel. Try what Heater suggested. Start with one propeller, SD card, and a portion of the database. See how Wikipedia stores and accesses that data, and if it is possible to split it into chunks and do the same with the propeller. Lots of fun and even if this idea proves to be impractical you will learn a lot.
  • David BDavid B Posts: 592
    edited 2015-01-12 09:29
    This would be a big project but I don't see any reason that it couldn't work.

    Learn about database indexing; that would be the number one thing to get right for this to have any chance of success, in my opinion. Plus get a lot of practical experience using a propeller and a few sd cards; that's the other number one thing.

    Once it's working, keeping the data updated with changes might be challenging, though.
  • Heater.Heater. Posts: 21,230
    edited 2015-01-12 10:18
    It could of course work. In the same way that it's possible to boot up Linux on an 8 bit AVR chip. Which takes three hours to get to a command prompt.

    This is totally impractical.

    However, I do urge Broker to get down with the Prop and an SD card and see what can be done.
  • David CarrierDavid Carrier Posts: 294
    edited 2015-01-12 13:19
    I used Kiwix to access the English portion Wikipedia on my phone, and it is compressed down to just over 40 GB, albeit with low-resolution images. It uses the open-source ZIM file format to reduce the file size. You may be able to write a parser to read the files in a Propeller microcontroller application.

    I also have a device called a WikiReader that I bought for around $15. It has a proprietary file format, but the software and hardware are both open source, so it wouldn't be to difficult to implement on a Propeller microcontroller system.
  • Heater.Heater. Posts: 21,230
    edited 2015-01-12 13:32
    The ZIM format requires that you implement LZMA algorithm and whatever else.

    Assuming you have bolted the required amount of memory and file system storage to your Propeller to do all that, plus run an OS to support it all, it's still going to take forever.

    Perhaps the best way to do this is to write a MIPS emulator for the Prop, which can be used to run Linux, which can then do the rest of the job.

    If you have time to wait....
  • Dave HeinDave Hein Posts: 6,347
    edited 2015-01-12 14:46
    Where do you get a 1 TB SD card?
    How much does a 1 TB SD card cost?
  • localrogerlocalroger Posts: 3,451
    edited 2015-01-12 19:33
    I have a wikireader, which is a very cool device. The Propeller is seriously RAM bound for a project of this sort unfortunately. The wikireader can be programmed in Forth, and I modified mine to add external access to the serial interface; I actually use my Parallax PropPlug to communicate with it. I had thought it would make a nifty ultra-portable low-power word processor and it probably would, but documentation on its Forth dialect is nil and I never had time to get into it. It is a slick very low power (runs for 100+ hours on 2 of AAA batteries) device. And the uSD card can be changed out both to upgrade the Wiki database and to repurpose the hardware. For what you want to do I'd be looking into one of them.
  • Peter JakackiPeter Jakacki Posts: 10,193
    edited 2015-01-12 19:58
    I've been treating this whole thread as a "dunny dream" and a few of you locals will know what I mean. There is nothing practical or realistic about it other than a learning exercise in finding out that there is nothing practical or realistic about it. Even if you said you were going to use an RPi which certainly has a lot more memory and computing power, I would still say it's impractical. Why?

    You said, "contain a propeller and a 1TB SD-card". Well that made me laugh as it is just possible to buy 512GB SD cards for around $600 or so, but not 1TB. Then you mentioned a propeller which we are all great fans of but it is not the chip to use for accessing huge amounts of data. Now talking about huge amounts of data, what use is it unless it can use it in an intelligent manner, otherwise it is nothing more than a reader where you have to type in what you are looking for. Then you are making an intelligent robot that can retrieve wikipedia data on "command" etc...

    I wish you all the best in just getting a 1TB SD card. Once they are available and you have saved up for that then you can proceed to the next step, which BTW, would NOT be a Propeller of any kind. Perhaps a Propeller can run the robot systems, but it is very ill suited to database processing, especially huge databases.

    But by all means get a Propeller with an SD card and play, it's fun.
  • BrokerBroker Posts: 4
    edited 2015-01-12 20:29
    Ive read all your reply's and currently processing it all please allow me time to respond. I am will state that I am very familiar with SPIN and the propeller but some peripherals are somewhat foggy to me, i.e. sdcard.

    Please be patient.
  • BrokerBroker Posts: 4
    edited 2016-09-13 15:42
    Peter -> 1Tb sd-cards are available and fairly cheep - currently under 40 american dollars.

    I still want to place wikipedia on an sd-card and use a propeller to access it.

    Ive installed the large wikipedia version using kiwix. its fantastic.
    Now I need to somehow hack or reverse engineer some code perhaps bypass kiwix.
  • Is that 1 Terabit (128 Gigabytes)?
  • Wot!? Almost two years since you posted this and even if these 1TB cards were really available now which they aren't then I sincerely doubt your claim that they are "cheep", and the "under 40 american dollars" you quote is not a genuine price for a genuine product. You go buy cheep cheep, and you can go bleep bleep, because that one I saw listed shows a no name brand "microSD" and that ain't right either since Sandisk's highest capacity full sized SD is 512GB, which is impressive. Try again in another couple of years to see if the price has come down to $40, I think not even then.

    btw, kiwix wikipedia is around 61GB which means it is heavily compressed and to decompress on the fly requires both high processing speed and lots of RAM. This is not a job for a microcontroller like the Propeller.

    Now you need to somehow work out how you can reverse engineer your own claims to better understand why you need to talk about the fact that you can't do what you would like to do that can't be done the way you would like it :) In other words, examine your motives and find some more immediate and realistic goals that you can set for yourself and work from there.
  • kwinnkwinn Posts: 8,697
    Seairth wrote: »
    Is that 1 Terabit (128 Gigabytes)?

    Most likely so. I have seen 128GB on the store shelf, 512GB, and 1TB (1024GB) on the internet. The 1TB cards were listed for $39.99, in "no name" packaging, which makes me somewhat suspicious.

    On top of that I think trying to store the entire wiki on a single card and accessing the data via a single propeller or other cpu would be dreadfully slow. Better to use multiple cards and cpu/cogs and access multiple cards in parallel
  • 1TB SD cards are bogus. No one makes/sells one that size.

    You can find them on eBay and Amazon. Read their reviews.
    One on eBay advertises 90MB/sec data transfer.
    At that speed it would take 131 DAYS to fill the 1TB...
  • kwinnkwinn Posts: 8,697
    DaveJenson wrote: »
    1TB SD cards are bogus. No one makes/sells one that size.

    You can find them on eBay and Amazon. Read their reviews.
    One on eBay advertises 90MB/sec data transfer.
    At that speed it would take 131 DAYS to fill the 1TB...

    Yep, which is why I am somewhat suspicious of them. If I were to attempt such a large data storage and retrieval task using propellers (highly unlikely) it would be with multiple smaller cards and at least one cog per card for reading and one for searching.
  • altosackaltosack Posts: 132
    edited 2016-09-14 19:19
    DaveJenson wrote: »
    1TB SD cards are bogus. No one makes/sells one that size.
    True.
    ...90MB/sec data transfer. At that speed it would take 131 DAYS to fill the 1TB...
    Off by a factor of 1000; it would take a little more than 3 hours. But since the real speed of the P1 writing to an SD card, especially if you include *getting* the data from somewhere, is about 1000 times less than that, it could be pretty close for that non-useful use case.
  • Even it you could get it all on an SD card, there's still the problem of indexing in a way that the info becomes useful. Plus, much of Wikipedia's "data" relies on the included graphics. How will those be accessed and displayed?

    -Phil
  • Maybe Wikipedia on a raspberry Pi.
    https://akhenakh.github.io/gozim/
Sign In or Register to comment.