Shop OBEX P1 Docs P2 Docs Learn Events
Forum Archive. Coming Soon. Need Programer Help/Sugestions — Parallax Forums

Forum Archive. Coming Soon. Need Programer Help/Sugestions

mctriviamctrivia Posts: 3,772
edited 2009-09-19 20:53 in Propeller 1
I have goten the go ahead from Parallax to download the entire forum. I will be doing this January 1st, 2010.

I will be compiling the data into a DVD or multiple DVD images that will be available for free download.

I can easily write the spider to download all the text and files publicly viewable on the forum the question is what is the best format to store them on DVD?

HTML would make finding what you want difficult. There will need to be some kind of search method.

Anyone up to writing a program to put on the dvd to display, sort, and search the large amount of data there will be?

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
propmod_us and propmod_1x1 are in stock. Only $30. PCB available for $5

Want to make projects and have Gadget Gangster sell them for you? propmod-us_ps_sd and propmod-1x1 are now available for use in your Gadget Gangster Projects.

Need to upload large images or movies for use in the forum. you can do so at uploader.propmodule.com for free.
«1

Comments

  • Oldbitcollector (Jeff)Oldbitcollector (Jeff) Posts: 8,091
    edited 2009-09-15 20:23
    Is it possible for you to store them as close as browser compatible as possible?
    With attached files?

    I'd consider .torrent transfer for this, at least initially to save on bandwidth. (I'll seed)

    OBC

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    New to the Propeller?

    Visit the: The Propeller Pages @ Warranty Void.
  • Mike GreenMike Green Posts: 23,101
    edited 2009-09-15 20:27
    I'd suggest some kind of RTF formatting. My main concern is that I have a Mac and a number of people use Linux and a search program is likely to be written for only Windows users. There are cross-platform database engines (like OpenBase) that are available for MacOS, Windows, and Linux. Even if the user interface is written for Windows, it could be redone later in a cross-platform way. For example, there's a RealBasic interface and RealBasic programs can be compiled for all three operating systems.
  • Bill HenningBill Henning Posts: 6,445
    edited 2009-09-15 20:30
    Perhaps use wget on a Unix box to fetch the whole tree, and maybe index it with Google desktop? (not sure the indexing would work)

    Excellent idea btw - it will let me take the forum with me on future cruises!

    (They charge$55US for 100 on NCL)
    mctrivia said...
    I have goten the go ahead from Parallax to download the entire forum. I will be doing this January 1st, 2010.

    I will be compiling the data into a DVD or multiple DVD images that will be available for free download.

    I can easily write the spider to download all the text and files publicly viewable on the forum the question is what is the best format to store them on DVD?

    HTML would make finding what you want difficult. There will need to be some kind of search method.

    Anyone up to writing a program to put on the dvd to display, sort, and search the large amount of data there will be?
    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Please use mikronauts _at_ gmail _dot_ com to contact me off-forum, my PM is almost totally full
    Morpheus & Mem+dual Prop SBC w/ 512KB kit $119.95, 2MB memory IO board kit $89.95, both kits $189.95
    www.mikronauts.com - my site 6.250MHz custom Crystals for running Propellers at 100MHz
    Las - Large model assembler for the Propeller Largos - a feature full nano operating system for the Propeller
  • mctriviamctrivia Posts: 3,772
    edited 2009-09-15 20:30
    I was thinking if there was a way to automate conversion PDF with key word database would be best way to go. Could use HTML also.


    I definitely plan to download all attachments.

    DVDs will definitely be via torent and my web site. I am not woried about bandwidth but will be faster via torrent to send to many people.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    propmod_us and propmod_1x1 are in stock. Only $30. PCB available for $5

    Want to make projects and have Gadget Gangster sell them for you? propmod-us_ps_sd and propmod-1x1 are now available for use in your Gadget Gangster Projects.

    Need to upload large images or movies for use in the forum. you can do so at uploader.propmodule.com for free.
  • mctriviamctrivia Posts: 3,772
    edited 2009-09-15 20:36
    I will be programing my web server to do download. Have multi gig internet access

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    propmod_us and propmod_1x1 are in stock. Only $30. PCB available for $5

    Want to make projects and have Gadget Gangster sell them for you? propmod-us_ps_sd and propmod-1x1 are now available for use in your Gadget Gangster Projects.

    Need to upload large images or movies for use in the forum. you can do so at uploader.propmodule.com for free.
  • CounterRotatingPropsCounterRotatingProps Posts: 1,132
    edited 2009-09-15 20:41
    Please PLEASE no PDF's unless you can make *each* thread a separate PDF.

    ( Pretty Demented Format )

    The only advantage to this format is that it's hard to search, hard to (really) use, and dang slow. Making the threads available for the slowly growing number of folks using E-readers is the *only* reason I can conjure up for PDFs.

    Stick with HTML.

    Slurp the whole site into a mambo PERL routine and spit HTML out the other end...

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
  • Beau SchwabeBeau Schwabe Posts: 6,568
    edited 2009-09-15 20:42
    mctrivia,

    Will there be any "clean up" efforts? ... i.e. there are some images saved as 24-bit color, when only two colors are used in the drawing.

    Depending on how the image data is referenced, you could use a batch image converter to save a huge amount of space.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Beau Schwabe

    IC Layout Engineer
    Parallax, Inc.
  • mctriviamctrivia Posts: 3,772
    edited 2009-09-15 20:57
    Yes any bmp images will be made to png

    I can save all the data in any format.

    Html would definetly make so any os can read.

    Still relatively easy to render or link to from search program.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    propmod_us and propmod_1x1 are in stock. Only $30. PCB available for $5

    Want to make projects and have Gadget Gangster sell them for you? propmod-us_ps_sd and propmod-1x1 are now available for use in your Gadget Gangster Projects.

    Need to upload large images or movies for use in the forum. you can do so at uploader.propmodule.com for free.
  • PavelPavel Posts: 43
    edited 2009-09-15 21:32
    The forum posts are most likely stored in a MySQL or Postreq database, so instead of spidering the whole forum, I'd consider tapping directly into the underlying database and obtaining a copy of the data directly from there. The data could then be stored on the DVD in the same (or similar) database format and a search/display client program (also distributed on the DVD) could be used to search and display the posts. The advantage of this approach is that the data are in database format (which makes them easy and fast to search/index and preserves the original data structure) and the search/display program can be updated at any time to make the searches more accurate or faster or the display of posts more coherent/pretty. You could also provide the ability for the user to download the new posts and add them to their local copy (because if you spider the whole forum on a certain date, by the time you have the DVD made, the forum itself has many new posts). Different search/display clients can be written for different platforms to maximize the effectiveness/quality, or a multiplatform approach (wxWidgets, Java, scripting languages) can be used to minimize the amount of the code to write.

    If you flatten the posts into some output format (HTML, PDF, DOC), you may encounter undue difficulties trying to retrieve the original information from the posts, as they are now flattened and some of the original data-structure might have been lost.
  • CounterRotatingPropsCounterRotatingProps Posts: 1,132
    edited 2009-09-15 21:44
    I agree with Pavel's analysis - what kind of DB is on the back end?

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
  • mctriviamctrivia Posts: 3,772
    edited 2009-09-15 21:46
    I can easily generate mysql

    And will ask parallax if they would consider letting me have direct dump of appropriate tables

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    propmod_us and propmod_1x1 are in stock. Only $30. PCB available for $5

    Want to make projects and have Gadget Gangster sell them for you? propmod-us_ps_sd and propmod-1x1 are now available for use in your Gadget Gangster Projects.

    Need to upload large images or movies for use in the forum. you can do so at uploader.propmodule.com for free.
  • CounterRotatingPropsCounterRotatingProps Posts: 1,132
    edited 2009-09-15 21:49
    Is it mysql?

    that would make sense as the beasty was built by 'dotNetBB' - and SQL server usually is a tad more.

    [noparse][[/noparse]EDIT]

    Actually info via the link at page bottom indicates SQL 2000.

    You'd need an ODBC connection if you really want to do this easy... but that's somewhat of a security hole.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔


    Post Edited (CounterRotatingProps) : 9/15/2009 9:55:36 PM GMT
  • CounterRotatingPropsCounterRotatingProps Posts: 1,132
    edited 2009-09-15 21:51
    Well, the question that's probably on many lips:

    Is this also a preliminary step to replacing the forum software?

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
  • Ken PetersonKen Peterson Posts: 806
    edited 2009-09-15 21:54
    I concur. Getting the tables directly is your best bet. I have done quite a bit of web / database programming, but unfortunately I just don't have time for such a project.

    One thought: first, generate an index. Then provide the index in something that can be used in Javascript. Do the UI and search functions in Javascript and then nothing needs to be installed from the DVD.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    "I have not failed. I've just found 10,000 ways that won't work. "
    - Thomas A. Edison
  • CounterRotatingPropsCounterRotatingProps Posts: 1,132
    edited 2009-09-15 22:00
    Yes,

    this way you can make a simple, web like HTML entry point.

    (I'm in a similar situation as Ken, but can at least offer ideas.)

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
  • BradCBradC Posts: 2,601
    edited 2009-09-15 23:30
    Leave the format as html, then it can be easily searched by any text based search tool of the users choice (grep!)

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    lt's not particularly silly, is it?
  • CounterRotatingPropsCounterRotatingProps Posts: 1,132
    edited 2009-09-15 23:36
    Ah, the humble grep. [noparse]:)[/noparse]

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
  • mctriviamctrivia Posts: 3,772
    edited 2009-09-15 23:36
    It would be multiple emails. How many programs will give useful results from 10000 html files. Would be easiest to save as 1 html per thread

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    propmod_us and propmod_1x1 are in stock. Only $30. PCB available for $5

    Want to make projects and have Gadget Gangster sell them for you? propmod-us_ps_sd and propmod-1x1 are now available for use in your Gadget Gangster Projects.

    Need to upload large images or movies for use in the forum. you can do so at uploader.propmodule.com for free.
  • edited 2009-09-15 23:43
    My suggestion would be to back it up on a solid state disk because I have CD's that have gone bad and I have a DVD that won't play because of "wobble". You might want to have several means of backup.
  • CounterRotatingPropsCounterRotatingProps Posts: 1,132
    edited 2009-09-15 23:55
    mctrivia said...
    It would be multiple emails.

    Ah, *what* would be multiple emails ?

    The data Parallax is sending you, or the search fetch results back to the user?

    > How many programs will give useful results from 10000 html files. Would be easiest to save as 1 html per thread

    Not sure where you're headed there, Matt, but there's a reason why this thing is in a DB in the first place... I'd say the easiest way to do this is to forget the DVD idea and just set it up as a static image on your server.

    Since·you have access to a server, can you put SQL-server on it?· If you can't (do to $)·SQL can usually be converted to MySQL.

    Either way,·the simplest thing: just snapshot the existing tables from the Parallax DB and replicate the database on your server, then make a simple web front-end to it.· Let the DB do the indexing, searching, etc.

    This could be coded, moved, and launched in a matter of hours.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
  • CounterRotatingPropsCounterRotatingProps Posts: 1,132
    edited 2009-09-16 00:00
    A follow up --- do you know how big the entire dataset is, roughly?

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
  • mctriviamctrivia Posts: 3,772
    edited 2009-09-16 00:18
    The reason dvd is so it can be used offline.

    I will keep image on my server indefinetly so losing do to damage disk not a problem.

    I have a mysql server but purpose is offline use. Use parallax for online.

    On blackberry. Not sure what word email was suppose to be.

    Database is probably not that big. Attachments are probably very big.



    The question is best format for static dvd.
    1 folder per forum
    1 html file per thread
    Coma delimited index of key raises for script to help find thing.

    Or keep entirely database with code for each of the major os

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    propmod_us and propmod_1x1 are in stock. Only $30. PCB available for $5

    Want to make projects and have Gadget Gangster sell them for you? propmod-us_ps_sd and propmod-1x1 are now available for use in your Gadget Gangster Projects.

    Need to upload large images or movies for use in the forum. you can do so at uploader.propmodule.com for free.
  • MicrocontrolledMicrocontrolled Posts: 2,461
    edited 2009-09-16 00:26
    I think that a PDF would be good if it was bookmarked for each thread. The disadvantage of the PDF is that to load the WHOLE FORUM it would take Windows a WEEK WITHOUT BREAKS!!!

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Computers are microcontrolled.

    Robots are microcontrolled.
    I am microcontrolled.

    But you·can·call me micro.

    Want to·experiment with the SX or just put together a cool project?
    SX Spinning light display·


  • CounterRotatingPropsCounterRotatingProps Posts: 1,132
    edited 2009-09-16 00:29
    > I think that a PDF would be good

    FIRE GOOD ! jumpin.gif


    skull.gif·skull.gif·skull.gifskull.gifskull.gif

    skull.gif·PDF BAD !skull.gif

    skull.gif·skull.gifskull.gifskull.gifskull.gif

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
  • mctriviamctrivia Posts: 3,772
    edited 2009-09-16 00:32
    Would be many pdf not 1 if that way but looks like consensus is not pdf

    Html or database

    Html I can do by myself(look almost exactly like this site but with no option to write stuff, and no search feature). I would like to be able to package a program along to help in finding stuff.

    Database I need help. there would need to be 3 programs to interpret. 1 for each of the os's

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    propmod_us and propmod_1x1 are in stock. Only $30. PCB available for $5

    Want to make projects and have Gadget Gangster sell them for you? propmod-us_ps_sd and propmod-1x1 are now available for use in your Gadget Gangster Projects.

    Need to upload large images or movies for use in the forum. you can do so at uploader.propmodule.com for free.

    Post Edited (mctrivia) : 9/16/2009 12:49:41 AM GMT
  • MicrocontrolledMicrocontrolled Posts: 2,461
    edited 2009-09-16 00:49
    I think that it does stand for "Pretty Dumb Format"!!! I just tried to load a 6 page datasheet, I know what you mean!!!!

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Computers are microcontrolled.

    Robots are microcontrolled.
    I am microcontrolled.

    But you·can·call me micro.

    Want to·experiment with the SX or just put together a cool project?
    SX Spinning light display·


  • SRLMSRLM Posts: 5,045
    edited 2009-09-16 00:55
    I'd like to suggest that the archiving includes links one deep to non-forum sites. Specifically, it's to get the attachments that are posted off site and the informational pages that people link to.
  • mctriviamctrivia Posts: 3,772
    edited 2009-09-16 01:00
    I will save imbeded images and maintain links though I will not copy code from links.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    propmod_us and propmod_1x1 are in stock. Only $30. PCB available for $5

    Want to make projects and have Gadget Gangster sell them for you? propmod-us_ps_sd and propmod-1x1 are now available for use in your Gadget Gangster Projects.

    Need to upload large images or movies for use in the forum. you can do so at uploader.propmodule.com for free.
  • mctriviamctrivia Posts: 3,772
    edited 2009-09-16 04:04
    Parallax won't provide original table

    Other then search database format does not provide much benefit and would force pc to generate page.

    I think html for the majority of data with a coma delimited database for search purposes would be best. Any os can read html and even if no one writes code to use db by release it could be installed on pc afterwards.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    propmod_us and propmod_1x1 are in stock. Only $30. PCB available for $5

    Want to make projects and have Gadget Gangster sell them for you? propmod-us_ps_sd and propmod-1x1 are now available for use in your Gadget Gangster Projects.

    Need to upload large images or movies for use in the forum. you can do so at uploader.propmodule.com for free.
  • w8anw8an Posts: 176
    edited 2009-09-16 04:37
    Keeping it in html would be best for multi-platform. A search tool could be written in taffy DB taffydb.com/.
Sign In or Register to comment.