Shop OBEX P1 Docs P2 Docs Learn Events
Tool to convert HTML to DAT — Parallax Forums

Tool to convert HTML to DAT

DynamoBenDynamoBen Posts: 366
edited 2012-12-11 20:46 in Accessories
I'm interested in a tool/script to convert HTML to something that is DAT friendly. I've been embedding my HTML code in Spin, to do this I have to break all the lines apart by hand and replace all the quotes with quote symbols. This would be less painful if I could just run a script against the HTML code and it would spit out a new file with everything reformatted and then I can copy/paste into the DAT block.

Comments

  • Mike GMike G Posts: 2,702
    edited 2012-10-24 13:40
    I can knock that for ya if you like.
  • DynamoBenDynamoBen Posts: 366
    edited 2012-10-24 13:45
    Mike G wrote: »
    I can knock that for ya if you like.

    Thanks. I'm thinking this will be helpful for those like myself who have simple pages in code, or those that want to put it on EEProm.
  • Mike GMike G Posts: 2,702
    edited 2012-10-24 14:37
    Just about done. It's a console app so you'll have to type the source file name. Regex is a wonderful tool.
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2012-10-24 14:43
    Mike,

    You beat me to it. I've almost got a web-based app done.

    -Phil
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2012-10-24 15:08
    Here's an online version:

    -Phil
  • Mike GMike G Posts: 2,702
    edited 2012-10-24 15:19
    Nice Phil. I'll post mine when I get home from work. Need to verify in the prop tool.
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2012-10-24 15:30
    Here's the source for mine.

    index.html:
    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
    <html>
      <head>
        <title>HTML2DAT</title>
        <style type="text/css">
          p {margin-top:5px;margin-bottom:5px}
        </style>
      </head>
      <body bgcolor="#202030" style="font-family:sans-serif;font-size:10pt;">
        <center>
          <div align="center" style="width:700px;border-width:1;border-color:black;border-style:solid;padding:15px;background:#ffffe0;">
            <h2 style="font-family:sans-serif;color:navy;text-align:center;margin-top:0px;">Propeller HTML to DAT Data Converter</h2>
            <form method="post" action="http://www.phipi.com/cgi-bin/html2dat.pl">
              Paste your HTML text in the box below, the click the Submit button.<p>
              <textarea name="html" cols="80" rows="40" wrap="off" style="font-size:8pt"></textarea><p>
              <input type="reset"> <input type="submit" value="Submit">
            <form>
          </div>
        </center>
      </body>
    </html>
    


    html2dat.pl:
    #!/usr/bin/perl -T
    
    use strict;
    use CGI;
    use CGI::Carp qw(fatalsToBrowser);
    
    $CGI::POST_MAX = 100000;
    
    my $dat = "Content-type: text/plain\n\n' Copy and paste the following into your Spin program's DAT section:\n\n";
    my $query = new CGI; 
    my $html = $query->param('html'); 
    my @html = split (/[\r\n]+/, $html);
    
    foreach (@html) {
      s/\t//g;
      while (length($_)) {
        my $line;
        if (length($_) > 60) {
          $line = substr($_, 0, 60); 
          $_ = substr($_, 60)
        } else {
          $line = $_;
          $_ = ''
        }
        $line =~ s/\"/\",34,\"/g;
        $line = '"' . $line . '"';
        $line =~ s/\"\",//g;
        $line =~ s/34,\"\"$/34/;
        $dat .= " " x 14 . "byte      $line";
        $dat .= ',13' unless length($_);
        $dat .= "\n"
      }
    }
    print $dat . "              byte      0\n";
    

    -Phil
  • Mike GMike G Posts: 2,702
    edited 2012-10-24 16:02
    Here's my eyeball verified source
    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.IO;
    using System.Text.RegularExpressions;
    
    namespace HtmlToDatConsole
    {
        class Program
        {
            static void Main(string[] args)
            {
                ConvertFile(@"D:\html\index.htm", @"D:\html\index.txt");
                ViewFile(@"D:\html\index.txt");
            }
    
    
            private static void ConvertFile(string source, string destination)
            {
                bool isFirstLine = true;
                // Create a file to write to. 
                using (StreamWriter sw = File.CreateText(destination))
                {
                    using (StreamReader sr = File.OpenText(source))
                    {
                        string s = string.Empty;
                        while ((s = sr.ReadLine()) != null)
                        {
                            s = ReplaceWhiteSpace(s);
                            s = ReplaceDoubleQuote(s);
                            if (!sr.EndOfStream)
                                s = AddCrLfAndComments(s, isFirstLine);
                            else
                                s = LastLine(s);
                            isFirstLine = false;
                            sw.WriteLine(s);
                        }
                     
                    }//StreamReader
                }//StreamWriter
     
            }
    
    
            private static string LastLine(string line)
            {
                string first = "}\t";
                line = string.Format("{0}\t{1}{2}{3}, {4}, {5} {6}", first, '\"', line, '\"', "$0D", "$0A", "{  }");
                return line; 
            }
    
            private static string AddCrLfAndComments(string line, bool isFirstline)
            {
                string first = string.Empty;
                if (isFirstline)
                    first = "lable\tbyte";
                else
                    first = "}\t";
    
                //                       }   "  --  "   cr   lf    {
                line = string.Format("{0}\t{1}{2}{3}, {4}, {5}, {6}", first, '\"', line, '\"', "$0D", "$0A", "{");
                return line;
            }
    
            private static string ReplaceDoubleQuote(string line)
            {
                string pattern = "\x22";
                string replacement = string.Format("{0}, $22, {1}", '\"', '\"');
                Regex rgx = new Regex(pattern);
                string result = rgx.Replace(line, replacement);
                return result;
            }
    
            private static string ReplaceWhiteSpace(string line)
            {
                string pattern = "\\s+";
                string replacement = " ";
                Regex rgx = new Regex(pattern);
                string result = rgx.Replace(line, replacement);
                return result;
            }
    
            private static void ViewFile(string path)
            {
                // Open the file to read from. 
                using (StreamReader sr = File.OpenText(path))
                {
                    string s = string.Empty;
                    while ((s = sr.ReadLine()) != null)
                    {
                        Console.WriteLine(s);
                    }
                }
            }
        }
    }
    
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2012-10-24 16:10
    Mike,

    I see you're replacing multiple consecutive spaces with a single space. I avoided doing that, in case there were any <pre> sections in the HTML. (I didn't feel like parsing the HTML to that level of detail to make the distinction, either. :) ) I do, however, eliminate all tabs.

    Ben,

    Just as a heads-up, the DAT section is not the best place to store HTML in the Spinneret. Text blocks in DAT chew up enormous amounts of hub memory. It would be better to keep the HTML out of your program proper and save it on an SD card.

    -Phil
  • Mike GMike G Posts: 2,702
    edited 2012-10-24 16:32
    Good point
  • Igor_RastIgor_Rast Posts: 357
    edited 2012-10-24 18:30
    Ben,

    Just as a heads-up, the DAT section is not the best place to store HTML in the Spinneret. Text blocks in DAT chew up enormous amounts of hub memory. It would be better to keep the HTML out of your program proper and save it on an SD card.

    -Phil
    @Phill . thats not a bad idea at all .
    do you mean by using a bigger eeprom and putting the dat where the prop normaly doesnt reach , And only call the date when it,s beeing posted.
    how,s the best way to do this ?
    would like to do that to get rid of some of my html code thats now taking alot of space .and i am not using a sd card. with it wiznet
    any helpfulll links ?


    UPDATE , o sorry i missread you saying SD card,

    is it posssible to do it the eeprom way i was mentioning if you know ?
  • DynamoBenDynamoBen Posts: 366
    edited 2012-10-24 18:44
    Ben,

    Just as a heads-up, the DAT section is not the best place to store HTML in the Spinneret. Text blocks in DAT chew up enormous amounts of hub memory. It would be better to keep the HTML out of your program proper and save it on an SD card.

    -Phil

    That would be true if I was using the spinneret. I'm doing a webpage for configuration on a design that doesn't have an SD card, I will probably store the webpages in EEProm.
  • Mike GMike G Posts: 2,702
    edited 2012-10-24 21:54
    HTML to DAT attached. The attachment contains the source code as well as the binary located in the Html2Dat folder. It's a command line app. Simply type h2d [path] where [path] is the location of the file to convert.
    h2d c:\index.htm
    

    The output file is in the same directory as the source. The output file name is the source file ending with .txt.

    I took Phil's advice and removed tabs not white space but I left the ReplaceWhiteSpace method in the source code if someone wants to use it.
  • DynamoBenDynamoBen Posts: 366
    edited 2012-10-25 07:39
    It is pretty rewarding seeing all of this code get converted in a few hundred milliseconds. Haven't confirmed that it is formatted correctly but its a strong start. Thanks!

    Next up will be writting a routine to program this to EEProm.
  • DynamoBenDynamoBen Posts: 366
    edited 2012-12-11 11:58
    DynamoBen wrote: »
    It is pretty rewarding seeing all of this code get converted in a few hundred milliseconds. Haven't confirmed that it is formatted correctly but its a strong start. Thanks!

    Next up will be writting a routine to program this to EEProm.

    BTW I'm wondering if EEProm is the right way to go for storing webpages, I'm concerned that I may run out of space. Is there another better method for storing webpages that I should consider?
  • Mike GMike G Posts: 2,702
    edited 2012-12-11 12:20
    An SD card works well but has the added driver overhead.

    It depends on the project. Many browser type projects can be realized with a single page. Lightweight AJAX requests execute from the client are used to update the HTML.

    Another approach is to use a real web server where the web server gets data (sensors and such) from HTTP GET/POST requests to the Wiznet device. To the web server the Wiznet is just a client request away like a database requests. The Wiznet device becomes a service...
  • DynamoBenDynamoBen Posts: 366
    edited 2012-12-11 13:23
    Mike G wrote: »
    An SD card works well but has the added driver overhead.

    It depends on the project. Many browser type projects can be realized with a single page. Lightweight AJAX requests execute from the client are used to update the HTML.

    Another approach is to use a real web server where the web server gets data (sensors and such) from HTTP GET/POST requests to the Wiznet device. To the web server the Wiznet is just a client request away like a database requests. The Wiznet device becomes a service...

    I'm creating some basic device configuration pages. While I could combine all of it into a single page visually it gets kind of unwieldy. I think a separate webserver would be overkill for this application.

    I suppose I could do SD (via SPI and share with Wiznet) I was just hoping there was something between EEProm and SD.
  • Mike GMike G Posts: 2,702
    edited 2012-12-11 15:42
    I'd store configuration data in EEPROM. I would think 64K EEPROM and using the upper 32k would be plenty of storage space.

    How much configuration data do you need? How large are the HTML pages? How many pages are required?
  • DynamoBenDynamoBen Posts: 366
    edited 2012-12-11 16:21
    Mike G wrote: »
    I'd store configuration data in EEPROM. I would think 64K EEPROM and using the upper 32k would be plenty of storage space.

    How much configuration data do you need? How large are the HTML pages? How many pages are required?

    The configuration data itself will be in EEProm, I'm talking about the webpages. I'm thinking 4-5 webpages, as far as size I will have to write a program to calculate how much space each would take up.
  • Mike GMike G Posts: 2,702
    edited 2012-12-11 18:06
    The configuration data itself will be in EEProm, I'm talking about the webpages. I'm thinking 4-5 webpages, as far as size I will have to write a program to calculate how much space each would take up.
    I kinda' figured the configuration data and configuration web pages would live in EEPROM. That's what I would do anyway. Use a specially named directory or file to retrieve the configuration page from EEPROM, like cnfg5x01.

    You don't need to write a program to check files size, just build the HTML and take a look at the file properties. That is, if you want browser based configuration . You could also use telnet or a custom UPD app to setup the configuration.
  • DynamoBenDynamoBen Posts: 366
    edited 2012-12-11 18:23
    Mike G wrote: »
    I kinda' figured the configuration data and configuration web pages would live in EEPROM. That's what I would do anyway.
    That is my preference.
    Use a specially named directory or file to retrieve the configuration page from EEPROM, like cnfg5x01.
    Not following you on this one, mind explaining a little further?
    You don't need to write a program to check files size, just build the HTML and take a look at the file properties.
    File properties includes white-space in the total (in windows). But for reference one page is 13.6KB.
    That is, if you want browser based configuration.
    Going browser based for this project.
  • Mike GMike G Posts: 2,702
    edited 2012-12-11 20:28
    In order to GET the configuration file(s), you'll need to somehow tell the server to retrieve the files from EEPROM as opposed to an SD card or HUB RAM. Browser based means using the HTTP protocol. Therefore, an HTTP process will be listening for a command like GET. With the GET there will be status line that contains a resource request. Make the request unique so the process knows to grab the HTML page from EEPROM.

    In my opinion, 13.6k for a configuration file is a bit on the large side. Especially if you are trying to fit the pages in the upper 32k of a 64k EEPROM.

    Maybe, it would be better to place the configuration files on an SD card.
  • DynamoBenDynamoBen Posts: 366
    edited 2012-12-11 20:46
    Mike G wrote: »
    In order to GET the configuration file(s), you'll need to somehow tell the server to retrieve the files from EEPROM as opposed to an SD card or HUB RAM. Browser based means using the HTTP protocol. Therefore, an HTTP process will be listening for a command like GET. With the GET there will be status line that contains a resource request. Make the request unique so the process knows to grab the HTML page from EEPROM.

    Thanks now I'm following you.
    In my opinion, 13.6k for a configuration file is a bit on the large side. Especially if you are trying to fit the pages in the upper 32k of a 64k EEPROM.

    Maybe, it would be better to place the configuration files on an SD card.

    I supposed I could add a second EEProm just for webpage storage, on the other hand an SD card gains me the ability to do images. Engineering is always filled with choices. :)
Sign In or Register to comment.