Shop OBEX P1 Docs P2 Docs Learn Events
Scripting language that isn't specific to data types? — Parallax Forums

Scripting language that isn't specific to data types?

MicrocontrolledMicrocontrolled Posts: 2,461
edited 2012-05-30 11:55 in General Discussion
Hi,

I've been recently revisiting my old ALICE-to-Propeller project. The main downfall of the project was the fact that, to speed it up, the entire ALICE library had to be sorted. After endless problems with the sorting program, mainly due to the Propeller's speed and numerous stack overflow errors, I gave up on the prospects of having the Propeller chip do the sorting. What would be much easier is to write a PC script to sort it, seeing as PC's are more fit to handle file systems and mass memory movement than a Propeller is. I've only experimented around with script languages before, and coming back to it for this project, I realized something in all of them that caught me up: Specific data types.

In SPIN, when you declare a byte, you are declaring an 8-bit section of the memory that can hold an ASCII character, a decimal value, a hexadecimal value, or a binary value. You can set them up into arrays with a simple bracket notation ( [#] ) and can access the arrays through a loop or directly using variables. This seems to not be the case for most scripting languages. VB script, Python, and Visual Basic (the 3 I have attempted so far) do not allow you to easily switch between data types, meaning you can't easily clear or nullify an array and makes programming difficult for someone not accustomed to it.

My question is, before I spend more time learning a new scripting language, can anyone suggest one that sticks to the SPIN-style way of denoting data types? Any help is appreciated.

Thanks,
Microcontrolled

Comments

  • jmgjmg Posts: 15,183
    edited 2012-05-27 20:06
    In SPIN, when you declare a byte, you are declaring an 8-bit section of the memory that can hold an ASCII character, a decimal value, a hexadecimal value, or a binary value. You can set them up into arrays with a simple bracket notation ( [#] ) and can access the arrays through a loop or directly using variables. This seems to not be the case for most scripting languages. VB script, Python, and Visual Basic (the 3 I have attempted so far) do not allow you to easily switch between data types, meaning you can't easily clear or nullify an array and makes programming difficult for someone not accustomed to it.

    I'm not following your point ? Visual Basic allows arrays, and bytes, and it can type convert mostly as needed ?
    Types exist mainly for speed and size reasons. If you care about neither, use Float for everything.

    Microcontrollers tend to care very much about size and speed.

    Some languages (or even chips) try to reduce the Types, but while that makes the language compiler simpler, the users code often gets larger.
    - eg you can do only 32 bit variables, but then have to resort to shift and mask, if you want bytes.
  • Heater.Heater. Posts: 21,230
    edited 2012-05-27 22:06
    Scripting languages tend to be dynamically typed but they can be coaxed into handling bytes and such.
    If you really want to be fussy about types why not just use a compiled language like C or Pascal? There are plenty of free compilers around.
  • Dave HeinDave Hein Posts: 6,347
    edited 2012-05-28 06:38
    I think the ALICE database file is just a text file. You can use the sort command under linux or Windows/Command prompt to sort it.

    EDIT: After think about it for a while, I seem to recall that the ALICE database file uses multiple lines for some entries, so my original suggestion wouldn't work. Do you have a link for the original data file? I'll search on my computer to see if I can find it.
  • MicrocontrolledMicrocontrolled Posts: 2,461
    edited 2012-05-28 08:12
    @Dave: Here is what I need to do to sort it:
    1. Find the Pattern tags
    2. Grab the text inside the pattern tags
    3. Take the first 7 characters of the pattern and create a new text file with that name.
    4. Write the entire category tag's contents (which includes the pattern and template tags) to the text file.
    5. Go to the next instance of a pattern tag and repeat this process.

    Doing this will allow the Propeller to easily narrow down what text file to search for a query in, rather than having to scan through 14MB of text.
  • Heater.Heater. Posts: 21,230
    edited 2012-05-28 09:53
    I would suggest building an index file. Go through your data file and find all the tags. Add the tag name to the index file along with the offset of the tag in the data file.
    Now you only need to search the index file to find any tags position in the data file which you can then read quickly with a seek command.
    Next step is to sort your index file so that when you do the look up on the Prop you can perform a binary search which will be very quick.
    Best use a fixed number of bytes for the tag name and file offset fields in the index file. Just zero out any unused space in the name.
    A sorted index plus binary search must be quicker than rifling through a mess of little files.
  • GadgetmanGadgetman Posts: 2,436
    edited 2012-05-28 12:44
    Have you considered REXX?

    http://en.wikipedia.org/wiki/REXX

    There's a lot of text parsing and file handling commands in the language.
    Variables are... typeless and really, really fun...
    (No arrays, but a compound structure that can be used as Arrays)

    I once built a MySQL frontend in REXX that ran in an OS/2 Command window.
    (OS/2 version of MySQL came with a REXX library)

    Here's a REXX-based XML-parser:
    http://rexxxmlparser.sourceforge.net/
  • 4x5n4x5n Posts: 745
    edited 2012-05-28 14:02
    I'm rather partial to perl on a pc or server (I'm a unix admin by profession). It allows arrays (lists really) to be created and destroyed dynamically if needed. I've never used it in a way that requires me to access or deal with data as bytes, words, etc and honestly wouldn't know how.
  • Dave HeinDave Hein Posts: 6,347
    edited 2012-05-28 14:47
    Microcontrolled, you should re-read your orignal thread on this subject at http://forums.parallax.com/showthread.php?137233-Artificial-Intelligence-(ALICE-AIML-interpreter)-on-the-Propeller!-(v.0.7)&highlight=alice . An efficient search algorimth was discussed there. Can you post a link to the original database file? I think it was call brain.aim, or something like that.
  • MicrocontrolledMicrocontrolled Posts: 2,461
    edited 2012-05-28 16:29
    I'm rewriting the ALICE program for use with Kye's FAT driver rather than FSRW, so I won't have [some of] the same problems that I had with the other version. I've changed the indexing system so now everything in the pattern tags are written to a single index tag, terminated by a decimal 13 and followed by the file seeking position. The indexing program appears to be working and has been processing the file for over an hour now. I'll let it run overnight and see what I get. Thanks for the help!

    - Microcontrolled

    EDIT: Dave, I took your advice and looked over the old thread. What I'm doing now is similar to what was discussed. I hope it's as fast as I think it will be. :)
  • softconsoftcon Posts: 217
    edited 2012-05-29 20:35
    Perl would do just fine with this. Everything can be referred to as anything else, just have everything be a string, unless you need it as a number or something else, and if you do, just put a variable type modifier on it, and poof, it's instantly something else.
    Perl also has the ability to swallow large strings with ease, so sorting large pieces of memory/files is super simple. It's a bit difficult to remember all it's ins and outs at times, but there's nothing better for doing quick and dirty things like what you're referring to.

    Although, it sounds like you got it solved, I figured I'd throw in my vote for perl anyhow.
  • Dave HeinDave Hein Posts: 6,347
    edited 2012-05-30 05:07
    Microcontrolled, I found the ALICE code that I had written back in January when you started the original thread. It was in the recycle bin on my computer at work. The BRAIN.ail file contains the text "<category>" at the beginning of each entry, and most entries are contained in a single line. Some of the entries are very long if they contain several random responses. The longest one is 10,025 bytes long. There are a few entries that use multiple lines, so I wrote a small program to make each entry a single line. The program is attached below. I then used the "sort" utility to sort the brain file.
    c
    c
    988B
  • MicrocontrolledMicrocontrolled Posts: 2,461
    edited 2012-05-30 11:55
    Thanks Dave!!!
Sign In or Register to comment.