Best way to handle text messaging

idbruce · 2015-03-03 21:49

There are three parts to this puzzle: 1) Storage 2) Iteration 3) Parsing

I suppose parsing could be eliminated, if each type of user had their own container for a certain scenario, but I like the idea of keeping all users together, to help keep track of the scenarios.

Peter Jakacki · 2015-03-03 22:01

idbruce wrote: »

One more point for now, because I do not believe anyone has brought this subject up, but if you run out of space on the EEPROM, you could add on-board flash memory with a chip or by adding the Propeller Memory Card to your project. Here is a snip from the Propeller Memory Card datasheet:

My first preference for this is a larger eeprom as it becomes a simple matter of swapping out the existing one. The maximum capacity appears to be 2Mbit which would translate to 3,584 messages of 63 characters + terminator in length, surely more than enough. Of course it is no problem to use an SPI Flash but that's an extra chip that has to be added unless of course you are designing new hardware but still the driver for this is more complicated than simple eeprom.

But I do this stuff all the time and I can't see why it is becoming so complicated. Use a plain text file and either copy and paste it or send as plain ascii through your terminal emulator, as simple as that. For formatting just agree on how the messages are referenced so that if it is a simple number then perhaps start the line with a special symbol and the number then perhaps the message after a whitespace or newline.

However in Forth I let Forth do all the hard work and might just let the message text file look like this:

104 MSG: How now brown cow.
105 MSG: Now is the time for all good men to come to the aid of the party.
& But if not then let all the bad men party on themselves.
106 MSG: She sells sea shells by the sea shore.
& If she smells then she sells she shells by the see saw.

etc,
But there are so many ways of formatting this but I know that it would only take minutes to make this method work for instance. It would allow messages to be changed individually or as a batch. The eeprom file would perhaps store the string normally as a null terminated string appended to the file in eeprom and if a string was being updated and it was the same size or smaller then it would just be written over the old one, otherwise it would zero out the old message and append as if it were a new message. For fast lookup a table would be maintained so that the message number would index a word that pointed to the message. You could still keep the pointer to 16-bits while addressing 256kB (largest eeprom) by simply making sure that strings were aligned to 4-byte boundaries otherwise use 3 or 4 byte pointers.

btw, I could write the code for this in about the same amount of time it takes me to draft this reply.

idbruce · 2015-03-03 22:51

106 MSG: She sells sea shells by the sea shore.
& If she smells then she sells she shells by the see saw.

Just exactly what kind of project did this come from?

btw, I could write the code for this in about the same amount of time it takes me to draft this reply.

Peter Jakacki · 2015-03-03 22:57

idbruce wrote: »

Just exactly what kind of project did this come from?

Oh, spur of the moment thing, couldn't stand the tedium of "plain text" messages

Oh, I have already written it, or just about anyway before I got sidetracked. A little bit more and then I just have to play with it but it looks like it will work. Only about 30 lines of code. I will post the results anyway but not right now as I have to run off to attend to other "duties".

MJB · 2015-03-04 00:08

Erlend wrote: »

TTS (TextToSpeech chip)
Erlend

Peter has been talking about the approach he would take in Tachyon ...
So I remembered his very compact WAV player.
When you go for a SD solution, then there is plenty of space to record the messages as audio ...

Heater. · 2015-03-04 01:21

Lots of suggestions here to get munching on.

I really have to disagree with kwinn. Spreadsheets are not the tool to use for maintaining text.

Let's say this gets really big. Thousands of messages a dozen languages.

My first issue then is how do I maintain all that, how do I keep it under version control with my the rest of my source code? You can't do a diff on a spreadsheet file. You don't want to be keeping spread sheet blobs in a source code repository.

What if I have many contributors to that efffort? Like guys around the world doing translations. How do they collaborate on it? How do you merge their efforts into the project easily?

Plain text files make all that easy. Plain text files in JSON format or even CSV make manipulating the whole thing with simple Python (or whatever language) scripts easy. It allows easy use of git or other source code management system. That allows easy collaboration.

I certainly was not suggesting that the JSON/CSV format was actually used on the target. But once you have the data organized like that it's easy to generate whatever final layout the target program needs. And easy to change that in future if need be.

On the other hand if the thing never gets that big why bother with a spreadsheet? Just hack it in a simple text file, with your favourite editor.

For the target I would go for simple. Just to get started transform your text "source" file into a set of zero terminated strings to be included into a "file" statement. Or transform it into a one message per fixed size block (for easy and fast access) in an SD card or large EEPROM.

If you need an index at some point to speed up access that can easily be generated from your source text.

Whatever fits the application.

Erlend · 2015-03-04 01:59

abecedarian wrote: »

Another idea might be instead of storing complete phrases, build phrases from a collection of dictionary-like objects.
Some of the objects could be stored in arrays whose index indicates severity.
You might have ...
... an array with "I'd like some", "I need", "I'm going to die without" and so on.
... an array with "food", "water", "fertilizer", "sunlight" and so on.
... an array with "please", "really soon", and maybe some expletives.
Then build sentences with what's in your dictionary.

That is the idea for the next step - when the machine constructs sentencenses. But probably at that time we will have pLisa on Obex. She'll do the talking.

Erlend

Erlend · 2015-03-04 02:22

MJB wrote: »

Peter has benn talking about the approach he woult take in Tachyon ...
So I remembered his very compact WAV player.
When you go for a SD solution, then there is plenty of space to record the messages as audio ...

I did go up that avenue at first, but the work to create and maintain spoken audio messages quickly got out of hand, and I realized that I would have to go for text. As part of this adventure I integrated a VMUSIC2 mp3 player into the system, it plays named files commanded over simple serial comms. This will serve as a sound effects device.
Erlend

Erlend · 2015-03-04 03:01

Lots of food for thoughts here.

Create & manage:
A significant part of the job is to create the messages themselves. I have created a spreadsheet on Google Drive for this, and I have shared it with some other people who may have creative ideas for the variant messages. For this kind of collaboration it works well. Configuration control is therefore zero, but this project is basically for fun, learning and research, so I can live with that.

Message format:
Fixed collumn makes everything easy. Of course it hurts to think about because it is a waste of space. But if space is no issue? CSV or ~SV files are also easy to handle, but require tiny more code to use at the P1 end, either to build an index first, or to search and find a text each time. Search & find is maybe fast enough though, even when trawling through 32k to find a record.
JSON and Python might be good friends, but I have barely begun to teach myself Python, so that appraoch would be too ambitious now.

Transfer and storage means:
Unplug/plug USB stick or SD is easy, but both have a significant sw footprint. I do really want to utilize that upper 32k empty space in the EEPROM which is already on the board. So then, how to transfer? As a simple text file transfer would be nice, and there might be good help to be had from Obex or you on that, I need to find out.

Erlend

Peter Jakacki · 2015-03-04 04:50

Erlend wrote: »

Lots of food for thoughts here.

Create & manage:
A significant part of the job is to create the messages themselves. I have created a spreadsheet on Google Drive for this, and I have shared it with some other people who may have creative ideas for the variant messages. For this kind of collaboration it works well. Configuration control is therefore zero, but this project is basically for fun, learning and research, so I can live with that.

Message format:
Fixed collumn makes everything easy. Of course it hurts to think about because it is a waste of space. But if space is no issue? CSV or ~SV files are also easy to handle, but require tiny more code to use at the P1 end, either to build an index first, or to search and find a text each time. Search & find is maybe fast enough though, even when trawling through 32k to find a record.
JSON and Python might be good friends, but I have barely begun to teach myself Python, so that appraoch would be too ambitious now.

Transfer and storage means:
Unplug/plug USB stick or SD is easy, but both have a significant sw footprint. I do really want to utilize that upper 32k empty space in the EEPROM which is already on the board. So then, how to transfer? As a simple text file transfer would be nice, and there might be good help to be had from Obex or you on that, I need to find out.

Erlend

The USB stick requires USB which the Prop does not really handle well at all so external hardware would be required. The SD is very simple and even with all my FAT32 and being able to open multiple files as well as the interactive shell commands the memory footprint is still only just over 4,800 bytes in Tachyon. As MJB mentioned the text-to-speech could be played from wave files very easily and you could use a similar method for text-to-speech in that you have every word recorded individually and those that aren't would be synthesised. In fact I may even play with that tonight, a text-to-speech add-on module for the filesystem so that we could just select the TTS as an output device and print messages to it just like we would for the console or any other output device.
TTS PRINT" I am Tachyon, resistance is futile, you will be assimilated." CR

Now you seem to talk about trawling through 32k as if that is enough, is it? If it is then I don't know why you haven't considered a larger eeprom if it's only for text as this only involves swapping out the existing chip.

Message format really should just be a plain text file with readable identifiers, why make it any more complicated than that???

Anyway, I'm pretty sure I could write and test the whole shebang in just a few hours which I may just do tonight before I turn in.

kwinn · 2015-03-04 08:11

Heater. wrote: »

Lots of suggestions here to get munching on.

I really have to disagree with kwinn. Spreadsheets are not the tool to use for maintaining text.

Let's say this gets really big. Thousands of messages a dozen languages...........

For a large collaborative project like that I agree, something more than a spreadsheet is needed, but for something where the code and text fits in a 64K eeprom a spreadsheet works well enough, although a text editor could also be used.

For the target I would go for simple. Just to get started transform your text "source" file into a set of zero terminated strings to be included into a "file" statement. Or transform it into a one message per fixed size block (for easy and fast access) in an SD card or large EEPROM.

If you need an index at some point to speed up access that can easily be generated from your source text.

Whatever fits the application.

Why waste space on fixed size blocks when zero terminated strings take up less space and make printing the messages simpler. If you look at the output CSV file created using a spreadsheet parsing and storing it in eeprom is very simple. For instance, the line from my earlier post can be stored as is and easily parsed and printed:

1,"This is a test to see if commas, can be entered and saved, as a csv file from a spreadsheet."

The message number is always followed by a , and the message always ends with a cr.

The text file could also be sent to the embedded propeller using a serial or bluetooth connection, and an index created as the text is stored in eeprom. A lot of the code for doing this is contained in JonnyMac's parsing routines (modified copy attached).

The changes made were adding a PUB to FullDuplexSerial to receive and store a text string, and modifying jm_parse_value to test the rxstr subroutine using PST.

Heater. · 2015-03-04 08:25

I'm all for saving space. My fixed size scheme was all about using an SD card. One could dedicate a kilobyte per message and still fit a million messages per gigabyte onto an SD. They are cheap enough no to worry about wasting space, it's only sand after all.

kwinn · 2015-03-04 09:54

Heater. wrote: »

I'm all for saving space. My fixed size scheme was all about using an SD card. One could dedicate a kilobyte per message and still fit a million messages per gigabyte onto an SD. They are cheap enough no to worry about wasting space, it's only sand after all.

So true. When using an sd the simplified searching and speed advantage of fixed size messages make that the way to go. We're not really disagreeing at all, mostly it's a result of the difference between fitting everything in a 64K eeprom vs using an sd card.

idbruce · 2015-03-04 10:46

What is the trigger mechanism for an individual message?

Erlend · 2015-03-04 10:53

kwinn wrote: »

...

An even better approach IMHO would be to leave the .csv file as is, have the PC send it to the propeller, and have a program on the propeller format and write the data to the upper 32K of the eeprom, creating an index while it writes the text, and finally writing that index to the eeprom.

The software needed for this would most likely take up less space than the text it moves to the upper 32K of ram, and JonnyMac's parsing routines would make formatting fairly simple. I also have added a routine to FullDuplexSerial that will receive and store a string of data to a byte array that I will post some time tomorrow.

Sounds like this is the way to go. I would appreciate if you would share the routine you mention, and then I can start to try to figure out how to transmit, receive, convert, index and store the texts.

Erlend

EDIT: always check if there is a next page on the thread before replying...
I have to study the attachments @kwinn included. I have to do an estimate if 32k really is enough. I have to make a decision. Everyone here has a good point, I learn a lot from this discussion.

Genetix · 2015-03-04 11:00

In 2 of Chip's VGA demos, he references a TXT file in the DAT section of the program and uses it with line PRINT_STRING(@TEXT) near the top of the program.
VGA Tile Driver Demo2 (1280x1024) and VGA Tile Driver Demo3 (1600x1200).

I love your statement Peter, " I am Tachyon, resistance is futile, you will be assimilated.", but it's not grammatically correct.

Erlend · 2015-03-04 11:12

idbruce wrote: »

What is the trigger mechanism for an individual message?

The intention is that the top level code in Cog 0 triggers the messages based on:
process sequence events (now heating water..), on user events (irish coffee has been selected...), on condition events (12 voltage is running low), and on environmental events (hey, I feel a person approaching), and on user feedback events (this is your third coffee in a row).
As it is the Cog 0 which is 'controlling the behaviour' of the machine, I believe this is the best place to do the talking. All the other Cogs are doing I/O, conversion, and specific sequences.

I am curious - why do you ask?

Erlend

Erlend · 2015-03-04 11:17

Genetix wrote: »

In 2 of Chip's VGA demos, he references a TXT file in the DAT section of the program and uses it with line PRINT_STRING(@TEXT) near the top of the program.
VGA Tile Driver Demo2 (1280x1024) and VGA Tile Driver Demo3 (1600x1200).

I love your statement Peter, " I am Tachyon, resistance is futile, you will be assimilated.", but it's not grammatically correct.

I believe that the amount of text quickly outgrows what can be handled in a DAT section. Right now, that is where it is, but that is just for debugging the outputting routines.
I too am amazed by Peter the Forth Warrier's endurance at the frontline. I am sure some day we will all give in and go Forth.

Erlend

idbruce · 2015-03-04 11:26

I am curious - why do you ask?

Because first you have an event trigger, then you have an event handler. I was thinking about the event handler end of things. If the type of user was known before the event trigger or the event handler, then the total event could processed much faster, such as the proper selection of a message based upon user experience. So instead of having 3 possibilities for the handler to sort out, you would only have 1.

Erlend · 2015-03-04 11:37

To finish the size uncertainty I have done some estimation:
Around 80 base messages, allowing for further growth say 100
Each message maximum 40 characters, otherwise the talk becomes too long-winded
Each message in 6 variants, but say 8 for good measure
100 x 40 x 8 = 32K if fixed size records are used, if not it is probably more like 20K

My old Professional Development Board only has a 32K chip in it though, whereas the target hw, a Propeller Project Board has a 64K. Can the PDB be upgraded by swapping memory chip? EDIT: can I put in a 24LC512 64 KB EEPROM just like that?

Erlend

Erlend · 2015-03-04 11:48

idbruce wrote: »

Because first you have an event trigger, then you have an event handler. I was thinking about the event handler end of things. If the type of user was known before the event trigger or the event handler, then the total event could processed much faster, such as the proper selection of a message based upon user experience. So instead of having 3 possibilities for the handler to sort out, you would only have 1.

The <variant> will be chosen based on previous user behaviour (in itself an event handler?), so yes, the event only need to trigger which message (number) whereas variant (number) will already be a given. For example, previously the user was recognized as a repeat customer by swiping RFID which causes variant 3 to be set, then event(21_cup removed) triggers message 21variant3.
Does this make sense?

I have implemented this by means of building a filename from <variant> and <message> and then play that mp3 filename (3_021.mp3), but I have realized it is un-manageble, so I will use text and TTS instead.

Erlend

Genetix · 2015-03-04 11:53

Erlend wrote: »

I believe that the amount of text quickly outgrows what can be handled in a DAT section. Right now, that is where it is, but that is just for debugging the outputting routines.

Erlend

Chip references an EXTERNAL file.

Demo 2

DAT
     Text   File   "Lincoln.TXT"
            Byte   0

Demo 3

DAT
     Text   File   "Lincoln2.TXT"
            Byte   0

Heater. · 2015-03-04 12:06

Genetix,

Chip references an EXTERNAL file.

Yes, your examples of the "file" directive pull in an external file.

But the bytes in that file end up in your DAT section just as if you had written them there directly.

It does not help save space in Propeller HUB memory.

kwinn · 2015-03-04 12:06

Yes he does, but the data from that file ends up as part of the DAT section, and so in hub ram when the program is loaded.

kwinn · 2015-03-04 12:38

Erlend wrote: »

Sounds like this is the way to go. I would appreciate if you would share the routine you mention, and then I can start to try to figure out how to transmit, receive, convert, index and store the texts...............

Transmitting and receiving should be fairly straight forward. Logic level serial data from a wired connection or bluetooth receiver in to the propeller. Several protocols to choose from. Propeller USB cable from PC to the project board to start with would be the simplest. Bluetooth could be added later.

Convert and index by storing the eeprom starting address of each message in an array of words. If the message numbering is not sequential store the message number in a byte array so that the number in Byte corresponds with the matching start address in Word.

Convert and store the text by changing the " at the end of the text to zero and deleting the message number and characters added by the spreadsheet.

All pretty easy to do courtesy of JonnyMac's routines.

BTW, keep in mind that the indexing arrays will be created in hub memory. They will need to be written to the equivalent locations in the lower 32K of the eeprom so they are saved permanently.

Erlend · 2015-03-04 13:04

kwinn wrote: »

BTW, keep in mind that the indexing arrays will be created in hub memory. They will need to be written to the equivalent locations in the lower 32K of the eeprom so they are saved permanently.

Thanks @kwinn, I think I can do what you are outlining, but the last thing you mention I am not sure about. I guess to avoid having to create the index and then paste it into a DAT section, I need to write it into the eeprom directly. More reseach required. Or I could simplify further, and use fixed size records, and the address would always be Base + 40ch x (msg# x var#).

Erlend

kwinn · 2015-03-04 18:46

Fixed size records would work and be simpler but they waste space and are not as flexible. You could also store the records with the message number at the beginning of the message and either search the text area for the requested number, or create an index in hub each time the prop starts up.

Creating an index in hub as you read and store the records is simple, and writing the index to the lower area of eeprom is no different than writing to the upper 32K. If the index starts at 6F00 in the hub then it needs to start at 6F00 in the eeprom.

msrobots · 2015-03-05 00:59

@Erlend,

the trick @kwinn is talking about is that when booting from EEPROM the RAM gets filled with the lower 32K of the EEPROM. So you can simply change the boot up value of a variable in the DAT section by writing the new value into the EEPROM. And the address in the EEPROM is the HUB address of the variable in your DAT section.

And - yes - you can replace the EEPROM against a 64K one on the PDB. You can also install another one (even quite bigger) with different address on the same I2C pins.

Enjoy!

Mike

Erlend · 2015-03-05 04:26

I showed this thread to a friend this morning: "Look", I said "a few days ago I had little clue about how to solve this problem I have, but now a good handful of good people around the world have been helping me find a solution, and showing how to do it". "Wow" he said "I thought Internet forums were all about arguing". Not so. Not on this Forum.

Erlend

Heater. · 2015-03-05 05:43

Parallax has the best technical forum here I have ever seen.

We like a good argument, err debate, as much as anyone else. But generally the arguments are backed up with solid evidence and made with the best of intentions.

Well done everybody.

Note to self: I must try to not argue so much

Best way to handle text messaging

Comments