Hippy's got his "way-back" machine cranked up on high! I remember some small DOS executables that worked the same way. They sounded pretty good. I might have some still on 5.25" floppies. Now - where did I put the 5.25" floppy drive? Hmmm....
Oh! you mean SAM! Surely the Prop should be able to handle the same thing!
OBC
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔ New to the Propeller?
Getting started with the Protoboard? - Propeller Cookbook
Got an SD card? - PropDOS
A Living Propeller FAQ - The Propeller Wiki (Got the Knowledge? Got a Moment? Add something today!)
An awful-sounding speech demo: (for educational purposes?)
Semi-failure to make type&say for SP0256 phonemes. Yucky test discouraged finishing the type&say feature.
It could work better, but I made it hastily and it has errors in the phonetic data pointer table.
Maybe when I get around to transferring old phoneme sets from 8-bit pre-PC machines it will work much better.
Written for Hydra, needs a few tweaks for other boards. Sufficient to make it try to say something else.
It doesn't come close to the Vocal Tract object, as it is a remake of something primitive from 1984.
I'm interested in making a system which any BS stamp can use,
not just the prop. I think this is the way to go on a new "chip"
equivalent since the previous chips are discontinued and increasingly
harder to find. If we build it, let's build it with simple and common
parts, and fewest components as possible and generic so it can run
on any microprocessor.
There's several projects that use a recording chip
to store sounds and retrieve them in specific sequence
but I don't know if the retrieval is fast enough. Anyone have data
on this?
Since we have Emics now, how about some wave files of the
Emic sounds? Then we can preserve both the man in the SPO256 and
the girl inside Emic. [noparse]:)[/noparse]
BTW, I really like the SPO256 sound, have used the chip in numerous
robots, and interfaced it to the Basic Stamps and various project boards.
Since it can handle both phonemes and allophones, I was able to program
it in up to 7 languages.
One product I was involved in, as a lark, booted to the sound of a welcome message from one of the team's developer's children. Management liked it, so it was shipped that way.
I've used pre-generated phrases, and compressed them for storage. Voice is highly compressible, though if you plan to do anything fancy with it you run the risk patent infringement, but most standard compression works a treat. I seem to remember using delta compression (i.e. last value was 36, current value is 37, output delta of +1) followed by simple Huffman encoding of the deltas, as decompression didn't require much memory. Applying a low-pass filter of about 8 kHz (or less) helps when encoding. I used SoX to do the WAV-to-binary stuff and filtering.
To generate phrases I've used (both FREE):
- rsynth, which is a very simple vocal-tract simulator so it sounds quite artificial like Stephen Hawking's chair's synth.
- MBROLA, a highly advanced and high quality simulator that is available in binary only. MBROLA was a pain to use because it was so darn configurable, didn't really use text (though there are front-ends available for it), but with a little effort the quality was amazing.
Another popular software speech solution is Festival / Festivox.
One can also hire a voice actor for even better quality for commercial products.
Post Edited (Andrew E Mileski) : 12/9/2008 8:38:35 PM GMT
I basically wrote that to see if concantonating phonemes would work. It does (sort of). Since then, I've written a synth that generates and transitions the formants for the vowel frequencies. I also added limited noise generation which pretty much allows it to generate most sounds required for English.
I did some work on automatically extracting parameters from speech to get phoneme values and pretty much found that to be useless. The results sound like a bad vocoder. I did have success using the extracted parameters as a starting point and hand coding the parameters. The number of parameters per vowel is pretty low (~100 bytes).
b.t.w. They were selling Winbond a while ago.. but i think they make their own now..i say this because it is in Spanish now too
This from the site
TextSpeak Embedded Text-To-Speech modules series convert ASCII text to a natural, clear voice with unlimited vocabulary. The small footprint, plug-in solution accepts wide range of input data to generating real-time speech.
-jm
The multi-language TTS comes with both male and female multiple voices Arabic, Spanish Catalan,Danish Dutch English (UK) English (US) Finnish French French (Canadian) German Greek Italian Norwegian Polish Portuguese Spanish (North America) Swedish
Comments
Oh! you mean SAM! Surely the Prop should be able to handle the same thing!
OBC
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
New to the Propeller?
Getting started with the Protoboard? - Propeller Cookbook
Got an SD card? - PropDOS
A Living Propeller FAQ - The Propeller Wiki
(Got the Knowledge? Got a Moment? Add something today!)
Semi-failure to make type&say for SP0256 phonemes. Yucky test discouraged finishing the type&say feature.
It could work better, but I made it hastily and it has errors in the phonetic data pointer table.
Maybe when I get around to transferring old phoneme sets from 8-bit pre-PC machines it will work much better.
Written for Hydra, needs a few tweaks for other boards. Sufficient to make it try to say something else.
It doesn't come close to the Vocal Tract object, as it is a remake of something primitive from 1984.
http://www.ict.kth.se/courses/IL131V/pictalk/fonems/index.htm
I'm interested in making a system which any BS stamp can use,
not just the prop. I think this is the way to go on a new "chip"
equivalent since the previous chips are discontinued and increasingly
harder to find. If we build it, let's build it with simple and common
parts, and fewest components as possible and generic so it can run
on any microprocessor.
There's several projects that use a recording chip
to store sounds and retrieve them in specific sequence
but I don't know if the retrieval is fast enough. Anyone have data
on this?
Since we have Emics now, how about some wave files of the
Emic sounds? Then we can preserve both the man in the SPO256 and
the girl inside Emic. [noparse]:)[/noparse]
BTW, I really like the SPO256 sound, have used the chip in numerous
robots, and interfaced it to the Basic Stamps and various project boards.
Since it can handle both phonemes and allophones, I was able to program
it in up to 7 languages.
humanoido
I'm really keen to use these text to speech doodads but not so much if the tts chip is not in production any more.
I notice epson have come out with one: S1V30120
See details at:
http://www.epson.jp/device/semicon_e/product/speech_audio/index.htm#ac01
I also note that parallax are distributors for epson products. Can you guys make these chips availiable?
Cheers
cncguy
I've used pre-generated phrases, and compressed them for storage. Voice is highly compressible, though if you plan to do anything fancy with it you run the risk patent infringement, but most standard compression works a treat. I seem to remember using delta compression (i.e. last value was 36, current value is 37, output delta of +1) followed by simple Huffman encoding of the deltas, as decompression didn't require much memory. Applying a low-pass filter of about 8 kHz (or less) helps when encoding. I used SoX to do the WAV-to-binary stuff and filtering.
To generate phrases I've used (both FREE):
- rsynth, which is a very simple vocal-tract simulator so it sounds quite artificial like Stephen Hawking's chair's synth.
- MBROLA, a highly advanced and high quality simulator that is available in binary only. MBROLA was a pain to use because it was so darn configurable, didn't really use text (though there are front-ends available for it), but with a little effort the quality was amazing.
Another popular software speech solution is Festival / Festivox.
One can also hire a voice actor for even better quality for commercial products.
Post Edited (Andrew E Mileski) : 12/9/2008 8:38:35 PM GMT
Glad to see someone enjoying my ChipTalk program.
I basically wrote that to see if concantonating phonemes would work. It does (sort of). Since then, I've written a synth that generates and transitions the formants for the vowel frequencies. I also added limited noise generation which pretty much allows it to generate most sounds required for English.
I did some work on automatically extracting parameters from speech to get phoneme values and pretty much found that to be useless. The results sound like a bad vocoder. I did have success using the extracted parameters as a starting point and hand coding the parameters. The number of parameters per vowel is pretty low (~100 bytes).
If only I had more hours in the day!
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
www.speechchips.com
Speech & Video IC's for BasicStamps
The closest to the Emic and Parallax TTS is the TTS-03 from Testspeak.
Sound very similar and takes RS-232 in
http://textspeak.com/oemtts.htm
They also have a multi-language 3.3v timy ARM9 module that has REALLY good voices... it's pricey, but can sound almost human
textspeak.com/oemtts.htm
b.t.w. They were selling Winbond a while ago.. but i think they make their own now..i say this because it is in Spanish now too
This from the site
TextSpeak Embedded Text-To-Speech modules series convert ASCII text to a natural, clear voice with unlimited vocabulary. The small footprint, plug-in solution accepts wide range of input data to generating real-time speech.
-jm
The multi-language TTS comes with both male and female multiple voices Arabic, Spanish Catalan,Danish Dutch English (UK) English (US) Finnish French French (Canadian) German Greek Italian Norwegian Polish Portuguese Spanish (North America) Swedish