Text To Speech
Agent420
Posts: 439
Just getting back to some more Propeller experimentation, and Text To Speech came to mind.· I know from my initial introduction·to the Prop and demo board that Ken's Vocal Tract was an included demo, and it surprised me that speech on the Prop had not advanced much in the last year or two - synthetic speech seems like it would be nearly as popular as any of the Prop's video abilities, yet information on the subject is sparse and there are no links in any of the sticky's or objects in the exchange.
I note that Phil Pilgrim has done some work on Phonemic Speech Synthesis, creating a Prop object named Talk.· I won't be able to check that out until later tonight, but I see that a couple of posts suggest it may be difficult to understand.· It is also phonetic based, and I would like to investigate free english text conversion.
I recall from my early C64 days a program titled SAM (Software Automatic Mouth), apparently was the first commercial software based speech synthesizer.· The speech produced was very comprehendible.· The free text to phonetic RECITER program occupied only 6K, so I would think that something similar should be able to·be done within the memory limits of the Propeller.
So with that in mind, does anybody have any relevant information or links on the subject?
I am using these references so far:
Phil's thread on his Talk object:
http://forums.parallax.com/showthread.php?p=613308
eSpeak, an open source speech app to review for the english text to phonetic conversion:
http://espeak.sourceforge.net/
Free TTS, another open source project:
http://freetts.sourceforge.net/docs/index.php
edit -
SAM original documentation for the Atari version, which includes some interesting theory of operation (if not simply nolstagic value )
http://www.retrobits.net/atari/sam.shtml
Post Edited (Agent420) : 8/12/2009 4:54:29 PM GMT
I note that Phil Pilgrim has done some work on Phonemic Speech Synthesis, creating a Prop object named Talk.· I won't be able to check that out until later tonight, but I see that a couple of posts suggest it may be difficult to understand.· It is also phonetic based, and I would like to investigate free english text conversion.
I recall from my early C64 days a program titled SAM (Software Automatic Mouth), apparently was the first commercial software based speech synthesizer.· The speech produced was very comprehendible.· The free text to phonetic RECITER program occupied only 6K, so I would think that something similar should be able to·be done within the memory limits of the Propeller.
So with that in mind, does anybody have any relevant information or links on the subject?
I am using these references so far:
Phil's thread on his Talk object:
http://forums.parallax.com/showthread.php?p=613308
eSpeak, an open source speech app to review for the english text to phonetic conversion:
http://espeak.sourceforge.net/
Free TTS, another open source project:
http://freetts.sourceforge.net/docs/index.php
edit -
SAM original documentation for the Atari version, which includes some interesting theory of operation (if not simply nolstagic value )
http://www.retrobits.net/atari/sam.shtml
SAM said...
The program uses about 450 rules to convert English into S.A.M.'s phonetic language. Included among these rules are some stress markers for situations where the stress choice is unambiguous. In addition, S.A.M.'s usual punctuation rules still operate with some additional symbols ("!", ";", and ":") being considered as periods. The net result is that even directly-translated English text has a fair amount of inflection.
Post Edited (Agent420) : 8/12/2009 4:54:29 PM GMT
Comments
But, for everything I've done so far, it has been far easier to just pre-record messages as .wav files on the SD card and play them out...
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
My Prop Info&Apps: ·http://www.rayslogic.com/propeller/propeller.htm
I had a text to speech card for my TRS80 back in the mid to early 1980s. My TRS80 had 16K of RAM and ran @ a Ballistic 3.5MHz.
This should be a breeze for the Prop.
Have you tried the EMIC speech module? I have used it with the BS2,But I havn't tried it with the Prop.
_________________________$WMc%______________
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
The Truth is out there············································ BoogerWoods, FL. USA
The WTS701 is a great chip; the speech is good quality and the built in text to phonome converter allows you to create random phrases on the fly; you don't have to pre code your phrases with phonetic spelling.· One example of this would be speaking text that was input from a terminal or keyboard connection.
Unfortunately, these modules are a bit expensive ~ $70.
I believe the WTS701 speech chip actually does store the various·phonomes in analog form and then pieces them together to create speech.· From my understanding this may provide better quality and may be less complex than synthesizing the formants in real time, at the expense of memory space (though the WTS701 uses a very clever method of actually storing analog values inside the chip - they are not digitized samples). That method may be an alternative, but obviously adds the complexity of requiring a memory card [noparse][[/noparse]and the associated preloading of data on it].
I checked out Phil's Talk program last night and it's pretty close to what I remember from the SAM software.· The creation of real time formant synthesizing seems a bit complex, but I think I may experiment with it some more.· It sure would be nice to have a pure software solution for creating speech.
That was 27 years ago! Has there been anything close from a software offering? Those speech chips aren't very active.
The talk demo doesn't sound a understandable as 27year old SAM on a ~250KIP processor. I do appreciate the work Phil did on it, but I think it needs more work.
I agree that a software only solution would be great.
Doug
Wow! I still have my original SAM Apple ][noparse][[/noparse] disk and 8 bit DAC card that came with it!
I just can't remember who I lent my Apple ][noparse][[/noparse] to, to run it.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
lt's not particularly silly, is it?
Bob
For the PC there were some soft before the SB times. They were ugly, specially for non-english languages
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Visit some of my articles at Propeller Wiki:
MATH on the propeller propeller.wikispaces.com/MATH
pPropQL: propeller.wikispaces.com/pPropQL
pPropQL020: propeller.wikispaces.com/pPropQL020
OMU for the pPropQL/020 propeller.wikispaces.com/OMU
Long, Long, ago...............
There was Windows 98.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Computers are microcontrolled.
Robots are microcontrolled.
I am microcontrolled.
But you·can·call me micro.
If it's not Parallax then don't even bother.
I have changed my avatar so that I will no longer be confused with others who use generic avatars (and I'm more of a Prop head then a BS2 nut, anyway)
A datasheet for one of the ChipCorder ic's better explains the process:
http://www.winbond-usa.com/products/isd_products/chipcorder/datasheets/4004/ISD4004_Rev1.2.pdf
One of the patents also has some information:
http://www.patentgenius.com/patent/7554844.html
·
EDIT: ... I forgot: the sound was very robotic (metalic)
There was indeed. Given the processor core was pretty much identical it was only a matter of porting the interfaces (user and HW) to make the emulation core work.
I must admit I'd given more than a passing thought to disassembling SAM and porting it to the Propeller. I got it in about '85. What's copyright these days? Still about 20 years ? [noparse];)[/noparse]
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
lt's not particularly silly, is it?
http://www.members.tripod.com/the-cbm-files/speak/
http://www.ccs64.com/
EDIT - I had some trouble getting that disc image to work, but found a good one here:
http://www.emuasylum.com/forums/showthread.php?t=31785
It's not really easy to always infer the intent of raw ML assembly however.
I don't think that hardware technology is the issue here so much the algorithms and logic.· The original developers are still around and quite prevelant, now known as SoftVoice (http://www.text2speech.com/).·
Post Edited (Agent420) : 8/13/2009 4:01:53 PM GMT
I've got a bit of experience reversing all sorts of assembler algorithms (Italian fuel injection for example) and I cut my teeth on the 6502 and trying to figure out how Woz did what he did prior to obtaining a listing of the Apple ROMS (thanks Beagle Brothers!).. so I figured SAM would not be too hard a target.
To be honest I was aiming more for brute force and ignorance. Translate the code rather than understand and re-write the algorithms.
SAM (the core anyway) was written in assembler, and like the Parallax Propeller Compiler, being written in assembler, it's a lot easier to read and understand the resultant disassembly than it is to read and understand something created from a high level compiler.
In any case, I put it on the back burner a year ago.. if I ever locate my Apple which has my SAM D-A card in it, I'll certainly give it some more thought.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
lt's not particularly silly, is it?
It was also used for the robot voices on "Short Circuit", (except for Jonny 5 of course), and it was the device that Stephen Hawking used for many years to vocalize his thoughts.
I was a C64 brat, and never spent a lot of time hacking Apples or Ataris, so I'm not sure how much different their versions may have been.
While we're on the subject, there is also some good info on computerized phonetics at this link, inlcuding some spectral analysis software...
http://www.fon.hum.uva.nl/praat/
This topic can become very complex quite quickly...
http://unaesthetic.net/st/index.shtml
The Stephen Hawking reference reminded of this free gidget.
I found some Devantech SP03's for sale at http://www.junun.org/MarkIII/Info.jsp?item=31·but they are $105!
The SPO256 requires a 3.12MHz crystal but a colorburst 3.57MHz crystal works ok. I built a circuit using a SPO256-AL and a PIC16F84 and it worked well.
tronsnavy: If you do hook up your SPO256-17 could you post the code? I was thinking of adding mine to my propeller based Boe-Bot.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
What electronics need - MORE POWER!!!!!!!
SAM used a set of phoneme's to synthesize speech. This is quite different than the VocalTract engine that Chip created in that it uses speech formants to slide from one transition to another rather than a cut and paste approach used with phoneme's.
Chip has written an entire chapter to a book soon to be released explaining how to use speech formants with the VocalTract engine. (I will see if I can get his permission to post it)
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe
IC Layout Engineer
Parallax, Inc.
Until then I will keep playing around, as well as investigating the rules based text/phonome conversion.
I imagine that aspect is also one of the powers of software synthesis, and if we can better apply the VocalTract elements you could theoretically create any voice with a Prop.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe
IC Layout Engineer
Parallax, Inc.
I am going to continue pursuing this using the PRAAT software I linked earlier.· I have not had time to fully experiment with it, but it significantly expands on the simple Spectrum Display application that is included in Chip's documents, apparently performing formant analysis and indentifying their frequencies.
I guess if this was all too simple there wouldn't be as much satisfaction getting it to work·;-)
Sorry I did not get back with you sooner... got real busy at work and had friends in from out of town. Anyway, I did get a chance last night to see if I still had the SP0256-017... sure enough, it was in my junk box. Right where I put it, 27 years ago. I also did a quick search on the internet for info... found the original booklet. I will build the ckt this weekend. I will try and write some code this weekend too, but spin is ambiguous at times and I only have about a month of training. I'm thinking about generating an array for the [noparse][[/noparse]70] or so allophones (in binary, for sending to the i/o pins). This will make it easy to generate loops and call only the elements needed for specific words and sentences. I will post the code (and schematic) when complete.
I see that you have your prop hooked up to boe... hows it going? I am going to do the same thing, once I gain more knowledge on spin. I have ping... going to buy "basic boe" next. Have a good weekend.
Bob (transnavy)
This looks as though it is a complex field