Shop OBEX P1 Docs P2 Docs Learn Events
Hanno's next "Grand Challenge": Text to Speech — Parallax Forums

Hanno's next "Grand Challenge": Text to Speech

HannoHanno Posts: 1,130
edited 2010-03-17 10:15 in Propeller 1
Hi,
Chip has written an awesome object that synthesizes speech- one formant at a time. He's very proud and enthusiastic about his object, but I haven't heard more than a few words from our Propellers. The reason is that currently, each word has to be programmed, one formant at a time. I think people want to pass a string to an object and have it do the translation from characters to phonemes to formants. 20 years ago I played with a simple set of chips that did this mapping. There's also open source software out there that does this.

So, here's my second grand challenge to the parallax community:

Publish an MIT licensed object that:
- uses Chip's VocalTract object
- uses <5KB of HUB ram
- uses 1 cog
- can speech synthesize this post- so that I can understand it

Submissions should include the code and an mp3 recording. Winner gets an ultimate license to ViewPort or 12Blocks...
Hanno

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Co-author of the official Propeller Guide- available at Amazon
Developer of ViewPort, the premier visual debugger for the Propeller (read the review here, thread here),
12Blocks, the block-based programming environment (thread here)
and PropScope, the multi-function USB oscilloscope/function generator/logic analyzer

Post Edited (Hanno) : 3/16/2010 8:36:45 AM GMT

Comments

  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2010-03-16 06:33
    Hanno said...
    - can speech synthesize this post- so that I can understand it
    This could be taken to imply that you want text-to-speech. Is that what you meant?

    -Phil
  • mctriviamctrivia Posts: 3,772
    edited 2010-03-16 06:41
  • HannoHanno Posts: 1,130
    edited 2010-03-16 08:34
    Hi Phil,
    Sorry for not being clear...
    Yes, I want an object that expands Chip's object to do text to speech.
    Chip's object requires very little memory to produce very nice speech.
    His object models the vocal tract and uses just 13 byte-sized parameters to define how the speech will be generated. From Chip's chapter it looks like Chip was successful in replicating all phonemes with his object- however, he left it as an exercise to the reader. Getting all phonemes right shouldn't take too long and should be fun.
    Once you have phonemes, you need to map words to a collection of phonemes- as mentioned, there are sources to get the rules for this.
    So, here's what the object should do:
    input: text2speech(string("hello world"))
    map text to phonemes: hello world-> h e l o~ wh o~ r ld (I'm making this up)
    call Chip's object with parameters for each phoneme
    Would be great for all projects that need to "display" something...
    Hanno

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Co-author of the official Propeller Guide- available at Amazon
    Developer of ViewPort, the premier visual debugger for the Propeller (read the review here, thread here),
    12Blocks, the block-based programming environment (thread here)
    and PropScope, the multi-function USB oscilloscope/function generator/logic analyzer
  • TonyWaiteTonyWaite Posts: 219
    edited 2010-03-16 13:02
    As Chip has written the speech-engine, the missing link is the text-engine: ie the software that analysises the
    text and converts it into commands.

    This is a *significant* coding requirement, and would normally be written in C/C++ for high-level OS's; eg most recently for Android #1.6 as an example of an embedded platform.

    I would guess that a port via Catalina/LMM would be one route, using open source software, for example from the Festival/CMU community.

    Regards,

    T o n y
  • jazzedjazzed Posts: 11,803
    edited 2010-03-16 14:04
    TonyWaite said...

    I would guess that a port via Catalina/LMM would be one route, using open source software, for example from the Festival/CMU community.
    Except that Festival requires GNU. Catalina is not GNU. The closest thing we have is heater's ZOG emulator running GNU.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Short answers? Not available at this time since I think you deserve more information than you requested.
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2010-03-16 16:00
    Hanno,

    I've already got an object that does phonemic synthesis using Chip's object:

    http://forums.parallax.com/showthread.php?p=613308

    It's big and bad (neither meant in a good sense, unfortunately). 'Been meaning to work on it again someday. Maybe this is the kick I need.

    -Phil
  • mctriviamctrivia Posts: 3,772
    edited 2010-03-16 17:06
    I think a pre procesing computer program that can generate code to input into chips would be good start.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    24 bit LCD Breakout Board now in. $24.99 has backlight driver and touch sensitive decoder.

    If you have not already. Add yourself to the prophead map
  • HannoHanno Posts: 1,130
    edited 2010-03-16 18:57
    Hi Phil!
    Great to see you've already completed the phonemes to vocaltract part of the problem- I'll check that out later today.
    Here's the open-source text to speech project I mentioned earlier:
    espeak.sourceforge.net/

    Like vocaltract it allows users to define formants by parameters- however, it also let's you use wav files- and after a brief scan, that's all I found.

    However, the text-to-phoneme work should be very applicable here. It has a command line mode where it converts text to phonemes...

    Here's the vocaltract-equivalent chip I used ages ago: courses.cit.cornell.edu/ee476/Speech/SPO256-AL2.pdf
    A separate chip, the CTS256 drives the SPO256- it converts text to phonemes- I can't find the datasheet, but a program that simulates the chip is here: www.speechchips.com/shop/item.aspx?itemid=13

    Hanno

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Co-author of the official Propeller Guide- available at Amazon
    Developer of ViewPort, the premier visual debugger for the Propeller (read the review here, thread here),
    12Blocks, the block-based programming environment (thread here)
    and PropScope, the multi-function USB oscilloscope/function generator/logic analyzer
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2010-03-16 19:43
    Hanno,

    I wouldn't say "completed", necessarily. You'll see what I mean. smile.gif

    -Phil
  • MicrocontrolledMicrocontrolled Posts: 2,461
    edited 2010-03-17 01:37
    I have been using Phil's speech for at least a year. IT IS GREAT! For your challenge, I will modify Phil's program, but I do not need something else in return. I notice that this is the 2nd time I have piggybacked my challenge-entry on Phil's work, dang it. I need to learn how to write my own awesome programs.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Don't click on this.....

    Use the Propeller icon!! Propeller.gif
  • potatoheadpotatohead Posts: 10,261
    edited 2010-03-17 05:04
    In light of past speech recognition topics, and this one running right now, I have to link this:

    http://www.haskins.yale.edu/featured/sws/sws.html

    These guys are working on some great leading edge research applied directly to the formants that I found interesting and wanted to share.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Propeller Wiki: Share the coolness!
    8x8 color 80 Column NTSC Text Object
    Safety Tip: Life is as good as YOU think it is!
  • Phil Pilgrim (PhiPi)Phil Pilgrim (PhiPi) Posts: 23,514
    edited 2010-03-17 05:24
    potatohead,

    Whoa! That sinewave speech is amazing! I mean both amazingly weird sounding and amazing that it's so understandable.

    -Phil
  • potatoheadpotatohead Posts: 10,261
    edited 2010-03-17 06:51
    Yeah, my thoughts too. After sitting with Chip, learning about the formants, then seeing this, I am intrigued.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Propeller Wiki: Share the coolness!
    8x8 color 80 Column NTSC Text Object
    Safety Tip: Life is as good as YOU think it is!
  • Graham StablerGraham Stabler Posts: 2,510
    edited 2010-03-17 10:15
    It's clear there are still many potential ways to skin a cat

    Graham
Sign In or Register to comment.