A challenge for computer speech recognition?

What do you hear when you play this video?



"Laurel" or "yanni?" How can two people hear two such different words from the same recording?

-Phil
“Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery

Comments

  • 21 Comments sorted by Date Added Votes
  • ercoerco Posts: 19,226
    I get "yammy" just hearing that.
    "When you make a thing, a thing that is new, it is so complicated making it that it is bound to be ugly. But those that make it after you, they don’t have to worry about making it. And they can make it pretty, and so everybody can like it when others make it after you."

    - Pablo Picasso
  • You should be able to hear either one if you try. I think the recording is both words right on top of each other. Your brain has to pick one to make sense of what it is receiving so you perceive one or the other, a few people hear both at the same time.
  • ercoerco Posts: 19,226
    edited July 12 Vote Up0Vote Down
    I just listened to it again and I swear they changed it. ~ 4 hours ago it sounded exactly like yammy, nothing like laurel. Made me wonder what the gag was. And now I hear laurel very clearly, nothing close to yammy at all.
    "When you make a thing, a thing that is new, it is so complicated making it that it is bound to be ugly. But those that make it after you, they don’t have to worry about making it. And they can make it pretty, and so everybody can like it when others make it after you."

    - Pablo Picasso
  • I hear Laurel and think the Yanni's are nuts.

    Here's a simulator.

    https://www.nytimes.com/interactive/2018/05/16/upshot/audio-clip-yanny-laurel-debate.html

  • I was convinced for a time that they were messing with phasing between the two stereo channels. When I sit in front of my MacBook, I hear "laurel." When I move off to the side, I hear "yanni." But then, I slowly moved from the side to the front and still heard "yanni."

    I'm not convinced by the frequency argument yet. There has to be something more subtle going on than that.

    -Phil
    “Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
  • I think it has to do with the nature of the digital synthesizer used to render speech. There is a coarse (digital) component and a somewhat smoother (Still digital .. but better) component. I think the original word to be synthesized is "Laurel" which would be the smoother component, the coarse component reveals itself as "Yanni" as the synthesizer struggles to render the sounds together. .... It's a bad example anyway... I hear "Yummy" .... what is Yanni anyway? ... just digital artifacts if you ask me.


    Beau Schwabe -- Submicron Forensic Engineer
    www.Kit-Start.com - bschwabe@Kit-Start.com ෴෴ www.BScircuitDesigns.com - icbeau@bscircuitdesigns.com ෴෴

  • Shawn LoweShawn Lowe Posts: 595
    edited July 12 Vote Up0Vote Down
    this is so weird. I distinctly hear Laurel. Played it for my wife, she hears yammi. Weird
  • As Dubya would have said. It must be sublimaniminals.
  • I think a lot of this is down to what you expect to hear. What you are used to and what you are not.

    For me as an old Brit "Laurel" is familiar, "yanni" is not even a word. Why would my brain match a sound to something it knows nothing about?



  • Here is what wiki says. It depends on what frequencies you hear best.
    Re-inventing the wheel is not a waste of time if, when you are done, you understand why it is round.
  • Yes, the frequency. Your hearing and the sound system in your computer/TV play a huge part in what you can hear. There are a ton of misheard lyrics ("bathroom on the right," "wrapped up like a douche," etc.) that are a testament to this.

    It has nothing to do with some mystical brain magic, except that the brain can interpret only the sounds that come through the cochlea. Impair the quality of that sound and your brain will make a closest association, even if it's a nonsensical word.
  • GordonMcCombGordonMcComb Posts: 3,366
    edited July 12 Vote Up0Vote Down
    Heater. wrote: »
    For me as an old Brit "Laurel" is familiar, "yanni" is not even a word.

    It may not be a "word" but it is the name of a famous musician who's been playing for nearly four decades. Odds are you've heard the word/name at one point or another in your life, though that has nothing to do with why you or anyone would hear "yanni." Our brains will approximate; "Earn more sessions by sleeving / Ten more seconds and I'm leaving." Steve Martin knows, just ask him.

  • When you brought up the point about frequencies, I decided to try something. I listened through the speaker on my tablet, and then I plugged a set of earbuds in. The difference was surprising...when I use the speakers on the tablet I hear "yanni", and when I use the earbuds I hear "laurel".
    San Mateo, CA
  • GordonMcCombGordonMcComb Posts: 3,366
    edited July 13 Vote Up0Vote Down
    For many years I worked for several Hollywood film labs, including Technicolor, developing software used in the transcription of film and TV dialogue. These to-the-frame transcripts are done to create lists for subtitling, foreign language translations, captioning, and other uses. Film sound is often highly augmented and compressed, with effects and music added, making interpreting a character's spoken lines a real art. I was never very good at it, but fortunately I only had to write the software.

    There were a lot of times when a transcriber got the words totally wrong, and the studios always made an unholy fuss about it, even though they were fully aware of the challenges. It can be a real problem in translations because if they've already started dubbing they have to pay for a retranslation, and maybe get the actor to come back in and voice the new dialogue. It's not unusual for voice actors to be paid a minimum four-hour day, even if they work for 10 minutes. Mistakes add up.

    After a couple instances of this with a particular studio that liked to be a PITA for price leverage, we always demanded isolated dialogue track only. Arnold Schwarzenegger is hard enough to understand without loud music and explosions. In a particular project involving robots from the future, when hearing the actual dialogue stems without filtering augmentation or effects a number of lines came out very differently.
  • Arnold Schwarzenegger is hard enough to understand without loud music and explosions. In a particular project involving robots from the future, when hearing the actual dialogue stems without filtering augmentation or effects a number of lines came out very differently.
    Hasta la vista, maybe!

    -Phil
    “Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away. -Antoine de Saint-Exupery
  • I just hear 'jello'

    Mike
    I am just another Code Monkey.
    A determined coder can write COBOL programs in any language. -- Author unknown.
    Press any key to continue, any other key to quit

    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this post are to be interpreted as described in RFC 2119.
  • GordonMcComb,
    It may not be a "word" but it is the name of a famous musician who's been playing for nearly four decades.
    Ah, you mean Yanni as in Yiannis Chryssomallis. Interesting, never heard of him, thanks for the heads up. Although that's not really my style.

    I'm more likely to think "jänis" the Finnish word for "hare".







  • Alexa, how do you wreck-a-nice beach?
  • ercoerco Posts: 19,226
    The BIGGER question here in the Matrix is choice: What do you WANT to hear? Yanni or Kenny G?



    "When you make a thing, a thing that is new, it is so complicated making it that it is bound to be ugly. But those that make it after you, they don’t have to worry about making it. And they can make it pretty, and so everybody can like it when others make it after you."

    - Pablo Picasso
  • Heater.Heater. Posts: 20,926
    edited July 18 Vote Up0Vote Down
    Neither. Same to me. All sounds like that stuff you hear while waiting for a call to a customer support center be answered.

    I want Janis:

  • GordonMcCombGordonMcComb Posts: 3,366
    edited July 18 Vote Up0Vote Down
    I never listened much to Yanni. I throw rocks at the stereo system when there's a Kenny G song on. I cannot stand soprano sax. I'm a Stan Getz (mostly tenor sax) kind of guy.

    Now, was that Girl from Ipanema or Grill for Hypotenuse?



Sign In or Register to comment.