Wake Word detection using electret microphone and Chip's FFT
Chip's excellent FFT example got me thinking that if I can discriminate between words by what I see on the output, maybe there's some simple algorithm that will do it automatically.
Here's Chip's original thread on the FFT: https://forums.parallax.com/discussion/170948/
This is a first step, for sure, but it kind of works.
One big thing that is missing is automated gain. I think it is currently relying on the volume to match the reference.
Anyway, here's a video of a test with wake word "robot":
The blue LED on the bottom right lights up when word is detected.
You can see in video it's not perfect, but pretty good. Needs some work to beat Alexa...
The code is rigged to use the left button of a USB mouse to enter in reference word. You hold the left button down, say the wake word, then release.
Wake word is captured and displayed in top left window.
Current input is shown on the bottom, like Chip's original, but it's windowed to only look at a certain frequency range.
The input is constantly copied to the top sorta-middle window and compared to the reference.
Right now the algorithm is just looking at the sum of squared errors.
Blue LED is lit when error is below some threshold.
Also looked at using the sum of vertical lines in the sample. It's shown under the samples. Hasn't proved useful yet but might be...
It would need a lot more work to generally useful, but I think this is a start...