Who would want a voice control module for BS?
GICU812
Posts: 289
I don’t see this as being a conflict, since Parallax does not offer anything simular, and I think it is something a lot of people would like to build into their projects. If anyone from Parallax has a problem with this, I'll take it down.
·
I’ve been working with VR Stamp (no parallax relation) and I think I have developed a rather user friendly program. That will interface with the Basic Stamp, it could probably interface with anything, but I’ve programmed it for BS2 for now.
·
Basically, there are four key words it listens for constantly.
·
When it hears a keyword, it emits a very soft beep (optional) then listens for a sub word. Each key word has up to 4 sub words.
When it hears a valid sub word, it sends a serialish signal out one wire, at 3.3v (so I can’t use actual RS232 serial). It will timeout after 3 seconds if it doesn’t hear a valid word, and resume key word listening.
It uses sound files to prompt during training, and for an additional charge, I could add in whatever sounds you wanted for various subwords.
·
You can program up to 4 key words, and each has 4 sub words. Once you have trained all 20 words, you can retrain any of the sub words whenever you like, you can clear the training, and you can adjust the recognition tolerance.
·
You would need 4 buttons, and optionally 3 LEDs for training, one Stamp pin for the "serial" line on 'poll' compatible stamps, more lines for non polling stamps.
·
It works VERY fast, it will recognize the keyword the instant you have said it, before you even get to the next word. There is a very brief pause before it picks up on the sub word. If your words were Lights and Hall, it would work if you said "Turn on the lights in the hall" It works with a TV in the background, so it’s rather noise robust as well. It seems to have no problem picking words out of sentences either. It will also recognize short phrases, and in fact those would be better for fewer false positives. I don’t have too much trouble with FP, except like "bedroom" and "familyroom" it seems to place more emphasis on the last syllables, rather than the first syllable.
·
·
·
Things I can’t change:
1: You have to train the chip, so you press the button, say the word, repeat, etc. I can’t preprogram the chips or anything, but the training is stored in EEPROM, so it’s not volatile
2: You can only have a max of 4 key words, no more. You can train as few as you like. It could be reprogramed for more sub words, we'll see how many people would need that.
·
·
I’m not looking to make money off this, more open this opportunity to Stamp users, as the actualy·development on this chip·has a·HUGE learning curve, its taken me at least 2 months,·but considering chip cost, shipping to me, programming, and everything, I think it would need to be $65 + shipping to you.
·
So I’d like to hear from everyone who took the time to read this, would you or wouldn’t you be interested, and why not (unless it’s just something you couldn’t use\ don’t want)·At this point, I could make some changes\tweaks, but·I don’t think I·would be able to custom tune every single one, not for that price. But let me know what your major hangup is, or if you're really interested.
·
·Thanks for reading,
Adam
Post Edited (GICU812) : 10/29/2008 4:38:13 AM GMT
·
I’ve been working with VR Stamp (no parallax relation) and I think I have developed a rather user friendly program. That will interface with the Basic Stamp, it could probably interface with anything, but I’ve programmed it for BS2 for now.
·
Basically, there are four key words it listens for constantly.
·
When it hears a keyword, it emits a very soft beep (optional) then listens for a sub word. Each key word has up to 4 sub words.
When it hears a valid sub word, it sends a serialish signal out one wire, at 3.3v (so I can’t use actual RS232 serial). It will timeout after 3 seconds if it doesn’t hear a valid word, and resume key word listening.
It uses sound files to prompt during training, and for an additional charge, I could add in whatever sounds you wanted for various subwords.
·
You can program up to 4 key words, and each has 4 sub words. Once you have trained all 20 words, you can retrain any of the sub words whenever you like, you can clear the training, and you can adjust the recognition tolerance.
·
You would need 4 buttons, and optionally 3 LEDs for training, one Stamp pin for the "serial" line on 'poll' compatible stamps, more lines for non polling stamps.
·
It works VERY fast, it will recognize the keyword the instant you have said it, before you even get to the next word. There is a very brief pause before it picks up on the sub word. If your words were Lights and Hall, it would work if you said "Turn on the lights in the hall" It works with a TV in the background, so it’s rather noise robust as well. It seems to have no problem picking words out of sentences either. It will also recognize short phrases, and in fact those would be better for fewer false positives. I don’t have too much trouble with FP, except like "bedroom" and "familyroom" it seems to place more emphasis on the last syllables, rather than the first syllable.
·
·
·
Things I can’t change:
1: You have to train the chip, so you press the button, say the word, repeat, etc. I can’t preprogram the chips or anything, but the training is stored in EEPROM, so it’s not volatile
2: You can only have a max of 4 key words, no more. You can train as few as you like. It could be reprogramed for more sub words, we'll see how many people would need that.
·
·
I’m not looking to make money off this, more open this opportunity to Stamp users, as the actualy·development on this chip·has a·HUGE learning curve, its taken me at least 2 months,·but considering chip cost, shipping to me, programming, and everything, I think it would need to be $65 + shipping to you.
·
So I’d like to hear from everyone who took the time to read this, would you or wouldn’t you be interested, and why not (unless it’s just something you couldn’t use\ don’t want)·At this point, I could make some changes\tweaks, but·I don’t think I·would be able to custom tune every single one, not for that price. But let me know what your major hangup is, or if you're really interested.
·
·Thanks for reading,
Adam
Post Edited (GICU812) : 10/29/2008 4:38:13 AM GMT
Comments
On the things that you can't change, sub the max of 4 keywords: that seems a bit limited. I suppose you could hook up several modules and add their keyword count, but that could get expensive quickly. I'm sure that you have a good reason for a max of 4, but I'll whine: why? Hardware? Software?
Anyway, very impressive that you built something like this. Does different voice types work with the same keywords? Such as a man enters the word, and a woman tries to use it?
Anyway, take a look at gadgetgangster.com. You can potentially make some money off your project, and you'll have something well documented to show for all your hard work.
The 4 word max is a hardware limitation. There is a way to have many more, but its a diffrent software technology, and does not offer near the performance. The modules are somewhat expensive, so it probably wouldnt be worth it to piggy-back them.
Having 4 key words isnt that much of a limitation, you can address 4 diffrent devices, or 4 diffrent commands, like
Move - Foward
Move - Back
Move - Up
Move - Back
Turn - Right
Turn - Left
Turn - ?
Turn - ?
Robot - Return to start
Robot - Run program
Robot - Sleep
Robot - Wake up
Keyword 4 - ?
etc.
As for diffrent speakers, there is definitely software that will let you pre-program action words that will respond to any voice, but its $1000 for the low end version. So thats not an option. I havent had many other people try the module, but it does have options for biometric voice recognition, where it almost guarentees you are the only one who can activate it, I dont know how picky it is with this·program, but I can completely alter the tone or pitch of my voice and it still hits every time. I'll have some other people try it.
Post Edited (GICU812) : 10/29/2008 5:55:24 AM GMT
This last message paints a much clearer picture of how it might be applied. Consider this for a moment.
You have a large and/or expensive robot. It's operated by I/R, RF or other similar remote control. You have told it to turn right or left to avoid going into a nearby pond. The batteries on the remote DIE. Is it worth $65.00 to keep your robot from becoming a submersible? I'd say so!
I'd buy one, if I had the proper application. Did you approach Ken Gracey? More than one of the Parallax products over the years have come from those who have taken the time to learn a particular technology and applied it in a way that's suitable for robotics application. The worst he can say is no thanks!
Regards,
Bruce Bates
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
When all else fails, try inserting a new battery.
What other components are needed to support the chip?
How much current does the chip draw?
Running it draws 26mA @ 3v, it has a sleep function, <20uA @3v, but I dont think that would be a useful function given the application.
Attatched is a basic connection diagram, you need to supply 3.3v, 4 buttons and 3 leds, as well as a mic and speaker. It will work either with PWM directly connected to a speaker, or it sounds better with a LM386 amp circuit running off the Audio pin.
Post Edited (GICU812) : 10/29/2008 10:45:59 PM GMT
Also, why couldn't you put two chips together and each on a separate i/o line?
whats it cost and can I get one sent to Australia
It should be able to work on any stamp, but because they can use the pollin commands, the 2p.. series can do it with one line.
Other stamps will require 2 lines, Im still working on a 2 line interface, but I think the best way would be :
Comm PIN 0
ACK PIN 1
the VRS pulls COMM high,
when the BS gets around to checking and sees it high, the BS pulls ACK high and begings listening
When the VRS sees ACK HIGH, it drops COMM for x ms then begins transmitting data on COMM
The easiest way to transmit the data would probably be in 1ms pulses, and a COUNT command. given the longest transmit would be 16, which would take 32 ms, I dont think its really worth it to code a comm protocol. It can do plain old serial comms, but then you would need a MAX232 chip, I figured this would be easier and use fewer componets. It has worked flawlessly for me so far and is very easy to implement in the BS.
You could use as many VRS chips as you had stamp I\O pins in reality.
I'm starting at $65 + shipping to you. If anyone is interested PM me for my email. I can take paypay (+fees) or money order (actual cost). Again, I can add custom sound responses if you like, but it involves recoding the program, so there will be an extra cost.
Im working on getting a video of this working.
Im starting to play with a single level 31 word program. It would listen for 31 diffrent words or very short phrases all the time. One word would activate it.
The advantages of this program,
1: More words = up to 31 (maybe more)
2: Faster action, single word activation
The downsides of this setup,
1: More likely to false accept. Since there is only one word, there is no protection against false positives
2: It is pickier about noise and sounds around the action words.
3: Its less robust against background noise
Would this be of interest to anyone? It would work well in some projects.
Heres my next proposal, which unlike this offer, is more of a 'what if', rather than im ready to ship.
I take a modified version of the code im running in my house, and install the chip in this: (with supporting circuitry including an X-10 fire cracker)
http://www.thinkgeek.com/images/products/additional/large/star_trek_classic_communicator_inhand.jpg
Package it with an X10 transciever, so anyone can voice control one outlet out of the box, obviously with the X10 parts (which are relatively cheap) you could control up to 16 anythings, outlets, light switches, relay contacts, etc.
I think price on the package would be probably $150 given all the parts I would need to buy, then custom rebuild the communicator, put in a max232, since the firecracker is wireless but driven by RS232, plus batteries, brownout protection, regulator, mic, speaker, probably something else im not thinking of. All that plus the cost of an X-10 transciever, the communicator, and a VRstamp module, plus my coding time, thats actually probably a little under priced, but if someone wanted to 'fund' the project, I could build one for them and see how it went.
Heres my next proposal, which unlike this offer, is more of a 'what if', rather than im ready to ship. Voice controlled lighting\etc with a star trek communicator:
I take a modified version of the code im running in my house, and install the chip in this: (with supporting circuitry including an X-10 fire cracker)
http://www.thinkgeek.com/images/products/additional/large/star_trek_classic_communicator_inhand.jpg
Package it with an X10 transciever, so anyone can voice control one outlet out of the box, obviously with the X10 parts (which are relatively cheap) you could control up to 16 anythings, outlets, light switches, relay contacts, etc.
I think price on the package would be probably $150 given all the parts I would need to buy, then custom rebuild the communicator, put in a max232, since the firecracker is wireless but driven by RS232, plus batteries, brownout protection, regulator, mic, speaker, probably something else im not thinking of. All that plus the cost of an X-10 transciever, the communicator, and a VRstamp module, plus my coding time, thats actually probably a little under priced, but if someone wanted to 'fund' the project, I could build one for them and see how it went.
And yes, it would come preloaded with scottys voice for a number of qippy responses.
Hmmm... I wonder if there is trademark issues with that.·Id have to look into that.
Post Edited (GICU812) : 11/14/2008 8:11:18 PM GMT
Actually, keyboards and keypresses and IR-Remotes and RF-Remotes are all very nice, unambiguous, rarely-spoofed devices. It would be annoying to be having a conversation with someone, and have the computer get a false-positive and turn the lights off.
If you HAD one of these, and it cost $20, you'd get some response. Hypotheticals don't get much play, we've seen too much vaporware.
I guarentee it wont false trigger, unless you speak the trigger phrase in conversation, it wont go on with just "lights", and if you use the one word triggering, it cant pick the words out of conversation, you have to pause before and after. Though I think you could train a phrase as the 'word', so you could say "lights on" and it would accept that as one 'word' and trigger that action. But again, you would have to pause, say lights on, then pause before you could continue talking.
I dont know, I like being able to be sitting on the couch in the dark watching tv, and say "lights on" and the lights come on so I can get up and walk around, or being able to go to bed and saying lights, all off, and every light in the house goes off. If two of the three motion sensors in the front of the house trigger within x seconds, it alerts me with front motion detected. Same, with the rear, only rear motion. Likewise I can turn on the spot lights without having to walk to the switch.
My GF likes being able to walk in the house when she gets off work in the morning and its dark and say "lights, all on", so she doesnt have to be scared of the dark. Of course I have the bedroom and spotlights exculded from that.
Ive toyed with integrating the thermostat, but that is a little inpractical, mostly because there is no immediate result, and you dont usually adjust the thermostat routinely. I can arm and disarm the alarm using the biometrics, so it will only respond to my commands, so again its nice to be in bed and be able to set the alarm to "home" without having to get up. The biometrics arent included in what im offering, but theres no reason you couldnt allow anyone to arm the alarm, and frankly, with the training recognition, the chip is pretty user specific anyway.
Man... I sound lazy, lol.