Voice Recognition Analysis
mallred
Posts: 122
The following commentary is the empirical observation of Dr. Jim with regard to voice recognition. He has completed the hardware in order to remove and discard the carrier wave which is necessary for his approach to voice recognition (see article below), and has proven it on the oscilloscope. All that is left is the modulation wave, which is much less complex and can therefore be analyzed on a Propeller based system such as ours.
He has also written code into the Datalogger's EEPROM that allows for FAT32 now. This has been tested on a variety of thumb drives, and also on a 100+ GB hard drive. Our system can now support very large programs, files, etc. which are also 100% DOS and Windows compatible.
This also means that, for machine intelligence, much more intelligence can be stored that we previously thought, in long term memory (the hard drive) that can now support into the Terrabytes. This exponentially increases access speed as well. For most purposes, virtually unlimited capacity is now available to the KISS OS, which completely cuts off the umbilical cord to the PC. It is now not necessary for your applications. They can be completely self contained on your MIT enhanced Proto board system.
Work is currently underway on the voice recognition software.
This message is generally not for those who respond, but to the silent majority who can make observations and conclusions on their own, without the help of the so called "chattering class".
This is how Dr. Jim proposes to do speaker independent voice recognition that is capable of natural language(s).
Here is Dr. Jim:
The computer world has struggled with voice recognition and its correct implementation for many years. Primitive and very poor attempts have been achieved thus far.
We will look at this problem starting from empirical observation of the human as a model.
You have vocal cords and a pharynx, a face and an oral cavity, and a tongue and teeth that form the words of human speech.
You have to be able to strike a note vibrating at a specific frequency in order to speak. Some people have lower or higher voices – higher or lower frequencies with which people are most comfortable with. That will be the voice with which you speak, the one most comfortable to you.
When you whisper, you have the rush of air out of your lungs and oral cavity, over the tongue and teeth, and out through the mouth. No vibration of vocal cords occurs with whispering.
It should be noted that words can be formed understandably with either vocal cords or by whispering. Again, whispering can be understood. This is a clue that everybody in voice recognition misses. So, obviously, the vocal cords only serve to project and increase the volume and distance that your words carry.
There are some inflections that can be added to your speech that increase or decrease volume only. These intonations are not necessary to understand speech. If these were necessary, you could not speak while whispering.
The vocal cords provide a carrier wave for the spoken word. But when you whisper, that carrier wave is eliminated, but you can still understand speech. All you have left is merely the rush of air through the oral cavity.
When you whisper, all you are modulating with your facial muscles is the rush of air from the lungs to the throat, tongue, teeth, and lips. There is no carrier wave, i.e. the vibration of vocal cords.
Without the carrier wave that the vocal cords generate, you can still be as well understood as long as you can be heard.
We assume that the microphone (audio pickup) has sufficient sensitivity to pick up a whisper.
What all this means is that we can extract and discard the carrier wave, because it is not necessary for speech. That is the key. All you have left is the modulation. In a whisper, all you have is the modulation, nothing else. The rush of air in a whisper is the carrier, but it is nothing more than white noise in the audio spectrum.
Therefore, we have established that to do speaker-independent voice recognition, it is absolutely imperative that you remove the carrier whether it be the vibration of vocal cords or white noise, and analyze nothing but the modulation absent any carrier.
Would that not be step one in the creation of independent voice recognition, the removal of the carrier wave and the analysis of only the modulations?
This is contrasted with the current technology of voice recognition where the carrier wave is always considered. This needlessly and exponentially increases the complexity and computation power necessary for voice recognition, not to mention speaker-independence.
By empirical observation alone, we have exponentially decreased the complexity of speech recognition.
My approach to speech recognition is to filter out all vestiges of the carrier wave leaving nothing but the modulation waveform.
On the oscilloscope, all you have left is the modulation wave, a far less complex waveform to analyze.
My voice recognition uses nothing but the modulation.
Here are all the steps I take to create my voice recognition software, except for the actual algorithms I use, which I will never divulge in source code, but will provide for sale in binary format only.
Step 1: We have to design a circuit to filter out the carrier wave component of speech, regardless of what that is, whether the vibration of vocal cords in normal speech, or the rush of air in a whisper. This should be done before digitization of the audio input and should leave nothing but the modulation waveforms.
Step 2: Now we digitize the raw modulation information. This is a relatively low frequency component. A one to two KSPS (thousand samples per second) digitization should be more than sufficient to yield a good tracking of the modulation information.
Since most words require less than one second to pronounce, the raw digitized data will be less than or equal to 2000 bytes, where 1 byte = 1 sample. Eight bits per sample is more than sufficient to track this modulation waveform.
Step 3: We convert the 2000 samples, as an example, into a delta modulation waveform, which yields approximately 1 bit per sample. This, then, is a valid compression technique for the modulation component. This technique yields 250 bytes to represent the original 2000 bytes, or an 8:1 compression ratio of the original data.
Step 4: We pass the 250 bytes of information through the neural synaptic generator which then yields one synaptic interconnection to represent the original spoken word.
Step 5: Repeat steps 1 – 4 by speaking the word 10 – 20 times, which will represent 10 – 20 variations on the word being taught. This will yield a 95 – 98% recognition rate of the spoken word in a speaker-independent form.
This recognition percentage can be increased by merely increasing the number of teachers with differing dialects, such as northern, southern, eastern, and western and perhaps English and Australian dialects as well. The machine can then be allowed to make a “best guess” at the word, which further increases its accuracy.
These algorithms require no floating point or signed arithmetic to implement and can be implemented using an extremely simplistic instruction set, i.e. a RISC instruction set computer environment.
This is why we don't need a powerful mainframe architecture, we merely need the speed, and that only in moderation.
By following these engineering axioms, a highly sensitive, speaker-independent speech recognition system can be implemented on a low-cost platform.
Let us compare this to the brain. In the neurobiological model (the brain) there are no adders, subtracters, floating point arithmetic mathematical units, or signed arithmetic units, and yet a child can learn to recognize any and multiple spoken languages with great accuracy.
The key to my version of voice recognition is, as Einstein stated, “empirical observation.”
Dr. Jim Gouge
Machine Intelligence Technologies
support@machineinteltech.com
He has also written code into the Datalogger's EEPROM that allows for FAT32 now. This has been tested on a variety of thumb drives, and also on a 100+ GB hard drive. Our system can now support very large programs, files, etc. which are also 100% DOS and Windows compatible.
This also means that, for machine intelligence, much more intelligence can be stored that we previously thought, in long term memory (the hard drive) that can now support into the Terrabytes. This exponentially increases access speed as well. For most purposes, virtually unlimited capacity is now available to the KISS OS, which completely cuts off the umbilical cord to the PC. It is now not necessary for your applications. They can be completely self contained on your MIT enhanced Proto board system.
Work is currently underway on the voice recognition software.
This message is generally not for those who respond, but to the silent majority who can make observations and conclusions on their own, without the help of the so called "chattering class".
This is how Dr. Jim proposes to do speaker independent voice recognition that is capable of natural language(s).
Here is Dr. Jim:
The computer world has struggled with voice recognition and its correct implementation for many years. Primitive and very poor attempts have been achieved thus far.
We will look at this problem starting from empirical observation of the human as a model.
You have vocal cords and a pharynx, a face and an oral cavity, and a tongue and teeth that form the words of human speech.
You have to be able to strike a note vibrating at a specific frequency in order to speak. Some people have lower or higher voices – higher or lower frequencies with which people are most comfortable with. That will be the voice with which you speak, the one most comfortable to you.
When you whisper, you have the rush of air out of your lungs and oral cavity, over the tongue and teeth, and out through the mouth. No vibration of vocal cords occurs with whispering.
It should be noted that words can be formed understandably with either vocal cords or by whispering. Again, whispering can be understood. This is a clue that everybody in voice recognition misses. So, obviously, the vocal cords only serve to project and increase the volume and distance that your words carry.
There are some inflections that can be added to your speech that increase or decrease volume only. These intonations are not necessary to understand speech. If these were necessary, you could not speak while whispering.
The vocal cords provide a carrier wave for the spoken word. But when you whisper, that carrier wave is eliminated, but you can still understand speech. All you have left is merely the rush of air through the oral cavity.
When you whisper, all you are modulating with your facial muscles is the rush of air from the lungs to the throat, tongue, teeth, and lips. There is no carrier wave, i.e. the vibration of vocal cords.
Without the carrier wave that the vocal cords generate, you can still be as well understood as long as you can be heard.
We assume that the microphone (audio pickup) has sufficient sensitivity to pick up a whisper.
What all this means is that we can extract and discard the carrier wave, because it is not necessary for speech. That is the key. All you have left is the modulation. In a whisper, all you have is the modulation, nothing else. The rush of air in a whisper is the carrier, but it is nothing more than white noise in the audio spectrum.
Therefore, we have established that to do speaker-independent voice recognition, it is absolutely imperative that you remove the carrier whether it be the vibration of vocal cords or white noise, and analyze nothing but the modulation absent any carrier.
Would that not be step one in the creation of independent voice recognition, the removal of the carrier wave and the analysis of only the modulations?
This is contrasted with the current technology of voice recognition where the carrier wave is always considered. This needlessly and exponentially increases the complexity and computation power necessary for voice recognition, not to mention speaker-independence.
By empirical observation alone, we have exponentially decreased the complexity of speech recognition.
My approach to speech recognition is to filter out all vestiges of the carrier wave leaving nothing but the modulation waveform.
On the oscilloscope, all you have left is the modulation wave, a far less complex waveform to analyze.
My voice recognition uses nothing but the modulation.
Here are all the steps I take to create my voice recognition software, except for the actual algorithms I use, which I will never divulge in source code, but will provide for sale in binary format only.
Step 1: We have to design a circuit to filter out the carrier wave component of speech, regardless of what that is, whether the vibration of vocal cords in normal speech, or the rush of air in a whisper. This should be done before digitization of the audio input and should leave nothing but the modulation waveforms.
Step 2: Now we digitize the raw modulation information. This is a relatively low frequency component. A one to two KSPS (thousand samples per second) digitization should be more than sufficient to yield a good tracking of the modulation information.
Since most words require less than one second to pronounce, the raw digitized data will be less than or equal to 2000 bytes, where 1 byte = 1 sample. Eight bits per sample is more than sufficient to track this modulation waveform.
Step 3: We convert the 2000 samples, as an example, into a delta modulation waveform, which yields approximately 1 bit per sample. This, then, is a valid compression technique for the modulation component. This technique yields 250 bytes to represent the original 2000 bytes, or an 8:1 compression ratio of the original data.
Step 4: We pass the 250 bytes of information through the neural synaptic generator which then yields one synaptic interconnection to represent the original spoken word.
Step 5: Repeat steps 1 – 4 by speaking the word 10 – 20 times, which will represent 10 – 20 variations on the word being taught. This will yield a 95 – 98% recognition rate of the spoken word in a speaker-independent form.
This recognition percentage can be increased by merely increasing the number of teachers with differing dialects, such as northern, southern, eastern, and western and perhaps English and Australian dialects as well. The machine can then be allowed to make a “best guess” at the word, which further increases its accuracy.
These algorithms require no floating point or signed arithmetic to implement and can be implemented using an extremely simplistic instruction set, i.e. a RISC instruction set computer environment.
This is why we don't need a powerful mainframe architecture, we merely need the speed, and that only in moderation.
By following these engineering axioms, a highly sensitive, speaker-independent speech recognition system can be implemented on a low-cost platform.
Let us compare this to the brain. In the neurobiological model (the brain) there are no adders, subtracters, floating point arithmetic mathematical units, or signed arithmetic units, and yet a child can learn to recognize any and multiple spoken languages with great accuracy.
The key to my version of voice recognition is, as Einstein stated, “empirical observation.”
Dr. Jim Gouge
Machine Intelligence Technologies
support@machineinteltech.com
Comments
Can I ask what the baud rate is that he is getting (assuming that he is using the Parallax Datalogger)? I've managed to get about 13KB per second (write), although with larger transfers it could jump up to perhaps 50KB per second. That's at 460_800 baud. Now, assuming that he is running at the full spec'ed speed of 3 Mbaud, that is 300 KB per second. With a 2 TB hard drive (about 2 billion KB), that would take 6.6 million seconds to fill up the drive, or 77 days. Can he really use all that space considering that the Propeller has only 32K of RAM?
Remember that he's using a multi-level memory system. The Propeller's 32K is probably used mostly for active code and for buffering information from a relatively large SRAM (a couple of MB). This in turn is archived / swapped out to the Datalogger storage for retrieval later. If he does this right, most of the patterns he needs at any given time are in the large SRAM along with some kind of index to the stuff on the Datalogger. Again, if he does this right, he can overlap the retrieval of information from the Datalogger with other accesses to the large SRAM.
Decent behavior isn't that hard achieve. I want to set the standard higher in my future conversations.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
JMH
Is it that we need 2 algorithms, one for whispers and another for speaking or is it that there is some other characteristic of the sound (it isn't the envelope....for the vowels).
i.e. He's correct when it applies to the consonants, but his theory doesn't cover vowels (yet).
Is worth looking at more though.
Note that for conventional techniques, it has been done with small micros for years (no big mainframe is needed). Sensory Inc (www.sensoryinc.com) has been doing this for several years now.
Very interesting thinking though.
-D
The only difference is that voiced formants are excited by vibration-induced impulses (and their harmonics) from the vocal cords, whereas whispered formants are excited by pink noise from a constant rush of air. You can see the difference in the spectrogram. The vocal cord harmonics create a striped fine structure in the formants; the pink noise creates more of a continuum. But the formants, which depend on the size and shape of the vocal cavity (i.e. mouth, sinuses, throat) remain the same. That's why it's still possible to recognize whispered speech.
-Phil
http://www.fifthgen.com/speaker-independent-connected-s-r.htm
I especially liked the quote at the bottom of the article:
[font=Arial, Helvetica, sans-serif]"Stay, you imperfect speakers, tell me more."[/font]
[font=Arial, Helvetica, sans-serif]Shakespeare's Macbeth[/font]
Are "imperfect speakers" and the "chattering class" the same thing? I hope so.·
Duffer·
This must be astronomical, given the claims you were making when you thought you were limited in space.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Dr. Jim, Mr. Mallred,
this is written in future tense - is it correct to assume then that step 1 is not complete yet?
Would you please elaborate some on how such a filter is done?
thanks
- Howard
(Skeptical, but always willing to give the benefit of the doubt.)
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Post Edited (CounterRotatingProps) : 9/1/2009 5:32:20 PM GMT
(This post is just another +tick in the community interest column which will no doubt be used for favorable MIT marketing).
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
--Steve
Propeller Tools
I'm sorry you have to look at some of the unflattering things other posters here say.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Nyamekye,
Kye, we are here for people like you who appreciate fresh ideas.
Not sure about the baud rate.· Would have to ask Dr. Jim.· He does not like talking on this forum.· I have to beg him for any input.· However, he did want me to post his theory and approach.
As far as the electronics to filter out the carrier, it is finished and working.· We have proved it out on the oscilloscope.· It is not complex, but rather simple.· I think there are a total of seven resistors, a single logic chip, some capacitors, and a microphone.· Not too much to it at all.
I know we have had bold claims and statements in the past.· We are making good progress on making those claims a reality.· We·are happy to·share our enthusiam for our approach to a difficult and perplexing problem.· Mabe we will have something great emerge, mabe not.· But we are working toward this end.· Don't you think it noble to work toward something difficult?· I think JFK said something about that when we were trying to reach the moon within the decade.
Thanks,
Mr. Mallred
·
This module has been tested and is working.· You can see that it is very simple in construction.
Our Best,
MIT
John Abshier
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
--Steve
Propeller Tools
A whisper works fine of course, but there is a practical problem. A whisper is very quiet. So it works in a quiet room but not so well with background noise. For a robot, the real world is going to be noisy.
Is that filter just a high pass filter? In which case, what is the cutoff frequency?
When I get home I might post some research on what happens to the very best speech recognition systems when you start adding background noise in 3db increments (addit - file attached, pdfs not allowed on the forum so rename it as a pdf)
mallred - please keep us posted - this is all most interesting.
Post Edited (Dr_Acula) : 9/2/2009 10:05:22 AM GMT
He said it was a logic chip, but I don't know of any 8-pin ones. It looks like an op amp circuit.
Leon
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Amateur radio callsign: G1HSM
Suzuki SV1000S motorcycle
Thanks,
Mark
I would like to take a moment to talk to you about myself.
I know that I am the nemesis of many of you. I am slightly technical, but clearly not as technical as most of you on this forum.
Dr. Jim is highly technical. He refuses to post on this forum.
The only reason anything gets posted is because I push for it occasionally. Perhaps I could do a better job of waiting until a product is more complete, but my intent is to allow those who wish to follow our progress as we add, develop, or implement new technologies/ideas.
I am not a professional business person. That being said, I am a part of this company and I am the only reason you even know about us. I was the first to find and post on this thread. If nothing else, you owe me hours and hours of downright hilarious entertainment you would not find anywhere else.
But I'm more than that. I present ideas that people have struggles with for decades. In my simplistic way, I say, we are working on an answer to that problem.
You always want more data. Ok, but I am not equipped to give you the data you desire, and getting it out of Dr. Jim is next to impossible when he is focused on his work.
So, as you see, we are at sort of an impass. I am doing my best, in my modest way, to relay what is going on. I know it does not meet with your standards, but it's all I have. Dr. Jim sees this forum as antagonistic and does not have time for it. And, to give credit where it is due, he is truly working hard on our project and sees this forum as a waste of time.
If you want to blame someone for the half-baked approach, that would be me. But again, I am speaking to those who want to see what we are up to and post for those people, even if I cannot do it justice, I want you to know that we are working steadily toward and making real progress on our goals.
So, it seems to me that you either get a glimpse of what we are doing through me, or nothing until the finished project. When I first posted, I wanted to share the excitement of a possible breakthrough. That is still my intent.
I hope I have clarified some of my postings in my explanation here.
All my best,
Mark Allred
Can you give a reference for your assertion?
I have studied cochlear implants at great depth (and indeed, I am wearing one right now), so I want to clear up
some confusion here.
First, there are people who have been wearing cochlear implants for over 20 years and are still very happy and
successful with them, so it's not clear that the electrical impulses will damage the nerves in any way. That said,
there are very specific and important characteristics of the pulses (i.e., that they are DC balanced over short
intervals, so electrolysis does not occur).
Secondly, the level of stimulation is determined more by user tolerance than anything else (T-levels).
Third, speech coding for cochlear implants has moved away from any sort of super-smart, try to extract
vocal characteristics and encode them appropriately, to a much simpler extract some approximation of the
spectra and encode that. So I know of no currently used coding strategies that "dedicate some of the
electrodes to voicing". Instead, the current effort is to enable the use of brain plasticity and the incredible
power of the wetware to take care of the actual *analysis*.
In terms of "minimum pulses", I believe the effort goes more towards presenting useful spectral information
than it does minimizing the count or duration of the pulses.
But as I am not an active researcher in the field, I may not have access to all the latest information.
See, a post like that goes a long way towards earning some respect.· The problem arises when you present information suggesting things are further along than they are, and additionally seem to be flogging your wares with these high profile postings.· Only a week ago or so we all thought the KISS OS was going to available, with features such as voice recognition and intelligence abilities.· Clearly that is some way off yet.
It is obvious that Dr Jim must be a clever guy and that he has some good ideas.· If you were to present those as they are rather than build them up you'd probably recieve a better response.· The cryptic nature of many of these posts can only serve to cause more harm than good.
I'm still curious why several of Dr Jim's interviews state he has accomplished many of these things on other platforms and is working on 'porting' that work to the Propeller...· To me that suggests there ought to be some demonstration of the work he has done, and it would be that kind of media that would garner you much more respect.
It is in nobody's interest that you fail at achieving your goals.
> Dr. Jim sees this forum as antagonistic and does not have time for it. And, to give credit where it is due, he is truly working hard on our project and sees this forum as a waste of time.
That is most unfortunate, but I think it is quite clear that response is a direct result of the manner in which you presented yourself, and is not indicative of this forum in general.· One need only browse through many of the user submitted projects here to see the community support that follows.· And smart as he may be, he is doing himself a great disservice by·disregarding an excellent·resource on the hardware he is using to create his project.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Post Edited (Agent420) : 9/2/2009 7:17:35 PM GMT
What truely grates on me though, is that you call for evidence, which I try to supply in my own simple way, and as soon as it posts, there is a flurry of rejection spotted by a few positive comments here and there. It seems nothing I try to convey is ever good enough.
That has been the reaction on this forum to any of my posts or even to posts about us that we did not initiate, such as from Ken.
Has anyone else in this "friendly" support group ever experienced what I have?
Not exactly, but I usually get corrected if I'm wrong. Mostly I accept it and become humble. Sometimes I keep going anyway just to prove something to myself until I trip and fall [noparse]:)[/noparse] Other times I've found the dissenters had exactly my same opinion the year before and just chuckle until I forget about the great injustice [noparse]:)[/noparse]
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
--Steve
Propeller Tools
Mr. Mallred,
I'm sorry, but that *almost*·borders on blaming the victim. (The victim, however, maybe is not you, but rather us, the forum members.)
With all due respect, we are a good bunch of people here. Even when we argue, it is done with respect and civility in the overwhelming number of instances.· You have read the many reasons why some forum members in reading your posts have gone from honest, open curiousity, to confusion, to reasonable disbelief, to annoyance, and to sarcasm.
Sarcasm here is usually light-hearted. But when you call forum members " The Chattering Class "· - you should reasonably expect to garner the same level of disrespect back to you.· We, however, have only "stooped" to sarcasm and playful joking at your expense.· That is a pretty healthy and humane response that some here may feel your former abrasive condescension does not deserve.
Having been a techical advisor to investors working on non-run-of-the-mill projects, I have seen many ways to present ideas, concepts, and prototypes to potential supporters.· Not that I'm from Missouri, the "Show me State", but your approach here to raising at least mutual respect, has been - to put it politically correctly - highly unusual.
What do you *really* want from us here at the Parallax forums?
regards,
Howard
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
As I've mentioned before, I worked on a small (3-4 person) project 30 years ago that sounds so much like yours. The head of the project was brilliant, had a long list of important patents to his name, most of them classified, and a lot of practical experience in computing as well. He and one other person had come up with a scheme for an associative database that made sense and sounded really promising for things like text and speech recognition. I was one of the implementers. We were going to get it to run on a PDP-11 which was what was available at the time. As I recall, we were able to get it to work for small datasets. It was theoretically scalable to large enough datasets for practical use, but just didn't work when scaled up past the toy / demonstration size.
I've really not seen the kind of reaction to a series of postings that you seem to have engendered. I think part of it is that you're acting as a go-between for someone who's too busy for a conversation with peers. There's nothing wrong with being too busy and I can imagine that Dr. Gouge feels that he has something unique, but the community here is very diverse. There are some extremely experienced people who drift in from time to time and plenty of PhDs and professional doctorates of all kinds. From the sort of information you've provided (to the best of your ability perhaps) there's a lot of what you're doing on this project that just doesn't make sense and some of the responses reflect that. Your responses to that level of criticism has been unhelpful, perhaps because you just don't have the background and detailed information to do better. I remember being in a similar position as a graduate student on the project I mentioned.
I don't know what to recommend other than a high level of transparency and being really really clear about what are plans and intentions vs. demonstrable accomplishments (and I mean demonstrable ... with real demonstrations included at least as links) with some details about what's actually going on. If you don't have the time to put together a good demonstration, don't post it. There are plenty of good technology demonstrations on the web. Apple's presentations are very well done. You don't have to do something as polished, but you can learn from them.
That's funny, "highly unusual".
I seem to have a way with this crowd, don't I?
What do I want from you at the Parallax forums? Great question.
Personally I would like to share my enthusiasm about a possible breakthrough on tough issues. Unfortunately, it always ends in name calling and a shouting match.
I am slowly coming to the conclusion that I do not have the technical ability to present what we have to offer in terms you appreciate, that is, very technical and detailed. I have access to Dr. Jim, but he is more interested in pushing forward on our project than in answering questions.
That's fine, I suppose, but then I have nothing but vague guesses and perhaps words taken out of context sometimes. If I have done this, I apologize. I am not trying to lie or even bend the truth, as some have said. I am trying to make sense of what I perceive we are doing, and then sharing perhaps an incomplete picture. At times I truely do get it, but may not know how to convey it technically enough for your tastes.
I have two options. I can stop posting until we have everything perfect, or I can continue posting, raising the ire of many of you, but still reach the silent majority out there that may wish to follow along, even if it is only a curiosity right now.
I know that the subject matter is highly interesting and also highly controversial, since no one currently has true machine intelligence like we see in the movies.
That's the holy grail, of course, and our iteration may actually be a let down, but I hope it is a start, a beginning to truely congnitive computing.
Those are my thoughts.
fair enough.
Then I would like to respectfully offer the following advice and suggestions:
* Please stop referring to forum members, directly or indirectly, as Chatter's, Ignorant, or whatnot.· If you really want to "share [noparse][[/noparse]your] enthusiam," calling us names is more likely in getting us to share bile.
*·Refrain from indirect provocation - you just did so not so subtly in your reply to me: "but [noparse][[/noparse]I] may not know how to convey it technically enough for your tastes."
*·Reread your posts here from top to bottom as objectively as you can. You may be surprised at who actually has been setting up or triggering what "always ends in name calling."·· Highly controversial things can be discussed calmly - especially here. But if you put on airs, or condescend, wouldn't you expect people to become annoyed at the very least?
" then I have nothing but vague guesses and perhaps words taken out of context sometimes. "
Sorry, but the main reason, I can see, that people's ire has been raised here is precisely because you have *not* provided a context,·despite many repeated calls to do so.·Why do you think we keep asking for more *substance* - not only because we're interested, but because of this:
" I have two options. I can stop posting until we have everything perfect [noparse][[/noparse]...]·"
" [noparse][[/noparse]...] I do not have the technical ability to present what we have to offer in terms you appreciate"
If you do not have the technical skill to present this material, as you seem to be clearly admiting, then you should stop posting. Stop·because *in most instances* here, you may well be doing more harm than good to your reputation. This is one reason why many are so skeptical - if you all really had a viable product/idea, then you would be too busy to post, to busy making your business work.· Further, if the idea is as viable as you say, then you would have the proper funding to hire an *experienced* marketing person - who could clearly and reasonably articulate these ideas publically - without giving·away·any proprietary or·I.P information.· That person, if he or she knew what they were doing, even as your subordinate employee, would advise you to stop talking publically about these ideas as your seeming misteps here reflect badly on what you wish to represent.
" or I can continue posting, raising the ire of many of you, but still reach the silent majority out there that may wish to follow along, even if it is only a curiosity right now. "
* Curb your enthusiasm and put it into making your business work - have someone who *does* understand provide us with some technically-correct, substantive marketing materials.· You will need this anyway.
sincerely,
Howard
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Post Edited (CounterRotatingProps) : 9/2/2009 10:39:39 PM GMT
To rokicki, I hope this isn't hijacking the thread but your comments raise some interesting points. When I first got interested in speech recognition and cochlear implants back in the early 1990s, I asked the very simple question - why don't you just have 22 band pass filters and feed the output of those into the 22 electrodes and let the brain work it out? The answer came from the three researchers I co-wrote that paper with and that that was the state of knowledge back in the early 1990s. Of course, things have moved on since then, the nerves were fine (with the balanced dc as you say), and it was possible to increase the pulse frequency and hence convey more and more information and then let the brain do all the clever work. I guess everyone was just being extremely cautious and conservative back then.
A quick history is here tia.sagepub.com/cgi/content/abstract/8/2/49 from one of those co-authors, Prof Hugh Mcdermott. As described, the first implants were just working on detecting sound in a simplistic way. They then moved on to speech and at the time I was doing research, some cochlear implant users were able to talk on the telephone with no lip reading. Now, they have moved on to music perception.
And the thing that I appreciate more and more is that the brain is far more clever than anything we humans have devised. It is why I have trouble with grand statements about machine intelligence. I'd like to see some hard evidence. Anything really - even something as simple as the %correct data for "yes" vs "no" for a sample of different speakers.
Post Edited (Dr_Acula) : 9/2/2009 11:53:12 PM GMT
I'm guessing at some values, and the output capacitor value is highly suspect. The green bulk bypass capacitor
and header are missing in the drawing ... I don't have time to do it all [noparse]:)[/noparse] Somebody check it please.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
--Steve
Propeller Tools