Prop Vision Outline Recognition
Humanoido
Posts: 5,770
I need to have a prop see a rough image (rectangle, circle, square, or irregular shape) and compare the outline with that of a simple stored outline and get a percentage of match.
Ideally it would be a simple vision recognition program and operate in principle much like the prop's speech recognition program, i.e. store around 3 to 5 shapes for recognition.
Has anyone developed this, and if not, what resources are available for the prop to accomplish it?
Ideally it would be a simple vision recognition program and operate in principle much like the prop's speech recognition program, i.e. store around 3 to 5 shapes for recognition.
Has anyone developed this, and if not, what resources are available for the prop to accomplish it?
Comments
I think this is a great application for Hanno's machine vision method.
I personally think I took the easier way with a breakout board for the ADC chip. I know you've seen the project I'm working on. I'm not sure what kind of filters I'll use with my system yet. I know I'll at least want to be able to find the brightest pixels in the image. I might try some shape recognition too.
I don't remember if Hanno talks about using filters (software algorithms) in the tutorial I linked to above or if the filter information is just in the book The Official Guide ...
Hanno's method is limited to black and white images. I know several forum members are working on color camera projects. You could also use color filters (plastic film) over the black and white camera's lens to gain color information. (The first color pictures from the moon used three different color filters with a black and white camera.)
Duane
2. For each of, say, 64 equally-spaced angles spanning a full 360 degrees, measure and record the distance from the centroid to the first edge encountered at the selected angle. This array of distances is the object's signature.
3. To compare two objects for a match, slide their signatures against each other, one step at a time, wrapping to the beginning when you go past the end. This forms a set of number pairs against which a Pearson correlation coefficient can be computed. Do this for each of the 64 positions in the array and record the largest correlation value. This becomes the degree of match for the pair.
4. Perform step 3 between the unknown and all the template signatures, and pick the one with the highest degree of match, assuming it's high enough.
By using this technique, you should be able to identify shapes that are congruent to your examples, regardless of their scale or rotation.
-Phil
In the project thread I linked to in my earlier post, I have the word "HI" (as seen by the video camera) displayed on my LED array. It is much easier to tell what the image is with some movement. I wonder if there is a way to adapt the algorithm Phil mentions to use multiple images over time. As I think of the problem, the extra time dimension would make a program much more difficult to write but I wonder it adding multiple frames would help a machine see better as multiple frames helps us humans see better.
Duane
Phil, this could supplement speech synthesis and speech recognition. Simple object recognition would be so useful on the Propeller chip. How would you recommend getting the image scan?
I wish there was a low cost common item that could be constructed for image input to the prop. I thought about using common photodetectors but those are physically large in greater numbers. Some people have used LEDs as detectors but their light sensitivity falls off.
It would be spectacular to have the next Propeller Demo Board include a small speaker for speech, a mic for speech input, and a small image sensor/lens for object recognition.
-Phil
Back in the day I was doing this on a Z80 and because it is a neural network, it very much lends itself to parallel processing. I've written neocognitron software for visual basic and even in an excell spreadsheet. I reckon it would do well on a multiprop network.
Check out the paper at this link (takes a while to load):
http://ro.uow.edu.au/cgi/viewcontent.cgi?article=1575&context=engpapers
(I happen to be working on a SSD1928 based camera solution)
Ever seen a Charles Schwab commercial where the actors appear partially animated? I was able to automatically generate the effect in a small way using a frame grabber and statistics as described. Processing speed was a limiting issue.
Do you have an example 'image' of what your robot might be looking at? That might help to decide on the correct filter. An idea I have in mind uses binocular vision using a single camera.
http://www.youtube.com/watch?v=yLzXThxZStE&feature=player_embedded
The robot is the Big Brain and it would look at itself. With this post
http://forums.parallax.com/showthread.php?124495-Fill-the-Big-Brain&p=1002718&viewfull=1#post1002718
it was concluded the Big Brain could be made self aware by viewing and recognizing itself in a mirror.
Silhouette of the Big Brain shows the basic rectangular shape for recognition
Because the Big Brain is in the shape of a tall narrow rectangle, I think recognizing this basic geometrical shape is a priority for self recognition. Simple monocular vision at 1X would be sufficient or a lesser focal length lens would work too.
I'm still looking for a very simple low resolution way to get an image into the prop maybe 64 or 128 photo sites. It would help if the home - made camera could be built from common parts like small phototransistors offered by Parallax, so students could duplicate the camera and do vision recognition too.
You're right - Hanno's work is really fantastic and serves as a remarkable model of what one can accomplish. What I'm thinking about is open source Spin programming and a system that students can understand along with building a home-made camera using simple components so anyone can make their own vision recognition prop system.
I think you trivialize "self-recognition." How, for example, would your Big Brain tell the difference between itself and other big brains of the same genre? It's not as if the image of a tall rectangle on the Big Brain's retina will suddenly instill consciousness, as with apes touching the monolith in 2001. That only happens in Hollywood. Computers are amazing, but they aren't magic.
-Phil
http://www.dealextreme.com/p/ntsc-mini-surveillance-av-camera-628x582px-6019
You don't need SD card...
You are right, I do make it trivial - if it's made ultimately simple it could have the problems you mention of being inaccurate when looking at the 2001 A Space Odyssey monolith (if it has exactly the same dimensions compared to itself, or some other matching dimensional rectangle with the same angular measurement).
But for something ultimately simple, I mean really, how often does that happen? It's not like we go out every day to discover the monolith sticking up in our front yard, or take our Big Brain to see the classic movies (ok, maybe the latter but only if the Brain requests it..) so I think we could get by with a trivialized system for demonstrative purposes.
Maybe the Big Brain will have something unique sticking out on the side for greater self recognition (just as some human may have a really big obtuse nose sticking out farther).
As you once said in reference to machine intelligence and the AI community, it is important to start at the basic level developing fundamentals rather than jump at the top off the bat. The system could always be improved in the future.
Computers and software are definitely not magic but some people on this forum would have to agree that the results of some of your Propeller programs seem magical.
http://forums.parallax.com/showthread.php?98516-stupid-video-capture
and this
http://www.dealextreme.com/p/ntsc-mini-surveillance-av-camera-628x582px-6019
and some mods..
best regards
Stefan
I have been working on a new version with the GreyTV driver and keyboard control instead of IR.
also doing some research on line/edge extraction algorithms.
Perry
Computer vision typically requires lots of memory and processing power- for humans it requires quite a bit of our brain. It's therefore not a good fit for the Propeller...
So, start with simple things- by finding the brightest spot you can track a laser pointer. Or, by looking for an artificial pattern that sticks out from the environment. Reliably detecting arbitrary objects is very hard. I'm close to finishing some work on a serial camera, the c329 where I choose to slowly download the raw rgb pixels from the camera- allowing me to compute while the data streams in at ~1fps. If you don't really need a fully embedded system, I highly recommend looking at the integrated OpenCV functionality of ViewPort. Lots of really cool vision research is performed with OpenCV- and that functionality is much more compelling than just finding the brightest spot. Audiences are always wowed by my ViewPort "face finder" demo which steers 2 servos using the Propeller.
Hanno