Shop OBEX P1 Docs P2 Docs Learn Events
Using the CMU cam as a robotic eye (Has it been done?) — Parallax Forums

Using the CMU cam as a robotic eye (Has it been done?)

SciTech02SciTech02 Posts: 154
edited 2006-06-15 23:45 in Robotics
I heard that the CMU cam navigates by waching the floor, and when a color diference is detected (a object) it will know when to turn.· Can you make the CMU cam work like a human eye?· I was thinking of puting a Ping senser·under it·so it could "See" the object, and know how far it is.··Could it navigate like that, or is that to advanced for the CMU cam?· Just curious.··· -SciTech02
«13

Comments

  • SciTech02SciTech02 Posts: 154
    edited 2006-04-05 03:40
    What I ment to ask was, Can I make the CMU cam navigate by really "Seeing" the object, and react according how it "Saw" it, instead of waching the ground for a·large color change?· Also, do I need to put a rangefinder under it so it can see how far the object is.··Is that needed, or is it not needed?··Or am I geting this·all wrong, does it already do what I'm asking?· Like before, I'm just currious.··· -SciTech02

    Post Edited (SciTech02) : 4/5/2006 4:13:33 PM GMT
  • allanlane5allanlane5 Posts: 3,815
    edited 2006-04-06 20:40
    I once read a quote from Edsger Dijkstra that said "Determining whether a computer can 'think' like a human is about as useful as determining if a submarine can swim like a fish".

    The point being, they're in different domains, with different purposes, but sometimes they accomplish the same goals. In the submarine, that goal is moving through the water. In the computer, that goal is reaching conclusions and reacting accordingly. This is all triggered by your phrase "really 'seeing' the object". The CMU-Cam can't "see", in the terms that it draws conclusions about what the pattern is on its CCD camera. "Watching the ground for a large color change" IS one way the CMU-Cam interprets its data.

    Also, there's lots of people on here with real problems, trying to develop real solutions. Asking a question because you're "just curious" is not likely to get a lot of reponses.

    Now, the CMU-Cam is a brilliant concept. It was created by two guys at Carnegie-Mellon University, with the purpose of having a 'line-follower' that could 'look ahead'. It has an on-board SX processor, which runs fast enough that the CMU-Cam can pick out contrasty-regions, and send simple messages to a controlling computer regarding where those contrasty regions are. Given a particular color, it can even control a panning servo to keep a ball (or object) of that color centered in the camera.

    And I believe you can 'dump' what is in effect a low-res screen capture of what it is seeing -- but analyzing two consecutive frames is probably way beyond what a BS2 can do. Google CMU-Cam for more information.

    Oh, and with a SINGLE CMU-Cam, you'll probably ALSO need a distance determining device.· With TWO CMU-Cams, and a lot of processing horsepower, you MIGHT be able to use parallax to find distance, but it would be non-trivial.
  • SciTech02SciTech02 Posts: 154
    edited 2006-04-07 02:19
    Okay I see wht you mean,·I'm sorry.· But I did some reserch and found out what I ment to ask was about "machine vision".· They make special "Framegrabers" to do that.· It's expencive, but they use it in industrial robots all the time.· Has anyone here done that?· (Note it's not in development, they have ben made, they even created "Soccer robots" that use that, and they sell the framegrabers for $100.00)· Again, I'm sorry about the posts.··· -SciTech02.
  • allanlane5allanlane5 Posts: 3,815
    edited 2006-04-07 12:37
    And I'm sorry I took that snarky tone. So we're all friends again.

    Yes, the CMU-Cam does have a 'frame-grabber' mode. You'd have to look at the docs for it to find its resolution -- I think it's rather low, which reduces the amount of processing you need to do to it. It may be more appropriate for a PC/laptop based solution though -- the BS2 is really limited in horsepower when it comes to that kind of processing. In the 'frame-grabber' mode, the CMU-Cam will output a serial stream of the pixels it most recently captured.

    The SX/28 and SX/48, however, definitely DO have the horsepower, if not the memory space.
  • SciTech02SciTech02 Posts: 154
    edited 2006-04-07 18:23
    Cool.· I did some reserch, and it says that almost all CMU cams do that.··Can the CMU cam for the boe-bot do that (The CMU cam for the boe bot was designed for the boe bot)?· If it could, can I make it roam? (Go and avoid·objects).· Because one of my goals in this hobby is making a robot actuly "see" like we can, and navigate by·doing that.·
  • Tom WalkerTom Walker Posts: 509
    edited 2006-04-07 19:47
    You are wandering into a field populated by tose with advanced degrees...[noparse]:)[/noparse] It's a fun place to be, but it sounds like you need to do some more research to understand what some of your expectations really entail and what the hardware is "really" doing. Sight, as we humans use it, is a tricky thing to get a computer to do, virtually impossible to do with one camera, and certainly more than a lone Stamp can handle. Terms like "edge-detection", "parallax" (no relation), "blob-centering", and the such-like are important to understand when trying to imitate how we use the information we get from our eyes.

    There was a good series of articles in either SERVO or Nuts &Volts on image processing a short while back, using the power of a PC to do most of the "heavy lifting"...definitely worth a read to get a better feel of some of the basics as they are currently interpreted.

    That being said, it is possible to use something like the CMU cam to capture and "interpret" a limited amount of information. Keep in mind that it will generally only be "aware" of a 2-dimensional space. If you add an ultrasonic sensor, you can give your 'bot a crude idea of the 3rd dimension (at least of what it is looking directly at...most of the time...within certain parameters).

    The BOE-Bot CMUCam is identical to the "stock" CMUCam in all respects except that it communicates at TTL- instead of RS-232 levels, as far as I know.

    Please keep us informed...

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Truly Understand the Fundamentals and the Path will be so much easier...
  • SciTech02SciTech02 Posts: 154
    edited 2006-04-07 20:18
    I did some more reserch, They sell all types of frame grabbers...· just type it on yahoo and you'll come up with all types of stuff.· They use them in security systems, and industrial robots all the time.· The frame grabers seem to be conected to you PC, then, it sends a signal how to respond, but I think they make·them for a smaller application.· They also have somthing caled a "Smart cammera" wich has the frame grabber built in it.·

    How come you would need more then one cammera to do that?·

    Also, I read up about ASIMO.· He uses two cammera eyes to navigate around his enviroment.

    And, I was on some site (I can't remember) that was selling one·that they said you can intergrate it to a robotic platform.· It was a big site that sold all types of robot stuff.· I think it was called; (Something) Robotics.· If anyone knows of this site, please show us.·

    Here's a link about this field of robotics.· It has links at the botom to other related stuff like the smart cammeras I talked about.

    http://en.wikipedia.org/wiki/Machine_vision

    -SciTech02.·
  • WhelzornWhelzorn Posts: 256
    edited 2006-04-07 20:33
    Allanlane5: Did you seriously have a problem with him asking because he was curious? To be perfectly honest, half my posts are out of sheer curiosity. How do you know he wasn't curious because he wondered if this type of project was in his realm of experience? there could be any number of reasons why he was curious about it. There are plenty of replies for everyone, so I don't think someone with a real problem is going to lose out because one person asked a question that you do not believe is "worthy" of being asked.

    Anyway, SciTech: One thing you could do (if you have the allowed space and size) is have a webcam or similar feed the frames to a computer which would then analyze them. It depends, I suppose, on how much detail you need out of the image, and how fast you need the frames analyzed. You may be able to use handheld computer, (or maybe even the propeller when it comes out) if the frames are low-res enough.
    The ping sensor under the camera is a perfectly valid idea. I am actually going to do something very similar to that on my next robot. The only thing you have to be aware of is the ping may have a greater cone of detection than the field of view that the camera has, and the ping will read the distance to the closest thing in it's detection cone.
  • Paul BakerPaul Baker Posts: 6,351
    edited 2006-04-07 20:34
    The problem with using ultrasonic ranger for distance is that they typically return the distance of the closest object, not nessesarily the object of interest. To achieve a detailed distance map that could be overlaid on a still image would require a scanning LIDAR like the ones used by the Blue and Red teams for this year's Darpa autonomous vehicle challenge.

    When using vision only systems you need to have two viewpoints to estimate distance by calculating the parallax of a nearby object with respect to a distant object.

    The biggest issue is the whole AI aspect that arises from doing these things, its a black magic field with mixed results. If you can provide a very structured environment they can do quite well but take them out of the structured environment and they can start behaving erratically. Most all of the issues arise from deciding whats an object or not. Lets say I have a soccer ball that Im holding over my head as though I was about to do a side line throw in. The bot would see me and the soccer ball, see we are at the same distance and concludes the soccer ball and I are one object. Now if let the ball fall in front of me, the bot sees the motion of the ball, but not me, must realize that what it thought was one object is actually two and act accordingly. Now when I pick the ball up and place it over my head, does the bot go back to thinking only one object is present, or does it still see two? These are the types of decision making that·is easy for a human, but not so easy for a robot.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    6+6=10 (Long live the duodecimal system)

    Post Edited (Paul Baker) : 4/7/2006 8:42:39 PM GMT
  • SciTech02SciTech02 Posts: 154
    edited 2006-04-07 20:46
    Interesting, why not have it "Lock on" the soccer ball before you pick it up?· Also, a soccer ball has a diferent color then a human, but I know what you mean.· So, you need at least two cams, Hmm.··So you would send them to a computer to process them, could you use a pocket pc that is "On board" the robot to do that?··· -SciTech02.··


    Post Edited (SciTech02) : 4/7/2006 8:53:21 PM GMT
  • Paul BakerPaul Baker Posts: 6,351
    edited 2006-04-07 20:59
    Im not sure, it depends on the frame size, frame rate and the exact algorithm used. The system I did a little research on, edge detected both frames then did subframe correlation along the x axis.

    The edge detection simplifies the image, instead of worrying about the full gamut of colors, you are only concerned about the parts of the picture which are relatively quick transitions of color, the resultant image is grey scale black=no color edges, white= very sharp color transition. The correlation compares the two edge detected frames, if the two cameras are perfectly aligned (they wont be) an object in the very far distance (far field) will be in the exact same place in each frame.

    Sub frame correlation takes two regions of each frame and mathmatically slides each image over each other trying to find an alignment of the two frames, the output of the correlator will peak when the two frames are aligned. The index value corresponding to the peak is the distance measurement of the object being examined.

    Object detection and recognition comes into play when determining the boundries of the subframe, since you want to be finding the distance of a single object and hence want to choose a subframe which includes as large of a portion of the object while reducing areas of not belonging to the object, because this increases the relaibilty of the distance measurement produced by the correlation method.

    There are other ways to trying the same goal, but this is the one I understand.

    I omitted another step used within the system. Since it's nigh impossible to align two cameras so that they are pointing pixel by pixel in the same direction, it used a 2D correlator calibration step of a far field image. This created a software alignment of the two cameras, where the frames where translated (moved x,y pixels), before doing any further operations on the frames so they are now in perfect alignment (barring any optical abberations which is a whole other story).

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    6+6=10 (Long live the duodecimal system)

    Post Edited (Paul Baker) : 4/7/2006 9:13:52 PM GMT
  • SciTech02SciTech02 Posts: 154
    edited 2006-04-07 21:12
    Hmm, that you just explained was the parallax method.··Also like you said, if the object is far away the parallax method doesn't work because they are exactly·in the·same space.· I like the idea of just black and white, Scientists use it because it has more details then color, so you could use that for beter object detection/distance detection.· Also, to align the cams, could you use a compass?· But still, could I use a pocket pc to process the images?··· -SciTech02.
  • WhelzornWhelzorn Posts: 256
    edited 2006-04-07 21:15
    certainly, as long as the camera you are using somehow interfaces with the pocketPC (serial, USB, CF adapter, etc.)
  • Paul BakerPaul Baker Posts: 6,351
    edited 2006-04-07 21:20
    Alignment is a very complicated process depending on the exact means of mounting, thats why the softtware alignment is performed, you get them close as mechanically possible in alignment then use software to do the fine tuned alignment. A black and white scene is one of those structured environment issues I was referring to, and using black and white images reduces the complexity of the first stage by a factor of 3 (less information is conveyed in each frame). I cannot answer the pocket PC question, the technical answer is yes it can, but how responsive the system will be will depend on the size of frame, the number of objects tracked in the frames and the exact model of handheld computer you will be using. It may take a second or even much longer to process a single frame, and this may (or may not) be an acceptable latency.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    6+6=10 (Long live the duodecimal system)
  • SciTech02SciTech02 Posts: 154
    edited 2006-04-08 01:15
    I agree with the software alignment.· Now I have a question, Do they make image processers for pocket pc's?· I'm not familier with pocket pcs and there capabilities, and all the image processers I've seen look like they connect to your computer.·

    Also, all of these processers have an ethernet connection, is that important?

    One more thing, the two cams, how far apart should they be?· Should they be next to eachother, or two inches apart?··· -SciTech02.
  • SciTech02SciTech02 Posts: 154
    edited 2006-04-08 01:38
    Here's·some links to·some frame grabbers.· Do you think·they would work?

    http://www.pixelsmart.com/ps512-8.html

    Or this one...

    http://www.datatranslation.com/products_hardware/prod_dt3120.htm

    -SciTech02.


    Post Edited (SciTech02) : 4/8/2006 1:43:48 AM GMT
  • Paul BakerPaul Baker Posts: 6,351
    edited 2006-04-08 17:19
    Neither will work, because they are cards that are to be plugged into a bus, something which a Pocket PC doesn't have. Im starting to have serious doubts whether a Pocket PC can do it, and here's why: To do this method you'll need two cameras, even if you can find two cameras the likelyhood of finding the ability to connect both to the Pocket PC is quite small, since the Pocket PC will only have a single expansion slot. The only way I can think is to get two ethernet cameras, but in order to get the Pocket PC to connect to both, you'll need a ethernet hub, and these are typically plugged into the wall. You may be able to design an unpowered hub yourself, but there is another issue: Pocket PCs run Windows CE. To my knowledge you cannot run ordinary windows programs on Windows CE. This means you would have to get a Windows CE development package, write your own device drivers to access the cameras, and write the software to grab frames and process them. So it could be done, but it is far from trivial to do so.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    1+1=10
  • SciTech02SciTech02 Posts: 154
    edited 2006-04-10 03:55
    Thank you, that's what I needed to know.· Hmm, if I can't use a pocket pc, what should I use?· I saw on a web site that this guy sends the serial data from the cam(s)·through a RF transmiter to his pc that then processed the image(s), sends them back, and·so on.·

    Now, here's a recap on how to navigate with·CMU cam(s), am I right?

    1:·Digitize the image(s), and send it to a processer.

    2: Process the image(s), and use the parallax method to find the distance of an object.

    3: Send what it detected to the stamp (BS2), and act on what it "Saw".·

    I know how to do steps 1 and 2, but I'm not to sure about step 3.·

    Thank you for your help so far.··· -SciTech02.
  • Paul BakerPaul Baker Posts: 6,351
    edited 2006-04-12 17:23
    Thats the thus far glossed over aspect, thats where the AI aspect of the equation lies. It all depends on what your end goal is.

    perhaps you could use some wireless cameras and a couple TV cards that have the ability to snapshot the input via a program (IE it has a development library, I dont know off hand which if any do).

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    1+1=10

    Post Edited (Paul Baker) : 4/12/2006 5:26:29 PM GMT
  • SciTech02SciTech02 Posts: 154
    edited 2006-04-13 18:55
    Well, I know what my goal is.· That is, to detect objects, and avoid them.· I know that that's a better job for IR and ultrasonic sensers, but to have a robot actuly "See" it's world would be great.

    Here's some more detail on how ASIMO navigates with two cams (Sadly it's not a lot)

    Using the visual information captured by the·camera(s) mounted in its head, ASIMO can detect the movements of multiple objects, assessing distance and direction. Common applications this feature would serve include the ability to follow the movements of people with its camera, to follow a person, or greet a person when he or she approaches.

    ASIMO can recognize the objects and terrain of his environment and act in a way that is safe for both himself and nearby humans. For example, recognizing potential hazards such as stairs, and by stopping and starting to avoid hitting humans or other moving objects.

    Maybe I could find a way to put a bus on my robot (Is that even posible?).· Or maybe just go back to the idea of puting a ultrasonic under the cam (But the cone of capture is larger then the image itself).· I could use a lasser rangefinder (But thats expencive).· I read about a DIY rangefinder that you would shoot a lasser pointer at the object, and depending on how big the dot is on a camera, you could find out how far away the object is (But having a lasser pointer on all the time isn't good, and it could hurt someone).·

    You have helped me a lot so far, thank you for that.··· -SciTech02.
  • Paul BakerPaul Baker Posts: 6,351
    edited 2006-04-15 15:56
    Thats the one benefit with using cameras, when used stereoscopicly they can detect depth, and by comparing two (time) frames you can detect motion. I dont think employing a bus on your bot is a good direction, it opens a whole can of worms with interfacing, software drivers, software etc. I would just use two wireless cameras tuned to seperate channels and beam the images to two receivers, connected to two TV cards in a PC. Have the PC do the depth and motion detection, and have it also do the AI stuff, then it sends action commands back to the bot. This way you could use a stamp on the bot because its only controlling movement, instead of trying to do any of the complex processing. This will also cheapen the required RF link between the PC and bot since simple tokenenized commands are all thats transmitted across the channel. Also this arrangment should reduce the cost of your upgrade path if you find the processing power insufficient. Not including the cost of the PC and bot, it should be about $400 for the equipment (2 RF cameras, RF module, 2 TV tuner cards). It may be possible to find a cheaper avenue, but the tradeoff is an exponential degree of difficulty.

    One thing about this direction I didnt mention is that wireless cameras introduce noise into the picture (unless you buy really expensive units), there would have to be a "de-noising" stage added to the processing. Though perhaps if you get Wi-Fi ethernet cameras, the error detection used for 802.11b would denoise the transmission for you (this would also remove the TV tuner requirement).

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    1+1=10

    Post Edited (Paul Baker) : 4/15/2006 4:00:45 PM GMT
  • Robert25Robert25 Posts: 53
    edited 2006-04-15 16:13
    Just a quick question along these lines (it's actually in a different forum thread)... but... What 802 wifi system would you suggest to link the stamp on the bot with the PC? This is the next stage in my exploration and I would like to do something similar to what you describe... i.e., let the pc do much of the higher processing and return simple commands to the stamp. I am not that familiar with the wifi protocols although my home is wifi through a router and wirelessly links two other computers to the internet.

    There does seem to be a lot of interest in this... I've seen wifi discussions as well as RF links... The wifi system I think would be better for most since many people already have this running in their homes etc.
  • Paul BakerPaul Baker Posts: 6,351
    edited 2006-04-15 18:03
    I have experience with RF wireless cameras but not WiFi cameras, after looking a little while at customer reviews, this one seemed to be liked by the two reviewers: http://www.newegg.com/Product/Product.asp?Item=N82E16830137002

    Finding professional reviewers doing side by side comparision of wireless cameras is harder than the proverbial needle in a haystack. Just be sure to buy from a company with a liberal return policy (thats why I linked to newegg).


    You should look for a camera which supports TCP/IP or UDP connections since they will provide the lowest latency of aquiring the image.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    1+1=10
  • SciTech02SciTech02 Posts: 154
    edited 2006-04-25 19:39
    Update:· I've been asking around at the propeller chip forum, and they said that it can process images.· So my plan is set: I'll use two cams, send the serial data to the propeller chip for processing, and deliver the pulses.· It's great because I can process the images and control the rest of my robot on the same chip.· How does that sound?

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    There is always an answer.

    There is always a way.
    There is always a reason.··· -SciTech02.
  • Paul BakerPaul Baker Posts: 6,351
    edited 2006-04-25 20:32
    Sounds good, good luck.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    1+1=10
  • Tom WalkerTom Walker Posts: 509
    edited 2006-04-25 20:49
    It's a great idea, but I think that there's a whole lot more going on than you realize...

    Break the project down into smaller sub-goals and achieve them individually...then integrate them into the whole. I only recommend this to keep you from dropping the project if you try to do it as one piece and realize that each of the little pieces can be a challenge unto itself.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Truly Understand the Fundamentals and the Path will be so much easier...
  • Paul BakerPaul Baker Posts: 6,351
    edited 2006-04-25 20:53
    Nice thing about the propeller, is it naturally lends itself to compartmentalization of your projects, you can first worry about getting the image data and where to stick it, then later develop other cogs that work on processing the data.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    1+1=10
  • SciTech02SciTech02 Posts: 154
    edited 2006-04-26 21:04
    Exactly.· I could have one cog get and position the pictures, and up to six others·could·do the processing (The last one would be for servo and robot control).·

    Also, I am breaking the project down to smaller ones.· First I have to learn how to program the cogs to do what I want them to do, connect the cams to it, test if it processed the images·and so on.· Thank you for your guys help. smile.gif

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    There is always an answer.

    There is always a way.
    There is always a reason.··· -SciTech02.

    Post Edited (SciTech02) : 5/1/2006 9:26:12 PM GMT
  • bbienvenuebbienvenue Posts: 2
    edited 2006-05-01 15:14
    Greetings,

    Let me delurk a bit before making my resonse. I'm a new member to the group and a newbie to robotics, but I am a graduate student in cognitive psychology and cognitive science·and I have some theoretical background in the link between what computers/robots do·and what humans do.·My focus is in human language comprehension, but I have some knowledge of vision too.

    SciTech02 wrote:
    ·[noparse][[/noparse]snip] "Can you make the CMU cam work like a human eye?" [noparse][[/noparse]snip]

    Meaning something like, would it be possible to use 1 (or two) CMU cams to make a robot see the world "like a human"·and then do navigation based on this vision, "like a human".

    The issue I have is with the phrase "like a human". This makes the assumption that we actually have a good grasp of how a human does navigation and that it is a reasonable way to make robots navigate.·This is a common assumption even among people with PhDs who do machine vision, but it isn't really·right.

    OK, some background on cognitive science & cognitive psychology and why this assumption is a problem.
    Back in the dark ages people invented serial computers and then·Turing, Newell, Simon and a bunch of others came along and said, hey we can make them think and in fact maybe human cognition is really equivalent to artificial intelligence in the end. The big assumption here was that human cognition was symbol manipulation: data is represented as a set of abstract symbols that are manipulated via syntactic rules. This is an important assumption for early AI because this is what computers can do really well. So if we can do cognition like this then we can do AI.

    AI worked on an assumption that sensing the world was pretty easy and entailed mapping the "stuff in the world" onto a set of symbols in the brain. All the hard stuff was the symobol maniupulation.

    In the 1980s David Marr wrote a book entitled _Vision_ that epitomizes the old-style AI approach to machine vision and to how human vision might work. We start with a bunch of pixels in the retina. Using some simple algorithms (that are hardwired into the visual system in humans) we can turn the pixels into lines, and corners and from there into surfaces and objects. Once we see the objects we build a map of the world (what Marr call a 2.5D representation of the world) and reason about it to do things like walk without tripping on the coffee table and catch baseballs. If you are interested in machine vision and how it links to human vision this is really worth reading (although a difficult read).

    But we have since realized this view is wrong in some important ways. Progress on symbol manipultation systems moved really rapidly (Chess playing computers better than humans, mathematical theorem provers) progress on the sensory input to the world really stank (Speech recognition that still is pretty bad, machine vision that is still pretty bad, etc...). The hard vs. easy probles were reversed. An important part of human cognition really seems to be the sensory link to the world. Getting the data into a useful form is part of the cognition we do. this view is called embodied cognition, the idea being part of doing human cognition is having human links to the world, for a good book on this see _Mindware_, by Andy Clark, a moderately difficult read.

    So embodied cognition has a new set·of assumptions to consider. Some of the relevant ones are:
    · 1) the human brain isn't a serial computer is is a parallel & distributed system
    · 2) the human brain works via lots of small systems that don't need to know what other systems are up to
    · 3) object recognition is not the same as navigation

    There is evidence to support all of this, and for our purposes the last point in particular. Once in the brain (the primary visual cortex) visual information splits into two very separate pathways, the dorsal pathway and the ventral pathway. Patients with brain damage (and surgically altered monkeys) allow us to see that these pathways are distinct and serve different functions. The ventral pathway is commonly called the "what" stream. It's job is to tell you what you are looking at (i.e. do object recognition, what machine vision wants to do). The dorsal pathway is the "where" stream and tells you that something is moving (or you are moving towards something). Patients with damage to the ventral stream cannot see objects in the normal sense. If you were to hold up a baseball they could not tell you what it was, and in fact thir perception is of being totally blind. But if you were to toss a baseball at them, they would duck! So, the dorsal stream can do it's thing _without_ object recognition: There's something flying at you, I don't know what it is, but duck! If you've seen Jurassic Park recall when the kids are hiding from the T-Rex. They stand still and the T-Rex can't see them. The T-Rex (and other evolutionarily old visual systems) are like the dorsal path, they don't do object recognition, just motion perception. If you are a T-Rex, you bite anything that moves, because it will be food, you don't really care if it's humans or chimps or other dinosaurs.·Later on, object recognition evolved as a related, but distinct visual system.

    So seeing "like a human" isn't one type of seeing. Seeing like a human is task dependent. Are you picking an M&M to eat based on color or are you catching a baseball?

    There's more to this issue and there are lots of people with more knowledge than me about how to get a robot to do dorsal or ventral style processing, but I just want you to realize that the "like a human" way of doing things isn't what most people think it is. Feel free to ask (in forum or off-line) for more information of the human parts of this equation.

    Thanks for reading.

    Breton Bienvenue

    ·
  • SciTech02SciTech02 Posts: 154
    edited 2006-05-01 21:25
    Wow, that's very interesting.· Very well planed out.· Now I know why the T-rex couldn't see them!·roll.gif·

    But I think the idea just discussed would work, treating each pixel (Or group of pixels) as a senser.· Then, get the distance by using the parallax method.· With the X, Y and Z axis covered, it could map out an image and navigate on a higher level by actualy "Seeing" it's enviroment and reacting to it.

    Thank you for the information.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    There is always an answer.

    There is always a way.
    There is always a reason.··· -SciTech02.
Sign In or Register to comment.