Using the CMU cam as a robotic eye (Has it been done?)

bbienvenue · 2006-05-02 12:22

Greetings,

I like seeing the engineering solutions to vision problems. They tend to be (as most engineering solutions are) well designed.

In contrast, biological systems are always ad-hoc and kludgy, because evolution doesn't have a goal in mind at the outset. There wasn't a plan to do full-color binocular object recognition and then make a system to do it. A few light-sensitive neurons appeared and then they were able to serve a more and more complex purpose after the fact via small changes. Hence the backwards orientation of the retina and a few other bugs.

It's akin to the difference between having a project in mind and then going shopping for parts versus looking in the parts bin under the workbench and seeing what you can make from what's there. And since this is a robotics forum (not even BEAM-style at that), not a biopsych forum, design away!

Another vision note: humans don't need to use binocular disparity/convergence to do depth perception. We can, and it's really good for close stuff (what you aiming for·in navigation for your bot), but at largish distances there's not much parallax so the system isn't as accurate, it can say "far away" but not how far. So, we have additional ways to do things (another sign of biological systems - lots of redundency)

As a test, patch one eye for a day and go about your life (don't drive just to be safe) and you'll see you do really well. It isn't perfect, for example you cannot hold two pencils and touch eraser to eraser one-eyed, but you won't trip on the sofa and you can pour your morning coffee. Humans have lots of cognitive tricks (i.e. stuff that's done via computation/software) that let them do depth perception one-eyed (as do animals like birds that give up binocular depth for a wide field of view by putting eyes on opposite sides of the head). I've got a page from a course I taught with examples of some of these monocular depth cues at:

http://www.acsu.buffalo.edu/~bmb/Courses/Old-Courses/PSY341-Fa2004/Exercises/Depth-cues/depth-cues.html

In theory, an alternative approach for object-recognition based navigation would be to do things that way. You'd be trading off a bunch of hardware (the second cam and it's servos) for a bunch of software,·and the software overhead is huge, requiring some AI and major preprogrammed background knowledge·for the system, and hasn't been done outside some really controlled settings.

OK, back to·my dissertation writing.

Breton

SciTech02 · 2006-05-03 02:40

Actually, what I·really had in mind (but I couldn't word it) was an "Object recognition" type of navigation.· Instead of just "Detecting" an object, it could acctually "See" what the object "Is".· So it could be a wall, ball, floor,·or a human and it could recognize what it is and navigate on that.··It could also recognize·particular enviroments, an alternative to traditional mapping.· Unfortunatly, that·might be·behond stamp based processing, but I don't·know for sure.· If it is, there are other ways to do the processing.

That's what I wanted it to do.· Can the CMU cam recognize objects?

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
There is always an answer.

There is always a way.
There is always a reason.··· -SciTech02.

Post Edited (SciTech02) : 5/3/2006 2:56:41 AM GMT

Tom Walker · 2006-05-03 14:01

The CMUcam possesses a lot less functionality than it appears you are giving it credit for. It can only tell you a few things about what it "sees". It can tell you the color of a certain area, what the bounds are for a color "blob", and some other simple "image information". Beyond that, it is up to your processor of choice to determine what to do with that information. The only "response" native to the CMUcam, is its ability to send servo pulses to try to keep a certain color "blob" centered in its viewspace.
As has been pointed out, it takes a LOT of computing to turn image DATA into INFORMATION that can be acted on.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Truly Understand the Fundamentals and the Path will be so much easier...

Post Edited (Tom Walker) : 5/5/2006 12:56:13 PM GMT

SciTech02 · 2006-05-04 21:27

I see.··If the Propeller isn't strong enough (Wich unfortunatly·might be true)·I was planing on using an on-board PC for the processing.· That should be strong enough.· But that would be complicated to do.· It dosen't mater, I wasn't going to do this any time soon.· Just looking at the options on how to do it.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
There is always an answer.

There is always a way.
There is always a reason.··· -SciTech02.

pharseid · 2006-05-05 14:12

I think it would be a matter of what sort of frame rate you could handle. For the Propellor it would really help to keep everything in on-chip memory. The Propellor is probably fast enough to handle things on the fly, you just need to buffer 3 scan lines or so and digest all that into an edge map. Using a relatively low-res b/w camera would help. But I don't know how to do the more advanced things you want to do, so I couldn't estimate that. But I think the Propellor might be useful in transforming the raw video into an edge map (or doing other image processing) that could be used by a second processor for futher analysis.

-phar

Paul Baker · 2006-05-05 17:43

Phar has a point. If you switch around the processing blocks you could reduce the memory needed to store the images. Placing the edge detection algorithm first would help, this is because you can downsample the 8 bit input into a 2-4 bit/pixel edge map. If you construct it so that one cog fills a memory buffer with the scan line at a time, you can have another cog process the scan lines into the edge map. The number of scan lines required for your buffer will be determined by how fast the two cogs can work with repect to each other.

Lets play with some numbers; if you use two cameras each with 320x240 resolution, and you use a 4 scan line buffer (3 for the processing cog, and 1 to be filled with the input cog), thats 1280 bytes/ input buffer, or 2560 bytes for both. If your edge map is 2 bits/pixel (768,000 pixels) you'll require 19200 bytes per image or 38400 bytes. Ok so there's a problem, we've run out of memory.

Ok, so lets downsample the picture, lets say we'll cut the image resolution by 1/2. We dont want to pick out every other pixel, because and edge might be missed, also averaging the pixels will blur edges making them harder to detect. So we employ a min/max downsampling scheme. As scan line is read in, each group of 2 pixels is min/maxed (ie the minimum value of the 2 pixels is written first, the maximum value is written next). Doing this means the input buffers will stay the same size, but after a scanline is written, the next scanline is merged with the previous scanline, by taking the min of both and the max of both, so by the end two scanlines are stored in the space of one. The edge processing cog then performs dual direction edge detection by comparing the min/max and max/min between lines in order to ensure you capture the edge, even though you aren't storing all the data. The resultant is a edge map of 4800 bytes, or 9600 for both, with the input buffer thats 12160 bytes, and with the merged two maps placed in a 3rd buffer to be processed by the AI, thats a total of 16960 or a little more than half the memory on the propeller.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
1+1=10

SciTech02 · 2006-05-06 03:42

Wow, I underestimated the power·propeller.· So the propeller could work, great.

I hate to ask if it's already been said, but what are scan lines?

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
There is always an answer.

There is always a way.
There is always a reason.··· -SciTech02.

Paul Baker · 2006-05-06 04:37

A single horizontal line. A camera outputing a 320x240 image would have 240 scan lines.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
1+1=10

pharseid · 2006-05-06 19:23

· The other thing that occurs to me is that the algorithms I'm aware of are very local, even determining the threshold for detecting an edge is local, so this would decompose fairly well between 2 or more Propellors. I actually originally was thinking in terms of decomposing it on one Propellor, maybe having 4 cogs doing edge detection, but I haven't really run the numbers to see how much processing power is required. It appears to me that it's better to break things down into blocks than have something like interlaced processing because the algorithm looks at the surrounding pixels.

·· But in the case of multiple Propellors I think you would only be looking at the amount of memory required, you could essentially assume infinite processing power.

-phar

SciTech02 · 2006-05-06 23:41

So that's a scan line, thanks.·

So it's more of a memory problem then a processing problem.· I thought of using multipule if one couldn't cut it, it looks like it's posible.· I was hoping·to have it on one if it's posible.·

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
There is always an answer.

There is always a way.
There is always a reason.··· -SciTech02.

pharseid · 2006-05-08 12:50

· My guestimate of the inner loop of an assembly program (I,m not sure about the timing of instructions) would indicate that 6 cogs could produce an edge map at 60 frames/sec for the resolution Paul mentioned. I think that's pretty interesting because my original thought was that you would have to have some other system doing frame captures and feeding it to your Propellor, but I think it would be possible to directly input from a flash ADC· (or whatever it is they use to convert raw video to digital). That would really simplify things.

· You could play with things to fit it on one chip (Paul didn't like averaging pixels, but one source I read actually recommended it to reduce spurious edges). I mention the multichip thing because producing an edge map is one problem which seems to lend itself to parallel processing. But having a distributed edge map may be terribly for further processing as far as I know.

-phar

Paul Baker · 2006-05-08 18:28

You could be correct that averaging would help denoise the system. The systems Ive seen (PC based) use black frame noise cancelation (took an image in 0 lux conditions, then subtracted that frame from subsequent images), but that technique would be too cumbersome for the propeller.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
1+1=10

pharseid · 2006-05-09 14:55

So they're correcting for flaws in individual cameras? The reference I mentioned didn't actually explain where the problems came from. I hadn't thought of that even though I saw a segment the other day on software that can determine the digital camera a picture was taken with.

I wonder if you could program that black frame into a flash RAM and the run that and the digital video output through a subtractor?

-phar

Paul Baker · 2006-05-09 17:53

CMOS sensors and the means through which they are designed have inherent flaws, which show up as noise in the system. Sometimes this is not visible, sometimes it is. There is permanent and temporary noise, the black frame only takes care of the first. Some people find that using the appropriate threshold in the edge detection algorithm is sufficient to weed out spurious edges.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
1+1=10

SciTech02 · 2006-05-10 16:59

On what phar said, I was planing on geting a cam that could send serial data from it.· The CMU cam does it, I just need to find a black and white one with a smaller resolution.

So it looks like edge detection is the way to go.· It would be a nice first step to object recognition (Like if you put a black block in a white room it could "See" it as a block), but that's not needed in my goal of vission-based navigation.·

So as of now, these are the steps that half to be done to get a image and navigate by it:

1: Send serial data from cams to Propeller chip for processing.

2: Use edge detection formula that Paul provided.

3: Use parallax method for distance detection.

Of corse I wonder if it's posible to just process one of the images for edges?· Then use the other unprocessed one for just·measuring distance.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
There is always an answer.

There is always a way.
There is always a reason.··· -SciTech02.

Post Edited (SciTech02) : 5/10/2006 5:03:50 PM GMT

SciTech02 · 2006-05-11 01:29

GREAT ASIMO!· I found that vision processing board that I talked about early in the forum!· It was designed for robotic vission processing.· Here's the link: http://active-robots.com/products/accessories/my-vision.shtml

It looks like you could directly control and program it from your computer.· It looks good for a RF·or even an on-board computer. ·What do you think?

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
There is always an answer.

There is always a way.
There is always a reason.··· -SciTech02.

SciTech02 · 2006-05-12 02:29

I see the Propeller demo board is out today.· That looks like my best route.· Does the Q44 have the same capabilities as the D40?

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
There is always an answer.

There is always a way.
There is always a reason.··· -SciTech02.

SciTech02 · 2006-05-16 18:06

I just got off NASA's website about there mars rovers. It uses the exact same method that we discussed (2 cameras, parallax method), then it projected a 3D image and chose the safest route. That's what I was planing on making it do. You can find more about it at there website if you're interested.

Also, it uses black and white cameras too. This is starting to look more and more posible as the days progress.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
There is always an answer.

There is always a way.
There is always a reason.··· -SciTech02.

Paul Baker · 2006-05-16 18:38

Yes the propellers are identical except the package.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
1+1=10

allanlane5 · 2006-05-16 20:03

But you're aware that the Rover uses Power-PC chips running at some dozens (if not hundreds) of Mhz, with LOTS of RAM, right?

bennettdan · 2006-06-14 06:49

Hey look at this wesight they use a standard color webcam and a pc to process the data. They have a demo program y9ou can get a password to download and play with they also have demo videos also.
http://www.evolution.com/

Post Edited (bennettdan) : 6/14/2006 7:13:27 AM GMT

SciTech02 · 2006-06-14 17:27

Wow.· That's useful.· I could add that to my current vision navigation plans (That solves the problem of mapping).· Could it work in multiple rooms?

Never mind, it says it can.· This system seems to be almost better then my vision based navigation idea.· I like it!· Now does anyone know how much it is? (The site doesn't show)·

Thanks bennettdan

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
There is always an answer.

There is always a way.
There is always a reason.··· -SciTech02.

Post Edited (SciTech02) : 6/14/2006 5:34:55 PM GMT

bennettdan · 2006-06-14 20:49

They have a "low cost development version" but it is about 1400 dollars and runs stepper motors the servo motor platform version is about 1000 more this includes software and a mobial platform for a laptop but they have some free software down loads to get you started it is reall nice and works great. They have a very nice robotics platform that has windows software for around 500 dollars look for the EV1-K on their websight you can add a stamp to do alot of things with this platform.

Post Edited (bennettdan) : 6/14/2006 10:46:06 PM GMT

SciTech02 · 2006-06-14 23:15

I was looking for one that I could just intergrate with a robotic platform I already have, I guess that would be the $1,400 one.· I couldn't find the $1,400 one, just a $1,700 one.

$1,400?

· Wow.· I wonder why·it's so much?· Do you think·the price could drop over time?· I'm not dising it, it's just that's alot of money.

You say that it works well, do you have it?· If you do, how easy is intergrating to a robotic platform and how·does·it work for you?·

Anyway, that evolution robotics is a great place.· They have state·of the art sensers for consumers.· I hear they're going to use there sensers including the north star in the·next gen Robosapien robots.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
There is always an answer.

There is always a way.
There is always a reason.··· -SciTech02.

Post Edited (SciTech02) : 6/14/2006 11:53:44 PM GMT

bennettdan · 2006-06-14 23:48

The EV1-K is a kit and includes the software and stepper drivers for about 500 dollars but I think you can get just the software for about half that cost and the software will drive steppers or servos traction motors.

SciTech02 · 2006-06-15 03:27

Wow!· The ER1 is my next robot, hands down.· I like it how it's a laptop computer, how it can hear, how it can see and recognize, and...· well everything about it's great.· Thanks bennettdan again for showing me this.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
There is always an answer.

There is always a way.
There is always a reason.··· -SciTech02.

bennettdan · 2006-06-15 06:20

SciTech02
I plan on building a tank tread robot for a security gaurd outside my house with this ER1-k kit from them I am going to use BS2's to control the sensors and plan to link the cam images throught a wifi network to my home PC security system. Have you downloaded the free reconition software yet it is fun to work with it lets you learn an object in front of a white background then pick it out of the room when your cam is pointed in it direction. I also seen in another thread you was asking about voice reconition·this platform does that too.

SciTech02 · 2006-06-15 16:07

Hey, I looked on there web site, and the ER1 robot costs $7,500.· Is this the $500 kit you talked about?·(I also saw the program for the ER1 that was worth $7,500)

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
There is always an answer.

There is always a way.
There is always a reason.··· -SciTech02.

bennettdan · 2006-06-15 18:26

Not that I know of I will check some more.

bennettdan · 2006-06-15 21:18

Hey
I found that the 500 is the motorized platform price but they do have a scaled down version of the prototype package that is about 1500 still within reach of say a small group if you started a little team maybe?

Using the CMU cam as a robotic eye (Has it been done?)

Comments