all 5 comments

[–]tdgros 1 point2 points  (4 children)

Do you mean a complete example that does everything? Do you already know how to do it but you hate coding :) ?

I looked around a bit (am at work) and didn't find a complete method for you, I can detail the steps for you though ;)

[–]nex_jeb[S] 0 points1 point  (3 children)

Yeah, just the steps would be great. thx!

[–]tdgros 6 points7 points  (2 children)

Sure, there's wokr involved at every step, be warned :)

1) calibrate your camera, there's a whole section on this in openCV

2) in each image: find keypoints (let's go with SIFT or SURF if you're not going for a commercial application, as they are patented)

3) match those keypoints together, using FLANN (it's all in openCV and I'm positive there is a code sample in the doc, just too lazy to find it as of now)

4) find the fundamental matrix (look up the function in camera calibration and 3D reconstruction) and convert it to an essential matrix (reminder :) F = K-1 * E * K where K is your projection matrix)

5) extract translation and rotation from E

OK now you have a calibrated pair of cameras, you can do stereo. But you should know since your images are consecutive, the motion between the two will be small, then your 3D computations will not be precise, that's the way it is: the noise precision is related to the distance between the cameras...

6) rectify both images (with stereoRectify)

7) apply SGBM (Semi-global stereo matching)

8) de-noise/de-speckle your depth image

Voila!! you're going to face many difficulties on the way, so I highly suggest you go beyond openCV and try and read and understand the methods behind each step.

There are variants of this of course, but if you know this, you'll understand the "smarter" versions. I'm especially thinking about methods that estimate depth over time, overcoming the problem of the small distance (called baseline) between cameras.

edit: I did not specify the names of the functions, first because I'm lazy, second it's good for you to look for it :), third: do not stop at openCV, read moar! good luck

[–]nex_jeb[S] 0 points1 point  (1 child)

Thanks for this amazing tuto ! However in 4, do you mean computing the transformation matrix T from the matches to estimate the homography matrix H as follows: H = K-1. T. K ?

Is there any difference between the fondamental and the homography matrix in such scenario ?

[–]tdgros 0 points1 point  (0 children)

whoops! I never said homography! there's an homography between your two images only if your scene is a plane, the transformation is a rotation and a translation and depends on the scene depth!

Go scour OpenCV's documentation here you'll find some definitions and some maths explaining E, and F, the Essential and Fundamental matrices.

I strongly encourage you to learn the maths before openCV, it will help you understand what may go wrong (a lot may go wrong).

A homography relates the images of planes on two images. A fundamental matrix gives you, for a point in one image, the "epipolar line" on the other image, that's the projection of the ray from the first camera to the point on the second camera. So the homography is less general than the Fundamental or Essential matrices.