Sunteți pe pagina 1din 3

Comparison of Dense Stereo Matching Metrics for Real Time Applications

Merlin George, Student Member, IEEE, and Rejimol Robinson R.R

Abstract Stereo Matching is one of the classical problems

in computer vision. The stereo matching problem is to compute the disparity map for the reference image using two or more images of the same scene. This work is particularly interested in local stereo matching methods, which generally have low computation complexity and less storage requirement; and therefore they are suitable for real-time and embedded implementations. The class of algorithms which has been selected among several is the class of correlation based stereo algorithms because they are the only ones that can produce sufficiently dense range maps with an algorithmic structure which lends itself nicely to fast implementations because of the simplicity of the underlying computation. The proposed work tries to compare various block matching similarity measures like Sum of Absolute Difference (SAD), Sum of Squared Difference (SSD) and Normalized Cross-Correlation (NCC) for calculating depth maps. The result shows that NCC provides a close match to ground truth by reducing error and noises when compared to SAD and SSD.
Index Terms Disparity Map, Epipolar Constraint, Stereo Correspondence, Stereo Vision.

closer (to the stereo cameras) the object. Therefore the disparity map encodes the depth information of each pixel, and once we infer the depth information by means of stereo matching, we are able to obtain the 3D information and reconstruct the 3D scene using triangulation. Since stereo matching provides depth information, it has great potential uses in 3D reconstruction, stereoscopic TV, navigation systems, virtual reality and so on.

a)

b)

c)

Fig. 1 An Example for Disparity Map (a) Image taken by the left camera. (b) Image taken by the right camera. (c) The ground truth disparity map associated with the left image.

I. INTRODUCTION HE word "stereo" comes from the Greek word "stereos" which means firm or solid. With stereo vision you see an object as solid in three spatial dimensions width, height and depth--or x, y and z. It is the added perception of the depth dimension that makes stereo vision so rich and special. Stereo matching has been, and continues to be one of the most active research topics in computer vision. The task of stereo matching algorithm is to analyse the images taken from a stereo camera pair, and to estimate the displacement of corresponding points existing in both images in order to extract depth information (inversely proportional to the pixel displacement) of objects in the scene. The displacement is measured in number of pixels and also called Disparity; disparity values normally lie within a certain range, the Disparity Range, and disparities of all the image pixels form the disparity map, which is the output of a stereo matching process. An example with the Teddy benchmark image set is shown in Figure 1. In the figure, the disparities are visualized as gray scale intensities, and the brighter the grayscale, the

Many stereo algorithms make use of the epipolar constraint, meaning that for a pixel in the left image the corresponding point in the right image lies on the same horizontal line, the epipolar line. This strong constraint is used to reduce the search space of the correspondence algorithms that calculates depth maps. In the past two decades, various stereo matching algorithms have been proposed and they were summarized and evaluated by Scharstein and Szeliski [1]. In his notable work, these proposed stereo matching algorithms are categorized into two major types: local area based methods and global optimization based methods. In local methods, the disparity evaluation at a given pixel is based on similarity measurement performed in a finite window. The similarity metric is defined by a matching cost and the all cost in the local window is often aggregated to provide a more reliable and robust result. On the other hand, global methods define global cost functions and solve an optimization problem. Global algorithms typically do not perform an aggregation step, but rather seek a disparity assignment that minimizes a global cost function. In this work we are particularly interested in local stereo matching methods, which generally have low computation complexity and less storage requirement; and therefore they are suitable for real-time and embedded implementations.

II. BLOCK MATCHING The block matching method is one of the most popular local methods because of its simplicity in implementation. The basic idea of block matching for stereo correspondence is as follows: to estimate the disparity of a point in the left image, we define a reference block surrounding this point; and then, find the closest matched block, within a search range in the right image, using a pre-specified matching criterion; thus, the relative displacement between the reference block and the closest matched block constitutes the disparity of the point being evaluated. In this work, matching criteria used for comparison are the Sum of Absolute Differences (SAD), the Sum of Squared Differences (SSD) and the Normalized CrossCorrelation (NCC). Normalized Cross-Correlation (NCC) is the standard statistical method for determining similarity. Its normalization, both in the mean and the variance, makes it relatively insensitive to radiometric gain and bias. The sum of squared differences (SSD) metric is computationally simpler than cross-correlation, and it can be normalized as well. In addition to NCC and SSD, many variations of each with different normalization schemes have been used. One popular example is the sum of absolute differences (SAD), which is often used for computational efficiency [3]. III. MATCHING METRICS The proposed work tries to compare various block matching similarity measures like Sum of Absolute Difference (SAD), Sum of Squared Difference (SSD) and Normalized CrossCorrelation (NCC) for calculating depth maps. These are shown in the Table 1. A. Sum of Absolute Differences(SAD) Sum of Absolute Differences (SAD) is one of the simplest of the similarity measures which is calculated by subtracting pixels within a square neighbourhood between the reference image I1 and the target image I2 followed by the aggregation of absolute differences within the square window, and optimization with the winner-take-all (WTA) strategy [1]. If the left and right images exactly match, the resultant will be zero. B. Sum of Squared Differences(SSD) In Sum of Squared Differences (SSD), the differences are squared and aggregated within a square window and later optimized by WTA strategy. This measure has a higher computational complexity compared to SAD algorithm as it involves numerous multiplication operations.
TABLE I BLOCK MATCHING METRICS USED FOR COMPARISON

Normalized CrossCorrelation(NCC)

NCC(x,y,d) =

C. Normalized Cross-Correlation(NCC) Normalized Cross Correlation is even more complex to both SAD and SSD algorithms as it involves numerous multiplication, division and square root operations. But the result shows that it gives the best disparity map compared to SAD and SSD. IV. RESULTS AND DISCUSSIONS In this section, we present some experimental results on teddy stereo pairs with ground truth from the Middlebury Stereo Vision page. In this work, teddy stereo image pair was taken for the study because it is rich in depth discontinuity. Sum of Absolute Differences (SAD) is easier and faster to compute than Sum of Squared Differences (SSD) and Normalized Cross-Correlation (NCC). But from table II it is noted that Normalized Cross-Correlation (NCC) gives more accurate disparity map when compared to Sum of Absolute Differences (SAD) and Sum of Squared Differences (SSD). Also Normalized Cross-Correlation (NCC) reduces the error and noise of the disparity map since the calculation averages the noise of each pixel. Error has been calculated for different window sizes. It is clear from table III that Normalized CrossCorrelation (NCC) provides a close match to ground truth by reducing the noises created in Sum of Absolute Differences (SAD) and Sum of Squared Differences (SSD).
TABLE II COMPARATIVE PERFORMANCE OF ALGORITHMS ON TEDDY STEREO IMAGE PAIR Image Method 3x3 4.3420e+ 004 4.3286e+ 004 4.2908e+ 004 Error Window Size 5x5 4.3151e+ 004 4.3076e+ 004 4.2502e+ 004 7x7 4.3029e+ 004 4.2977e+ 004 4.2398e+ 004

SAD Teddy SSD NCC

TABLE III DISPARITY MAP COMPARISON OF TEDDY STEREO IMAGE PAIR

Method

3x3

Disparity Map 5x5

7x7

SAD

Match Metric
Sum of Absolute Differences(SAD) Sum of Squared Differences(SSD) SAD(x,y,d) = SSD(x,y,d) =

Definition
, , -d, -d, |
2

[5]

[6]

SSD
[7]

[8]

NCC

[9] [10]

T. Kanade and M. Okutomi: A Stereo Matching Algorithm with an Adaptive Window: Theory and Experiments, PAMI, vol. 16, no. 9 (1994) 920-932. S.T. Barnard and M.A. Fischler, Computational Stereo, ACM Computing Surveys, vol. 14, pp. 553-572, 1982 Birchfield and C. Tomasi, Depth Discontinuities by Pixel-to-Pixel Stereo, Technical Report STAN-CS-TR-96-1573, Stanford Univ., 1996. O. Faugeras, B. Hotz, H. Matthieu, T. Vieville, Z. Zhang, P. Fua, E.Theron, L. Moll, G. Berry, J. Vuillemin, P. Bertin, and C. Proy,Real Time Correlation-Based Stereo: Algorithm, Implementations and Applications, INRIA Technical Report 2013, 1993. S. Birchfield and C. Tomasi, Depth Discontinuities by Pixel-to-Pixel Stereo, Proc. IEEE Intl Conf. Computer Vision, pp. 1073-1080,1998. http://vision.middlebury.edu/stereo/data/...

V. CONCLUSIONS AND FUTURE WORK In general, SAD is easier to compute and is less sensitive to outliers than other measures. Stereo by SAD correlation has proven a robust and reliable tool in moderately complex environments. In this work it is proved that Normalized CrossCorrelation (NCC) provides a close match to ground truth and also the error computed is much less when compared to Sum of Absolute Differences (SAD) and Sum of Squared Differences (SSD). But the computing time taken by NCC is much higher than SAD and SSD. So our future work in this area is to develop an efficient NCC-based stereo matching algorithm which works faster than conventional Normalized Cross Correlation (NCC).

Merlin George received the B.Tech degree in Computer Science and Engineering from M.G University, Kottayam, in 2006 and now an M.Tech student in computer Science and Engineering at Kerala University, Thiruvananthapuram. Her field of interests include stereo matching, 3D reconstruction, and computational photography. She is a student member of the IEEE.

REFERENCES
[1] D. Scharstein and R. Szeliski. A taxonomy and evaluation of dense twoframe stereo correspondence algorithms. International journal of computer vision, 47(1):7-42,2002. Daniel Scharstein, Richard SZeliski, A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,International Journal of Computer Vision,vol. 47,no.1,pp.742,2002. Myron Z. Brown, Darius Burschka, and Gregory D. Hager, Advances in computational stereo, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 8, pp.9931001,2003. (2002) E. Salari and J. Strong. On the reliability of correlation based stereo matching. In IEEE Int. Conf. on Systems Engineering,pages 559561, 1990.

Rejimol Robinson R.R received B.Tech degree in Computer Science and Engineering from the University of Kerala in 1999 and M.Tech in Computer Science with specialization in Digital Image Computing from the same university in the year 2007.She is currently working as a Senior Lecturer in Computer Science and Engineering of the University of Kerala. Her research interest area include Digital Image Processing, Pattern Recognition, Network Security, Intrusion Detection System

[2]

[3]

[4]

S-ar putea să vă placă și