Sunteți pe pagina 1din 18

6.

FINDING BASIC SHAPES


6.1 Combining Edges
Bits of edges, even when they have been joined up in some way by using, for example, crack edge relaxation, are not very useful in themself unless they are used to enhance a previous image. From identification point of view it is more useful to determine structure of lines, equations, lengths, thickness... There are a variety of edge-combining methods in literature. These include edge following and Hough transforms.

6.2 Hough Transform


This technique allows to discover shapes from image edges. It assumes that a primitive edge detection has already been performed on an image. It attempts to combine edges into lines, where a sequence of edge pixels in a line indicates that a real edge exists. As well as detecting straight lines, versions of the Hough transform can be used to detect regular or non-regular shapes, though, as will be seen, the most generalized Hough transform, which will detect a two dimensional specific shape of any size or orientation, requires a lot of processing power in order to be able to do its work in a reasonably finite time. 6.2.1 Basic principle of the straight-line Hough transform After primitive edge detection and then thresholding to keep only pixels with a strong edge gradient, the scree n may look like Figure 6.1.

Figure 6.1 Screen after primitive edge detection and thresholding (only significant edge pixel shown). A straight line connecting a sequence of pixels can be expressed in the form: y = mx + c If we can evaluate values for m and c such that the line passes through a number of the pixels that are set, then we have a usable representation of a straight line. The Hough transform takes the above image and converts into a new image (what is termed) in a new space. In fact, it transforms each significant edge pixel in (x,y) space into a straight line in this new space.

Original data

Line to be found
1 2 3 4

Figure 6.2 Original data. Clearly, many lines go through a single point ( x, y), e.g. a horizontal line can be draw through the point, a vertical line, and all the lines at different angles between these. However, each line will have a slope (m) and intercept (c) such that the above equation holds true. A little manipulation of the above equation gives: c = ( x )m + y y 3 2 3 0 x 1 2 4 4 Gives 3=m.1+c 2=m.2+c 3=m.4+c 0=m.4+c Transposed c = 1m + 3 c = 2m + 3 c = 3m + 3 c = 4m + 3

Three line coincide here

3 3 0 c= 1m+3 c= 2m+2 4m+3 c= 4m c =

Figure 6.3. Accumulator array in (m,c) space. Maximum in the accumulator array is 3 at (1,4), suggesting that a line y = 1x + 4 goes through three of the original data points.

We know the value of x and y (the position where the pixel may be on an edge), but in this form. the equation now represents a straight line in ( m,c) space, i.e. with a horizontal m-axis and a vertical c-axis, each (x,y) edge pixel corresponds to a straight line on this new ( m,c) graph. We need space to be available to hold this set of lines in an array (called the accumulator array). Then for every (x,y) point, each element that lies on the corresponding line in the (m,c) accumulator array can be incremented. So that after the first point in the ( x, y) space has been processed, there will be a line of 1st in the (m,c) array. This plotting in the (m, c) array is done using an enhanced form of Bresenhams algorithm, which will plot a wide, straight line (so that at the ends crossing lines are not missed). At the end of processing all the (x,y) pixels, the highest value in the (m,c) accumulator array indicates that a large number of lines cross in that array at some points ( m,c). The value in this element corresponds to the same number of pixels being in the straight line in the ( x,y) space and the position of this element gives the equation of the line in the ( x,y) space, and the position of this element gives the equation of the line in (x,y) space: y = mx + c 6.2.2 Problems There are serious problems in using ( m,c) space. For each pixel, m may properly vary from minus infinity to infinity (i.e. straight line upwards). Clearly this is unsatisfactory: no accumulator array can be set up with enough elements. There are alternatives, such as using two accumulator array, with m ranging from 1 m +1 in one and 1 1/m +1 in the second. It is safer, though requiring more calculation, to use angles, transforming to polar coordinates (r,), where xcos + ysin = r.
Point(x,y)

y=a1x+b1 y=a2x+b2 y=a3x+b3

y=a5x+b5 y=a4x+b4

Figure 6.4 Family of lines (Cartesian coordinates) through the point (x,y).

(x,y) r

Shotest distance from origin to line defines the line in term of r and y x/cos xtan

x One of many possible lines through (x,y), e.g. y=ax+b

(x,y)

y-x tan (y-x tan )sin x

r=

x + ( y x tan ) sin cos x sin 2 = + y sin x cos cos 2 1 sin = x cos + y sin = x cos + y sin

Figure 6.5 Relationship between Cartesian straight line and polar defined line. Technique 6.1. Real straight-edge discovery using the Hough transform. USE. This technique is used to find out and connect substantial straight edges already found using and edge detector. OPERATION. For each edge pixel value I(x,y), vary from 0o to 360o and calculate r = xcos + ysin . Given an accumulator array size (N+M,360), increment those elements in the array that lie in box (b x b) with center (r, ). Clearly if the box is (1x1), only one element of the array is incremented; if the box is 3 x 3, nine elements are incremented. This gives a "thick" line in the new space so that intersections are not missed. Finally, look for the highest values in the accumulator arrays (r,) and thus identify the pair (r, ) that are most likely to indicate a line in (x,y) space. This method can be enhanced in a number of ways: 1. Instead of just incrementing the cells in the accumulator array, the gradient of the edges, prior to thresholding, could be added to the cell, thus plotting a measure of the likelihood of this being an edge.

2. Gradient direction can be taken into account. If this suggest s that the direction of the real edge lies between two angles 1 and 2, then only the elements in the (r, ) array that lies in 1< < 2 that are plotted. 3. The incrementing box does not need to be uniform. It is known that the best estimate of (r, ) is at the center of the box, so this element is incremented by a large figure than the elements around that center element. Note that the line length is not given, so that the lines go to infinity as it stands. Three approaches may be considered: 1. Pass 3 x 3 median filter over the image original and subtracting the value of the center pixel in the window from the result. This tends to find some corners of images, thus enabling line endings to be estimated. 2. Set up four further accumulator array. This first pair can hold the most north-east position on the line and the second pair the most south-west position, these positions being updated as and when a pixel contributes to the corresponding accumulating element in the main array. 3. Again with four further accumulator array, let the main accumulator array be increased by w for some pixel (x,y). Increase this first pair by wx and wy and the second by (wx)2 and (wy)2. At the end of the operation a good estimate of the line is: mean of lines 2 where is the standard deviation, i.e. End of line wx wx ) wx ( estimate = w w w
2 2

for the x range and the similar expression for the y range. This makes some big assumption regarding the distribution of edge pixels, e.g. it assumes that the distribution is not skewed to one end of the line, and also many not always be appropriate. The Hough technique is good for finding straight lines. It is even better for finding circles. Again the algorithm requires significant edge pixels to be identified so some edge detector must be passed over the original image before it is transformed using the Hough technique. Technique 6.2. Real circle discovery using the Hough transform. USE. Finding circles from an edge-detected image. OPERATION. If the object is to search for circles of a known radius R, say, then the following identity can be used:

( x a ) 2 + ( y b) 2

= R2

where (a,b) is the centre of the circle. Again in ( x,y) space all pixels or, an edge are identified (by thresholding) or every pixel with I(x,y) > 0 is processed. A circle of elements is incremented in the (a,b) accumulator array centre (0<a< M1, 0<b<N-1), radius R for each edge pixel to be processed. Bresenham's circle drawing algorithm

can be used to increment the circle elements quickly. Finally, the highest values ill the (a,b) array, indicate coincident edges in (a, b) space corresponding to a number of pixels on the edge of the same circle in space.
Circle to be found

Figure 6.6. Original data in (x,y) domain. Again it is possible to reduce the amount of work by using the gradient direction to indicate the likely arc within which the circle centre is expected to lie. Figure 6.7 illustrates this technique. It is possible to look for the following types of circles: different radii different radii, same vertical centres different radii, same horizontal centres plot in (a,b,R) space plot in (b,R) space plot in (a,R) space

Four cicles coincide here only

Figure 6.7 Illustration of Hough circle transform (looking for circles radius 1/2). Corresponding accumulator circles in (a,b) domain. If the circle radius is known to be one of three values, say, then ( a,b,R) space can be three planes of (a,b) arrays. The following points are important: 1. As the number of unknown parameters increases, the amount of processing increases exponentially. 2. The Hough technique above can be used to discover any edge that can be expressed as a simple identity.

3. The generalized Hough transform can also be used to discover shapes that can not be represented by simple mathematical identities. This is described below. Technique 6.3. The generalized Hough transform. USE. Find a known shape in its most general form-of any size or orientation in an image. In practice it is best to go for a known size and orientation. OPERATION. Some preparation is needed prior to the analysis of the image. Given the object boundary, and assuming that the object in the image is of the same size and orientation (otherwise a number of accumulator arrays have to beset up for different sizes and orientations), a centre (x,y) is chosen somewhere within the boundary of the object. The boundary is then traversed and after every step d alone the boundary the angle of the boundary tangent with respect to horizontal is noted, and the x difference and y difference of the boundary position from the centre point are also noted. For every pixel I(x, y) in the edge-detected image, the gradient direction is found. The accumulator array (same size as the image) is then incremented by 1 for each such element. Finally, the highest-valued elements in the accumulator array point to the possible centres of the object in the image.

6.3 Bresenhams Algorithm


Bresenhams line algorithm is an efficient method for scan-converting straight lines in that it uses only integer addition, subtraction, and multiplication by 2. As a very well known fact, the computer can perform the operations of integer addition and subtraction very rapidly. The computer is also time-efficient when performing integer multiplication and division by powers of 2. The algorithm described in the following is a modified version of the Bresenham algorithm. It is commonly referred to as the midpoint line algorithm.
U

yk+1
d2

y
D d1

yk xk xk + 1

Figure 6.8 Midpoint algorithm The equation of a straight line in 2-dimensional space can be written in an implicit form as

F(x, y) = ax + by + c = 0 From the slope-intercept form


y= dy x+B dx

we can bring it to the implicit form as


dy x dx y + Bdx = 0

So

a = dy,

b = dx,

c = Bdx

Suppose that point (xi, yi) has been plotted. We move xi to xi + 1. The problem is to select between two pixels, U(xi + 1, yi + 1) and D(xi + 1, yi). For this purpose, we consider the middle pixel M(xi + 1, yi + 1 2 ). We have

d = F(M) = a(xi + 1) + b( yi + 1 2 )+ c If d>0 , choose U d<0 , choose D d=0 , choose either U or D, so choose U. - When D is chosen, M is incremented one step in the x direction. So dnew = F(xi +2, yi + while dold = F(xi + 1, yi + So the increment in d (denoted dD) is
1 2 1 2

)
1 2

= a(xi + 2) + b(yi +

)+c
1 2

) = a (xi + 1) + b (yi +

)+c

dD = dnew dold = a = dy - When U (xi + 1, yi + 1) is chosen, M is incremented one step in both directions: dnew = F (xi +2, yi +
3 2

)
3 2

= a (xi + 2) + b( yi + = dold + a + b So the increment in d (denoted dU ) is dU = a + b = dy dx

)+c

In summary, at each step, the algorithm chooses between two pixels based on the sign of d. It updates d by adding dD or dU to the old value.

First, we have the point (x1, y1). So M (x1 +1, y1 + F(M) = a(x1 + 1) + b (y1 + = F(x1, y1 ) + a + b/2
1 2

1 2

) and

)+c

Since F (x1 , y1) = 0, we have d = d1 = dy dx/2 In order to avoid a division by 2, we use 2 d1 instead. Afterward, 2d is used. So, with d used in place of 2d, we have First set d1 = 2dy dx If di 0 then If di < 0 then xi+1 = xi + 1, yi+1 = yi + 1 and di+1 = di + 2 (dy dx) xi+1 = xi + 1, yi+1 = yi di+1 = di + 2dy

The algorithm can be summarized as follows: Midpoint Line Algorithm [Scan-convert the line between (x1, y1) and (x2, y2)] dx = x2 x1; dy = y2 y1; d = 2*dy dx; /* initial value of d */ dD = 2*dy; /* increment used to move D */ dU = 2*(dy dx); /* increment used to move U */ x = x1; y = y1 ; Plot Point (x, y); /* the first pixel */ While (x < x1) if d <0 then d = d + dD; / * choose D */ x = x + 1; else d = d + dU; /* choose U */ x = x + 1; y = y + 1; endif Plot Point (x, y); /* the selected pixel closest to the line */ EndWhile

Remark The described algorithm works only for those lines with slope between 0 and 1. It is generalized to lines with arbitrary slope by considering the symmetry between the various octants and quadrants of the xy-plane. Example. Scan-convert the line between (5, 8) and (9, 11).

Since for the points, x < y, consequently the algorithm can apply. Here dy = 11 8 = 3, dx = 95=4 First d1 = 2dy dx = 6 4 = 2 > 0 So the new point is (6, 9) and d2 = d1 + 2 (dy dx) = 2 + 2(1) = 0 the chosen pixel is (7, 10) and d3 = d2 + 2 (dy dx) = 0 +2(1) = 2 < 0 the chosen pixel is (8, 10), then d4 = d3 + 2dy = 1 +6 = 5 > 0 The chosen pixel is (9, 11). 6.3.2 Circle incrementation A circle is a symmetrical figure. Any circle-generating algorithm can take advantage of the circles symmetry to plot eight points for each value that the algorithm calculates. Eight-way symmetry is used by reflecting each calculated point around each 45 axis. For example, if point 1 in Figure 6.9 were calculated with a circle algorithm, seven more points could be found by reflection. The reflection is accomplished by reversing the x, y coordinates as in point 2, reversing the x, y coordinates and reflecting about the y axis as in point 3, reflecting about the y

y (-2, 8) (-y, x) (-8, 2) (-x, y) (-x, -y) (-8, -2) (-y, -x) (-2, -8) (y, -x) (2, -8)
9

(2, 8) (y, x) (8, 2) (x, y) 9 (x, -y) (8, -2)

Figure 6.9 Eight-way symmetry of a circle. axis as in point 4, switching the signs of x and y as in point 5, reversing the x, y coordinates, reflecting about the y axis and reflecting about the x axis as in point 6, reversing the x, y coordinates and reflecting about the y axis as in point 7, and reflecting about the x axis as in point 8. To summarize: P1 = (x, y) P2 = (y, x) P3 = (y, x) P4 = (x, y) (i) Defining a Circle There are two standard methods of mathematically defining a circle centered at the origin. The first method defines a circle with the second-order polynomial equation (see Figure 6.10). y2 = r2 x2 where x = y = r = the x coordinate the y coordinate the circle radius P5 = (y, x) P1 = (y, x) P7 = (y, x) P8 = (x, y)

With this method, each x coordinate in the sector, from 90 to 45, is found by stepping x from 0 to r / 2 , and each y coordinate is found by evaluating r 2 x 2 for each step of x. This is a very inefficient method, however, because for each point both x and r must be squared and subtracted from each other; then the square root of the result must be found.

The second method of defining a circle makes use of trigonometric functions (see Figure 6.11): y P = ( x, r 2 x 2 ) y
P=(r cos , r sin )

r x x

r cos

r sin

Fig. 6.10 Circle defined with a seconddegree polynomial equation.

Fig. 6.11 Circle defined with trigonometric functions. y = r sin

x = r cos where r x y = = = = current angle circle radius x coordinate y coordinate

By this method, is stepped from to / 4, and each value of x and y is calculated. However, computation of the values of sin and cos is even more time-consuming than the calculations required by the first method. (ii) Bresenhams Circle Algorithm If a circle is to be plotted efficiently, the use of trigonometric and power functions must be avoided. And as with the generation of a straight line, it is also desirable to perform the calculations necessary to find the scan-converted points with only integer addition, subtraction, and multiplication by powers of 2. Bresenhams circle algorithm allows these goals to be met. Scan-converting a circle using Bresenhams algorithm works are follows. If the eightway symmetry of a circle is used to generate a circle, points will only have to be generated through a 45 angle. And, if points are generated from 90 to 45, moves will be made only in the +x and -y directions (see Figure 6.12).

-y

45

+x

Figure 6.12 Circle scan-converted with Bresenhams algorithm. The best approximation of the true circle will be described by those pixels in the raster that fall the least distance from the true circle. Examine Figures 6.13( a) and 6.13(b). Notice that if points are generated from 90 and 45, each new point closest to the true circle can be found by taking either of two actions: (1) move in the x direction one unit or (2) move in the x direction one unit and move in the negative y direction one unit. Therefore, a method of selecting between these two choices is all that is necessary to find the points closest to the true circle. Due to the 8-way symmetry, we need to concentrate only on the are from (0, r) to 2 ) . Here we assume r to be an integer.

(r /

2, r /

Suppose that P(xi, yi) has been selected as closest to the circle. The choice of the next pixel is between U and D (Fig.2.8). Let F(x, y) = F(x, y) = x2 + y2 - r2. We know that 0 then (x, y) lies on the circle >0 then (x, y) is outside the circle <0 then (x, y) is inside the circle

Let M be the midpoint of DU. If M is outside then pixel D is closer to the circle, and if M is inside, pixel U is closer to the circle. Let dold = F(xi+1, yi 1 2 ) = (xi + 1)2 + (yi *
1 2

)2 r2

If dold < 0, then U (xi+1, yi) is chosen and the next midpoint will be one increment over x. dnew = F(xi+2, yi 1 2 ) = dold + 2xi + 3

Thus

The increment in d is

dU = dnew dold = 2xi + 3 * If dold 0, M is outside the circle and D is chosen. The new midpoint will be one increment over x and one increment down in y:
3 dnew = F (xi + 2, yi 2 ) = dold + 2xi 2yi + 5

The increment in d is therefore dD = dnew dold = 2(xi yi ) + 5 Since the increments dU and dD are functions of (xi , yi), we call point P(xi, yi) the point of evaluation. Initial point : (0, r). The next midpoint lies at (1, rF(1, r
1 2
1 2

) and so

) = 1 + (r

1 2

) r =
2 2

r
1 4

To avoid the fractional initialization of d, we take h = d


1

. So the initials value of h is 1

r and the comparison d < 0 becomes h < 4 . However, since h starts out with an integer value and is incremented with integer values (dU and dD), we can change the comparison to h < 0. Thus we have an integer algorithm in terms of h. It is summarized as follows:
(0, r)
(r / 2, r / 2)

P(xi, yi) M

U(xi + 1, yi )

D(xi +1, yi - 1)

(a)

(b)

Figure 6.13 Bresenhams Circle Algorithm (Midpoint algorithm) Bresenham Midpoint Circle Algorithm h = 1 r ; /*initialization */ x = 0; y = r; Plot Point (x, y); While y > x

if h < 0 then /* Select U */ dU = 2*x + 3; h = h + dU; x = x + 1; else /* Select D */ dD = 2*(x y) + 5; h = h dD; x = x + 1; y = y 1; endif End While (iii) Second-order differences If U is chosen in the current iteration, the point of evaluation moves from ( xi, yi ) to (xi+1, yi ). The first-order difference has been calculated as dU = 2xi + 3 At point (xi + 1, yi ), this will be dU = 2( xi + 1) + 3 . Thus the second-order difference is
dU = 2 U = d U

Similarly, dD at (xi, yi ) is 2(xi yi )+5 and at (xi +1, yi ) is d D = 2(xi +1 yi ) + 5. Thus the second-order difference is

D = d D d D = 2
If D is chosen in the current iteration, the point of evaluation moves from ( xi, yi ) to (xi +1, yi -1). The first-order differences are
d D = 2(xi yi ) + 5 = 2[xi + 1 ( yi 1)] + 5 = 2(xi yi ) + 4 + 5 dD d U = 2xi + 3 d U = 2(xi + 1) + 3

Thus the second-order differences are


U = 2, D = 4

So the revised algorithm using the second-order differences is as follows: (1) h = 1 r, x = 0 , y = r , U = 3, D = 5 2r, plot point (x, y) (initial point)

(2) (3)

Test if the condition y = x is reached. It not then If h < 0 : select U x = x+1 h = h + U U = U + 2 D = D + 2 else : select D x = x+1 y = y1 h = h + D U = U + 2 D = D + 4 end if plot point (x, y)

6.4 Using interest point


The previous chapter described how interest points might be discovered from an image. From these, it is possible to determine whether the object being viewed is a known object. Here the two-dimensional problem, without occlusion (objects being covered up by other objects), is considered. Assume that the interest points from the known two dimensional shape are held on file in some way and that the two-dimensional shape to be identified has been processed by the same interest points that now have to be compared with a known shape. We further assume that the shape may be have been related, scaled, and/or translated from the original known shape. Hence it is necessary to determine a matrix that satisfies: discovered interest point = known shape interest point M or D = KM where M is two-dimensional transformation matrix of the form a c e and the interest point sets are of the form x1 x2 ... x n y1 y2 ... yn 1 1 ... 1 b d f 0 0 1

The matrix M described above does not allow for sheering transformations because this is essentially a three-dimensional transformation of an original shape.

There is usually some error in the calculations of interest point positions so that D=KM+ and the purpose is to find M with the largest error and then determine whether that error is small enough to indicate that the match is correct or not. A good approach is to use a leastsquares approximation to determine M and the errors, i.e. minimize F(D-KM) where F(Z) = x12 + y12 This gives the following normal equations:

x2 xy x
and

xy x a xX y y c = yX y n e X
2

or La = s1

x2 xy x

xy x b xY y y d = yY y n f Y
2

or Lb = s 2

If the inverse of the square L matrix is calculated, then the values for a to f can be evaluated and the error determinated. This is calculate as L-1L a = L-1 s1 and L-1L b = L-1 s2 Resulting in a = L-1s1 and b = L-1s2.

6.5 Problems
There are some problems with interest point. First, coordinates must be paired beforehand. That is, there are known library coordinates, each of which must correspond to correct unknown coordinate for a match to occur. This can be done by extensive searching, i.e. by matching each known coordinate with each captured coordinate, all possible permutations have to be considered. For example, consider an interest point algorithm that delivers five interest points for a known objects. Also let there be N images, each containing an unknown object, the purpose of the exercise being to identify if any or all of the images contain the known object. A reduction on the search can be done by eliminating all those images that do not have five interest points. If this leaves n images there will be b x 5! = 120n possible permutations to search. One search reduction method is to order the interest points. The interest operator itself may give a value which can place that interest point at a particular position in the list. Alternatively, a simple sum of the brightness of the surrounding pixels can be used to give a position. Either way, if the order is known, the searches are reduced from 0(n x i!) to 0(n), where i is the number of interest points in the image. The second problem is that the system

cannot deal with occlusion or part views of objects, nor can it deal with three-dimensional objects in different orientations.

6.6 Exercises
6.6.1 Using standard graph paper, perform a straight line Hough transform on the binary pixels array shown in the following figure transforming into (m,c) space.

Figure 6.8 Binary array 6.6.2 A library object has the following ordered interest point classification {(0,0), (3,0), (1,0), (2,4)} Identify, using the above technique, which of the following two sets of interest points represent a transition, rotation, and/or scaling of the above object: {(1,1), (6,12), (2,5), (12,23)} {(1,3), (1,12), (-1,8), (3,6)} Check your answer by showing that a final point maps near to its corresponding known point.

S-ar putea să vă placă și