Sunteți pe pagina 1din 6

NEW ADVANCES IN AUTOMATIC READING OF V.L.P.'s (VEHICLE LICENSE PLATES).

FERNANDO MARTN RODRGUEZ, XULIO FERNNDEZ HERMIDA. Departamento de Tecnologas de las Comunicaciones. E.T.S.I. de Telecomunicacin. Universidad de Vigo. Ciudad Universitaria s/n. 36200 Vigo (Pontevedra). Spain. Phone: +34-986-812131. Fx: +34-986-812116. E-mail: fmartin@tsc.uvigo.es, xulio@tsc.uvigo.es.

ABSTRACT: In a former paper [1], we described a system


that recognizes car V.L.P.s (Vehicle License Plate) using machine vision. Now, we describe new developments in this field (we will make brief descriptions of previous material to keep the paper self-contained). Our system has many applications in: parking accounting, traffic monitoring, stolen car detection and security systems. KEYWORDS: Machine Vision, O.C.R., V.L.P. Recognition. 1 .- INTRODUCTION: The purpose of the project is the automatic reading of the V.L.P. We don't want to make assumptions about size, color (we will use gray-scale images) or nationality. We establish two classifications for the plates: - Form Classification: there are one row plates (rectangular plates) and two rows plates (square ones). - Background Color Classification: there are plates with dark characters on a bright background (like spanish plates) and other plates with bright characters on a dark background (like old portuguese or french plates). With these assumptions in mind, we divided the total problem into the following stages (we will devote one subsection to each stage below): - Character Location: locating all the zones in the image that may be a plate. - Binarization: distinguishing between black and white pixels. - Skew Angle Correction: detecting and correcting a possible skew angle. - Character Extraction: segmenting the possible zones into individual characters. - Character Recognition: recognizing the characters. - Post-Processing: result postprocessing to detect and correct errors. The paper will finish by practical results, future lines and conclusions.

2 .- RECOGNITION ALGORITHMS: 2.1 .- Character Location: Our first approach here was based on image gradients. We also designed other method using mathematical morphology. We will present both approaches and we will compare them.

Figure 1. Image of a car with a recognizable V.L.P.

Figure 2. Piece of scan line for gradient detection.

2.1.2 .- The Gradient Method: We compute the horizontal Sobel gradient. The character zone presents high positive values followed by high negative ones (thresholds used on gradient values are computed using a histogram). We look for rectangular zones in which we have a repetition of this gradient pattern. The distance between the positive values and the negative ones must be approximately constant (this distance is an estimate of the stroke width). We try to find the rectangle that encloses the plate. We have implemented many heuristical rules to remove false plates: aspect ratio, maximum and minimum size... (this stage may produce more than one rectangle, the next stages will have to decide). We tested this method on 185 cars and found 96% of the plates (in 2% of the cars we found incomplete

plates and we lost the remaining 2%). Sometimes, the found rectangle is much bigger than the true plate, creating problems for the next stages. This method works well no matter the characters are black on a bright background or viceversa. What's more, the method can detect to which of the two kinds the input plate belongs (for dark characters positive gradient comes first).

Figure 3. Located zone for the car in fig. 1.

Figure 5. "Top-hat" of the initial image.

2.1.3 .- The Morphological Method: This method starts from a different model. Characters are objects of small thickness. A morphological transformation called "top-hat" will help us to find characters. If we perform a morphological closing with circular structuring element (S.E.) and diameter bigger than stroke width, we will erase the characters from the image (fig. 4).

Figure 6. Characters are joined into a rectangle.

Figure 7. Location result.

This method depends strongly on character size (all S.E. sizes can be computed from expected character size). If the images are taken at a fixed distance, there is no problem.
Figure 4. Characters have been erased (input car is that of fig. 1).

If we substract the image in fig. 4 from the initial one, we will get an image with the characters made prominent. This result is known as "top-hat" of the initial image (fig. 5). We can see more white points apart from the characters. We can do some processing in order to eliminate them. We will binarize the image in fig. 5 so that we can work with binary morphology (using the classical method [2]). We continue making a closing with a horizontal linear S.E. If the S.E. width is bigger than character spacing, we will convert the plate into a white rectangle (that will be our result), fig. 6. We have to apply other morphological operations to filter out all the other white objects in the image. We will use size information until we have the elementary rectangle, then we make it a big bigger so as not to loose any point (fig. 7). An especial case is that of square plates. In these cases, we get two regions. When we finally dilate both rectangles they are joined together (fig. 8). This knowledge can be used as help in the process of row segmentation. After testing this algorithm, we found that it was able to find 95.3% of the plates. Sometimes, the algorithm produces also false outputs (the next stages will reject them).

The method we have explained is only valid for dark characters on bright backgrounds. To locate bright characters is necessary to perform an opening on the initial image. The system must search for dark characters first and, if it finds nothing, search for bright ones. 2.1.4 .- Comparison of Both Methods: Global location results are very similar for both methods. Nevertheless, the morphological method has a big advantage: location is usually more precise (fig. 9). This makes life easier for the subsequent stages and improves recognizing. The reason could be that other high gradient objects surrounding the plate (like lights) affect gradient method. The gradient method has some advantages: it is almost insensitive to character size and has no problems with dark/bright characters.... We are trying to find a merging strategy able to apply both methods and use the advantages from each.

Figure 10. Graphical interpretation of the formula.

Figure 11. Binarization of the image in fig. 3.

Figure 8. Square plate and results.

We divide the plate into small rectangles (48 regions -8x6for a rectangular plate). Then, we compute a threshold for each region using Otsu method. We will compute an individual threshold for each point (a threshold image). Otsu threshold is only good at subimage center. For the other points, we apply a bilinear interpolation [4].

Figure 9. Results with gradients (above) and morphological methods.

2.2 .- Threshold Computation: We first tested the classical method from [2]. Although results were not bad, we designed other method that uses gradient information from location stage (if location is not based on gradients, we have to compute gradient first). This new method outperforms Otsu's in this particular problem (the resulting bilevel image is usually more easy to segment). Besides the gradient method is much faster. Finally, we implemented a refinement of Otsu method (inspired by [3]) that works well even with illumination gradients on the plate. Our final system applies a combination of the gradient method the refined Otsu method. 2.2.1 .- The Gradient Method: The idea of the method is: "moving along the image, gradient will signal changes between black and white". Algorithm: choose a line of the image where the probability of having many transitions is high (for example, at the center). Then move along the line and detect all transitions. Compute the sum of the gray levels for each color (black and white). Compute also the number of pixels for each color. The threshold (U) will be: U=(SbNw+SwNb)/(Nw+Nb) This method can trivially be extended to use several lines. We apply this extended algorithm to the 80% of the lines in the plate (discarding the first and last 10% lines). 2.2.2 .- The Refined Otsu Method: The former method is not good when we have an illumination gradient on the plate. The method we are describing is suggested in [3].

Figure 12. Initial image.

The image at fig. 12 is a good example of illumination gradient. A global threshold would erase lower part of seven. We will apply our new method here. In fig. 13 we have an image made by the computed thresholds for each subimage. When we perform the bilinear interpolation, we get the image in fig. 14. Final result is shown in fig. 15. See that result is correct and that the threshold is bigger at the brightest points. This method is very time consuming (we are applying Otsu method not a fast one- 48 times). We can implement some optimizations but will be slow anyway.

Figure 13. Threshold for each subimage.

Figure 14. Interpolated threshold image.

Figure 15. Binarized image.

2.2.3 .- Merging of Both Methods: We wanted to merge both methods above. The first method is almost always good and very fast. The second method is good for the most difficult cases. We designed an integrating process: - First, we apply the gradient method that yields good results in most cases. - Then we measure threshold quality. - If quality is below a limit, we use the second method. To measure threshold quality we measure the percentage of points lying in the doubtful region. If this percentage is small quality is good.

After segmentation, characters are applied a contour softening filter (fig. 19).

Figure 16. Bipolar histogram. Figure 18. Separation of rows for a square plate.

In fig. 16 we have shown the computed threshold (U), white mean (Mw) and black mean (Mb). We define as doubtful pixels those lying in a range centered on U. Range length (X) is a percentage of the difference between means. 2.3 .- Skew Angle Correction: Computation of the skew angle is achieved using Hough transform. We apply the transform only to the horizontal contour points. This improves speed and accuracy.
Figure 19. Character segmentation and filtering.

Figure 17. Image of fig. 11 after correction (skew angle = 3.3).

2.4.4 .- Deciding the True Plate: Here, we possibly have several plates from the character locator. We make some assumptions: - Character location seldom produces false plates. - The good zone is usually the largest. - A plate has from 4 to 11 characters. We will process the possible plates in size descendant order. If we find a plate with 4 to 11 characters, we stop processing and give the result. 2.5 .- The Final O.C.R.: Our first version for the final O.C.R. was based on a "template-matching" approach [1]. As a more advanced method we tried feature extraction based on Kirsch gradient [7], followed by a probabilistic neural network (PNN) [8]. This new method yielded much better results but proved to be very time consuming. Then, we developed a method that combines two previous ones and, in fact, outperforms them. 2.5.1 .- The Projection Method: We extract the horizontal and vertical projections of the bilevel image. Id EST: the sums of the columns and of the rows (background pixels are 0 and foreground ones are 1). We normalize, dividing by the size of the column(row) to get size independent projections. We also compute the number of strokes. This is defined as the number of foreground color zones (black strokes) through which we pass when computing the projection. This will be necessary for character comparisons. The key election is the metric or distance (we will compare the input character to all characters in an alphabet, minimum distance will define the recognized character). We will compare different size projections with a heuristic comparison algorithm (similar to DTW). We make the assumption that the alphabet image is always the bigger (if not, we decimate the input image; this is never necessary, in practice). We compare each value of the smaller vector with one or more values of the

The main skew angle will produce many local maxima in the transform. We use a measure of the number and strength of the maxima for each value of angle [5]. 2.4 .- Character Segmentation: 2.4.1 .- Borders Removal: Now, that we have binarized and corrected skew angle, it is easy to remove plate borders (simply removing the horizontal and vertical lines with length very close to the total width or height of the plate). 2.4.2 .- Segmentation into Rows: To segment the plate into rows (after border removal), we compute the vertical projection of the characters and divide the plate by the minima of this projection. We, in fact, only look for a local minimum at the central region of the plate (there are only two possibilities: one or two rows). Besides, we impose the condition that the local minimum is smaller than a threshold U (this threshold distinguishes between one row plates and two rows ones). The value of U is computed from projection data. 2.4.3 .- Segmentation of Characters: To segment the characters we make an adaptive search of the connected black spots (if characters were white we would have made a negative of the image). We start searching at center, and then move to the left and to the right. When we find a character we adapt our searching position to its coordinates. We apply many heuristical rules to remove spots that are not characters: aspect ratio, relative size... We also added a stage that searches for pairs of connected characters (using an aspect ratio threshold) and splits them (using horizontal projection [6]).

bigger (number of values is equal to the ratio of projection lengths). We apply a special penalty when there is a difference in the number of strokes. We could say that the number of strokes divides the projection curve in certain significant zones and that we must compare only corresponding zones (if they exist). The O.C.R. has (when the segmentation has worked perfectly) an accuracy of 94%. This was measured on 1271 real characters (extracted using our segmentor on 185 car images). Furthermore, method is fast: 15.26 milliseconds per character (Intel Pentium MMX, 133 MHz). 2.5.2 .- The Kirsch & PNN Method: Kirsch extraction [7] is based on obtaining image components in four main directions: horizontal, vertical, first diagonal (top left to down right) and second diagonal. We achieve this with gradient operators. Afterwards, we obtain a vector of 80 real numbers: 16 per component, considering 5 components: the original image and 4 directional components (components are scaled to 4x4 images and the resulting pixels are the 16 numbers). We have computed Kirsch feature vectors for more than 1000 characters (extracted from real plates). Then, we computed the Wilks parameter [9]. If this parameter is less than 1, we will have proved that these feature data are suitable for classification (lesser means easier classification). We obtained in the order of 10-18 confirming a good extraction. PNN [8] is based on "a posteriori" probability calculations, id EST: this is a maximum likelihood method. To implement the computations, PNN uses gaussian modeling of input feature vectors (centered on the training samples). Our implementation of the PNN uses a random reduction of the training set. This trick allows a dramatic reduction in computing time (the main problem of this classifier). We trained the PNN with 1054 characters, extracted using our segmentor on 155 cars. Then, we tested the recognizer with 217 characters (extracted from 30 cars). Training and testing sets have no common elements. Accuracy is 97.68% but method is slow: 98.44 milliseconds per character (Intel Pentium MMX, 133 MHz). 2.5.3 .- The Combined Method: We decided to combine both methods to have the advantages of both (the speed of the projection method and the accuracy of the PNN). We designed a combined method that works as follows: - We apply the projection method. This method selects, at most, 5 possible characters. We compare the computed distances and decide where is the cutting point that defines the "dominant group". - If we selected only one character, we finished. - Otherwise, we use PNN to distinguish (we only compute discriminant functions for the selected classes).

The new recognition rate is 97.92%, even bigger than that of the PNN alone (the combination of two methods allows to exploit the benefits of both, compensating for the weaknesses). The time result is of 52.14 milliseconds per character (Intel Pentium MMX, 133 MHz). This value lies between those of both former methods (as it was to expect).

Figure 20. Example of projections and number of strokes.

Figure 21. Directional components for Kirsch extraction.

2.6 .- Post-Processing: In this section we have grouped two stages with different purposes but also with some relation. The first one measures the self-confidence of the recognizer in its own result. This method is interesting to have a recognizer that rejects the most difficult plates, leaving them to inspection by a human operator. The second one is an application of context rules to detect and correct errors in the recognized plate. As we will see, this second stage uses the confidence we compute in the first one. 2.6.1 .- Self-Confidence Measurements: The O.C.R. measures confidence on each recognized character saving the PNN discriminant for the most probable class (winning class, D1) and for the second most probable class (finalist class, D0). We compute the quotient c=D1/D0.. If, as a consequence of combining O.C.R. methods, the PNN is not applied; the confidence will be 1.0 (c is infinite). We compute the confidence on that character (the meaning is similar to the probability that that character has been properly recognized) using an empirical formula: f(c)=1-e(1-c) We take =(1/7)ln10,(c=8, f(c)=0.9). This gives us only the confidence of each character. To compute a confidence on a complete plate we use the formula:
C G = ci
i =0 M

Where M is the number of found characters, c0 is the confidence of the initial processes (location, thresholding and segmentation) and ci is the O.C.R. confidence for the ith recognized character of this plate.

4 .- FUTURE LINES:
There are still more ideas to develop in this system: - Developing a good merging method for the two location algorithms of section 2.1. - Improving postprocessing stage.

5 .- CONCLUSION:
Figure 22. Empirical curve for c0.

c0 is computed with an empirical curve to compute c0 from the number M. Practical results were good (section 3). 2.6.2 .- Context Rules: At this point we have the two most probable results per input character and a measurement of how much the winner domains over the finalist. We will define a morphology and a syntax for license numbers. - Morfology: digits on the plates are usually grouped into blocks. Block spacing is quite bigger than that of characters (we use this fact to split each row into blocks). Each block is usually numeric or alphabetic, mixtures are unusual. We modify the O.C.R. decision (changing only low confidence characters) so that this rule is honored. We know whether a block has digits or alphas seeing the high confidence characters (if we find a mixed block, we do nothing). - Syntax: the idea is looking for known plate patterns applying additional constraints. For example, rectangular spanish plates have one row and three blocks. The first block is alphabetic and has between 1 and 3 characters. Second block is numeric with 4 characters. Last block is alphabetic with 1 or 2 characters. When a plate complies with these conditions, we conclude that it is a spanish one. Then we know that the first block is a province prefix (of course, the number of possible province prefixes is limited). If this rule is not satisfied, we see if we can change some of the low-confidence characters to comply with the rule. System tries to match all plate models in its database. If the plate is not matched with any model, we do nothing. This method improves results on some plates but does not interfere with others. We have checked that this method improves overall plate recognition by 3.5% (see section 3).

We have designed a reader for car license numbers with no assumptions about size, color, or nationality. Algorithms are easy and simple, and results are in the state of the art. Nevertheless many things can be improved. This system has many practical applications, such us: parking accounting, traffic monitoring and security systems.

6 .- REFERENCES:
[1] X. Fernndez et al, Automatic and Real Time Recognition of V.L.P.'s (Vehicle License Plates), Lecture Notes on Computer Science (Springer), 1311(2), 1997, 552-559. [2] N. Otsu, A Threshold Selection Method for Gray Level Histograms, IEEE Transactions on System, Man and Cybernetics, 1979. [3] J. Ohya et al, Recognizing Characters in Scene Images, IEEE Transactions of P.A.M.I., 16(2), 1994, 214-220. [4] G. Farin, Curves and Surfaces for Computer Aided Geometric Design, Academic Press, 1993. [5] G. S. D. Farrow et al, Detecting the Skew Angle in Document Images, Signal Processing: Image Communication, 6, 1994, 101-114. [6] M. D. Garris, Component-Based Handprint Segmentation Using Adaptive Writing Style Model, N.I.S.T. Internal Report Number 5843. [7] D. Cruces et al, Printed and Handwritten Digits Recognition Using Neural Networks, Proceedings of ICSPAT98, 1, 1998, 839-843. [8] J. L. Blue et al, Evaluation of Pattern Classifiers for Fingerprint and OCR Applications, Pattern Recognition, 27(4), 1994, 485-501. [9] M. M. Tatsuoka, Multivariate Analysis, John Wiley & Sons, 1971.

3 .- PRACTICAL RESULTS:
Our global recognition rate (per plate) is 84.11% (we obtain more than 97% per character, ignoring location and segmentation errors). If we apply rejection techniques, rate goes up to 94.07% (the system rejects 10.59% of the cars). This was measured on 185 car images. Using postprocessing correction (section 2.6) overall recognition rate goes up by 3.5%. Execution mean time per car is 1.5 seconds (Intel Pentium MMX, 133 MHz).

S-ar putea să vă placă și