Sunteți pe pagina 1din 4

2010 Second International Conference on Computer Engineering and Applications

-Soft: An English Language OCR


Junaid Tariq Department of Computer science COMSATS Institute of Information Technology, Islamabad, Pakistan junaid_tariq@comsats.edu.pk Umar Nauman Department of Computer science COMSATS Institute of Information Technology, Islamabad, Pakistan umarnauman@comsats.edu.pk Muhammad Umair Naru Department of Computer science COMSATS Institute of Information Technology, Islamabad, Pakistan umair_naru@comsats.edu.pk

Abstract Because of growing technology and fast living, business card are very much in demand. Most of the cards have fixed font size and style. Thats why the OCR required for such documents does not need to be so expensive (computational cost) and complex (Artificial Neural Network). This paper presents a simple, efficient, and less costly approach to construct OCR for cards reading or any document that has fix font size and style. As English is an international language and almost every card have English characters on them. To achieve efficiency and less computational cost, OCR in this paper uses database instead of ANN (Artificial Neural Network) to recognize English characters which makes this OCR very simple to manage. As this paper is about English character recognition, so author is naming this OCR as soft, pronounced as alpha soft. We have developed a prototype for this system. Different experiments are conducted to show that 100% accuracy is possible in OCR. I. INTRODUCTION NGLISH is an international language and is spoken and used in almost every country. As English is spoken in almost every country, it is names as world language[1]. Most of the business cards and documents use fixed font size and style to make document/ card of universal standard. Now days business cards are very much in as it makes the processing fast so the users of card dont have to wait for balance checking. If we are dealing in an environment where document font size and font style is fixed, as we have seen in case of official documents, business cards, air ticketing, etc, in such environment we can make the system very easily automated with the help of a technology called Optical Character Recognition. Optical character recognition is usually referred to as OCR [2]. This technology recognizes the characters present on the images [3]. OCR technology is widely used but its accuracy is still far from that of second grade student [4]. But people want OCR to convert information in electronic form to increase efficiency of data storage and retrieval [5]. No dough OCR cost much and their accuracy also varies from OCR to OCR. In all the OCR ANN (Artificial Neural Network) is used at the back end for recognition. But in -Soft we are replacing ANN with database. Because of database the -Soft implementation becomes very simple. And same results can be achieved as we get with ANN. As we are implementing English language OCR, we must have some information about this language. English language consists of 26 characters which are shown in table 1. In English language, characters combine together to form a word. Words combine together to form a sentence.

TABLE I ENGLISH LANGUAGE ALPHABETS [6]

Letter A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a bee cee dee e ef gee aitch haitch i jay jy kay el em en o pee cue ar ess tee u vee double-u Ex wy or wye Zed

Letter name

II. IDEA BEHIND -SOFT English language has 26 characters. And the environment in which we are using -Soft has fixed fond size and style. So in such scenario we can easily identify each of 26 characters by its
553

978-0-7695-3982-9/10 $26.00 2010 IEEE DOI 10.1109/ICCEA.2010.112

Authorized licensed use limited to: Reva Institute of Tehnology and Management. Downloaded on June 22,2010 at 10:53:52 UTC from IEEE Xplore. Restrictions apply.

Height, Width, and Checksum (sum of pixel that form the shape of that character).

Bit Depth=24 Horizontal Resolution=96dpi Vertical Resolution=96dpi III. -SOFT ARCHITECTURE

Fig. 1. Height of the character is shown

Height of the character is shown in figure 1. Height of the character is the difference between the first and last black pixel encounter in the top to bottom scan of the character.

Fig. 4. -Soft Architecture [2]

Figure 4 show the over all architecture of the -Soft OCR. Almost all the components of -Soft are same as of other OCRs except database. This is the new feature used in English language recognition instead of ANN. IV. ALPHA SOFT STAGES A. Image Acquisition

Fig. 2. Width of the character is shown.

Width of the character is shown in figure 2. Width of the character is the difference between the first and last black pixel encounter in the left to right scan of the character.

Fig. 3. Bitmap of the character is shown.

Fig. 5. -Soft Image Acquisition Process [7]

Checksum off the character is shown in figure 3. Checksum of the character is the sum of all the pixels that form the shape/ body of the character. Checksum can also be referred as the sum of character bitmap. As we are dealing with the sum of pixels present in the bitmap of a character. We have to scan document with constant pixel depth. So that character must be match with any entry present in the database. In our case we are using following resolution:

In -Soft, data is capture by scanning the document with the help of a scanner. The image is loaded in the -Soft in the form of a matrix. The matrix has the same number of rows and columns as the original image has. The whole image acquisition process is shown in figure 5.

554

Authorized licensed use limited to: Reva Institute of Tehnology and Management. Downloaded on June 22,2010 at 10:53:52 UTC from IEEE Xplore. Restrictions apply.

B. Preprocessing

D. Characters Extraction

Fig. 6. -Soft Preprocessing [8]

After loading the image in the -Soft, preprocessing is done on the image. Usually the image is a colored image. The color image is converted to gray scale and then gray scale image is converted to binary image so that each pixel in the image is either 0 or 1. Figure 6 shows the whole preprocessing process. C. Line Extraction

Fig. 9. line and its equivalent vertical projection.

To extract words or characters from the lines extracted in the previous step, make vertical projection of the image. Each vertical histogram will show the start and end of each character present in the line. The white space or gap between characters/ words is used for character and word segmentation. Figure 9 shows the characters and its equivalent histogram.

Fig. 10. Shows the character boundary.

Figure 10 shows clearly the boundary of each character that is extracted in the vertical histogram projection by the -Soft.
Fig. 7. -Soft Line Extraction Image

E. Features Extraction

To extract line from the image, make horizontal projection of the image. Horizontal projection is the histogram of the ON pixel along every row of the image. Space between the lines will be used for separating one written line from another line [9].

Fig. 11. Feature extraction of a single character. Fig. 8. Image and its equivalent horizontal projection.

Figure 11 shows how -Soft extract feature from a single character that is extracted from the image. Checksum is the sum of pixel that form the body of the character.

555

Authorized licensed use limited to: Reva Institute of Tehnology and Management. Downloaded on June 22,2010 at 10:53:52 UTC from IEEE Xplore. Restrictions apply.

V. RECOGNITION We are using database at the end of the -Soft so the query has to extract the character from the database that has the same value of height, width, and checksum. As font size is fixed, so we are using hard matching. Hard matching mean we are extracting the character from the database that has the exact value of height, width and checksum. We can also use soft matching. Soft matching means that we are giving range for height, width, checksum values in query while extracting the character from the database. Soft matching accuracy will depends upon the range. Greater the range, lower will be the accuracy. VI. RESULTS A. Line Extraction Line extraction accuracy of -Soft is 100%. B. Character Extraction Character extraction accuracy of -Soft is 100%. C. Digitization As we are using the concept of hard and soft matching, so we get different result of both the matching:
TABLE 2
HARD MATCHING (EXACT MATCHING)

according to our desire. For example: character explanation mark ! might be matched with character capital L or small letter l. VIII. BEST USAGE This approach of constructing OCR is best suits in situation like: A. Card Reading B. Air Ticketing C. Any environment with deals with fixed font. IX. CONCLUSION Before our technique, all the OCRs where constructed by ANN which makes the whole process of OCR very complex and complicated as how many nodes require? Training cost of neural network? Our result shows that database approach gives us best result where font size and font style is fixed. Because of database we can overcome the problems associated with ANN. REFERENCES
[1] [2] Topic: English Language, Sub heading: English as global language, Paragraph # 1, Line # 1. Available:http://en.wikipedia.org/wiki/ English_language. Haidar Almohri, John S. Gray, Histam Alnajjar, A Real-time DSP-Based Optical Character Recognition System for Isolated Arabic characters using the TI TMS320C6416T. Proceedings of the 2008 IAJC-IJME International Conference. Sub heading: Introduction, Line # 1. Available:http://www.ijme.us/cd_08/PDF/228%20ENT%20201. pdf AIM, Inc. Apha Drive, Pittsburgh, USA. Title: Optical Character Recognition, Paragraph # 1, Line # 1. Available:http:// www.aimglobal.org/technologies/othertechnologies/ocr.pdf George Nagy, Thomas A. Nartkar, Stephen, Optical Character Recognition: An illustrated Guide to the frontier, SPIE Vol.3967, 58-69, Sub heading: Introduction, Paragraph # 2, Line # 1. Available: http://www.ecse.rpi.edu/~nagy/PDF_chrono/2000_Nartker_Rice_SPIE3967 -DR&R2000.pdf Million Meshesha, C. V. jawahar, Optical Character Recognition of Amharic Documents, Sub heading: Introduction, Paragraph # 1, Line # 3. Available: http://cvit.iiit.ac.in/papers/million07Optical.pdf English Alphabet, Sub heading: letter names, Table 1. Available: http://en.wikipedia.org/wiki/English_alphabet. To understand is to perceive patterns-Isaiah Berlin on pattern Recognition, Sub heading: The various steps involved in any standard, Figure 1. Available:http://profile.iiita.ac.in/ssuman_02/home/ homepage_files/resources/ocr.pdf Adnan Mohammand Shoeb Shatil and Mumit Khan, Minimally segmenting high performance Bangle Optical Character Recognition Using Kohonen network, Sub heading: image processing, Figure 3. Available:http://www.panl10n.net/english/final%20reports/pdf%20files/Ba ngladesh/BAN27.pdf R Sanjeev Kunte and Sudhaker Samuel., A simple and efficient optical character recognition system for basic symbols in printed kannada text. Sub heading: segmentation. Sub-sub heading: line extraction. Paragraph # 1. Line # 2. Available: http://www.ias.ac.in/sadhana/Pdf2007Oct/521.PDF.

Characters in Image 14 20 37 7

-Soft Output 14 20 37 7
SOFT MATCHING

No. of undigitized characters 0 0 0 0


TABLE 3 (2 VALUE RANGE)

-Soft Correctness % 100 100 100 100

[3] [4]

Characters in Image 14 20 26 7

-Soft Output 14 20 26 7

No. of undigitized characters 0 0 0 0


TABLE 4

-Soft Correctness % 100 100 100 100

[5] [6] [7]

[8]

SOFT MATCHING (4 VALUE RANGE)

Characters in Image 14 20 26 7

-Soft Output 14 20 25 7

No. of undigitized characters 0 0 1 0

-Soft Correctness % 100 100 96.15 100

[9]

VII. LIMITATIONS As we are using database at the back end so the result depends upon our query. If we use soft matching, the result will not be

556

Authorized licensed use limited to: Reva Institute of Tehnology and Management. Downloaded on June 22,2010 at 10:53:52 UTC from IEEE Xplore. Restrictions apply.

S-ar putea să vă placă și