Documente Academic
Documente Profesional
Documente Cultură
Abstract—Text recognition in imagery gives more meaningful Fractional Poisson enhancement. The MSER regions
information, which makes it a relevant area of interest in corresponding to the Laplacian noise filtered image is obtained.
different fields like , content based image retrieval, navigation, It is then segmented using Canny’s edge operator and is
blind people assistance, intelligent transportation systems, vehicle enhanced using morphological operations. Text-candidate
testing etc. Text detection from the scene image is a process by regions are filtered from the non-text regions by using
which text zones are segmented from non-textual ones and they connected component analysis of the edge-enhanced MSER
are arranged in accordance with their correct order of reading. image. Clustering of the extracted text candidates from region
Diverse text patterns and variant background interferences are filtered image is done using morphological operations. An
the challenges that affect the reliability of text character
OCR is then used for recognizing the characters. The
extraction. A novel system for text detection and recognition in
images is proposed in this paper. The proposed method uses
remainder of this paper is organized as follows. A brief
Fractional Poisson enhancement for removing Laplacian noise of literature survey of the text extraction algorithm is discussed in
the input image. Then Edge-enhanced Maximally Stable section II, the proposed methodology is detailed in section III,
Extremal Regions (MSERs) is obtained from the pre-processed the performance evaluation is given in section IV and finally,
image. Region filtering is used to filter non-text regions and is the inference and the future scope is concluded in section V.
then recognized by an Optical Character Recognition (OCR)
system. The result of this algorithm outperforms other existing II. RELATED WORKS
methods in terms of Peak Signal to Noise Ratio (PSNR) and Over the last few decades, detection and recognition of text
Structural Similarity (SSIM) measurements. The method is in images and videos have been developed as an active
evaluated using the standard ICDAR dataset consisting mostly research area. The unique features of text help us to
real time images.
distinguish it from non-textual regions. Most commonly used
Keywords—Text recognition; text detection; fractional Poisson text extraction methods are summarized in this section.
enhancement;maximally stable extremal regions; region filtering;
optical character recognition A. Texture based methods
In Texture based methods, text is categorized as a special
I. INTRODUCTION texture which classifies it from the background. A machine
In recent years, text recognition from images and videos learning technique is used to train a classifier so as to detect the
gained popular interest with advancement in pattern presence of text within an image. Zhong et al. [3] analysed
recognition and computer vision technology. The significance gray-scale image using local spatial variations, with the
of semantic or high-level text data present in an image is that it assumption that text regions exhibit high degree of variance.
can easily describe an image with good clarity and can be This approach was limited to the detection of text with only
extracted using low-level features like color, texture etc., horizontal orientation. Horizontal and vertical frequencies were
which in turn varies with language, font, style and background, taken into consideration by Zhong in [4]. They also applied
thus making the task of text extraction a challenging one [1]. discrete cosine transform (DCT) for the extraction of text
Recognition of text is yet another challenge for the researchers characters. This algorithm was robust but failed to give precise
as low resolution text with small fonts may be present in an localization.
image or video with complex or textured background [2].
The aim of the proposed method is to detect and recognize B. Region-based methods
the text- candidates from a scene-text image. The first stage of Region - based method is based on the pixel difference of a
this method is text detection and the second stage deals with text-region with respect to its background. Here the pixels
recognition of the detected text characters. The proposed
algorithm begins with pre-processing of the input image using
(a) (b)
Fig.4: Region Filtered Text Mask
(c) (d)
(a) (b)
(e)
Fig. 3: Edge-enhanced MSER mask formation. (a) MSER mask, (b) Edge (c)
mask, (c) Edge mask after morphological operation, (d) Binary edge mask
after morphological filling, (f) Resultant Edge-enhanced MSER mask Fig.5: Sample image from ICDAR dataset showing text detection and
recognition results. (a) Morphological Mask, (b) Bounding boxes for text-
region, and (c) Recognized text with character annotations.
D. Connected Component Segmentation
The bounding box of the text region can be computed by, gives count of recognized characters with high probability.
merging the individual characters into a single connected Resultant image of OCR with character annotation is given in
component. In the proposed method, individual characters are Fig. 5(c).
connected to form a text cluster by using morphological
closing followed by opening to clean up any outliers [15]. The
image region corresponding to region filtered text mask after IV. EXPERIMENTAL RESULTS
morphological masking and detected text-regions with
The proposed method has been implemented with
bounding boxes can be clearly observed in Fig. 5(a), and
MATLAB 2015a using sample images taken from standard
Fig. 5(b), respectively.
ICDAR 2013 scene dataset with horizontal texts [16 ].
E. Character Recognition
A. Performance evaluation of detection stage
The binary text mask obtained is fed to an optical character
recognition (OCR) system to improve the accuracy of The text extraction result for different text detection
recognition. A threshold T is set to find character with high algorithm is displayed in Fig.6. It is clear from the
character confidence index. The number of extracted character observation that the proposed method outperforms other text
with character confidence index greater than the threshold is extraction algorithms like edge based algorithm, edge-
said to have high character confidence value, which in turn enhanced MSER along with stroke width based algorithm, and
2017 International Conference on Intelligent Computing, Instrumentation and Control Technologies (ICICICT)
MSER and
SWD based 29.00 33.39 66.61 23.67
method [10]
Poisson
(c) (d) 44.90 52.96 47.04 28.06
method [11]
Proposed
90.30 96.42 3.58 21.00
method
(e) (f)
PSNR and MSE are defined in (1) and (2) respectively.
[5] Xiaoqing Liu and Jagath Samarabandu, “An Edge-based text region
V. CONCLUSION extraction algorithm for Indoor mobile robot navigation,” Proceedings of
the IEEE, July 2005.
A novel system for detection and recognition of text, [6] Xiaoqing Liu and Jagath Samarabandu, “Multiscale edge-based Text
following the removal of Laplacian noise from the image and extraction from Complex images,”IEEE, 2006.
edge-enhanced MSERs is proposed. The method gives high [7] Xiaoqing Liu and Jagath Samarabandu, “A Simple and Fast Text
PSNR and SSIM compared to existing methods for detection Localization Algorithm for Indoor Mobile Robot Navigation,”
stage. It also give a very good character confidence of 90.30%, Proceedings of SPIE-IS&T Electronic Imaging, SPIE Vol. 5672, 2005.
highest character accuracy of the order of 96.42% , and is with [8] K. Jung, K. I. Kim, and A. K. Jain, “Text information extraction in
the least error rate of 3.58%. So from the results drawn, it can images and video: a survey,” Pattern Recognition, vol. 37, no. 5, pp. 977
– 997, 2004.
be concluded that the proposed method is robust than existing
methods for high resolution, complex background images with [9] Julinda Gllavata, Ralph Ewerth and Bernd Freisleben, “A Robust
algorithm for Text Detection in images,” Proceedings of the 3rd
horizontal and arbitrarily oriented texts. The future works international symposium on Image and Signal Processing and Analysis,
include methods for improving the results for images with 2003.
small fonts and distorted text. [10] H.Chen, S.S. Tsai, G. Schorth, D.M. Chen, R. Grzeszczuk, B. Girod,
“Robust text detection in natural scene images with edge-enhanced
maximally stable extremal regions,” in: Proceedings of ICIP, 2011,
pp.2609–2612.
[11] Jiji Mol, Anisha Muhammed, Nikhil G Kurup “ A Novel Method for
REFERENCES Text Detection in Imagery,” in NCICIS ,2017, in press.
[12] S. Roy, P.Shivakumara, H.A Jalab, R.W Ibrahim, U. Pal, T. Lu “Frac-
tional poisson enhancement model for text detection and recognition in
[1] Keechul Jung, Kwang In Kim and Anil K. Jain, “Text information video frames”,Pattern Recogn. 52(2016),433-447.
extraction in images and video: a survey,” The journal of the Pattern [13] J. Matas, O. Chum, M. Urban, and T. Pajdla. “ Robust wide baseline
Recognition society, 2004 stereo from maximally stable extremal regions,” Proc. of British
[2] Victor Wu, Raghavan Manmatha, and Edward M. Riseman, “TextFinder: Machine Vision Conference, pages 384-396, 2002.
An Automatic System to Detect and Recognize Text in Images,” IEEE [14] http://radio.feld.cvut.cz/matlab/toolbox/images/region.html
Transactions on Pattern Analysis and Machine Intelligence, Vol. 21, No.
11, November 1999. [15] https://in.mathworks.com.
[16] http://rrc.cvc.uab.es/?ch=4&com=downloads
[3] Y. Zhong, K. Karu, and A.K. Jain, “Locating Text in Complex Color
Images,” Pattern Recognition, vol. 28, no. 10, pp. 1,523-1,536, Oct. [17] Dr. S.Vijayarani and Ms. A.Sakila, “Performance comparison of ocr
1995. tools,” International Journal of UbiComp (IJU), Vol.6, No.3, July 2015
[4] Y. Zhong, H. Zhang, and A. K. Jain, “Automatic caption localization in
compressed video,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no.
4, pp. 385 –392, 2000.