Documente Academic
Documente Profesional
Documente Cultură
1341
ICASSP 2008
0, 0<p+q 7
1 , 7 < p + q 10
WBM (p, q) =
(6)
0 , 10 < p + q 16
S=
2
between
2
within
(1)
(2)
1
N
1 N
Also of interest are the low and high frequencies. The corresponding binary weight functions, WBL and WBH , are given
by
0, p=q=1
1, 1<p+q 7
WBL (p, q) =
(7)
0 , 7 < p + q 16
0 , 0 < p + q 10
(8)
WBH (p, q) =
1 , 10 < p + q 16
2.2.3. Linear Weight
Binary weights can reduce computations, but at the cost of increasing detection errors. Since the text blocks have more energy in high frequency components, linearly increasing weights
are selected to amplify the high frequency energy while mitigating the effect of low frequency energy. For 8 8 DCT
coefcients, linear weights, denoted by WL are given by
0,
p=q=1
WL (p, q) =
(9)
32
(p
+
q
2),
p
+q >2
7
Thus, the elements of WL increase linearly along the diagonal.
2.2.4. Quadratic Weight
Similarly, for 8 8 DCT coefcients, the quadratic weight
function WQ is given by
p=0 q=0
WQ (p, q) =
0,
2
( p+q
2 ) ,
p=q=1
p+q >2
(10)
1342
4
2
2
0
0
5
10
0
5
10
15
5
10
15
20
10
15
15
20
25
20
20
25
30
25
25
30
35
(a)
30
35
(a)
(b)
30
35
35
(b)
3
2
20
10
0
5
5
10
15
10
10
10
15
15
15
20
20
20
25
20
25
25
30
25
30
35
35
35
(c)
(c)
30
30
35
(d)
(d)
400
400
200
200
0
5
0
10
5
10
15
15
0
10
5
10
15
15
20
20
20
20
25
25
35
(e)
(e)
25
25
30
30
30
30
35
35
35
(f)
Fig. 2. (a) energy plot with WBL , (b) energy plot with WBM ,
(c) energy plot with WBH , (d) energy plot with WU , (d) energy plot with WL , (f) energy plot with WQ .
(f)
3. EXPERIMENTAL RESULTS
(g)
(h)
Fig. 1. (a) input image, (b) detection result with WBL , (c)
detection result with WBM , (d) detection result with WBH ,
(e) detection result with WU , (f) detection result with WL , (g)
detection result with WQ , (h) bounding box generated onto
the input image.
1343
In this section, the proposed detection algorithms are evaluated through program simulations running on a Microsoft
Windows XP, Pentium D, 3.2 GHz platform. The programs
are implemented by using Matlab. In the simulation, 20 images with text are randomly selected from our image library.
All are gray scale images and of size 256 256. Since the
block size does not signicantly affect the result, we choose
an 8 8 block size. In the following, we compare the performance of proposed algorithms with different types of weights.
We use two metrics (recall and precision) commonly used
in Information Retrieval (IR) to evaluate different detection
algorithms:
recall =
precision =
Table 1 gives the evaluation for different weighted text energy algorithms. The rst row is Crandal [3]s method, which
selects 18 coefcients in row-major order as discrimination
statistic. Shiratori [4]s method is equivalent to using binary
weight WBM , which is dened in Equation (6). These two
methods are evaluated by recall and precision metrics along
Recall
Precision
Crandall [3]
71.1%
55.2%
WBL
60.7%
52.9%
WBM
78.5%
59.2%
WBH
68.1%
56.1%
WU
93.3%
58.6%
WL
95.5%
59.4%
0.9
0.8
0.7
0.6
W
WQ
TPR
BL
0.5
WBM
0.4
WBH
0.3
96.3%
WU
0.2
WL
0.1
63.7%
WQ
0.1
0.2
0.3
0.4
0.5
FPR
0.6
0.7
0.8
0.9
FPR =
1344
based on xed optimal threshold generated by LDA. With varied thresholds, linear, quadratic, and binary weighted methods
have their advantages in different sensitive cases. Specically,
the quadratic weight should be chosen for the high TPR demand.
5. REFERENCES
[1] N. Ezaki, M. Bulacu, L. Schomaker, Detection from
Natural Scene Image: Towards a System for Visually
Impaired Persons, in Proc. of IEEE International Conference on Pattern Recognition (ICPR04), pp. 683-686,
2004.
[2] W. Wu, X. Chen, J. Yang, Detection of Text on Road
Signs from Video, IEEE Trans. on Intelligent Transportation Systems, vol. 6, no. 4, pp. 378-390, Dec. 2005.
[3] D. Crandall, S. Antani, R. Kasturi, Extraction of special
effects caption text events from digital video, International Journal on Document Analysis and Recognition,
pp. 138-157, 2003.
[4] H. Shiratori, H. Goto, H. Kobayshi, An Efcient Text
Capture Method for Moving Robots Using DCT Feature and Text Tracking, in Proc. of IEEE International
Conference on Pattern Recognition (ICPR06), pp. 10501053, 2006.
[5] S. Nadarajah and S. Kotz, On the DCT Coefcient Distributions, IEEE Signal Processing Letters, vol. 13, no.
10, pp. 601-603, Oct. 2006.
[6] E. Y. Lam and J. W. Goodman, A Mathematical Analysis of the DCT Coefcients, IEEE Trans. on Image Processing, vol. 9, no. 10, pp. 1661-1666, Oct. 2000.