Based On Support Vector Machine's Tumor Image Classifier Design PDF

Notice of Retraction
After careful and considered review of the content of this paper by a duly
constituted expert committee, this paper has been found to be in violation of
IEEE's Publication Principles.
We hereby retract the content of this paper. Reasonable effort should be made to
remove all past references to this paper.
The presenting author of this paper has the option to appeal this decision by
contacting TPII@ieee.org.
2010 International Conference on e-Education, e-Business, e-Management and e-Learning
Based on Support Vector Machines Tumor Image Classifier Design

Lan Gan
School of Information Engineering
East China Jiaotong University
Nanchang, China
gl7046798@yahoo.com.cn
Zhongping Yu
School of Information Engineering
East China Jiaotong University
Nanchang, China
babaheiio@126.com
to this characteristic, we extract the following four features:

the number of nucleus that perimeter is in the scope of L
namely C_N, the average perimeter namely A_C, the
number of nucleus that area in the scope of L namely A_N,
the average area namely A_A. In order to get the image
classification quickly, accurately and effectively, how can I
set up the scope of L? Separately, we do feature extraction
and get different feature values when the scope of L is
100L1000, 200L1000, 300L1000, 500L1000,
After the feature values compared, we can obtain the
following results: with the narrowing of the scope of L, C_N
and A_N gradually become smaller; When the lower limit of
L is 100200 or 300, there are no significant difference
between the abnormal image and normal image; When the
scope of L is 500L1000, C_N and A_N is equal to 0 for
abnormal image, but C_N and A_N is greater than 0 for
normal image.Part of global feature values shown as
TABLE I:
Abstract In this paper , we designed a tumor image

classification and recognition system. With a series approaches
of image pre-processing, segmentation and tracking, extract
the global image characteristic parameters ( the
characteristics of the whole picture). According to obtain the
characteristic parameters to realize the image classification
and identification. Since this article mainly related to small
sample data, so the use of Fisher method, KNN method and
support vector machine (SVM) Comparison of three methods
of classification and recognition rate, experimental results
show that the Support Vector Machine (SVM) classification
recognition rate higher, more reliable.
Keywords: Global characteristics;Fisher method; KNN
method; support vector machine;
I.
INTRODUCTION
In the research of medical images, the tumor image

classification and recognition has been a hot spot of our
research. However, because of adhesions between cells, a lot
of tumor images can not be accurately identified. Therefore,
how well the tumor image classification is the focus of this
study.
In view of this situation, this article from the overall
point of view, through the use of different classification
methods to classify the image of the tumor. Research at this
stage of image classification and recognition methods are
many, such as Statistical models classification , structure
method, classification trees and neural networks, but these
methods in the absence of a priori knowledge and sample
case of less is difficult to obtain the desired results. Support
vector machine in pattern recognition to solve the small
sample, nonlinear and high-dimensional identification of
problems demonstrated the unique advantages and good
prospects.
This article describes the cell image feature extraction,
support vector machine principle, and select organizations to
build support vector classifier to classify images of the
tumor, and with this in comparison to other algorithms,
experimental results show that the use of support vector
machine approach has been better results.
II.
TABLE I.
Image
Name
Image
C_N
A_C
A_N
A_A
Normal
16
457
15
471
Cancer
10
280
14
437
Hyperplasia
15
288
22
407
Cell Extraction and selection play an important position

in the classification and recognition of medical images.
feature extraction accuracy as well as the choice of whether
or not appropriate will directly affect the efficiency and
image recognition accuracy. the selected features also are
not the better in the identification, on the contrary, too many
features will result in the extension of the system processing
time and recognition rate of decline. based on the synthesis
of a large number of actual cell image and cell feature
data,we use a portion of this paper morphological
parameters and the optical density analysis parameters as
shown in the following table TABLE II. Cell morphology is
a relatively rough, but it is intuitive, strong, easy to compute.
As the cells absorb light in varying degrees, we also
integrate some optical density feature so as to reach a better
recognition effect.
FEATURE EXTRACTION
In the tracking process, we should set up the scope of

tracking perimeter namely L, according to the different
scope of L,we will get different feature values[1].
Under normal circumstances, by contrast with normal
cell, cancer cell is irregular and nucleus is small. According
978-0-7695-3948-5/10 $26.00 2010 IEEE
DOI 10.1109/IC4E.2010.48
A LIST OF GLOBAL FEATURE VALUES
137
(1) cell area S1 : (cell percentage pixels)
(2 )
= (5.1038 , 467.5943 , 7.2264 , 468.9151) ,

0.0011 0.0175 0.0012 0.0260
0.0175 2.5455 0.0141 1.8309

V =
0.0012 0.0141 0.0030 0.0339
0.0260 1.8309 0.0339 2.4201
(3) Nuclear pulp area ratio NP : (nucleus area to
cytoplasm area ratio) NP = S 2 / S1 S 2

(1)
(4) cells compact degrees C: (cell perimeter and area of
2
the square of the ratio) C = D / S1

Where D , said cell perimeter.
(5) nuclear Transmission
(2)
F p F p = F ( x, y )
:
(6) The average transmittance nucleus
F1 p
Cell
III.
B. KNN method
k nearest neighbor algorithm for kNN (kNearestNeighbor) is a widely used pattern recognition
method which can be used for linear separable or
inseparable from the identification of many types of samples
. The basic idea is To take samples of unknown type kneighbor, check these neighbors are in the majority of which
category, put classified as the type. Specifically, are known
in the n-type samples, to find n, k-neighbor. If
F 1 p = F p / S1
TABLE II.
Such an approach through the use of matlab in the fisher

() function can be obtained sample under test category.
(3)
F ( x, y ) : gray-scale of point ( x, y ) .
(4)
LIST OF CELLS CHARACTERISTIC VALUES
NP
F1 p
18.94
51.59
18.95
4.55
43.5
19
0.06
72
23.4
10.75
50.02
k1 , k 2 , k 3 ,", k c were k neighbors number of samples

, , ,", c , you can define
belonging to the class 1 2 3
the discriminant function as
x j
g i ( x ) = k i , i = 1,2,",5 , decisiong j ( x ) = max k i j = 1,2

making rules: If
,
, Then the
x j
. In this way you can distinguish

decision-making
the sample under test in order to complete the final
classification.
G1 mean vector is
C. Support Vector Machine
= u1(1) , u 2(1) , u3(1) , u 4(1) , the estimated value of Overall

(2 )
(2 ) (2 ) (2 ) (2 )
G2 mean vector is u = u1 , u 2 , u 3 , u 4
common covariance is V .
1) Mechanism
of
Support
Vector
Machine
Classification.
Support vector machine is from 1992 to 1995[2~4], a new
pattern recognition method based on the statistical learning
theory. It is the concrete realization of the statistical
learning theory, the theory of VC dimension and structural
risk minimization principle[2~5]. SVM was originally
separable from the context of the linear optimal separating
surface concerned. For the two types of samples linearly
separable case.
As Shown in Figure 1, the figure solid points and hollow
points, two types of training samples, respectively. H is to
separate the two types of correct classification of lines, H1,
, The
We collected 55 normal images, 32 images and 16

cancer proliferation test images. In these three images, first
we consider the latter two on one, that is, first the sample
image is divided into two kinds of normal and abnormal, and
then using Fisher method to seek its mean and covariance is:
(1)
the detection function
X = ( x1 , x2 , x3 , x4 ) , x1 is the value of C_N, x2 is the

x
x
value of A_C, 3 is the value of A_N, 4 is the value of
,then
With 1 and 2 respectively express. As the k value is

different for the different samples . In this paper, taking a
large number of experiments gave k the value of 5. Through
A. Fisher Method
In tumor recognition, we use the above-mentioned
feature
extraction
of
global
features,
setting
(1)
g j ( x ) = max k i
.
In this article we will be training samples are divided
into two types of tumor: Two types of normal and abnormal,
APPLICATION OF THIS ARTICLE
g i ( x ) = k i , i = 1,2,3,", c
Decision-making rules are: if
THREE KINDS OF CLASSIFICATION IN THE
A_A. the estimated value of Overall
(2) nuclear area S 2 : (nuclear share of the number of

pixels)
= (5.0598 ,463.1880 , 7.3419 ,468.1966) ,
138
H2 respectively had two samples from the category line and

parallel to the recent classification of the straight line, H1,
H2 distance of two types of class interval (Margin). Optimal
separating line requires not only be able to separate the two
types of samples of error-free manner, but also to make the
largest class interval. The former to ensure the experience
risk minimization, and the latter to ensure the scope of the
minimum confidence. Extended to higher-dimensional
space, optimal separating line will become optimal
separating surface.
Figure 2. Structure of Flow
We use the training sample of 80, including 40 normal

images, unusual image 40, the number of samples under test
for 80. Its feature vector and feature mean are used Fisher
f (x )
Figure 1. Optimal separating surface diagram
methods. According to the formula (5), by judging

the positive and negative to determine the image under test
belongs to which category.
Svmtrain()
{ Input under test image
Two types of samples in the H1, H2 on the training

samples are the support vector, using a big circle of Figure 1
marked point. Optimal separating plane problem can be
expressed as a constrained optimization problem, the
optimal classification function is as follows:
f (x ) > 0
If
Then determine the image under test is
classified as a normal image
f ( x) = sgn{ w, x + b} = sgn{ a i y i xi , x + b}
i =1
On the type sum calculation does not take
f (x ) 0
else
Then determine the image under test is
classified as abnormal image
{
(5)
ai value of
f (x ) > 0
zero, b can be used to satisfy either a support vector type

obtained in the equal sign
For general nonlinear problems, Through the definition
of an appropriate kernel functions to achieve non-linear
transformation, The input space will be transformed into a
high dimensional space, Then in this new space plane to
strike the optimal linear classify[6]
2) SVM basic algorithm
In this paper, we use support vector machines in the
one-mode. About tumor pictures from the whole divided
into two categories, namely, normal and abnormal. And then
adopt the same approach to deep-level classification, be
further divided into abnormal cancer and hyperplasia.
According to this model, we continue to classify, until all of
the categories are sub-last. Get an inverted binary tree
structure similar to the support vector machine classifier.
The following structure as shown in Figure 2
If 1
Then determine the image
under test is the cancer images
f (x ) 0
else 1
Then determine the image
under test is the proliferation images
}
}
In the SVM classifier construction process, we need to
consider its structure and parameters. SVM kernel function
selection is difficult, usually by means of experimental
methods and identification of selected. In this paper, we
were selected polynomial function, radial basis function
RBF and Sigmoid function as the kernel function to
verify.The results showed that RBF kernel function of the
classification accuracy as high as the RBF kernel function to
finalize function. In MATLAB7.0 environment, we have
designed using RBF function as kernel function of SVM
multi-classifier.Using the training sample the course of their
training, through cross-validation by the relevant parameter
values: nuclear function of nuclear width 4. Combination of
the above-mentioned classification flow chart, we use
optimization approach to measuring a clear division of the
image, the final completion of classification.
139
IV.
EXPERIMENT AND RESULT ANALYSIS
algorithm is the most reliable and most effective. Since these

algorithms only involve a picture of the global features, for
other features such as cell compactness, nucleus ratio of pulp
to be further addressed. In addition, image acquisition is also
overlooked in this paper. Acquisition is also a crucial step in
image. How to capture is more reasonable to the desired
image to be studied further.
In this paper, taking samples of the training samples and

the analyte 80. We use the above three methods to classify
the test samples.
The results list is as TABLE III:
TABLE III.
Algorithm
classificati
on
Fisher
KNN
SVM
ALGORITHM ANALYSIS TABLE

Image recognition rate
100 L 1000
200 L 1000
70.2%
71.13%
72.54%
71.78%
70.59%
73.43%
ACKNOWLEDGMENT
I would like to thank Professor Gan Lan for useful
discussions and suggestions. Without her consistent and
illuminating instruction, this thesis could not have reached
its present form. I also owe my sincere gratitude to my
friends and my fellow classmates who gave me their help
and time in listening to me and helping me work out my
problems during the difficult course of the thesis.
500 L 1000
71.53%
72.27%
74.68%
From the table can be found through the abovementioned three kinds of ways, support vector machine
[4]
REFERENCES
[1]
[2]
[3]
He Chuan-bin. Feature Extraction and Classification for 2d Medical

Image. Northwestern University.2005
Vapnik V N.The Nature of Statistical llearning Theory [M].New
York:Springer-Verlag,1995.
Vapnik V N. An Overview of Statistical Learning Theory[J]. IEEE
Trans on Neural Networks,1999,10(5):988-999.
[5]
[6]
140
Guergachi A A, Patry G G. Statistical Learning Theory Model

Identification and System Information Content [J]. International
Journal of General Systems,2002,31(4):343-357..
Bian Zhao-qi, ZHANG Xue-gong. Pattern Recognition (second
edition) [M]. Beijing: Tsinghua University Press, 2000.
Vapnik V. The nature of statistical learning theory [M]. NewYork:
Springer- Verlag, 1995.

Based On Support Vector Machine's Tumor Image Classifier Design PDF

Încărcat de

Informații document

Descriere originală:

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Based On Support Vector Machine's Tumor Image Classifier Design PDF

Încărcat de

Drepturi de autor:

Formate disponibile

Notice of Retraction

2010 International Conference on e-Education, e-Business, e-Management and e-Learning

Based on Support Vector Machines Tumor Image Classifier Design

to this characteristic, we extract the following four features:

Abstract In this paper , we designed a tumor image

In the research of medical images, the tumor image

Cell Extraction and selection play an important position

In the tracking process, we should set up the scope of

A LIST OF GLOBAL FEATURE VALUES

(1) cell area S1 : (cell percentage pixels)

= (5.1038 , 467.5943 , 7.2264 , 468.9151) ,

0.0175 2.5455 0.0141 1.8309

0.0260 1.8309 0.0339 2.4201

(3) Nuclear pulp area ratio NP : (nucleus area to

cytoplasm area ratio) NP = S 2 / S1 S 2

the square of the ratio) C = D / S1

(6) The average transmittance nucleus

Such an approach through the use of matlab in the fisher

k1 , k 2 , k 3 ,", k c were k neighbors number of samples

g i ( x ) = k i , i = 1,2,",5 , decisiong j ( x ) = max k i j = 1,2

. In this way you can distinguish

C. Support Vector Machine

= u1(1) , u 2(1) , u3(1) , u 4(1) , the estimated value of Overall

We collected 55 normal images, 32 images and 16

the detection function

X = ( x1 , x2 , x3 , x4 ) , x1 is the value of C_N, x2 is the

With 1 and 2 respectively express. As the k value is

APPLICATION OF THIS ARTICLE

Decision-making rules are: if

THREE KINDS OF CLASSIFICATION IN THE

A_A. the estimated value of Overall

(2) nuclear area S 2 : (nuclear share of the number of

= (5.0598 ,463.1880 , 7.3419 ,468.1966) ,

H2 respectively had two samples from the category line and

Figure 2. Structure of Flow

We use the training sample of 80, including 40 normal

Figure 1. Optimal separating surface diagram

methods. According to the formula (5), by judging

Two types of samples in the H1, H2 on the training

On the type sum calculation does not take

zero, b can be used to satisfy either a support vector type

EXPERIMENT AND RESULT ANALYSIS

algorithm is the most reliable and most effective. Since these

In this paper, taking samples of the training samples and

ALGORITHM ANALYSIS TABLE

He Chuan-bin. Feature Extraction and Classification for 2d Medical

Guergachi A A, Patry G G. Statistical Learning Theory Model

S-ar putea să vă placă și