Documente Academic
Documente Profesional
Documente Cultură
Neil MacEwen
DECLARATION
I declare that this report is entirely the result of my own work under the
Neil MacEwen
2
Medical Image Analysis Using Texture Analysis Neil MacEwen
CONTENTS
CONTENTS........................................................................................................................ 3
ABSTRACT........................................................................................................................ 5
ACKNOWLEDGEMENTS................................................................................................ 6
TABLE OF FIGURES........................................................................................................ 7
LIST OF TABLES.............................................................................................................. 8
1. INTRODUCTION ..................................................................................................... 9
4. CLASSIFICATION .................................................................................................. 16
4.3 Classifiers.......................................................................................................... 19
6. SEGMENTATION ............................................................................................... 27
7.3 Classifiers.......................................................................................................... 30
7.4.2 Normalisation............................................................................................ 33
REFERENCES ................................................................................................................. 47
APPENDICES .................................................................................................................. 48
4
Medical Image Analysis Using Texture Analysis Neil MacEwen
ABSTRACT
This main aim of this project was to validate the use of a set of 5 texture analysis
The project built on a previous final year project that used the 5 algorithms to generate a
set of texture features describing an image. This report outlines the work done in
validating these algorithms. A set of classifiers were designed and tested, and results
showed that the algorithms can be used to successfully differentiate between different
texture types. The segmentation of a mixed texture image was undertaken to validate the
Feature reduction was investigated in order to reduce the amount of features needed to
5
Medical Image Analysis Using Texture Analysis Neil MacEwen
ACKNOWLEDGEMENTS
I would like to thank Bill for introducing me to this interesting topic, and for his support
6
Medical Image Analysis Using Texture Analysis Neil MacEwen
TABLE OF FIGURES
Figure 3.1: Two textural images with their histograms and some first-order statistics... 13
Figure 3.2: An example of a greyscale MRI image and the graphical representation of its
textural features. See appendices A and B for feature details and code respectively. ..... 14
Figure 4.1: Two-dimensional feature space containing two classes. An incoming pattern
Figure 4.3: A histogram of features values. Anything falling to the right hand side of the
Figure 4.4: (a) Linear decision function (b) Nonlinear decision function (both shown in
red) .................................................................................................................................... 20
Figure 4.5:Three linearly separable classes in R2, the decision boundary for a class Ci is
Figure 4.6: Three pairwise separable classes in R2, two decision boundaries are needed to
Figure 7.1: The four textures used for classifier development. ....................................... 28
Figure 7.2: Initial classifier performance results (% test vectors correctly identified).... 32
identified).......................................................................................................................... 34
7
Medical Image Analysis Using Texture Analysis Neil MacEwen
Figure 7.7: Neural Net classifier performance for different spreads for 32 x 32 pixel
images ............................................................................................................................... 37
Figure 7.8: Average classifier performance over all four classes at different sizes ........ 38
Figure 8.1: Left: combination of the four Brodatz source textures. ............................ 40
Figure 9.1: Classifier performances using subsets selected using the Bhattacharyya
distance ............................................................................................................................. 42
Figure 9.2: Classifier performances using subsets selected using forward and backward
selection procedures.......................................................................................................... 43
LIST OF TABLES
Table 7.1: Number of smaller images extracted from source images at each resolution 29
Table 7.2: Classification of Fisher’s Iris Data before and after normalisation ................ 36
Table 9.1: Feature subset sizes created by using only individual texture algorithms....... 44
8
Medical Image Analysis Using Texture Analysis Neil MacEwen
1. INTRODUCTION
image analysis to remove any subjective bias. In a medical context for example, it can be
very difficult to distinguish between different clinical features present in a medical image,
such as between grey and white matter in a MRI scan. One type of computer-based
analysis that can be used is texture analysis, which can be used to identify different
This project is an extension to a previous final year project [1] that aimed to create
a robust image-viewing platform and investigate the use of advanced image analysis
strategies for assisting clinical diagnosis. Five texture analysis algorithms were used to
generate a set of 38 textural features describing an image, and this project will
consequently extend the previous work to analyse a gold-standard data set to validate the
9
Medical Image Analysis Using Texture Analysis Neil MacEwen
2. PATTERN RECOGNITION
This project fits into the overall subject area known as pattern recognition.
between four types of humans [2]; (a) tall and thin (b) tall and fat (c) short and thin (d)
short and fat. A classification process is therefore carried out on certain features
belonging to these persons, to put them into the correct class. A good choice of features
This process is called feature generation. A subset of the selected features may then be
chosen for various reasons (see section 6); the selection of this subset is called feature
reduction. These features are then used as the input for a classifier, which will assign the
original object into a corresponding class. In the previous example, the classes are the
four types of human, the observations are all the observed qualities of each human (which
are almost limitless, such as age, employment etc), and the extracted features are height
y – feature vector
10
Medical Image Analysis Using Texture Analysis Neil MacEwen
wi - selected class
In the specific case of this project, the observation vector is a digital image (i.e.
the pixel values), the feature vector is a vector of textural features, and the various classes
observation vector that describes the original object. In the specific case of this project
features must be extracted from medical images. There are many ways to extract features
from images [3], in this project the image is described using textural features extracted by
texture analysis.
11
Medical Image Analysis Using Texture Analysis Neil MacEwen
The previous project [1] undertaken involved the textural analysis of medical
images in order to extract textural features to aid clinical diagnosis. Various algorithms
were explored which produced a total of 38 features (see appendix A). A Graphical User
Interface (GUI) was created in MATLAB [10] to allow simple graphical and textual
image [4]. In a digital image, texture describes the relationship between the intensities of
neighbouring pixels (not necessarily adjacent). Texture can be examined in two ways,
structurally and statistically. The statistical approach was used [1]. One first-order
algorithm was used, producing nine features, and four second-order algorithms producing
the pixel intensity values. The nine features calculated are detailed in appendix A. An
12
Medical Image Analysis Using Texture Analysis Neil MacEwen
Figure 3.1: Two textural images with their histograms and some first-order statistics [1]
intermediate matrix describing the digital image was created, from which the features
visual data such as size, colour, shape and orientation, and then
13
Medical Image Analysis Using Texture Analysis Neil MacEwen
in appendix A.
The next two algorithms are based on the co-occurrence matrix. The co-occurrence
matrix represents the joint probability distribution of pairs of grey level intensities. [1]
Figure 3.2: An example of a greyscale MRI image and the graphical representation of its textural
features. See appendices A and B for feature details and code respectively.
14
Medical Image Analysis Using Texture Analysis Neil MacEwen
The algorithms were implemented using MATLAB. For both the intermediate
matrix and texture feature calculations, computation was found to be very slow. To
more suitable for these types of algorithms [1]. A GUI viewer was also created, which
allows the user to view the input image and its numerical feature values. Regions of
interest (ROI) can also be selected, allowing the user to compare textural features for up
15
Medical Image Analysis Using Texture Analysis Neil MacEwen
4. CLASSIFICATION
To validate the texture algorithms textures were identified using the previously
Classification is the process of categorising an object using certain features describing the
object. The features create a feature space in which all objects will lie and the aim of a
classifier is to identify the regions of the feature space taken up by each class. Thus when
a new feature vector is applied to the classifier it will be assigned to the corresponding
x2
Region 1
o pattern type 1
x x x pattern type 2
x x x
xx x oo
o oo o
o
Region 2
x1
Figure 4.1: Two-dimensional feature space containing two classes. An incoming pattern is assigned
to the class corresponding to the region it falls in.
class case the classifier must differentiate between a 1 and a zero. The objects (the
numbers) are shown in figure 4.2. They have been placed on a grid so that an observation
16
Medical Image Analysis Using Texture Analysis Neil MacEwen
1 2 3 4 5 1 2 3 4 5
6 6
25 25
⎡0⎤ ⎡0⎤
⎢0⎥ ⎢0⎥
⎢ ⎥ ⎢ ⎥
⎢ . ⎥ ⎢ . ⎥
⎢ ⎥ ⎢ ⎥
. .
x1 = ⎢ ⎥ x0 = ⎢ ⎥
⎢0.2⎥ ⎢0.2⎥
⎢ ⎥ ⎢ ⎥
⎢ . ⎥ ⎢ . ⎥
⎢0⎥ ⎢0.2⎥
⎢ ⎥ ⎢ ⎥
⎢⎣ 0 ⎥⎦ ⎢⎣ 0 ⎥⎦
As the 0’s generally cover more area than the 1’s, a feature that will differentiate between
the two characters could intuitively be chosen to be the total area covered by each
character. The components of each observation vector are therefore summed to give the
feature vector y.
y1 = 3.8 y 2 = 9.1
A classifier can be designed by plotting a histogram of all the features obtained from a
“training” set, as shown in figure 4.3. The histogram for each class (1 and 0) is plotted on
the same line, and then a boundary can be visually applied, a character being classified to
a class depending on which side of the boundary its feature vector falls in.
17
Medical Image Analysis Using Texture Analysis Neil MacEwen
6
Decision Boundary
5
1's
3
N
0's
0
y1
Figure 4.3: A histogram of features values. Anything falling to the right hand side of the boundary
is classified as a 0, anything to the left as a 1.
This classifier is obviously not ideal, and the error regions can be seen visually as
the portions of the 1 histogram to the right of the boundary, and the portions of the 0
contemplating the use of a classifier [5]. A given pattern x has to be assigned to one of C
classes w1, w2,…., wc based on its feature values (x1, x2,…., xN). There is therefore an N-
dimensional feature space. The features have a density function conditioned on the
However, even if the densities are not known a common approach is to estimate them
using a training set of patterns, as was highlighted in the previous simple example. That
is, a selection of feature vectors belonging to a single class are taken and an estimate is
made of the conditional density belonging to that class. There are also non-parametric
18
Medical Image Analysis Using Texture Analysis Neil MacEwen
techniques that assume no prior knowledge of the classes. This is when it is impossible
to construct a classifier using training samples taken from a known class due to the
unavailability of labelled training samples. It may not even be known how many classes
there should be. In these cases cluster analysis is used to organize the training sets into
4.3 Classifiers
the feature space. A statistical classifier examines the risk involved with every
statistical classifier is the Bayes classifier that is based on Bayes formula from probability
theory and minimises the total expected risk, the classifier is thus an optimum classifier.
The Bayes classifier calculates the posterior probability of a pattern being in each class,
and assigns it to the class that gives the largest probability. A simple Bayesian decision
p x|w (x | w1 ) Pr(w2 )
If l ( x) ≡ 1
> choose w1
p x|w (x | w2 )
2
Pr(w1 )
Pr( w2 )
< choose w2
Pr( w1 )
19
Medical Image Analysis Using Texture Analysis Neil MacEwen
When the number of classes is known and the training patterns produce
pattern. For a two-class example, where two classes C1 and C2 exist in Rn and a
hyperplane d (x) = 0 separates their patterns, the decision function d (x) can be used as
a linear classifier.
d (x) > 0 ⇒ x ∈ C1
d ( x) < 0 ⇒ x ∈ C 2
The hyperplane d (x) = 0 is called the decision boundary. In some cases where the
created using generalised decision functions, or the feature space can be transformed into
a much higher dimension where linear decision boundaries can be used. Examples of
x2 x2
Region 1 x Region 2
o pattern type 1 x x
x x x pattern type 2
x x x oo
xx x oo oo o x1
o oo o o
o x x
Region 2 x
x
x1
Region 1
(a) (b)
Figure 4.4: (a) Linear decision function (b) Nonlinear decision function (both shown in red)
surface d (x) = 0, x ∈ R separates some class Ci from the remaining Cj, j ≠ i, i.e.
n
20
Medical Image Analysis Using Texture Analysis Neil MacEwen
d (x) > 0 , x ∈ Ci
d (x) < 0 , x ∈ C j , j ≠ i
then d (x) is a decision function of Ci. This concept is illustrated in figure 4.5, which
gives an example of absolutely separable classes. Classes can also be pairwise separable,
which means that there is a possible linear decision boundary between each pair of
d2(x) = 0
+
x2 -
C2
C3
-
d1(x) = 0
+
C1 + x1
-
d3(x) = 0
Figure 4.5:Three linearly separable classes in R2, the decision boundary for a class Ci is given by di(x)
d13(x) = 0
x2 -
+
C3
-
d23(x) = 0
+
C1
C2
x1
- +
d12(x) = 0
Figure 4.6: Three pairwise separable classes in R2, two decision boundaries are needed to select each
class.
21
Medical Image Analysis Using Texture Analysis Neil MacEwen
If the training patterns form clusters, a distance function and clustering approach
can be used. This entails classifying an incoming pattern according to its proximity to
patterns of existing classes. Two ways of deciding to what class the incoming pattern
distance classification represents each class by prototype vectors, for example the class
mean. The simplest case is when the patterns of each class are very close to each other,
and each class can therefore be represented by a single prototype. If there are m pattern
classes in Rn, {C1,…,Cm}, represented by the prototype vectors y1,…,ym, then the
distances between an incoming pattern x and the prototype vectors can be defined as [2]:
( )
1
Di = x − y i = (x − y i ) (x − y i ) , 1≤ i ≤ m
T 2
D j = min x − y i , 1 ≤ i ≤ m
This calculates the minimum Euclidean distance to the class prototypes. The
Euclidean distance however does not take into account any correlation between features,
and thus an improvement is to use the Mahalanobis distance that takes into account the
class covariance matrices. This classifier is also occasionally known as the Gaussian
classifier. Given a class mean and covariance matrix ui and Wi respectively, the distance
is defined as [6]:
−1
Di = ( x − u i ) T Wi ( x − u i ) + ln Wi
x ∈ CL when DL = min{D i }
22
Medical Image Analysis Using Texture Analysis Neil MacEwen
When using the minimum-distance classifier a major problem is defining the class
prototypes. This is especially a problem if classes are split into several clusters. A
measure of similarity between patterns must be used, so that “similar” patterns can be
grouped together to form clusters. Clustering algorithms usually aim to optimise some
performance index, such as the sum of the distances between each pattern and its
corresponding cluster centre. Several clustering algorithms have been developed [2],
such as the c-Means Iterative algorithm (CMI), which iteratively updates each cluster
centre by replacing it with the mean of its samples, or the ISODATA algorithm, which is
class corresponding to its nearest neighbour in the set of sample patterns. The Nearest
Neighbour classifier can also be extended to take into account the k nearest neighbours
[2].
Classification can be carried out using a fuzzy logic approach. This would result
in each incoming pattern being classified to every class with varying degrees of certainty.
This approach would not be useful for this project, as definite pattern identification must
be achieved, i.e. an incoming pattern must be classified into one single class.
In order to use the ANN classifier it must be assumed that a set of training
patterns and their correct classifications are a priori available. ANNs are based on the
23
Medical Image Analysis Using Texture Analysis Neil MacEwen
functionality of the human brain. The human brain is made up of many neurons
connected together by synapses. ANNs are based on the same idea; they consist of
neurons connected by “weights”. Each neuron performs an operation on its input signal
to produce its output, and the weights simply multiply the signal by a fixed value. A
form of ANN that can be used for classification is the Probabilistic Neural Network
(PNN), which can be easily implemented using the MATLAB Neural Net toolbox [10].
The PNN consists of two layers, the first calculates the distance from an input vector to
the training vectors before the second determines the probabilities of the input being in
each class. Finally the input is classified according to the maximum of these
probabilities.
24
Medical Image Analysis Using Texture Analysis Neil MacEwen
5. FEATURE REDUCTION
Feature reduction is the process of reducing the dimensionality of the feature space. The
reduce feature computation expense, which may be unnecessarily high due to irrelevant
features.
Feature extraction is where a smaller number of new features are created from linear
combinations of the original features. The obvious drawback is that this means that the
same number of measurements must be taken in the first place. Feature selection is the
can be a huge number of possible subsets. There are however some other heuristic
methods of feature selection which, although perhaps not finding the “best” subset, will
find a reasonable subset [3]. All the methods require some form of quantifying the “best”
features, usually a measure closely related to the error rate of the resulting classifier if the
Stepwise forward: this method first finds the single feature that maximises the
measure of “best”. Then another feature is selected which, coupled with the original,
again maximises the measure. A third feature is then chosen and this process continues
25
Medical Image Analysis Using Texture Analysis Neil MacEwen
Stepwise backward: this method starts with all the features and, at each step,
removes the feature that maximises (or least reduces) the measure.
Full stepwise: this method combines both the previous methods to form a method
Another method [11] uses the Bhattacharyya distance to rank order the features in terms
of ‘relevance’ in separating the classes. The ranked features can then be used to create
subsets of chosen sizes. The Bhattacharyya distance for a single feature between two
1 ⎧ 1 σ a2 σ b2 ⎫ 1 ⎧ (µ a − µb ) 2 ⎫
BD (a, b) = ln ⎨ ( 2 + 2 + 2)⎬ + ⎨ 2 2 ⎬
4 ⎩ 4 σ b σ ab ⎭ 4 ⎩ σa +σb ⎭
where µ j and σ j are the class mean and variance for class j.
26
Medical Image Analysis Using Texture Analysis Neil MacEwen
6. SEGMENTATION
many medical image analysis applications is to separate an image into regions defined by
different clinical features. For example this may be to define clear regions in an MRI
brain scan containing white matter or grey matter. This could provide an extremely
split into smaller blocks, and each block analysed individually, and either given a
classification as one of the classes, or designated “unknown”. The unknown blocks are
then analysed in more detail, as they are likely to contain boundaries between classes. A
progressive “zooming-in” process is then undertaken in order to find the boundary lines
between classes. The image has therefore been split into small regions each containing a
single class, and a new image could be created showing clearly the distinctions between
the classes. The performance of this segmentation process would validate the
27
Medical Image Analysis Using Texture Analysis Neil MacEwen
The Brodatz texture database [8] was chosen for the development of a suitable
classifier as it provides a benchmark set of texture images. Four separate textures were
selected to provide four distinct classes; as illustrated in figure 7.1. Clockwise from the
D92 – Pigskin
i.e. they were broken up into smaller sized regions which were analysed individually.
28
Medical Image Analysis Using Texture Analysis Neil MacEwen
Initially, various smaller images were extracted from the source images. Once the
numerous smaller images were extracted, they were divided into training and test sets, as
shown in table 7.1, so that the classifiers could be designed and tested. The number of
images extracted at each resolution is a function of (and limited by for the larger sizes)
16 x 16 400 320 80
32 x 32 400 340 60
64 x 64 100 70 30
128 x 128 25 15 10
Table 7.1: Number of smaller images extracted from source images at each resolution
The extracted texture images were then used to create feature vectors. Initially for
classifier development the study was carried out only on the intermediate-sized 32 x 32
pixel images. A feature vector containing values for each of the 38 textures features was
generated for each image, as was explained in section 3. The feature vectors were stored
in matrix form for ease of future accessibility, thus classifier development was carried out
on two matrices for each class, a 340 x 38 training matrix, and a 60 x 38 test matrix.
29
Medical Image Analysis Using Texture Analysis Neil MacEwen
7.3 Classifiers
Three classifiers were chosen for examination; the simple minimum Euclidean
distance classifier, the Gaussian Mahalanobis distance classifier, and the probabilistic
neural network classifier contained within the MATLAB Neural Network Toolbox.
Neural Network classifiers and the Gaussian classifier are commonly seen in
texture literature, and they have been shown to give good results. The Euclidean distance
classifier was chosen to provide a simple, fast alternative to the other more complex
classifiers.
there must first be a choice made of class prototypes to represent the class. An incoming
feature vector is then classified to the class represented by the nearest prototype, ‘nearest’
established using the Euclidean distance. The simplest prototype that can be used is
simply the class mean. Class prototypes can also be generated using some clustering
algorithm. As a comparison to the class means, prototypes were also created using the c-
means algorithm, and a function was also written to attempt to match the c-means cluster
centres to their best-fit class. The c-means algorithm has various modifiable attributes,
and thus produced a wide variety of different results, from this point on only the best
result will be referred to. The c-means process was undertaken using a MATLAB
toolbox function kmeans.m, which allowed for various different versions of the algorithm
incoming feature vector to the nearest class, ‘nearest’ established using the Mahalanobis
distance (see section 4.3.3). The Mahalanobis distance measures the distance between a
point in space and a data set, the classifier thus needs a priori class mean and covariance
values, which are of course available from the training matrix. A Mahalanobis classifier
The MATLAB Neural Network toolbox was used to create a probabilistic neural
network (see section 4.3.5) using the training data, which was then used for classification
purposes. The classification was then carried out by inputting a test feature vector into
the neural net, which outputted the corresponding class. The Neural Net classifier
vectors and working out the probabilities of the test vector belonging to each class. This
is achieved by measuring the Euclidean distance from the test point to each of its
neighbours. The test vector is then allocated to the class corresponding to the highest
probability.
As explained earlier, initial classifier testing was carried out on the 32 x 32 pixel
set of texture images with 340 training vectors for each class used to create a priori class
means and covariance matrices, and for constructing the neural net. The test vectors were
31
Medical Image Analysis Using Texture Analysis Neil MacEwen
inputted one by one into each classifier, and each classifier’s performance was measured
by examining the amount of correct classifications it achieved. The initial results are
shown in figure 7.2. The figure shows the percentage of test vectors from each class that
120
100
Euclidean
distance(means)
80
Euclidean distance(c-
means)
60
Mahalanobis distance
40
Neural Net
20
0
D19 D55 D92 D93 average
Figure 7.2: Initial classifier performance results (% test vectors correctly identified)
were correctly classified, and finally the average performance for each classifier. In the
case of the c-means minimum distance classifier, the results shown are the best achieved
for all algorithm set-ups (see MATLAB documentation for further information) in order
As can be see in figure 7.2, the classifiers produced widely varying results. The
best performance was from the Mahalanobis classifier, which obtained an overall
classification performance of 96.25% of test vectors correctly identified. It was also seen
that the c-means algorithm provided no great advantage over simply using the class
means as prototypes. This is because the c-means algorithm is essentially trying to find
cluster centres that minimise the total distance between each cluster centre and its
members, and thus the best result that can be found is in fact the class mean. The Neural
32
Medical Image Analysis Using Texture Analysis Neil MacEwen
7.4.2 Normalisation
The feature vectors that are created come from five different texture analysis
techniques, and thus there are significant range variations between some values in a
feature vector. This is illustrated in figure 7.3, in which the range of texture feature
values for a simple texture image can be seen to range over several orders of magnitude.
A normalisation process was therefore carried out on all vectors used in order to
bring them all into the same range, as also illustrated in figure 7.3. The normalisation
undertaken was zscore normalisation (or standardisation), which allows for the
x−µ
Z=
σ
where µ and σ are the feature mean and standard deviation.
33
Medical Image Analysis Using Texture Analysis Neil MacEwen
120
100
Euclidean
distance(means)
80
Euclidean distance(k-
means)
60
Mahalanobis distance
40
Neural Net
20
0
D19 D55 D92 D93 average
Figure 7.4: Classifier performance after normalisation (% test vectors correctly identified)
the neural net classifier was completely transformed, returning a 98.75% success rate.
therefore an exploration of a 2-dimensional data set was carried out with the aim of better
The performance of each classifier was examined for a 2-dimensional data set, as
this allowed visual analysis of the data. Fisher’s Iris data set, contained within the
150 iris specimens, 50 of each of 3 types. The 2 features selected for analysis were sepal
34
Medical Image Analysis Using Texture Analysis Neil MacEwen
length and petal width. Figure 7.5 shows a graphical representation of the data, before
and after normalisation, the black dots show the class means.
types, and thus one would expect it to classify well. The other two iris types, versicolor
and virginica, are slightly overlapping, and thus one would expect some possible
misclassifications. In this case the normalisation process has a smaller effect as each
feature is measured on the same scale to begin with, however the means can be seen to be
The classifiers were again tested for the new data, every specimen was inputted to
each classifier in an attempt to find some correlation between the results given by the
classifiers and the spatial representation of the data. For the Euclidean distance classifier
the class means were used as class prototypes. The results of the classification are shown
in table 7.2. Figure 7.6 shows the results of the classification graphically. The circled
samples are those that were misclassified, the black corresponding to the Euclidean
classifier, magenta to the Mahalanobis classifier and cyan to the Neural Net classifier.
35
Medical Image Analysis Using Texture Analysis Neil MacEwen
Euclidean Mahalanobis
Class Neural Net
Distance Distance
Before normalisation 100 100 100
Setosa
After normalisation 100 100 100
Before normalisation 76 96 84
Versicolor
After normalisation 80 96 88
Before normalisation 78 94 78
Virginica
After normalisation 86 94 80
Before normalisation 84.67 96.67 87.3
Average
After normalisation 86 96.67 89.33
Table 7.2: Classification of Fisher’s Iris Data before and after normalisation
some misclassifications. Normalisation has no great spatial effect on the data, however it
still affects the classifier performances. Again the Euclidean distance and the Neural Net
classifiers have improved their performance. This time however the Mahalanobis
The Neural Network and Euclidean classifiers both improve performance as they
are using the Euclidean distance, which can be adversely affected by being measured
over distorted scales. The Mahalanobis distance however measures its distance in units
of standard deviation from the class mean, and therefore the zscore normalisation has no
defined. For initial classification testing spread was set to one, however the value of
spread affects the performance of the classifier. When spread is set to around 0, the
takes into account several neighbouring vectors, and thus becomes a k-nearest-neighbour
classifier. Figure 7.7 shows the average Neural Net classifier performance for values of
spread varying from 0.1 to 2. It can be seen that the Neural Net classifier performed best
for 32 x 32 pixel images when the spread was between 0.8 and 0.9.
Figure 7.7: Neural Net classifier performance for different spreads for 32 x 32 pixel images
37
Medical Image Analysis Using Texture Analysis Neil MacEwen
The classifier performances were tested at the image sizes other than 32 x 32
pixels. The results of the classifications are shown in figure 7.8. Neural Net and
Euclidean classifier performances were evaluated under normalised conditions, while the
that evaluations at 64 x 64 and 128 x 128 pixel images were carried out on reduced data
sets.
Figure 7.8: Average classifier performance over all four classes at different image sizes
All three classifier performances dropped significantly at the smallest image size,
8 x 8 pixels. This is because it is hard to extract significant texture information from such
a small region. In general, all three classifiers performed best at 32 x 32 pixels, with the
performance characteristics generally dropping off to either side. The Neural Net
classifier returned overall the best performance. The general dip in performance at the
sizes greater than 32 x 32 can be attributed to the smaller data sets, and thus reduced
38
Medical Image Analysis Using Texture Analysis Neil MacEwen
The variation of the Neural Net classifier’s performance according to spread was also
examined for the various image sizes. Figure 7.9 shows the performance of the classifier
at all sizes the performance had a peak somewhere between 0.9 and 1.1. The figure also
illustrates very well the different performances at different image sizes, with the best
39
Medical Image Analysis Using Texture Analysis Neil MacEwen
With the classifiers having now been designed and tested, an image made up from
the four previous Brodatz textures was segmented to verify the classifier performance
Figure 8.1 shows the image created by combining sections of the source image,
interest (ROI), classifying the region, and creating a new image using a different colour
for each identified class. The combination image was created using 32 x 32 pixel blocks
taken from the source image, and as such performance using an ROI of greater than 32 x
Figure 8.2 shows the segmented images created using a ROI of 8, 16 and 32
pixels.
40
Medical Image Analysis Using Texture Analysis Neil MacEwen
previously. Using an 8 x 8 ROI the results were again generally quite poor, with a high
occurrence of misclassifications across all classifiers. Increasing the ROI size improved
the segmentation for all classifiers, and at 32 x 32 the Mahalanobis and Neural Net
41
Medical Image Analysis Using Texture Analysis Neil MacEwen
number of features used by the classifier. Calculating the 38 texture features was a very
means to reduce the amount of computation needed to create the input vectors for the
task, thus techniques to find sub-optimal subsets were investigated. Again for
Firstly the rank ordering of the features using the Bhattacharyya distance was
carried out (see section 5). The resulting classifier performances for normalised and non-
Figure 9.1: Classifier performances using subsets selected using the Bhattacharyya distance
The results showed that a vastly reduced subset of features can be used for no
original 38-feature performances can be seen at the extreme right of each graph. It is
42
Medical Image Analysis Using Texture Analysis Neil MacEwen
interesting to note that the Neural Net classifier achieved good performance using only 2
features of non-normalised data, and thereafter fell off to its usual poor performance. All
sizes for non-normalised data, however for normalised data subset size affected
performances with normalised data, unlike previously, which suggests that there were
Performance at reduced subsets was also examined using the stepwise procedures
to select the subsets (see section 5). Both forward and backward algorithms were
investigated using both normalised and non-normalised data and using all 3 classifiers as
performance indicators. Figure 9.2 shows the results obtained using these procedures.
Figure 9.2: Classifier performances using subsets selected using forward and backward selection
procedures.
The results again showed that excellent performance could be achieved using
reduced subsets. The graphs show forward and backward selection results using each
classifier as the measure of performance. Moving from left to right across the graph the
forward algorithm adds a feature and the backward algorithm removes one. Thus at the
very left side of the graph, the forward subset contains one feature, and the backward
subset contains 37 features (one has been removed). Likewise at the extreme right of the
43
Medical Image Analysis Using Texture Analysis Neil MacEwen
each plot the forward subset contains 38 features (showing original performances) and
the backward subset contains one feature (not necessarily the same feature as the forward
algorithm).
Again it was seen that normalisation has very little affect on the Mahalanobis
interesting to note the reverse characteristics of the forward and backward algorithms, for
example for the Euclidean classifier using normalised data the backward characteristic is
The performance of each texture algorithm was also examined, thus producing 5
reduced subsets of varying sizes, as shown in table 9.1. These performances are shown in
figure 9.3.
First order 9
NGTDM 5
GLDM 5
GLRLM 5
SGLDM 14
Table 9.1: Feature subset sizes created by using only individual texture algorithms
It can be seen that the GLRLM algorithm is on average the best performing for
both normalised and non-normalised data, it is also affected the least by normalisation.
These results again show that a much-reduced subset can be used for good classifier
performances.
45
Medical Image Analysis Using Texture Analysis Neil MacEwen
A set of classifiers was developed which were used to validate the use of the five
texture algorithms for texture identification. Results showed that the algorithms generate
features that can be used to classify images by texture, and a texture combination image
reduction was examined, and good classification was achieved using reduced subsets of
textures, the next step of the project is to use the algorithms to identify clinical features in
46
Medical Image Analysis Using Texture Analysis Neil MacEwen
REFERENCES
Press 1999.
to pattern recognition and related topics.” John Wiley & Sons, Inc, 1989.
7. http://www.nd.com/welcome/whatisnn.htm
9. http://www.nlm.nih.gov/research/visible/visible_human.html
10. http://www.mathworks.com
47
Medical Image Analysis Using Texture Analysis Neil MacEwen
APPENDICES
First-order - mean f1
- variance f2
- skew f3
- kurtosis f4
- energy f5
- coarseness f6
- entropy f7
- median f8
- mode f9
- contrast f11
- busyness f12
- complexity f13
- energy f16
- entropy f17
- mean f18
48
Medical Image Analysis Using Texture Analysis Neil MacEwen
- energy f26
- homogeneity f27
- correlation f28
- entropy f29
49
Medical Image Analysis Using Texture Analysis Neil MacEwen
50