Sunteți pe pagina 1din 5

Classification of Hardwood Species using

ANN Classifier
Arvind R. Yadav, M. L. Dewal , R. S. Anand

Sangeeta Gupta

Department of Electrical Engineering


Indian Institute of Technology, Roorkee
Roorkee, India
arvind.yadav.me@gmail.com, mohanfee@iitr.ac.in,
anandfee@iitr.ac.in

Botany Division
Forest Research Institute, Dehradun
Dehradun, India
guptas@icfre.org

AbstractIn this paper, an approach for the classification of


different hardwood species of open access database, using texture
feature extraction and supervised machine learning technique has
been implemented. Edges of complex cellular structure of
microscopic images of hardwood are enhanced with the
application of Gabor filter, and Gray Level Co-occurrence Matrix
(GLCM) as an effective texture feature extraction technique is
being revalidated. About, 44 features have been extracted from
GLCM; these features have been further normalized in the range
[0.1, 1]. Multilayer Perceptron Backpropagation Artificial Neural
Network have been used for classification. Experiments conducted
on 25 wood species have resulted in recognition accuracy of about
88.60%
and
92.60%
using
Levenberg-Marquardt
backpropagation training function with two different datasets for
training, validation and testing ratio (70%, 15% , 15% and 80%,
10%, 10%) respectively. Proposed methodology can be extended
with optimized machine learning techniques for online
identification of wood.
Keywords Microscopic Image; GLCM; Gabor Filter;
Multilayer Perceptron Backpropagation Artificial Neural Network;
Reciever Operating Characteristics

I.

INTRODUCTION

Wood is considered to be one of the natures supreme


souvenirs to mankind. India has rich forest resources; it has
more than thousand species of tree, commercially exploited for
wood (timber). Certain prerequisite specifications are desired
for an application and it is important to have best fit and right
wood selected for it. Wood species is broadly classified into
hardwood and softwood species. Hardwood (angiosperms) has
broad leaf and deciduous, on the other hand softwood
(gymnosperms) trees are conifers (evergreens); it has needles or
scale-like foliage and is not deciduous. Hardwood species have
complex cellular structure, and vary considerably between
species. Vessels, rays, parenchyma and fibers are the major
elements of hardwood species. Softwood species cellular
structure is simple and 90-95% of the cells are longitudinal
tracheids. It is difficult to discriminate softwoods species from
one another because of limited number of cell types [1].
To fight against illegal logging, and to assess the tariffs on
wood properly by the custom officers, correct recognition of
wood is essential [2]. At Present, the hardwood species samples
are
identified
through
microstructure
examination.

The microscopic features of unknown wood samples are


compared with the microscopic features of the known ones. The
features examined are as per the list provided by International
Association of Wood Anatomist (IAWA, 1989) of 221 features.
In order to have correct identification of wood in and around
India, and throughout the globe, it is high time to have an
efficient, speedy and accurate machine vision based "Intellectual
Wood Identification System" to overcome the errors caused by
traditional methods of wood identification that is exclusively
based on the human expertise.
An image processing approach has recently provided an
alternative by authors for such a purpose. Taking this into
account, Tou et al. [3], proposed computer vision based wood
recognition system using GLCM (Gray Level Co-occurrence
Matrix) to extract texture features and MLP (Multilayer
Perceptron) for classification. A recognition rate of 75% and
60% are reported by them using 5 and 4 features of GLCM,
respectively for 5 wood species. Khalid et al. [4] have
developed an expensive in-house VSDP (Visual System
Development Platform) for the classification of 20 different
tropical Malaysian wood species. They have reported
recognition rate of 95%, based on 5 features extracted using
GLCM approach from each sample using Multilayer Perceptron
Backpropagation Artificial Neural Network (MLP-BP-ANN) as
classifier. Later Wang et al. [5], reported identification of 24
wood species using wood stereogram images. A classification
accuracy of 91.70% was achieved by them using six texture
features extracted by GLCM approach, the SVM has been used
for classification. Because of the necessity of database, Martins
et al. [6] have prepared wood image database. Further they have
extracted structural, GLCM and LBP features of the images
with 6, 24 and 59 features, respectively. The classification
techniques used by them have been K-NN, LDA and SVM.
Recognition rate of 86.0% has been reported for LBP and SVM
combination. Wood species identification is yet not fully
established especially in India, extensive research is yet to be
carried out in this area.
In this paper, performance evaluation of MLP-BP-ANN for
the classification of 25 different species of hardwood using its
microscopic images is undertaken. The quality of input data has
significant impact on the classification accuracy of ANN.
Combination of Gabor filter and GLCM is used to extract
features of the microscopic images of hardwood species.

Comparisons of different backpropagation training algorithms


based on their classification accuracies are used as a measure to
evaluate the performance of ANN.
This research paper is organized as folllows; Methodology
for wood identification with Gabor filteer and GLCM in
association with MLP-BP-NN has been desccribed in Section II.
Further, implementation of an effective feature extraction
technique using Gabor filter as edge enhanccement and GLCM
as texture feature extraction technique has beeen discussed. Also,
supervised machine learning technique foor classification of
hardwood species has been described annd implemented at
section III. Evaluation and discussion of propposed methodology
with an open access experimental hardwoodd image dataset has
been reported in Section IV. Finally conclussion of the work is
presented in section V.
II.

METHODOLOGY

The complete block diagram of the method


m
proposed is
given in Fig. 1. A detailed description of individual block is
presented below.
Start

Microscopic Image
Database of Hardwood

, ; , , , ,

exp

cos 2

(1)

cos

sin

(2)

cos

sin

(3)

Where, : Wavelength of thee cosine factor, : Orientation of


the normal to parallel stripes of gabor function, : Phase offset,
: Sigma of the gaussian enveloppe, and : Spatial aspect ratio.
We have used two phasee offsets, 0 and 90, and 8
orientations; 45, 90, 135, 180, 225, 270, 315 and 360. For
each pixel the Gabor energy is calculated for the different
orientation and spatial frequenccy (
1/ ) combinations, by
superposition of phase offsets 0
0 and 90, extensively used in
the image processing field. Gabor
G
energy filter is preferred
because of its ability, to generaate smooth response to an edge
with a local maximum exactly at the edge. L-2 norm (the L-2
norm of a vector is the square root of the sum of the absolute
values squared), super-imposeed normalized image for the
concerned orientations, obtainedd by taking squared value of the
convolution results, added togetther pixel wise, and followed by
pixel wise square root computtation to produce the combined
result [8-9]. Gray scale input im
mage samples of californica and
parahyba species and their L-2 super-imposed
s
normalised image
obtained after Gabor processing are shown in Fig. 2.

Preprocessing
(RGB to Gray Image
Conversion)
Gabor Filter
(Edge Enhancement)

GLCM Feature Extraction

(a)

(b)

(c)

(d)

Data Normalization

Training Image
Dataset

Validation Image
Dataset

Test Image
Dataset

ANN Classifier
(MLP-BP-NN)
End

Fig. 1. Flow chart of the proposed methoodology

A. Preprocessing
The microscopic image samples of hardwood species
contain added artificial colours to enhannce its anatomical
features. The first step is to preprocess the microscopic
m
images
to convert RGB image to gray scale image, in order to reduce
the computational complexity.
B. Gabor Filter
Useful features of complex microscopic image
i
are extracted
using, a set of Gabor filters [7] with differeent orientations and
frequencies. Gabor function is represented wiith (1).

Fig. 2. (a) Gray scale input image - californica species, (b) Gray scale input
image - parahyba species, (c) L-2 supper-imposed normalized Gabor outputcalifornica species, and (d) L-2 super-imposed normalized Gabor outputparahyba species.

C. GLCM (Gray Level Co-occuurrence Matrix)


One of the disadvantage off texture information computed
using only histogram is that, itt doesnt carry any information
about the relative position of the
t pixels with respect to each
other. GLCM or gray tonee spatial dependence matrix
(GTSDM), have been extensively used to extract texture
features of images, originally prroposed by Haralick et al. [10].
A co-occurrence matrix is geneerated, which is measure of how
often different combination of pixel
p
gray values, with specified
distance and orientations occur in
i an image.

Consider, f {f(x, y), 0 x M-1, 0 y N-1}, M x N size


image with L gray (intensity) levels. The GLCM matrix G is a
square matrix of order L. Each (i, j)th entry of G represents,
number of times a pixel with gray level i is adjacent to a pixel
with gray level j. Different spatial distance and 4 directions 0,
45, 90 and 135, are used for the generation of GLCM matrix.
Second order statistical texture features such as Autocorrelation,
Contrast,
Correlation(MATLAB),
Correlation(proposed),
Cluster prominence, Cluster shade, Dissimilarity, Energy,
Entropy, Homogeneity(MATLAB), Homogeneity(proposed),
Maximum probability, Sum of squares, Sum variance, Sum
average, Sum entropy, Difference variance, Difference entropy,
Information measure of correlation1, Information measure of
correlation2, Inverse difference normalized (INN) and Inverse
difference moment normalized are calculated from the GLCM
matrix. For each of the aforementioned 22 texture features, two
values (minimum and maximum) are obtained, thus forming 44
features for each sample. Authors have used computationally
efficient MATLAB code provided by [11], to calculate
aforementioned 44 features.
D. Data normalization
The 44 x 500 matrix feature matrix generated from GLCM
has various ranges of values, which are not suitable for
classification. In order to generate a matrix that can be applied
as input to the classifier, the data has to be normalized. Equation
(4) is used for data normalization.
0.9

0.1

(4)

Where FN : Normalised matrix, F: Feature matrix, the


normalize data is in the range [0.1, 1].
E. Multilayer Perceptron Backpropagation Artificial Neural
Network (MLP-BP-ANN)
In order to classify the hardwood species into 25 classes,
Multilayer Perceptron Backpropagation Trained Artificial
Neural Network has been used. ANN is a massively parallel
distributed processor; where in supervised or unsupervised
learning process is used to acquire knowledge [12]. Supervised
learning approach has been presented in this paper (target
pattern and training patterns are known to ANN during learning
process) to classify species of wood.
Neural networks are inspired by biological brains [13-14],
composed of interconnected and interacting components,
commonly referred to as nodes or neurons. Inputs are given to
each and every node; emulate biological neurons by performing
operations on data and selectively passing the information on to
other nodes. Weights are information used to solve particular
problem. Activation functions are required to calculate the
output response of node, and in MLP network, the nonlinear
activation function are used to solve complex problems.
Perceptron (Threshold unit) can learn anything that it can
represent (anything separated with hyper plane). Bias improves
the performance of neural network and acts as a weight on a
connection from a unit whose activation value is always 1.
Backpropagation learning algorithm [13] is applied to obtain
the weight of the network in MLP. The backpropagation
algorithm consists of forward phase, wherein activations
(computes functional signal) are propagated from input layer to

the output layer through hidden layer, and the backward phase
computes error signal (difference between actual and targeted
output values), propagates it backward through the network
starting from output node to the input node to modify the
weights of the network. The training is required to minimize the
mean square error for all the training patterns.
III.

IMPLEMENTATION ASPECTS

A. Wood Database
Microscopic image database of 25 hardwood species are
undertaken for experimentation purpose [6], it has resolution of
1024 x 768 pixels as listed in Table I. Each of the species has 20
images (samples). Total 500 microscopic images of hardwood
species have been used in the experiment.
TABLE I.
Family
Ephedraceae
Lecythidaceae
Lecythidaceae
Lecythidaceae
Lecythidaceae
Sapotaceae
Sapotaceae
Sapotaceae
Fabaceae-Cae.
Fabaceae-Cae.
Fabaceae-Cae.
Fabaceae-Cae.
Fabaceae-Cae.
Fabaceae-Fab.
Fabaceae-Mim
Fabaceae-Mim
Fabaceae-Mim
Fabaceae-Fab
Fabaceae-Fab
Fabaceae-Fab
Fabaceae-Mim.
Fabaceae-Mim.
Fabaceae-Mim.
Fabaceae-Mim.
Fabaceae-Fab.

LIST OF 25 HARDWOOD SPECIES


Gender
Ephedra
Cariniana
Couratari
Eschweilera
Eschweleira
Chrysophyllum
Micropholis
Pouteria
Copaifera
Eperua
Hymenaea
Hymenaea
Schizolobium
Pterocarpus
Acacia
Anadenanthera
Anadenanthera
Dalbergia
Dalbergia
Dalbergia
Dinizia
Enterolobium
Inga
Leucaena
Lonchocarpus

Species
californica
estrellensis
sp
matamata
chartaccae
sp
guianesis
pachycarpa
trapezifolia
falcata
courbaril
sp
parahyba
violaceus
tucunamensis
colubrina
peregrina
jacaranda
spruceana
variabilis
excelsa
schomburgkii
sessilis
leucocephala
subglaucescens

B. Processing Steps
The microscopic image samples of hardwood contain added
artificial colours to enhance the anatomical features. The first
step is to pre-process the microscopic images, in order to
convert the RGB image to gray scale image. Each image is then
convolved with Gabor filter bank to enhance the edges of the
image. The parameters selected for gabor filter in our approach
are; = 8, = 45, 90, 135, 180, 225, 270, 315 and 360,
= 0 and 90, = 1, and , = 0.5. L2- norm superimposed
normalized image is obtained, that enhances the edges of the
microscopic image, having 8 orientations. Further, Gabor
processed image is applied to GLCM block, to extract the
texture features. Two pixel spatial distance between pixel of
interest, and neighbouring pixel along with 0 and 180
orientations has been used to extract the texture features from
each Gabor filtered image. Minimum and maximum value of
each texture feature is considered, to obtain 44 features from the
GLCM technique. For 500 images, one feature matrix of size

44 x 500 has been generated that is further used by the classifier


for classification of different hardwood species.
Two sets of experiments are performed to investigate the
performance of MLP-BP-ANN, for the classification of
hardwood species. MATLAB R2012b, pattern recognition
toolbox has been used to implement classification aspects, with
an aim to evaluate the performance of different backpropagation
training functions of MLP-BP-ANN.
IV.

of 10%. Receiver operating characteristics (ROC) curve (plot of


true positive rate versus false positive rate as the threshold
varies) is shown in Fig. 3 and Fig. 4 for trainlm and trainscg
function, respectively. The perfect results are obtained when all
the test points are concentrated on the upper left corner of the
ROC curve. The ROC curve shown in Fig. 3 has more
concentration of points towards upper left corner compared to
Fig. 4.

RESULTS & DISCUSSIONS

Hardwood species classification has been performed here,


with supervised machine learning technique, MLP-BP-ANN.
Microscopic image database of 25 different hardwood species
have been assigned as the output target in machine learning
technique. In this work 20 samples of each species are
considered, thus in total 500 samples of hardwood images are
classified with proposed supervised machine learning technique.
To evaluate the performance of the MLP-BP-ANN classifier,
two experiments are being performed. All the results has been
generated using i7 Processor, 16 GB RAM, 64-bit Windows-7
operating system and MATLAB R2012b (64-bit).

Fig. 3. Receiver Operating Characteristic (trainlm)

A. Experiment 1
The feature dataset is divided into 3 parts, training,
validation and testing dataset. Out of 500 samples, 350 (70%),
75 (15%) and 75 (15%) samples are used for training, validation
and testing dataset respectively. Neural network pattern
recognition toolbox of MATLAB has been used with different
training functions for the classification of the hardwood species.
The classification accuracy obtained for all 25 hardwood species
with each training function are listed in Table II.
TABLE II.
CLASSIFICATION ACCURACY
FOR ALL 25 HARDWOOD
SPECIES WITH VARIOUS TRAINING FUNCTION OF MLP-BP-ANN.
Training
Function
trainlm
trainscg
trainrp
traincgb
traincgp
traingdx
trainoss
traincgf
traingda
trainbfg
traingd
traingdm

MSE
0.0100
0.0097
0.0127
0.0142
0.0125
0.0178
0.0259
0.0251
0.0270
0.0345
0.0394
0.0394

No. of Hidden
Neurons
12
84
123
148
87
31
166
109
44
29
166
166

Classification
Accuracy %
88.60
84.80
82.00
75.40
75.40
72.40
52.60
51.40
51.00
23.80
10.40
10.00

Execution Time
in Seconds
84.41789
8.106982
7.416739
11.34363
9.072579
6.944351
15.894873
8.573005
7.55883
319.181743
18.498352
18.609838

Levenberg-Marquardt backpropagation (trainlm) training


function [15] has resulted in best classification accuracy of
88.60%. Classification accuracy of 84.80%, has been resulted
with trainscg (Scaled conjugate gradient backpropagation)
training function [16]. Although, trainscg function has resulted
in comparatively less classification accuracy, it took less
computation time compared to trainlm function. Since offline
hardwood species classification system has been considered,
classification accuracy is of utmost importance, compared to
few more seconds involved to generate the result. Further,
Gradient descent with momentum backpropagation (traingdm)
training function has resulted in lowest classification accuracy

Fig. 4. Receiver Operating Characteristic (trainscg)

B. Experiment 2:
In this experimentation, performance of MLP-BP-ANN has
been evaluated with trainlm and trainscg training functions
using datasets with different proportions of training, validation
and testing.
TABLE III.
CLASSIFICATION ACCURACY
FOR ALL 25 HARDWOOD
SPECIES WITH TRAINLM AND TRAINSCG FUNCTION OF MLP-BP-ANN
T.F
trainlm
trainscg

Tr/Va/Te
Ratio in %
80/10/10
75/10/15
55/15/30
80/10/10
75/10/15
55/15/30

MSE
0.0061
0.0084
0.0111
0.081
0.0126
0.0143

No. of
H. N.
23
11
29
146
135
40

C. A.
%
92.60
90.20
85.60
88.80
79.60
79.80

E. T.
Seconds
403.897
69.355
569.530
10.305
10.016
7.165

In Table III, T.F.: Training Function, Tr/ Va/ Te: Training /


Validation/ Test, H. N.: Hidden Neurons, C.A.: Classification
Accuracy, and E.T.: Execution Time.
It has been observed here that trainlm function exhibits
92.60% classification accuracy for training, validation and test
dataset ratio of 80%, 10% and 10% respectively. Whereas, for

the same ratio of dataset trainscg function has resulted in


classification accuracy of 88.80% as listed in Table III. It may
be noted that the classification accuracy obtained in Table III
are average of training, validation and testing classification
accuracies.

15% and 15% respectively. Also, in the experiment 2,


Levenberg-Marquardt backpropagation training function has
resulted 92.60% classification accuracy for training, validation
and test dataset ratio of 80%, 10% and 10% respectively. The
proposed methodology can be implemented for online
identification of hardwood species using better feature
extraction techniques and optimized machine learning
techniques. Thus an effective hardwood recognition tool is
being implemented in this paper to assist the human experts in
hardwood identification system.
REFERENCES
[1]

[2]
[3]
Fig. 5. Receiver Operating Characteristic (trainlm)

[4]

[5]

[6]

[7]

[8]
Fig. 6. Receiver Operating Characteristic (trainscg)

The ROC curve of the trainlm function shows more


concentration of data points on the top upper left corner, which
justifies the significant performance of MLP-BP-ANN using
trainlm function as shown in Fig. 5. As compared to trainlm
function, classification accuracy of trainscg function produced
poor performance. The same has been justified with ROC curve
as shown in Fig. 6, as less concentration of data points are been
at the top upper left corner of the ROC curve.
V.

CONCLUSION

In this work, hardwood species classification has been


performed with supervised machine learning technique called as
MLP-BP-ANN. Here 25 different hardwood species have been
considered from open source database. Further 44 features have
been extracted with GLCM technique and normalization of data
has been performed with proposed methodology. This work has
identified Gabor filter, as an effective pre-processing tool to
enhance edges of complex cellular structure of hardwood
images. Further, authors have revalidated GLCM as an effective
texture feature extraction technique for hardwood images. In the
experiment 1, all the backpropagation training functions
classification accuracy has been compared and it is found that
Levenberg-Marquardt backpropagation training function has the
best classification accuracy of 88.60% among all the training
functions for training, validation and test dataset ratio of 70%,

[9]

[10]

[11]

[12]
[13]

[14]
[15]

[16]

B. Bond, Wood Identification for Hardwood and Softwood Species


Native to Tennessee, Agricultural Extension Service, University of
Tennessee, 2002.
E. A. Wheeler and P. Baas, "Wood identification-a review," IAWA
Jl.(NS), vol. 19, 1998, pp. 241-264,.
J. Y. Tou, P. Y. Lau, and Y. H. Tay, "Computer vision-based wood
recognition system," in Proceedings of International Workshop on
Advanced Image Technology, 2007.
M. Khalid, E. L. Y. Lee, R. Yusof, and M. Nadaraj, "Design of an
intelligent wood species recognition system," International Journal of
Simulation System, Science and Technology, vol. 9, 2008, pp. 9-19.
B.-h. Wang, H.-j. Wang, and H.-n. Qi, "Wood recognition based on greylevel co-occurrence matrix," in Computer Application and System
Modeling (ICCASM), 2010 International Conference on, 2010, pp. V1269-V1-272.
J. Martins, L. Oliveira, S. Nisgoski, and R. Sabourin, "A database for
automatic classification of forest species," Machine Vision and
Applications, 2013, pp. 1-12.
J. G. Daugman, "Uncertainty relation for resolution in space, spatial
frequency, and orientation optimized by two-dimensional visual cortical
filters," Optical Society of America, Journal, A: Optics and Image
Science, vol. 2, 1985, pp. 1160-1169.
P. Kruizinga and N. Petkov, "Nonlinear operator for oriented texture,"
Image Processing, IEEE Transactions on, vol. 8, 1999, pp. 1395-1407.
P. Kruizinga, N. Petkov, and S. E. Grigorescu, "Comparison of texture
features based on Gabor filters," in Image analysis and processing, 1999.
Proceedings. International conference on, 1999, pp. 142-147.
R. M. Haralick, K. Shanmugam, and I. H. Dinstein, "Textural features for
image classification," Systems, Man and Cybernetics, IEEE Transactions
on , 1973, pp. 610-621.
http://www.mathworks.in/matlabcentral/fileexchange/22354glcmfeatures4-m-vectorized-version-of-glcmfeatures1-m-with-codechanges.
S. Haykin, Neural networks: a comprehensive foundation: Prentice Hall
PTR, 1994.
D. E. Rumelhart and J. L. McClelland, "Parallel distributed processing:
explorations in the microstructure of cognition. Volume 1. Foundations,
1986.
G. F. Luger and W. A. Stubblefield, "Artificial intelligence: structures
and strategies for complex problem solving," 1993.
M. I. Lourakis, "A brief description of the Levenberg-Marquardt
algorithm implemented by levmar," Institute of Computer Science,
Foundation for Research and Technology, vol. 11, 2005.
M. F. Mller, "A scaled conjugate gradient algorithm for fast supervised
learning," Neural networks, vol. 6, 1993, pp. 525-533.

S-ar putea să vă placă și