Sunteți pe pagina 1din 5

Unsupervised classification of grayscale image using

Probabilistic Neural Network (PNN)

*Jawad Iounousse, Ahmed Farhi, Ahmed El motassadeq, Hassan Chehouani, Salah Erraki
Laboratoire des Procedes, Metrologie et Materiaux pour l'Energie et L'Environnement
Faculte des Sciences et Techniques
Marrakech, Morocco
iounousse@gmail.com, ahmedfarhi@gmail.com, motassadeq@gmail.com,
chehouani@fstg-marrakech.ac.ma, s.erraki@gmail.com

Abstract-Image classification is a very common step in image An alternative to the complex domain neural network for
analysis process. It is a low-level processing that precedes the step classification is the probabilistic neural network (PNN). PNN
of measuring, understanding and decision. Its purpose is image is an implementation of the Bayes optimal decision rule in the
partitioning into related and homogeneous regions in the sense of form of a neural network. Several recent studies [2] [3] [4] [5]
a homogeneity criterion. have been used it for classification. This technique provides
In this paper, we proposed a procedure to determine the
satisfactory results, but the user is obliged initially to define the
optimal number of classes in a grayscale image classification
classes. In this paper, we proposed an automation process for a
based on a Probabilistic Neural Network (PNN). The used
grayscale images classification to solve this problem and
procedure is completely automatic with no parameter adjusting.
The results on synthetic images show a high robustness and
choose the correct number of classes in the PNN training by
better performance. The results showed that PNN is a good using a validity index. This procedure is inspired by a method
technique for one-dimensional data classifying. [6] [7] used for automating the celebrated technique Fuzzy C­
Means (FCM).

Keywords-image processing; classification; probabilistic neural II. DESCRIPTION OF PROBABILISTIC NEURAL NETWORK
network; automation; cluster validity index; grayscale (PNN)

In 1990, Donald F. Specht proposed a network based on


I. INTRODUCTION nearest neighbor classifiers and named it as "Probabilistic
The classification is a concept which occurs frequently in Neural Network" (PNN)
[8]. It is often used for data
real life. So, it is desirable to combine the elements of a classification problems such as noise classification [2], the
heterogeneous group, in a limited number of homogeneous protein superfamily classification [3], ship classification [4],
classes as possible. Its application has an important role for the classification of highway vehicles [5] and others. PNN
resolving many problems in pattern recognition, imaging, color allows a supervised classification because the classes are listed
image segmentation, data mining and in different domains such in advance.
as medicine, biology, marketing, land use, etc.... The architecture of a PNN [9] is presented in (Fig. I).
We are talking about unsupervised classification or Input Layer 1 : Radial Basis Layer Layer 2 : Competitive Layer
clustering, when there is no information on the variables to be
\
processed; and supervised classification otherwise. In this
study, we developed a methodology for unsupervised
classification, which aims search for homogeneous groups in a
a'
multidimensional mixture where the number of groups is Kxl
c
unknown. The classification results obtained depend strongly
on the number of classes fixed. So it is important to choose the
correct number of classes for achievement a good quality of
R Q K
classification. This is not always easy, especially in the case of \....-/ \.
...
_________ __
� \.
...
------ -.1
overlapping clusters. a' radbas (1IWl,l. P II b')
= a' com pet (W'.l a')
=

Q number of input/target pairs (dimension) number of neurons in layer 1


= =

Reference [I] showed that classification is the most K number of classes of input data number of neurons in layer 2
= =

researched topic of neural networks and confirmed that neural R number of elements in input vector (pixels)
=

networks are a promising alternative to various conventional


classification methods. The advantage of neural networks is
Figure l. Architecture of a PNN
that it uses self-adaptive methods to adjusting the data without
any explicit specification.
PNN is composed of two layers:

978-1-4673-1520-3/12/$31.00 ©2012 IEEE


• 1st layer "Radial Basis Layer": When an input P is To make PNN automatic, we use the output of RBL which
presented, this layer computes the distances from the takes the form of a matrix of probabilities. This matrix will
target vectors representing the defined classes to the allow us, by analogy with the methodology used for FCM
training input vectors (the pixels of the image in our case) [6][7], to determine the target vector (classes) by calculating a
and produces a vector whose elements indicate how each validity index Yin with varying the number of classes N in a
class is close to the training set. This vector is multiplied, given interval. The optimal number of classes is obtained when
element by element, by the bias and then sent to the radial Yin reaches its maximum value.
basis transfer function. An input vector close to a target
vector is represented by a number close to 1 in the output A. choice of classes
vector al. In this paper, we noted this layer RBL. We started by choosing the number of classes N in the
• 2nd layer "Competitive Layer": This layer sums these range [Cmin, Cmax] (Fig. 2). We make a linear distribution of
contributions for each class of inputs to produce a vector the input data values in N classes while allowing a margin
of probabilities as a net output. Finally, a competitive between two classes of 2 sp and margin of 1 sp on the
transfer function on the output reaches the maximum of extremities as illustrated in (Fig. 3).
these probabilities, and produces 1 corresponding to the Minimum Maximum
value sp
largest element of nz, and Os elsewhere. sp value

�I I I I I I I ,
......, ..
It is assumed that there are Q input vector/target vector
".. 2 sp *- .¢-. - . - . - . - . - . - . - . - . - ¢.. 2 sp -t'- Input data
pairs. Each target vector has K elements. One of these elements 2 sp
values
Class 1 Class 2 Class 3 Class N-l Class N
is 1 and the rest are 0. Thus, each input vector is associated to
one of K classes because that specific class has the maximum
probabiI ity of being correct. Figure 3. Distribution of input data values in N classes

III. AUTOMATION PROCESS sp is the spread of radial basis transfer function of RBL.
The PNN have been widely used for classification. This The peak of the radial basis function is always at its center and
algorithm requires initially the target vectors, which is not spread sp varies depending on the number of classes N to avoid
evident to fmd. The choice of target classes and their number the risk of mutual influence between distant points following
should be without errors. An evaluation methodology IS the duplication:
required to determine and choose the optimal number of sp = (max - min) / N+ 1 (1)
clusters k *. This is called cluster validity index.
In order to approach closer to the dominant classes in the
The process to calculate the cluster validity index is input data, we used the histogram of the input image to move
summarized in the following steps: the classes Ci over the interval [Ci - sp, Ci + sp] by replacing
• Step 1: Selection of target classes from a number of them by the weighted mean (mi) of the input vector values
clusters k. belonging to this interval (Fig. 4). The mean weight mi is the
• Step 2: Apply the algorithm PNN for different values of
number of pixels associated with each interval value.
k with k 2,3, . . . , Cmax. (Cmax is set by the user).
= Pixel
number
• Step 3: Calculate the validity index for each partition
I

obtained in step 2. . I
.

• Step 4: Select the optimal number k * of clusters. ��


� I
I I

I I

The following schema Fig. 2 illustrates the automation I

process.

I
I
I
I I

Minif11 um : fit1 aximum


valuei I

sp
. :
I

Input data
2sp 2sp _ . _ . _ . _ . _. _. _. _ . 25
Class 1 Class 2 Class 3 Class N-l 8 ass N value

Input vectors Figure 4. Displacement of classes according to the histogram

Input data
B. Validity index computing
The classes displaced represent the target vectors of RBL.
At the output of this layer, it retrieves a probability matrix Uik
Figure 2. The automation procedure
which represents the membership of a pixel Xk to a class Ci.
The matrix is normalized in the interval [0, 1] with LUik 1. =
Then we compute the validity index Vin defined by the We tested our algorithm on a synthetic image Fig. 5-a
function (2) as follows: representing a gradient of eight levels of gray. In this case, we
(eN choose a number of classes N in the range [Cmin=3,
Vince, U, N) = bb
c. max (Uik) - (2) Cmax=10] to see if our algorithm goes to determine the exact
number of classes. Tab. [ summarized the obtained results.
The maximum validity index (0.969) corresponds to a
C: number of vectors to classify
number of classes of 8.
N: number of classes
U: normalized matrix membership outcome ofRBL TABLE I. VARIABILITY OF VALIDITY INDEX WITH NUMBER OF CLASSES
Max (U): the maximum value ofU associated to each pixel. It FOR A SYNTHETIC IMAGE (FIG. 5-A)
represents the closest class to the pixel.
Number of classes
By varying the number N of classes in a given interval Index Vin
[Cmin, Cmax], the maximum index Vin corresponds to the
optimal distribution ofN classes. The results displayed in (Tab. [ and Fig. 6-a) showed that
our method has recognized the eight existing classes in the
After determining the adequate number of class N, we
image.
perform a PNN classification using the obtained classes Cj as
PNN's target vector. Then we tested our method to the real image of a Moroccan
tile (zelij) (Fig. 5-b). This image appears to contain five levels
[v. RESULTS of gray. So, we choose a number of classes N in the range
Synthetic and real grayscale images (Fig. 5) were used for [Cmin=3, Cmax=8].
testing the developed method. The objective of these tests is to
The maximum validity index (0.882) is associated to a
show if the decision on the classes and their number works
number of classes of 5 (Tab. II and Fig. 6-b).
well for any types of grayscale distribution (simple, real and
complex images). TABLE II. VARIABILITY OF VALIDITY INDEX WITH NUMBER OF CLASSES
FOR A REAL IMAGE (FIG. 5-B)

Number of classes
Index Vin

A final test was applied to another real image with complex


distribution of grayscale (Fig. 5-c). The image contains a
mixture of three classes. The number of classes N is taken
between Cmin=2 and Cmax=8.
(a)
The obtained results showed that the maximum validity
index (0.823) corresponds to 3 classes (Tab. III).

TABLE III. VARIABILITY OF VALIDITY INDEX WITH NUMBER OF CLASSES


FOR A REAL IMAGE WITH COMPLEX DISTRIBUTION OF GRAYSCALE (FIG. 5-c)

Number of classes
Index Vin

Our algorithm was able to recognize the classes. The results


in (Tab. III and Fig. 6-c) demonstrate the robustness of our
methodology in the distinction of classes and the choice of
adequate number for a complex distribution of grayscale.

(a)
c)
Figure 5. Grayscale images: (a) Synthetic image (b,c) Real images
(b)

Figure 7. Classified images using automatic FCM method: (a) Synthetic


image (b,c) Real images

For all images, FCM detects the false number of classes.

There is a remarkable difference between the two methods


in determining the number of classes using the same concept.

V. CONCLUSION AND PERSPECTIVES

We developed in this paper an automation process to


Figure 6. Classified images using our method: (a) Synthetic image (b,c) Real
transform the supervised PNN classification to unsupervised
images one by determining the adequate number of classes using a
validity index.
To show more the robustness and performance of our
We tested our method on different synthetic and real
method, we compared it with the automatic method FCM using
grayscale images. We compared it with automatic FCM
the same images. The following figures show the weaknesses
method using the same concept to detect the number of classes.
of the FCM method compared to ours:
The results showed that it proves to be a good and reliable
technique for one-dimensional data classifying especially when
the number of classes is unknown. Some of the advantages of
our algorithm are that it needs only the interval [Cmin, Cmax]

2
to be configured and that PNN training is easy and
instantaneous of course.

For critical examples of images with noise and as several


classification methods, we proceed by a preliminary filtering
before using our method.

(a) Finally, it should be noted that the method was only tested
for synthetic and real images. Further study including satellite
images with different spatial resolutions is desirable for testing
the proposed technique in order to get an unsupervised land
cover classification.

REFERENCES

[I] G.P. Zhang, "Neural Networks for Classification: A Survey," IEEE


Transaction on Systems, Man, and Cybernetics- Part C: Applications
and Reviews, Vol. 30, No.4, November 2000.

[2] T. Santhanam and S. Radhika, "Probabilistic neural networks - a better


solutionfor noise classification," Journal of Theoretical and Applied
(b) Information Technology. JATIT, vol. 27 No. 1, May 2011, pp. 39-42.

[3] PV. Nageswara Rao,T. Uma Devi, DSGVK Kaladhar, GR. Sridhar and
Allam Appa Rao, "A Probabilistic neural network approach for protein
superfamily classification," Journal of Theoretical and Applied
Information Technology. .fATIT, vol. 6 No.1, 2009, pp 101-105.
[4] L. Fallah Araghi, H. Khaloozade and M. Reza Arvan, "Ship [7] Sahbani Mahersia, H., Hamrouni, K., Segmentation d'images texturees
Identification using Probabilistic Neural Networks (PNN)," Proceedings par transformee en ondelettes et classification C-moyenne floue. SETIT
of the International MultiConference of Engineers and Computer Tunisie, Mars 2005.
Scientists 2009 Vol II, IMECS 2009,March 18 - 20,2009, Hong Kong. [8] D.F. Specht, "Probabilistic neural networks," Neural Networks, vol. 3,
[5] V. Kwigizile, M. Selekwa and R. Mussa, "Highway Vehicle no. I, pp. 109-118,1990.
Classification by Probabilistic Neural Networks," Proceedings of the
[9] P.O. Wasserman, Advanced Methods in Neural Computing, New York:
Seventeenth International Florida Artificial Intelligence Research Van Nostrand Reinhold, 1993, pp. 35-55.
Society Conference, FLAIRS 2004,May 12-14,2004,Miami Florida.
[6] Weina Wang, Yunjie Zhang, "On fuzzy cluster validity indices," Fuzzy
sets and systems 158 (2007), pp. 2095- 2117.