Sunteți pe pagina 1din 12

ARTICLE IN PRESS

Neurocomputing 73 (2010) 1244–1255

Contents lists available at ScienceDirect

Neurocomputing
journal homepage: www.elsevier.com/locate/neucom

A danger theory inspired artificial immune algorithm for on-line supervised


two-class classification problem
Chenggong Zhang a,, Zhang Yi b
a
The Computational Intelligence Laboratory, School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 610054, PR China
b
The Machine Intelligence Laboratory, College of Computer Science, Sichuan University, Chengdu 610054, PR China

a r t i c l e in f o a b s t r a c t

Article history: Self–nonself discrimination has long been the fundamental model of modern theoretical immunology.
Received 12 January 2009 Based on this principle, some effective and efficient artificial immune algorithms have been proposed
Received in revised form and applied to a wide range of engineering applications. Over the last few years, a new model called
6 January 2010
‘‘danger theory’’ has been developed to challenge the classical self–nonself model. In this paper, a novel
Accepted 9 January 2010
immune algorithm inspired by danger theory is proposed for solving on-line supervised two-class
Communicated by T. Heskes
Available online 28 January 2010 classification problems. The general framework of the proposed algorithm is described, and several
essential issues related to the learning process are also discussed. Experiments based on both artificial
Keywords: data sets and real-world problems are carried out to visualize the learning process, as well as to
Artificial immune system
evaluate the classification performance of our method. It is shown empirically by the experimental
Danger theory
results that the proposed algorithm exhibits competitive classification accuracy and generalization
Supervised classification
Computational intelligence capability.
& 2010 Elsevier B.V. All rights reserved.

1. Introduction This area is called the ‘‘danger zone’’ [3]. On the other hand, the
danger signal should not be sent by healthy cells or by cells
Over the past 50 years, it was widely believed that the immune undergoing normal physiological deaths [13]. The APCs (antigen-
system functions by discriminating between the self and the presenting cells) that receive the danger signal within the danger
nonself. This simple and straightforward thought became one of zone are activated and co-stimulate the B-cells or helper T-cells that
the foundations of modern immunology, which we call the SNS have already captured the antigen, i.e., received the nonself signal.
(Self–NonSelf) model. However, recently a different viewpoint has Under the effect of these two types of signals, the B-cells undergo the
appeared, and it attracts much interest in both the theoretical clonal selection process. Even if a B-cell or helper T-cell outside the
immunology and artificial immunology communities. This new danger zone captures an antigen, it cannot be stimulated because it
model is the ‘‘danger model’’ proposed by Matzinger in does not receive a danger signal from a APC. The danger zone
1994 [11,13]. The essential difference between the danger model establishes a way to locally develop an antibody population, thereby
and the SNS model is that the danger model does not suggest that preventing the interference from antigens in distant areas.
the discrimination between self and nonself is the key factor in In this paper, we present a new immune algorithm inspired by
triggering the immune response. Although to what extent the danger theory and apply it to solve two-class classification
the danger model factually reflects the inner mechanisms of the problems. The concept of ‘‘danger zone’’—derived from danger
immune system remains controversial [20], it is uncontroversial theory—is incorporated into our model in order to effectively
to extract the promising mechanisms from this new theory develop an antibody population. The danger zone in the proposed
to help us design new bionic techniques. From the point of view of model is defined as an area in the shape-space that is caused by
artificial immunology, we are actually interested in applying antigenic stimulation. The size of the danger zone is gradually
the metaphors behind the danger model—especially those that decreased while the accumulated intensity of the antibody response
the former models cannot offer us—to design novel artificial increased. This strategy, as far as we know, is the first attempt of
immune algorithms. introducing time-varying antigen status into artificial immunology.
The danger theory assumes that cells that undergo unnatural In addition, based on the concept of danger zones, two kinds of
deaths may release danger signals to a small area around that cell. signals, i.e., nonself and danger signals, are adopted in the proposed
model. The precondition for triggering the antibody response is that
the antibody receives both nonself signals and danger signals
 Corresponding author. simultaneously. Although the signal mechanisms in our model are
E-mail addresses: zhchgo@gmail.com, cgzh@hotmail.com (C. Zhang). somewhat different from that of Matzinger’s model, the model is
zhangyi@scu.edu.cn (Z. Yi). effective in terms of both classification accuracy and computational

0925-2312/$ - see front matter & 2010 Elsevier B.V. All rights reserved.
doi:10.1016/j.neucom.2010.01.005
ARTICLE IN PRESS
C. Zhang, Z. Yi / Neurocomputing 73 (2010) 1244–1255 1245

complexity. A suppression mechanism is also implemented to metaphors that we take from the danger theory. It should be
prevent the antibody population from growing too fast. Further- pointed out that in our model, the process of immune response is
more, a clonal selection mechanism based on the historical simplified. The co-stimulation signal is directly detected by
performance records of antibodies is used to guarantee that the antibodies rather than relayed by APCs and helper T-cells.
antibodies capable of recognizing antigens more effectively will Moreover, it is the antibodies rather than the B-cells that undergo
proliferate and survive longer. clonal selection to produce offspring.
The rest of the paper is organized as follows. In Section 2, we The danger theory has attracted many researchers in
present a brief review of the danger theory, especially of those recent years and is showing its potential in various areas such
mechanisms directly related to our model, and summarize some as data mining [3,17] and anomaly detection [2,3,9]. The
related work in the artificial immune community that share a first paper directly concerning the application of danger theory
common inspiration with our model. In Section 3, we outline the in artificial immune systems is [3]. In this conceptual paper, the
framework of the proposed algorithm and detail several essential authors propose that the classic negative selection algorithm
issues related to its learning procedure. In Section 4, experimental for artificial immune system is bound to be imperfect (We think
results based on artificial data sets as well as on real-world the imperfection of negative selection is due to the imperfection
problems are presented to show the effectiveness of our of the SNS model, e.g., the lack of reasonable explanation for some
algorithm. We draw conclusions and give directions for future phenomena with respect to a changing self.). They also propose
research in Section 5. that because of the deficiency of previous algorithms, there may
be some potential application areas such as anomaly detection
and data mining that can accommodate this new theory.
2. Background and related work The authors pointed out that the principal problem in applying
danger theory to engineering applications is to suitably define the
Because danger theory is a relatively new field in both theoretical danger signal. Based on their work, an idea of designing a danger
immunology and artificial immunology, it is necessary to give a brief theory inspired IDS (Intrusion Detection System) was proposed by
review. For simplicity, we focus our description mainly on the Aickelin et al. [2]. They use two metaphors, apoptosis (natural
mechanisms directly related to our model. The reader can refer to cell death) and necrosis (anomaly cell death), as the basic of the
[11–13,7] for a comprehensive understanding of this theory. danger signal. The IDS responds quickly when it detects any
An easy way of understanding the danger theory is to point out danger signal. The self–nonself discrimination is still important in
the similarities between it and the SNS model. As Matzinger stated the system, though may not able to trigger a response. In [9],
[12], both types of models agree that there is a need for some sort of a danger model based algorithm that uses the functionality of
discrimination at the effector stage of the immune response and that dendritic cells is proposed and applied to anomaly detection.
this critical need to discriminate is the evolutionary selection pressure The authors claim that their algorithm illustrates a prediction
behind the mechanisms that endow T-cell receptors and antibodies from the danger theory that is also a crucial element in designing
with their enormous range and exquisite degrees of specificity. There anomaly detection: ‘‘self-reaction killers should be found
are differences between the danger model and the SNS model. The during the early phases of most response to foreign antigens,
danger model suggests that the immune system is more concerned and they should disappear with time’’. Another conceptual paper
with damage than with foreignness, and is called into action by alarm [17] utilizes the context dependent response, which is an
signals from injured tissues, rather than by the recognition of the important property of danger theory, to build a web mining
nonself. In a word, the immune system responds to dangers, but not system.
always to foreigners. This description might seem to be a renaming of
SNS model. If all self cells are benign and all nonself substances are
dangerous, the danger model is unwanted. However, there indeed
3. The proposed method
exist some kinds of dangerous self entities (e.g., some mutations) and
benign, even beneficial nonself entities (e.g., fetuses). It is these special
In this section, we first outline the general framework of our
self and nonself entities that provide some phenomena that the
proposed model, and then explain in detail several key issues
classic SNS model cannot explain very well. For example, there are a
related to the learning process.
number of harmless viruses in our body and, more specifically, in the
food that we ingest and in our gut, but why then does the immune
system not respond to them? Why do mothers not reject their
3.1. Overview of the proposed model
fetuses? Why are tumors not rejected, even though they are carrying
‘‘foreign’’ tumor-specific antigens? These observations that do not fit
Before we begin the description, we will give a brief summary
well with the SNS model become the driving behind the creation of
of the notations and operators used in learning process detailed
the danger theory.
later.
The danger theory assumes that cells that undergo unnatural
The notations are listed as follows:
death may release danger signals that cover a small area around
that cell. This area is called the ‘‘danger zone’’ [3]. However, the
danger signal should not be sent by healthy cells or by  S: shape-space where the immune recognition takes place,
cells undergoing normal physiological deaths [13]. The APCs S DRn ;
(antigen-presenting cells) that receive the danger signal within  ab: an individual antibody, ab A S;
the danger zone are activated and co-stimulate the B-cells or  AB: antibody population, ABD S;
helper T-cells that have already captured the antigen, i.e., received  mAB: memory antibody population;
the nonself signal. Under the effect of these two types of signals,  gAB: general antibody population, mAB \ gAB ¼ |; mAB[
the B-cells undergo the clonal selection process. Even if B-cells or gAB ¼ AB;
helper T-cells out of the danger zone capture the antigen, they  sAB: stimulated antibodies, sAB D AB;
cannot be stimulated because they do not receive danger signals  ag: an individual antigen, ag A S;
from an APC. The above mechanisms, especially the concept of the  AG: antigen population, AG D S;
danger zone and the two types of signals, become the primary  DZ: current danger zone;
ARTICLE IN PRESS
1246 C. Zhang, Z. Yi / Neurocomputing 73 (2010) 1244–1255

The operators are listed as follows: 22: end while


23: Output mAB;
 ChangeStatus(ab): operator used to change the status of
antibody ab. When the learning algorithm is finished, the obtained memory
 ClonalSelection(ab): clonal selection operator; antibodies can be used to classify previously unknown antigens.
 Suppress(AB): suppression operator; This is a simple process in which an unknown antigen will be
 React(ab,ag): antibody ab reacts to danger caused by antigen classified as the same class as the antibody with which it has the
ag; lowest affinity.
 UpdateDanger(ag): operator used to update the size of the
danger zone caused by antigen ag;
3.2. Details of the learning algorithm

Every cell (antigen or antibody) has a label indicating its class. In the following subsections, we will detail several key issues
The value of the label is taken from f0; 1g because we deal with related to the learning process of our proposed algorithm, from
two-class classification problems. which we can obtain a clear view of the interaction between
The status of an individual antibody is defined by a antibodies and the immune memory development mechanisms.
triple (sti,rel,cr), where sti indicates the antibody’s accumulated (1) Initialization: The algorithm begins with a random gAB
stimulation level, rel indicates the antibody’s accumulated population initialization of antibodies in the shape-space. The
reliability in terms of classification accuracy, cr indicates whether label of each antibody is randomly assigned. The status of each
or not the antibody belongs to the same class as current antigen. antibody, i.e., sti, rel and cr, is set to zero. The mAB is initialized as
Among these attributes, sti and rel are determined by the an empty set. An important parameter in this phase is iniSize, the
antibody’s historical records. Every time an antigen is presented initial size of gAB. We found that the final size of the antibody
to the antibody population and causes an immune response, the population and the classification performance are fairly insensi-
status of each antibody may change. How the antibodies’ statuses tive to iniSize. In fact, the redundant antibodies caused by too
influence the immune memory development is a key issue in our large iniSize can be gradually cleaned out under the filtrating
work and will be discussed in Section 3.2. effect of clonal selection. On the other hand, if the iniSize is too
Based on the above notations and operators, we give a small, the algorithm is also capable of producing sufficient
framework for our proposed algorithm (see Algorithm 1). antibodies under the proliferating effect of clonal selection.
Algorithm 1. Danger theory based immune algorithm. (2) Nonself and danger—two kinds of signals: As mentioned
previously, the central idea of the danger model is that the
1: Initialize gAB,mAB’|; immune system does not react to foreigners but to dangerous
2: while Stopping criterion not satisfied do entities. However, ‘‘no reaction’’ does not means ‘‘no detection’’.
3: for i’0 to jAGj do The detection of danger serves only as a co-stimulation signal,
4: Present AG(i) as current active ag; which we call ‘‘signal 1’’. The perceiving of foreign antigens is
5: DZ created by ag; called ‘‘signal 0’’ which first triggers the immune function.
6: gAB receives signal 0 from ag; In our model, the antibody population is divided into two sub-
7: gAB receives signal 1 from DZ; populations: general antibodies and memory antibodies. Any
8: Select antibodies receives both signal 0 and signal 1 to general antibody with a sufficient accumulated stimulation level
form sAB; may be converted to a memory antibody. Memory antibodies do
9: for all ab A sAB not take part in reaction to antigens. They serve as a fixed
10: ChangeStatus(ab); memory for antigens. Once a memory antibody is generated, its
11: React(ab,ag); status will not change until being suppressed (killed) by other
12: end for memory antibodies.
13: Suppress(AB); Every time an antigen is presented, all general antibodies
14: DecreaseDanger(ag); receive the signal 0. This means that all general antibodies in the
15: for all ab A sAB shape-space can detect the stimuli of current antigen. Such
16: if the ab.sti reaches a certain threshold then detection is performed through calculating the affinity between
17: ClonalSelection(ab); each antibody and the current antigen:
18: end if
affinityðab; agÞ ¼ JabagJ: ð1Þ
19: end for
20: end for However, unlike signal 0, signal 1 is only sent to antibodies within
21: Check the stopping criterion; the danger zone created by the current antigen (see Fig. 1). The

Fig. 1. Illustration of two signals. Every general antibody can receive signal 0, but only those within the danger zone can receive signal 1. Antibodies which receive both
signal 0 and signal 1 are stimulated.
ARTICLE IN PRESS
C. Zhang, Z. Yi / Neurocomputing 73 (2010) 1244–1255 1247

antibodies that receive both signal 0 and signal 1 are stimulated Based on these considerations, we designed Algorithm 4 to gradually
by the current antigen and are allowed to change their status decrease the danger value of antigens.
(see Algorithm 2).
Algorithm 4. DecreaseDanger(ag).
Algorithm 2. ChangeStatus(ab).
1: var’ðjsABjþ 1Þk1 ;
1: ab:sti’ab:sti þ1; 2: ag:danger’ag:danger=var;
2: if ab.label = = ag.label then
3: ab.cr= 1;
The introducing of a variable danger value has an additional benefit
4: else
of simplifying the criterion for checking whether or not to stop the
5: ab.cr= 1;
algorithm. In fact, by properly setting a value of k1 in Algorithm 4, we
6: end if
can guarantee that the decreasing rate of danger values always
7: ab:rel’ab:rel þ ab:cr; remains larger than the increasing rate of the number of antibodies
stimulated by the antigen. Hence, there must be a time at which the
(3) Antibody reaction and variable danger zone: In addition to antigens have such a small danger value that they cannot stimulate
updating their status, stimulated antibodies can also react to the any antibody. Thus, we define the stopping criterion as follows: stop if
current antigen. The reaction intensity of an antibody is inversely beyond a predefined number of iterations there is no antibody
proportional to its affinity with the current antigen. The stimulated by any antigen. However, there may be an exception to
antibodies closer to the current antigen have a higher reaction this criterion. When there are a large number of antigens concen-
intensity (see Algorithm 3). Moreover, the stimulated antibodies trated in a relatively small area, it may take a long time to satisfy the
that belong to the same class label as the current antigen (i.e., stopping criterion, even if the antibody population has already
cr= 1) move toward that antigen, whereas the stimulated matured. In this situation, we may predefine a maximum generation
antibodies that belong to a different class (i.e., cr=  1) move number (usually a certain proportion of original antigen population
away from that antigen (see Algorithm 3). size). The learning process must be terminated at that time even if the
stopping criterion has not yet been satisfied.
Algorithm 3. React(ab,ag). (4) Suppression between antibodies: Like other classification or data
analysis methods, obtaining effective classifiers of appropriate size is
1: var ¼ 1affinityðab; agÞ=ag:danger; an essential issue in designing bio-inspired algorithms. In our
2: ab’ab þab:cr  var  ðagabÞ; proposed model, we control the size of the antibody population by
implementing suppression among the antibody population. The
In our model, the danger values of antigens are decreased over intensity of suppression between any two antibodies is inversely
time. Such decrease is, like natural immune system, a result of proportional to their affinity against current antigen.
antibody reaction to antigen (see Fig. 2). Therefore, it is reasonable to The suppression is divided into two independent processes. One is
assume that the more antibodies stimulated by an antigen, the more the suppression between stimulated antibodies (they are general
its danger value (i.e., the range of its danger zone) will be decreased. antibodies within the current danger zone). The other is the
Thereafter, the antigen will stimulate fewer antibodies when it is suppression between memory antibodies within the current danger
presented again. In fact, from the point of view of immune response, zone.
more stimulated antibodies means that the immune system exhibits For any two stimulated antibodies abi and abj, the one with the
a stronger response to the current antigen, and hence the antigen higher affinity to the current antigen will be deleted with probability
receives a stronger resistance. On the other hand, from the point of p1 which is calculated by following equation:
view of learning, more stimulated detectors (antibodies) mean that  k
the current sample (antigen) contributes more to the learning affinityðabi ; abj Þ 2
p1 ¼ 1 ; ð2Þ
process, so it is reasonable to weaken its activity in future learning. 2  ag:danger

Fig. 2. Antibody reaction and antigen danger value decreasing. Left: before antigen presentation; Right: after antigen presentation. The stimulated antibodies with cr = 1
move toward the antigen; the stimulated antibodies with cr=  1 move away from the antigen. The decrease of the danger value of antigen ag 2 is larger than that of
antigen ag 1 since it stimulated more antibodies. Notice that we draw two antigens here to illustrate the difference of antibody reaction mechanism under different
situations. In real implementation there would be only one antigen presented in each generation.
ARTICLE IN PRESS
1248 C. Zhang, Z. Yi / Neurocomputing 73 (2010) 1244–1255

where k2 is a parameter that controls the suppression intensity. 13: end if


Similarly, for any two memory antibodies Mabi and Mabj within the 14: end for
current danger zone, the one with the higher affinity to the current
antigen will be deleted with probability p2 which is calculated by
following equation: (5) Clonal selection: The clonal selection is adopted in the
 k proposed model to aid in the reproduction of superior antibodies
affinityðMabi ; Mabj Þ 3 that have good performance in classification accuracy and
p2 ¼ 1 ; ð3Þ
2  ag:danger deletion of antibodies that often make mistakes. We need to
where k3 is a parameter that controls the suppression intensity. define a measurement to determine whether an antibody is
Algorithm 5 depicts the suppression process (see also Fig. 3). superior or inferior. We achieve this by introducing two positive
threshold values, namely THR 1 and THR 2 ðTHR2r THR1Þ. Only
Algorithm 5. Suppress(sAB). when the sti of a general antibody reaches THR 1 can it enter the
clonal selection process. Then, according to the rel value of that
1: randomly group the stimulated antibodies into pairs (if antibody, one of the following operations will be implemented:
the size is odd, ignore any single one.);
2: for all pairs do
(1) If rel Z THR2, convert the antibody to a memory antibody,
3: calculate probability p1 according to Eq. (2);
clone it, and then mutate the clones.
4: if random op1 then
(2) If rel rTHR2, invert the label of the antibody, reset its status
5: delete the antibody with higher affinity to current
to zero.
antigen;
(3) If jrelj oTHR2, delete the antibody with probability p, where
6: end if
p ¼ 1ðjrelj=THR2Þ2 .
7: end for
8: randomly group the memory antibodies within the danger
zone into pairs (if the size is odd, ignore any single one); Such clonal selection mechanisms make us measure the
9: for all pairs do potential of an antibody from the point of view of its historical
10: calculate probability p2 according to Eq. (3); record of performance in classification. If the sti reaches THR 1
11: if random op2 then while the rel is still lower than THR 2, then the antibody was
12: delete the memory antibody with higher affinity to alternately stimulated by antigens with different class labels. In
current antigen; other words, the antibody has a low reliability in terms of

Fig. 3. Suppression between antibodies. Left: before the suppression. Right: after the suppression. Four antibodies ab 1, ab 2, ab 3, and ab 4 are randomly grouped into two
pairs (ab 1,ab 2) and (ab 3,ab 4), with affinityðab1; ab2Þ 4 affinityðab3; ab4Þ and affinityðab3; agÞ 4affinityðab4; agÞ. Hence the intensity of suppression between ab 1 and ab 2 is
smaller than that of ab 3 and ab 4. After the suppression both ab 1 and ab 2 may survive, whereas ab 3 will be suppressed by ab 4 due to its larger affinity with
current antigen.

Fig. 4. Clonal selection. Left: before the clonal selection. Right: after the clonal selection. There are four antibodies ab 1, ab 2, ab 3, and ab 4 whose sti reach THR 1. ab 3 was
alternately stimulated by antigens in different class, so it will be deleted due to its unreliability in terms of classification accuracy. The ab 4 was continually stimulated by
opposite class antigens, so its label will be inverted. ab 1 and ab 2 are both go through proliferation. The rel of ab 1 is larger than that of ab 2. This makes the offspring size of
ab 1 larger than that of ab2, and the mutation rate of ab 2 larger than that of ab1.
ARTICLE IN PRESS
C. Zhang, Z. Yi / Neurocomputing 73 (2010) 1244–1255 1249

classification, and it is reasonable to force it to forgo reproduction. represent the distribution of antigens in each class. The parameter
Algorithm 6 depicts the process of clonal selection (see also settings used in the experiment are listed in Table 1, where
Fig. 4). iniSize indicates the initial antibody population size and iniDV
indicates the initial danger value of each antigen. The
Algorithm 6. ClonalSelection(ab). stopping criterion is defined as follows: stop when after 4500
1: if jab:reljo THR2 and random o1ðjrelj=THR2Þ then generations (a quarter of antigen population size) no antibody is
2: delete ab; stimulated.
3: else if ab:rel r THR2 then (2) Results: Fig. 6(a) shows the final memory antibody
4: ab:label’1  ab:label; population. The learning process lasts 634 263 generations. The
5: ab:sti’0; final memory antibody population contains 1419 antibodies,
6: among which 730 antibodies belong to class 1 and 689 antibodies
ab:rel’0;
belong to class 2. Hence, the compression ratio is 4.06%. From
7: else
Fig. 6(a), we can find that the evolved antibodies can effectively
8: Clone ab;
represent the overlapping regions and that the boundary
9: Each offspring of ab goes through mutation;
separating the two classes is clear.
10: Convert ab to memory antibody;
Next we will quantitatively validate that the proposed model
11: end if
has promising data representation capability. For each class i
ði A f1; 2gÞ, we will test the vector quantization error EQ(Agi,Abi),
The size of clones is calculated by following formula: which is defined as
 
ab:relTHR2
size ¼ ðmc1Þ  þ1 ; ð4Þ 1 X X 1
THR1THR2 EQ ðAg i ; Abi Þ ¼ affinityðab; agÞ; ð5Þ
2 ab A Ab ag A NðabÞ jNðabÞj
i
where the integer mc represents the maximum size of clones.
Notice that for any antibody ab that can take part in cloning, the where Agi and Abi represent the antigen population and the final
following relation must hold: THR2 rab:relr THR1. From Eq. (4), antibody population of class i, respectively. The neighborhood
we have 0 rsize rmc. (receptive field) N(ab) of an antibody ab is defined as

0
NðabÞ ¼ ag A Ag i jab ¼ arg min
0
affinityðag; ab Þg: ð6Þ
4. Simulations ab A Abi

To obtain comparative results, we used another two artificial


In this section, we present experiments that have been immune algorithms to learn the artificial data set. They are aiNet
performed to illustrate the learning process, as well as to evaluate [4,5] and RLAIS [19], which have proved to be very efficient for
the classification performance of our proposed method. data representation. For each algorithm including the proposed
model, we performed 50 independent trials and the final
4.1. Artificial data

(1) Experimental setup: The artificial data set contains 18 000 Table 1
antigens in a two-dimensional unit square. The antigens are evenly Parameter settings of our model used in experiment based on artificial data set.
divided into two classes that are highly not linearly separable (see
THR 1 THR 2 iniSize iniDV mc k1 k2 k3
Fig. 5(a)). As shown in Fig. 5(b) and (c), extensive overlapping
regions exist in the border of each class. Our objective is to obtain an 6 4 10 0.02 4 1.5 0.5 1.5
antibody population of appropriate size that can effectively

1
area 2
0.9

0.8

0.7

0.6
area 1
0.5 0.65
0.62
0.4 0.615
0.61 0.6
0.3 0.605
0.6
0.2 0.595
0.59 0.55
0.1 0.585
0.58
0 0.575 0.5
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0. 3
0. 4
0. 5
66

0. 7
0. 8
69
7
71
72

0. 5
0. 3
0. 5
0. 4
0. 5
0. 5
0. 5
0. 6
0. 5
0. 7
5
6
6
6

6
6

0.

82
8
83
8
84
8
85
8
86
8
87
0.

0.

0.
0.

0.

Fig. 5. (a) Antigen population. The dots represent class 1 antigens; the circles represent class 2 antigens. Each class contains 9000 antigens. The two classes are highly not
linearly separable. (b) and (c) Zoomed area 1 and zoomed area 2 in (a) where the overlapping regions exist.
ARTICLE IN PRESS
1250 C. Zhang, Z. Yi / Neurocomputing 73 (2010) 1244–1255

Generation = 25000 Generation = 50000 Generation = 75000


1 1 1

1
0.5 0.5 0.5
0.9
0 0 0
0.8 0 0.5 1 0 0.5 1 0 0.5 1
Generation = 100000 Generation = 125000 Generation = 150000
0.7 1 1 1

0.6
0.5 0.5 0.5
0.5

0.4 0 0 0
0 0.5 1 0 0.5 1 0 0.5 1
Generation = 200000 Generation = 400000 Generation = 600000
0.3
1 1 1
0.2
0.5 0.5 0.5
0.1

0 0 0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.5 1 0 0.5 1 0 0.5 1

Fig. 6. Evolutionary process of the memory antibody population. (a) Final memory antibody population. The dots and circles represent class 1 and class 2 antibodies,
respectively. (b) Memory antibody population in different generations.

Results on class 1 antigens Results on class 2 antigens


0.15 0.14

0.14
0.13
Vector quantization error

Vector quantization error

0.13
0.12
0.12
0.11
0.11

0.1
0.1

0.09 0.09

0.08
The model aiNet RLAIS The model aiNet RLAIS

Fig. 7. Vector quantization error on artificial data set. (a) Results on class 1 antigens. (b) Results on class 2 antigens.

quantization error is averaged over these 50 trials. Fig. 7 shows within the overlapping regions (see Fig. 5), the proposed model
the results and their statistical properties. From Fig. 7, we can find can be thought of as exhibiting good classification capability on
that the proposed model has obtained the lowest mean the artificial data set.
quantization error on both class 1 and class 2 antigens, which Fig. 6(b) shows the memory antibody population in different
are 0.095 (with standard deviation 0.0054) and 0.090 (with generations. The algorithm behaves like incremental learning:
standard deviation 0.0034), respectively. The RLAIS performed gradually stimulation by different antigens causes the randomly
worst on both class 1 and class 2 antigens. It obtained a mean initialized antibodies to undergo reaction, suppression, and clonal
quantization error 0.144 (with standard deviation 0.0024) and selection, and makes it eventually grow into matured antibodies.
0.131 (with standard deviation 0.0040), respectively. The aiNet As shown in Figs. 6(b) and 8(a), the learning process can be
performed better than RLAIS, but worse than the proposed model. divided into two distinct phases: the shaping phase and the
Moreover, we used the original data set to test the classifica- maturing phase. In the previous 200 000 generations, the memory
tion performance of our algorithm. The final classification antibody population is gradually shaped as a rough distribution of
accuracy is 87.24%. Considering that the data set is highly not antigen population. The population grows quickly in this phase.
linearly separable and that there are a large number of antigens When the shaping is finished, the learning process enters the
ARTICLE IN PRESS
C. Zhang, Z. Yi / Neurocomputing 73 (2010) 1244–1255 1251

4500 4
general antibody
memory antibody
4000 3.5

number of stimulated antibodies


3500
3
3000
population size

2.5
2500
2
2000
1.5
1500

1000 1

500 0.5

0 0
0 100000 200000 300000 400000 500000 600000 700000 0 100000 200000 300000 400000 500000 600000 700000
generation number generation number

Fig. 8. Size of memory antibody population and number of stimulated antibodies during the on-line learning. The sampling frequency is 50 generations. (a) Antibody
population size. (b) Number of stimulated antibodies.

maturing phase. This phase lasts about 400 000 generations in Table 2
which the population grows slowly. Fig. 8(b) shows the number of Parameter settings of our model used in experiment based on Wisconsin Breast
Cancer Database.
stimulated antibodies in different generations. We can find that
the number of stimulated antibodies decreases along the learning THR 1 THR 2 iniSize iniDV mc k1 k2 k3
process. In the shaping phase, the danger values of antigens are
relatively large, and hence there are more antibodies stimulated 5 3 300 1.5 3 0.05 2.2 7.0
by their danger zones. Consequently, both the clonal selection and
suppression take place frequently. The skeleton topology of the
antibody population is formed in this phase. In the maturing
phase, the danger values of antigens are relatively small, and For each algorithm, we apply a 10-fold cross-validation for 20
hence there are fewer antibodies stimulated. Consequently, the independent runs. Strictly speaking, the original data set is randomly
clonal selection and suppression take place infrequently. divided into 10 mutually exclusive subsets. Each of the first nine
subsets contains 68 instances, while the last subset contains 71
instances. In each of the 20 runs, the tested algorithms go through
training and validating 10 times. In each trial, one of the 10 subsets
4.2. Real problem is selected as the validating set, the rest nine subsets are used as the
training sets. Table 2 lists the parameter settings of our model used
In this section, the proposed algorithm is tested on breast in the experiment, where iniSize indicates the initial antibody
cancer diagnosis, which is a widely used benchmark problem. population size and iniDV indicates the initial danger value of each
(1) Experimental setup: The experiment is based on the antigen. The stopping criterion is defined as follows: stop when after
Wisconsin Breast Cancer Database taken from the University of 615 generations (the number of antigens in the training set) no
California at Irvine (UCI) Machine Learning Repository [14]. The antibody is stimulated.
original database contains 699 instances, and each instance has (2) Results: Figs. 9 and 10 show the experimental results of the
nine numerically valued attributes. Because there are 16 instances involved algorithms on the Wisconsin breast cancer problem.
that contain missing attribute values, we only use the remaining From Fig. 9 we can observe that DCA outperforms the other 10
683 instances for our experiment. The instances are divided into algorithms with regard to the mean classification accuracy on the
two classes: class 0 (tested benign) contains 444 (65.0%) training data sets over 20 independent runs. It achieved a mean
instances; class 1 (tested malignant) contains 239 (35.0%) accuracy of 97.44%. AIRS performs the second best with a mean
instances. All nine of the attributes of each instance are normal- accuracy of 97.14%. Among the other nine algorithms, our
ized before experiment. proposed model performs the best with a mean accuracy of
The breast cancer diagnosis is a widely used benchmark 96.88%. In order to obtain statistical information, we conducted
problem among machine learning researchers, and several standard one-tailed t-tests on the mean classification accuracy,
previous algorithms have been proposed for and applied to this with significant level of 95%. The corresponding results show that
problem. We studied some of these algorithms and carried out the proposed model performs significantly better than C4.5, RIAC,
comparative experiments to prove that our proposed model has NEFCLASS, Supervised fuzzy clustering, and CLONALG; the
competitive classification capability. The involved algorithms are difference between our model and the other five algorithms
C4.5 [16], RIAC [10], LDA [18], NEFCLASS [15], Optimized-LVQ [8], have no statistical significance. From Fig. 10, we can observe that
Big-LVQ [8], AIRS [8], Supervised fuzzy clustering [1], ClONALG the proposed model outperforms the other 10 algorithms with
[6], and DCA [9]. Among these algorithms, AIRS, CLONALG, and regard to the mean classification accuracy on the validating data
DCA are artificial immune algorithms. In particular, DCA is set over 20 independent runs. The proposed model has achieved a
inspired from danger theory. mean accuracy of 96.79%. Big-LVQ performs the second best with
ARTICLE IN PRESS
1252 C. Zhang, Z. Yi / Neurocomputing 73 (2010) 1244–1255

0.98

0.975

0.97

Classification accuracy
0.965

0.96

0.955

0.95

0.945
el

SS

VQ

A
4.

rin
LD

C
IA
od

LV

AL
LA

−L
C

AI

D
R

te
m

g−

N
us
ed
C

LO
e

Bi
EF
Th

cl
iz

C
im
N

F.
pt

S.
O

Fig. 9. Results on training data set in experiment based on Wisconsin breast cancer database.

0.97

0.965
Classification accuracy

0.96

0.955

0.95

0.945

0.94
S
el

VQ

G
A

A
4.

rin
S

R
LD

C
IA

LV

AL
od

LA

−L

AI
C

D
R

te
m

g−

N
us
ed
C

LO
e

Bi
EF

cl
Th

iz

C
im
N

F.
pt

S.
O

Fig. 10. Results on validating data set in experiment based on Wisconsin breast cancer database.

mean accuracy of 96.60%. The standard one-tailed t-tests with divided into two distinct phases. In the first 5000 generations
significant level of 95% shows that the proposed model performs (10 000 generations in case of fold 4 and fold 8), the population is
significantly better than C4.5, RIAC, NEFCLASS, Supervised fuzzy shaped as a rough distribution of the antigen population;
clustering, CLONALG, and DCA. antibodies gradually occupy the areas in which the antigens
To summarize, the proposed model performs third best with congregate. In later generations, the learning process enters the
regard to the mean classification accuracy on the training data set, maturing phase. In this phase, the antibodies go through local
and performs best with regard to the mean classification accuracy on proliferation and suppression, and the slight difference between
validating data set. Further, for the training data set, the proposed clustering areas is formed in this phase.
model performs better than five other algorithms with significant Fig. 12 shows the number of stimulated antibodies during the
level of 95%; for the validating data set the proposed model performs first run. The number of stimulated antibodies decreases along
better than six other algorithms with significant level of 95%. These with learning process due to the time-varying danger values of
results prove that our model exhibits competitive classification antigens. Notice that there are some evident peaks in the case of
capability and promising generalization capability. folds 2, 4 and 8. This is because of the random population
Fig. 11 shows the size of the memory antibody population initialization mechanism. If the initial 300 general antibodies
during the first run. The evolution of the antibody population is congregate in a relatively small area, then in the shaping phase
ARTICLE IN PRESS
C. Zhang, Z. Yi / Neurocomputing 73 (2010) 1244–1255 1253

Fold 1 Fold 2 Fold 3 Fold 4


2000 2000 1000 2000

1500 1500 1500

1000 1000 500 1000

500 500 500

0 0 0 0
0 5000 10000 0 5000 10000 0 5000 10000 15000 0 10000 20000
Fold 5 Fold 6 Fold 7 Fold 8
2000 1500 1500 2000

1500 1500
1000 1000

1000 1000

500 500
500 500

0 0 0 0
0 5000 10000 0 5000 10000 0 5000 10000 0 5000 10000 15000
Fold 9 Fold 10
1000 1500

1000

500

500

0 0
0 5000 10000 0 5000 10000

Fig. 11. Size of memory antibody population during each of the 10 learning processes in the first run. The x-axis represents generation number; the y-axis represents
population size. The red thick curves represent general antibody population; the blue thin curves represent memory antibody population. The sampling frequency is set as
10 generations. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

some areas containing a considerable number of antigens may not become the foundation of a series of bio-inspired methods. The
be explored by antibodies. Therefore, when the newborn concept of a time-varying antigen status provides us with more
antibodies eventually find these areas, the antigens may reasons to design novel immune operators. Although the danger
stimulate a large number of antibodies due to their relatively theory based immune algorithm is relatively young compared to
large initial danger value, and hence the peak appears. traditional immune algorithms like negative selection and clonal
selection, this new method is showing its potential in more and
more application areas and will play more important roles in both
5. Conclusion artificial immunology and engineering applications.
Most of the work in this paper is empirical and experimental. So,
The natural immune system is an interesting research field the next step of our work is to theoretically analyze the learning
because it provides an intricate model of adaptive learning process of our method and apply it to more complicated problems.
processes. Therefore, it inspires novel effective paradigms for Moreover, we have found that there may be some intricate relation-
engineering applications. Such a model is proposed in this paper ships between the parameters used in our model, and we believe that
for solving on-line two-class classification problems. Our method it is worth studying these relationships theoretically.
is based on the mechanisms of a novel theoretical immunology
model—danger theory, e.g., danger zone and danger signals. We
evaluate our method through experiments based on an artificial
data set as well as on a real-world problem. Experimental results Acknowledgments
show that our method exhibits good topology learning and
classification capability. The authors would like to thank the anonymous reviewers and
The main contribution of this paper is the introduction of a the Associate Editor for their helpful remarks and comments,
framework of a danger theory based immune algorithm that may which have greatly improved this paper.
ARTICLE IN PRESS
1254 C. Zhang, Z. Yi / Neurocomputing 73 (2010) 1244–1255

Fold 1 Fold 2 Fold 3 Fold 4


20 20 20 20

15 15 15 15

10 10 10 10

5 5 5 5

0 0 0 0
0 5000 10000 0 5000 10000 0 10000 20000 0 10000 20000

Fold 5 Fold 6 Fold 7 Fold 8


20 20 20 20

15 15 15 15

10 10 10 10

5 5 5 5

0 0 0 0
0 5000 10000 0 5000 10000 0 5000 10000 0 5000 10000 15000
Fold 9 Fold 10
20 20

15 15

10 10

5 5

0 0
0 5000 10000 0 10000 20000

Fig. 12. Number of stimulated antibodies during each of the 10 learning processes in the first run. The x-axis represents generation number; the y-axis represents the
number of stimulated antibodies. The sampling frequency is set as 10 generations.

References [9] J. Greensmith, U. Aickelin, S. Cayzer, Introducing dendritic cells as a


novel immune-inspired algorithm for anomaly detection, in: ICARIS, 2005, pp.
153–167.
[1] J. Abonyi, F. Szeifert, Supervised fuzzy clustering for the identification of fuzzy [10] H.J. Hamilton, N. Shan, N. Cercone, Riac: a rule induction algorithm based on
classifiers, Pattern Recognition Letters 24 (2003) 2195–2207. approximate classification, Technical Report CS 96-06, University of Regina,
[2] U. Aickelin, P. Bentley, S. Cayzer, J. Kim, J. Mcleod, Danger theory: the link 1996.
between ais and ids, in: ICARIS, 2003, pp. 147–155. [11] P. Matzinger, Tolerance, danger, and the extended family, Annual Review of
[3] U. Aickelin, S. Cayzer, The danger theory and its application to artificial Immunology 12 (1994) 991–1045.
immune system, in: ICARIS, 2002, pp. 141–148. [12] P. Matzinger, Essay 1: the danger model in its historical context,
[4] L.N. De Castro, F.J. Von Zuben, An evolutionary immune network for data Scandinavian Journal of Immunology 54 (2001) 4–9.
clustering, in: Proceedings of the IEEE SBRN’00’ Brazil, 2000. [13] P. Matzinger, The danger model: a renewed sense of self, Science 296 (2002)
[5] L.N. De Castro, F.J. Von Zuben, ainet: an artificial immune network for data 301–305.
analysis, International Journal of Computation Intelligence and Applications 1 [14] D.J. Newman, S. Hettich, C.L. Blake, C.J. Merz, UCI repository of machine
learning databases, 1998.
(3) (2001).
[15] D. Nuack, R. Kruse, Obtaining interpretable fuzzy classification rules from
[6] L.N. De Castro, F.J. Von Zuben, Learning and optimization using the clonal
medical data, Artificial Intelligence in Medicine 16 (1999) 149–169.
selection principle, IEEE Transactions on Evolutionary Computation 6 (3)
[16] J.R. Quinlan, Improved use of continuous attributes in c4.5, Journal of
(2001) 239–251. Artificial Intelligence Research 4 (1996) 77–90.
[7] S. Gallucci, P. Matzinger, Danger signals: Sos to the immune system, Current [17] A.Secker, A.A. Freitas, J.Timmis, A danger theory inspired approach to web
Opinion in Immunology 13 (1) (2001) 114–119. mining, in: ICARIS, 2003, pp. 156–167.
[8] D.E. Goodman, L. Boggess, A.Watkins, Artificial immune system classification [18] B. Ster, A. Dobnikar, Neural networks in medical diagnosis: comparison with
of multiple-class problems, in: Proceedings of the Artificial Neural Networks other methods, in: Proceedings of the International Conference on Engineer-
in Engineering, 2002, pp. 179–183. ing Applications of Neural Networks, 1996, pp. 427–430.
ARTICLE IN PRESS
C. Zhang, Z. Yi / Neurocomputing 73 (2010) 1244–1255 1255

[19] J. Timmis, M. Neal, A resource limited artificial immune system for data Zhang Yi received his B.Sc. degree in Mathematics
analysis, Knowledge-Based Systems 14 (2001) 121–130. from Sichuan Normal University, Chengdu, China, in
[20] R.E. Vance, Cutting edge commentary: a copernican revolution? doubts about 1983 and his MS degree in Mathematics from Hebei
the danger theory, Journal of Immunology 165 (2000) 1725–1728. Normal University in 1986. In December 1987, he was
promoted Associate Professor in the University of
Electronic Science and Technology of China. From
1989 to 1990, he was a Senior Visiting Scholar in the
Department of Automatic Control and Systems En-
Chenggong Zhang received his B.Sc. degree in Com- gineering in The University of Sheffield, England. In
puter and Information Science from Southwest Uni- 1994, he received his Ph.D. degree in Mathematics
versity in 2002 and the M.Sc. degree in Computer from the Institute of Mathematics, The Chinese
Science from University of Electronic Science and Academy of Science, Beijing, China. He was promoted
Technology of China (UESTC), in 2005. He is currently full Professor in 1994 in the University of Electronic
pursuing the Ph.D. degree in Computational Intelli- Science and Technology of China. From February 1999 to February 2000 and from
gence Laboratory, School of Computer Science and August 2000 to August 2001, he was a Research Associate in the Department of
Engineering, UESTC. Computer Science and Engineering at The Chinese University of Hong Kong. From
August 2001 to December 2002, he was a Research Fellow in the Department of
Electrical and Computer Engineering at The National University of Singapore.
Currently, he is a professor of College of Computer Science, Sichuan University.

S-ar putea să vă placă și