Documente Academic
Documente Profesional
Documente Cultură
I. INTRODUCTION
Dimensionality reduction is an important issue in
classification problems as it could reduce the computation cost
associated with training by using a large number of different
features that could be redundant, noisy or irrelevant, and it
could avoid the problem related with data sets having more
features than patterns (curse of dimensionality). Such
problems are frequent in bioinformatics applications as it is
described in [1], where a review on feature selection
techniques used in bioinformatics is provided along with
analyses and references of feature selection in different
bioinformatics applications such as sequence analysis,
microarray analysis, and mass spectra analysis. In [2], feature
selection in high-dimensional feature spaces with small pattern
314
315
Distributed fitness
computation
Communication issues
Implementation
Master-Worker
Frequency: {periodic, adaptive, probabilistic}
Topology: through master
Information: {solutions, searching memory,}
Concurrent
evolutionary
algorithms on
subpopulations
Island
(Coarse-Grain
model)
Diffusion
(Fine-Grain
model)
316
that determine the way the individuals are selected for the next
iteration (whether only non-dominated ones are selected, a
solutions archive is used, etc.). The distribution of the
individuals among the different subpopulations is not
important in this case, as the evaluation of the cost functions is
completely independent for each individual. Only
considerations for load balancing should be taken into account
whenever possible differences in the computational costs for
different individuals are known.
Master process
01
Initialize a Population composed of P subpopulations, SP[i] (i=1,..,P), of N/P individuals
02
for i=1 to P workers
03
Send the i-th subpopulation SP[i] to Worker[i];
04
end;
05
t=1;
06
do
07
for i=1 to P workers
08
Receive subpopulation SP[i] from Worker[i];
09
end;
10
Execute one iteration of MOEA on the population (SP[1] U SP[2] UU
SP[P])
11
Distribute the population into new subpopulations SP[i] (i=1,..,P) of N/P
individuals
12
for i=1 to P workers
13
Send the i-th subpopulation SP[i] to Worker[i];
14
end;
15
t=t+1;
16
while stop criterion is not reached;
Worker[i]
01
while true
02
Receive subpopulation SP[i] from Master process
03
Evaluation of individuals in SP[i]
04
Send subpopulation SP[i] to Master process
05
end;
Input
patterns
x11,...,x1F
x11,...,x1F
Input
patterns
x21,,x2F
x31,...,x3F
x41,...,X4F
xN1,...,xNF
Features selected
(by individual i):
xi1, xi2,.. xiF
Clustering
Algorithm
(SOM)
Learning
Iterations
Evaluation of objectives
f1, f2,..,fn of the
individuals (feature
selections)
Learning
Iterations
Features selected
(by individual i):
xi1, xi2,.. xiF
Clustering
Algorithm
Input
patterns
Clustering
Algorithm
xN1,...,xNF
Learning
Iterations
Features selected
(by individual i):
xi1, xi2,.. xiF
Evolutionary
operators
+
Selection of
individuals
Evolutionary
operators
+
Selection of
individuals
Subpulation: subset of
feature selections
Subpulation: subset of
feature selections
x11,...,x1F
x11,...,x1F
x21,,x2F
Clustering
Algorithm
(SOM)
x21,,x2F
x31,...,x3F
Learning
Iterations
x31,...,x3F
Input
patterns
x41,...,X4F
r=N/P
Xr1,...,xrF
Features selected
(by individual i):
xi1, xi2,.. xiF
Evaluation of objectives
f1, f2,..,fn of the
individuals (feature
selections)
Evolutionary
operators
+
Selection of
individuals
Input
patterns
Features selected
(by individual i):
xi1, xi2,.. xiF
x41,...,X4F
r=N/P
Xr1,...,xrF
Clustering
Algorithm
(SOM)
Learning
Iterations
Evaluation of objectives
f1, f2,..,fn of the
individuals (feature
selections)
Evolutionary
operators
+
Selection of
individuals
P processors
Combine_&_Distribute
Figure 3. Concurrent evolution of subpopulations for unsupervised
multiobjective feature selection based on SOM
317
(2)
New population of
individuals after
first combination
(ALT1)
Non-dominated
Solutions found
by P1 (after first
set of
independent
iterations)
Non-dominated Solutions found by
P2 (after first set of independent
iterations)
(3)
(4)
Areas explored by
P1 in the second set
of independent
iterations (dot lines)
Non-dominated
solutions found by
P1 after the second ,
set of independent
iterations (circle
points)
Non-dominated solutions
found by P2 after the
second set of
independent iterations
(square pints)
Areas explored by
P2 in the second set
of independent
iterations
(continuous lines)
318
319
Combination of solutions
in the subpopulations of
P1 and P2 (after genser
independent iterations)
and assigned to P2
320
Efficiency
Figure 7. Best (B) and worst (W) Pareto fronts obtained by the different
parallel alternatives (SQ: sequential, MWE, ALT1, ALT2, and ALT3) in
different runs for benchmark b152
b152
b384
b512
MWE
ALT2
ALT1
ALT3
Alternative
Processors
V. CONCLUSIONS
Parallel implementations of a wrapper procedure based on
multi-objective optimization for feature selection in
supervised and unsupervised classification has been described
and evaluated by using different synthetic benchmarks and
benchmarks for EEG signals classification in BCI
applications.
Four parallelization alternatives of an evolutionary multiobjective procedure based on NSGA-II have been considered.
The MWE alternative corresponds to a model of concurrent
evaluation of the individual fitness with a master-worker
321
[3]
[4]
[5]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[17]
[18]
References
[2]
[7]
[16]
Acknowledgment
[1]
[6]
322
[19]
[20]
[21]
[22]
[23]
[24]
[25]