Documente Academic
Documente Profesional
Documente Cultură
Abstract The number of heuristic optimization algorithms has exploded over the last decade with new methods being proposed
constantly. A recent overview of existing heuristic methods has listed over 130 algorithms. The majority of these optimization
algorithms have been designed and applied to solve real-parameter function optimization problems, each claiming to be superior to
other methods in terms of performance. In this paper, three heuristic algorithms are systematically analyzed and tested in detail for
real-parameter optimization problems, especially those involving a large number of parameters. Three traditional methods, i.e.,
genetic algorithms (GA), particle swarm optimization (PSO) and differential evolution (DE) are compared in terms of accuracy and
runtime, using several high dimensional standard benchmark functions and real world problems.
Keywords heuristic optimization, high dimensional optimization, optimization techniques, nature-inspired algorithms
where D and Y , one has to estimate the for i 1,..., N and j 1,..., D where Xi , j denotes the
optimal parameter vector x * such that
j-th element, x j , of the i-th vector, x i . U low j , up j is
f x * max f x ,
x D
(2) a random number in low j , up j drawn according to a
uniform distribution and the symbol denotes
sampling from a given distribution.
Algorithm 1 An abstract Evolutionary Algorithm (EA) f1 > f6 > f3 > f5. The total fitness range f is initially
Input: N, kmax determined using equation (5):
Output: xBEST N
1: xBEST , f BEST 0 f fi . (5)
i 1
2: Build an initial population X
k:=0 Then, the sampling length f/NS is determined, where
3: repeat NS denotes the number of individuals that need to be
4: k:=k+1 selected from the entire population.
4: for each individual x i in X
0 f
5: Calculate f i
Total Fitness Range
6: If xBEST = or fi f BEST then
Individuals 1 2 3 4 5 6 7
7: xBEST x i (sized by fitness)
f1 f2 f3 f4 f5 f6 f7
f BEST fi
end if Initial search range 0 f/NS
8: (NS = 6 in this example)
9: end
10: X Merge( X , Reproduce( X )) NS sampled individuals
1 2 3 4 6 7
11: until xBEST is the ideal solution or k > kmax
Fig.1. Array of individual ranges, initial search
range, and chosen positions in Stochastic Universal
Evolutionary algorithms differ from one another Sampling.
largely in how they perform the Reproduce and Merge
operations. The Reproduce operation usually has two A random position is generated between 0 and f/NS
parts: Selecting parents from the old population, then and the individual covering this position is selected as
creating new individuals or children (usually mutating the first individual. The value f/NS is then added to this
or recombining them in some way) to generate initial position to determine the second position and,
children. The Merge operation usually either thus, the second individual. Hence, each subsequent
completely replaces the parents with the children, or individual is selected by adding the value f/NS to the
includes fit parents along with their children to form the previous position. This process is performed until N
next generation [14]. individuals have been selected.
The stopping condition of the algorithm is often Crossover
defined in a few ways such as: 1) limiting the execution The representation of an individual in GA determines
time of the algorithm. This is normally done either by the type of crossover and mutation operators that can
defining the maximum number of iterations, as is be implemented. By far, the most popular
shown in Algorithm 1, or by limiting the maximum representation of an individual in GA is the vector
number of fitness function evaluations; 2) f BEST does representation. Depending on the problem, the
not change appreciably over successive iterations; 3) individual can be defined using a boolean vector, an
attaining a pre-specified objective function value. integer vector or a real-valued vector as is the case in
One of the first EA is GA invented by John Holland in this paper.
1975 [5]. The standard GA consists of three genetic The crossover operator used in this paper is the
operators: selection, crossover and mutation. During Scattered or Uniform crossover method. Assuming the
each generation, parents are selected using the parents x i and x k have been selected, a random
selection operator. The selection operator selects
individuals in such a way that individuals with better binary vector or mask is generated. The children xi ,new
fitness values have a greater chance of being and x k ,new are then formed by combining genes of
selected. Then new individuals, or children, are
both parents. This recombination is defined by
generated using the crossover and mutation operators.
equations (6) and (7):
The Reproduce operation used in Algorithm 1. consists
xi j , if mask ( j ) 1
of these 3 operators. There are many variants of GA
due to the different selection, crossover and mutation xi ,new j , (6)
operators proposed, some of which can be found in [1, xk j , otherwise
5-6, 14-17]. The GA analyzed in this paper is available
in the Global Optimization Toolbox of Matlab R2010a. xk j , if mask ( j ) 1
The implemented genetic operators used in this study xk ,new j . (7)
are defined as follows. xi j , otherwise
Selection The number of children to be formed by the
The selection function used in this paper is the crossover operator is provided by a user defined
Stochastic Universal Sampling (SUS) method [17]. parameter Pcrossover which represents the fraction of the
Parents are selected in a biased fitness-proportionate population involved in crossover operations.
way such that fit individuals get picked at least once. The crossover operator tends to improve the overall
This method can be explained with the aid of Fig. 1. quality of the population since better individuals are
which shows an array of all individuals sized by their involved. As a result, the population will eventually
fitness values (N = 7). It can be noticed that f4 > f7 > f2 >
converge, often prematurely, to copies of the same ui , if f (ui ) f (xi )
individual. In order, to introduce new information i.e., xi ,new (10)
move to unexplored areas of the search space, the xi , otherwise.
mutation operator is needed.
Thus, the individual xi is replaced by the new
Mutation
individual u i only if u i represents a better solution.
The Uniform mutation operator is used in this paper.
Uniform mutation is a two-step process. Assuming an Based on these equations, it can be noticed that DE
individual has been selected for mutation, the has three main control parameters: F, CR and N which
algorithm selects a fraction of the vector elements for are problem dependent. Storn and Price [8]
mutation. Each element has the same probability, recommended N to be chosen between 5*D and 10*D,
Rmutation, of being selected. Then, the algorithm and F to be between 0.4 and 1. A lot papers and
replaces each selected element by a random number research work have been published indicating
selected uniformly from the domain of that element. methods to improve the ultimate performance of DE by
tuning its control parameters [20-22]. In this paper, a
For example, assuming the element x j of the
variant of the types of DE discussed in [19] is used
individual x i has been selected for mutation, then the where the mutation scale factor and the cross over
value of element x j is changed by generating a rate are generated randomly from continuous uniform
distributions.
random number from U low j , up j . 3.3. PARTICLE SWARM OPTIMIZATION (PSO)
In order to guarantee convergence of GA, an PSO belongs to the set of swarm intelligence
additional feature - elitism is used. Elitism ensures that algorithms. Even though there are some similarities to
at least one of the best individuals of the current EA, it is not modeled after evolution but after swarming
generation is passed on to the next generation. This is and flocking behaviors in animals and birds. It was
often a user defined value, Nelite, and indicates the top initially proposed by Kennedy and Eberhart in 1995 [9].
Nelite individuals, ranked according to their fitness A lot of variations and modifications of the basic
values, that are copied on to the next generation algorithm have been proposed ever since [23,24]. A
directly. candidate solution in PSO is referred to as a particle,
while a set of candidate solutions is referred to as a
3.2. DIFFERENTIAL EVOLUTION (DE)
swarm. A particle i is defined completely by 3 vectors:
DE is a very powerful yet simple real-parameter its position, xi, its velocity, vi, and its personal best
optimization algorithm proposed by Storn and Price position xi,Best. The particle moves through the search
about 20 years ago [7,8]. As with GA, a lot of variants space defined by a few simple formulae. Its movement
of the basic algorithm with improved performance have is determined by its own best known position, xi,Best, as
been proposed [18,19]. The evolutionary operations of
well as the best known position of the whole swarm,
classical DE can be summarized as follows [8].
xBEST. First, the velocity of the particle is updated using
Mutation equation (11):
The mutation of a given individual x i is defined by vi ,new c0 vi c1 r1 xi , Best xi
(11)
vi x k F x m x n , (8) c2 r2 x BEST xi ,
where i, k , m, n 1, N are mutually different, F 0 is then, the position is updated using equation (12):
the mutation scale factor used to control the differential xi ,new xi vi ,new , (12)
variation di xm xn .
where r1 and r2 are random numbers generated from
Crossover U(0,1), c0 is the inertia weight, and c1 and c2 are the
The crossover operator is defined by equation (9): cognitive and social acceleration weights respectively.
Modern versions of PSO such as the one analyzed in
vi j , if U 0,1 CR
this paper do not use the global best solution, xBEST, in
ui j , (9)
xi j , otherwise
equation (11) but rather the local best solution xi,LBest
[23,25]. Hence the velocity update equation is given by
where CR 0,1 is the cross over rate and controls vi ,new c0 vi c1 r1 xi , Best xi
(13)
c2 r2 xi , LBest xi .
how many elements of an individual are changed. u i
is the new individual generated by recombining the
mutated individual v i and the original individual x i . The local best solution of a give individual is
This operator is basically the Uniform crossover ((6) or determined by the best-known position within that
(7)) except for the fact that only one child is generated. particles neighborhood. Different ways of defining the
neighborhood of a particle can be found in [23, 25-28].
Selection
The analyzed PSO algorithm in this paper uses an
The selection operator is defined by equation (10): adaptive random topology, where each particle
randomly informs K particles and itself (the same
particle may be chosen several times), with K usually
set to 3. In this topology the connections between minimum lies in a narrow, parabolic valley. Even
particles randomly change when the global optimum though this valley is easy to find, convergence to the
shows no improvement [23, 25]. minimum is difficult [29]. The 2D plot is shown in Fig 4.
D 1
100 xi 1 xi2 xi 1 .
4. EXPERIMENTAL ANALYSIS 2 2
f ( x) (16)
i 1
Three standard optimization test functions were used
in performing the analyses: the Ackley function, The domain is defined on the hypercube
Rastrigin function and Rosenbrock function. xi 5,5, i 1, , D , with the global minimum
Ackleys function, (14), in its 2D form is characterized
by a nearly flat outer region, has a lot of local minima, f x* 0 , at x* 1, ,1 .
and a large hole at the center (Fig 2). Rosenbrock function
1 D 2 4
10
D i 1 (14) 8
1 D
exp cos 2 xi .
f(x1,x2)
6
i 1
D 4
2
The domain is defined on the hypercube 0
xi 5,5, i 1, , D , with the global minimum 5
5
0
f x 0 , at x* 0, ,0 .
* x2 0
-5 -5 x1
Ackley function
Fig. 4. Rosenbrock function for D = 2.
15 GA, DE and PSO were tested on these 3 test
functions for D = 2, 5, 10, 50 and 100. All analyses
10
were performed in Matlab. The algorithm specific
f(x1,x2)
100
80
Table 1. Algorithm specific control parameter values
used in the experiments
f(x1,x2)
60
0
GA Nelite = 2; Pcrossover = 0.8; Rmutation = 0.01
5
F U 0.5,2 ; CR U 0.2,0.9
5
0 DE
x2 0
-5 x1
-5
1
PSO c0 ; c c 0.5 ln(2)
Fig. 3. Rastrigin function for D = 2. 2 ln(2) 1 2
On the other hand, the Rosenbrock function, (16),
which is a popular test problem for gradient-based
optimization algorithms, is unimodal, and the global
It can be concluded that PSO, in general, has a better 5. CONCLUSION
accuracy for high dimensional problems but with a very
poor runtime performance. If runtime is the main In this paper, three heuristic algorithms are
condition, then GA is a better optimization tool. systematically analyzed and tested in detail for high
However, care should be taken if the optimization dimensional real-parameter optimization problems.
problem is similar to the Ackley function. These algorithms are GA, DE and PSO. An overview
of the implemented algorithms is provided. The
algorithms are tested on three standard optimization
functions, namely: Rastrigin, Rosenbrock and Ackley
functions. For lower dimensional problems, i.e.
problems involving at most 10 parameters, all three
algorithms had comparable results. However, for
higher dimensional problems, PSO outperformed the
other algorithms in terms of accuracy but had a very
poor runtime performance. On the other hand, the
runtime performances of GA and DE did not change
much with an increase in problem dimensionality.
(a)
6. REFERENCES:
[1] T. Weise, Global Optimization Algorithms: Theory and
Applications, E-book, Online available at http://www.it-
weise.de/, accessed 20.08.2014.
Fig. 6. Accuracy performance of the heuristic Systems: An Introductory Analysis with Applications to
algorithms for 100 trials for a) Rastrigin b) Biology, Control, and Artificial Intelligence. The University
Rosenbrock and c) Ackley function. of Michigan Press, Ann Arbor, 1975. ISBN: 0-4720-8460-
7, 978-0-47208-460-9,0-5850-3844-9, 978-0-58503-844-
5, 978-0-26258-111-6. Reprinted by MIT Press, April
1992, NetLibrary, Inc.
[9] J. Kennedy, R. Eberhart, Particle Swarm Optimization. [20] J. Brest, S. Greiner, B. Boskovic, M. Mernik, V. Zumer,
Proceedings of IEEE International Conference on Neural Self-adapting control parameters in differential evolution:
Networks, Vol. 4, 1995, pp. 19421948. a comparative study on numerical benchmark problems,
[14] S. Luke, Essentials of Metaheuristics, Lulu, 2nd edition, Comprehensive learning particle swarm optimizer for