Sunteți pe pagina 1din 28

Rubiks Cube

Ibanescu Diana
Dudeanu Ermoghen
Cracaoanu Sergiu

Abstract

There are many algorithms to solve scrambled Rubik's Cube. It is not known how many moves is
the minimum required to solve any instance of the Rubik's cube.
When discussing the length of a solution, there are two common ways to measure this. The first is
to count the number of quarter turns. The second is to count the number of face turns. A move like a half
turn of the front face would be counted as 2 moves in the quarter turn metric and as only 1 turn in the face
metric. A center square will always remain a center square no matter how you turn the cube.
This paper presents an evolutionary approach that tries to solve Rubiks Cube. A chromosome
represents a chain of moves that makes a cube to pass from a state to another. The cube starts from the final
state .We apply a number of transformations chosen randomly over this final state. This new state will
become the initial state.
The operations used to transform the cube are mutation and crossover. The way these operators are
used is depending on the stage of the algorithm. The operators are used to confer an equilibrium between
exploration and exploitation. If the local optimum is reached the exploration increases to avoid blocking
cases.
The fitness function depends on the number of squares that are on the final position and on the
number of crosses that the cube has. In fact there are two fitness functions.
The implemented algorithm does not find the solution all the time but it offers an alternative that
can find the solution in a short time, depending on the input cube.

Contents

I. Introduction .................................................................................................................................. 4
II. Deterministic algorithms............................................................................................................. 5
II.1. Simple solution .................................................................................................................... 5
II.2. Ortega's Method ................................................................................................................. 10
III. Genetic Algorithm ................................................................................................................... 15
III.1. Cube representation .......................................................................................................... 15
III.2. The components of the genetic algorithm ........................................................................ 15
IV. Evolutionary computing SDKs .............................................................................................. 17
IV.1. ECJ ................................................................................................................................... 17
IV.2. Origin ............................................................................................................................... 25
V. Experiments .............................................................................................................................. 27

I. Introduction
Rubik's cube is a cube whose edge length is approximately 56 mm (2.2 in). It consists of 26 smaller cubes;
one side is made up of 3x3 such cubes. If you take an ordinary cube and cut it as you would a Rubik's cube,
you are left with 27 smaller cubes. You see, the inner smaller cube comes from the center of this cube, and
this center is not contained in the Rubik's cube. In fact, the smaller pieces of the Rubiks cube have not
really a shape of a cube. We call these 26 small pieces cubbies.
There are 6 cubbies in the middle of individual sides, and those are rigidly interconnected by a six-armed
spatial cross (they merely rotate around their axles and keep other cubbies from falling off the cube). I will
call those cubbies centers. The centers are connected to the cross by means of screws and springs, causing
other cubbies to hold nicely tightly together. The center cubbies have a colored sticker on their external
sides, which determines the color of each face. The colors are usually white, red, blue, yellow, orange, and
green. Sometimes brownish color replaces red color. The original Rubik's cube (as well as all of the
recently produced Rubik-brand cubes) has coloring white opposite yellow, red opposite orange, and blue
opposite green. The former coloring has certain logic, called "plus yellow," because opposite side of a
primary color (white, red, blue) is formed by adding yellow color (yellow, orange, green). On the other
hand, the latter coloring tries to combine opposite sides with maximum contrast. Each coloring can further
have two variants due to mirror reflection.
We also have 12 edge cubbies, or just edges. Those cubbies have two stickers of different colors. All color
combinations are exhausted, except those on opposite faces.
And the last type of cubbies is eight corner cubbies, or just corners. Those cubbies have three differentlycolored stickers on their mutually orthogonal sides.
Labeling of moves
As you surely know, the Rubik's cube has 6 sides. All sides can be rotated by a certain angle. This rotation
is called a move (or a turn).
In order to be able to perform other moves, it is recommended to rotate by 0 (360), 90, 180, or -90 (270)
degrees. The 90 and -90 degrees turns are called quarter turns. The 180 degreed turn is called a double turn
(or sometimes - a bit misleading - face turn is used). Of course, if you rotate a side, you rotate one third of
the cube (9 cubbies). This array is called a layer.
Having a cube in front of you, the individual layers are rotated thus:
A layer facing towards you is the front layer, and is labeled as F (Front).
A layer facing away from you is the back layer and is labeled as B (Back).
A layer which is on top is the up layer and is labeled as U (Up).
A layer which is at bottom is the down layer and is labeled as D (Down).
A layer which is on your right is the right layer and is labeled as R (Right).
A layer which is on your left is the left layer and is labeled as L (Left).
A layer which is between the left and right layer is labeled as M (Middle).
A layer which is between the top and bottom layer is labeled as E (Equator).
A layer which is between the front and back layer is labeled as S (Standing).
The move is labeled by the symbol of the layer you are turning (F, B, U, D, L, R). A symbol by itself labels
a clockwise rotation of the layer. For a counterclockwise rotation, the layer symbol is followed by an
apostrophe (a single quote}. For a 180-degree rotation, the layer symbol is followed by digit '2' or an
exponent "to the second power."
Turning M, E, and S layers is called slice turn. For M turn the direction is top-down, for M' bottom-up. For
E turn the direction is left-right, for E' right-left. For S turn the direction is clockwise if seeing from front,
for S' counterclockwise.
For a clearer understanding here is an example:

For example F2 U' R M' means: Rotate the front layer by a half-turn (180 degrees in any direction), then
rotate the up layer counterclockwise by a quarter-turn (90 degrees), then rotate the right layer clockwise by
a quarter-turn (90 degrees) and finally rotate the middle layer in a bottom-up direction by a quarter-turn.

II. Deterministic algorithms


II.1. Simple solution
This method uses very few sequences that you need to memorize in order to solve the cube. Although there
are quite a few sequences provided in this solution, most of them are intuitive steps.
The method is split to two main steps: solve all eight corners of the cube and solve all twelve edges (and
centers) while keeping the corners solved
This method is an alternative with many advantages:
Smaller amount of unintuitive sequences (nowadays often computer-generated)
Generally shorter sequences
Higher "symmetry" of solution (you solve cube evenly)
You do not break what you solved in previous steps as much (easier recovery from
mistakes)
It is quite efficient with respect to its simplicity
You can achieve very good times just by practice using the very basic method.
You can scale up the method incrementally to gain speed and efficiency
1. Solve Four Bottom Corners
We will start by solving four corners of the cube that share one color (in this case we will select
white color). This step can be solved intuitively.
Select one corner with white sticker and turn the whole cube so the white sticker of this corner is
facing down. You have solved one of 4 corners this way. Now look for other corners with white
sticker and put them to the bottom layer using the right one of the following sequences. Solve the
corners one by one. When searching for the next corner to solve, you may freely turn the top
layer to put the corners into the position in which you can apply the sequence.
Pay attention to align colors of the corners on sides, since if they do not match as well, the corners are not
in correct places. The orange and green colors are just example, there can be other color combinations (like
blue-red, green-red, ...).
The cubbie on top, bottom sticker on the front side:

The cubbie on top, bottom sticker on the right side:

If you do not see any situation being similar to one of the first two above (remember that you can freely
turn top layer to position the corner to the top-right-front position), the corners are in positions that are
more difficult to solve. The following sequences will help you to transform such positions into the ones you
should be familiar with already.

The bottom sticker on the top:

The cubbie on bottom, bottom sticker on the front side:

The cubbie on bottom, bottom sticker on the right side:

One possible way to remember the last two sequences is "bring white sticker to the top, put it back (inside
layer you just turned), reverse the first step".

2. Place Four Top Corners


To solve the four top corners you will need to temporarily destroy the 4 bottom corners. The question is:
How to destroy and restore the bottom corners so as the top corners become solved? The simplest idea is to
remove one bottom corner from its position (using one of the sequences given earlier) and solve it in a
different way. Let us look at an example showing the removing and restoring of the front-right-bottom
corner:
Remove, position top layer, and restore corner (shown applied to a solved cube):

If you look at the result you may notice that the top corners changed. Two corners are twisted (orange-blueyellow and red-blue-yellow) and two are swapped (top-right ones). If we select carefully how to turn the
whole cube before applying this corner sequence to affect the right corners, we can solve the top corners
just by this one sequence! In this step, we will only position the corners to their correct positions while
ignoring the way how they are twisted. Thus our task is quite simple: apply the corner sequence (possibly
more times) to place the corners to their correct positions (use the colors of side stickers of bottom corners
to find the right ones).
As you can notice the corner sequence swaps the top-right-front and top-right-back corners. You just need
to turn the top-layer and/or the whole cube (keeping top layer facing up) to a position where swapping
these two top-right corners will place at least one corner to a correct position. You can always get one of
the following cases when turning the top layer to place the corners:
All corners are in their positions (although probably twisted) - this step is finished.
If two adjacent corners can be correctly positioned by turning the top face then only one swap of the other
two corners is necessary (make sure that you turn the cube so that these two corners are in top-right
positions when applying the sequence).
If two (diagonally) opposite corners can be correctly positioned by turning the top face then perform a swap
of any two top corners and you will obtain the previous situation.

3. Twist Four Top Corners


Now we are able to position the top corners using one (quite simple) corner sequence that is
explained in the previous text, thus there is no magic here up to this point. Let us try to follow
this way even for twisting the corners. We can twist (two) corners using the previous corner
sequence, however, it also moves corners which are not good for this step.
(Just reminding you that in this step, we want to twist corners and NOT move them, because they are
already positioned in` the previous step.) The idea behind the corner sequence was to do some change and
redo it in a different way, so the other parts of the cube became changed while everything solved before
remains solved. Let us try the same idea in this step using our corner sequence: swap two corners using the
corner sequence and swap them back from a different angle using the same corner sequence. If we can do
so, the corners will be in their correct positions (swap + swap back = nothing), but will be somehow
twisted. Let us try that, but before that I must say that swapping two corners back from a different angle
requires left | right mirroring the corner sequence, which is shown below:
Mirror vision of the corner sequence (shown applied to a solved cube):

Now you can try the presented idea of doing and redoing (in a different way) the corner swap to twist top
corners:
Normal corner sequence, turn cube, mirrored corner sequence (shown applied to a solved cube):

You can see that this new twist sequence leaves all corners in their original positions and twists only two
corners: the top-left-front corner is twisted clockwise and the top-left-back corner counter-clockwise. It is
not difficult to twist all top corners in any orientation using this twist sequence. After performing
consequent multiple application of the twist sequence to the corners, as soon as three corners become
oriented correctly, the remaining corner has no chance to be twisted incorrectly.
Examples:
The following examples will show you what angle to choose for the twist sequence. After one application
the orientation of top corners changes and you will get another that is also shown bellow. You will not get
into an infinite cycle when followed correctly.
Two twisted, facing opposite sides - apply from this angle
Two twisted, facing opposite sides - apply from this angle

Two twisted, facing the same side - apply from this angle

Two twisted, facing adjacent sides - apply from this angle

Three twisted, clock-wise - apply from this angle

Three twisted, counter clock-wise - apply from this angle

Four twisted, facing opposite sides - apply from this angle

Four twisted, facing three sides - apply from this angle

4. Solve Three Ledges


To solve the Ledges (which stands here for left-side edges - those
with white stickers in the pictures) you use the following simple
sequences.

Ledge in bottom-front:

Ledge in front-bottom:

Ledge in top-right:

Ledge in right-top:

Ledge in top-left (flipped in its place):

5. Solve Four Ridges


Ridges are right-side edges, which have yellow stickers in the pictures.
Ridge in bottom-front:

Ridge in front-bottom:

Ridge in top-left:

Ridge in left-top:

Ridge in top-right (flipped in its place):

6. Solve Last Ledge

Ledge in bottom-front:

Ledge in bottom-back:

7. Flip Midges
The edges in the ring need to be flipped in most cases before you can proceed to the following step of
positioning them. How should you know which ones need to be flipped? There is a simple rule to spot the
incorrectly oriented edges: Look at two colors - color of an edge sticker (choose either one of the two) and
color of the center adjacent to the chosen edge sticker. If the colors are the same or opposite (red + orange
or blue + green) the edge is just fine. It is flipped otherwise. There may be only none, two, or four of them.

Two top midges flipped:

8. Place Midges
Three midges in forward cycle:

Three midges in backward cycle:

Two top midges and two bottom midges swapped:

Two and two midges diagonally swapped:

II.2. Ortega's Method


This solution method is designed to solve Rubik's cube and to solve it quickly, efficiently, and without
having to memorize a lot of sequences. For ease and speed of execution, turns are mostly restricted to the
top, right, and front faces, and center and middle slices. Strong preference is given to the right face, since it
is one of the easiest faces to turn for many people. Yet all sequences are minimal (or very close to minimal)
by the slice-turn metric.
This solution method orients cubbies before positioning them. The idea is that it is easier to permute
cubbies after they've been oriented than before orienting them, because once the cubbies have been
oriented, the facelet colors that determine their permutation make easily identifiable patterns on the cube.
Orienting cubbies, whether done before or after positioning them, is always easy because orientation
requires focusing on only one face color and on the patterns that that color makes on the cube. For middleslice edges on the last layer, permuting cubbies after they've been oriented is a very simple affair, thus
reinforcing this principle.
1. Orient Top Corners
You should be able to manage this on your own. Do not worry about positions - all corners will be
permuted in step 3. For the greatest speed and efficiency, try to do this in one look
. For smoother cubing you should try to orient these corners on
bottom face, because the next step can be done faster then (no cube
rotation afterwards, easier looking ahead).
(average number of turns for this step ... 5)

2. Orient Bottom Corners


Rotate the whole cube so that bottom face becomes top face. Orient
the corners depending on which of the seven patterns below you see:

T pattern:
R U R' U' F' U' F

L (F) pattern:
F R' F' U' R' U R

MI pattern:
R U R' U R U2 R'

PI pattern:
R U R2 F' R'2 U R'

U pattern:
R' F' U' F U R

H pattern:
R2 U2 R' U2 R2

(average number of turns for this step ... 7)


3. Position All Corners
A pair here represents two adjacent corners on the top or bottom layer.
Such a pair is considered to be solved correctly if the two corners are
positioned correctly relative to each other. A solved pair will be easy
to identify because the two adjacent facelets on the side (not top or
bottom) will be of the same color. A layer can have only zero, one, or
four correct pairs.
The number and location of correct pairs can be quickly identified by merely looking at two adjacent side
faces (that is, not top or bottom). For a given layer, if you see one correct pair and one incorrect pair, then
there is only one correct pair on that layer. If you see two correct pairs, then all four pairs are correct. If you
see no correct pairs but both pairs consist of opposite colors, then there are no correct pairs on that layer. If
you see no correct pairs and only one pair consisting of opposite colors, then there is one correct pair on
that layer, and it is opposite to the pair with the opposite colors.
Proceed with one of the following sequences depending on how many solved pairs you have:

0 (no pairs solved):


R2 F2 R2

1 (bottom-back pair solved):


R' U R' B2 R U' R

2 (top-back and bottom-back pairs solved):


R2 U F2 U2 R2 U R2

4 (bottom pairs solved):

5 (bottom and top-back pairs solved):


R U' R F2 R' U R F2 R2

(average number of turns for this step ... 8)


At this point, align corners and position centers. The cube is now fully symmetric except for edges. Pick the
new top and bottom face depending on what will make solving top and bottom edges easiest. Steps 4 and 5
can be combined, although this requires monitoring more cubbies simultaneously and may not yield a
speed gain or a reduction in number of movements.
4. Solve Three Top Edges
In order to do this step efficiently, you need not position centers and allign corners in the previous step.
Instead, you can solve first (or first two opposite) top edge using one or two turns ignoring centers and
then, you can solve the top center together with another top edge.
(average number of turns for this step ... 9)
5. Solve Three Bottom Edges
To reduce the number of turns required, you can combine this and the following step when solving the third
bottom edge. There are several possible cases that are easy to find and very efficient. In addition, you
should force yourself to look ahead in this step and try to prevent slower cases to occur.
(average number of turns for this step ... 12)

6. Solve One More Top or Bottom Edge


Often, you can solve the last top or bottom edge in the previous step
thus omit this step and reduce turns and time.
(average number of turns for this step ... 4)
At this point, the last top or bottom edge will either be in the middle
layer, in position but not oriented, or solved. Depending on the case,
proceed as follows to solve that last edge (if necessary) while
orienting the middle layer edges.
7. Solve Last Top Edge and Orient Middle Edges

a) Top Edge in Middle Layer


Position the "notch" at top-right and the edge cubbie at left-front, with
the facelet with the top color on the left face. If the edge cubbie is
twisted, mirror vertically (top-right becomes bottom-right, right-face
turns go in the opposite direction)
As shown in the diagram, the pink-marked edges are oriented
correctly - O - if the pink facelet's color matches the color of the
adjacent or opposite center. Otherwise the edge is oriented incorrectly
(flipped) - X.
OOO
E R E R' E' R' E R

OOX
E R' E' R' E R' E' R'

OXO
E' R E R2 E R

OXX
R' E' R E R' E R

XOO
R' E' R E R' E R

XOX
R' E R' E' R' E' R E R E' R

XXO
R' E' R' E R E R

XXX
R' E R' E R' E' R2 E' R

b) U Edge Twisted in Its Position


There will be 1 or 3 twisted edges in the middle layer:
front-right twisted:

R U2 R' E2 R2 E' R' U2 R'

front-right not twisted:


R' E' R' E' R' E' R'

c) U Edge Solved
There will be 0, 2, or 4 edges twisted in the middle layer:
2 adjacent (front-left and front-right):
R2 F M F' R2 F M' F'

2 opposite (front-left and back-right):


F M F' R2 F M' F' R2

4:
R F2 R2 E R E2 R F2 R2 E R

(average number of turns for this step ... 9)


8. Position Midges
Send front-right to back-left, back-left to back-right, and back-right to front-right:
R2 E' R2

Exchange centers with opposites:


M2 E' M2

Exchange front-right with back-right, front-left with back-left:


R2 E2 R2

(average number of turns for this step ... 4)


Average number of turns for this method ... 58

III. Genetic Algorithm


The Rubik Cube problem consists of finding a minimal chaines of movements in order to solve the cube.
Let's imagine that after a given number of chain the cube arrive in a state.
The genetic algorithm used in this paper is in face a chain of genetic algorithms. The genetic algorithm is
used for every state that the cube will pass, and all these put toghether will constitute the solution of the
problems.
The chain that bring the cube from a state to a best state is select by genetic algorithm. In this way, the cube
will pass through the best states, so the chances to reach the final state are bigger.
Initialization:
begin
T: = 0;
Generate(P(t));
Evaluate(P(t));
SortPopulaion(P(t));
Fitness_new = average(fitness);
end
Iteration:
do
T := T + 1;
Fitness_old = Fitness_new;
Mutation();
Crossover();
P(t) = P(t) + Mutation() + Crossover();
EvaluateP(t));
SortPopulaion(P(t));
Selection(P(t));
Fitness_new = average(fitness);
while (Math.Abs(fitness_old fitness_new) > Math.Pow(10, -7);

III.1. Cube representation


The algorithm represents each of six faces of the cube as an array of 27 bits represented by a BitSet, and
every cell represent a color that are encoded in 3 bits. By using a BitSet we minimize the memory
consumption because we have to memorize a set of possible solutions.
The twist is the only operation for the cube. For every face there are three possibilities of twisting: 90
degrees clockwise, 180 degrees clockwise, 90 degree counter-clockwise.

III.2. The components of the genetic algorithm


Coding the solutions
A solution (chromosome) represents a chain of moves that makes a cube to pass from a state to another.
Initialization
To establish what state will be the initial one, the cube will perform some movements starting from the
final state, in a random manner. This is done in order to assure to possibility that starting from this state,
and applying different moving, the cube will be able to arrive in the final state. This state chooses randomly
will become the initial state and the starting point from the genetic algorithm. The genetic algorithm will

try to find a way to reach in the final state applying chains of movements. Every chain of moves will bring
the cube in transition states.
The fitness function
Every individual from population is evaluated in order to measure their performance. The fitness function
used in this algorithm computes what cells are on their position. The algorithm will stop when all the cells
will be on their position. Also, the number of crosses is giving a boost to the fitness function,
acknowledging the importance of these particular patterns in the most successful algorithms.
Mutation
The mutation operator consists of changing randomly a moves from the chromosome. The number of genes
of a individual that participate to mutation vary from an individual to another. A parameter alpha is
considered, with alpha [0,1] that is used to decide if the mutation will be applied (if alpha < pm) on the
gene or not (if alpha > pm). Also, with this parameter will be established if an individual will participate or
not to the mutation. The pm parameter denote the probability of mutation and depends on various factors
like iteration number, fitness dispersion and crossover rate. The initial value of pm is pm = 0.7 * N, where
N is the total number of chromosomes in the population. The chromosome that results from this process
will be added to the current population, without erasing the initial chromosome. This way, the risk of losing
possible good solutions in future generations will be eliminated.
Crossover
Let pc be the probability of crossover. The matching pool for crossover S, is formed by the best 40%
individuals. From this total, the number of individuals that will participate in the crossover operation is
N 0.4 pc . The initial value of pc is pc = 0.2*N, where N is the total number of chromosomes in the
population.
Two parents are selected from the basin S, denoted by x1 and x2 . Given a value a [0, 1], two descendants
will be formed from the two parents and it is used one cut point.
Selection for the next iteration
Applying these mutation and crossover operations will create new individuals that will be added to the
current population. This population will be sorted decreasingly by the fitness value, so that the first
individual will have the best fitness, while the last one will have the worst. By applying selection, the first
N best individuals are chosen where N is the dimension of the initial population.
Stop condition
The algorithm is let to works 200 times. If it doesn't reach the final state, the algorithm will choose the
individual found so far, that brings the cube in the best state. Then another algorithm will start from this
solution, trying to bring the cube in another best state.
Mutation versus crossover rate
Taking in consideration the observations of the cube behavior, there are situations when the cube arrives
into a state that only two or three cells are not in the correct position and how many mutations and
crossover are applied it can't go out from this local maximum. A solution for this situation is to increase
exploration and decrease the exploitation. This is solved by increase the mutation rate and to decrease the
crossover rate on individuals in order to move away form this local.
The value fitness_old is computed at every iteration, that represents the old value of the average fitness at
the previous iteration, and also the value fitness_new is calculated, representing the new value of the
average fitness at the current iteration. When | fitness _ old fitness _ new |< td , where td is a tolerance
threshold ( td = 10 7 ) is accomplish then the mutation rate increase with 0.7 and the crossover decrease
with 0.2.

Improvements of the algorithm


The algorithm uses a tabu search memory in order to avoid the cycling situation when two moves cancel
each other. This brings a performance to the algorithm, because forces the cube to avoid the states recently
visited.

Enhancement
One enhancement to solve the cube is the bidirectional search. Let s 0 the initial state and g the final state.
The genetic algorithm generates chromosome of length m. There will be used two genetic algorithms. One
algorithm will start from the initial state and the other from the final state and will evolve independently till
a common state will be reached. A correlation function compares the generated state for both the forward
and backward search. If any paths from forward and backward search cross the path from s 0 to g is
determined. If no paths cross, the correlation function finds two closest states and the cube will pass in the
states s1 and g 1 , till the sequence that bounds the two searches will be found.

IV. Evolutionary computing SDKs


IV.1. ECJ
ECJ is a research EC system written in Java and represents models iterative evolutionary processes using a
series of pipelines arranged to connect one or more subpopulations of individuals with selection, breeding
(such as crossover) and mutation operators. It was designed to be highly flexible, with nearly all classes
(and all of their settings) dynamically determined at runtime by a user-provided parameter file. All
structures in the system are arranged to be easily modifiable. Even so, the system was designed with an eye
toward efficiency.
The ECJ source is licensed under the Academic Free License 3.0, except for the MersenneTwister and
MersenneTwisterFast Java classes, which use this license.
General Features:
- GUI with charting
- Platform-independent checkpointing and logging
- Hierarchical parameter files
- Multithreading
- Mersenne Twister Random Number Generators
-Abstractions for implementing a variety of EC
forms.
Vector (GA/ES) Representations
- Fixed-Length and Variable-Length Genomes
Arbitrary representations
- Ten pre-done vector application problem
domains (rastrigin, sum, rosenbrock, sphere, step,
noisy-quartic, booth, griewangk, nk, hiff)
Other Representations
- Multiset-based genomes in the rule package, for
evolving Pitt-approach rulesets or other set-based
representations.

EC Features:
- Asynchronous island models over TCP/IP
- Master/Slave evaluation over multiple
processors, with support for generational,
asynchronous steady-state, and coevolutionary
distribution
- Genetic Algorithms/Programming style Steady
State and Generational evolution, with or without
Elitism
- Evolutionary-Strategies style (mu,lambda) and
(mu+lambda) evolution
- Very flexible breeding architecture
- Many selection operators
- Multiple subpopulations and species
- Inter-subpopulation exchanges
- Reading populations from files
- Single- and Multi-population coevolution
- SPEA2 multiobjective optimization
- Particle Swarm Optimization
- Differential Evolution
- Spatially embedded evolutionary algorithms
- Hooks for other multiobjective optimization
methods

An example: Build a Genetic Algorithm for the MaxOnes Problem


We will build an evolutionary computation system that uses:
- Generational evolution
- A GA-style selection and breeding mechanism (a pipeline of tournament selection, then crossover, then
mutation)

- A single, non-coevolutionary population


- A simple, floating-point fitness value (no multiobjective fitness stuff)
- A fixed-length vector representation (MaxOnes uses a vector of bits)
- Only one thread of execution.
- Only one process (no island models or other such funkiness).
The example is tested on UNIX system.

Create an app subdirectory and parameters file


Go into the ec/app directory and create a directory called tutorial1. In this directory, create a file called
tutorial1.params. The params file is where we will specify parameters which direct ECJ to do an
evolutionary run. ECJ parameters guide practically every aspect of ECJ's operation, down to the specific
classes to be loaded for various functions.
ECJ's top-level object is ec.Evolve. Evolve has only one purpose: to initialize a subclass of
ec.EvolutionState, set it up, and get it going. The entire evolutionary system is contained somewhere
within the EvolutionState object or a sub-object hanging off of it.
The EvolutionState object stores a lot of top-level global evolution parameters and several important toplevel objects which define the general evolution mechanism. Some of the parameters include:
- The number of generations to run
- Whether or not we should quit when we find an ideal individual, or go on to the end of generations
Some of the top-level objects inside EvolutionState include:
- A subclass of ec.Initializer, responsible for creating the initial population.
- An ec.Population, created initially by the Initializer. A Population stores an array of
ec.Subpopulations. Each Subpopulation stores an array of ec.Individuals, plus an ec.Species which
specifies how the Individuals are to be created and bred. We'll be using a Population with just a single
Subpopulation.
- A subclass of ec.Evaluator, responsible for evaluating individuals.
- A subclass of ec.Breeder, responsible for breeding individuals.
- A subclass of ec.Exchanger, responsible for exchanging individuals among subpopulations or among
different processes. Our version of Exchanger won't do anything at all.
- A subclass of ec.Finisher , responsible for cleaning up when the system is about to quit. Our Finisher
won't do anything at all.
- A subclass of ec.Statistics, responsible for printing out statistics during the run.
- An ec.util.Output facility, responsible for logging messages. We use this instead of
System.out.println(...), because Output makes certain guarantees about checkpointing, thread-safeness, etc.,
and can also prematurely quit the system for us if we send it a fatal or error message.
- An ec.util.ParameterDatabase. The ParameterDatabase stores all the parameters loaded from our
params file and other parameter files, and is used to help the system set itself up.
- One or more ec.util.MersenneTwisterFast random number generators, one per thread. Since we're
using only one thread, we'll only have one random number generator.

Define Parameters for the Evolve object


Let's begin by defining some basic parameters in our params file which the Evolve class uses. Since Evolve
(oddly given its name) isn't involved in evolution, these parameters are mostly administrative stuff. Add the
following parameters to your tutorial1.params file.
verbosity
flush
store

= 0
= true
= true

Most of the things ECJ prints out to the terminal are messages. A message is a string which is sent to the
Output facility to be printed and logged. Messages can take several forms, though you'll usually see: plainold messages, warnings, errors, and fatal errors. A fatal error causes ECJ to quit as soon as it is printed and
logged. An ordinary error raises an error flag in the Output facility; ECJ can wait after a string of errors
before it finally quits (giving you more debugging information). Warnings and messages do not quit ECJ.

The verbosity parameter tells ECJ what kinds of things it should print to the screen: a verbosity of 0 says
ECJ should print everything to the screen, no matter how inconsequential. The verbosity can be changed.
The flush parameter tells ECJ whether or not it should immediately attempt to flush messages to the
screen as soon as it logs them; generally, you'd want this to be true. The store parameter tells ECJ
whether or not it should store messages in memory as it logs them. Unless you have an absolutely
gargantuan number of messages, this should probably be true.
breedthreads
evalthreads
seed.0

= 1
= 1
= 4357

This tells ECJ whether or not it should be multithreaded. If you're running on a single-processor machine, it
rarely makes sense to be multithreaded (in fact, it's generally slower). breedthreads tells the Breeder how
many threads to spawn when breeding. evalthreads tells the Evaluator how many threads to spawn when
evaluating.
Each thread will be given its own unique random number generator. You should make sure that these
generators have different seeds from one another. The generator seeds are seed.0, seed.1, ...., up to
seed.n where n = max(breedthreads,evalthreads) - 1. Since we have only one thread, we only need one
random number generator. 4357 is a good initial seed for the generator: but remember that if you run your
evolution twice with the same seed, you'll get the same results! So change your seed for each run. If you'd
like the system to automatically change the seed to an arbitrary seed each time you run, you can base the
seed on the current wall clock time. You do this by saying seed.0 = time.
Next let's define our evolution state. The simple package defines lots of basic generational evolution stuff,
and we can borrow liberally from it for most of our purposes. We'll start by using its EvolutionState
subclass, ec.simple.SimpleEvolutionState. We do this by defining a final parameter which Evolve uses to
set stuff up:
state = ec.simple.SimpleEvolutionState

Define Parameters for the SimpleEvolutionState object


SimpleEvolutionState defines a simple, generational, non-coevolutionary evolution procedure. The
procedure is as follows:
1. Call the Initializer to create a Population.
2. Call the Evaluator on the Population, replacing the old Population with the result.
3. If the Evaluator found an ideal Individual, and if we're quitting when we find an ideal individual, then go
to Step 9.
4. Else if we've run out of generations, go to Step 9.
5. Call the Exchanger on the Population (asking for a Pre-breeding Exchange), replacing the old Population
with the result.
6. Call the Breeder on the Population, replacing the old Population with the result.
7. Call the Exchanger on the Population (asking for a Post-breeding Exchange), replacing the old
Population with the result.
9. Increment the generation count, and go to Step 2.
10. Call the Finisher on the population, then quit.
In between any of these steps, there are hooks to call the Statistics object so it can update itself and print
out statistics information. Since our Exchanger will do nothing, steps 5 and 7 won't do anything at all.
SimpleEvolutionState can work with a variety of Initializers, Evaluators, Breeders, Exchangers, Finishers,
and Populations. But to keep things simple, let's use the basic ones which go along with it nicely. Here are
some parameters which will direct SimpleEvolutionState to load these classes:
pop
init
finish
breed
eval
stat
exch

=
=
=
=
=
=
=

ec.Population
ec.simple.SimpleInitializer
ec.simple.SimpleFinisher
ec.simple.SimpleBreeder
ec.simple.SimpleEvaluator
ec.simple.SimpleStatistics
ec.simple.SimpleExchanger

SimpleInitializer makes a population by loading an instance of (in this case ec.Population) and telling it to
populate itself randomly. Populations, by the way, can also load themselves from files (see the
Subpopulation documentation). The SimpleEvaluator evaluates each individual in the population
independently. The SimpleStatistics just reports basic statistical information on a per-generation basis. The
SimpleExchanger and SimpleFinisher do nothing at all.
Additionally, there are some more parameters that SimpleEvolutionState needs:
generations
quit-on-run-complete
checkpoint
prefix
checkpoint-modulo

=
=
=
=
=

200
true
false
ec
1

is the number of generations to run. quit-on-run-complete tells us whether or not we should


quit ECJ when it finds an ideal individual; otherwise it will continue until it runs out of generations.

generations

checkpoint tells ECJ that it should perform checkpointing every checkpoint-modulo generations, using
a Gzipped checkpoint file whose name begins with the prefix specified in prefix. Checkpointing saves out
the state of the entire evolutionary process to a file; you can then start from that point by launching ECJ on
that checkpoint file. If you have a long run and expect that the power might go out or the system might be
shut down, you may want to checkpoint. Otherwise don't do it, it's an expensive thing to do.

Define the Statistics File


SimpleStatistics requires a file to write out to. Let's tell it that it should write out to a file called out.stat,
located right where the user launched ECJ at (that's what the $ is for):
stat.file

= $out.stat

How do we know that SimpleStatistics needs a file? Because it says so. A great many objects in ECJ have
parameter bases. The parameter base is passed to the object when it is created, and is prefixed to its
parameter names. That way, for example, you could conceivably create two different Statistics objects, pass
them different bases, and they'd be able to load different parameters. Some ECJ objects also have a default
base which defines a secondary parameter location that the object will look for if it can't find a parameter it
needs at its standard parameter base. This allows some objects to all use the same default parameters, but
specialize only on certain ones.
SimpleStatistics doesn't have a default base. It's too high-level an object to need one. The base for our
SimpleStatistics object is stat. Usually the bases for objects correspond with the parameter name that
specified what class they were supposed to be. For SimpleStatistics, for example, the class-specifying
parameter was stat = ec.simple.SimpleStatistics, hence stat is the base, and the SimpleStatistics'
output filename is at stat.file.
If no file is specified, by the way, SimpleStatistics will just output statistics to the screen.

Define the Population parameters


We begin by telling ECJ that the Population will have only one Subpopulation, and we'll use the default
Subpopulation class for subpopulation #0.
pop.subpops
pop.subpop.0

= 1
= ec.Subpopulation

Note that Population, like Statistics, also uses parameter bases (in this case its base is pop). Similarly,
Subpopulation #0 has a parameter base. It will be, you guessed it, pop.subpop.0. Let's define some stuff
about Subpopulation #0:
pop.subpop.0.size
= 100
pop.subpop.0.duplicate-retries
= 0
pop.subpop.0.species
= ec.vector.VectorSpecies

We've first stated that the size of the subpopulation is going to be 100 individuals. Also, when initializing
themselves, subpopulations can guarantee that they won't duplicate individuals: they do this by generating

an individual over and over again until it's different from its peers. By default we're telling the system not
to bother to do this, duplicates are fine.
As mentioned earlier, every Subpopulation has an associated ec.Species which defines features of the
Individuals in the Subpopulation: specifically, how to create them and how to breed them. This is the first
representation-specific object we've seen so far: ec.vector.VectorSpecies defines a particular kind of
Species that knows how to make BitVectorIndividuals, which are the kind of individuals we'll be using.
Other kinds of individuals require their own special Species classes.

Define the Representation


Species hold a prototypical Individual which they clone multiple times to create new Individuals for that
Species. This is the first place you will see the notion of prototypes in ECJ, a concept that's used widely. A
prototype is an object which can be loaded once from the parameter files, and set up, then cloned
repeatedly to make lots of customized copies of itself. In ECJ, Individuals are prototypes.
The parameters for ec.Species are where the individual is specified:
pop.subpop.0.species.ind

= ec.vector.BitVectorIndividual

Here we stipulate that the kind of individual used is a ec.vector.BitVectorIndividual, which defines an
Individual that holds a vector of with boolean values. VectorSpecies also holds various parameters that all
individuals of that species will abide by:
pop.subpop.0.species.genome-size
pop.subpop.0.species.crossover-type
pop.subpop.0.species.crossover-prob
pop.subpop.0.species.mutation-prob

=
=
=
=

100
one
1.0
0.01

This stipulates that our individuals will be vectors of 100 bits, that their "default" crossover will be onepoint crossover, that if we use the default crossover we will use it 100% of the time to breed individuals (as
opposed to 0% direct copying), and finally that if we use the "default" mutation, then each bit will have a
1% probability of getting bit-flipped, independent of other bits.
We'll get to the "default" crossover and mutation in a second, but first note that VectorSpecies is a
Prototype, and Prototypes almost always have default parameter bases to fall back on. The default
parameter base for VectorSpecies is vector.species (see the VectorSpecies documentation). For example,
instead of explicitly saying that all indivividuals in the species used in subpopulation #0 of the population
are supposed to have a genome-size of 100, we could have simply said that all individuals belonging to any
VectorSpecies have a genome size of 100 unless otherwise stipulated. We say it like this:
vector.species.genome-size = 100.

Define the Fitness


Fitnesses are similarly defined:
pop.subpop.0.species.fitness

= ec.simple.SimpleFitness

Every Individual has some kind of fitness attached to it, defined by a subclass of ec.Fitness. Fitnesses are
not built into Individuals; and instances of the same Individual subclass can have different kinds of
Fitnesses if you see fit. Fitnesses are prototypes just like Individuals are: each Species instantiates one
Fitness subclass, called the prototypical Fitness, and uses that class to clone copies which are attached to
new Individuals the Species has created.
Here we say that we will use ec.simple.SimpleFitness as our fitness class. SimpleFitness defines fitness
values from 0.0 inclusive to infinity exclusive, where 0.0 is the worst possible fitness, and infinity is better
than the ideal fitness. You can define the ideal fitness to any value greater than 0, we'll get to that later.

Define the Breeding Procedure


ECJ has a very flexible breeding mechanism called a breeding pipeline. It's not actually a pipeline per se:
it's really a tree. The leaves of the tree are selection methods, responsible for picking individuals from the
old population. Nonleaf nodes of the tree are breeding operators, which take individuals handed them by

their child nodes, breed them, and send them to their parent node. The root of the tree then hands
completely-bred individuals to be added to the new population.
We will define a breeding pipeline which does the following. First, it picks two individuals from the
population and hands them to be crossed over. The crossover operator then hands the individuals to a
mutation operator to be mutated. The mutation operator then hands the individuals off to be placed into the
new population. The tree thus has a mutation operator at the root, with one child (the crossover operator).
The crossover operator has two children, each selection methods.
For a mutation operator we will use ec.vector.breed.VectorMutationPipeline. This operator requests
Individuals of its sole child source (the crossover operator), then mutates all of them. It mutates them by
simply calling the default mutation method defined in the Individuals themselves. If you want some nondefault mutation method (like vector-inversion), you'll need to define your own BreedingPipeline subclass
to do the custom mutation.
Similarly, for a crossover operator we will use ec.vector.breed.VectorCrossoverPipeline. This
operator requests one Individual from each of its two child sources (in this case, the selection methods),
then crosses them over and returns both of them at once. This pipeline does its crossover simply by calling
the default crossover method defined in the Individuals themselves. Once again, if you want a special kind
of crossover not stipulated in the defaults, you'll need to define your own BreedingPipeline subclass to do
the special crossover.
Lastly, for our selection methods, let's use ec.select.TournamentSelection, which defines basic
tournament-selection.
The root of the pipeline is defined by the parameter pop.subpop.0.species.pipe, and everything else
derives its base off of it in a hierarchical fashion:
pop.subpop.0.species.pipe
pop.subpop.0.species.pipe.source.0
pop.subpop.0.species.pipe.source.0.source.0
pop.subpop.0.species.pipe.source.0.source.1

=
=
=
=

ec.vector.breed.VectorMutationPipeline
ec.vector.breed.VectorCrossoverPipeline
ec.select.TournamentSelection
ec.select.TournamentSelection

Because the default mutation and crossover probabilities and types were defined as part of the
BitVectorIndividuals, we don't need to stipulate those parameters here. But one thing is left: we have to
define the tournament size for our TournamentSelection to be 2. We could explicitly define sizes for each
of the selection operators as follows:
pop.subpop.0.species.pipe.source.0.source.0.size
pop.subpop.0.species.pipe.source.0.source.1.size

= 2
= 2

...but TournamentSelection (and all selection methods and breeding pipeline operators) is a Prototype, and
so it has a default base we could simply use instead:
select.tournament.size = 2

Define the Problem


So far, we've managed to define the high-level evolutionary process, administrative details, representation,
and complete breeding procedure without writing a single drop of code. But not any more. Now we have to
write the object that's actually responsible for assessing the fitness of our Individuals. This object is called a
Problem, and it is specified as a parameter in our Evaluator. We will create a Problem subclass called
ec.app.tutorial1.MaxOnes which will take an Individual, evaluate it, and hand it back. Before we do so,
we have one more self-explanatory parameter to define:
eval.problem = ec.app.tutorial1.MaxOnes

Now close the tutorial1.params file and open a new file (also in the tutorial1 directory) called
MaxOnes.java. In the file, write:
package ec.app.tutorial1;
import ec.*;
import ec.simple.*;

import ec.util.*;
import ec.vector.*;
public class MaxOnes extends Problem implements SimpleProblemForm
{

First, it defines a setup method, which you can override (remember to call super.setup(...) ) to set up the
prototypical Problem from a parameter file. Your Problem will be a clone of this prototypical Problem.
Second, it defines the method clone() which is used to make (deep) copies of the Problem. Java's clone()
method doesn't deep-clone by default; so if you have an object holding (for example) an array inside it, and
clone the object, the array isn't cloned. Instead both objects now point to the same array. ECJ instead calls
clone() on you, and you're responsible for cloning yourself properly.
Since we're not defining any instance data that needs to be loaded from a parameter file or specially cloned,
we don't even need to touch these methods. So what methods do we actually need to implement? As it turns
out, Problem doesn't actually define any methods for evaluating individuals. Instead, there are special
interfaces which various Evaluators use that you must implement. SimpleEvaluator requires that its
Problems implement the ec.simple.SimpleProblemForm interface. This interface defines two methods,
evaluate (required) and describe (optional). evaluate takes an individual, evaluates it somehow, sets its
fitness, and marks it evaluated. describe takes an individual and prints out to a log some information about
how the individual operates (maybe a map of it running around, or whatever you'd like). describe is called
when the statistics wants to print out "special" information about the best individual of the generation or of
the run, and it's not necessary. We'll leave it blank.
Here's the first part of the evaluate method:
//
//
//
//
//

ind is the individual to be evaluated.


We are given state and threadnum primarily so we
have access to a random number generator
(in the form: state.random[threadnum] )
and to the output facility
public void evaluate(final EvolutionState state,
final Individual ind,
final int threadnum)
{
if (ind.evaluated) return;
//don't evaluate the individual if it's
//already evaluated

Individuals contain two main pieces of data: evaluated, which indicates that they've been evaluated
already, and fitness, which stores their fitness object. Continuing:
if (!(ind instanceof BitVectorIndividual))
state.output.fatal("Whoa! It's not a BitVectorIndividual!!!",null);
BitVectorIndividual ind2 = (BitVectorIndividual)ind;

First we check to see if ind is a BitVectorIndividual -- otherwise something has gone terribly wrong. If
something's wrong, we issue a fatal error through the state's Output facility. Messages (like fatal) all have
one or two additional arguments where you can specify a Parameter that caused the fatal error, because it's
very common to issue a fatal error on loading something from the ParameterDatabase and discovering it's
incorrectly specified. Since this fatal error doesn't have anything to do with any specific parameter we
know about, we pass in null. Continuing:
int sum=0;
for(int x=0; x<ind2.genome.length; x++)
sum += (ind2.genome[x] ? 1 : 0);

VectorIndividuals have all have an array called genome. The type of this array (int, boolean, etc.) varies
depending on the subclass. For BitVectorIndividual, genome is a boolean array. We're simply counting the
number of trues in it. Continuing:
if (!(ind2.fitness instanceof SimpleFitness))
state.output.fatal("Whoa! It's not a SimpleFitness!!!",null);
((SimpleFitness)ind2.fitness).setFitness(state,
// ...the fitness...
(float)(((double)sum)/ind2.genome.length),
///... is the individual ideal? Indicate here...
sum == ind2.genome.length);
ind2.evaluated = true;
}

Note that Fitness itself doesn't actually contain any methods for setting the fitness, only for getting the
fitness. This is because different Fitness subtypes operate differently. In order to set a fitness, we must
assume that it's some particular Fitness, in this case, SimpleFitness. Just in case, we double-check first. [If
you're just hacking something up fast and you know that you're using a given kind of Individual and a
given kind of Fitness, the double-checking is probably unnecessary, but if you change your Individual or
Fitness in your parameters, your code may break in an icky way of course].
SimpleFitness defines a fitness-setting method where you provide the EvolutionState object, the fitness
value you want to set (for SimpleFitness, this is between 0 inclusive and infinity exclusive, 0 being worst
and infinity being better than the best), and a flag indicating whether or not this fitness is the ideal fitness.
We do exactly this.
Lastly, we mark the individual as evaluated, and we're done!
We finish out with an empty version of the describe method, since we don't have anything special to say
about individuals:
public void describe(final
final
final
final
{
}

Individual ind,
EvolutionState state,
int threadnum, final int log,
int verbosity)

Run the program


Close the MaxOnes.java file and compile it. If you're inside the tutorial1 directory, you can run it by
calling:
java ec.Evolve -file tutorial1.params

As of ECJ 8, the ideal individual will be discovered in the Generation 102, at least on Java VMs which
obey strict math (Microsoft's VM does not). The system dumps its statistics into the out.stat file like you
requested. Look in the file and note the style of the statistics that SimpleStatistics uses. The last few lines of
the file look like this:
Generation: 101
Best Individual:
Evaluated: true
Fitness: 0.99
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1
Generation: 102
Best Individual:
Evaluated: true
Fitness: 1.0

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1
Best Individual of Run:
Evaluated: true
Fitness: 1.0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1

If you'd like information instead in a columnar format, and don't care about what the best individuals look
like, you might try using ec.simple.SimpleShortStatistics instead of SimpleStatistics. You can of course
modify your parameter file, but it might be easier to simply override a parameter on the command line:
java ec.Evolve -file tutorial1.params -p stat=ec.simple.SimpleShortStatistics

...the last few lines look like this:


100 0.9550000005960464 0.99 0.99
101 0.9568999993801117 0.99 0.99
102 0.9563999980688095 1.0 1.0

These columns are: the generation number, the mean fitness of the first subpopulation per generation, the
best fitness of the first subpopulation per generation, and the best fitness of the first subpopulation so far in
the run. You can turn on even more statistics-gathering in most Statistics objects by saying stat.gatherfull = true. More than one Statistics object can be defined simultaneously for a run as well, though
that's out of the scope of this tutorial.
Remember, you can also change the random number generator seed as well:
java ec.Evolve -file tutorial1.params -p seed.0=4

IV.2. Origin
Origin is a Java-based software development platform for developing distributed evolutionary computation
and genetic programming applications. It's based on ECJ, a research evolutionary computation framework
developed by Dr. Sean Luke of George Mason University, and the Parabon Frontier Grid Platform. Like
ECJ, most of Origin's behavior is dynamically configured at runtime by parameter files, and most Origin
classes can be easily subclassed or replaced to extend Origin's operation.
Evolutionary computation is a problem-solving method inspired by biological evolutionary processes. A
random population of individuals - possible solutions to a problem - are generated and executed, and those
individuals that produce better results are favored to breed additional solutions, while poorly performing
individuals are eventually eliminated. Evolutionary computation is attractive for problems where the range
of possible solutions (the "solution space") is too large to exhaustively test all possible solutions, and where
the goal is to determine a "pretty good" solution rather than the best possible one.
ORIGIN is licensed under the ORIGIN End User License Agreement.

1. The Parabon Frontier Grid Platform


The Parabon Frontier Grid Platform eliminates traditional computing limitations. It builts for extreme
scalability, Frontier can employ a virtually unlimited number of computers. Supercomputation is get
without the supercomputer. It is inexpensive. Frontier harnesses the unused capacity of computersdesktops, servers, clusters, etc.so the cost of computation is extremely low. It's a pay-per-use service, so
it is need only to pay for the capacity that is need, when it is need. Introduced in 2000 as the first
commercially available grid computing solution, Frontier: gives the access and control through a simple,
browser-based "dashboard.", guarantees task execution by shielding applications from unreliable
computing resources and networking, supports most common computing platforms and is easily adaptable
to others and has powerful applications available today and many more in development, thanks to the
Frontier Software Development Kit (SDK) and a rich suite of other Frontier development tools.

Built from the ground up with the most advanced mobile code security capabilities, Frontier has safeguards
for both providers and users. It's the most secure grid computing platform on the market.

2. The Origin Distributed Grid Models


Grid computing (or the use of computational grids) is the application of several computers to a single
problem at the same time usually to a scientific or technical problem that requires a great number of
computer processing cycles or access to large amounts of data.
One of the main grid computing strategies is to use software to divide and apportion pieces of a program
among several computers, sometimes up to many thousands. Grid computing can also be thought of as
distributed and large-scale cluster computing, as well as a form of network-distributed parallel processing.
It can be small confined to a network of computer workstations within a corporation, for example or
it can be a large, public collaboration across many companies or networks.
The Origin distributed grid models use the Frontier Grid Platform to distribute the work of an
evolutionary run across the machines of a Frontier Grid. An evolutionary run is broken into units of work
called tasks and these tasks are sent to a Frontier Grid Server for scheduling. The server forwards tasks to
host computers running the Frontier Grid Engine for execution. When a task completes, its results are
returned to the Origin application on the local machine. A properly-written Origin application can run
without modification either conventionally on a local machine or as a distributed application on a Frontier
Grid.
The grid is best suited to evolutionary problems that have some combination of large populations
(thousands to millions of individuals) and computationally expensive evaluation functions.
Because of the latency inherent in distributing tasks over the grid an evolutionary run with a small
population may run faster locally. Since changing between conventional and grid models usually requires
only modifying a few parameter settings you can take advantage of this to test your evolutionary model by
running Origin locally, then scale up to a much larger population running on the Frontier grid.
Origin supports two grid models: master/slave, where individual fitness evaluations are distributed over the
grid, and Opportunistic Evolution (OE), which combines remote generational evolution with an
asynchronous steady-state local model. The master/slave model can be used with almost any generational
evolutionary model and is the simplest grid model to work with - a properly written generational model will
produce identical results in master/slave mode as when running locally. The OE model makes optimal use
of the grid and is well suited for very large populations or long evolutionary runs.

The Master/Slave Model


In the Origin Master/Slave model individual fitness evaluations are distributed over the grid, while
selection and breeding is performed locally. To increase performance multiple individuals are sent to each
Frontier task, but the individuals' fitness values are computed independently. The computed fitness values
are returned to the Origin application and selection and breeding proceeds as usual, followed by launching
another set of Frontier tasks to compute the next generation's fitness.
Most Origin generational evolutionary programs can be change to a master/slave model with the following
parameter file:
parent.0 = original-parameter-file
parent.1 = ${origin.parameter.dir}/com/parabon/ec/eval/master.origin.params
This loads the original program parameters and then the Origin master/slave model parameters. Any
parameter setting specific to the master/slave model should also go in this file. You can then switch
between master/slave and local mode by specifying the new parameter file instead of original-parameterfile on the Origin command line.
Because of the fixed overhead of distributing tasks over the Frontier grid and contention for compute
resources with other grid users each task instance should be sent enough individuals to require at least
several minutes to evaluate them. However on the other hand reducing the individuals sent to each task and
thus increasing the number of tasks used to evaluate each generation will distribute evaluations over a

greater number of grid machines. The optimum balance between the amount of work per task and the
number of tasks is usually determined empirically based on earlier evolutionary runs for this problem.
Origin provides two parameters to control the number of individuals sent to each task: job-size (or
eval.masterproblem.max-jobs-per-slave) and eval.masterproblem.max-data-per-slave. job-size sets the
maximum number of individuals ("jobs" in ECJ terminology) per task. This should be based on the average
time required to compute an individual's fitness- reduce this value as fitness evaluation times increase.
eval.masterproblem.max-data-per-slave is best suited to GP problems where individuals grow in size over
generations. As individuals get larger and take longer to evaluate Origin will reduce the number of
individuals sent to each task. If grid tasks are failing due to "out of memory" exceptions decrease this value.

The Opportunistic Evolution Model


Opportunistic Evolution (OE) combines remote generational evolution with a local steady-state
evolutionary model. OE distributes groups of individuals to remote tasks, where these subpopulations are
evolved for a specified number of generations. The remote tasks return their final individuals to the local
Origin application, which merges them into its population, then selects and breeds a new subset of
individuals to be evolved by a remote task.
Unlike the Master/Slave model, where all remote tasks must complete before the next generation is
evalated, OE is asynchronous - as each task returns the OE application merges that task's individuals into
the population, breeds new individuals, and launches another task.
To change an existing steady-state evolutionary model to an Opportunistic Evolution model use a
parameter file like this:
parent.0 = original-parameter-file
parent.1 = ${origin.parameter.dir}/ec/steadystate/steadystate.origin.params
generations = local-evaluations
slave.generations = remote-generations
slave.generations is the number of generations each remote task should evolve, while generations is the
number of times the population size is returned from remote tasks- e.g. if generations is 10 and the
population size is 50,000 the Origin application would complete after 500,000 individuals were returned
from remote evolution.
OE also supports the same parameters to control the number of individuals sent to each task as
Master/Slave and the standard steady-state evolutionary model parameters.

V. Experiments
The results of some experiments we made with our algorithm are shown in the tables below. All the
experiments were made using a population of 100 chromosomes which evolved for 200 generations at each
of the maximum 5 runs. Times are shown in seconds. 10 tests were made for each different situation.
We made a comparison between unidirectional search and bidirectional success rate.
For unidirectional search:
For unidirectional search:
Initial random moves: 4
Success rate: 81 %
Length of solutions: 2, 4, 6, 8, 9
Initial random moves: 5
Success rate: 50%
Length of solutions: 4, 5, 7, 8 chromosome length
Initial random moves: 6
Success rate: 30%
Length of chromosome: 7, 10

The variation of the mutation and crossover rate are shown in the figure:
120
100
80
60

mutation

40

crossover

20
0
1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91

For bidirectional search:


Initial random moves
4
5
10

Success rate
60%
40%
0%

Mean success time


5
4
-

Mean failure time


202
232
225

S-ar putea să vă placă și