Documente Academic
Documente Profesional
Documente Cultură
Jim Kukunas
Allegheny College
2010
Jim Kukunas
Copyright
c 2010
Jim Kukunas
All rights reserved
ii
JIM KUKUNAS. A Genetic Algorithm to Improve Kernel Performance
on Resource-Constrained Devices.
(Under the direction of Dr. G. M. Kapfhammer.)
Abstract
iii
Acknowledgments
I would like to thank my family, friends, and advisors, Dr. Gregory Kapfhammer and
Dr. Robert Cupper, for all their help and support throughout this long process.
iv
Contents
Acknowledgments iv
1 Overview 1
1.1 Diamondville . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Linux Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Intel C Compiler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Genetic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5 Thesis Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.6 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2 Prior Work 13
2.1 ACOVEA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 MILEPOST GCC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Cooper et. al. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4 COLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.5 Davidson et. al. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4 Results 29
4.1 Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2 Resulting Kernels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
v
4.3 Produced Kernels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.4 Application Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.4.1 Tests with Improved Performance . . . . . . . . . . . . . . . . 34
4.4.2 Tests with Decreased Performance . . . . . . . . . . . . . . . 38
4.4.3 Tests with Static Performance . . . . . . . . . . . . . . . . . . 39
4.5 Results Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.6 Threats To Validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
A Code Listings 44
A.1 Profiler Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
A.2 GA Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
A.2.1 main.h . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
A.2.2 main.c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
A.2.3 Makefile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
A.3 Builder Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
A.3.1 main.c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
A.3.2 Makefile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
A.4 Client Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
A.4.1 main.c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
A.4.2 check.sh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
A.4.3 Makefile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
A.5 Compiler Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Bibliography 153
vi
List of Tables
1.1 Compiler Flags Set at Optimzation Levels for the Linux Intel C Compiler 6
vii
List of Figures
viii
Chapter 1
Overview
#include<stdlib.h>
int main(int argc, char** argv)
{
int* test = malloc(sizeof(int)*100);
for(int i = 0; i < 100; i++) {
*(test+i) = 15;
}
free(test);
}
Source code Listing 1 iterates through every element in the array test and sets it
to 15. Without any optimization, this loop will perform 100 loop iterations, which
means that it will not only need to execute 100 instructions to store the values into
the array, but it will also need to check the loop constraint 101 times and increment
the loop counter variable 100 times. This can be seen in the assembly generated by
the Intel C++ Compiler in source code listing 2, with optimizations disabled.
Source code listing 3 demonstrates loop-unrolling performed on the previous ex-
ample. While the number of instructions to store the values into the array remain the
Listing 2 Compiler Generated Assembler Without Optimization.
same, the number of loop iterations are reduced from 100 to 20, which means that
each add instruction replaces 5 inc instructions.
#include<stdlib.h>
int main(int argc, char** argv)
{
int* test = malloc(sizeof(int)*100);
for(int i = 0; i < 100; i = i + 5) {
*(test+i) = 15;
*(test+i+1) = 15;
*(test+i+2) = 15;
*(test+i+3) = 15;
*(test+i+4) = 15;
}
free(test);
}
However, when we look at the assembly generated by this code in source code
listing 4, we notice that the resulting binary size has increased.
In typical userspace applications, the increased image size usually is outweighed
by the decreased execution time, however the performance of the Linux kernel is very
closely tied to its size in memory. A smaller kernel is a faster kernel, as a smaller
kernel image uses less memory, thus more memory is available for user applications.
Thus this optimization might actually hurt kernel performance, rather then help it.
2
Listing 4 Loop Unrolled Version of Program 1
3
Often, developers do not perform these optimizations by hand, but rather allow the
compiler to perform these optimizations. For the Intel C++ Compiler, loop-unrolling
is activated with the command-line flag -funroll-all-loops. This optimization
is automatically activated by optimization level -O1, on x64 platforms, and -02, on
x86 platforms. The Intel Compiler also provides the flags -unroll-aggressive,
which allows the compiler to unroll certain loops, ones with small trip counts, com-
pletely, as well as -unroll[=n], which allows the user to specify the maximum
number of times a loop can be unrolled.
In the previous assembly listings, all optimizations other then the loop-unrolling
were disabled, however optimizations such as loop-unrolling, used in conjunction with
other optimizations, such as parallelization, can provide even stronger benefits. In
source code listing 5, the Intel Compiler exploited the implicit parallelism of the loop
with the use of single instruction, multiple data (SIMD) instructions. SIMD instruc-
tions are instructions which perform a single instruction in parallel over multiple
chunks of data.
This example demonstrates how optimizations exist in a delicate balance, which
must be taken into consideration when determining which optimizations to employ.
Even if an optimization provides a significant performance increase on its own, it is
null or detrimental if used in conjunction with other “optimizations” which hinder
it. To remedy this condition, vendors build predetermined optimization levels into
their compilers. Each level corresponds to a given amount of optimization. These
typical optimization levels activate a set of optimizations with one flag. Each of the
optimization levels targets a unique purpose. Optimization level s optimizes the code
for decreased image size. Optimization level 0 disables all compiler optimizations.
Optimization levels 1 − 3 provide different levels of compiler optimizations, 1 being
light optimizations, and 3 being aggressive optimizations. If no other optimization
flags are selected, 2 is the default for the Intel C++ Compiler. Table 1 provides a
detailed description of each level.
These blanket optimizations perform well in the general case, but when dealing
with highly specialized architectures, such as those found in mobile devices, more
consideration needs to be placed on specialized optimizations.
1.1 Diamondville
This work focuses on the Intel Diamondville platform, Intel’s first generation micro-
architecture designed for netbooks and mobile Internet devices.
The Diamondville platform includes the Intel Atom n270 processor, as well as the
945GME chipset and 82801DBM I/O controller [12]. The n270 is a 45nm fabricated
processor containing on-die a 32kB instruction cache, as well as a 24kB write-back
data cache. It operates a steppable frequency of 1.60 GHz and a front-side bus speed
of 533MHz.
4
Listing 5 Intel Optimized Version of Source Code 1
5
Table 1.1: Compiler Flags Set at Optimzation Levels for the Linux Intel C Compiler
Optimization Level Optimization Flags
00 Disables All Optimizations
01 Global Optimizations
All optimizations from 01 and . . .
Constant Propagation
Copy Propagation
Dead-Code Elimination
Global Register Allocation
Loop Unrolling
Optimized Code Selection
02 Partial Redundancy Elimination
Strength Reduction/Induction Variable Simplification
Variable Renaming
Exception Handling Optimizations
Tail Recursions
Peephole Optimizations
Structure Assignment Lowering and Optimizations
Dead Store Eliminations
All optimizations from 02 and . . .
Prefetching
Scalar Replacement
03
Loop and Memory Access Transformations
Branch Elimination
Cache Padding
6
sor footprint to eliminate features that consume significant power, often hindering
performance.
The Intel Diamondville micro-architecture includes power-saving feature such as
dynamic cache sizing and dynamic bus parking [12]. Dynamic cache sizing, in low
power states, flushes and disables chunks of L2 cache to conserve power. Dynamic
bus parking powers down the chipset while the processor is in low-frequency mode.
While these features have a minor impact on performance, this potential impact is
not nearly as great as the exclusion of an out-of-order instruction scheduler.
In-Order Instruction Scheduler.
Unlike other typical Intel processors, the n270 utilizes an in-order instruction
scheduler for decreased power consumption and reduced heat generation. As opposed
to an out-of-order instruction scheduler, which reorders instructions at the processor
level, an in-order instruction scheduler executes instructions in the same order as
they were input [25]. Due to this static instruction placement, instructions that are
placed in less-than-optimal positions can cause bubbles to form in the pipeline and
thus reduce throughput. These bubbles can be avoided by reordering the instructions
at compile time. However, the compiler must be aware of the type of instruction
scheduler at the target platform to know how to accurately model the instruction
pipeline.
Consider the following example from Intel’s Atom optimization guide [20]. The
code in Listing 6, without careful optimization, can lead to a memory access depen-
dency stall. The assembler generated code from Listing 6 in Listing 7 demonstrates
these dependency stalls. While the first movl instruction is executing, the next in-
struction imull can not execute, due to the memory dependency of the value b in
register eax. This also occurs for the next move and multiply for d. Reordering the
instructions to match Listing 8 removes the dependency stalls, as both sets of movl
and imull instructions can be executed in parallel.
a = b * 7;
c = d * 7;
movl b, %eax
imull $7, %eax
movl %eax, a
movl d, %edx
imull $7, %edx
movl %edx, c
7
Listing 8 Optimized Assembler of Listing 6 to remove dependency stall
movl b, %eax
movl d, %edx
imull $7, %eax
imull $7, %edx
movl %eax, a
movl %edx, c
Movbe Instruction.
Due to the role of netbooks and mobile Internet devices, which often have to
interact with many different peripherals, efficient conversion between big-endian and
little-endian, and visa versa is important to compensate for the lack of high clock
speeds. One common situation which often requires endian conversions is networking,
where a machine may be required to communicate with a machine using a different
endianness. During high processor usage, this conversion can cause network transfers
to suffer. To remedy this, Intel added the movebe instruction, which performs a move
and byte swap, thus allowing for single instruction conversions. This instruction can
also be used to increase the performance of certain arithmetic operations. Since the
Intel Atom processor is the only processor line in the x86 family that supports this
instruction, the Intel C compiler is the only compiler that will generate code utilizing
this instruction [12].
8
User Space
System Calls
Kernel Space
Hardware
to the system, than an exploitation in kernel space. However, since user space ap-
plications often require access to the functionality controlled in kernel space, such as
memory management and I/O, they must interface with the kernel, which can then
perform the requested action. As shown in Figure 1.1, this interface is performed
through system calls [28].
By optimizing the kernel, we optimize the memory management, I/O and process
management of every process on the system.
9
1.4 Genetic Algorithms
Inspired by natural selection, genetic algorithms (GA) are an adaptive heuristic search
technique used for evolving solutions. The GA begins with the generation of a random
population, the individuals of which are encodings of solutions to the problem-set.
Each generation of the random population undergoes mutation, reproduction, selec-
tion, and fitness operators.
Fitness Operator.
The fitness operator evaluates each individual based on the quality of the solution
it represents. A fitness is then assigned to each individual which allows the individuals
to be compared. This allows the GA to encourage and nurture good solutions while
eliminating bad solutions.
Selection Operator.
After each individual is evaluated with the fitness operator, the selection operator
chooses which individuals will reproduce for the next generation of the GA. Typically,
selection operators foster elitism within the population to encourage better solutions.
There are multiple kinds of selection operators including, roulette-wheel selection,
tournament selection and truncation selection [19].
The most straightforward type of selection operator is truncation. The truncation
selection operator orders the population by fitness and then chooses a percentage of
the most fit individuals to reproduce [19].
The tournament selection operator runs multiple tournaments, where multiple
random individuals are compared. Only the individual with the highest fitness from
each tournament, the tournament winner, is chosen for reproduction [19].
The roulette-wheel selection operator, also known as the fitness proportionate
selection operator, assigns a percentage to each individual, based on the percentage
of the individual’s fitness compared to the total fitness available in the population.
Then, based on these percentages, individuals are chosen at random for reproduction.
Unlike the truncation and tournament selection operators, which attempt to keep as
many high-fitness individuals as possible, the roulette-wheel selection operator runs
the risk of selecting low-fitness individuals over high-fitness individuals, depending on
the random values chosen [19].
Mutation Operator.
The mutation operator randomly mutates some of the individuals, so as to add
genetic diversity within the population. To mutate an individual, the mutation op-
erator makes a subtle change to one or more parts of the individual at random. For
example, for an individual representing a set of compiler flags, a mutation might
involve setting a flag, that was previously on, to off, or visa versa.
Reproduction Operator.
The reproduction operator, combines two different individual’s encodings to pro-
duce a new individual with some characteristics of each of the original parent in-
dividuals. The overall goal of the reproduction operator is to combine the good
characteristics of each parent to produce a better child.
10
Figure 1.2: Flowchart For a Generic Genetic Algorithm
These operators are performed at each generation, with each generation producing
a local optimum [23]. Figure 1.2 demonstrates the typical flow of a GA.
Termination There is no way of knowing when a GA should terminate. Typically,
too few generations will not give the GA enough time to “evolve” a good solution.
Too many generations will waste time, if the best solution found by the GA was found
early on.
Due to the fact that genetic algorithms are a heuristic search technique, there is no
guarantee that the optimal solution found after N generations is indeed the optimal
solution, however typically GAs are good at finding local optimums, and solutions
better than the current solution at the start of the GA.
On the Intel Diamondville platform, a Linux kernel built with compiler flags
evolved from a genetic algorithm will improve user space application performance
for some applications when compared to a Linux kernel built with the standard
build flags, because the genetic algorithm will adapt to how different optimiza-
tions interact amongst themselves.
11
1.6 Thesis Outline
The next chapter discusses the prior work that motivated this work. The third chapter
describes the implementation and design details of constructing the system. The
fourth chapter shows the results achieved, and finally the fifth chapter provides ideas
for future work.
12
Chapter 2
Prior Work
“Some men give up their designs when they have almost reached the goal; while
others, on the contrary, obtain a victory by exerting, at the last moment, more
vigorous efforts than ever before”
– Herodotus of Halicarnassus
2.1 ACOVEA
The Analysis of Compiler Options via Evolutionary Algorithm system (ACOVEA) is
a C++ framework to implement a genetic algorithm to “find the “best” options for
compiling programs with the GNU Compiler Collection(GCC) C and C++ compilers
[16]. Currently versions of ACOVEA supporting the SPARC platform, as well as the
Intel C++ compiler are in development.
Population Representation and Initial Population.
ACOVEA initially represented individuals as a binary string, with each compiler
flag represented as a bit in a long long data primitive, typically 8 bytes on a 32 bit
system, thus limiting the number of potential considered options to 64. This was then
changed to reference extensible markup language (XML) descriptions of the compiler
and its options, to reduce the complexity of dealing with compiler flags which offer
multiple states [16]. However, the added overhead of parsing XML motivated a shift
from XML to an object hierarchy for the final representation of the population.
The initial population is created at random, however “blanket” optimizations,
described in Table 1, are added so that the random individuals must compete against
the typically chosen optimizations.
Following the biological model of African lions, ACOVEA attempts to run multi-
ple populations simultaneously. These populations, or prides, link together through
migration, where individuals relocate from one population to another. Over time, the
populations each approach unique genetic uniformity, with each population focusing
on locally optimal results. Then, as individuals migrate, these local optimums are
spread and combined among the populations to improve these results [16].
Fitness Evaluation.
ACOVEA evaluates fitness by compiling the given application, which the user
wants optimized, using each individual’s corresponding compiler options. The result-
ing executable is run and then assigned a fitness score based on the resulting execution
time. Individuals that cause compiler or program failure result in a fitness score, that
reduces the likelihood that such individuals are chosen for reproduction.
Reproduction and Mutation.
After each individual is assigned a fitness, a subset of the individuals are selected
for reproduction and fitness. In this system, two-point crossover, in which two indexes
are chosen within the parents, and then everything in between those indexes are
swapped, is used for reproduction.
Mutation, on the other hand, is performed either by switching a compiler op-
tion to on from off, or visa versa, or by changing the state of a compiler option
which can exist in multiple states. An example of the former would be changing
-fno-unroll-loops to -funroll-loops. An example of the latter would be
changing -fp-model=strict to -fp-model=fast. The first example enables
loop-unrolling, which was previously disabled. The second example changes the float-
ing point model from strict to fast.
Results.
ACOVEA has seen significant success, in some cases improving run time perfor-
mance by up to 57% [16]. While this system seems like a good solution to the problem
mentioned earlier, ACOVEA focuses more on small algorithmic benchmarks, rather
then on full systems. Therefore, it would not be an ideal system to use for optimiz-
ing the Linux kernel, however it does reinforce the successful applications of genetic
algorithms in optimization based scenarios.
14
the results of a set of optimization strategies without actually compiling and running
the application, based on the program features in the code to be optimized. During
this phase, the actual application the user wishes to optimize is input to the compiler
and the optimization strategies determined effective on the training applications are
used [7].
Results.
Tested on multiple platforms, and on multiple popular open source projects such
as BerkeleyDB and Mozilla, MILEPOST GCC has seen significant success in practical
use. For example, on the Intel Xeon processor, MILEPOST GCC achieved a 140%
increase in performance, without a significant increase in code size [5].
15
The genetic algorithm is run for 1000 generations. Over these 1000 generations,
up to 40% smaller code resulted, by employing the sequence evolved by the genetic
algorithm [4].
2.4 COLE
The Compiler Optimization Level Exploration (COLE) system claims to be the first
multi-objective evolutionary algorithm, inspired by a genetic algorithm, designed to
find Pareto optimal optimization levels. While most systems optimize for either
reduced execution time or reduced space, COLE allows for multi-objective fitness.
This allows for the system to optimize for both reduced execution time and code
size. Due to these multiple fitness constraints, COLE evaluates fitness based on
Pareto optimality. Pareto optimality refers to performing better for at least one
objective, while performing at least as well for the other optimizations for the other
objectives. The COLE system outperformed both GCC’s standard optimization levels
and random search techniques when applied to the SPEC CPU2000 benchmarks [8].
16
Chapter 3
The system described consists of 4 main segments. The first segment is the ap-
plication profiler, which determines how fitness will be evaluated within the genetic
algorithm. The second segment is the genetic algorithm, which drives the entire sys-
tem. The third segment is the build farm, which allows for distribution of the kernel
building workload. Finally, the fourth segment is the testing farm, which provides
distribution of the kernel evaluation workload.
Algorithm 1
Procedure Profile(T )
(∗ Keeps Track of System Call Usage ∗)
Input: Set of Applications to be Profiled T
Output: Total Count of Each System Call Occurrence
1. for i ←0 to Total Number of System Calls
2. Countsi ←0
3. for i ←0 to |T |
4. do Create New Process
5. if New Process
18
6. Ptrace System Call with PTRACE_TRACEME
7. Execute Ti
8. else
9. while true
10. do Wait for Child
11. if Child Exited
12. stop
13. else
14. Obtain Child Registers
15. tmp ←EAX
16. Countstmp ←Countstmp + 1
17. Allow Child to Continue
18. return Counts
19
to the performance of binary strings. While serialization typically is not an expen-
sive operation, deserialization can be very expensive. At the same time, care must
be taken to ensure that the XML files stored on the disk, and the objects stored in
memory are kept in sync to prevent data loss.
The final representation considered is an object-oriented hierarchy. While an
object hierarchy allows for data structure abstraction and increased extensibility, The
object-oriented hierarchy representation can also incur significant overhead compared
to the binary string representation. Expensive inheritance features, such as virtual
functions, which allow for dynamic dispatch but also force virtual function addresses
to be referenced in a vtable, slow down operations that would normally take only
a few instructions for a binary string.
Both the XML and object-oriented representations could implement the binary
string representation internally, however they both add additional overhead in provid-
ing more functionality. The representation chosen in this work was that of the binary
string, incorporating some of the benefits provided by the other representations, while
attempting to significantly reduce the overhead which accompanied those features.
Achieving data persistence without the performance penalties of XML, relies on
the POSIX mmap system call. The mmap system call allows for a file to be mapped
directly into memory. Replacing the calls to malloc with mmap when initializing the
population, allows the system to allocate memory which is backed by a file. Thus,
as the population is created and evolved, the file is updated periodically, or upon
request, with the msync system call, without a parser, which is required for XML,
or any expensive serialization and deserialization. While implementing the ability to
pause and continue the genetic algorithm during execution exceeds the focus of this
work, more information can be found in Section 5.3.1.
To provide support for compiler options that can exist in multiple states, an
overflow buffer can be allocated to each individual. This overflow buffer, whose size is
chosen based on the number of options that can exist in multiple states, contains the
possible states for each compiler flag. For example, a flag which can exist in multiple
states such as -O, which can exist in states 0, 1, 2, 3, and s, would be stored in an
overflow buffer of size 5. The first element of the overflow buffer would be -O0, the
second would be -O1, and so on.
Consider the following example, representing the compiler flags and the population
shown in Figure 3.2. Individual 0, which has a value of 0x03, is a composite of
both 0x01 and 0x02, thus this individual would represent both -axSSE3 and -vec
optimizations active. The overflow buffer for individual 0 is empty, as both of those
active flags are binary in nature. On the other hand, individual 1 has a value of
0x04, thus representing only the -fp-model compiler flag. This flag can exist in
multiple states, either -fp-model=strict or -fp-model=fast, and thus further
clarification is required for this individual. For this clarification, the overflow buffer is
referenced, which contains the specific state of the flag. For individual 1, the overflow
buffer contains -fp-model=fast, and thus this state of the optimization is used.
Individual 2 on the other hand, whose binary string is equivalent to individual 1,
20
represents fp-model=strict.
Algorithm 2
Procedure parse_options(F )
(∗ Parses Option File into Memory for Use in Population ∗)
Output: Compiler Options to Create Population
Input: File Path F
1. Open F for Reading
21
2. i ←0
3. while Not End of File F
4. do line ←Read from F
5. if line = “EITHER”
6. then f lagi ←NULL
7. n ←Read from F
8. countsi ←n
9. for j ←0 to n
10. tmp ←Read from F
11. Add tmp to Overflow Buffer
12. else
13. f lagi ←line
14. countsi ←1
15. i ←i + 1
22
subtractive random number generator. While analysing the effect of these different
algorithms is beyond the scope of this work, see Section 5.3.3 for more information.
Algorithm 3 demonstrates how each member of the population is initialized. Take
note that on Line 3, the initial fitness is set to −1.
Algorithm 3
Procedure init_population(T, X)
(∗ Creates an Initial Population ∗)
Output: Initialized Population Ready for Evolution
Input: Population Size T , Option Count X
1. for i ←0 to T − 1
2. do
3. f itnessi ←−1
4. f lagsi ←Read X8 + 1 bytes from /dev/urandom
5. return 0
3.2.4 Fitness
Fitness evaluation is the most expensive operator in the genetic algorithm. As seen
in Subsection 3.2.3, each individual was initially assigned a fitness of −1. If any of
the individuals are modified, either by mutation or reproduction, their fitness is reset
back to −1, thus indicating their fitness needs to be reevaluated. This obviates dupli-
cate fitness evaluations between generations. Further evaluations can be avoided by
employing a global lookup table, containing the fitnesses of previously tested kernels.
This lookup table, especially if persisted in between runs of the genetic algorithm,
could greatly reduce the fitness operator’s time overhead. For more information, see
Section 5.3.4.
The first step of determining the fitness of each individual in the population is
to construct their corresponding compiler flags, thus allowing their optimization set
to build and test a kernel. To accomplish this, the individual’s bits are iterated
to determine which optimizations are active. If a flag is found to be active, the
corresponding compiler flag was concatenated onto the individual’s compiler flags.
Rather then searching for which optimization corresponded to each bit, the compiler
flags were stored in such a way that the required bit shift for the current byte plus
the total iteration counter was equivalent to the optimization’s index in the storage
array. Thus, after a bit was found to be active, searching for the corresponding string
is O(1).
Once the flags are built, the workload of building the kernels is divided between
the machines within the build-farm. First the genetic algorithm chooses the next
individual for fitness and then begins to build a kernel with the corresponding flags.
For n machines in the build farm, the next n individuals’ corresponding flags are sent
to those machines, using the transmission control protocol (TCP). TCP was chosen
as it provides reliable data transfers, with safeguards against data-loss and data-
corruption. By capitalizing on the inherently embarrassingly parallel task of building
23
separate kernels, the overall duration of the genetic algorithm can be drastically
reduced.
To actually build the kernels, a child process is created for each machine in
the build farm. To create these processes, the POSIX standard fork, execl and
waitpid system calls are used. The fork system call creates a new child process
which inherits an exact copy of the memory image of the parent process. The execl
system call, then replaces that memory image, thus executing a different executable,
in the specified environment. Finally, the waitpid system call allows for a process
to block until another process finishes, thus allowing process synchronization. While
fork technically requires that the entire address space of the parent is copied for the
child, the copy-on-write optimization used within the Linux kernel allows the child to
access the parent’s address space until the child attempts to write changes to that ad-
dress space, at which point the parent’s address space is copied into a unique address
space, for the child to modify. Since, execl is called immediately after the child is
forked when child processes are created to execute kernel build tasks, no changes are
made to the parent’s address space, and thus no address space copy occurs, leaving
only the penalty of duplicating the parent’s page tables for the child. Before this
optimization was implemented in the kernel, vfork was used to achieve similar re-
sults. When initially forking the children to build the kernels and interact with the
build farm, each child does alter the parent’s address space, and thus incurs the copy
penalty, however, the parent’s address space can be pruned with the madvise system
call before the invocation of fork.
Each child sends the corresponding compiler flags to the build farm, and then
awaits a response containing the fitness score. The parent process, rather then sending
the compiler flags to a build farm, compiles the kernel locally. To build the kernel, the
parent process forks a child to perform each kernel build task, and then synchronizes
the two processes using waitpid, so that in cases where a job depends upon the
previous step, no dependency violations occur. Algorithm 4 demonstrates the fitness
evaluation function. At Line 1, a child process is created for each machine within the
builder server. While this is performed, the machine running the genetic algorithm
also builds the kernel, using the build rules, clean, oldconfig, bzImage, and then
creates an archive of the necessary files to install the produced kernel. The first build
rule executed is clean. This rule deletes the files created from the last kernel build.
This reduces the chances of cross-contamination, the kernel using already generated
files from the last compilation. After this, the build rule oldconfig updates the
.config file, which contains the kernel’s configuration and settings. This process
generates some files which are necessary for the kernel image to build properly.
24
Algorithm 4
Procedure fitness(P )
(∗ Evaluates the Fitness of Each Individual ∗)
Output: Population with New Fitness Values
Input: Population with Some Individuals Requiring Fitness Evaluation
1. for i ←0 to Number of Build Machines − 1
2. do
3. Fork Child
4. if Pid of Child
5. then
6. Send Individual Flag to Builder
7. Await Fitness Response
8. Fork Child
9. if Pid of Child
10. then
11. make clean
12. Waitpid Child
13. Fork Child
14. if Pid of Child
15. then
16. make oldconfig
17. Waitpid Child
18. Fork Child
19. if Pid of Child
20. then
21. make bzImage
22. Waitpid Child
23. Fork Child
24. if Pid of Child
25. then
26. Create tar archive of bzImage and System.map
27. Waitpid Child
28. Wait To Sequentially Receive Fitness for Each Kernel Created by Children
29. Send Kernel Archive Created by Parent to Testing Farm
30. Wait To Receive Fitness For Parent Kernel
Once this task has completed, the actual compressed kernel image is built. The
build rule bzImage, along with the flags HOSTCFLAGS and HOSTCXXFLAGS set
to the individual’s corresponding compiler options, as well as AR set to xiar, the Intel
archiver, and LD set to xild, the Intel linker, is invoked. This process constructs the
compressed kernel image, which is now ready for installation on the target system.
Typically, the next step would be to build the kernel modules, however since the
modules are never modified, compiling them at every iteration wastes valuable time.
Therefore, the kernel modules are separately built and installed on the testing farm
25
machines before the genetic algorithm is run.
Now that the kernel is built, it is ready for installation. Typically, to install the ker-
nel, the build rule install is invoked, however, this requires archiving, and ideally
compressing, the entire kernel source in preparation for network transfer. Regardless
of the compression algorithm used, including bzip2, gzip, lzop, and lzma, these tasks
are very time-consuming, and can consume up to 30 minutes. To remedy this, only
the necessary parts of the kernel are archived, the compressed kernel image located at
arch/x86/boot/bzImage, and the System.map, located in the root source direc-
tory. At boot, the compressed kernel image is decompressed and executed. Due to the
differing kernel structures, the address of each function in the kernel is contained in
the System.map file. By targeting only these two files, the entire process of preparing
the kernel for installation is reduced to just under 1 minute. Along with exploiting
the parallelism in GNU Make by using the -j n flag, where n jobs are executed in
tandem, the entire preparation process takes approximately 6 minutes depending on
disk latency, processor speed, and the selected compiler optimizations.
26
state, more kernels can begin to build, while the previous set of kernels are still under
evaluation.
The implementation of the build farm can be found in Section A.3.
27
Listing 9 Timing the mmap2 System Call.
clockid_t cl;
struct timespec start, end;
void* buffer;
clock_getcpuclockid(0, &cl);
clock_gettime(cl, &start);
__asm__ __volatile__ (
"movl $192 ,%%eax\n\t"
"movl $0 ,%%ebx\n\t"
"movl $8192,%%ecx\n\t"
"movl $0x3 ,%%edx\n\t"
"movl $0x22,%%esi\n\t"
"movl $-1 ,%%edi\n\t"
"movl $0 ,%%ebp\n\t"
"int $0x80 \n\t"
"movl %%eax, %0"
: "=m" (buffer)
:
: "%eax", "%ebx", "%ecx", "%edx", "%esi"
"%edi", "%ebp");
clock_gettime(cl, &end);
After all system calls have been timed, the next step is to calculate the fitness
score. Algorithm 5 demonstrates how the fitness score is derived once all system calls
considered are timed. So that higher fitness scores are better, the fitness starts at
the largest value possible and then decreases the fitness score according to the total
measured system call duration.
Algorithm 5
Procedure Calculate_Fitness(T )
(∗ Calculates the Fitness Score ∗)
Input: Total Time For System Calls To Execute T
Output: Fitness Score n
1. n ←INT_MAX
2. for i ←0 to Number of Seconds in T
3. do n ←n − 10000000
4. n ←n − Number of NanoSeconds in T
5. return n
28
Chapter 4
Results
4.1 Configurations
The genetic algorithm was run in three unique configurations. Table 4.1 demonstrates
these configurations. For all three configurations, a population size of ten was used.
As discussed in Section 2.3, Cooper et. al. found that population sizes over twenty did
not provide better results than population sizes under twenty [4]. The population of
twenty was reduced to ten for these experiments due to time constraints. By reducing
the population size to ten, more time was available for testing different configurations.
Two of the three configurations tested ran for five generations. The other config-
uration ran for ten generations. The primary test was run 0, and the other two were
modifications of that run. Run 1 increased the generation count to ten, and added
two more system calls, to gauge the effect of these changes, compared to run 0. Run
2 maintained the same number of generations as run 0, but added more system calls.
The system calls chosen were determined by the application profiler discussed in
Section 3.1. The application profiler analyzed the UNIX commands ls, du, and
mkdir. The results from the profiler can be found in Table 4.2. Out of the 300+
system calls, only the top 12 are listed, all other system calls had counts in the single
digits, with most at 0. From this table, open, close, read, write, fstat64,
mmap2, and munmap were chosen for use.
# Population Generations System Calls
0 10 5 Open, Close, Read, Write, Fstat64
1 10 5 Open, Close, Read, Write, Fstat64, Fork
2 10 10 Open, Close, Read, Write, Fstat64, Fork, Mmap2, Munmap
30
All three evolved kernels resulted in a higher total fitness then both the Fedora
default kernel and the ICC kernel. Interestingly, the ICC kernel scored a lower fitness
value then the default Fedora kernel. The Fedora kernel was most likely built with
either -O2 or -O3, as opposed to the ICC kernel built without any optimizations.
While this may seem like an unfair comparison, it provides a base comparison for the
flags the genetic algorithm chose.
While the evolved kernels have a higher fitness value, they must be analyzed and
benchmarked to determine whether this increased fitness corresponds to an increase
in performance.
31
added overhead on SSE3 instructions may be responsible, among other factors, for a
lower fitness score [9].
The kernel produced by run 2 was built using the flags -mcpu=pentium4,
-march=pentium4, -scalar-rep, -alias-const, -fargument-noalias-global,
opt-ra-region-strategy=default, -vec, -par-schedule=auto,
-fast-transcendentals, -fp-port, -rcd, -ftz, -inline-level=2, -finline,
-Zp16, -align, -falign-functions=16, -falign-stack=assume-16-byte.
Instead of performing SSE optimizations like the other two kernels, the first two flags,
-mcpu=pentium4 and -march=pentium4, tune the compiler flags for the Pen-
tium line of processors. While these optimizations are probably somewhat beneficial,
they will not fully utilize any architecture specific features introduced after the Pen-
tium series of Intel processors, and thus will not take full advantage of the Intel Atom.
Just as in the last function, the compiler options show favor to 16 byte alignment.
The option -par-schedule=auto allows either the compiler, or run-time libraries,
to determine the best scheduling algorithm for parallel loop iteration. This can be
beneficial in some situations, or an added overhead in others, depending on what
scheduler is chosen and how it is chosen.
While these observations are merely speculations based on fitness scores and com-
piler optimizations, the real performance analysis occurs at the user space level, where
application benchmarks are run to characterize the performance of the kernels.
32
Test Description
LAME Wav to MP3 Encoding Using LAME
OGG Wav to Ogg Encoding
FFMpeg AVI to VCD using FFMpeg
7Zip 7-Zip File Compression
Scimark0 Composite Scientific Computation
Scimark1 FFT Calculation
Scimark2 Monte Carlo Simulation
IOZone0 512MB Write Performance
IOZone1 512MB Read Performance
IOZone2 1GB Write Performance
SqlLite 12, 500 Inserts to SQLite DB
GNUPG 2 GB File Encryption
CRay Ray Tracing
Ram Integer Add
GTKPerf0 GtkComboBox
GTKPerf1 GtkPixBuf
GTKPerf2 GtkRadioButton
33
The Phoronix netbook test suite was run on each kernel considered. Each of these
tests provides unique insight into which kernels were best suited for certain tasks.
Also considered was a kernel built with the Intel C compiler with no optimizations.
This allows for a comparison of the compiler options the genetic algorithm chose.
OGG Test.
The next test performed was the OGG test, where a WAV file was encoded as an
OGG file. Table 4.6 displays the results. The fastest kernel was the run 0 kernel, and
again the slowest kernel was the stock Fedora kernel. The combination of -axSSE3
and -mia32 might have been responsible for this performance increase, however that
is purely speculation.
Scimark.
34
The next three these are performed using SciMark, a Java benchmark for testing
scientific and numerical computing. This first test, SciMark0, computes five compu-
tational kernels, an FFT kernel, a Gauss-Seidel relaxation kernel, a Sparse matrix-
multiply kernel, a Monte Carlo integration kernel and a dense LU factorization kernel
[22]. The test compares the combined composite score of all of these kernels. Table
4.7 shows the results of this test. The fastest kernel was produced by run 0. The
second fastest kernel was produced by run 2. The stock Fedora kernel was the slowest
of the kernels tested.
Kernel Average Duration (sec)
ICC 120.42
Run 1 120.4
Stock 120.74
Run 0 120.03
Run 2 120.1
The next Scimark test, SciMark1, only compares the performance of the FFT
kernel [22]. As seen in Table 4.8, the ICC kernel with no optimizations performed the
best here. The stock Fedora kernel performed substationally worse then the others,
taking a full 2 seconds more.
The last SciMark test, SciMark2, compares the performance of only the Monte
Carlo integration kernel [22]. The results are shown in Table 4.9. For this test,
the kernels performed very similarly, with almost no difference between the kernels,
perhaps because this task did not interact with the kernel often. The stock Fedora
kernel took the longest to complete the computations.
Sqllite.
Sqllite is a lightweight file-based database system. The Sqllite benchmark eval-
uates the performance of 12, 500 inserts. Table 4.10 displays the results. In this
test, the stock Fedora kernel performed significantly worse then the other Intel-built
kernels. The kernel produced by run 2 outperformed all of the other Intel-built ker-
nels by 1 to 2 seconds. This result may be caused by the Intel-built kernels using
a different IO scheduler then the Fedora kernel. The Fedora kernel, which was con-
figured by the Fedora kernel developers, uses the Complete Fair Queueing Scheduler
35
Kernel Average Duration (sec)
ICC 41.65
Run 1 41.81
Stock 41.87
Run 0 41.84
Run 2 41.84
(CFQ), while the kernels built with the Intel compiler were configured to use the An-
ticipatory IO Scheduler (AS). The Anticipatory IO Scheduler is optimized to avoid
disk head movements, which can be ideal in mobile devices, in an attempt to reduce
power-consumption.
GNUPG.
The GNU Privacy Guard (GNUPG) test evalutes encrypting a 2 GB file using
GNUPG. The results are visible in Table 4.11. The slowest kernel was the stock
Fedora kernel. The fastest kernel was the kernel produced by run 0. This is probably
a result of SSE3 optimizations, but that is merely speculation.
Cray.
The C-ray test measures the time required to ray-trace a scene. The results from
this test can be found in Table 4.12. The fastest kernel produced was the kernel from
run 1. The slowest kernel was the stock Fedora kernel. In general, ray-tracing tends
to be a highly parallel task, which can benefit from architecture parallelism the Intel
C compiler can exploit.
GTKPerf0.
36
Kernel Average Duration (sec)
ICC 2564.34
Run 1 2563.4
Stock 2570.03
Run 0 2564.19
Run 2 2563.91
The next three tests perform benchmarks using the GTK graphic library, which
is the basis of the popular GNOME desktop. The first test performs a series of
operations on a GtkComboBox. Table 4.13 displays the results. The slowest kernel of
the group was the Stock Fedora kernel. The fastest of the group was kernel produced
by run 1.
GTKPerf1.
The next GTK test was a benchmark of the GTK PixBufs. Table 4.14 displays
the results. The fastest kernel in this test was the stock Fedora kernel. The slowest
was run 1.
GTKPerf2.
The final GTK test was a benchmark of the GTK radio button. Table 4.15 displays
the results. The fastest kernel in this test was the kernel produced in run 0. The
slowest kernel produced was the stock Fedora kernel.
37
Kernel Average Duration (sec)
ICC 26.93
Run 1 26.65
Stock 27.34
Run 0 26.52
Run 2 26.57
IOZone0.
The next three tests are performed using the IOZone filesystem benchmark. As
seen in Table 4.17, Table 4.18, and Table 4.19, The Fedora kernel was the fastest
kernel for all IOZone tests by a significant margin.
The first test performed benchmarked 512 MB write performance. As an in-
teresting observation, the worst kernel was the ICC kernel with no optimizations.
Comparing this kernel with the results of the three runs, it is apparent that in each
case, the optimizations chosen by the genetic algorithm did improve performance.
IOZone1.
In the next IOZone test, 512 MB read performance is benchmarked. Just like
the results for the first IOZone test, the worst kernel was the ICC kernel. When
compared to the ICC kernel, each of the kernels evolved by the genetic algorithm
improved performance.
IOZone2.
For the last IOZone test, 1 GB write performance was benchmarked. Yet again,
38
Kernel Average Duration (sec)
ICC 54.8
Run 1 53.3
Stock 36.46
Run 0 51.95
Run 2 52.48
the trend continues from the previous two IOZone tests. The ICC kernel was the
worst, and the three evolved kernels improved performance, in comparison.
While there is no empirical evidence to prove this, it is the author’s opinion that
these performance results occurred due to the difference in IO scheduler configurations
discussed earlier in regards to the Sqllite test.
Ram.
The RAMspeed test performs integer addition in memory, to benchmark the per-
formance of memory access. Table 4.20 shows the results. The slowest kernel pro-
duced was the kernel from run 1, and the Fedora kernel was the fastest. In this case,
the ICC kernel was the next fastest kernel, thus indicating that all options chosen by
the genetic algorithm reduced performance in this case.
39
Kernel Average Duration (sec)
ICC 2006.93
Run 1 2021.47
Stock 1888.01
Run 0 2006.98
Run 2 2018.61
In the FFMpeg test an AVI file was converted to an NTSC VCD file. Table 4.21
displays the results. While all of the kernels were very close, run 0 was slightly faster
then the rest, and the slowest was the stock Fedora kernel. This lack of variety within
the test results hint that this benchmark did not interact with the kernel often.
Kernel Average Duration (sec)
ICC 94.43
Run 1 94.47
Stock 94.52
Run 0 94.20
Run 2 94.17
40
Figure 4.1: Phoronix Benchmark Results (Smaller Is Better).
41
Chapter 5
“In the future, computers may weigh no more then 1.5 tonnes.”
– Popular Mechanics, 1949
5.1 Conclusion
This work accomplished exactly what it intended to do. The genetic algorithm was
able to effectively find a set of compiler optimizations that interacted well and yielded
better performance then the stock Fedora kernel on the Intel Diamondville platform.
Now that this system has been a successful proof-of-concept, it is time to expand
to more platforms, and more compilers, while also expanding the system features to
eventually meet the needs of the home user, who wishes to automatically optimize
his kernel, or to the chipset manufacturer, who wishes to tune vendor compilers and
provide optimizations strategies to developers.
5.2 Contributions
This system is not going to change the way millions of people compile their kernels.
What it does contribute is a proof-of-concept. It shows that genetic algorithms can
not only improve user space executable performance, but also can be expanded to
improve kernel performance. It also reinforces the claim that genetic algorithms accel
at finding correlations between options for optimization problem and motivated future
work.
43
Appendix A
Code Listings
11 #define _BSD_SOURCE
12
13 #include<sys/ptrace.h>
14 #include<sys/types.h>
15 #include<sys/wait.h>
16 #include<sys/mman.h>
17 #include<sys/user.h>
18 #include<unistd.h>
19 #include<sys/syscall.h>
20 #include<stdio.h>
21 #include<stdlib.h>
22 #include<string.h>
23
24 /**
25 * \struct call
26 * Holds the name and total of system calls
27 */
28 struct call
29 {
30 char* name; /** \brief the name of the system call
*/
31 int counts; /** \brief the total number of calls
throughout the
32 * benchmarks
33 * */
34 long int num; /** \brief the system call number (eax
) */
35 };
36
37
46 if(argc < 3) {
47 fprintf(stderr, "Usage:\n%s <syscall file> <
benchmark 0>"
48 " ... <benchmark N>\n", *
argv);
49 return EXIT_FAILURE;
50 }
51
54 if(syscalls == NULL) {
55 perror("Opening System Call File");
56 return EXIT_FAILURE;
57 }
58
59 count = 0;
60 line = malloc(BUFSIZ);
61 if(line == NULL) {
62 perror("Allocating Memory");
63 fclose(syscalls);
64 return EXIT_FAILURE;
45
65 }
66
67
68 total = 512;
69 calls = malloc(sizeof(struct call)*total);
70
71 if(calls == NULL){
72 perror("Allocating Memory for System Calls")
;
73 free(line);
74 fclose(syscalls);
75 return EXIT_FAILURE;
76 }
77
78
79 while(!feof(syscalls)) {
80 fscanf(syscalls, "%s %li\n", line,&((count+
calls)->num));
81 (count+calls)->name = malloc(strlen(line));
82 strcpy((count+calls)->name, line);
83 (count+calls)->counts = 0;
84 if(count == total-1) {
85 total += 256;
86 void* new = realloc(calls,
87 sizeof(struct call)*total);
88 if(new == NULL) {
89 perror("Reallocating more
memory"
90 "for the system calls
");
91 for(count = 0; count < total
-256;count++){
92 if( (calls+count)->
name != NULL) {
93 free( (calls
+count)->
name);
94 }
95 }
96 fclose(syscalls);
97 free(line);
98 free(calls);
99 return EXIT_FAILURE;
46
100 }
101 else {
102 calls = new;
103 }
104 }
105 count++;
106 }
107
108 free(line);
109 fclose(syscalls);
110 madvise(calls, sizeof(struct call) * total,
MADV_DONTFORK);
111
139 }
47
140 else {
141 struct user_regs_struct regs;
142 while(1) {
143 wait(&status);
144 ptrace(PTRACE_SYSCALL, child
, 0, 0);
145 wait(&status);
146 ptrace(PTRACE_GETREGS, child
,0, ®s);
147 for(int i =0; i < total; i
++) {
148 if( (calls+i)->num
==
149 regs.
orig_eax)
{
150 (calls+i)->
counts++;
151 break;
152 }
153 }
154 ptrace(PTRACE_SYSCALL, child
, 0,0);
155 if(WIFEXITED(status)) {
156 break;
157 }
158 }
159 }
160 }
161
48
175 (calls+count)->name,
176 (calls+count)->counts);
177 }
178 }
179
190 );
191
192
193 fclose(results);
194 fclose(script);
195 sync();
196
206
49
212 free(calls);
213 }
1 cc = gcc
2 out = profiler
3
5 build_debug: profile.c;
6 ${cc} profile.c -g -std=c99 -Wall -pedantic -o ${
out}
7 build: profile.c;
8 ${cc} profile.c -std=c99 -O2 -o ${out}
A.2 GA Code
A.2.1 main.h
1 /**
2 * \file main.h
3 * \author Jim Kukunas <jkukunas@acm.org>
4 * \date 02/2010
5 * \brief Function declarations to create and run the system
6 *
7 * \mainpage Senior Thesis
8 *
9 * \section Introduction
10 * This project ...
11 *
12 */
13
14 #ifndef _MAIN_H
15 #define _MAIN_H
16
20 #include<string.h>
21 #include<stdio.h>
22 #include<stdlib.h>
23 #include<stdint.h>
24 #include<math.h>
25 #include<sys/mman.h>
26 #include<sys/wait.h>
27 #include<sys/types.h>
50
28 #include<sys/stat.h>
29 #include<fcntl.h>
30 #include<unistd.h>
31 #include<errno.h>
32 #include<time.h>
33
34 #include<sys/socket.h>
35 #include<sys/sendfile.h>
36 #include<arpa/inet.h>
37 #include<netdb.h>
38
39 #include"hash.h"
40
44 struct command_opt
45 {
46 char** flag;
47 int counts;
48 int dif_index;
49 };
50
51 struct inhab
52 {
53 char* member,
54 * line;
55 int fitness,
56 mem_size;
57 int choices[29];
58 };
59
51
71 int* size,
72 const char* file);
73
84
99 //hash_t table;
100
52
5 const uint64_t option_count)
6 {
7 char* line;
8 int rand_fd;
9
10 line = malloc((int)((option_count/8)+1));
11 if(line == NULL) {
12 perror("Allocating Memory");
13 return 0x01;
14 }
15 rand_fd = open("/dev/urandom", O_RDONLY);
16 if(rand_fd == -1) {
17 perror("Could Not Open /dev/random");
18 free(line);
19 return 0x01;
20 }
21
40
41 close(rand_fd);
42 free(line);
43
53
44 return 0;
45 }
46
68 int diff_index = 0;
69
70 while(!feof(cmd)) {
71 fscanf(cmd, "%s", line);
72 if(!strncmp(line, "EITHER", 6)) {
73 fscanf(cmd, "%i", &(((*dest)+index)
->counts));
74 (((*dest)+index)->flag) = malloc(((*
dest)+index)->counts * sizeof(
char*));
75 ((*dest)+index)->dif_index =
diff_index;
76 diff_index++;
77 for(tmp = 0; tmp < ((*dest)+index)->
counts; tmp++) {
78 ((*dest)+index)->flag[tmp] =
malloc(80);
79 getline(&((*dest)+index)->
flag[tmp], &line_size,
54
cmd);
80
81 }
82 }
83 else {
84 (((*dest)+index)->flag) = malloc(
sizeof(char*));
85 ((*dest)+index)->flag[0] = malloc
(80);
86 strcpy(((*dest)+index)->flag[0],
line);
87 ((*dest)+index)->counts = 1;
88 }
89 index++;
90 }
91
92 *size = index;
93 free(line);
94 fclose(cmd);
95
98 return 0x00;
99 }
100
101
55
119 if(size) {
120 bit = tmp[1] & 1 ? 1 : 0;
121 }
122 *tmp >>= 1;
123 *tmp |= bit;
124 }
125 }
126
154
56
159 for(int i = 0; i < pop_size; i++) {
160 int byte = 0;
161 if(((*pop)+i)->fitness == -1) {
162 for(int j=0; ((byte*8)+j) < opt_size
; j++) {
163 if(*((((*pop)+i)->member)+
byte) & (1 << j)) {
164 if( ((*opts)+((byte
*8)+j))->counts >
1) {
165 if(((*pop)+i
)->
choices[
166 ((*
opts
)
+((
byte
*8)
+
j
)
)
->
dif_inde
]
==
0)
167 {
168 ((*
pop
)
+
i
)
->
choices
[
169 ((
57
170 ra
171 (
172 strcat
(
((*
pop
)
+
58
i
)
->
line
,
173 ((*
o
)
+
b
*
+
j
)
-
f
[
174 ((*
p
)
+
i
)
-
c
[
175 ((*
o
)
+
b
*
+
j
)
-
d
]
;
59
176 strcat
(
((*
pop
)
+
i
)
->
line
,
"
")
;
177 }
178 }
179 else {
180 strcat( ((*
pop)+i)->
line,
181 ((*
opts
)
+((
byte
*8)
+
j
)
)
->
flag
[0])
;
182 strcat( ((*
pop)+i)->
line, "
");
183 }
184 }
60
185
186 if(j % 7 == 0) {
187 byte++;
188 j = 0;
189 }
190 }
191
61
225 inet_pton(AF_INET, "141.195.226.19",
226 &netbook.sin_addr);
227
228 con:
229 if(connect(send_sock, (struct sockaddr*)&netbook,
230 sizeof(netbook)) == -1) {
231 perror("Connecting to Netbook... Trying
again");
232 sleep(10);
233 goto con;
234 }
235
236
62
265 perror("Reading Fitness Score");
266 }
267
272 /* Kernel 2 */
273
283 _con:
284 if(connect(send_sock, (struct sockaddr*)&netbook,
285 sizeof(netbook)) == -1) {
286 perror("Connecting to Netbook... Trying
again");
287 sleep(10);
288 goto _con;
289 }
290
291
63
305 meh = 0;
306 while( meh < file_stat2.st_size) {
307 s = read(file, meh_, 1023);
308 meh += send(send_sock, meh_, s, 0);
309 }
310 shutdown(send_sock, SHUT_RDWR);
311 close(file);
312
325 free(meh_);
326 close(new_sock);
327 }
328
341 if(child == 0) {
342 if( ((*pop)+i)->fitness == -1) {
343
64
SOCK_STREAM, 0);
345 struct sockaddr_in
builder;
346
347 memset(&builder, 0,
sizeof(struct
sockaddr_in));
348 builder.sin_family =
AF_INET;
349 builder.sin_port =
htons(12345);
350 // inet_pton(AF_INET,
"141.195.226.146",
351 inet_pton(AF_INET,
"141.195.226.136",
352 &
builder
.
sin_addr
)
;
353
354
355 if(connect(sock, (
struct sockaddr*)
&builder,
356 size
(
bu
)
)
==
-1
{
357 perror("
Connecting
to
Builder")
;
65
358 close(sock);
359 }
360
364
369
373 memset(&
fit_addr,
0,
sizeof(
struct
sockaddr_in
));
374 fit_addr.
sin_family
=
AF_INET;
375 fit_addr.
sin_port
= htons
(12345);
376 fit_addr.
sin_addr.
s_addr =
66
INADDR_ANY
;
377 if(bind(
fit_sock,
(struct
sockaddr
*)&
fit_addr,
378 size
(
fi
)
)
==
-1
379 perror
("
Binding
Failed
")
;
380 break
;
381 }
382
383 if(listen(
fit_sock,
1024) ==
-1) {
384 perror
("
Listening
")
;
385 break
;
386 }
67
387
388 }
389 int sock_n = accept(
fit_sock, (struct
sockaddr*)&
meh_addr,
390 &
length__
)
;
391
Connection
");
394 break;
395 }
396 if(recv(sock_n,
buffer, 80,0) <
1) {
397 perror("
Reading
Fitness
Score");
398 break;
399 }
400
401 ((*pop)+i)->fitness
= atoi(buffer);
402 fprintf(stdout, "
Kernel Recieved
Fitness of %s\n",
buffer);
403 //hstore(&table, ((*
pop)+i)->member,
((*pop)+i)->
fitness);
404 free(buffer);
405 _exit(0);
406 }
407
68
408 }
409 else {
410 /*
411 int store_ = hload(&table, ((*pop)+i
+1)->member);
412 if(store_ != 0) {
413 ((*pop)+i+1)->fitness =
store_;
414 }
415 else {
416 */
417 char* HOSTCFLAGS = malloc
(8192+13),
418 * HOSTCXXFLAGS = malloc
(8192+15);
419
430
69
-2.6.30.5/");
437 execl("/usr/bin/make
", "make", "-j4",
438 "
clean
",
NULL
)
;
439 }
440 waitpid(pid, &status, 0);
441
70
456 ((*pop)+i+1)
->fitness
= -500;
457 break;
458 }
459 }
460
461
470 HOSTCFLAGS,
HOSTCXXFLAGS
,
471 "LD=xild", "
bzImage",
472 NULL);
473 }
474
71
")
;
480 ((*pop)+i+1)
->fitness
= -500;
481 break;
482 }
483 }
484
498
72
501 if(send_sock == -1) {
502 perror("Creating the
send socket");
503 }
504
512 con:
513 if(connect(send_sock, (
struct sockaddr*)&netbook
,
514 sizeof
(
netbook
)
)
==
-1)
{
515 perror("Connecting
to Netbook...
Trying again");
516 sleep(10);
517 goto con;
518 }
519
520
73
Netbook");
522
74
551
552 memset(&fit_addr, 0,
sizeof(struct
sockaddr_in));
553 fit_addr.sin_family
= AF_INET;
554 fit_addr.sin_port =
htons(12345);
555 fit_addr.sin_addr.
s_addr =
INADDR_ANY;
556 if(bind(fit_sock, (
struct sockaddr*)
&fit_addr,
557 sizeof
(
fit_addr
)
)
==
-1)
{
558 perror("
Binding
Failed");
559 }
560
561 if(listen(fit_sock,
1024) == -1) {
562 perror("
Listening
");
563 }
564
565 }
566
75
568
580 close(new_sock);
581 free(rcbuffer);
582 }
583 }
584 }
585
596
602 /*
76
603 *(dest->member) |= ( *(a->member) << (int)floor(
604 (strlen(a->member)*8)/2));
605 *(dest->member) |= ( *(b->member) >> (int)floor(
606 (strlen(b->member)*8)/2));
607 */
608
617
633
77
641 int part = rand() % (int)floor(.50 * (
pop_size-1)) +
642 (int) floor(.25 * (pop_size
-1));
643 reproduction(log, ((*pop)+pop_size-i-1),
644 ((*pop)+pop_size -part-1),
645 ((*pop)+i));
646 }
647 }
648
649
650
78
680
79
723 }
724
A.2.3 Makefile
1 CC =gcc
2 OUT = out
3
4 debug:
5 ${CC} main.c hash.c -std=c99 -g -Wall -pedantic -o $
{OUT} -lm
6 release:
7 ${CC} main.c hash.c -std=c99 -O2 -lm -o ${OUT}
8 parser:
9 lex opt_parser.lex
10 ${CC} lex.yy.c -o lex_parser -ll
3 #include<stdio.h>
4 #include<stdlib.h>
5 #include<string.h>
6
7 #include<unistd.h>
8 #include<sys/types.h>
9 #include<sys/socket.h>
10 #include<sys/sendfile.h>
11 #include<sys/mman.h>
12 #include<sys/wait.h>
13 #include<sys/stat.h>
14
15 #include<fcntl.h>
16 #include<arpa/inet.h>
17 #include<netdb.h>
80
18
19 void send_bad_fitness(void);
20 void send_complete(void);
21
57 buffer = malloc(8192);
81
58 while(1) {
59 begin:
60 length = sizeof(new_addr);
61 new_sock = accept(sock, (struct sockaddr*)&
new_addr,
62 &length);
63 if(new_sock == -1) {
64 perror("Accepting");
65 break;
66 }
67 memset(buffer, 0, 8192);
68
71 strcpy(HOSTCFLAGS, "HOSTCFLAGS=");
72 strcat(HOSTCFLAGS, buffer);
73 strcpy(HOSTCXXFLAGS, "HOSTCXXFLAGS=");
74 strcat(HOSTCXXFLAGS, buffer);
75
78 child = fork();
79 if(child == -1) {
80 perror("Another Fork Problem");
81 break;
82 }
83 else if(child == 0) {
84 chdir("/home/cs8/kukunaj/linux
-2.6.30.5/");
85 execl("/usr/bin/make", "make", "-j4
",
86 "clean", NULL);
87 }
88 waitpid(child, &status, 0);
89
90 child = fork();
91 if(child == -1){
92 perror("Comeon");
93 break;
94 }
95 else if(child == 0) {
82
96 chdir("/home/cs8/kukunaj/linux
-2.6.30.5/");
97 execl("/usr/bin/make", "make", "-j4
",
98 "oldconfig",NULL);
99 }
100 waitpid(child, &status, 0);
101
102
83
131 "/home/cs8/kukunaj/linux
-2.6.30.5/arch/x86/boot/
bzImage",
132 "/home/cs8/kukunaj/linux
-2.6.30.5/System.map",
NULL);
133 }
134 waitpid(child, &status, 0);
135
84
167 break;
168 }
169
85
209 }
210
4 debug:
5 ${CC} -Wall -pedantic -std=c99 main.c -o ${OUT} -g
6 release:
7 ${CC} main.c -std=c99 -O2 -o ${OUT}
86
3
4 #include<stdio.h>
5 #include<stdlib.h>
6 #include<string.h>
7 #include<errno.h>
8 #include<unistd.h>
9 #include<time.h>
10 #include<limits.h>
11
12 #include<sys/types.h>
13 #include<sys/socket.h>
14 #include<sys/stat.h>
15 #include<fcntl.h>
16 #include<linux/reboot.h>
17 #include<sys/reboot.h>
18 #include<sys/wait.h>
19 #include<netinet/in.h>
20 #include<netinet/ip.h>
21 #include<arpa/inet.h>
22
23
87
44 if(dest->tv_nsec > 1000000000) {
45 dest->tv_sec++;
46 dest->tv_nsec -= 1000000000;
47 }
48 return;
49 }
50
63 meh:
64 if(connect(sock, (struct sockaddr*)&addr, sizeof(
addr)) != 0) {
65 perror("Connecting to Send Fitness");
66 sleep(10);
67 goto meh;
68 }
69 fprintf(stdout, "Succesfully Connected\n");
70
71 fit = malloc(15);
72 memset(fit, 0, 15);
73 sprintf(fit, "%d", fitness);
74
82
83
84 int runTests(void)
85 {
88
86 clockid_t cl;
87 int child;
88 struct timespec start, end, total,
89 op, mm, clo, fs64, rd, fork, wr, mum
;
90
91 if(clock_getcpuclockid(0, &cl) != 0) {
92 perror("Getting Clock Id");
93 return EXIT_FAILURE;
94 }
95
102 /*************************************
103 * Open
104 ************************************/
105 clock_gettime(cl, &start);
106 __asm__ __volatile__ (
107 "movl $5, %%eax\n\t"
108 "movl %1, %%ebx\n\t"
109 "movl $2, %%ecx\n\t"
110 "int $0x80\n\t"
111 "movl %%eax, %0"
112 : "=r" (file)
113 : "r" (test)
114 : "%eax", "%ebx", "%ecx");
115 clock_gettime(cl, &end);
116 spec_diff(&start,&end, &op);
117
118
119 /************************************
120 * fstat64
121 ************************************/
122 clock_gettime(cl, &start);
123 __asm__ __volatile__ (
124 "movl $197, %%eax\n\t"
125 "movl %1, %%ebx\n\t"
126 "movl %2, %%ecx\n\t"
127 "int $0x80 \n\t"
89
128 "movl %%eax, %0"
129 : "=r" (out)
130 : "r" (file), "r" (fst)
131 : "%eax", "%ebx", "%ecx");
132 clock_gettime(cl, &end);
133 spec_diff(&start, &end, &fs64);
134
135 /**************************************
136 * Write
137 *************************************/
138 clock_gettime(cl, &start);
139 __asm__ __volatile__ (
140 "movl $4, %%eax\n\t"
141 "movl %1, %%ebx\n\t"
142 "movl %2, %%ecx\n\t"
143 "movl $20, %%edx\n\t"
144 "int $0x80 \n\t"
145 "movl %%eax, %0"
146 : "=r" (out)
147 : "r" (file), "r" (test)
148 : "%eax", "%ebx", "%ecx", "%edx");
149 clock_gettime(cl, &end);
150 spec_diff(&start, &end, &wr);
151
154
155 /*************************************
156 * Read
157 ************************************/
158 clock_gettime(cl, &start);
159 __asm__ __volatile__ (
160 "movl $3, %%eax\n\t"
161 "movl %1, %%ebx\n\t"
162 "movl %2, %%ecx\n\t"
163 "movl $10, %%edx\n\t"
164 "int $0x80 \n\t"
165 "movl %%eax, %0"
166 : "=r" (out)
167 : "r" (file), "r" (test)
168 : "%eax", "%ebx", "%ecx", "%edx");
169 clock_gettime(cl, &end);
170 spec_diff(&start, &end, &rd);
90
171
174
175 /**************************************
176 * Close
177 **************************************/
178 clock_gettime(cl, &start);
179 __asm__ __volatile__ (
180 "movl $6, %%eax\n\t"
181 "movl %1, %%ebx\n\t"
182 "int $0x80 \n\t"
183 "movl %%eax, %0"
184 : "=r" (file)
185 : "r" (file)
186 : "%eax", "%ebx");
187 clock_gettime(cl, &end);
188 spec_diff(&start, &end, &clo);
189
212 /***************************************
91
213 * MUNMAP
214 ***************************************/
215 clock_gettime(cl, &start);
216 __asm__ __volatile__ (
217 "movl $91, %%eax\n\t"
218 "movl %0, %%ebx\n\t"
219 "movl $8192, %%ecx\n\t"
220 "int $0x80"
221 :
222 : "r" (buffer)
223 : "%eax", "%ebx", "%ecx");
224 clock_gettime(cl, &end);
225 spec_diff(&start, &end, &mum);
226
227
228
231 free(test);
232
233 /****************************************
234 * FORK
235 ****************************************/
236 clock_gettime(cl, &start);
237 __asm__ __volatile__ (
238 "movl $2, %%eax\n\t"
239 "int $0x80 \n\t"
240 "movl %%eax, %0"
241 : "=r" (child)
242 :
243 :"%eax");
244 if(child == 0) {
245 _exit(0);
246 }
247
253
92
256 fit -= 10000000;
257 }
258 fit -= (int)(total.tv_nsec % 1000000);
259
263 return 0;
264 }
265
266
267
268
93
298
308 int l = 0;
309 l = recv(new_sock, buffer, 4096, 0);
310 write(kern_tar, buffer, l);
311 while( (l = recv(new_sock, buffer, 4096, 0)) != 0) {
312 write(kern_tar, buffer, l);
313 memset(buffer, 0, 4096);
314 }
315
316 close(kern_tar);
317 free(buffer);
318 close(new_sock);
319 close(sock);
320
321 sync();
322
94
339
340 wait(&status);
341 if(WIFEXITED(status)) {
342 if(WEXITSTATUS(status) != 0) {
343 fprintf(stderr, "Corrupt Kernel Tar\
n");
344 send_fitness(-500);
345 return -1;
346 }
347 }
348
368 return 0;
369 }
370
95
375 if(child == 0) {
376 execl("/bin/sh", "sh", "/root/check.sh",
NULL);
377 }
378
406 sync();
407 }
408
96
417 if(install_kern() == -1) {
418 goto meh;
419 }
420 fprintf(stdout, "Kernel Installed
Successfully");
421 change_state(0x01);
422 sync();
423 reboot(LINUX_REBOOT_CMD_RESTART);
424 }
425 else {
426 runTests();
427 change_state(0x00);
428 goto meh;
429
430 }
431 }
A.4.2 check.sh
1 #!/bin/bash
2
3 if [ -f /root/.ga/wait ];
4 then
5 echo ’waiting’
6 exit 1
7 else
8 echo ’running’
9 exit 2
10 fi
A.4.3 Makefile
1 CC = gcc
2 OUT = GAClient
3
4 debug:
5 ${CC} main.c -std=c99 -Wall -pedantic -g -o ${OUT}
-lrt
6 release:
7 ${CC} main.c -std=c99 -o ${OUT} -lrt
97
2 -O1
3 -O2
4 -O3
5 -O
6 -Os
7 -O0
8 -fast
9 EITHER 2
10 -fno-alias
11 -fno-fnalias
12 -fno-builtin
13 -ffunction-sections
14 -fdata-sections
15 -nolib-inline
16 EITHER 7
17 -xSSE2
18 -xSSE3
19 -xSSE3_ATOM
20 -xSSSE3
21 -xSSE4.1
22 -xSSE4.2
23 -xAVX
24 EITHER 6
25 -axSSE2
26 -axSSE3
27 -axSSSE3
28 -axSSE4.1
29 -axSSE4.2
30 -axAVX
31 EITHER 2
32 -mcpu=pentium3
33 -mcpu=pentium4
34 EITHER 2
35 -mtune=pentium3
36 -mtune=pentium4
37 EITHER 2
38 -march=pentium3
39 -march=pentium4
40 -mia32
41 EITHER 5
42 -msse
43 -msse2
44 -msse3
98
45 -mssse3
46 -fomit-frame-pointer
47 -fexceptions
48 -fnon-call-exceptions
49 EITHER 4
50 -unroll0
51 -unroll
52 -unroll-aggresive
53 -funroll-loops
54 -scalar-rep
55 -complex-limited-range
56 -alias-const
57 EITHER 3
58 -fargument-alias
59 -fargument-noalias
60 -fargument-noalias-global
61 -opt-multi-version-aggressive
62 EITHER 5
63 -opt-ra-region-strategy=routine
64 -opt-ra-region-strategy=block
65 -opt-ra-region-strategy=trace
66 -opt-ra-region-strategy=loop
67 -opt-ra-region-strategy=default
68 EITHER 2
69 -no-vec
70 -vec
71 EITHER 2
72 -no-vec-guard-write
73 -vec-guard-write
74 EITHER 5
75 -opt-malloc-options=0
76 -opt-malloc-options=1
77 -opt-malloc-options=2
78 -opt-malloc-options=3
79 -opt-malloc-options=4
80 -opt-calloc
81 EITHER 3
82 -opt-jump-tables=default
83 -opt-jump-tables=large
84 -fno-jump-tables
85 -opt-subscript-in-range
86 -use-intel-optimized-headers
87 -par-runtime-control
99
88 EITHER 8
89 -par-schedule-static
90 -par-schedule-static-balanced
91 -par-schedule-static-steal
92 -par-schedule-dynamic
93 -par-schedule-guided
94 -par-schedule-guided-analytical
95 -par-schedule-runtime
96 -par-schedule-auto
97 EITHER 3
98 -fp-speculation=fast
99 -fp-speculation=safe
100 -fp-speculation=strict
101 -prec-sqrt
102 -prec-div
103 -fast-transcendentals
104 -fp-port
105 -rcd
106 -ftz
107 EITHER 3
108 -inline-level=0
109 -inline-level=1
110 -inline-level=2
111 -finline
112 -inline-forceinline
113 -inline-calloc
114 EITHER 5
115 -Zp1
116 -Zp2
117 -Zp4
118 -Zp8
119 -Zp16
120 -align
121 -freg-struct-return
122 -no-bss-init
123 EITHER 2
124 -falign-functions=2
125 -falign-functions=16
126 EITHER 3
127 -falign-stack=default
128 -falign-stack=maintain-16-byte
129 -falign-stack=assume-16-byte
100
Appendix B
Phoronix Results
102
61 </Benchmark>
62 <Benchmark>
63 <Name>7-Zip Compression</Name>
64 <Version>4.65</Version>
65 <Attributes>Compress Speed Test</Attributes>
66 <Scale>MIPS</Scale>
67 <Proportion>HIB</Proportion>
68 <ResultFormat>BAR_GRAPH</ResultFormat>
69 <TestName>compress-7zip</TestName>
70 <TestArguments></TestArguments>
71 <Results>
72 <Group>
73 <Entry>
74 <Identifier>run0</
Identifier>
75 <Value>842.66</Value
>
76 <RawString
>837:854:837</
RawString>
77 </Entry>
78 </Group>
79 </Results>
80 </Benchmark>
81 <Benchmark>
82 <Name>SciMark</Name>
83 <Version>2.0</Version>
84 <Attributes>Composite</Attributes>
85 <Scale>Mflops</Scale>
86 <Proportion>HIB</Proportion>
87 <ResultFormat>BAR_GRAPH</ResultFormat>
88 <TestName>scimark2</TestName>
89 <TestArguments>TEST_COMPOSITE</TestArguments
>
90 <Results>
91 <Group>
92 <Entry>
93 <Identifier>run0</
Identifier>
94 <Value>120.74</Value
>
95 <RawString
>119.52:121.28:120.83:121.
103
RawString>
96 </Entry>
97 </Group>
98 </Results>
99 </Benchmark>
100 <Benchmark>
101 <Name>SciMark</Name>
102 <Version>2.0</Version>
103 <Attributes>Fast Fourier Transform</
Attributes>
104 <Scale>Mflops</Scale>
105 <Proportion>HIB</Proportion>
106 <ResultFormat>BAR_GRAPH</ResultFormat>
107 <TestName>scimark2</TestName>
108 <TestArguments>TEST_FFT</TestArguments>
109 <Results>
110 <Group>
111 <Entry>
112 <Identifier>run0</
Identifier>
113 <Value>22.63</Value>
114 <RawString
>22.71:22.56:22.66:22.61</
RawString>
115 </Entry>
116 </Group>
117 </Results>
118 </Benchmark>
119 <Benchmark>
120 <Name>SciMark</Name>
121 <Version>2.0</Version>
122 <Attributes>Monte Carlo</Attributes>
123 <Scale>Mflops</Scale>
124 <Proportion>HIB</Proportion>
125 <ResultFormat>BAR_GRAPH</ResultFormat>
126 <TestName>scimark2</TestName>
127 <TestArguments>TEST_MONTE</TestArguments>
128 <Results>
129 <Group>
130 <Entry>
131 <Identifier>run0</
Identifier>
132 <Value>41.87</Value>
104
133 <RawString
>41.94:41.68:41.81:42.07</
RawString>
134 </Entry>
135 </Group>
136 </Results>
137 </Benchmark>
138 <Benchmark>
139 <Name>IOzone</Name>
140 <Version>3.315</Version>
141 <Attributes>512MB Write Performance</
Attributes>
142 <Scale>MB/s</Scale>
143 <Proportion>HIB</Proportion>
144 <ResultFormat>BAR_GRAPH</ResultFormat>
145 <TestName>iozone</TestName>
146 <TestArguments>-s 512M -i0</TestArguments>
147 <Results>
148 <Group>
149 <Entry>
150 <Identifier>run0</
Identifier>
151 <Value>36.46</Value>
152 <RawString
>35.669921875:36.7421875:3
RawString>
153 </Entry>
154 </Group>
155 </Results>
156 </Benchmark>
157 <Benchmark>
158 <Name>IOzone</Name>
159 <Version>3.315</Version>
160 <Attributes>512MB Read Performance</
Attributes>
161 <Scale>MB/s</Scale>
162 <Proportion>HIB</Proportion>
163 <ResultFormat>BAR_GRAPH</ResultFormat>
164 <TestName>iozone</TestName>
165 <TestArguments>-s 512M -i0 -i1</
TestArguments>
166 <Results>
167 <Group>
105
168 <Entry>
169 <Identifier>run0</
Identifier>
170 <Value>380.67</Value
>
171 <RawString
>381.8984375:378.137695312
RawString>
172 </Entry>
173 </Group>
174 </Results>
175 </Benchmark>
176 <Benchmark>
177 <Name>IOzone</Name>
178 <Version>3.315</Version>
179 <Attributes>1GB Write Performance</
Attributes>
180 <Scale>MB/s</Scale>
181 <Proportion>HIB</Proportion>
182 <ResultFormat>BAR_GRAPH</ResultFormat>
183 <TestName>iozone</TestName>
184 <TestArguments>-s 1024M -i0</TestArguments>
185 <Results>
186 <Group>
187 <Entry>
188 <Identifier>run0</
Identifier>
189 <Value>33.71</Value>
190 <RawString
>33.3193359375:33.45996093
RawString>
191 </Entry>
192 </Group>
193 </Results>
194 </Benchmark>
195 <Benchmark>
196 <Name>SQLite</Name>
197 <Version>3.6.11</Version>
198 <Attributes>12,500 INSERTs</Attributes>
199 <Scale>Seconds</Scale>
200 <Proportion>LIB</Proportion>
201 <ResultFormat>BAR_GRAPH</ResultFormat>
202 <TestName>sqlite</TestName>
106
203 <TestArguments></TestArguments>
204 <Results>
205 <Group>
206 <Entry>
207 <Identifier>run0</
Identifier>
208 <Value>202.18</Value
>
209 <RawString
>201.50340890884:205.29840
RawString>
210 </Entry>
211 </Group>
212 </Results>
213 </Benchmark>
214 <Benchmark>
215 <Name>GnuPG</Name>
216 <Version>1.4.9</Version>
217 <Attributes>2GB File Encryption</Attributes>
218 <Scale>Seconds</Scale>
219 <Proportion>LIB</Proportion>
220 <ResultFormat>BAR_GRAPH</ResultFormat>
221 <TestName>gnupg</TestName>
222 <TestArguments></TestArguments>
223 <Results>
224 <Group>
225 <Entry>
226 <Identifier>run0</
Identifier>
227 <Value>169.01</Value
>
228 <RawString
>169.73490190506:168.99539
RawString>
229 </Entry>
230 </Group>
231 </Results>
232 </Benchmark>
233 <Benchmark>
234 <Name>C-Ray</Name>
235 <Version>1.1</Version>
236 <Attributes>Total Time</Attributes>
237 <Scale>Seconds</Scale>
107
238 <Proportion>LIB</Proportion>
239 <ResultFormat>BAR_GRAPH</ResultFormat>
240 <TestName>c-ray</TestName>
241 <TestArguments></TestArguments>
242 <Results>
243 <Group>
244 <Entry>
245 <Identifier>run0</
Identifier>
246 <Value>2570.03</
Value>
247 <RawString
>2567.216:2565.403:2577.48
RawString>
248 </Entry>
249 </Group>
250 </Results>
251 </Benchmark>
252 <Benchmark>
253 <Name>RAMspeed</Name>
254 <Version>2.5.2</Version>
255 <Attributes>Integer Add</Attributes>
256 <Scale>MB/s</Scale>
257 <Proportion>HIB</Proportion>
258 <ResultFormat>BAR_GRAPH</ResultFormat>
259 <TestName>ramspeed</TestName>
260 <TestArguments>ADD -b 3 -l 10</TestArguments
>
261 <Results>
262 <Group>
263 <Entry>
264 <Identifier>run0</
Identifier>
265 <Value>1888.01</
Value>
266 <RawString>1888.01</
RawString>
267 </Entry>
268 </Group>
269 </Results>
270 </Benchmark>
271 <Benchmark>
272 <Name>GtkPerf</Name>
108
273 <Version>0.40</Version>
274 <Attributes>GtkComboBox</Attributes>
275 <Scale>Seconds</Scale>
276 <Proportion>LIB</Proportion>
277 <ResultFormat>BAR_GRAPH</ResultFormat>
278 <TestName>gtkperf</TestName>
279 <TestArguments>COMBOBOX</TestArguments>
280 <Results>
281 <Group>
282 <Entry>
283 <Identifier>run0</
Identifier>
284 <Value>163.89</Value
>
285 <RawString>163.89</
RawString>
286 </Entry>
287 </Group>
288 </Results>
289 </Benchmark>
290 <Benchmark>
291 <Name>GtkPerf</Name>
292 <Version>0.40</Version>
293 <Attributes>GtkDrawingArea - Pixbufs</
Attributes>
294 <Scale>Seconds</Scale>
295 <Proportion>LIB</Proportion>
296 <ResultFormat>BAR_GRAPH</ResultFormat>
297 <TestName>gtkperf</TestName>
298 <TestArguments>DRAWING_PIXBUFS</
TestArguments>
299 <Results>
300 <Group>
301 <Entry>
302 <Identifier>run0</
Identifier>
303 <Value>23.65</Value>
304 <RawString>23.65</
RawString>
305 </Entry>
306 </Group>
307 </Results>
308 </Benchmark>
109
309 <Benchmark>
310 <Name>GtkPerf</Name>
311 <Version>0.40</Version>
312 <Attributes>GtkRadioButton</Attributes>
313 <Scale>Seconds</Scale>
314 <Proportion>LIB</Proportion>
315 <ResultFormat>BAR_GRAPH</ResultFormat>
316 <TestName>gtkperf</TestName>
317 <TestArguments>RADIO_BUTTON</TestArguments>
318 <Results>
319 <Group>
320 <Entry>
321 <Identifier>run0</
Identifier>
322 <Value>27.34</Value>
323 <RawString>27.34</
RawString>
324 </Entry>
325 </Group>
326 </Results>
327 </Benchmark>
328 <System>
329 <Hardware>Processor: Intel Atom CPU N270 @
1.60GHz (Total Cores: 2), Motherboard:
Acer AOA150, Chipset: Intel Mobile 945GME
Express Hub + ICH7-M, System Memory: 997
MB, Disk: 160GB WDC WD1600BEVT-2,
Graphics: Intel Mobile 945GME Express IGP
(rev 03)</Hardware>
330 <Software>OS: Fedora 10, Kernel:
2.6.27.19-170.2.35.fc10.i686 (i686),
Display Server: X.Org Server 1.5.3,
Display Driver: intel 2.5.0, OpenGL: 1.4
Mesa 7.3-devel, Compiler: GCC 4.3.2, File
-System: ext3, Screen Resolution: 1024
x600</Software>
331 <Author>jim</Author>
332 <TestDate>April 11, 2010 08:46 AM</TestDate>
333 <TestNotes>2D Acceleration: EXA.
334 Intel SpeedStep Technology was enabled</TestNotes>
335 <Version>1.8.1</Version>
336 <AssociatedIdentifiers>run0</
AssociatedIdentifiers>
110
337 </System>
338 <Suite>
339 <Title>fedorastock</Title>
340 <Name>netbook</Name>
341 <Version>1.2.0</Version>
342 <Description>This test suite is designed to
test various aspects of a netbook/net-top
/UMPC computer.</Description>
343 <Type>System</Type>
344 <Extensions></Extensions>
345 <TestProperties></TestProperties>
346 </Suite>
347 </PhoronixTestSuite>
B.2 ICC
1 <?xml version="1.0"?>
2 <?xml-stylesheet type="text/xsl" href="pts-results-viewer.
xsl" ?>
3 <!-- Generated: 2010-04-10 21:49:23 -->
4 <PhoronixTestSuite>
5 <Benchmark>
6 <Name>LAME MP3 Encoding</Name>
7 <Version>3.98.2</Version>
8 <Attributes>WAV To MP3</Attributes>
9 <Scale>Seconds</Scale>
10 <Proportion>LIB</Proportion>
11 <ResultFormat>BAR_GRAPH</ResultFormat>
12 <TestName>encode-mp3</TestName>
13 <TestArguments></TestArguments>
14 <Results>
15 <Group>
16 <Entry>
17 <Identifier
>2010-04-10
18:17</Identifier
>
18 <Value>162.13</Value
>
19 <RawString
>163.98458313942:161.78006
RawString>
20 </Entry>
21 </Group>
111
22 </Results>
23 </Benchmark>
24 <Benchmark>
25 <Name>Ogg Encoding</Name>
26 <Version>1.2.0</Version>
27 <Attributes>WAV To Ogg</Attributes>
28 <Scale>Seconds</Scale>
29 <Proportion>LIB</Proportion>
30 <ResultFormat>BAR_GRAPH</ResultFormat>
31 <TestName>encode-ogg</TestName>
32 <TestArguments></TestArguments>
33 <Results>
34 <Group>
35 <Entry>
36 <Identifier
>2010-04-10
18:17</Identifier
>
37 <Value>107.98</Value
>
38 <RawString
>108.51464605331:107.78667
RawString>
39 </Entry>
40 </Group>
41 </Results>
42 </Benchmark>
43 <Benchmark>
44 <Name>FFmpeg</Name>
45 <Version>0.5</Version>
46 <Attributes>AVI To NTSC VCD</Attributes>
47 <Scale>Seconds</Scale>
48 <Proportion>LIB</Proportion>
49 <ResultFormat>BAR_GRAPH</ResultFormat>
50 <TestName>ffmpeg</TestName>
51 <TestArguments></TestArguments>
52 <Results>
53 <Group>
54 <Entry>
55 <Identifier
>2010-04-10
18:17</Identifier
>
112
56 <Value>94.43</Value>
57 <RawString
>95.681114912033:93.412126
RawString>
58 </Entry>
59 </Group>
60 </Results>
61 </Benchmark>
62 <Benchmark>
63 <Name>7-Zip Compression</Name>
64 <Version>4.65</Version>
65 <Attributes>Compress Speed Test</Attributes>
66 <Scale>MIPS</Scale>
67 <Proportion>HIB</Proportion>
68 <ResultFormat>BAR_GRAPH</ResultFormat>
69 <TestName>compress-7zip</TestName>
70 <TestArguments></TestArguments>
71 <Results>
72 <Group>
73 <Entry>
74 <Identifier
>2010-04-10
18:17</Identifier
>
75 <Value>858.00</Value
>
76 <RawString
>858:853:863</
RawString>
77 </Entry>
78 </Group>
79 </Results>
80 </Benchmark>
81 <Benchmark>
82 <Name>SciMark</Name>
83 <Version>2.0</Version>
84 <Attributes>Composite</Attributes>
85 <Scale>Mflops</Scale>
86 <Proportion>HIB</Proportion>
87 <ResultFormat>BAR_GRAPH</ResultFormat>
88 <TestName>scimark2</TestName>
89 <TestArguments>TEST_COMPOSITE</TestArguments
>
113
90 <Results>
91 <Group>
92 <Entry>
93 <Identifier
>2010-04-10
18:17</Identifier
>
94 <Value>120.42</Value
>
95 <RawString
>120.91:120.75:119.29:120.
RawString>
96 </Entry>
97 </Group>
98 </Results>
99 </Benchmark>
100 <Benchmark>
101 <Name>SciMark</Name>
102 <Version>2.0</Version>
103 <Attributes>Fast Fourier Transform</
Attributes>
104 <Scale>Mflops</Scale>
105 <Proportion>HIB</Proportion>
106 <ResultFormat>BAR_GRAPH</ResultFormat>
107 <TestName>scimark2</TestName>
108 <TestArguments>TEST_FFT</TestArguments>
109 <Results>
110 <Group>
111 <Entry>
112 <Identifier
>2010-04-10
18:17</Identifier
>
113 <Value>20.04</Value>
114 <RawString
>20.22:20.03:19.88:20.03</
RawString>
115 </Entry>
116 </Group>
117 </Results>
118 </Benchmark>
119 <Benchmark>
120 <Name>SciMark</Name>
114
121 <Version>2.0</Version>
122 <Attributes>Monte Carlo</Attributes>
123 <Scale>Mflops</Scale>
124 <Proportion>HIB</Proportion>
125 <ResultFormat>BAR_GRAPH</ResultFormat>
126 <TestName>scimark2</TestName>
127 <TestArguments>TEST_MONTE</TestArguments>
128 <Results>
129 <Group>
130 <Entry>
131 <Identifier
>2010-04-10
18:17</Identifier
>
132 <Value>41.65</Value>
133 <RawString
>41.68:41.30:41.55:42.07</
RawString>
134 </Entry>
135 </Group>
136 </Results>
137 </Benchmark>
138 <Benchmark>
139 <Name>IOzone</Name>
140 <Version>3.315</Version>
141 <Attributes>512MB Write Performance</
Attributes>
142 <Scale>MB/s</Scale>
143 <Proportion>HIB</Proportion>
144 <ResultFormat>BAR_GRAPH</ResultFormat>
145 <TestName>iozone</TestName>
146 <TestArguments>-s 512M -i0</TestArguments>
147 <Results>
148 <Group>
149 <Entry>
150 <Identifier
>2010-04-10
18:17</Identifier
>
151 <Value>54.80</Value>
152 <RawString
>45.7919921875:55.64941406
RawString>
115
153 </Entry>
154 </Group>
155 </Results>
156 </Benchmark>
157 <Benchmark>
158 <Name>IOzone</Name>
159 <Version>3.315</Version>
160 <Attributes>512MB Read Performance</
Attributes>
161 <Scale>MB/s</Scale>
162 <Proportion>HIB</Proportion>
163 <ResultFormat>BAR_GRAPH</ResultFormat>
164 <TestName>iozone</TestName>
165 <TestArguments>-s 512M -i0 -i1</
TestArguments>
166 <Results>
167 <Group>
168 <Entry>
169 <Identifier
>2010-04-10
18:17</Identifier
>
170 <Value>538.76</Value
>
171 <RawString
>531.8134765625:546.182617
RawString>
172 </Entry>
173 </Group>
174 </Results>
175 </Benchmark>
176 <Benchmark>
177 <Name>IOzone</Name>
178 <Version>3.315</Version>
179 <Attributes>1GB Write Performance</
Attributes>
180 <Scale>MB/s</Scale>
181 <Proportion>HIB</Proportion>
182 <ResultFormat>BAR_GRAPH</ResultFormat>
183 <TestName>iozone</TestName>
184 <TestArguments>-s 1024M -i0</TestArguments>
185 <Results>
186 <Group>
116
187 <Entry>
188 <Identifier
>2010-04-10
18:17</Identifier
>
189 <Value>48.88</Value>
190 <RawString
>48.7265625:48.8232421875:
RawString>
191 </Entry>
192 </Group>
193 </Results>
194 </Benchmark>
195 <Benchmark>
196 <Name>SQLite</Name>
197 <Version>3.6.11</Version>
198 <Attributes>12,500 INSERTs</Attributes>
199 <Scale>Seconds</Scale>
200 <Proportion>LIB</Proportion>
201 <ResultFormat>BAR_GRAPH</ResultFormat>
202 <TestName>sqlite</TestName>
203 <TestArguments></TestArguments>
204 <Results>
205 <Group>
206 <Entry>
207 <Identifier
>2010-04-10
18:17</Identifier
>
208 <Value>63.69</Value>
209 <RawString
>62.594249010086:65.346024
RawString>
210 </Entry>
211 </Group>
212 </Results>
213 </Benchmark>
214 <Benchmark>
215 <Name>GnuPG</Name>
216 <Version>1.4.9</Version>
217 <Attributes>2GB File Encryption</Attributes>
218 <Scale>Seconds</Scale>
219 <Proportion>LIB</Proportion>
117
220 <ResultFormat>BAR_GRAPH</ResultFormat>
221 <TestName>gnupg</TestName>
222 <TestArguments></TestArguments>
223 <Results>
224 <Group>
225 <Entry>
226 <Identifier
>2010-04-10
18:17</Identifier
>
227 <Value>162.45</Value
>
228 <RawString
>162.87016987801:163.39538
RawString>
229 </Entry>
230 </Group>
231 </Results>
232 </Benchmark>
233 <Benchmark>
234 <Name>C-Ray</Name>
235 <Version>1.1</Version>
236 <Attributes>Total Time</Attributes>
237 <Scale>Seconds</Scale>
238 <Proportion>LIB</Proportion>
239 <ResultFormat>BAR_GRAPH</ResultFormat>
240 <TestName>c-ray</TestName>
241 <TestArguments></TestArguments>
242 <Results>
243 <Group>
244 <Entry>
245 <Identifier
>2010-04-10
18:17</Identifier
>
246 <Value>2564.34</
Value>
247 <RawString
>2564.897:2564.278:2563.84
RawString>
248 </Entry>
249 </Group>
250 </Results>
118
251 </Benchmark>
252 <Benchmark>
253 <Name>RAMspeed</Name>
254 <Version>2.5.2</Version>
255 <Attributes>Integer Add</Attributes>
256 <Scale>MB/s</Scale>
257 <Proportion>HIB</Proportion>
258 <ResultFormat>BAR_GRAPH</ResultFormat>
259 <TestName>ramspeed</TestName>
260 <TestArguments>ADD -b 3 -l 10</TestArguments
>
261 <Results>
262 <Group>
263 <Entry>
264 <Identifier
>2010-04-10
18:17</Identifier
>
265 <Value>2006.93</
Value>
266 <RawString>2006.93</
RawString>
267 </Entry>
268 </Group>
269 </Results>
270 </Benchmark>
271 <Benchmark>
272 <Name>GtkPerf</Name>
273 <Version>0.40</Version>
274 <Attributes>GtkComboBox</Attributes>
275 <Scale>Seconds</Scale>
276 <Proportion>LIB</Proportion>
277 <ResultFormat>BAR_GRAPH</ResultFormat>
278 <TestName>gtkperf</TestName>
279 <TestArguments>COMBOBOX</TestArguments>
280 <Results>
281 <Group>
282 <Entry>
283 <Identifier
>2010-04-10
18:17</Identifier
>
119
284 <Value>148.14</Value
>
285 <RawString>148.14</
RawString>
286 </Entry>
287 </Group>
288 </Results>
289 </Benchmark>
290 <Benchmark>
291 <Name>GtkPerf</Name>
292 <Version>0.40</Version>
293 <Attributes>GtkDrawingArea - Pixbufs</
Attributes>
294 <Scale>Seconds</Scale>
295 <Proportion>LIB</Proportion>
296 <ResultFormat>BAR_GRAPH</ResultFormat>
297 <TestName>gtkperf</TestName>
298 <TestArguments>DRAWING_PIXBUFS</
TestArguments>
299 <Results>
300 <Group>
301 <Entry>
302 <Identifier
>2010-04-10
18:17</Identifier
>
303 <Value>57.52</Value>
304 <RawString>57.52</
RawString>
305 </Entry>
306 </Group>
307 </Results>
308 </Benchmark>
309 <Benchmark>
310 <Name>GtkPerf</Name>
311 <Version>0.40</Version>
312 <Attributes>GtkRadioButton</Attributes>
313 <Scale>Seconds</Scale>
314 <Proportion>LIB</Proportion>
315 <ResultFormat>BAR_GRAPH</ResultFormat>
316 <TestName>gtkperf</TestName>
317 <TestArguments>RADIO_BUTTON</TestArguments>
318 <Results>
120
319 <Group>
320 <Entry>
321 <Identifier
>2010-04-10
18:17</Identifier
>
322 <Value>26.93</Value>
323 <RawString>26.93</
RawString>
324 </Entry>
325 </Group>
326 </Results>
327 </Benchmark>
328 <System>
329 <Hardware>Processor: Intel Atom CPU N270 @
1.60GHz (Total Cores: 2), Motherboard:
Acer AOA150, Chipset: Intel Mobile 945GME
Express Hub + ICH7-M, System Memory: 997
MB, Disk: 160GB WDC WD1600BEVT-2,
Graphics: Intel Mobile 945GME Express IGP
(rev 03)</Hardware>
330 <Software>OS: Fedora 10, Kernel: 2.6.30.5
noopts (i686), Display Server: X.Org
Server 1.5.3, Display Driver: intel
2.5.0, OpenGL: 2.1 Mesa 7.3-devel,
Compiler: GCC 4.3.2, File-System: ext3,
Screen Resolution: 1024x600</Software>
331 <Author>jim</Author>
332 <TestDate>April 10, 2010 09:49 PM</TestDate>
333 <TestNotes>2D Acceleration: EXA.
334 Intel SpeedStep Technology was enabled</TestNotes>
335 <Version>1.8.1</Version>
336 <AssociatedIdentifiers>2010-04-10 18:17</
AssociatedIdentifiers>
337 </System>
338 <Suite>
339 <Title>netbook_noopt</Title>
340 <Name>netbook</Name>
341 <Version>1.2.0</Version>
342 <Description>This test suite is designed to
test various aspects of a netbook/net-top
/UMPC computer.</Description>
343 <Type>System</Type>
121
344 <Extensions></Extensions>
345 <TestProperties></TestProperties>
346 </Suite>
347 </PhoronixTestSuite>
B.3 Run0
1 <?xml version="1.0"?>
2 <?xml-stylesheet type="text/xsl" href="pts-results-viewer.
xsl" ?>
3 <!-- Generated: 2010-04-11 18:45:13 -->
4 <PhoronixTestSuite>
5 <Benchmark>
6 <Name>LAME MP3 Encoding</Name>
7 <Version>3.98.2</Version>
8 <Attributes>WAV To MP3</Attributes>
9 <Scale>Seconds</Scale>
10 <Proportion>LIB</Proportion>
11 <ResultFormat>BAR_GRAPH</ResultFormat>
12 <TestName>encode-mp3</TestName>
13 <TestArguments></TestArguments>
14 <Results>
15 <Group>
16 <Entry>
17 <Identifier>run0</
Identifier>
18 <Value>162.57</Value
>
19 <RawString
>163.28898477554:162.53107
RawString>
20 </Entry>
21 </Group>
22 </Results>
23 </Benchmark>
24 <Benchmark>
25 <Name>Ogg Encoding</Name>
26 <Version>1.2.0</Version>
27 <Attributes>WAV To Ogg</Attributes>
28 <Scale>Seconds</Scale>
29 <Proportion>LIB</Proportion>
30 <ResultFormat>BAR_GRAPH</ResultFormat>
31 <TestName>encode-ogg</TestName>
32 <TestArguments></TestArguments>
122
33 <Results>
34 <Group>
35 <Entry>
36 <Identifier>run0</
Identifier>
37 <Value>107.77</Value
>
38 <RawString
>107.87712788582:107.99775
RawString>
39 </Entry>
40 </Group>
41 </Results>
42 </Benchmark>
43 <Benchmark>
44 <Name>FFmpeg</Name>
45 <Version>0.5</Version>
46 <Attributes>AVI To NTSC VCD</Attributes>
47 <Scale>Seconds</Scale>
48 <Proportion>LIB</Proportion>
49 <ResultFormat>BAR_GRAPH</ResultFormat>
50 <TestName>ffmpeg</TestName>
51 <TestArguments></TestArguments>
52 <Results>
53 <Group>
54 <Entry>
55 <Identifier>run0</
Identifier>
56 <Value>94.20</Value>
57 <RawString
>95.523708820343:93.556678
RawString>
58 </Entry>
59 </Group>
60 </Results>
61 </Benchmark>
62 <Benchmark>
63 <Name>7-Zip Compression</Name>
64 <Version>4.65</Version>
65 <Attributes>Compress Speed Test</Attributes>
66 <Scale>MIPS</Scale>
67 <Proportion>HIB</Proportion>
68 <ResultFormat>BAR_GRAPH</ResultFormat>
123
69 <TestName>compress-7zip</TestName>
70 <TestArguments></TestArguments>
71 <Results>
72 <Group>
73 <Entry>
74 <Identifier>run0</
Identifier>
75 <Value>860.00</Value
>
76 <RawString
>859:867:854</
RawString>
77 </Entry>
78 </Group>
79 </Results>
80 </Benchmark>
81 <Benchmark>
82 <Name>SciMark</Name>
83 <Version>2.0</Version>
84 <Attributes>Composite</Attributes>
85 <Scale>Mflops</Scale>
86 <Proportion>HIB</Proportion>
87 <ResultFormat>BAR_GRAPH</ResultFormat>
88 <TestName>scimark2</TestName>
89 <TestArguments>TEST_COMPOSITE</TestArguments
>
90 <Results>
91 <Group>
92 <Entry>
93 <Identifier>run0</
Identifier>
94 <Value>120.03</Value
>
95 <RawString
>120.61:119.80:119.86:119.
RawString>
96 </Entry>
97 </Group>
98 </Results>
99 </Benchmark>
100 <Benchmark>
101 <Name>SciMark</Name>
102 <Version>2.0</Version>
124
103 <Attributes>Fast Fourier Transform</
Attributes>
104 <Scale>Mflops</Scale>
105 <Proportion>HIB</Proportion>
106 <ResultFormat>BAR_GRAPH</ResultFormat>
107 <TestName>scimark2</TestName>
108 <TestArguments>TEST_FFT</TestArguments>
109 <Results>
110 <Group>
111 <Entry>
112 <Identifier>run0</
Identifier>
113 <Value>20.08</Value>
114 <RawString
>19.95:19.99:20.03:20.37</
RawString>
115 </Entry>
116 </Group>
117 </Results>
118 </Benchmark>
119 <Benchmark>
120 <Name>SciMark</Name>
121 <Version>2.0</Version>
122 <Attributes>Monte Carlo</Attributes>
123 <Scale>Mflops</Scale>
124 <Proportion>HIB</Proportion>
125 <ResultFormat>BAR_GRAPH</ResultFormat>
126 <TestName>scimark2</TestName>
127 <TestArguments>TEST_MONTE</TestArguments>
128 <Results>
129 <Group>
130 <Entry>
131 <Identifier>run0</
Identifier>
132 <Value>41.84</Value>
133 <RawString
>41.81:41.94:41.81:41.81</
RawString>
134 </Entry>
135 </Group>
136 </Results>
137 </Benchmark>
138 <Benchmark>
125
139 <Name>IOzone</Name>
140 <Version>3.315</Version>
141 <Attributes>512MB Write Performance</
Attributes>
142 <Scale>MB/s</Scale>
143 <Proportion>HIB</Proportion>
144 <ResultFormat>BAR_GRAPH</ResultFormat>
145 <TestName>iozone</TestName>
146 <TestArguments>-s 512M -i0</TestArguments>
147 <Results>
148 <Group>
149 <Entry>
150 <Identifier>run0</
Identifier>
151 <Value>51.95</Value>
152 <RawString
>41.4755859375:55.01855468
RawString>
153 </Entry>
154 </Group>
155 </Results>
156 </Benchmark>
157 <Benchmark>
158 <Name>IOzone</Name>
159 <Version>3.315</Version>
160 <Attributes>512MB Read Performance</
Attributes>
161 <Scale>MB/s</Scale>
162 <Proportion>HIB</Proportion>
163 <ResultFormat>BAR_GRAPH</ResultFormat>
164 <TestName>iozone</TestName>
165 <TestArguments>-s 512M -i0 -i1</
TestArguments>
166 <Results>
167 <Group>
168 <Entry>
169 <Identifier>run0</
Identifier>
170 <Value>531.51</Value
>
171 <RawString
>527.1669921875:534.735351
RawString>
126
172 </Entry>
173 </Group>
174 </Results>
175 </Benchmark>
176 <Benchmark>
177 <Name>IOzone</Name>
178 <Version>3.315</Version>
179 <Attributes>1GB Write Performance</
Attributes>
180 <Scale>MB/s</Scale>
181 <Proportion>HIB</Proportion>
182 <ResultFormat>BAR_GRAPH</ResultFormat>
183 <TestName>iozone</TestName>
184 <TestArguments>-s 1024M -i0</TestArguments>
185 <Results>
186 <Group>
187 <Entry>
188 <Identifier>run0</
Identifier>
189 <Value>45.01</Value>
190 <RawString
>44.7880859375:45.50390625
RawString>
191 </Entry>
192 </Group>
193 </Results>
194 </Benchmark>
195 <Benchmark>
196 <Name>SQLite</Name>
197 <Version>3.6.11</Version>
198 <Attributes>12,500 INSERTs</Attributes>
199 <Scale>Seconds</Scale>
200 <Proportion>LIB</Proportion>
201 <ResultFormat>BAR_GRAPH</ResultFormat>
202 <TestName>sqlite</TestName>
203 <TestArguments></TestArguments>
204 <Results>
205 <Group>
206 <Entry>
207 <Identifier>run0</
Identifier>
208 <Value>63.93</Value>
127
209 <RawString
>63.119623184204:63.700066
RawString>
210 </Entry>
211 </Group>
212 </Results>
213 </Benchmark>
214 <Benchmark>
215 <Name>GnuPG</Name>
216 <Version>1.4.9</Version>
217 <Attributes>2GB File Encryption</Attributes>
218 <Scale>Seconds</Scale>
219 <Proportion>LIB</Proportion>
220 <ResultFormat>BAR_GRAPH</ResultFormat>
221 <TestName>gnupg</TestName>
222 <TestArguments></TestArguments>
223 <Results>
224 <Group>
225 <Entry>
226 <Identifier>run0</
Identifier>
227 <Value>162.17</Value
>
228 <RawString
>162.97913503647:163.28654
RawString>
229 </Entry>
230 </Group>
231 </Results>
232 </Benchmark>
233 <Benchmark>
234 <Name>C-Ray</Name>
235 <Version>1.1</Version>
236 <Attributes>Total Time</Attributes>
237 <Scale>Seconds</Scale>
238 <Proportion>LIB</Proportion>
239 <ResultFormat>BAR_GRAPH</ResultFormat>
240 <TestName>c-ray</TestName>
241 <TestArguments></TestArguments>
242 <Results>
243 <Group>
244 <Entry>
128
245 <Identifier>run0</
Identifier>
246 <Value>2564.19</
Value>
247 <RawString
>2563.709:2564.232:2564.64
RawString>
248 </Entry>
249 </Group>
250 </Results>
251 </Benchmark>
252 <Benchmark>
253 <Name>RAMspeed</Name>
254 <Version>2.5.2</Version>
255 <Attributes>Integer Add</Attributes>
256 <Scale>MB/s</Scale>
257 <Proportion>HIB</Proportion>
258 <ResultFormat>BAR_GRAPH</ResultFormat>
259 <TestName>ramspeed</TestName>
260 <TestArguments>ADD -b 3 -l 10</TestArguments
>
261 <Results>
262 <Group>
263 <Entry>
264 <Identifier>run0</
Identifier>
265 <Value>2006.98</
Value>
266 <RawString>2006.98</
RawString>
267 </Entry>
268 </Group>
269 </Results>
270 </Benchmark>
271 <Benchmark>
272 <Name>GtkPerf</Name>
273 <Version>0.40</Version>
274 <Attributes>GtkComboBox</Attributes>
275 <Scale>Seconds</Scale>
276 <Proportion>LIB</Proportion>
277 <ResultFormat>BAR_GRAPH</ResultFormat>
278 <TestName>gtkperf</TestName>
279 <TestArguments>COMBOBOX</TestArguments>
129
280 <Results>
281 <Group>
282 <Entry>
283 <Identifier>run0</
Identifier>
284 <Value>151.37</Value
>
285 <RawString>151.37</
RawString>
286 </Entry>
287 </Group>
288 </Results>
289 </Benchmark>
290 <Benchmark>
291 <Name>GtkPerf</Name>
292 <Version>0.40</Version>
293 <Attributes>GtkDrawingArea - Pixbufs</
Attributes>
294 <Scale>Seconds</Scale>
295 <Proportion>LIB</Proportion>
296 <ResultFormat>BAR_GRAPH</ResultFormat>
297 <TestName>gtkperf</TestName>
298 <TestArguments>DRAWING_PIXBUFS</
TestArguments>
299 <Results>
300 <Group>
301 <Entry>
302 <Identifier>run0</
Identifier>
303 <Value>56.63</Value>
304 <RawString>56.63</
RawString>
305 </Entry>
306 </Group>
307 </Results>
308 </Benchmark>
309 <Benchmark>
310 <Name>GtkPerf</Name>
311 <Version>0.40</Version>
312 <Attributes>GtkRadioButton</Attributes>
313 <Scale>Seconds</Scale>
314 <Proportion>LIB</Proportion>
315 <ResultFormat>BAR_GRAPH</ResultFormat>
130
316 <TestName>gtkperf</TestName>
317 <TestArguments>RADIO_BUTTON</TestArguments>
318 <Results>
319 <Group>
320 <Entry>
321 <Identifier>run0</
Identifier>
322 <Value>26.52</Value>
323 <RawString>26.52</
RawString>
324 </Entry>
325 </Group>
326 </Results>
327 </Benchmark>
328 <System>
329 <Hardware>Processor: Intel Atom CPU N270 @
1.60GHz (Total Cores: 2), Motherboard:
Acer AOA150, Chipset: Intel Mobile 945GME
Express Hub + ICH7-M, System Memory: 997
MB, Disk: 160GB WDC WD1600BEVT-2,
Graphics: Intel Mobile 945GME Express IGP
(rev 03)</Hardware>
330 <Software>OS: Fedora 10, Kernel: 2.6.30.5
run2 (i686), Display Server: X.Org Server
1.5.3, Display Driver: intel 2.5.0,
OpenGL: 2.1 Mesa 7.3-devel, Compiler: GCC
4.3.2, File-System: ext3, Screen
Resolution: 1024x600</Software>
331 <Author>jim</Author>
332 <TestDate>April 11, 2010 06:45 PM</TestDate>
333 <TestNotes>2D Acceleration: EXA.
334 Intel SpeedStep Technology was enabled</TestNotes>
335 <Version>1.8.1</Version>
336 <AssociatedIdentifiers>run0</
AssociatedIdentifiers>
337 </System>
338 <Suite>
339 <Title>run2</Title>
340 <Name>netbook</Name>
341 <Version>1.2.0</Version>
342 <Description>This test suite is designed to
test various aspects of a netbook/net-top
/UMPC computer.</Description>
131
343 <Type>System</Type>
344 <Extensions></Extensions>
345 <TestProperties></TestProperties>
346 </Suite>
347 </PhoronixTestSuite>
B.4 Run1
1 <?xml version="1.0"?>
2 <?xml-stylesheet type="text/xsl" href="pts-results-viewer.
xsl" ?>
3 <!-- Generated: 2010-04-11 04:58:30 -->
4 <PhoronixTestSuite>
5 <Benchmark>
6 <Name>LAME MP3 Encoding</Name>
7 <Version>3.98.2</Version>
8 <Attributes>WAV To MP3</Attributes>
9 <Scale>Seconds</Scale>
10 <Proportion>LIB</Proportion>
11 <ResultFormat>BAR_GRAPH</ResultFormat>
12 <TestName>encode-mp3</TestName>
13 <TestArguments></TestArguments>
14 <Results>
15 <Group>
16 <Entry>
17 <Identifier>run0</
Identifier>
18 <Value>162.12</Value
>
19 <RawString
>163.22724604607:162.69049
RawString>
20 </Entry>
21 </Group>
22 </Results>
23 </Benchmark>
24 <Benchmark>
25 <Name>Ogg Encoding</Name>
26 <Version>1.2.0</Version>
27 <Attributes>WAV To Ogg</Attributes>
28 <Scale>Seconds</Scale>
29 <Proportion>LIB</Proportion>
30 <ResultFormat>BAR_GRAPH</ResultFormat>
31 <TestName>encode-ogg</TestName>
132
32 <TestArguments></TestArguments>
33 <Results>
34 <Group>
35 <Entry>
36 <Identifier>run0</
Identifier>
37 <Value>107.93</Value
>
38 <RawString
>107.78687500954:107.99853
RawString>
39 </Entry>
40 </Group>
41 </Results>
42 </Benchmark>
43 <Benchmark>
44 <Name>FFmpeg</Name>
45 <Version>0.5</Version>
46 <Attributes>AVI To NTSC VCD</Attributes>
47 <Scale>Seconds</Scale>
48 <Proportion>LIB</Proportion>
49 <ResultFormat>BAR_GRAPH</ResultFormat>
50 <TestName>ffmpeg</TestName>
51 <TestArguments></TestArguments>
52 <Results>
53 <Group>
54 <Entry>
55 <Identifier>run0</
Identifier>
56 <Value>94.47</Value>
57 <RawString
>95.719970941544:93.913688
RawString>
58 </Entry>
59 </Group>
60 </Results>
61 </Benchmark>
62 <Benchmark>
63 <Name>7-Zip Compression</Name>
64 <Version>4.65</Version>
65 <Attributes>Compress Speed Test</Attributes>
66 <Scale>MIPS</Scale>
67 <Proportion>HIB</Proportion>
133
68 <ResultFormat>BAR_GRAPH</ResultFormat>
69 <TestName>compress-7zip</TestName>
70 <TestArguments></TestArguments>
71 <Results>
72 <Group>
73 <Entry>
74 <Identifier>run0</
Identifier>
75 <Value>864.33</Value
>
76 <RawString
>863:865:865</
RawString>
77 </Entry>
78 </Group>
79 </Results>
80 </Benchmark>
81 <Benchmark>
82 <Name>SciMark</Name>
83 <Version>2.0</Version>
84 <Attributes>Composite</Attributes>
85 <Scale>Mflops</Scale>
86 <Proportion>HIB</Proportion>
87 <ResultFormat>BAR_GRAPH</ResultFormat>
88 <TestName>scimark2</TestName>
89 <TestArguments>TEST_COMPOSITE</TestArguments
>
90 <Results>
91 <Group>
92 <Entry>
93 <Identifier>run0</
Identifier>
94 <Value>120.70</Value
>
95 <RawString
>120.73:120.56:120.80:120.
RawString>
96 </Entry>
97 </Group>
98 </Results>
99 </Benchmark>
100 <Benchmark>
101 <Name>SciMark</Name>
134
102 <Version>2.0</Version>
103 <Attributes>Fast Fourier Transform</
Attributes>
104 <Scale>Mflops</Scale>
105 <Proportion>HIB</Proportion>
106 <ResultFormat>BAR_GRAPH</ResultFormat>
107 <TestName>scimark2</TestName>
108 <TestArguments>TEST_FFT</TestArguments>
109 <Results>
110 <Group>
111 <Entry>
112 <Identifier>run0</
Identifier>
113 <Value>20.34</Value>
114 <RawString
>20.07:20.14:20.10:21.05</
RawString>
115 </Entry>
116 </Group>
117 </Results>
118 </Benchmark>
119 <Benchmark>
120 <Name>SciMark</Name>
121 <Version>2.0</Version>
122 <Attributes>Monte Carlo</Attributes>
123 <Scale>Mflops</Scale>
124 <Proportion>HIB</Proportion>
125 <ResultFormat>BAR_GRAPH</ResultFormat>
126 <TestName>scimark2</TestName>
127 <TestArguments>TEST_MONTE</TestArguments>
128 <Results>
129 <Group>
130 <Entry>
131 <Identifier>run0</
Identifier>
132 <Value>41.81</Value>
133 <RawString
>41.81:41.30:42.07:42.07</
RawString>
134 </Entry>
135 </Group>
136 </Results>
137 </Benchmark>
135
138 <Benchmark>
139 <Name>IOzone</Name>
140 <Version>3.315</Version>
141 <Attributes>512MB Write Performance</
Attributes>
142 <Scale>MB/s</Scale>
143 <Proportion>HIB</Proportion>
144 <ResultFormat>BAR_GRAPH</ResultFormat>
145 <TestName>iozone</TestName>
146 <TestArguments>-s 512M -i0</TestArguments>
147 <Results>
148 <Group>
149 <Entry>
150 <Identifier>run0</
Identifier>
151 <Value>53.30</Value>
152 <RawString
>44.083984375:54.630859375
RawString>
153 </Entry>
154 </Group>
155 </Results>
156 </Benchmark>
157 <Benchmark>
158 <Name>IOzone</Name>
159 <Version>3.315</Version>
160 <Attributes>512MB Read Performance</
Attributes>
161 <Scale>MB/s</Scale>
162 <Proportion>HIB</Proportion>
163 <ResultFormat>BAR_GRAPH</ResultFormat>
164 <TestName>iozone</TestName>
165 <TestArguments>-s 512M -i0 -i1</
TestArguments>
166 <Results>
167 <Group>
168 <Entry>
169 <Identifier>run0</
Identifier>
170 <Value>520.11</Value
>
171 <RawString
>518.37890625:519.46191406
136
RawString>
172 </Entry>
173 </Group>
174 </Results>
175 </Benchmark>
176 <Benchmark>
177 <Name>IOzone</Name>
178 <Version>3.315</Version>
179 <Attributes>1GB Write Performance</
Attributes>
180 <Scale>MB/s</Scale>
181 <Proportion>HIB</Proportion>
182 <ResultFormat>BAR_GRAPH</ResultFormat>
183 <TestName>iozone</TestName>
184 <TestArguments>-s 1024M -i0</TestArguments>
185 <Results>
186 <Group>
187 <Entry>
188 <Identifier>run0</
Identifier>
189 <Value>44.69</Value>
190 <RawString
>44.7919921875:44.68945312
RawString>
191 </Entry>
192 </Group>
193 </Results>
194 </Benchmark>
195 <Benchmark>
196 <Name>SQLite</Name>
197 <Version>3.6.11</Version>
198 <Attributes>12,500 INSERTs</Attributes>
199 <Scale>Seconds</Scale>
200 <Proportion>LIB</Proportion>
201 <ResultFormat>BAR_GRAPH</ResultFormat>
202 <TestName>sqlite</TestName>
203 <TestArguments></TestArguments>
204 <Results>
205 <Group>
206 <Entry>
207 <Identifier>run0</
Identifier>
208 <Value>64.33</Value>
137
209 <RawString
>62.0997569561:65.31823015
RawString>
210 </Entry>
211 </Group>
212 </Results>
213 </Benchmark>
214 <Benchmark>
215 <Name>GnuPG</Name>
216 <Version>1.4.9</Version>
217 <Attributes>2GB File Encryption</Attributes>
218 <Scale>Seconds</Scale>
219 <Proportion>LIB</Proportion>
220 <ResultFormat>BAR_GRAPH</ResultFormat>
221 <TestName>gnupg</TestName>
222 <TestArguments></TestArguments>
223 <Results>
224 <Group>
225 <Entry>
226 <Identifier>run0</
Identifier>
227 <Value>163.19</Value
>
228 <RawString
>164.5307559967:165.627771
RawString>
229 </Entry>
230 </Group>
231 </Results>
232 </Benchmark>
233 <Benchmark>
234 <Name>C-Ray</Name>
235 <Version>1.1</Version>
236 <Attributes>Total Time</Attributes>
237 <Scale>Seconds</Scale>
238 <Proportion>LIB</Proportion>
239 <ResultFormat>BAR_GRAPH</ResultFormat>
240 <TestName>c-ray</TestName>
241 <TestArguments></TestArguments>
242 <Results>
243 <Group>
244 <Entry>
138
245 <Identifier>run0</
Identifier>
246 <Value>2563.40</
Value>
247 <RawString
>2563.28:2563.756:2563.17<
RawString>
248 </Entry>
249 </Group>
250 </Results>
251 </Benchmark>
252 <Benchmark>
253 <Name>RAMspeed</Name>
254 <Version>2.5.2</Version>
255 <Attributes>Integer Add</Attributes>
256 <Scale>MB/s</Scale>
257 <Proportion>HIB</Proportion>
258 <ResultFormat>BAR_GRAPH</ResultFormat>
259 <TestName>ramspeed</TestName>
260 <TestArguments>ADD -b 3 -l 10</TestArguments
>
261 <Results>
262 <Group>
263 <Entry>
264 <Identifier>run0</
Identifier>
265 <Value>2021.47</
Value>
266 <RawString>2021.47</
RawString>
267 </Entry>
268 </Group>
269 </Results>
270 </Benchmark>
271 <Benchmark>
272 <Name>GtkPerf</Name>
273 <Version>0.40</Version>
274 <Attributes>GtkComboBox</Attributes>
275 <Scale>Seconds</Scale>
276 <Proportion>LIB</Proportion>
277 <ResultFormat>BAR_GRAPH</ResultFormat>
278 <TestName>gtkperf</TestName>
279 <TestArguments>COMBOBOX</TestArguments>
139
280 <Results>
281 <Group>
282 <Entry>
283 <Identifier>run0</
Identifier>
284 <Value>146.95</Value
>
285 <RawString>146.95</
RawString>
286 </Entry>
287 </Group>
288 </Results>
289 </Benchmark>
290 <Benchmark>
291 <Name>GtkPerf</Name>
292 <Version>0.40</Version>
293 <Attributes>GtkDrawingArea - Pixbufs</
Attributes>
294 <Scale>Seconds</Scale>
295 <Proportion>LIB</Proportion>
296 <ResultFormat>BAR_GRAPH</ResultFormat>
297 <TestName>gtkperf</TestName>
298 <TestArguments>DRAWING_PIXBUFS</
TestArguments>
299 <Results>
300 <Group>
301 <Entry>
302 <Identifier>run0</
Identifier>
303 <Value>57.71</Value>
304 <RawString>57.71</
RawString>
305 </Entry>
306 </Group>
307 </Results>
308 </Benchmark>
309 <Benchmark>
310 <Name>GtkPerf</Name>
311 <Version>0.40</Version>
312 <Attributes>GtkRadioButton</Attributes>
313 <Scale>Seconds</Scale>
314 <Proportion>LIB</Proportion>
315 <ResultFormat>BAR_GRAPH</ResultFormat>
140
316 <TestName>gtkperf</TestName>
317 <TestArguments>RADIO_BUTTON</TestArguments>
318 <Results>
319 <Group>
320 <Entry>
321 <Identifier>run0</
Identifier>
322 <Value>26.65</Value>
323 <RawString>26.65</
RawString>
324 </Entry>
325 </Group>
326 </Results>
327 </Benchmark>
328 <System>
329 <Hardware>Processor: Intel Atom CPU N270 @
1.60GHz (Total Cores: 2), Motherboard:
Acer AOA150, Chipset: Intel Mobile 945GME
Express Hub + ICH7-M, System Memory: 997
MB, Disk: 160GB WDC WD1600BEVT-2,
Graphics: Intel Mobile 945GME Express IGP
(rev 03)</Hardware>
330 <Software>OS: Fedora 10, Kernel: 2.6.30.5
run3 (i686), Display Server: X.Org Server
1.5.3, Display Driver: intel 2.5.0,
OpenGL: 2.1 Mesa 7.3-devel, Compiler: GCC
4.3.2, File-System: ext3, Screen
Resolution: 1024x600</Software>
331 <Author>jim</Author>
332 <TestDate>April 11, 2010 04:58 AM</TestDate>
333 <TestNotes>2D Acceleration: EXA.
334 Intel SpeedStep Technology was enabled</TestNotes>
335 <Version>1.8.1</Version>
336 <AssociatedIdentifiers>run0</
AssociatedIdentifiers>
337 </System>
338 <Suite>
339 <Title>run3</Title>
340 <Name>netbook</Name>
341 <Version>1.2.0</Version>
342 <Description>This test suite is designed to
test various aspects of a netbook/net-top
/UMPC computer.</Description>
141
343 <Type>System</Type>
344 <Extensions></Extensions>
345 <TestProperties></TestProperties>
346 </Suite>
347 </PhoronixTestSuite>
B.5 Run2
1 <?xml version="1.0"?>
2 <?xml-stylesheet type="text/xsl" href="pts-results-viewer.
xsl" ?>
3 <!-- Generated: 2010-04-11 23:11:39 -->
4 <PhoronixTestSuite>
5 <Benchmark>
6 <Name>LAME MP3 Encoding</Name>
7 <Version>3.98.2</Version>
8 <Attributes>WAV To MP3</Attributes>
9 <Scale>Seconds</Scale>
10 <Proportion>LIB</Proportion>
11 <ResultFormat>BAR_GRAPH</ResultFormat>
12 <TestName>encode-mp3</TestName>
13 <TestArguments></TestArguments>
14 <Results>
15 <Group>
16 <Entry>
17 <Identifier>run0</
Identifier>
18 <Value>162.12</Value
>
19 <RawString
>163.17384290695:162.00460
RawString>
20 </Entry>
21 </Group>
22 </Results>
23 </Benchmark>
24 <Benchmark>
25 <Name>Ogg Encoding</Name>
26 <Version>1.2.0</Version>
27 <Attributes>WAV To Ogg</Attributes>
28 <Scale>Seconds</Scale>
29 <Proportion>LIB</Proportion>
30 <ResultFormat>BAR_GRAPH</ResultFormat>
31 <TestName>encode-ogg</TestName>
142
32 <TestArguments></TestArguments>
33 <Results>
34 <Group>
35 <Entry>
36 <Identifier>run0</
Identifier>
37 <Value>107.66</Value
>
38 <RawString
>107.60985088348:108.16382
RawString>
39 </Entry>
40 </Group>
41 </Results>
42 </Benchmark>
43 <Benchmark>
44 <Name>FFmpeg</Name>
45 <Version>0.5</Version>
46 <Attributes>AVI To NTSC VCD</Attributes>
47 <Scale>Seconds</Scale>
48 <Proportion>LIB</Proportion>
49 <ResultFormat>BAR_GRAPH</ResultFormat>
50 <TestName>ffmpeg</TestName>
51 <TestArguments></TestArguments>
52 <Results>
53 <Group>
54 <Entry>
55 <Identifier>run0</
Identifier>
56 <Value>94.17</Value>
57 <RawString
>94.920183897018:93.461187
RawString>
58 </Entry>
59 </Group>
60 </Results>
61 </Benchmark>
62 <Benchmark>
63 <Name>7-Zip Compression</Name>
64 <Version>4.65</Version>
65 <Attributes>Compress Speed Test</Attributes>
66 <Scale>MIPS</Scale>
67 <Proportion>HIB</Proportion>
143
68 <ResultFormat>BAR_GRAPH</ResultFormat>
69 <TestName>compress-7zip</TestName>
70 <TestArguments></TestArguments>
71 <Results>
72 <Group>
73 <Entry>
74 <Identifier>run0</
Identifier>
75 <Value>862.00</Value
>
76 <RawString
>858:867:861</
RawString>
77 </Entry>
78 </Group>
79 </Results>
80 </Benchmark>
81 <Benchmark>
82 <Name>SciMark</Name>
83 <Version>2.0</Version>
84 <Attributes>Composite</Attributes>
85 <Scale>Mflops</Scale>
86 <Proportion>HIB</Proportion>
87 <ResultFormat>BAR_GRAPH</ResultFormat>
88 <TestName>scimark2</TestName>
89 <TestArguments>TEST_COMPOSITE</TestArguments
>
90 <Results>
91 <Group>
92 <Entry>
93 <Identifier>run0</
Identifier>
94 <Value>120.10</Value
>
95 <RawString
>119.90:119.70:119.86:120.
RawString>
96 </Entry>
97 </Group>
98 </Results>
99 </Benchmark>
100 <Benchmark>
101 <Name>SciMark</Name>
144
102 <Version>2.0</Version>
103 <Attributes>Fast Fourier Transform</
Attributes>
104 <Scale>Mflops</Scale>
105 <Proportion>HIB</Proportion>
106 <ResultFormat>BAR_GRAPH</ResultFormat>
107 <TestName>scimark2</TestName>
108 <TestArguments>TEST_FFT</TestArguments>
109 <Results>
110 <Group>
111 <Entry>
112 <Identifier>run0</
Identifier>
113 <Value>20.34</Value>
114 <RawString
>20.26:20.26:20.41:20.45</
RawString>
115 </Entry>
116 </Group>
117 </Results>
118 </Benchmark>
119 <Benchmark>
120 <Name>SciMark</Name>
121 <Version>2.0</Version>
122 <Attributes>Monte Carlo</Attributes>
123 <Scale>Mflops</Scale>
124 <Proportion>HIB</Proportion>
125 <ResultFormat>BAR_GRAPH</ResultFormat>
126 <TestName>scimark2</TestName>
127 <TestArguments>TEST_MONTE</TestArguments>
128 <Results>
129 <Group>
130 <Entry>
131 <Identifier>run0</
Identifier>
132 <Value>41.84</Value>
133 <RawString
>42.07:41.81:41.43:42.07</
RawString>
134 </Entry>
135 </Group>
136 </Results>
137 </Benchmark>
145
138 <Benchmark>
139 <Name>IOzone</Name>
140 <Version>3.315</Version>
141 <Attributes>512MB Write Performance</
Attributes>
142 <Scale>MB/s</Scale>
143 <Proportion>HIB</Proportion>
144 <ResultFormat>BAR_GRAPH</ResultFormat>
145 <TestName>iozone</TestName>
146 <TestArguments>-s 512M -i0</TestArguments>
147 <Results>
148 <Group>
149 <Entry>
150 <Identifier>run0</
Identifier>
151 <Value>52.48</Value>
152 <RawString
>44.2880859375:53.87011718
RawString>
153 </Entry>
154 </Group>
155 </Results>
156 </Benchmark>
157 <Benchmark>
158 <Name>IOzone</Name>
159 <Version>3.315</Version>
160 <Attributes>512MB Read Performance</
Attributes>
161 <Scale>MB/s</Scale>
162 <Proportion>HIB</Proportion>
163 <ResultFormat>BAR_GRAPH</ResultFormat>
164 <TestName>iozone</TestName>
165 <TestArguments>-s 512M -i0 -i1</
TestArguments>
166 <Results>
167 <Group>
168 <Entry>
169 <Identifier>run0</
Identifier>
170 <Value>523.81</Value
>
171 <RawString
>517.7275390625:520.118164
146
RawString>
172 </Entry>
173 </Group>
174 </Results>
175 </Benchmark>
176 <Benchmark>
177 <Name>IOzone</Name>
178 <Version>3.315</Version>
179 <Attributes>1GB Write Performance</
Attributes>
180 <Scale>MB/s</Scale>
181 <Proportion>HIB</Proportion>
182 <ResultFormat>BAR_GRAPH</ResultFormat>
183 <TestName>iozone</TestName>
184 <TestArguments>-s 1024M -i0</TestArguments>
185 <Results>
186 <Group>
187 <Entry>
188 <Identifier>run0</
Identifier>
189 <Value>43.96</Value>
190 <RawString
>43.5712890625:43.22949218
RawString>
191 </Entry>
192 </Group>
193 </Results>
194 </Benchmark>
195 <Benchmark>
196 <Name>SQLite</Name>
197 <Version>3.6.11</Version>
198 <Attributes>12,500 INSERTs</Attributes>
199 <Scale>Seconds</Scale>
200 <Proportion>LIB</Proportion>
201 <ResultFormat>BAR_GRAPH</ResultFormat>
202 <TestName>sqlite</TestName>
203 <TestArguments></TestArguments>
204 <Results>
205 <Group>
206 <Entry>
207 <Identifier>run0</
Identifier>
208 <Value>62.39</Value>
147
209 <RawString
>61.145431995392:63.320338
RawString>
210 </Entry>
211 </Group>
212 </Results>
213 </Benchmark>
214 <Benchmark>
215 <Name>GnuPG</Name>
216 <Version>1.4.9</Version>
217 <Attributes>2GB File Encryption</Attributes>
218 <Scale>Seconds</Scale>
219 <Proportion>LIB</Proportion>
220 <ResultFormat>BAR_GRAPH</ResultFormat>
221 <TestName>gnupg</TestName>
222 <TestArguments></TestArguments>
223 <Results>
224 <Group>
225 <Entry>
226 <Identifier>run0</
Identifier>
227 <Value>163.54</Value
>
228 <RawString
>164.57925820351:166.04097
RawString>
229 </Entry>
230 </Group>
231 </Results>
232 </Benchmark>
233 <Benchmark>
234 <Name>C-Ray</Name>
235 <Version>1.1</Version>
236 <Attributes>Total Time</Attributes>
237 <Scale>Seconds</Scale>
238 <Proportion>LIB</Proportion>
239 <ResultFormat>BAR_GRAPH</ResultFormat>
240 <TestName>c-ray</TestName>
241 <TestArguments></TestArguments>
242 <Results>
243 <Group>
244 <Entry>
148
245 <Identifier>run0</
Identifier>
246 <Value>2563.91</
Value>
247 <RawString
>2564.097:2563.603:2564.03
RawString>
248 </Entry>
249 </Group>
250 </Results>
251 </Benchmark>
252 <Benchmark>
253 <Name>RAMspeed</Name>
254 <Version>2.5.2</Version>
255 <Attributes>Integer Add</Attributes>
256 <Scale>MB/s</Scale>
257 <Proportion>HIB</Proportion>
258 <ResultFormat>BAR_GRAPH</ResultFormat>
259 <TestName>ramspeed</TestName>
260 <TestArguments>ADD -b 3 -l 10</TestArguments
>
261 <Results>
262 <Group>
263 <Entry>
264 <Identifier>run0</
Identifier>
265 <Value>2018.61</
Value>
266 <RawString>2018.61</
RawString>
267 </Entry>
268 </Group>
269 </Results>
270 </Benchmark>
271 <Benchmark>
272 <Name>GtkPerf</Name>
273 <Version>0.40</Version>
274 <Attributes>GtkComboBox</Attributes>
275 <Scale>Seconds</Scale>
276 <Proportion>LIB</Proportion>
277 <ResultFormat>BAR_GRAPH</ResultFormat>
278 <TestName>gtkperf</TestName>
279 <TestArguments>COMBOBOX</TestArguments>
149
280 <Results>
281 <Group>
282 <Entry>
283 <Identifier>run0</
Identifier>
284 <Value>149.88</Value
>
285 <RawString>149.88</
RawString>
286 </Entry>
287 </Group>
288 </Results>
289 </Benchmark>
290 <Benchmark>
291 <Name>GtkPerf</Name>
292 <Version>0.40</Version>
293 <Attributes>GtkDrawingArea - Pixbufs</
Attributes>
294 <Scale>Seconds</Scale>
295 <Proportion>LIB</Proportion>
296 <ResultFormat>BAR_GRAPH</ResultFormat>
297 <TestName>gtkperf</TestName>
298 <TestArguments>DRAWING_PIXBUFS</
TestArguments>
299 <Results>
300 <Group>
301 <Entry>
302 <Identifier>run0</
Identifier>
303 <Value>57.19</Value>
304 <RawString>57.19</
RawString>
305 </Entry>
306 </Group>
307 </Results>
308 </Benchmark>
309 <Benchmark>
310 <Name>GtkPerf</Name>
311 <Version>0.40</Version>
312 <Attributes>GtkRadioButton</Attributes>
313 <Scale>Seconds</Scale>
314 <Proportion>LIB</Proportion>
315 <ResultFormat>BAR_GRAPH</ResultFormat>
150
316 <TestName>gtkperf</TestName>
317 <TestArguments>RADIO_BUTTON</TestArguments>
318 <Results>
319 <Group>
320 <Entry>
321 <Identifier>run0</
Identifier>
322 <Value>26.57</Value>
323 <RawString>26.57</
RawString>
324 </Entry>
325 </Group>
326 </Results>
327 </Benchmark>
328 <System>
329 <Hardware>Processor: Intel Atom CPU N270 @
1.60GHz (Total Cores: 2), Motherboard:
Acer AOA150, Chipset: Intel Mobile 945GME
Express Hub + ICH7-M, System Memory: 997
MB, Disk: 160GB WDC WD1600BEVT-2,
Graphics: Intel Mobile 945GME Express IGP
(rev 03)</Hardware>
330 <Software>OS: Fedora 10, Kernel: 2.6.30.5
run4 (i686), Display Server: X.Org Server
1.5.3, Display Driver: intel 2.5.0,
OpenGL: 2.1 Mesa 7.3-devel, Compiler: GCC
4.3.2, File-System: ext3, Screen
Resolution: 1024x600</Software>
331 <Author>jim</Author>
332 <TestDate>April 11, 2010 11:11 PM</TestDate>
333 <TestNotes>2D Acceleration: EXA.
334 Intel SpeedStep Technology was enabled</TestNotes>
335 <Version>1.8.1</Version>
336 <AssociatedIdentifiers>run0</
AssociatedIdentifiers>
337 </System>
338 <Suite>
339 <Title>run4</Title>
340 <Name>netbook</Name>
341 <Version>1.2.0</Version>
342 <Description>This test suite is designed to
test various aspects of a netbook/net-top
/UMPC computer.</Description>
151
343 <Type>System</Type>
344 <Extensions></Extensions>
345 <TestProperties></TestProperties>
346 </Suite>
347 </PhoronixTestSuite>
152
Bibliography
[1] Michael Abrash. Graphics Programming Black Block, chapter 7, pages 8–10.
[2] Boaz Barak and Shai Halevi. A model and architecture for pseudo-random gen-
eration with applications to /dev/random. In CCS ’05: Proceedings of the 12th
ACM conference on Computer and communications security, pages 203–212, New
York, NY, USA, 2005. ACM.
[3] Paul E Black. Fisher-yates shuffle. National Institute of Standards and Tech-
nology, 2009. http://www.itl.nist.gov/div897/sqg/dads/HTML/
fisherYatesShuffle.html.
[4] Keith D. Cooper, Philip J. Schielke, and Devika Subramanian. Optimizing for
reduced code space using genetic algorithms. In LCTES ’99: Proceedings of the
ACM SIGPLAN 1999 workshop on Languages, compilers, and tools for embedded
systems, pages 1–9, New York, NY, USA, 1999. ACM.
[5] Prancois Bodin Zbigniew Chamski Bjorn Franke Grigori Fursin Taras Glek
Yuriy Kashnikov Hugh Leather Adbul Wahid Memon Cupertino Miranda Mircea
Namolaru Diego Novillo Sebastian Pop Joern Rennecke Jeremy Singer Basile
Starynkevitch Ayal Zaks Fabio Arnone, Phil Barnard. Ctools: Milepost
gcc:motivation, 2010. http://ctuning.org/wiki/index.php/CTools:
MilepostGCC:Motivation.
[7] Grigori Fursin, Cupertino Miranda, Olivier Temam, Mircea Namolaru, Elad
Yom-Tov, Ayal Zaks, Bilha Mendelson, Phil Barnard, Elton Ashton, Eric Cour-
tois, Francois Bodin, Edwin Bonilla, John Thomson, Hugh Leather, Chris
Williams, and Michael O’Boyle. Milepost gcc: machine learning based research
compiler. In Proceedings of the GCC Developers’ Summit, June 2008.
[8] Kenneth Hoste and Lieven Eeckhout. Cole: compiler optimization level explo-
ration. In CGO ’08: Proceedings of the sixth annual IEEE/ACM international
symposium on Code generation and optimization, pages 165–174, New York, NY,
USA, 2008. ACM.
153
[9] Intel. Intel(R) C++ Compiler User and Reference Guides. Document number:
304968-023US.
[12] Intel. Mobile Intel Atom Processor N270 Single Core Datasheet, May 2008.
[13] Ilhyun Kim and Mikko H. Lipasti. Implementing optimizations at decode time.
In ISCA ’02: Proceedings of the 29th annual international symposium on Com-
puter architecture, pages 221–232, Washington, DC, USA, 2002. IEEE Computer
Society.
[15] Prasad A. Kulkarni, David B. Whalley, Gary S. Tyson, and Jack W. Davidson.
Practical exhaustive optimization phase order exploration and evaluation. ACM
Trans. Archit. Code Optim., 6(1):1–36, 2009.
[19] Brad L. Miller and David E. Goldberg. Genetic algorithms, selection schemes,
and the varying effects of noise. Evol. Comput., 4(2):113–131, 1996.
[20] Robert Muller-Albrecht. Optimized for the intel atom processor with intel’s
compiler. Technical note, Intel, January 2009.
[21] S. K. Park and K. W. Miller. Random number generators: good ones are hard
to find. Commun. ACM, 31(10):1192–1201, 1988.
154
[24] Justin Ryan. Linuxdna supercharges linux with the intel c/c++ compiler. Linux
Journal, February 2009. http://www.linuxjournal.com/content/
linuxdna-supercharges-linux-intel-cc-compiler.
[25] Eric Schnarr and James R. Larus. Instruction scheduling and executable editing.
In MICRO 29: Proceedings of the 29th annual ACM/IEEE international sympo-
sium on Microarchitecture, pages 288–297, Washington, DC, USA, 1996. IEEE
Computer Society.
[26] Peter Selinger. The glibc pseudo-random number generator, 2007. http://
www.mscs.dal.ca/~selinger/random/.
[27] Marco Boero Staffan Algers, Eric Bernauer. Review of micro-simulation models,
1997. http://www.its.leeds.ac.uk/projects/smartest/deliv3.
html.
[29] Linus Torvalds. Linux torvalds on git. Google Tech Talk, 5 2007.
[30] Kent Wilken, Jack Liu, and Mark Heffernan. Optimal instruction scheduling
using integer programming. SIGPLAN Not., 35(5):121–133, 2000.
155