Documente Academic
Documente Profesional
Documente Cultură
Performance
Two common measures
Latency (how long to do X)
Also called response time and execution time
Measuring Performance
Peak (MIPS, MFLOPS)
Often not useful
unachievable in practice, or unsustainable
Measuring Performance
Benchmarks
Real applications and application suites
E.g., SPEC CPU2000, SPEC2006, TPC-C, TPC-H
Kernels
Representative parts of real applications
Easier and quicker to set up and run
Often not really representative of the entire
app
Price-Performance
TPC Benchmarks
Measure transaction-processing
throughput
Benchmarks for different scenarios
TPC-C: warehouses and sales transactions
TPC-H: ad-hoc decision support
TPC-W: web-based business transactions
Throughput-Server Perf/Cost
High performance
Very expensive!
CPU time
Seconds
Program
Program
Instruction
Clock Cycle
ISA,
Compiler
Technology
A.K.A. The iron law of performance
Organization,
ISA
Hardware
Technology,
Organization
Car Analogy
Need to drive from Klaus to CRC
Clock Speed = 3500 RPM
CPI = 5250 rotations/km or 0.19 m/rot
Insts = 800m
CPU time
Seconds
Program
Program
Instruction
Clock Cycle
800 m
1 rotation
0.19 m
= 1.2 minutes
1 minute
3500 rotations
CPU Version
Program takes 33 billion instructions to run
CPU processes insts at 2 cycles per inst
Clock speed of 3GHz
CPU time
Seconds
Program
Program
Instruction
Clock Cycle
= 22 seconds
CPU time
i 1
ICi CPIi
Frequency
CPI
Integer
40%
1.0
Branch
20%
4.0
Load
20%
2.0
CPU time
Store
10%
3.0
Total Insts = 50B, Clock speed = 2 GHz
i 1
ICi CPIi
Comparing Performance
X is n times faster than Y
Execution time Y
n
Execution time X
Summarizing Performance
Arithmetic mean
Average execution time
Gives more weight to longer-running
programs
Speedup
Machine A
Machine B
Program 1
5 sec
4 sec
Program 2
3 sec
6 sec
i 1
CPI/IPC
Often when making comparisons in
comp-arch studies:
Program (or set of) is the same for two
CPUs
The clock speed is the same for two CPUs
Harmonic Mean
H.M.(x1,x2,x3,,xn) =
n
1 + 1 + 1
x1 x2 x3
+ + 1
xn
A.M.(CPI)
1
+ CPI2 + CPI3 + + CPIn
=
CPI1 + CPI2
=
1
+
=H.M.(IPC)
IPC1
1
IPC2
n
+ CPI3 + + CPIn
n
+
1
IPC3
IPCn
Speedup
Execution Time new
Fraction Enhanced
Execution Time old 1 Fraction Enhanced
Speedup Enhanced
Overall Speedup
Caution: fraction
of What?
1 Fraction Enhanced
Fraction Enhanced
Speedup Enhanced
Overall Speedup
1 Fraction Enhanced
Speedup Enhanced 20
Speedup
1
1 0.1 0.1
20
1.105
VS
Fraction Enhanced
Speedup Enhanced
Speedup
1
0.9
1 0.9
1. 2
1.176
Green Phase
Blue Phase
Speedup Overall 1.33
Generation 2
Green
Blue
Generation 3
Blue
Speedup Green 2
Fraction Green
over Generation 1
Speedup Green 2
Fraction Green
1
3
over Generation 2
1
2
Without Turbo
With Turbo
Car costs extra $3,000
Selling price is $16,000 $5K profit per car
But only a few gear heads buy the car:
We only sell 400 cars and make $2M in profit