Sunteți pe pagina 1din 62

Code Optimization

Code optimization - a program transformation that preserves


correctness and improves the performance (e.g., execution
time, space) of the input program.
Code optimization may be performed at multiple levels of
program representation:
1. Source code
2. Intermediate code
3. Target machine code

Criterion of code optimization


Correctness
Optimizations can only be applied conservatively - the
result of the transformation cannot change the observable
b h i off the
behavior
th program.
Profitability
We must have a reasonable expectation that the
transformation will improve the code.
Efficiency
Can we locate application easily?

Optimizations
p
reduce the number of instructions
reduce the cost of instructions
reduce the number of times an instruction is executed

Code Optimization
Why
Reduce pprogrammers
g
burden
Allow programmers to concentrate on high level concept
y g about pperformance issues
Without worrying
Target
Reduce execution time
Reduce space
Sometimes these are tradeoffs
Sometimes,
Types
Intermediate code level
We are looking at this part now
Assembly level
Instruction selection, register allocation, etc.

Code Optimization
Architecture Independent:
Common Sub expression Elimination
p g
Constant Propagation
Dead Code Elimination

Architecture Dependent:
Instruction selection
Register allocation

Organization of an optimizing compiler

Control
flow
analysis

Data flow
analysis
y

Transformation

Code
optimizer
7

Classifications of Optimization
techniques

Peephole optimizations
Local Optimizations (straight-line code (basic block))
p
Global Optimizations
Intra-procedural (Consider entire procedures - data flow
analysis, control flow analysis )
Inter-procedural (Consider entire programs - problems
of parameter passing, aliasing, calling scope )
Loop Optimizations

Themes behind Optimization


Techniques

Avoid redundancy:
y somethingg alreadyy computed
p
need
not be computed again

Smaller code: less work for CPU, cache, and memory!


Less jumps: jumps interfere with code pre-fetch
Code locality:
y codes executed close together
g
in time is
generated close together in memory increase locality of
reference

E
Extract
more information
i f
i about
b
code:
d More info
better code generation

Redundancy elimination
Redundancy elimination = determining that two
computations
p
are equivalent
q
and eliminating
g one.
There are several types of redundancy elimination:
Value numbering
Associates symbolic values to computations and identifies expressions
that have the same value

Common subexpression elimination


Identifies expressions that have operands with the same name

Constant/Copy propagation
Identifies variables that have constant/copy values and uses the
constants/copies in place of the variables.
variables

Partial redundancy elimination


Inserts computations in paths to convert partial redundancy to full
redundancy.
10

Code Optimization
Techniques
C
Constant
propagation
i
Constant folding
Algebraic simplification, strength reduction
Copy propagation
Common subexpression elimination
Unreacheable code elimination
Dead code elimination
Loop Optimization
Function related
Function inlining, function cloning
Most of the optimization goals are identified through
experience, but techniques requires theoretical foundation

Compile Time Evaluation


Compile-Time
Expressions whose values can be pre-computed at the
compilation time
Two ways:
Constant folding
Constant propagation

12

Compile Time Evaluation


Compile-Time
Constant folding: Evaluation of an expression with
constant operands to replace the expression with single
value

Example:
area := (22.0/7.0) * r ** 2
area := 3.14286 * r ** 2

13

Compile Time Evaluation


Compile-Time
Constant Propagation: If the value of a variable is a
constant, then replace the variable by the constant
Example:
pi := 3.14286
area = pi * r ** 2
area = 3.14286 * r ** 2

14

Constant Propagation
p g
What does it mean?
Given an assignment x = c,
c where c is a constant,
constant replace
later uses of x with uses of c, provided there are no
interveningg assignments
g
to x.
When is it performed?
Early in the optimization process.
What is the result?
Smaller code
Fewer registers

15

Code Optimization Techniques


Algebraic simplification
More general form of constant folding, e.g.,
x+0x x0x
x*1x x/1x
x*00
Repeatedly apply the rules
(y * 1 + 0)) / 1 y
Strength reduction
Replace expensive operations
E.g.,
E g x ::= x * 8 x ::= x << 3

Common Sub-expression Evaluation


Identify common sub-expression present in different
expression, compute once, and use the result in all the
places.
The definition of the variables involved should not change
Example:
a := b * c

x := b * c + 5

temp := b * c
a := temp

x := temp + 5

17

Basic Blocks and Flow Graphs


A graph representation of three address statements, called
flow graph.
Nodes in the flow graph represent computations
Edges represent the flow of control
Basic Block:
A bbasic
i block
bl k is
i a sequence off consecutive
i statements
in which flow of control enters at the beginning and leaves at
the end without halt or possibly of the branching except at the
end.

Basic Blocks and Flow Graphs (2)

Thi is
This
i a basic
b i block
bl k

t1 = a*a
t2 = a*b
t3 = 2*t2
t4 = t1+t3
t5 = b*b
t6 = t4+ t5

Three address statement x = y + z is said to define x and to use y and z.

A name in a basic block is said to be live at a g


given p
point if its
value is used after that point in the program, perhaps in another
basic block

Basic Blocks and Flow Graphs (3)


Partition
P titi into
i t basic
b i blocks
bl k
Method
We
W first
fi t determine
d t
i the
th leader
l d
o The first statement is a leader
o Any
A statement that
h iis the
h target off a conditional
di i l or
unconditional goto is a leader
o Any
An statement that immediatel
immediately follows
follo s a goto or
unconditional goto statement is a leader
For each leader,
leader its basic block consists of the leader and
all the statements up to but not including the next leader
or the end of the program.
p g

Basic Blocks and Flow Graphs (4)


(1) prod = 0
(2) i = 1

B1

(3) t1=4*I
---------------------------------(11) I = t7
(12) If I <= 20 goto (3)

B2

Control Flow Graph


Two Basic Blocks identified in previous example
Add flow-of-control information to the basic blocks by
constructing a directed graph known as flow graph or CFG
Nodes in a CFG are basic Blocks
Block whose leader is the first statement is distinguished
as initial node
There is a directed edge from block B1 to block B2 if
There is conditional or unconditional jump from the
last statement of B1 to the first statement of B2 or
B2 immediately follows B1 in the order of the
program, and B1 does not end in an unconditional
jump.
jump

Flow Graph of Previous ex


prod = 0

B1

i=1

t1=4*I
---------------------------------I = t7
If I <= 20 goto B2

B2

Basic Blocks (Revision)


A basic block is a sequence of consecutive
intermediate language statements in which flow of
control can only
y enter at the beginning
g
g and leave at

the end.

Only the last statement of a basic block can be a branch


statement and only the first statement of a basic block
can be a target of a branch.
branch
In some frameworks, procedure calls may occur within a
basic block.

Basic Block
Partitioning Algorithm
1. Identify leader statements (i.e. the first statements
of basic blocks) by using the following rules:
(i) The first statement in the program is a leader
(ii) Any statement that
h iis the
h target off a b
branch
h
statement is a leader (for most intermediate
languages these are statements with
an associated
i t d label)
l b l)
(iii) Any statement that immediately follows a branch
or return statement is a leader

(AhoSethiUllman)

Example: Finding Leaders


The following code computes the inner product of two vectors.
begin
prod := 0;
i := 1;
1
do begin
prod := prod + a[i] * b[i];
i = i+ 1;
end
while i <= 20
end
Source code

(1) prod := 0
(2) i := 1
(3) t1 := 4 * i
((4)) t2 := a[t1]
[ ]
(5) t3 := 4 * i
(6) t4 := b[t3]
(7) t5 := t2 * t4
(8) t6 := prod + t5
(9) prod := t6
(10) t7 := i + 1
(11) i := t7
(12) if i <= 20 goto (3)
Three-address code
(AhoSethiUllman)

Example: Finding Leaders


The following code computes the inner product of two vectors.
Rule ((i)) (1) prod := 0
(2) i := 1
begin
(3) t1 := 4 * i
prod := 0;
((4)) t2 := a[t1]
[ ]
i := 1;
1
(5) t3 := 4 * i
do begin
(6) t4 := b[t3]
prod := prod + a[i] * b[i];
(7) t5 := t2 * t4
i = i+ 1;
(8) t6 := prod + t5
end
(9) prod := t6
while i <= 20
(10) t7 := i + 1
end
(11) i := t7
Source code
(12) if i <= 20 goto (3)
(13)
Three-address code

Example: Finding Leaders


The following code computes the inner product of two vectors.
Rule ((i)) (1) prod := 0
(2) i := 1
begin
Rule (ii) (3) t1 := 4 * i
prod := 0;
((4)) t2 := a[t1]
[ ]
i := 1;
1
(5) t3 := 4 * i
do begin
(6) t4 := b[t3]
prod := prod + a[i] * b[i];
(7) t5 := t2 * t4
i = i+ 1;
(8) t6 := prod + t5
end
(9) prod := t6
while i <= 20
(10) t7 := i + 1
end
(11) i := t7
Source code
(12) if i <= 20 goto (3)
(13)
Three-address code

Example: Finding Leaders


The following code computes the inner product of two vectors.
Rule ((i)) (1) prod := 0
(2) i := 1
begin
Rule (ii) (3) t1 := 4 * i
prod := 0;
((4)) t2 := a[t1]
[ ]
i := 1;
1
(5) t3 := 4 * i
do begin
(6) t4 := b[t3]
prod := prod + a[i] * b[i];
(7) t5 := t2 * t4
i = i+ 1;
(8) t6 := prod + t5
end
(9) prod := t6
while i <= 20
(10) t7 := i + 1
end
(11) i := t7
(12) if i <= 20 goto (3)
Source code
R l (iii) (13)
Rule
Three-address code

Forming the Basic Blocks


Now that we know the leaders, how do we form
the basic blocks associated with each leader?
The basic block corresponding to a leader consists of the leader,
plus all statements up to but not including the next leader or
up to the end of the program.

Example: Forming the Basic Blocks


B1 (1) prod := 0
(2) i := 1

Basic Blocks:

B2 (3) t1 := 4 * i
(4) t2 := a[t1]
(5) t3 := 4 * i
(6) t4 := b[t3]
(7) t5 := t2 * t4
(8) t6 := prod + t5
(9) prod := t6
((10)) t7 := i + 1
(11) i := t7
(12) if i <= 20 goto (3)
B3 (13)

Control Flow Graph (CFG)


A control flow graph (CFG), or simply a flow graph,
is a directed multigraph in which:
(i) the nodes are basic blocks; and
(ii) the edges represent flow of control
(b
(branches
h or fall-through
f ll th
h execution).
ti )

The basic block whose leader is the first intermediate


language statement is called the start node.
In a CFG we have no information about the data.
Therefore an edge in the CFG means that the
program
p
g
may
y take that p
path.

Control Flow Graph (CFG)


There is a directed edge from basic block B1 to basic block
B2 in the CFG if:
(1) There is a branch from the last statement of B1 to the first
statement of B2,, or
(2) Control flow can fall through from B1 to B2 because:
(i) B2 immediately follows B1, and
(ii) B1 does not end with
ith an unconditional
nconditional branch

Example: Control Flow Graph Formation


B1 (1) prod := 0
(2) i := 1

B1
B2
B3

R l (2)
Rule
B2 (3) t1 := 4 * i
((4)) t2 := a[t1]
[ ]
(5) t3 := 4 * i
(6) t4 := b[t3]
(7) t5 ::= t2 * t4
(8) t6 := prod + t5
(9) prod := t6
(10) t7 := i + 1
(11) i := t7
(12) if i <= 20 goto (3)
B3 (13)

Example : Control Flow Graph Formation


B1 (1) prod := 0
Rule (1)
(2) i := 1

B1
B2
B3

R l (2)
Rule
B2 (3) t1 := 4 * i
((4)) t2 := a[t1]
[ ]
(5) t3 := 4 * i
(6) t4 := b[t3]
(7) t5 ::= t2 * t4
(8) t6 := prod + t5
(9) prod := t6
(10) t7 := i + 1
(11) i := t7
(12) if i <= 20 goto (3)
B3 (13)

Example : Control Flow Graph Formation


B1 (1) prod := 0
Rule (1)
(2) i := 1

B1
B2
B3

R l (2)
Rule
B2 (3) t1 := 4 * i
((4)) t2 := a[t1]
[ ]
(5) t3 := 4 * i
(6) t4 := b[t3]
(7) t5 ::= t2 * t4
(8) t6 := prod + t5
(9) prod := t6
(10) t7 := i + 1
(11) i := t7
(12) if i <= 20 goto (3)
Rule (2)

B3 (13)

Code Optimization Techniques


Common sub expression elimination
Example:
a := b + c
a := b + c
c := b + c

c := a
d := b + c
d := b + c
Example in array index calculations
c[i+1]
[ ] := a[i+1]
[ ] + b[i+1]
[ ]
During address computation, i+1 should be reused
Not visible in high level code, but in intermediate
code

Local Common sub-expression


sub expression Elimination

Local &
Global
Optimization

Common Subexpression Elimination


Local common subexpression elimination
Performed
P
f
d within
i hi basic
b i blocks
bl k
Algorithm sketch:
T
Traverse BB from
f
top to bottom
b
Maintain table of expressions evaluated so far
if anyy operand
p
of the expression
p
is redefined,, remove it from the table

Modify applicable instructions as you go


generate temporary variable, store the expression in it and use the
variable next time the expression is encountered.

x = a + b
...
y = a + b

t = a + b
x = t
...
y = t

46

Common Subexpression Elimination


c = a + b
d = m * n
e = b + d
f = a + b
g = - b
h = b + a
a = j + a
k = m * n
j = b + d
a = - b
if m * n go to L

t1 = a + b
c = t1
t2 = m * n
d = t2
t3 = b + d
e = t3
f = t1
1
g = -b
h = t1 /* commutative */
a = j + a
k = t2
j = t3
a = -b
b
if t2 go to L

47

Common Subexpression Elimination


Global common subexpression elimination
Performed on flow graph
Requires available expression information
In addition to finding what expressions are available
at the
h endpoints
d i off basic
b i blocks,
bl k we needd to know
k
where each of those expressions was most recently
evaluated (which block and which position within
that block).

48

Common Sub-expression
Evaluation
1

x:=a+b

a:= b

z : = a + b + 10

a + b is not a
common sub-expression
in 1 and 4

None of the variable involved should be modified in any path

49

Code Optimization Techniques


Unreacheable code elimination
Construct the control flow graph
Unreachable code block will not have an incoming
edge
Dead code elimination
Ineffective statements
x := y + 1
(immediately redefined,
eliminate!)
li i
!)
y := 5

y := 5
x := 2 * z
x := 2 * z
A variable is dead if it is never used after last definition
Eliminate assignments to dead variables
Need to do data flow analysis to find dead variables

Code Optimization Techniques


Loop optimization
Consumes 90% of the execution time
a larger
g ppayoff
y to optimize
p
the code within a loop
p
Techniques
Loopp invariant detection and code motion
Strength reduction in loops
Induction variable elimination

Code Motion
Moving code from one part of the program to other
without modifying the algorithm
Reduce size of the program
Reduce execution frequency of the code subjected to
movement

52

Code Motion
1. Code Space
p
reduction: Similar to common subexpression elimination but with the objective to
reduce code size.
Example: Code hoisting
if (a< b) then
z := x ** 2
else
y := x ** 2 + 10

temp : = x ** 2
if (a< b) then
z := temp
else
y := temp + 10

x ** 2 is computed once in both cases, but the code size in the second
case reduces.
53

Code Motion
Loop invariant detection and code motion
If the result of a statement or expression does not
change within a loop, and it has no external side-effect
Computation
p
can be moved to the outside of the loopp
Example 1:
for (i=0;
(i 0; ii<n;
n; ii++)) a[i] ::= a[i] + x/y;
for (i=0; i<n; i++) { c := x/y; a[i] := a[i] + c; } //
three address code
c := x/y; for (i=0; i<n; i++) a[i] := a[i] + c;
Example 2 :
while ( i < (max-2) )
Equivalent to:
t := max - 2
while ( i < t )

54

Code Motion
Safety of Code movement
Movement of an expression e from a basic block bi to
another block bj, is safe if it does not introduce any new
occurrence of e along any path.

55

Code Optimization Techniques


Induction variable elimination
If there are multiple induction variables in a loop, can
eliminate the ones which are used only in the test
condition.
Example
s := 0; for (i=0; i<n; i++) { s := s + 4; }
s := 0; while (s < 4*n) { s := s + 4; }

Strength Reduction

Replacement of an operator with a less costly one.


Example
s := 0; for (i=0; i<n; i++) { v := 4 * i; s := s + v; )
s := 0; for (i=0; i<n; i++) { v := v + 4; s := s + v; )
Typical cases of strength reduction occurs in address
calculation
l l i off array references.
f

57

Code Optimization Techniques


Loop unrolling
Execute loop body multiple times at each iteration
Get rid of the conditional branches
Space
S
ti
time tradeoff
t d ff
Increase in code size, reduce some instructions
i=1;
while(i<=100)
{a[i]=0;
++i;
}

i=1;
while(i<=100)
{a[i]=0;
++i;
a[i]=0;
++i;
}

Loop fusion
Example
for i=1 to N do
A[i] = B[i] + 1
endfor
for i=1 to N do
C[i] = A[i] / 2
Endfor
After Loop Fusion
for i=1 to N do
A[i] = B[i] + 1
C[i] = A[i] / 2
End for

Code Optimization Techniques


Loop jamming
for(i=1;i<=10;++i)
for (j=1;j<=10;++j)
a[i][j]=0;
for(i=1;i<=10;++i)
a[i][i]=1;

for(i=1;i<=10;++i)
f (j
for
(j=1;j<=10;++j)
1 j 10 j)
{ a[i][j]=0;
a[i][i]=1;
a[i][i]
1; }

Strength reduction
Replace expensive operations
Replace Multiply by Shift
A := A * 4;
Can be replaced by 2-bit left shift (signed/unsigned)
E g , x ::= x * 8 x ::= x << 3
E.g.,
But must worry about overflow if language does
A := A / 44;
If unsigned, can replace with shift right
E.g., x := x * 8 x := x << 3
Language may allow it anyway (traditional C)

Code Optimization
Peephole and local analysis generally involve
Constant foldingg
Algebraic simplification, strength reduction
Global analysis
y ggenerallyy requires
q
Control flow graph
Data flow analysis
y
Loop optimization

S-ar putea să vă placă și