Sunteți pe pagina 1din 18

Lecture 3: Instruction Pipelining

Basic concepts

Pipeline hazards

Branch handling

Branch prediction

1
What is a Pipeline?

2
Traditional Pipeline Concept
Laundry Example :
Ann, Brian, Cathy, Dave
each have one load of clothes A B C D
to wash, dry, and fold

Washer takes 30 minutes

Dryer takes 40 minutes

Folder takes 20 minutes

3
Traditional Pipeline Concept

6 PM 7 8 9 10 11 Midnight
Time
30 40 20 30 40 20 30 40 20 30 40 20
Sequential laundry takes 6 hours
A for 4 loads
If they learned pipelining, how
long would laundry take?
B

D
4
Traditional Pipeline Concept

6 PM 7 8 9 10 11 Midnight

Time
T
a 30 40 40 40 40 20
s
A
k Pipelined laundry takes 3.5
hours for 4 loads
O B
r
d C
e
D
r
5
Traditional Pipeline Concept

Pipelining doesnt help


6 PM 7 8 9 latency of single task, it helps
throughput of entire workload
Time Pipeline rate limited by slowest
T pipeline stage
30 40 40 40 40 20
a Multiple tasks operating
A simultaneously using different
s resources
k Potential speedup = Number
B pipe stages
Unbalanced lengths of pipe
O C stages reduces speedup
Time to fill pipeline and time
r
to drain it reduces speedup
d D Stall for Dependences
e 6
Pipeline Idea

7
Use the Idea of Pipelining in a Computer

Fetch + Execution
Time
I1 I2 I3
Time
Clock cycle 1 2 3 4
F E F E F E
1 1 2 2 3 3
Instruction

I1 F1 E1
(a) Sequential execution

I2 F2 E2
Interstage buffer
B1
I3 F3 E3

Instruction Execution
fetch unit (c) Pipelined execution
unit

Figure 8.1. Basic idea of instruction pipelining.


(b) Hardware organization

8
Use the Idea of Pipelining in a Computer

Fetch
+Decode
+ Execution
+ Write

9
Basic Concepts
Sequential execution of an N-stage task:

Production time: N time units.


Resource needed: one general-purpose machine.
Productivity: one product per N time units.
Pipelined execution of an N-stage task:
Production time: N time units.
Resource needed: N specialpurpose
machines.
Productivity: about one product per
time unit.

10
Instruction Execution Stages
A typical instruction execution sequence:
1. Fetch Instruction (FI): Fetch the instruction.
2. Decode Instruction (DI): Determine the op-code
and the operand specifiers.
3. Calculate Operands (CO): Calculate the effective
addresses.
4. Fetch Operands (FO): Fetch the operands.
5. Execute Instruction (EI): perform the operation.
6. Write Operand (WO): store the result in memory.

11
Instruction Pipelining

12
Typical Instruction Pipelining

13
Number of Pipeline Stages
In general, a larger number of stages gives better
performance.
However:
A larger number of stages increases the overhead in moving
information between stages and synchronization between
stages.
The complexity of the CPU grows with the number of stages.
It is difficult to keep a large pipeline at maximum rate because
of pipeline hazards.
Intel 80486 and Pentium:
Five-stage pipeline for integer instructions.
Eight-stage pipeline for FP (floating points) instructions.
IBM PowerPC:
Four-stage pipeline for integer instructions.
Six-stage pipeline for FP instructions.

14
Lecture 3: Instruction Pipelining
Basic concepts

Pipeline hazards

Branch handling

Branch prediction

15
Pipeline Hazards (Conflicts)
They are situations that prevent the next
instruction in the instruction stream from
executing during its designated clock cycle. The
instruction is said to be stalled.
When an instruction is stalled:
All instructions later in the pipeline than the stalled
instruction are also stalled;
No new instructions are fetched during the stall;
Instructions earlier than the stalled one continue as
usual.
Types of hazards:
Structural hazards
Data hazards
Control hazards
16
Structural (Resource) Hazards
Hardware conflicts caused by the use of the same
hardware resource at the same time (e.g., memory
conflicts).

Penalty: 1 cycle (NOTE: the performance lost is multiplied


by the number of stages).

17
Structural Hazard Solutions
In general, the hardware resources in conflict are
duplicated in order to avoid structural hazards.
Functional units (ALU, FP unit) can also be pipelined
themselves to support several instructions at the same
time.
Memory conflicts can be solved by:
having two separate caches, one for instructions and the other
for operands (Harvard architecture);

18

S-ar putea să vă placă și