Documente Academic
Documente Profesional
Documente Cultură
Next addr
jta
Next PC
ALUOvfl
(PC)
PC
(rs)
rs
rt
Instr
cache
inst
rd
31
imm
op
Br&Jump
0
1
2
Ovfl
Reg
file
ALU
(rt)
/
16
0
32
SE / 1
Func
ALU
out
Data
addr
Data
cache
Data
in
Data
out
0
1
2
Register input
fn
RegDst
RegWrite
ALUSrc
ALUFunc
DataRead
RegInSrc
DataWrite
26
/
4 MSBs
rt
0
rd 1
31 2
Cache
Data Reg
InstData
PCWrite
(rs)
Reg
file
0
12
Data
MemWrite
MemRead
op
IRWrite
(rt)
imm 16
/
fn
ALUZero
x Mux
ALUOvfl
0
Zero
z Reg
1
Ovfl
x Reg
rs
PC
32 y Reg
SE /
RegInSrc
RegDst
0
1
SysCallAddr
jta
Address
0
1
30
/
RegWrite
y Mux
4
0
1
2
4 3
ALUSrcX
30
ALU
0
1
2
3
Func
ALU out
ALUFunc
ALUSrcY
PCSrc
JumpAddr
Pipelining Concepts
Strategies for improving performance
1 Use multiple independent data paths accepting several instructions
that are read out at once: multiple-instruction-issue or superscalar
2 Overlap execution of several instructions, starting the next instruction
before the previous one has run to completion: (super)pipelined
Approval
1
Cashier
2
Registrar
3
ID photo
Pickup
Start
here
5
Exit
Instr 3
Instr 4
Instr 5
Cycle 2
Cycle 3
Cycle 4
Cycle 5
Cycle 6
Cycle 7
Cycle 8
Cycle 9
Time dimension
Instr 2
Instr 1
Cycle 1
Task
dimension
Reg
file
ALU
Data
cache
Reg
file
Instr
cache
Reg
file
ALU
Data
cache
Reg
file
Instr
cache
Reg
file
ALU
Data
cache
Reg
file
Instr
cache
Reg
file
ALU
Data
cache
Reg
file
Instr
cache
Reg
file
ALU
Data
cache
Reg
file
f
f = Fetch
r = Reg read
a = ALU op
d = Data access
w = Writeback
3
4
5
6
7
10
11
Cycle
1
2
3
4
5
Start-up
region
Pipeline
stage
(b) Space-time diagram
10
11
Cycle
Instruction
(a) Task-time diagram
Drainage
region
Dependency
Data Dependency
Read after write
Read after load
Control Dependency
Data Dependency
First type of data dependency
$5 = $6 + $7
$8 = $8 + $6
$9 = $8 + $2
sw $9, 0($3)
Cycle 1
Cycle 2
Instr
cache
Reg
file
Instr
cache
Cycle 3
Cycle 4
Cycle 5
Cycle 6
Cycle 7
ALU
Data
cache
Reg
file
Reg
file
ALU
Data
cache
Reg
file
Instr
cache
Reg
file
ALU
Data
cache
Reg
file
Instr
cache
Reg
file
ALU
Data
cache
Cycle 8
Data
forwarding
Reg
file
Cycle 1
Cycle 2
Cycle 3
Cycle 4
Cycle 5
Instr
cache
Reg
file
ALU
Data
cache
Reg
file
Instr
cache
Reg
file
Instr 3
Instr 5
Instr 4
Instr
cache
Reg
file
Cycle 7
Cycle 8
Cycle 9
Cycle 2
Cycle 3
Writes into $8
Data
cache
ALU
Bubble
Reg
file
Task
dimension
Cycle 4
Reg
file
Data
cache
Bubble
ALU
Instr
cache
Reg
file
Cycle 5
Cycle 6
Reg
file
Data
cache
Reg
file
ALU
Data
cache
Reg
file
Cycle 8
Cycle 9
Cycle 7
Reads from $8
Time dimension
Instr
cache
Instr 3
Instr 2
Instr 1
ALU
Bubble
Instr
cache
Cycle 1
Reg
file
ALU
Instr
cache
Reg
file
Instr
cache
Instr 4
Instr 5
Cycle 6
Time dimension
Instr 2
Instr 1
Data
cache
ALU
Bubble
Reg
file
Instr
cache
Task
dimension
Reg
file
Data
cache
Writes into $8
Reg
file
Data
cache
Reg
file
Reg
file
ALU
Data
cache
Reg
file
Instr
cache
Reg
file
ALU
Data
cache
ALU
Bubble
C ycle 2
Instr
mem
Reg
file
sw $6, . . .
C ycle 3
ALU
C ycle 4
C ycle 5
Data
mem
Reg
file
C ycle 6
C ycle 7
C ycle 8
Reorder?
lw $8, . . .
Insert bubble?
$9 = $8 + $2
Instr
mem
Reg
file
ALU
Data
mem
Reg
file
Instr
mem
Reg
file
ALU
Data
mem
Reg
file
Instr
mem
Reg
file
ALU
Data
mem
Reg
file
C ycle 2
Instr
mem
Reg
file
Instr
mem
$6 = $3 + $5
Insert bubble?
C ycle 3
C ycle 4
C ycle 5
ALU
Data
mem
Reg
file
Reg
file
ALU
Data
mem
Reg
file
Instr
mem
Reg
file
ALU
Data
mem
Reg
file
Instr
mem
Reg
file
ALU
Data
mem
$9 = $8 + $2
Assume branch
resolved here
C ycle 6
C ycle 7
C ycle 8
Reorder?
(delayed
branch)
Reg
file
t
Function unit
Stage
1
t/q
Stage
2
Stage
3
.. .
Stage
q1
Stage
q
8
7
Ideal:
/t = 0
t/q +
/t = 0.05
5
or
/t = 0.1
1 + q / t
2
1
4
5
6
Number q of pipeline stages
Stage 2
ALUOvfl
1
PC
Stage 4
Stage 5
Next addr
NextPC
Stage 3
inst
Instr
cache
rs
rt
(rs)
Ovfl
Reg
file
ALU
(rt)
imm SE
Incr
IncrPC
SeqInst
op
Data
Data
addr
Address
Data
cache
Func
0
1
0
1
0
1
2
rt
rd 0
1
31 2
Br&Jump
RegDst
fn
RegWrite
ALUSrc
ALUFunc
DataRead
RetAddr
DataWrite
RegInSrc
Pipelined Control
Stage 1
Stage 2
ALUOvfl
1
PC
Stage 4
Stage 5
Next addr
NextPC
Stage 3
inst
Instr
cache
rs
rt
(rs)
Ovfl
Reg
file
imm SE
Incr
IncrPC
Data
cache
ALU
(rt)
Data
Data
addr
Address
Func
0
1
0
1
0
1
2
rt
rd 0
1
31 2
2
3
5
SeqInst
op
Br&Jump
RegDst
fn
RegWrite
ALUSrc
ALUFunc
DataRead RetAddr
DataWrite
RegInSrc