Documente Academic
Documente Profesional
Documente Cultură
-Cursul 8
Banda de asamblare
Dan Tudose
Facultatea de Automatic i Calculatoare
Universitatea Politehnica Bucureti
20.02.2013
http://xkcd.com/619/
25-Apr-13
Calculatoare Numerice
slide 2
Calculatoare Numerice
slide 3
stage
2
stage
3
stage
4
Calculatoare Numerice
slide 4
Calculatoare Numerice
slide 5
RegWriteEn
MemWrite
WBSel
0x4
Add
Add
clk
Br Logic
we
PC
clk
clk
rs1
rs2
addr
Bcomp?
rd1
wa
wd rd2
inst
Inst.
Memory
ALU
GPRs
we
addr
rdata
Data
Memory
Imm
Select
wdata
ALU
Control
OpCode
25-Apr-13
WASel
ImmSel
FuncSel
Op2Sel
Calculatoare Numerice
slide 6
ImmSel
ALU
SW
*
IType12
IType12
BsType12
BEQtrue
BrType12
BEQfalse
ALUi
LW
J
JAL
JALR
Op2Sel
Reg
Imm
Imm
Imm
FuncSel
MemWr
RFWen
WBSel
PCSel
rd
rd
rd
*
pc+4
pc+4
pc+4
pc+4
Func
Op
+
+
no
no
no
yes
yes
yes
yes
no
no
no
br
BrType12
*
*
*
*
*
*
*
*
no
no
no
no
no
*
*
*
*
pc+4
jabs
no
yes
yes
PC
PC
X1
rd
jabs
rind
25-Apr-13
ALU
ALU
Mem
*
WASel
Calculatoare Numerice
slide 7
25-Apr-13
Calculatoare Numerice
slide 8
PC
addr
rdata
IR
Inst.
Memory
fetch
phase
we
rs1
rs2
rd1
wa
wd rd2
GPRs
ALU
Data
Memory
wdata
Imm
Select
we
addr
rdata
execute
phase
write
-back
phase
memory
phase
Calculatoare Numerice
slide 9
Time = Instructions
Cycles
Time
Program
Program * Instruction * Cycle
Instructions per program depinde de codul surs,
compilator i ISA
Cycles per instructions (CPI) depinde de ISA i
architectur
Time per cycle depinde de architectur i tehnologia de
baz n care e implementat procesorul
Microarchitectur
Microcodificat
Single-cycle unpipelined
Pipelined
25-Apr-13
CPI
>1
1
1
cycle time
short
long
short
Calculatoare Numerice
slide 10
Concepte de pipelining
Strategii pentru mbuntirea performanei
1 Folosim ci de date independente care accept citirea mai multor
instruciuni n paralel: multiple-instruction-issue sau superscalar
2 Suprapunem execuia diferitelor instruciuni prin pornirea urmtoarei
instruciuni nainte de terminarea celei n curs de execuie:(super)pipelined
Approval
1
Cashier
2
2
Registrar
3
ID photo
4
Pickup
5
Start
here
Exit
Calculatoare Numerice
slide 11
Instr
cache
Cycle 4
Cycle 5
Cycle 6
Cycle 7
Cycle 8
ALU
Data
cache
Reg
file
Instr
cache
Reg
file
ALU
Data
cache
Reg
file
Instr
cache
Reg
file
ALU
Data
cache
Reg
file
Instr
cache
Reg
file
ALU
Data
cache
Reg
file
Instr
cache
Reg
file
ALU
Data
cache
Instr 3
Reg
file
Instr 4
Instr 5
Cycle 3
Cycle 9
Time dimension
Instr 2
Instr 1
Cycle 1
Task
dimension
Reg
file
Calculatoare Numerice
slide 12
f
f = Fetch
r = Reg read
a = ALU op
d = Data access
w = Writeback
3
4
5
6
7
10
11
Cycle
1
2
3
4
5
Start-up
region
10
Cycle
Drainage
region
Pipeline
stage
Instruction
(a) Task-time diagram
Calculatoare Numerice
11
slide 13
25-Apr-13
Calculatoare Numerice
slide 14
Exemple CPI
Main cu microcod
7 cicli
5 cicli
Inst 1
Inst 2
Main fr pipeline
Inst 1
Main cu pipeline
Inst 1
Inst 2
Inst 3
25-Apr-13
10 cicli
Timp
Inst 3
Inst 3
slide 15
Calculatoare Numerice
slide 16
PC
addr
rdata
we
rs1
rs2
rd1
wa
wd rd2
GPRs
IR
Inst.
Memory
I-Fetch
(IF)
we
addr
rdata
ALU
Data
Memory
wdata
Imm
Select
timp
instruction1
instruction2
instruction3
instruction4
instruction5
25-Apr-13
t0
IF1
t1
ID1
IF2
t2
EX1
ID2
IF3
t3
MA1
EX2
ID3
IF4
t4
WB1
MA2
EX3
ID4
IF5
Write
-Back
(WB)
Memory
(MA)
t5
t6
t7
....
WB2
MA3 WB3
EX4 MA4 WB4
ID5 EX5 MA5 WB5
Calculatoare Numerice
slide 17
Add
addr
rdata IR
Inst.
Memory
Imm
Selec
t
ALU
we
addr
rdata
Data
Memory
wdata
Write
-Back
(WB)
Resurse
PC
we
rs1
rs2
rd
ws
1
wd rd2
GPRs
25-Apr-13
Calculatoare Numerice
slide 18
Execuie n pipeline:
Instruciuni UAL
0x4
IR
Add
IR
IR
PC
addr
inst IR
Inst
Memory
we
rs1
rs2
rd1
wa
wd rd2
GPRs
A
Y
ALU
we
addr
rdata
Data
Memory
wdata
Imm
Select
wdata
MD1
MD2
Nu e foarte corect!
Avem nevoie de un Instruction Reg (IR) pentru fiecare etap
25-Apr-13
Calculatoare Numerice
slide 19
IR
IR
W
IR
1
0x4
Add
WASel
RegWriteEn
PC
addr
inst IR
Inst
Memory
we
rs1
rs2
rd1
wa
wd rd2
FuncSel
A
ALU
GPRs
Op2Sel
we
addr
Data
Memory
wdata
wdata
Imm
Select
ImmSel
WBSel
rdata
MD1
25-Apr-13
MemWrite
MD2
Punctele de control
trebuie s fie conectate
Calculatoare Numerice
slide 20
hazard de date
Dependena poate s fie pentru adresa urmtoarei
instruciuni
hazard de control (branch-uri, excepii)
25-Apr-13
Calculatoare Numerice
slide 21
25-Apr-13
Calculatoare Numerice
slide 22
Hazarduri de date
x1
x4 x1
0x4
IR
Add
PC
addr
inst IR
Inst
Memory
we
rs1
rs2
rd1
wa
wd rd2
GPRs
A
Y
ALU
25-Apr-13
we
addr
rdata
Data
Memory
Imm
Select
...
x1 x0 + 10
x4 x1 + 17
...
IR
IR
wdata
wdata
MD1
MD2
slide 23
Strategia 1:
Atept ca rezultatul s fie disponibil prin
nghearea tuturor etapelor anterioare ale b.a.
interlocking
25-Apr-13
Calculatoare Numerice
slide 24
FB1
FB2
stage
1
FB3
stage
2
FB4
stage
3
stage
4
Calculatoare Numerice
slide 25
$5 = $6 + $7
Cycle 1
Cycle 2
Instr
cache
Reg
file
Instr
cache
$8 = $8 + $6
$9 = $8 + $2
sw $9, 0($3)
Cycle 3
Cycle 4
Cycle 5
Cycle 6
Cycle 7
ALU
Data
cache
Reg
file
Reg
file
ALU
Data
cache
Reg
file
Instr
cache
Reg
file
ALU
Data
cache
Reg
file
Instr
cache
Reg
file
ALU
Data
cache
Cycle 8
Data
forwarding
Reg
file
25-Apr-13
Calculatoare Numerice
slide 26
Cycle 1
Cycle 2
Cycle 3
Cycle 4
Cycle 5
Instr
cache
Reg
file
ALU
Data
cache
Reg
file
Instr
cache
Reg
file
Instr 3
Instr 2
Instr 1
Instr 5
Instr 4
Instr
cache
Instr
cache
Instr 3
Cycle 2
Cycle 3
Reg
file
ALU
Instr
cache
Reg
file
Instr 4
Instr
cache
Cycle 4
Data
cache
Task
dimension
25-Apr-13
Data
cache
Instr
cache
Reg
file
Cycle 5
Cycle 6
Data
cache
Cycle 9
Time dimension
Reg
file
Data
cache
Reg
file
ALU
Data
cache
Reg
file
Cycle 8
Cycle 9
Cycle 7
Scrie la $8
Citete de la $8
Time dimension
Reg
file
Data
cache
Reg
file
Reg
file
ALU
Data
cache
Reg
file
Instr
cache
Reg
file
ALU
Data
cache
ALU
Bubble
Instr
cache
Cycle 8
Reg
file
Bubble
ALU
Reg
file
ALU
Bubble
Reg
file
Data
cache
Reg
file
Cycle 7
Scrie la $8
ALU
Bubble
Task
dimension
Instr 2
Instr 1
Reg
file
Instr
cache
Cycle 1
Instr 5
ALU
Bubble
Cycle 6
Citete de la $8
Reg
file
Calculatoare Numerice
slide 27
C ycle 2
Instr
mem
Reg
file
sw $6, . . .
C ycle 3
ALU
C ycle 4
C ycle 5
Data
mem
Reg
file
C ycle 6
C ycle 7
C ycle 8
Reorder?
lw $8, . . .
Instr
mem
Insert bubble?
$9 = $8 + $2
Reg
file
ALU
Data
mem
Reg
file
Instr
mem
Reg
file
ALU
Data
mem
Reg
file
Instr
mem
Reg
file
ALU
Data
mem
Reg
file
Calculatoare Numerice
slide 28
bubble
0x4
Add
PC
addr
IR
IR
IR
1
inst IR
Inst
Memory
...
x1 x0 + 10
x4 x1 + 17
...
25-Apr-13
we
rs1
rs2
rd1
wa
wd rd2
GPRs
A
ALU
we
addr
rdata
Data
Memory
Imm
Select
wdata
wdata
MD1
MD2
Calculatoare Numerice
slide 29
Resource
Usage
IF
ID
EX
MA
WB
timp
t0 t1
I1
I2
I1
t2
I3
I2
I1
t3
I3
I2
I1
t4
I3
I2
I1
t5
I3
I2
-
t6
....
t6
I4
I3
I2
-
-
25-Apr-13
t7
t7
I5
I4
I3
I2
-
....
I5
I4
I3
I2
I5
I4
I3
I5
I4
pipeline bubble
Calculatoare Numerice
slide 30
I5
wa
Cstall
rs2 ?
rs1
bubble
0x4
Add
PC
addr
IR
IR
IR
1
inst IR
Inst
Memory
we
rs1
rs2
rd1
wa
wd rd2
GPRs
A
ALU
we
addr
rdata
Data
Memory
Imm
Select
wdata
wdata
MD1
MD2
Calculatoare Numerice
slide 31
Cstall
rs1 ?
rs2
re1
re2
Cre
0x4
Add
PC
wa
we
addr
Cdest
Cdest
bubble
wa
wa we
we
IR
IR
IR
1
inst IR
Inst
Memory
we
rs1
rs2
rd1
wa
wd rd2
GPRs
Cdest
A
ALU
we
addr
rdata
Data
Memory
wdata
Imm
Select
wdata
MD1
MD2
Trebuie s stagnm ntotdeauna cnd adresa unui rs este egal cu a unui rd?
nu orice instruciune scrie un registru -> we
nu orice instruciune citete un registru -> re
25-Apr-13
Calculatoare Numerice
slide 32
rs1
rd
rs1
Imm[11:0]
func3 opcode ALUI/LW/JALR
rs1
rs2 Imm[6:0] func3 opcode SW/Bcond
Jump offset[24:0]
opcode
Imm[11:7]
rs2
ALU
rd <- rs1 func10 rs2
ALUI
rd <- rs1 op imm
LW
rd <-M [rs1 + imm]
SW
M [rs1 + imm] <- rs2
Bcond rs1,rs2
true:
PC <- PC + imm
false: PC <- PC + 4
J
PC <- PC + imm
JAL
x1 <- PC, PC <- PC + imm
JALR
rd <- PC, PC <- rs1 + imm
25-Apr-13
func10
opcode
source(s)
rs1, rs2
rs1
rs1
rs1, rs2
rs1, rs2
rs1
Calculatoare Numerice
ALU
destination
rd
rd
rd
-
x1
rd
slide 33
Cdest
ws = Case opcode
JAL
else
-> X1
-> rd
we = Case opcode
ALU, ALUi, LW,JALR ->(ws != 0)
JAL
->on
...
->off
Cstall
Calculatoare Numerice
slide 34
Condiie de stall
0x4
bubble
Add
PC
addr
IR
IR
IR
1
inst IR
Inst
Memory
...
M[x1+7] <- x2
x4 <- M[x3+5]
...
25-Apr-13
we
rs1
rs2
rd1
wa
wd rd2
GPRs
A
ALU
we
addr
rdata
Data
Memory
wdata
Imm
Select
wdata
MD1
MD2
slide 35
Calculatoare Numerice
slide 36
Strategia 2:
Ruteaz datele imediat ce ele sunt calculate ctre
nivelurile inferioare din banda de asamblare
bypass
25-Apr-13
Calculatoare Numerice
slide 37
Bypassing
timp
(I1) x1 x0 + 10
(I2) x4 x1 + 17
(I3)
(I4)
(I5)
t0
IF1
t1 t2 t3 t4 t5
ID1 EX1 MA1 WB1
IF2 ID2 ID2 ID2 ID2
IF3 IF3 IF3 IF3
stalled stages
t6
t7
....
t1
IF1
t2 t3
ID1 EX1
IF2 ID2
IF3
t4
MA1
EX2
ID3
IF4
t5
WB1
MA2
EX3
ID4
IF5
Calculatoare Numerice
t6
t7
....
WB2
MA3 WB3
EX4 MA4 WB4
ID5 EX5 MA5 WB5
slide 38
x4 x1...
0x4
x1 ...
E
bubble
Add
IR
IR
IR
1
ASrc
PC
addr
inst IR
Inst
Memory
we
rs1
rs2
rd1
wa
wd rd2
GPRs
A
ALU
rdata
Data
Memory
wdata
Imm
Select
MD2
wdata
MD1
(I1)
(I2)
we
addr
Calculatoare Numerice
JAL 500
x4 <- x1 + 17
nu
slide 39
Semnalul de bypass
Derivarea semnalului din cel pentru Stall
stall = ( ((rs1D =wsE).weE + (rs1D =wsM).weM + (rs1D =wsW).weW).re1D
+((rs2D =wsE).weE + (rs2D =wsM).weM + (rs2D =wsW).weW).re2D )
ws = Case opcode
JAL ->X1
else ->rd
ASrc = (rs1D=wsE).weE.re1D
we = Case opcode
ALU, ALUi, LW, JALR ->(ws != 0)
JAL
->on
...
->off
Este corect?
25-Apr-13
Calculatoare Numerice
slide 40
25-Apr-13
Calculatoare Numerice
slide 41
stall
0x4
bubble
Add
PC
addr
ASrc
inst IR
Inst
Memory
Mai este
nevoie de un
semnal de
stall ?
IR
IR
IR
1
we
rs1
rs2
rd1
wa
wd rd2
A
ALU
GPRs
Imm
Select
BSrc
we
addr
rdata
Data
Memory
wdata
wdata
MD1
MD2
Calculatoare Numerice
slide 42
Inst 1
Inst 2
Inst 3
Inst 1
Inst 2
Bubble
Inst 3
Inst 1
Bubble 1
Inst 2
Inst
Bubble
3 2
Inst 3
25-Apr-13
Calculatoare Numerice
slide 43
25-Apr-13
Calculatoare Numerice
slide 44
stall
0x4
bubble
Add
PC
addr
ASrc
inst IR
Inst
Memory
we
rs1
rs2
rd1
wa
wd rd2
IR
IR
IR
1
Guess_zero
A
ALU
GPRs
Imm
Select
BSrc
we
addr
rdata
Data
Memory
wdata
wdata
MD1
MD2
Calculatoare Numerice
slide 45
Hazarduri de control
De ce avem nevoie pentru a calcula urmtorul PC?
Pentru Jump-uri
Opcode, PC i offset
25-Apr-13
Calculatoare Numerice
slide 46
C ycle 2
Instr
mem
Reg
file
Instr
mem
$6 = $3 + $5
Insert bubble?
C ycle 3
C ycle 4
C ycle 5
ALU
Data
mem
Reg
file
Reg
file
ALU
Data
mem
Reg
file
Instr
mem
Reg
file
ALU
Data
mem
Reg
file
Instr
mem
Reg
file
ALU
Data
mem
$9 = $8 + $2
Assume branch
resolved here
C ycle 6
C ycle 7
C ycle 8
Reorder?
(delayed
branch)
Reg
file
Calculatoare Numerice
slide 47
Bule n calculul PC
(I1) x1 x0 + 10
(I2) x3 x2 + 17
(I3)
(I4)
Resource
Usage
IF
ID
EX
MA
WB
timp
t0 t1 t2 t3
IF1 ID1 EX1 MA1
IF2 IF2 ID2
IF3
time
t0 t1
I1
I1
t2
I2
I1
t3
I2
I1
t4 t5
WB1
EX2 MA2
IF3 ID3
IF4
t6
t4
I3
t6
I4
I2
I1
t5
I3
I2
-
....
WB2
EX3 MA3 WB3
IF4 ID4 EX4 MA4 WB4
I3
I2
-
25-Apr-13
t7
t7
....
I4
I3
-
I4
I3
I4
-
I4
pipeline bubble
Calculatoare Numerice
slide 48
bubble
0x4
Add
Jump?
PC
104
I1
I2
I3
I4
addr
inst
IR
IR
I1
IR
Inst
Memory
096
100
104
304
I2
ADD
J 304
ADD
ADD
25-Apr-13
kill
slide 49
stall
Pentru a invalida o
instruciune la care s-a
fcut fetch -- inserm un
MUX nainte de IR
Add
bubble
0x4
Add
Jump?
304
104
I1
I2
I3
I4
addr
inst
bubble
Inst
Memory
096
100
104
304
IR
bubble
I2
ADD
J 304
ADD
ADD
25-Apr-13
kill
IR
IR
II21
I1
Exist
vreo
interaciun
e ntre
stall i
jump?
IRSrcD
PC
Calculatoare Numerice
slide 50
096:
100:
104:
304:
ADD
J 304
ADD
ADD
Resource
Usage
IF
ID
EX
MA
WB
timp
t0 t1 t2
IF1 ID1 EX1
IF2 ID2
IF3
timp
t0 t1
I1
I2
I1
t2
I3
I2
I1
t3
MA1
EX2
IF4
t4
WB1
MA2
ID4
t5
t6
....
t3
I4
I2
I1
t4
I5
I4
I2
I1
t5
t6
t7
....
I5
I4
I2
I5
I4
-
I5
I4
I5
WB2
EX4 MA4 WB4
-
25-Apr-13
t7
Calculatoare Numerice
pipeline bubble
slide 51
stall
Add
0x4
bubble
Add
IR
IR
I1
BEQ?
Taken?
IRSrcD
PC
104
I1
I2
I3
I4
096
100
104
304
addr
inst
bubble
Inst
Memory
ADD
BEQ x1,x2 +200
ADD
ADD
25-Apr-13
IR
A
ALU
I2
slide 52
stall
?
Add
bubble
0x4
Bcond?
Add
IR
IR
I2
I1
Taken?
IRSrcD
PC
108
I1
I2
I3
I4
096
100
104
304
addr
inst
bubble
Inst
Memory
IR
A
ALU
I3
Calculatoare Numerice
slide 53
Add
PCSrc (pc+4/jabs/rind/br)
bubble
0x4
Bcond?
IRSrcE
Add
Jump?
IR
IR
I2
I1
Taken?
PC
PC
108
I1:
I2:
I3:
I4:
096
100
104
304
IRSrcD
addr
inst
bubble
Inst
Memory
IR
A
ALU
I3
Calculatoare Numerice
slide 54
timp
t0 t1 t2
096: ADD
IF1 ID1 EX1
100: BEQ +200
IF2 ID2
104: ADD
IF3
108:
304: ADD
Resource
Usage
IF
ID
EX
MA
WB
timp
t0 t1
I1
I2
I1
t2
I3
I2
I1
t3
MA1
EX2
ID3
IF4
t4
WB1
MA2
IF5
t5
t6
....
t3
I4
I3
I2
I1
t4
I5
I2
I1
t5
t6
t7
....
I5
I2
I5
-
I5
-
I5
WB2
ID5 EX5 MA5 WB5
-
25-Apr-13
t7
Calculatoare Numerice
pipeline bubble
slide 55
Latching
of results
t
Function unit
Stage
1
t/q
Stage
2
Stage
3
.. .
Stage
q1
Stage
q
Calculatoare Numerice
slide 56
8
7
t
t/q +
Ideal:
/t = 0
/t = 0.05
sau
5
4
q
1 + q / t
/t = 0.1
3
2
1
4
5
6
Number q of pipeline stages
Calculatoare Numerice
slide 57
q
(1 + q / t)(1 + )
R-type
Load
Store
Branch
Jump
44%
24%
12%
18%
2%
Exemplu:
25-Apr-13
Calculatoare Numerice
slide 58
Acknowledgements
These slides contain material developed and
copyright by:
Arvind (MIT)
Krste Asanovic (MIT/UCB)
Joel Emer (Intel/MIT)
James Hoe (CMU)
John Kubiatowicz (UCB)
David Patterson (UCB)
Calculatoare Numerice
slide 59