Sunteți pe pagina 1din 74

Chapter 7

MIPS microarchitecture

These transparencies are based on those provided with the book:


David Money Harris and Sarah L. Harris, “Digital Design and Computer Architecture”,
2nd Edition, 2012, Elsevier Chapter 7 <1>
Microarchitecture
• Multiple implementations for a single
architecture:
– Single-cycle: Each instruction executes in a
single cycle
– Multicycle: Each instruction is broken into series
of shorter steps
– Pipelined: Each instruction broken up into series
of steps & multiple instructions execute at once
Processor Performance
• Program execution time
Execution Time = (#instructions)(cycles/instruction)(seconds/cycle)

• Definitions:
– CPI: Cycles/instruction
– clock period: seconds/cycle
– IPC: instructions/cycle = IPC
• Challenge is to satisfy constraints of:
– Cost
– Power
– Performance
MIPS Processor
• Consider subset of MIPS instructions:
– R-type instructions: and, or, add, sub, slt
– Memory instructions: lw, sw
– Branch instructions: beq
Architectural State
• Determines everything about a processor:
– PC
– 32 registers
– Memory
MIPS State Elements

CLK CLK CLK


PC' PC WE3 WE
A1 RD1
32 32 A RD 5 32
32 32
5
A2 RD2 32 A RD
Instruction 32 32
Memory Data
5
A3 Memory
Register
WD3 WD
32 File 32
Single-Cycle MIPS Processor
• Datapath
• Control
CLK CLK CLK
PC' PC WE3 WE
A1 RD1
32 32 A RD 5 32
32 32
5
A2 RD2 32 A RD
Instruction 32 32
Memory Data
5
A3 Memory
Register
WD3 WD
32 File 32

The instruction memory, register file, and data memory are all read combinationally
• if the address changes, the new data appears at RD after some propagation delay
• no clock is involved
They are written only on the rising edge of the clock
• the state of the system is changed only at the clock edge.
Single-Cycle Datapath: lw fetch
STEP 1: Fetch instruction

CLK CLK
CLK
PC Instr WE3 WE
PC' A1 RD1
A RD
A RD
Instruction
A2 RD2 Data
Memory
A3 Memory
Register
WD3 WD
File
Single-Cycle Datapath: lw Register Read

STEP 2: Read source operands from RF

CLK CLK
CLK
25:21
WE3 WE
PC' PC Instr A1 RD1
A RD
A RD
Instruction
A2 RD2 Data
Memory
A3 Memory
Register
WD3 WD
File
Single-Cycle Datapath: lw Immediate

STEP 3: Sign-extend the immediate

CLK CLK
CLK
25:21
WE3 WE
PC' PC Instr A1 RD1
A RD
A RD
Instruction
A2 RD2 Data
Memory
A3 Memory
Register
WD3 WD
File

15:0 SignImm
Sign Extend
Single-Cycle Datapath: lw address
STEP 4: Compute the memory address

ALUControl2:0
010
CLK CLK
CLK
25:21
WE3 SrcA Zero WE
PC' PC Instr A1 RD1
A RD

ALU
ALUResult
A RD
Instruction
A2 RD2 SrcB Data
Memory
A3 Memory
Register
WD3 WD
File

SignImm
15:0
Sign Extend
Single-Cycle Datapath: lw Memory Read

• STEP 5: Read data from memory and write


it back to register file
RegWrite ALUControl2:0
1 010
CLK CLK
CLK
25:21
WE3 SrcA Zero WE
PC' PC Instr A1 RD1
A RD

ALU
ALUResult ReadData
A RD
Instruction
A2 RD2 SrcB Data
Memory 20:16
A3 Memory
Register
WD3 WD
File

SignImm
15:0
Sign Extend
Single-Cycle Datapath: lw PC Increment

STEP 6: Determine address of next instruction


RegWrite ALUControl2:0
1 010
CLK CLK
CLK
25:21
WE3 SrcA Zero WE
PC' PC Instr A1 RD1
A RD

ALU
ALUResult ReadData
A RD
Instruction
A2 RD2 SrcB Data
Memory 20:16
A3 Memory
Register
WD3 WD
File

PCPlus4
+

SignImm
4 15:0
Sign Extend

Result
Single-Cycle Datapath: sw
Write data in rt to memory

RegWrite ALUControl2:0 MemWrite


0 010 1
CLK CLK
CLK
25:21
WE3 SrcA Zero WE
PC' PC Instr A1 RD1
A RD

ALU
ALUResult ReadData
20:16 A RD
Instruction
A2 RD2 SrcB Data
Memory 20:16
A3 Memory
Register WriteData
WD3 WD
File

PCPlus4
+

SignImm
4 15:0
Sign Extend

Result
Single-Cycle Datapath: R-Type
• Read from rs and rt
• Write ALUResult to register file
• Write to rd (instead of rt)
RegWrite RegDst ALUSrc ALUControl2:0 MemWrite MemtoReg
1 1 0 varies 0
CLK CLK 0
CLK
25:21
WE3 SrcA Zero WE
PC' PC Instr A1 RD1 0
A RD

ALU
ALUResult ReadData
A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
20:16
0
15:11
1
WriteReg4:0
PCPlus4
+

SignImm
4 15:0
Sign Extend

Result
Single-Cycle Datapath: beq
• Determine whether values in rs and rt are equal
• Calculate branch target address:
BTA = (sign-extended immediate << 2) + (PC+4)
PCSrc

RegWrite RegDst ALUSrc ALUControl2:0 Branch MemWrite MemtoReg


0 x 0 110 1 x
CLK CLK 0
CLK
WE3 SrcA Zero WE
0 PC' PC Instr
25:21
A1 RD1
A RD 0

ALU
1 ALUResult ReadData
A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
20:16
0
15:11
1
WriteReg4:0
PCPlus4
+

SignImm
4 15:0
<<2
Sign Extend PCBranch

+
Result
Single-Cycle Processor
MemtoReg
Control
MemWrite
Unit
Branch
ALUControl2:0 PCSrc
31:26
Op ALUSrc
5:0
Funct RegDst
RegWrite

CLK CLK
CLK
25:21 WE3 SrcA Zero WE
0 PC' PC Instr A1 RD1 0
A RD

ALU
1 ALUResult ReadData
A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
20:16
0
15:11
1
WriteReg4:0
PCPlus4
+

SignImm
4 15:0
<<2
Sign Extend PCBranch

+
Result
Single-Cycle Control
Control
Unit MemtoReg
MemWrite
Branch
Opcode5:0 Main
ALUSrc
Decoder
RegDst
RegWrite

ALUOp1:0

ALU
Funct5:0 ALUControl 2:0
Decoder
Review: ALU
F2:0 Function
A B 000 A&B
N N 001 A|B
010 A+B
F 011 not used
ALU 3
100 A & ~B
N
101 A | ~B
Y
110 A-B
111 SLT
Review: ALU
A B
N N

0
F2
N

Cout +
[N-1] S
Extend
Zero

N N N N
1

0
3

2 F1:0
N
Y
Control Unit: ALU Decoder
ALUOp1:0 Meaning
00 Add
01 Subtract
10 Look at Funct
11 Not Used

ALUOp1:0 Funct ALUControl2:0


00 X 010 (Add)
X1 X 110 (Subtract)
1X 100000 (add) 010 (Add)
1X 100010 (sub) 110 (Subtract)
1X 100100 (and) 000 (And)
1X 100101 (or) 001 (Or)
1X 101010 (slt) 111 (SLT)
Control Unit Main Decoder
Instruction Op5:0 RegWrite RegDst AluSrc Branch MemWrite MemtoReg ALUOp1:0

R-type 000000
lw 100011
sw 101011
beq 000100
MemtoReg
Control
MemWrite
Unit
Branch
ALUControl2:0 PCSrc
31:26
Op ALUSrc
5:0
Funct RegDst
RegWrite

CLK CLK
CLK
25:21 WE3 SrcA Zero WE
0 PC' PC Instr A1 RD1 0
A RD

ALU
1 ALUResult ReadData
A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
20:16
0
15:11
1
WriteReg4:0
PCPlus4
+

SignImm
4 15:0
<<2
Sign Extend PCBranch
+

Result
Control Unit: Main Decoder

Instruction Op5:0 RegWrite RegDst AluSrc Branch MemWrite MemtoReg ALUOp1:0

R-type 000000 1 1 0 0 0 0 10
lw 100011 1 0 1 0 0 0 00
sw 101011 0 X 1 0 1 X 00
beq 000100 0 X 0 1 0 X 01
Single-Cycle Datapath: or
MemtoReg
Control
MemWrite
Unit
Branch 0
ALUControl2:0 PCSrc
31:26
Op ALUSrc
5:0
Funct RegDst
RegWrite

CLK CLK
CLK 1 0
0 001 0
25:21 WE3 SrcA Zero WE
0 PC' PC Instr A1 RD1 0
A RD

ALU
1 ALUResult ReadData
0 A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
1
20:16
0
15:11
1
WriteReg4:0
PCPlus4
+

SignImm
4 15:0 <<2
Sign Extend PCBranch

+
Result
Extended Functionality: addi
MemtoReg
Control
MemWrite
Unit
Branch
ALUControl2:0 PCSrc
31:26
Op ALUSrc
5:0
Funct RegDst
RegWrite

CLK CLK
CLK
25:21 WE3 SrcA Zero WE
0 PC' PC Instr A1 RD1 0
A RD

ALU
1 ALUResult ReadData
A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
20:16
0
15:11
1
WriteReg4:0
PCPlus4
+

SignImm
4 15:0
<<2
Sign Extend PCBranch

+
Result

No change to datapath
Control Unit: addi
Instruction Op5:0 RegWrite RegDst AluSrc Branch MemWrite MemtoReg ALUOp1:0

R-type 000000 1 1 0 0 0 0 10

lw 100011 1 0 1 0 0 1 00

sw 101011 0 X 1 0 1 X 00

beq 000100 0 X 0 1 0 X 01

addi 001000
Control Unit: addi
Instruction Op5:0 RegWrite RegDst AluSrc Branch MemWrite MemtoReg ALUOp1:0

R-type 000000 1 1 0 0 0 0 10

lw 100011 1 0 1 0 0 1 00

sw 101011 0 X 1 0 1 X 00

beq 000100 0 X 0 1 0 X 01

addi 001000 1 0 1 0 0 0 00
Extended Functionality: j
Jump MemtoReg
Control
MemWrite
Unit
Branch
ALUControl2:0 PCSrc
31:26
Op ALUSrc
5:0
Funct RegDst
RegWrite

CLK CLK
CLK
0 PC' 25:21
WE3 SrcA Zero WE
0 PC Instr A1 RD1 0 Result
1 A RD

ALU
1 ALUResult ReadData
A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
20:16
0
PCJump 15:11
1
WriteReg4:0
PCPlus4
+

SignImm
4 15:0
<<2
Sign Extend PCBranch

+
27:0 31:28

25:0
<<2
Control Unit: Main Decoder
Instruction Op5:0 RegWrite RegDst AluSrc Branch MemWrite MemtoReg ALUOp1:0 Jump

R-type 000000 1 1 0 0 0 0 10 0

lw 100011 1 0 1 0 0 1 00 0

sw 101011 0 X 1 0 1 X 00 0

beq 000100 0 X 0 1 0 X 01 0

j 000010
Control Unit: Main Decoder
Instruction Op5:0 RegWrite RegDst AluSrc Branch MemWrite MemtoReg ALUOp1:0 Jump

R-type 000000 1 1 0 0 0 0 10 0

lw 100011 1 0 1 0 0 1 00 0

sw 101011 0 X 1 0 1 X 00 0

beq 000100 0 X 0 1 0 X 01 0

j 000010 0 X X X 0 X XX 1
Review: Processor Performance
Program Execution Time
= (#instructions)(cycles/instruction)(seconds/cycle)
= # instructions x CPI x TC
Single-Cycle Performance
MemtoReg
Control
MemWrite
Unit
Branch 0 0
ALUControl 2:0 PCSrc
31:26
Op ALUSrc
5:0
Funct RegDst
RegWrite

CLK CLK
CLK 1 0
010 1
25:21
WE3 SrcA Zero WE
0 PC' PC Instr A1 RD1 0
A RD

ALU
1 ALUResult ReadData
1 A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
0
20:16
0
15:11
1
WriteReg4:0
PCPlus4
+

SignImm
4 15:0 <<2
Sign Extend PCBranch

+
Result

TC limited by critical path (lw)


Single-Cycle Performance
• Single-cycle critical path:
Tc = tpcq_PC + tmem + max(tRFread, tsext + tmux) + tALU +
tmem + tmux + tRFsetup

• Typically, limiting paths are:


– memory, ALU, register file
– Tc = tpcq_PC + 2tmem + tRFread + tmux + tALU + tRFsetup
Single-Cycle Performance Example
Element Parameter Delay (ps)
Register clock-to-Q tpcq_PC 30
Register setup tsetup 20
Multiplexer tmux 25
ALU tALU 200
Memory read tmem 250
Register file read tRFread 150
Register file setup tRFsetup 20

Tc = ?
Single-Cycle Performance Example
Element Parameter Delay (ps)
Register clock-to-Q tpcq_PC 30
Register setup tsetup 20
Multiplexer tmux 25
ALU tALU 200
Memory read tmem 250
Register file read tRFread 150
Register file setup tRFsetup 20

Tc = tpcq_PC + 2tmem + tRFread + tmux + tALU + tRFsetup


= [30 + 2(250) + 150 + 25 + 200 + 20] ps
= 925 ps
Single-Cycle Performance Example
Program with 100 billion instructions:

Execution Time = # instructions x CPI x TC


= (100 × 109)(1)(925 × 10-12 s)
= 92.5 seconds
Multicycle MIPS Processor
• Single-cycle:
+ simple
- cycle time limited by longest instruction (lw)
- 2 adders/ALUs & 2 memories
• Multicycle:
+ higher clock speed
+ simpler instructions run faster
+ reuse expensive hardware on multiple cycles
- sequencing overhead paid many times
• Same design steps: datapath & control
Multicycle State Elements
• Replace Instruction and Data memories with
a single unified memory – more realistic
CLK CLK
CLK
WE WE3
PC' PC A1 RD1
RD
EN A A2 RD2
Instr / Data
Memory A3
Register
WD
File
WD3
Multicycle Datapath: Instruction Fetch
STEP 1: Fetch instruction

IRWrite

CLK CLK
CLK CLK
WE WE3
PC' PC Instr A1 RD1
b A
RD
A2 RD2
EN
Instr / Data
Memory A3
Register
WD
File
W D3
Multicycle Datapath: lw Register Read

STEP 2a: Read source operands from RF

IRWrite

CLK CLK CLK


CLK CLK
WE 25:21 WE3 A
PC' PC Instr A1 RD1
b A
RD
A2 RD2
EN
Instr / Data
Memory A3
Register
WD
File
W D3
Multicycle Datapath: lw Immediate

STEP 2b: Sign-extend the immediate


IRWrite

CLK CLK CLK


CLK CLK
WE 25:21 WE3 A
PC' PC Instr A1 RD1
b A
RD
A2 RD2
EN
Instr / Data
Memory A3
Register
WD
File
W D3

SignImm
15:0
Sign Extend
Multicycle Datapath: lw Address

STEP 3: Compute the memory address

IRWrite ALUControl2:0

CLK CLK CLK


CLK CLK
WE W E3 A SrcA CLK
25:21
PC' PC Instr A1 RD1
b RD

ALU
A EN A2 RD2 ALUResult ALUOut
Instr / Data SrcB
Memory A3
Register
WD
File
W D3

SignImm
15:0
Sign Extend
Multicycle Datapath: lw Memory Read

STEP 4: Read data from memory

IorD IRWrite ALUControl 2:0

CLK CLK CLK


CLK CLK
WE W E3 A SrcA CLK
25:21
PC' PC Instr A1 RD1
b 0 Adr RD

ALU
A EN A2 RD2 ALUResult ALUOut
1
Instr / Data SrcB
Memory CLK A3
Register
WD
Data File
W D3

SignImm
15:0
Sign Extend
Multicycle Datapath: lw Write Register

STEP 5: Write data back to register file

IorD IRWrite RegW rite ALUControl 2:0

CLK CLK CLK


CLK CLK
WE W E3 A SrcA CLK
25:21
PC' PC Instr A1 RD1
b 0 Adr RD

ALU
A EN A2 RD2 ALUResult ALUOut
1
Instr / Data SrcB
Memory CLK
20:16
A3
Register
WD
Data File
W D3

SignImm
15:0
Sign Extend
Multicycle Datapath: Increment PC

STEP 6: Increment PC

PCW rite IorD IRW rite RegW rite ALUSrcA ALUSrcB1:0 ALUControl 2:0

CLK CLK CLK


CLK CLK
0 SrcA
WE W E3 A CLK
25:21
PC' PC Instr A1 RD1 1
b 0 Adr RD

ALU
EN A EN A2 RD2 00 ALUResult ALUOut
1 SrcB
Instr / Data 4 01
Memory CLK
20:16
A3 10
Register
WD 11
Data File
W D3

SignImm
15:0
Sign Extend
Multicycle Datapath: sw
Write data in rt to memory

PCWrite IorD MemW rite IRWrite RegWrite ALUSrcA ALUSrcB1:0 ALUControl2:0

CLK CLK CLK


CLK CLK
0 SrcA
WE W E3 A CLK
25:21
PC' PC Instr A1 RD1 1
b 0 Adr RD B

ALU
EN A EN
20:16
A2 RD2 00 ALUResult ALUOut
1
Instr / Data 4 01 SrcB
Memory CLK
20:16
A3 10
Register
WD 11
Data File
WD3

SignImm
15:0
Sign Extend
Multicycle Datapath: R-Type
• Read from rs and rt
• Write ALUResult to register file
• Write to rd (instead of rt)
PCW rite IorD MemW rite IRWrite RegDst MemtoReg RegWrite ALUSrcA ALUSrcB1:0 ALUControl2:0

CLK CLK CLK


CLK CLK
0 SrcA
WE WE3 A CLK
25:21
PC' PC Instr A1 RD1 1
b 0 Adr RD B

ALU
EN A EN
20:16
A2 RD2 00 ALUResult ALUOut
1
Instr / Data 20:16 4 01 SrcB
0
Memory 15:11 A3 10
CLK 1 Register
WD 11
0 File
Data W D3
1

SignImm
15:0
Sign Extend
Multicycle Datapath: beq
• rs == rt?
• BTA = (sign-extended immediate << 2) + (PC+4)
PCEn
IorD MemW rite IRWrite RegDst MemtoReg RegWrite ALUSrcA ALUSrcB1:0 ALUControl2:0 Branch PCW rite PCSrc

CLK CLK CLK


CLK CLK
0 SrcA
WE WE3 A Zero CLK
25:21
PC' PC Instr A1 RD1 1 0
b 0 Adr RD B

ALU
EN A EN
20:16
A2 RD2 00 ALUResult ALUOut
1 1
Instr / Data 20:16
4 01 SrcB
0
Memory 15:11
A3 10
CLK 1 Register
WD 11
0 File
Data W D3
1
<<2

SignImm
15:0
Sign Extend
Multicycle Processor
CLK
PCWrite
Branch PCEn
IorD Control PCSrc
MemWrite Unit ALUControl2:0
IRWrite ALUSrcB1:0
31:26 ALUSrcA
Op
5:0 RegWrite
Funct

MemtoReg
RegDst
CLK CLK CLK
CLK CLK
0 SrcA
WE WE3 A Zero CLK
25:21
PC' PC Instr A1 RD1 1 0
0 Adr RD B

ALU
EN A EN
20:16
A2 RD2 00 ALUResult ALUOut
1 1
Instr / Data 20:16 4 01 SrcB
0
Memory 15:11 A3 10
CLK 1 Register
WD 11
0 File
Data WD3
1
<<2

SignImm
15:0
Sign Extend
Multicycle Control
Control
MemtoReg
Unit
RegDst
IorD Multiplexer
PCSrc Selects
Main ALUSrcB1:0
Controller
Opcode5:0 (FSM) ALUSrcA
IRWrite
MemWrite
Register
PCWrite
Enables
Branch
RegWrite

ALUOp1:0

ALU
Funct5:0 ALUControl2:0
Decoder
Main Controller FSM: Fetch
S0: Fetch

Reset

CLK
PCWrite 1
Branch 0 PCEn
IorD Control PCSrc
MemWrite Unit ALUControl2:0
IRWrite ALUSrcB1:0
31:26 ALUSrcA
Op
5:0 RegWrite
Funct

MemtoReg
RegDst
CLK CLK CLK 0
CLK 0 CLK 0
0 SrcA 010
0 WE WE3 A Zero CLK 0
25:21
PC' PC Instr A1 RD1 1 0
0 Adr RD B 01

ALU
EN A EN
20:16
A2 RD2 00 ALUResult ALUOut
1 1
X
Instr / Data 1 20:16 4 01 SrcB
1 0
Memory 15:11 A3 10
CLK 1 X Register
WD 11
0 File
Data WD3
1
<<2

SignImm
15:0
Sign Extend
Main Controller FSM: Fetch
S0: Fetch
IorD = 0
Reset AluSrcA = 0
ALUSrcB = 01
ALUOp = 00
PCSrc = 0
IRWrite CLK
PCWrite PCWrite 1
Branch 0 PCEn
IorD Control PCSrc
MemWrite Unit ALUControl2:0
IRWrite ALUSrcB1:0
31:26 ALUSrcA
Op
5:0 RegWrite
Funct

MemtoReg
RegDst
CLK CLK CLK 0
CLK 0 CLK 0
0 SrcA 010
0 WE WE3 A Zero CLK 0
25:21
PC' PC Instr A1 RD1 1 0
0 Adr RD B 01

ALU
EN A EN
20:16
A2 RD2 00 ALUResult ALUOut
1 1
X
Instr / Data 1 20:16 4 01 SrcB
1 0
Memory 15:11 A3 10
CLK 1 X Register
WD 11
0 File
Data WD3
1
<<2

SignImm
15:0
Sign Extend
Main Controller FSM: Decode
S0: Fetch S1: Decode
IorD = 0
Reset AluSrcA = 0
ALUSrcB = 01
ALUOp = 00
PCSrc = 0
IRWrite
PCWrite

CLK
PCWrite 0
Branch 0 PCEn
IorD Control PCSrc
MemWrite Unit ALUControl2:0
IRWrite ALUSrcB1:0
31:26 ALUSrcA
Op
5:0 RegWrite
Funct

MemtoReg
RegDst
CLK CLK CLK X
CLK 0 CLK 0
0 SrcA XXX
X WE WE3 A Zero CLK X
25:21
PC' PC Instr A1 RD1 1 0
0 Adr RD B XX

ALU
EN A EN
20:16
A2 RD2 00 ALUResult ALUOut
1 1
X
Instr / Data 0 20:16 4 01 SrcB
0 0
Memory 15:11 A3 10
CLK 1 X Register
WD 11
0 File
Data WD3
1
<<2

SignImm
15:0
Sign Extend
Main Controller FSM: Address
S0: Fetch S1: Decode
IorD = 0
Reset AluSrcA = 0
ALUSrcB = 01
ALUOp = 00
PCSrc = 0
IRWrite
PCWrite

Op = LW
or
S2: MemAdr Op = SW CLK
PCWrite 0
Branch 0 PCEn
IorD Control PCSrc
MemWrite Unit ALUControl2:0
IRWrite ALUSrcB1:0
31:26 ALUSrcA
Op
5:0 RegWrite
Funct

MemtoReg
RegDst
CLK CLK CLK 1
CLK 0 CLK 0
0 SrcA 010
X WE WE3 A Zero CLK X
25:21
PC' PC Instr A1 RD1 1 0
0 Adr RD B 10

ALU
EN A EN
20:16
A2 RD2 00 ALUResult ALUOut
1 1
X
Instr / Data 0 20:16 4 01 SrcB
0 0
Memory 15:11 A3 10
CLK 1 X Register
WD 11
0 File
Data WD3
1
<<2

SignImm
15:0
Sign Extend
Main Controller FSM: Address
S0: Fetch S1: Decode
IorD = 0
Reset AluSrcA = 0
ALUSrcB = 01
ALUOp = 00
PCSrc = 0
IRWrite
PCWrite

Op = LW
or CLK
S2: MemAdr Op = SW PCWrite 0
Branch 0 PCEn
IorD Control PCSrc
ALUSrcA = 1 MemWrite Unit ALUControl2:0
ALUSrcB = 10 IRWrite ALUSrcB1:0
ALUOp = 00 31:26
Op
ALUSrcA
5:0 RegWrite
Funct

MemtoReg
RegDst
CLK CLK CLK 1
CLK 0 CLK 0
0 SrcA 010
X WE WE3 A Zero CLK X
25:21
PC' PC Instr A1 RD1 1 0
0 Adr RD B 10

ALU
EN A EN
20:16
A2 RD2 00 ALUResult ALUOut
1 1
X
Instr / Data 0 20:16 4 01 SrcB
0 0
Memory 15:11 A3 10
CLK 1 X Register
WD 11
0 File
Data WD3
1
<<2

SignImm
15:0
Sign Extend
Main Controller FSM: lw
S0: Fetch S1: Decode
IorD = 0
Reset AluSrcA = 0
ALUSrcB = 01
ALUOp = 00
PCSrc = 0
IRWrite
PCWrite

Op = LW
or
S2: MemAdr Op = SW

ALUSrcA = 1
ALUSrcB = 10
ALUOp = 00

Op = LW
S3: MemRead

IorD = 1

S4: Mem
Writeback

RegDst = 0
MemtoReg = 1
RegWrite
Main Controller FSM: sw
S0: Fetch S1: Decode
IorD = 0
Reset AluSrcA = 0
ALUSrcB = 01
ALUOp = 00
PCSrc = 0
IRWrite
PCWrite

Op = LW
or
S2: MemAdr Op = SW

ALUSrcA = 1
ALUSrcB = 10
ALUOp = 00

Op = SW
Op = LW
S5: MemWrite
S3: MemRead

IorD = 1
IorD = 1
MemWrite

S4: Mem
Writeback

RegDst = 0
MemtoReg = 1
RegWrite
Main Controller FSM: R-Type
S0: Fetch S1: Decode
IorD = 0
Reset AluSrcA = 0
ALUSrcB = 01
ALUOp = 00
PCSrc = 0
IRWrite
PCWrite

Op = LW
or Op = R-type
S2: MemAdr Op = SW
S6: Execute

ALUSrcA = 1 ALUSrcA = 1
ALUSrcB = 10 ALUSrcB = 00
ALUOp = 00 ALUOp = 10

Op = SW
Op = LW S7: ALU
S5: MemWrite
Writeback
S3: MemRead

RegDst = 1
IorD = 1
IorD = 1 MemtoReg = 0
MemWrite
RegWrite

S4: Mem
Writeback

RegDst = 0
MemtoReg = 1
RegWrite
Main Controller FSM: beq
S0: Fetch S1: Decode
IorD = 0
Reset AluSrcA = 0
ALUSrcB = 01 ALUSrcA = 0
ALUOp = 00 ALUSrcB = 11
PCSrc = 0 ALUOp = 00
IRWrite
PCWrite
Op = BEQ
Op = LW
or Op = R-type
S2: MemAdr Op = SW
S6: Execute
S8: Branch
ALUSrcA = 1
ALUSrcA = 1 ALUSrcA = 1 ALUSrcB = 00
ALUSrcB = 10 ALUSrcB = 00 ALUOp = 01
ALUOp = 00 ALUOp = 10 PCSrc = 1
Branch

Op = SW
Op = LW S7: ALU
S5: MemWrite
Writeback
S3: MemRead

RegDst = 1
IorD = 1
IorD = 1 MemtoReg = 0
MemWrite
RegWrite

S4: Mem
Writeback

RegDst = 0
MemtoReg = 1
RegWrite
Multicycle Controller FSM
S0: Fetch S1: Decode
IorD = 0
Reset AluSrcA = 0
ALUSrcB = 01 ALUSrcA = 0
ALUOp = 00 ALUSrcB = 11
PCSrc = 0 ALUOp = 00
IRWrite
PCWrite
Op = BEQ
Op = LW
or Op = R-type
S2: MemAdr Op = SW
S6: Execute
S8: Branch
ALUSrcA = 1
ALUSrcA = 1 ALUSrcA = 1 ALUSrcB = 00
ALUSrcB = 10 ALUSrcB = 00 ALUOp = 01
ALUOp = 00 ALUOp = 10 PCSrc = 1
Branch

Op = SW
Op = LW S7: ALU
S5: MemWrite
Writeback
S3: MemRead

RegDst = 1
IorD = 1
IorD = 1 MemtoReg = 0
MemWrite
RegWrite

S4: Mem
Writeback

RegDst = 0
MemtoReg = 1
RegWrite
Extended Functionality: addi
S0: Fetch S1: Decode
IorD = 0
Reset AluSrcA = 0
ALUSrcB = 01 ALUSrcA = 0
ALUOp = 00 ALUSrcB = 11
PCSrc = 0 ALUOp = 00
IRWrite
PCWrite
Op = ADDI
Op = BEQ
Op = LW
or Op = R-type
S2: MemAdr Op = SW
S6: Execute S9: ADDI
S8: Branch
Execute
ALUSrcA = 1
ALUSrcA = 1 ALUSrcA = 1 ALUSrcB = 00
ALUSrcB = 10 ALUSrcB = 00 ALUOp = 01
ALUOp = 00 ALUOp = 10 PCSrc = 1
Branch

Op = SW
Op = LW S7: ALU
S5: MemWrite S10: ADDI
Writeback
S3: MemRead Writeback

RegDst = 1
IorD = 1
IorD = 1 MemtoReg = 0
MemWrite
RegWrite

S4: Mem
Writeback

RegDst = 0
MemtoReg = 1
RegWrite
Main Controller FSM: addi
S0: Fetch S1: Decode
IorD = 0
Reset AluSrcA = 0
ALUSrcB = 01 ALUSrcA = 0
ALUOp = 00 ALUSrcB = 11
PCSrc = 0 ALUOp = 00
IRWrite
PCWrite
Op = ADDI
Op = BEQ
Op = LW
or Op = R-type
S2: MemAdr Op = SW
S6: Execute S9: ADDI
S8: Branch
Execute
ALUSrcA = 1
ALUSrcA = 1 ALUSrcA = 1 ALUSrcB = 00 ALUSrcA = 1
ALUSrcB = 10 ALUSrcB = 00 ALUOp = 01 ALUSrcB = 10
ALUOp = 00 ALUOp = 10 PCSrc = 1 ALUOp = 00
Branch

Op = SW
Op = LW S7: ALU
S5: MemWrite S10: ADDI
Writeback
S3: MemRead Writeback

RegDst = 1 RegDst = 0
IorD = 1
IorD = 1 MemtoReg = 0 MemtoReg = 0
MemWrite
RegWrite RegWrite

S4: Mem
Writeback

RegDst = 0
MemtoReg = 1
RegWrite
Extended Functionality: j

PCEn
IorD MemWrite IRWrite RegDst MemtoReg RegWrite ALUSrcA ALUSrcB1:0 ALUControl2:0Branch PCWrite PCSrc1:0

CLK CLK CLK


CLK CLK
0 SrcA
WE WE3 A 31:28 Zero CLK
25:21
PC' PC Instr A1 RD1 1 00
0 Adr RD B

ALU
EN A EN
20:16
A2 RD2 00 ALUResult ALUOut
1 01
Instr / Data 20:16 4 01 SrcB 10
0
Memory 15:11 A3 10
CLK 1 Register PCJump
WD 11
0 File
Data WD3
1
<<2 27:0
<<2

SignImm
15:0
Sign Extend
25:0 (jump)
Main Controller FSM: j
S0: Fetch S1: Decode
IorD = 0
Reset AluSrcA = 0 S11: Jump
ALUSrcB = 01 ALUSrcA = 0
ALUOp = 00 ALUSrcB = 11 Op = J
PCSrc = 00 ALUOp = 00
IRWrite
PCWrite
Op = ADDI
Op = BEQ
Op = LW
or Op = R-type
S2: MemAdr Op = SW
S6: Execute S9: ADDI
S8: Branch
Execute
ALUSrcA = 1
ALUSrcA = 1 ALUSrcA = 1 ALUSrcB = 00 ALUSrcA = 1
ALUSrcB = 10 ALUSrcB = 00 ALUOp = 01 ALUSrcB = 10
ALUOp = 00 ALUOp = 10 PCSrc = 01 ALUOp = 00
Branch

Op = SW
Op = LW S7: ALU
S5: MemWrite S10: ADDI
Writeback
S3: MemRead Writeback

RegDst = 1 RegDst = 0
IorD = 1
IorD = 1 MemtoReg = 0 MemtoReg = 0
MemWrite
RegWrite RegWrite

S4: Mem
Writeback

RegDst = 0
MemtoReg = 1
RegWrite
Main Controller FSM: j
S0: Fetch S1: Decode
IorD = 0
Reset AluSrcA = 0 S11: Jump
ALUSrcB = 01 ALUSrcA = 0
ALUOp = 00 ALUSrcB = 11 Op = J
PCSrc = 00 ALUOp = 00 PCSrc = 10
IRWrite PCWrite
PCWrite
Op = ADDI
Op = BEQ
Op = LW
or Op = R-type
S2: MemAdr Op = SW
S6: Execute S9: ADDI
S8: Branch
Execute
ALUSrcA = 1
ALUSrcA = 1 ALUSrcA = 1 ALUSrcB = 00 ALUSrcA = 1
ALUSrcB = 10 ALUSrcB = 00 ALUOp = 01 ALUSrcB = 10
ALUOp = 00 ALUOp = 10 PCSrc = 01 ALUOp = 00
Branch

Op = SW
Op = LW S7: ALU
S5: MemWrite S10: ADDI
Writeback
S3: MemRead Writeback

RegDst = 1 RegDst = 0
IorD = 1
IorD = 1 MemtoReg = 0 MemtoReg = 0
MemWrite
RegWrite RegWrite

S4: Mem
Writeback

RegDst = 0
MemtoReg = 1
RegWrite
Multicycle Processor Performance
• Instructions take different number of cycles:
– 3 cycles: beq, j
– 4 cycles: R-Type, sw, addi
– 5 cycles: lw
• CPI is weighted average
• SPECINT2000 benchmark:
– 25% loads
– 10% stores
– 11% branches
– 2% jumps
– 52% R-type
Average CPI = (0.11 + 0.02)(3) + (0.52 + 0.10)(4) + (0.25)(5) = 4.12
Multicycle Processor Performance
Multicycle critical path:
Tc = tpcq + tmux + max(tALU + tmux, tmem) + tsetup
CLK
PCWrite
Branch PCEn
IorD Control PCSrc
MemWrite Unit ALUControl2:0
IRWrite ALUSrcB1:0
31:26 ALUSrcA
Op
5:0 RegWrite
Funct

MemtoReg
CLK RegDst CLK CLK
CLK CLK
0 SrcA
WE WE3 A Zero CLK
25:21
PC' PC Instr A1 RD1 1 0
0 Adr RD B

ALU
EN A EN
20:16
A2 RD2 00 ALUResult ALUOut
1 1
Instr / Data 20:16 4 01 SrcB
0
Memory 15:11 A3 10
CLK 1 Register
WD 11
0 File
Data WD3
1
<<2

SignImm
15:0
Sign Extend
Multicycle Performance Example
Element Parameter Delay (ps)
Register clock-to-Q tpcq_PC 30
Register setup tsetup 20
Multiplexer tmux 25
ALU tALU 200
Memory read tmem 250
Register file read tRFread 150
Register file setup tRFsetup 20

Tc = ?
Multicycle Performance Example
Element Parameter Delay (ps)
Register clock-to-Q tpcq_PC 30
Register setup tsetup 20
Multiplexer tmux 25
ALU tALU 200
Memory read tmem 250
Register file read tRFread 150
Register file setup tRFsetup 20

Tc = tpcq_PC + tmux + max(tALU + tmux, tmem) + tsetup


= tpcq_PC + tmux + tmem + tsetup
= [30 + 25 + 250 + 20] ps
= 325 ps
Multicycle Performance Example
Program with 100 billion instructions
Execution Time = ?
Multicycle Performance Example
Program with 100 billion instructions
Execution Time = (# instructions) × CPI × Tc
= (100 × 109)(4.12)(325 × 10-12)
= 133.9 seconds

This is slower than the single-cycle processor


(92.5 seconds). Why?
Multicycle Performance Example
Program with 100 billion instructions
Execution Time = (# instructions) × CPI × Tc
= (100 × 109)(4.12)(325 × 10-12)
= 133.9 seconds

This is slower than the single-cycle processor


(92.5 seconds). Why?
– Not all steps same length
– Sequencing overhead for each step (tpcq + tsetup= 50 ps)
Review: Single-Cycle Processor
Jump MemtoReg
Control
MemWrite
Unit
Branch
ALUControl2:0 PCSrc
31:26
Op ALUSrc
5:0
Funct RegDst
RegWrite

CLK CLK
CLK
0 25:21
WE3 SrcA Zero WE
0 PC' PC Instr A1 RD1 0 Result
1 A RD

ALU
1 ALUResult ReadData
A RD 1
Instruction 20:16
A2 RD2 0 SrcB Data
Memory
A3 1 Memory
Register WriteData
WD3 WD
File
20:16
0
PCJump 15:11
1
WriteReg4:0
PCPlus4
+

SignImm
4 15:0 <<2
Sign Extend PCBranch

+
27:0 31:28

25:0
<<2
Review: Multicycle Processor
CLK
PCWrite
Branch PCEn
IorD Control PCSrc
MemWrite Unit ALUControl2:0
IRWrite ALUSrcB1:0
31:26 ALUSrcA
Op
5:0 RegWrite
Funct

MemtoReg
RegDst
CLK CLK CLK
CLK CLK
0 SrcA
WE WE3 A 31:28 Zero CLK
25:21
PC' PC Instr A1 RD1 1 00
0 Adr RD B

ALU
EN A EN
20:16
A2 RD2 00 ALUResult ALUOut
1 01
Instr / Data 20:16 4 01 SrcB 10
0
Memory 15:11 A3 10
CLK 1 Register PCJump
WD 11
0 File
Data WD3
1
<<2 27:0
<<2

ImmExt
15:0
Sign Extend
25:0 (Addr)

S-ar putea să vă placă și