Documente Academic
Documente Profesional
Documente Cultură
Acknowledgement: Many slides were adapted from Prof. Hsien-Hsin Lee’s ECE4100/6100 Advanced Computer
Architecture course at Georgia Inst. of Tech. with his generous permission.
ii
11
66
Data Flow Graph (or Data Dependency Graph)
l Pros
¤ Scalable performance: allows code to be compiled on one
platform, but also run efficiently on another
¤ Handle cases where dependency is unknown at compile-time
l Cons
¤ Hardware complexity (main argument from the VLIW/EPIC
camp)
ECE 752: Advanced Computer Architecture I 4
Out-of-Order Execution
ii ii ii
i1: r2 = 4(r22) ii ii ii ii ii
11 11 11
i2: r10 = 4(r25) 11 22 55 66 77
00 11 22
i3: r10 = r2 + r10
i4: 4(r26) = r10
i5: r14 = 8(r27) ii
i6: r6 = (r22) ii ii
11
i7: r5 = (r23) 33 88
33
i8: r5 = r6 – r5
i9: r4 = r14 * r5
i10: r15 = 12(r27) ii
i11: r7 = 4(r22) ii ii
11
i12: r8 = 4(r23) 44 99
44
i13: r8 = r7 – r8
i14: r8 = r15* r8 ii
i15: r8 = r4 – r8 11
i16: (r28) = r8 55
ii
11
66
l Cons
¤ Stop issue when WAW is detected
¤ Stop writeback when WAR is detected
Functional Units
Data bus
Registers
FP
FP
Divide
Divide
Data bus
FP
FPAdd
Add
Data bus Bus Des Src Src Dep Dep
Fu Op
Integer y t 1 2 1 2
Integer Loa
Int 1 F1 R3
Data bus d
Mult
1 Mult F0 F1 F4 Int
1
Mult
2 Memory
0
SCOREBOARD
SCOREBOARD
Control bus/Status Add 1 Sub F8 F6 F1 Int
Mult
Div 1 Div F2 F0 F6
1
FU Status Table
F0 F1 F2 .. .. .. F31
FU Mult1 Int Div .. .. .. xxx
Register Update Table
l Tomasulo algorithm
¤ Dynamic scheduling
¤ Register renaming
¤ Tags
l Due to in-order issue, the register status table always keeps the
latest write (No WAW issue)
FP Load Buffers
6 (FLB)
5
4 Store Data
3 Buffers
2 (SDB)
1
3
2 2
1 1
Reservation
Stations To Mem
6 FLB
5 FLR
4 Tags and other info in RS Control Tag
3
2 Sink Source Tag Tag
Control
1 (Vj) (Vk) (Qj) (Qk)
3
2 2 Store Data
1
1 Buffers
Reservation
Stations To Mem (SDB)
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
These are RS,
Add2 No
Add3 No
we have only
Mult1 No
one FU for each
Mult2 No type (MUL, ADD,
LD). We reduce
Register result status: Load from 6 to 3 for
Clock F0 F2 F4 F6 F8 F10 F12 ... F30 simplicity. SDB is
0 Qi not shown either
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
Add2 No
Add3 No
Mult1 No
Mult2 No
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
Add2 No
Add3 No
Mult1 No
Mult2 No
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
Add2 No
Add3 No
Mult1 Yes MULTD R(F4) Load2
Mult2 No
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 Yes SUBD M(A1) Load2
Add2 No
Add3 No
Mult1 Yes MULTD R(F4) Load2
Mult2 No
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
2 Add1 Yes SUBD M(A1) M(A2)
Add2 No
Add3 No
10 Mult1 Yes MULTD M(A2) R(F4)
Mult2 Yes DIVD R(F6) Mult1
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
1 Add1 Yes SUBD M(A1) M(A2)
Add2 Yes ADDD R(F2) Add1
Add3 No
9 Mult1 Yes MULTD M(A2) R(F4)
Mult2 Yes DIVD R(F6) Mult1
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
0 Add1 Yes SUBD M(A1) M(A2)
Add2 Yes ADDD R(F2) Add1
Add3 No
8 Mult1 Yes MULTD M(A2) R(F4)
Mult2 Yes DIVD R(F6) Mult1
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
2 Add2 Yes ADDD (M1-M2)R(F2)
Add3 No
7 Mult1 Yes MULTD M(A2) R(F4)
Mult2 Yes DIVD R(F6) Mult1
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
1 Add2 Yes ADDD (M1-M2)R(F2)
Add3 No
6 Mult1 Yes MULTD M(A2) R(F4)
Mult2 Yes DIVD R(F6) Mult1
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
0 Add2 Yes ADDD (M1-M2)R(F2)
Add3 No
5 Mult1 Yes MULTD M(A2) R(F4)
Mult2 Yes DIVD R(F6) Mult1
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
Add2 No
Add3 No
4 Mult1 Yes MULTD M(A2) R(F4)
Mult2 Yes DIVD R(F6) Mult1
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
Add2 No
Add3 No
3 Mult1 Yes MULTD M(A2) R(F4)
Mult2 Yes DIVD R(F6) Mult1
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
Add2 No
Add3 No
2 Mult1 Yes MULTD M(A2) R(F4)
Mult2 Yes DIVD R(F6) Mult1
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
Add2 No
Add3 No
1 Mult1 Yes MULTD M(A2) R(F4)
Mult2 Yes DIVD R(F6) Mult1
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
Add2 No
Add3 No
0 Mult1 Yes MULTD M(A2) R(F4)
Mult2 Yes DIVD R(F6) Mult1
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
Add2 No
Add3 No
Mult1 No
40 Mult2 Yes DIVD M*F4 R(F6)
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
Add2 No
Add3 No
Mult1 No
1 Mult2 Yes DIVD M*F4 R(F6)
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
Add2 No
Add3 No
Mult1 No
0 Mult2 Yes DIVD M*F4 R(F6)
Reservation Stations: S1 S2 RS RS
Time Name Busy Op Vj Vk Qj Qk
Add1 No
Add2 No
Add3 No
Mult1 No
Mult2 No