Documente Academic
Documente Profesional
Documente Cultură
Architecture Categories
IS IS DS
C P M
DS
P
IS
C M
DS
P
IS IS DS
C P
IS IS DS
C P
IS IS DS
C P
IS IS DS
C P
16K •MPP
16 •C.mmP
< K x K’ , D x D’ , W x W’ >
control data word
dash degree of pipelining
TI - ASC <1, 4, 64 x 8>
CDC 6600 <1, 1 x 10, 60> x <10, 1, 12> (I/O)
C.mmP <16,1,16> + <1x16,1,16> + <1,16,16>
PEPE <1 x 3, 288, 32>
Cray-1 <1, 12 x 8, 64 x (1 ~ 14)>
Parallel
architectures
Data-parallel Function-parallel
architectures architectures
Data-parallel
architectures
Function-parallel
architectures
IF D RF EX/AG M WB
T
S stages
Frequency of interruptions - b
CPI = 1 + (S - 1) * b
Time = CPI * T / S
Anshul Kumar, CSE IITD slide 18
ILP in VLIW processors
Cache/ Fetch
memory Unit Single multi-operation instruction
FU FU FU
Register file
multi-operation instruction
FU FU FU
Instruction/control
Data Register file
FU Funtional Unit
FU FU FU
Register file
•Instruction encoding
•Scalability: Access time, area, power consumption
sharply increase with number of register ports
Anshul Kumar, CSE IITD slide 22
Tasks of superscalar processing
Example :
Band matrix multiplication
A11 A12 0 0 0 0 B11B12 0 0 0 0
A A A 0 0 0 B B B 0 0 0
21 22 23 21 22 23
A31 A32 A33 A34 0 0 B31B32 B33 B34 0 0
C
0 A A A A
42 43 44 45 0 0 B B B
42 43 44 45B 0
0 0 A A A A 0 0 B B B B
53 54 55 56
53 54 55 56
0 0 0 A64 A65 A66 0 0 0 B64 B65 B66
B31
A23
Data-parallel Function-parallel
architectures architectures
Built using
general purpose
processors Distributed Shared
Memory Memory
MIMD MIMD
Concurrent
tasks/processes/threads/objects