Documente Academic
Documente Profesional
Documente Cultură
Hardware Architect
IBM
Execution BW
6 x 107 FLOPS
16000x
1 x 109 FLOPS
Storage BW
Storage BW
240 MB/sec to L1
25000x
2700x
6 TB/sec to L1
6 TB/sec to L2
3 TB/sec to L3
400 GB/sec 1 TB DRAM
But,
3D structures
Power reduction through more efficient gate utilization
What is OpenPOWER?
Industry Consortium focused on Innovation
- Across Server HW / SW stack
- For customized servers and components
- Leveraging complementary skills and investments
- To provide differentiated architectural alternatives
IBM
Mellanox OpenPower
Open Innovation
TYAN
Google
NVIDIA
OpenPOWER: Today
Implementation / HPC / Research
System / Software / Services
3d Stacking
Many-levels of chips with
low power and latency
communication
Enables larger caches
stacked below the CPU
Enables larger chips with
good yield
DRAM TSV enables larger
capacity without power and
frequency cuts
Core
Core
L2 Cache
L2 Cache
Fast Local
L3 Region
Mem Ctrl
L2 Cache
L2 Cache
Remote SMP +
10
Gates in 3D advantage
Critical high power execution core takes up the penthouse
(where heat can more easily be removed)
Smaller core yields higher frequency and reduced energy
11
13
Off chip
Wave pulses along a string
14
Multi-cycle transition
Current server class design gate large blocks
(entire CPU)
Required to provide voltage stability through
capacitance in power grid
Fine grain power gating will become possible in
server space with sophisticated 3D based
power delivery.
Potential ~4x reduction in leakage power.
15
Future Memory
Today
DRAM ~100ns, Read and Write Durable, Volatile
Technology scaling slowdown
~100ns Read
~1 usec write
Read Durable
Non-volatile
17
Optical Interconnects
POWER7 775 HPC
system
Silicon Photonics,
Multi-wavelength,
25 Gb/s Optics
10 Gb/s = 24 GB/s
1 Color
(Deuce)
25 Gb/s = 60 GB/s
1 Color
(Deuce)
Heterogeneous Computing
ASIC - An application-specific
integrated circuit (ASIC) is
an integrated circuit (IC) customized for
a particular use, rather than intended for
general-purpose use. For example, a
chip designed solely to run specific cell
phone is an ASIC.
FPGA - A field-programmable gate
array (FPGA) is an integrated
circuit designed to be configured by the
customer or designer after
manufacturinghence "fieldprogrammable
GP GPU A General Purpose
Graphics Processing Unit is a
massively threaded processing engine
capable of accelerating highly parallel
19
FPGA Capability
FPGA Trends
Current
Field
Deployed
Appliances
WFO
PoC /
Datapower
74.4
49.6
$/K LE's
10.0
35.6 31.3
26.0
14.0
9.0
6.6 6.2
5.3
3.8
2.0
0.9 0.8
0.7
19
93
19
94
19
95
19
96
19
97
19
98
19
99
20
00
20
01
20
02
20
03
20
04
20
05
20
06
20
07
20
08
20
09
20
10
20
11
20
12
20
13
20
14
1.0
0.1
Generic Logic
250 MHz
Pervasive
Decimal
Unit
Instruction
Sequencing
Fixed
Point
Unit
Vector
and
Scalar
Unit
Instruction
Fetch and Decode
Load/Store
Unit
21
22
Software challenges
All of the following apply to,
Applications
Middleware: Compilers, database, etc.
System SW: OS, hypervisors, cluster, etc.
Parallel programing
Accelerator usage (heterogeneous computing)
Workload partitioning
FPGA compilation
Tiered memory management
More levels, diverse types
Melding of main memory and storage
EDA tools required to support complex design structures and circuit power
optimization (e.g. productive fine grain power gating, diffraction mask generation,
etc.)
23
World leading
Technology
Research labs
Hardware design
Software design
System design
24