Documente Academic
Documente Profesional
Documente Cultură
2
Moores Law
3
CMOS Computer Performance
100.00
intel 486
Specint 2006 intel pentium
intel pentium 2
intel pentium 3
intel pentium 4
intel itanium
10.00 Alpha 21064
Alpha 21164
Alpha 21264
Sparc
SuperSparc
Sparc64
1.00 M ips
HP PA
Power PC
AM D K6
AM D K7
AM D x86-64
IBM Power
0.10 SUN UltraSPARC
Intel Core 2
AM D Opteron
AM D Phenom
0.01
88 90 92 94 96 98 00 02 04 0
4
CMOS Computer Performance
100.00
intel 486
Specint 2006 intel pentium
intel pentium 2
intel pentium 3
intel pentium 4
intel itanium
10.00 Alpha 21064
Alpha 21164
Alpha 21264
Sparc
SuperSparc
Sparc64
1.00 M ips
HP PA
Power PC
AM D K6
AM D K7
AM D x86-64
IBM Power
0.10 SUN UltraSPARC
Intel Core 2
AM D Opteron
AM D Phenom
0.01
88 90 92 94 96 98 00 02 04 06 08 10
5
Moores Original Issues
Design cost
Power dissipation
What to do with all the functionality possible
ftp://download.intel.com/research/silicon/moorespaper.pdf
6
Scaling MOS Devices
1000
Watts
100
10
1
88 90 92 94 96 98 00 02 04 06 08 10
8
Power Density
1.00
Watts/mm2
0.10
0.01
85 87 89 91 93 95 97 99 01 03 05 07 09
9
Why Power Increased
10000
1000
100
10
85 87 89 91 93 95 97 99 01 03 05 07 09
10
Good News
10
85 87 89 91 93 95 97 99 01 03 05
11
Processor Power
10
1
85 87 89 91 93 95 97 99 01 03
12
Bad News
Ed Nowak, IBM
13
Technology Scaling Today
14
Other Technologies
15
Problem with Different Technologies
17
Optimizing Energy
Performance
18
Optimizing Energy
Performance
19
Optimizing Energy
Performance
20
Years of Low Power Research
Can waste
Energy (clock gating, leakage control, etc)
Performance
Adding additional constraints to operation flow
22
The Push for Parallelism
i ntel 386
i ntel 486
Watts/Spec*L*Vdd^2
i ntel penti um
i ntel penti um2
i ntel penti um3
i ntel penti um4
0.1 i ntel i tani um
Al pha 21064
Al pha 21164
Al pha 21264
Spar c
Super Spar c
Spar c64
Mi ps
10 parallel HP PA
Power PC
processors AMD K6
0.01 AMD K7
AMD x86-64
1 10 100 1000
Spec2000 *L
23
Exploit Parallelism / Scale Vdd
Energy/op
Add more function units
Fill up new die (2x)
Lower energy/op
E/P will decrease
Vdd, sizes, etc will reduce
Build simpler architectures
Performance
Works well when E/P is large
But what happens when that runs out?
24
Problem Reformulation
User Performance
25
Exploit Specialization
26
ASIC/SOC Design Trends
27
28
ASIC Future Depends on Your Religion
29
Computings Future:
30
Can We Do Better?
31
Chip Generator Idea
Application
$$$
Configure a Configure /
ASIC design
Process: programmable program a virtual
chip chip
Generate
optimized chip
Not
Efficient
Final Custom Configured Semi-custom
System System System
Product:
32
Another Way To Put It
ASIC
CMP Excel perf/watt per app
Performance
33
Smart Memories - A Pretend Generator
Data Instruction
Cache Cache
M M M M M M M M T D D T D D M M
M M M M M M M M T D D M M M M M
Crossbar Crossbar
34
Looks Promising
Chip
Multiprocessor
Generator
Optimization Verification
(Omid, Pete) (Megan, Ofer)
Given an application
Verification of a chip is
and a target (power /
difficult. How do we
perf.) How do we find
verify a generator?
the best chip plan?
36
Conclusions
37