Documente Academic
Documente Profesional
Documente Cultură
Abstraction Layers
Algorithm
Programming Language
Operating System/Virtual Machines
Instruction Set Architecture (ISA)
Gates/Register-Transfer Level (RTL)
Circuits
Devices
Physics
2
Computer Architecture is Design and Analysis
Architecture is an iterative process:
• Searching the space of possible designs
• At all levels of computer systems
Design
Analysis
Creativity
Cost /
Performance
Analysis
Good Ideas
Mediocre Ideas
Bad Ideas 3
Computer Architecture
Applications
suggest how Improved
Applications technologies
to improve
technology, make new
provide applications
revenue to possible
Technology
fund
development
5
Crossroads: Conventional Wisdom
• Old Conventional Wisdom: Power is free, Transistors expensive
• New Conventional Wisdom: “Power wall” Power expensive, Xtors free
(Can put more on chip than can afford to turn on)
• Old CW: Sufficiently increasing Instruction Level Parallelism via compilers,
innovation (Out-of-order, speculation, VLIW, …)
• New CW: “ILP wall” law of diminishing returns on more HW for ILP
• Old CW: Multiplies are slow, Memory access is fast
• New CW: “Memory wall” Memory slow, multiplies fast
(200 clock cycles to DRAM memory, 4 clocks for multiply)
• Old CW: Uniprocessor performance 2X / 1.5 yrs
• New CW: Power Wall + ILP Wall + Memory Wall = Brick Wall
• Uniprocessor performance now 2X / 5(?) yrs
• Sea change in chip design: multiple “cores”
(2X processors per chip / ~ 2 years)
• More simpler processors are more power efficient 6
Instruction Set Architecture: Critical Interface
software
instruction set
hardware
Technology Parallelism
Programming
Languages
Applications Interface Design
Computer Architecture: (ISA)
• Organization
• Hardware/Software Boundary
Compilers
Parallel Processing
Strong ECE
ECE369 ECE569
Prerequisite 462/562
Basic computer Computer Architecture, High Performance
organization, first look First look at parallel Computing, Advanced
at pipelines + caches architectures Topics
ECE
474/574 ECE 576
• Topics
1. Simple machine design (ISAs, microprogramming,
unpipelined machines, Iron Law, simple pipelines)
2. Memory hierarchy (DRAM, caches, optimizations) plus
virtual memory systems, exceptions, interrupts
3. Complex pipelining (score-boarding, out-of-order issue)
4. Explicitly parallel processors (vector machines, VLIW
machines, multithreaded machines)
5. Multiprocessor architectures (memory models, cache
coherence, synchronization)
11
Your ECE462/562
12
Coping with ECE462/562
13
Policies
• Background: ECE369 or equivalent, based on Patterson
and Hennessy’s Computer Organization and Design
• Prerequisite: ECE274 & ECE369 & Programming in C
• 3 to 4 assignments, 2 exams, final project
• Grad students: extra exam questions, survey
paper and presentation
• NO LATE ASSIGNMENTS
• Make-ups may be arranged prior to the scheduled activity.
• Inquiries about graded material => within 3 days of
receiving a grade.
• You are encouraged to discuss the assignment
specifications with your instructor, and your fellow students.
However, anything you submit for grading must be unique and
should NOT be a duplicate of another source.
• Read before the class
• Participate and ask questions
• Manage your time 14
• Start working on assignments early
Grading
Distribution of Components Grades Scale
Assignments+Quiz
35 90-100% A
+Participation
Exam-I 15 80-89% B
Exam-II 15 70-79% C
Project 35 60-69% D
16
Research Paper Reading
17
Project (Undergrad vs Grad)
18
Project (Undergrad vs Grad)
• Topics: (Chapter 1)
Technology trends
Performance equations
20
Technology Trends and This Book
• In Applications
Data Level Parallelism
o Data items that can be operated on concurrently
Task-level Parallelism
o Tasks of a work can operate independently
• In Hardware
ILP: exploits DLP with compiler, pipelining,
speculative execution
Vector Architectures and GPUs: exploit DLP by
applying a single instruction to a collection of data
Thread-level parallelism: exploits DLP and TLP,
tightly coupled hardware, interaction among threads
Request level parallelism: exploits largely
decoupled tasks specified by the programmer 23
Processor Technology Trends
24
Trends: Historical Perspective
25
Power Consumption Trends
26
Recent Microprocessor Trends
Performance: 1.15x
Frequency: 1.05x
Power: 1.04x
2004 2010
27
Source: Micron University Symp.
Improving Energy Efficiency Despite Flat Clock Rate
28
Modern Processor Today
• Intel Core i7
29
Other Technology Trends
30
First Microprocessor Intel 4004, 1971
• 4-bit accumulator
architecture
• 8mm pMOS
• 2,300 transistors
• 3 x 4 mm2
• 750kHz clock
• 8-16 cycles/inst.
31
Hardware
• Team from IBM building
PC prototypes in 1979
• Motorola 68000 chosen
initially, but 68000 was
late
• 8088 is 8-bit bus version
of 8086 => allows
cheaper system
• Estimated sales of
250,000
• 100,000,000s sold
34
Performance
CPU time = Seconds = Instructions x Cycles x Seconds
Program Program Instruction Cycle
Compiler X X
Inst. Set. X X
Organization X X
Technology X
35
Amdahl’s Law
36
Amdahl’s Law
37
Principle of Locality
38
Exploit Parallelism
39