Lesson 5: Processor Design: Topic 1 - Methods and Concepts

Lesson 5: Processor Design
Topic 1 Methods and Concepts
EE37E 2005
Introduction
References:
-Modern Processor Design Book ( pp. 1 16)
- Computer Organization and Design Book (pp. 54- 89)
EE37E 2005
While introducing this topic we will focus on these points:

Evolution of microprocessors
Instruction set processor design
Principles
Microprocessors are Instruction set processors (ISPs).
An ISP executes instructions from a predefined
instruction set.
A microprocessors functionality is fully characterized by
the instruction set it is capable of executing.
This predefined instruction set is also called the
instruction set architecture.
EE37E 2005
An ISA serves as an interface between software

and hardware.
In terms of processor design methodology, an
ISA is the specification of the design while the
microprocessor or ISP is the implementation of a
design.
EE37E 2005
Computer System Components

L1
1000MHZ - 3 GHZ (a multiple of system bus speed)

Pipelined ( 7 -21 stages )
Superscalar (max ~ 4 instructions/cycle) single-threaded
Dynamically-Scheduled or VLIW
Dynamic and static branch prediction
CPU
L2
SDRAM
L3
PC100/PC133
100-133MHZ
64-128 bits wide
2-way inteleaved
~ 900 MBYTES/SEC
Double Date
Rate (DDR) SDRAM
PC3200
400MHZ (effective 200x2)
64-128 bits wide
4-way interleaved
~3.2 GBYTES/SEC
(second half 2002)
RAMbus DRAM (RDRAM)
PC800, PC1060
400-533MHZ (DDR)
16-32 bits wide channel
~ 1.6 - 3.2 GBYTES/SEC
( per channel)
Caches
System Bus
Examples: Alpha, AMD K7: EV6, 400MHZ

Intel PII, PIII: GTL+ 133MHZ
Intel P4
800MHZ
Support for one or more CPUs
adapters
Memory
Controller
Memory Bus
NICs
Controllers
Memory
I/O Buses
Disks
Displays
Keyboards
North
Bridge
South
Bridge
I/O Devices:
Chipset
EE37E 2005
Example: PCI-X 133MHZ

PCI, 33-66MHZ
32-64 bits wide
133-1024 MBYTES/SEC
Networks
Fast Ethernet
Gigabit Ethernet
ATM, Token Ring ..
Computer System Components

Enhanced CPU Performance & Capabilities:
Memory Latency Reduction:
Conventional &
Block-based
Trace Cache.
Integrate Memory
Controller & a portion
of main memory with
CPU: Intelligent RAM
Integrated memory
Controller:
AMD Opetron
IBM Power5
L1
Support for Simultaneous Multithreading (SMT): Alpha EV8.

VLIW & intelligent compiler techniques: Intel/HP EPIC IA-64.
More Advanced Branch Prediction Techniques.
Chip Multiprocessors (CMPs): The Hydra Project. IBM Power 4,5
Vector processing capability: Vector Intelligent RAM (VIRAM).
Or Multimedia ISA extension.
Digital Signal Processing (DSP) capability in system.
Re-Configurable Computing hardware capability in system.
SMT
CMP
CPU
L2
L3
Caches
System Bus
adapters
Memory
Controller
Memory Bus
NICs
Controllers
Memory
I/O Buses
Disks (RAID)
Displays
Keyboards
North
Bridge
South
Bridge
Chipset
Networks
I/O Devices:
EE37E 2005
Recent Trends in Computer Design
The cost/performance ratio of computing systems have seen a steady decline

due to advances in:
Integrated circuit technology: decreasing feature size,
Clock rate improves roughly proportional to improvement in
Number of transistors improves proportional to (or faster).
Architectural improvements in CPU design.
Microprocessor systems directly reflect IC improvement in terms of a yearly

35 to 55% improvement in performance.
Assembly language has been mostly eliminated and replaced by other

alternatives such as C or C++
Standard operating Systems (UNIX, NT) lowered the cost of introducing new
architectures.
Emergence of RISC architectures and RISC-core architectures.
Adoption of quantitative approaches to computer design based on empirical

performance observations.
EE37E 2005
Microprocessor Architecture Trends

C IS C M a c h in e s
in s tru c tio n s tak e va ria b le tim e s to c o m p le te
R IS C M a c h in e s (m ic r o c o d e )
s im p le in s tru c tio n s , o p tim iz e d fo r s p e e d
R IS C M a c h in e s (p ip e lin e d )
s am e in d ivid u a l in s tru c tio n late n c y
g r e a t e r t h r o u g h p u t t h r o u g h i n s t r u c t i o n "o v e r l a p "
Superscalar P ro cesso rs
m u ltip le in s tr u c tio n s e x e c u tin g s im u lta n e o u s ly
M u ltith r e ad e d P r o c e sso r s
ad d itio n a l H W re s o u rc e s (re g s , P C , S P )
e ac h c o n te x t g e ts p ro c e s s o r fo r x c y c le s
V L IW
"S u p e r i n s t r u c t i o n s " g r o u p e d t o g e t h e r
d e c re as e d H W c o n tro l c o m p le x ity
CMPs
S in g le C h ip M u ltip ro cesso rs
d u p lic ate e n tire p ro c e s s o rs
( t e c h s o o n d u e t o M o o r e 's L a w )
S I M U L T A N E O U S M U L T I T H R E A D I N G (SMT)
m u ltip le H W c o n te x ts (re g s , P C , S P )
e ac h c y c le , a n y c o n te x t m ay e x e c u te
SMT/CMPs (e.g. IBM Power5 in 2004)
EE37E 2005
Evolution of microprocessors
100000000
Graduation Window
Alpha 21264: 15 million
Pentium Pro: 5.5 million
PowerPC 620: 6.9 million
Alpha 21164: 9.3 million
Sparc Ultra: 5.2 million
10000000
Transistors
Moores Law
Pentium
i80486
1000000
i80386
i80286
100000
CMOS improvements:
Die size: 2X every 3 yrs
Line width: halve / 4-7 yrs
i8086
10000
i8080
i4004
1000
1970
1975
1980
1985
1990
1995
2000
Year
EE37E 2005
Figure1: Evolution of
microprocessors
9
Three decades of the history of microprocessors

tell a truly remarkable story of advances in the
computer industry (Table 1).
1970 1980
1980 1990
1990
-2000
2000
-2010
Transistor 2K 100K
count
100K 1 M 1M 100M 100M 2

B
Clock
0.1 3
frequency MHz
3 30
MHz
30 MHz
1 GHz
1 15 GHz
Instructio 0.1IPC
ns/Cycle
0.1IPC0.9IPC
0.9IPC1.9IPC
1.9IPC2.9IPC
Table 1. The amazing decades of the evolution of microprocessors

EE37E 2005
10
Hierarchy of Computer Architecture

High-Level Language Programs
Software
Application
Operating
System
Machine Language
Program
Software/Hardware
Boundary
Assembly Language
Programs
Compiler
Firmware
Instr. Set Proc. I/O system
Instruction Set
Architecture
Datapath & Control
Hardware
Digital Design
Circuit Design
Microprogram
Layout
Register Transfer
Notation (RTN)
Logic Diagrams
Circuit Diagrams
EE37E 2005
11
Instruction Set Processor Design

Critical to an ISP is the
instruction set
architecture, which
specifies the functionality
that must be implemented
by the instruction set
processor (ISP).
EE37E 2005
12
The Design Process

"To Design Is To Represent
Design activity yields description/representation of an
object
Traditional craftsman does not distinguish between the
conceptualization and the artifact
Separation comes about because of complexity
Concept is captured in one or more representation
languages
This process IS design
Design Begins With Requirements

Functional Capabilities: what it will do
Performance Characteristics: Speed, Power, Area,
Cost, . . .
EE37E 2005
13
Design Process (cont.)

CPU
Design Finishes As Assembly

Design understood in terms of
components and how they have
been assembled
Datapath
ALU
Top Down decomposition of

complex functions (behaviors)
into more primitive functions
Regs
Control
Shifter
Nand
Gate
Bottom-up composition of primitive

building blocks into more complex assemblies
Design is a "creative process," not a simple method
EE37E 2005
14
Design as
Search
Problem A
Strategy 1
SubProb 1
BB1
BB2
BB3
Strategy 2
SubProb2
SubProb3
BBn
Design involves educated guesses and verification

-- Given the goals, how should these be prioritized?
-- Given alternative design pieces, which should be selected?
-- Given design space of components & assemblies, which part will yield
the best solution?
Feasible (good) choices vs. Optimal choices
EE37E 2005
15
Instruction Set Architecture

(subset of Computer Architecture)
...
the attributes of a [computing] system as seen by the

programmer, i.e., the conceptual structure and functional
behavior, as distinct from the organization of the data flows and
controls the logic design, and the physical implementation.
Amdahl, Blaaw, and Brooks, 1964
Organization of Programmable Storage
SOFTWARE
Data Types & Data Structures:

Encodings & Representations
Instruction Set
Instruction Formats
Data Items
Modes of Addressing and AccessingEE37E
2005 and Instructions
16
The Instruction Set: a Critical

Interface
software
instruction set
hardware
Figure 2: ISA
EE37E 2005
17
Dynamic Static Interface

We have discussed two critical roles played by
the ISA:
Contract between software and Hardware, which
facilitates the development pf programs and machines
Specification for microprocessor design
The third role is an associated definition of an

interface that separates what is done statically
at the compile time versus what is done
dynamically at run time. This interface is called
the Dynamic-static Interface
EE37E 2005
18
(Software)
Program
Compiler
complexity
Exposed to
software
Static
Architecture (DSI)
Hardware
complexity
Machine
Hidden in
hardware
Dynamic
(Hardware)
Figure 3: The dynamic-static feature
EE37E 2005
19
Computer Architecture Topics

Input/Output and Storage
Disks, WORM, Tape
DRAM
Memory
Hierarchy
VLSI
L2 Cache
L1 Cache
Instruction Set Architecture
RAID
Emerging Technologies
Interleaving
Bus protocols
Coherence,
Bandwidth,
Latency
Addressing,
Protection,
Exception Handling
Pipelining, Hazard Resolution,

Pipelining and Instruction
Superscalar, Reordering,
Level Parallelism
Prediction, Speculation,
Vector, DSP
EE37E 2005
20
Principles of Processor Performance
EE37E 2005
21
Definition
s
Performance is in units of things per sec
bigger is better
If we are primarily concerned with response time
1
performance(x) =
execution_time(x)
" X is n times faster than Y" means
Execution_time(Y)
Performance(X)
n
=
Performance(Y)
Execution_time(X)
EE37E 2005
22
Cycles Per Instruction

IC = Instruction Count
CPI = Clock Per Instruction
CPU time Number of clock cycles Clock cycle time
Number of clock cycles

CPU time
Clock Frequency
Number of clock cycles
CPI
IC
CPU time IC CPI Clock cycle time
IC CPI
CPU time
Clock Rate
n
CPU time Cycle Time CPI j I j

j 1
EE37E 2005
23
Cycles Per Instruction

We may separate the contribution of each type of
instruction to the execution time defining:
n
Number of clock cycles CPI j IC j

j 1
where IC j is the number of times that instruction

j is executed, and CPI j is the average number of
clocks required to execute instruction j
Processor pipelining and memory interactions limit the accuracy of this
approach, but its a good first guess. For accuracy, it is necessary to simulate
the instructions of an entire program with issue, pipeline and memory
interactions.
EE37E 2005
24
Aspects of CPU Performance (CPU Law)

CPU
CPUtime
time
== Seconds
Seconds == Instructions
Instructions xx Cycles
Cycles xx Seconds
Seconds
Program
Program
Instruction
Cycle
Program
Program
Instruction
Cycle
EE37E 2005
25
Amdahl's Law
Speedup due to enhancement E:
Exec Time w/o E Performance w/ E
Speedup(E)
Exec Time w/ E Performance w/o E
Suppose that enhancement E accelerates a fraction

F of the task by a factor S, and the remainder of
the task is unaffected
E.g. special instructions, memory, IO, parallel
processing
EE37E 2005
26
Amdahls Law
ExTime new
Fraction enhanced
ExTime old 1 Fraction enhanced
Speedup enhanced
Speedup overall
ExTime old
1
Fraction enhanced
ExTime new 1 Fraction
enhanced
Speedup enhanced
EE37E 2005
27
Amdahls Law
Example: Floating point instructions improved to
run 2X; but only 10% of actual instructions are FP
ExTime new
0.1
ExTime old 1 0.1
ExTime old 0.95
Speedup overall
ExTime old
ExTime old
1
1.053
ExTime new ExTime old 0.95 0.95
EE37E 2005
28
Topic 2: Instruction Set Architecture

Design
Adapted from Prof. Jerry Breechers Notes + my CS21Q
Notes
(http://babbage.clarku.edu/~jbreecher/arch/arch.html)
EE37E 2005
29
Introduction
7.1 Introduction
7.2 Classifying Instruction Set Architectures
7.3 Memory Addressing
7.4 Operations in the Instruction Set
7.5 Type and Size of Operands
7.6 Encoding and Instruction Set
7.7 The Role of Compilers
7.8 The MIPS Architecture and Bonus
7.9. Endianess
EE37E 2005
30
Introduction
The Instruction Set Architecture is that portion of the machine visible to the
assembly level programmer or to the compiler writer.
software
instruction set
hardware
Questions:
- What are the advantages and disadvantages of various
instruction set alternatives?
- How do languages and compilers affect ISA?
EE37E 2005
31
Classifying Instruction Set

Architectures
Classifications can be by:
1.
2.
3.
Stack/accumulator/register
Number of memory operands.
Number of total operands.
EE37E 2005
32
Instruction Set
Architectures
Accumulator:
1 address
1+x address
Basic ISA
Classes
add A
addx A
acc acc + mem[A]

acc acc + mem[A + x]
add
tos tos + next
add A B
add A B C
EA(A) EA(A) + EA(B)

EA(A) EA(B) + EA(C)
Stack:
0 address
General Purpose Register:
2 address
3 address
Load/Store:
0 Memory
1 Memory
load R1, Mem1

load R2, Mem2
add R1, R2
ALU Instructions
can have two or
three operands.
ALU Instructions can

have 0, 1, 2, 3 operands.
Shown here are cases of
0 and 1.
add R1, Mem2
EE37E 2005
33
Basic ISA
Classes
Instruction Set
Architectures
The results of different address classes is easiest to see with the examples here,
all of which implement the sequences for C = A + B.
Stack
Accumulator
Register
Register
(Register-memory)
(load-store)
Push A
Load A
Load R1, A
Load
R1, A
Push B
Add B
Add
Load
R2, B
Add
Store C
Store
Add
R3, R1, R2
R1, B
C, R1
Pop C
Store
C, R3
Registers are the class that won out. The more registers on the CPU, the better.
EE37E 2005
34
Instruction Set
Architectures
Intel 80x86 Integer

Registers
GPR0
EAX
Accumulator
GPR1
ECX
Count register, string, loop
GPR2
EDX
Data Register; multiply, divide
GPR3
EBX
Base Address Register
GPR4
ESP
Stack Pointer
GPR5
EBP
Base Pointer for base of stack seg.
GPR6
ESI
Index Register
GPR7
EDI
Index Register
CS
Code Segment Pointer
SS
Stack Segment Pointer
DS
Data Segment Pointer
ES
Extra Data Segment Pointer
FS
Data Seg. 2
GS
Data Seg. 3
EIP
Instruction Counter
Eflags
Condition Codes
PC
EE37E 2005
35
Memory Addressing
Sections Include:
Interpreting Memory Addresses
Addressing Modes
Displacement Address Mode
Immediate Address Mode
EE37E 2005
36
Memory
Addressing
Interpreting Memory
Addresses
What object is accessed as a function of the address and length?

Objects have byte addresses an address refers to the number of bytes counted from
the beginning of memory.
Little Endian puts the byte whose address is xx00 at the least significant position in the
word.
Big Endian puts the byte whose address is xx00 at the most significant position in the
word.
Alignment data must be aligned on a boundary equal to its size. Misalignment typically
results in an alignment fault that must be handled by the Operating System.
EE37E 2005
37
Memory
Addressing
Addressing
Modes
This table shows the most common modes. A more complete set is in Figure 2.6
Addressing Mode
Example Instruction
Meaning
When Used
Register
Add R4, R3
R[R4] <- R[R4] + R[R3]
When a value is in a
register.
Immediate
Add R4, #3
R[R4] <- R[R4] + 3
For constants.
Displacement
Add R4, 100(R1)
R[R4] <- R[R4] +
Accessing local variables.
M[100+R[R1] ]
Register Deferred
Add R4, (R1)
R[R4] <- R[R4] +

M[R[R1] ]
Absolute
Add R4, (1001)
R[R4] <- R[R4] + M[1001]
EE37E 2005
Using a pointer or a
computed address.
Used for static data.
38
Memory
Addressing
Displacement
Addressing Mode
How big should the displacement be?

For addresses that do fit in displacement size:
Add R4, 10000 (R0)
For addresses that dont fit in displacement size, the compiler must do the
following:
Load R1, address
Add R4, 0 (R1)
Depends on typical displaces as to how big this should be.
On both IA32 and DLX, the space allocated is 16 bits.
EE37E 2005
39
Memory
Addressing
Immediate Address
Mode
Used where we want to get to a numerical value in an instruction.
At high level:
At Assembler level:
a = b + 3;
Load
Add
if ( a > 17 )
Load
R2, 17
CMPBGT R1, R2
goto
Load
Jump
Addr
R2, 3
R0, R1, R2
R1, Address
(R1)
So how would you get a 32 bit value into a register?

EE37E 2005
40
Operations In The Instruction Set

Sections Include:
Detailed information about types of instructions.
Instructions for Control Flow (conditional branches, jumps)
EE37E 2005
41
Operations In The
Instruction Set
Arithmetic and logical
Data transfer
Control
System
Floating point
Decimal
String
Multimedia -
Operator Types
and, add
move, load
branch, jump, call
system call, traps
add, mul, div, sqrt
add, convert
move, compare
2D, 3D? e.g., Intel MMX and Sun VIS
EE37E 2005
42
Control
Instructions
Operations In The
Instruction Set
Conditional branches are 20%

of all instructions!!
Control Instructions Issues:
taken or not
where is the target
link return address
save or restore
Instructions that change the PC:
(conditional) branches, (unconditional) jumps

function calls, function returns
system calls, system returns
EE37E 2005
43
Type And Size of Operands

The type of the operand is usually encoded in the Opcode a LDW
implies loading of a word.
Common sizes are:
Character (1 byte)
Half word (16 bits)
Word (32 bits)
Single Precision Floating Point (1 Word)
Double Precision Floating Point (2 Words)
Integers are twos complement binary.

Floating point is IEEE 754.
Some languages (like COBOL) use packed decimal.
EE37E 2005
44
The MIPS Architecture

MIPS is very RISC oriented.
EE37E 2005
45
The MIPS
Architecture
Theres MIPS 32 that we learned in
CS140
32bit byte addresses aligned
Load/store only displacement
addressing
Standard datatypes
3 fixed length formats
32 32bit GPRs (r0 = 0)
16 64bit (32 32bit) FPRs
FP status register
No Condition Codes
Theres MIPS 64 the current arch.
Standard datatypes
4 fixed length formats (8,16,32,64)
32 64bit GPRs (r0 = 0)
64 64bit FPRs
MIPS Characteristics
Addressing Modes
Immediate
Displacement
(Register Mode used only for ALU)
Data transfer
load/store word, load/store
byte/halfword signed?
load/store FP single/double
moves between GPRs and FPRs
ALU
add/subtract signed? immediate?
multiply/divide signed?
and,or,xor immediate?, shifts: ll, rl,
ra immediate?
sets immediate?
EE37E 2005
46
The MIPS
Architecture
MIPS Characteristics
Control
branches == 0, <> 0
conditional branch testing FP bit
jump, jump register
jump & link, jump & link register
trap, returnfromexception
Floating Point
add/sub/mul/div
single/double
fp converts, fp set
EE37E 2005
47
The MIPS
Architecture
The MIPS Encoding
Register-Register
31
26 25
Op
21 20
Rs1
11 10
16 15
Rs2
6 5
Rd
Opx
Register-Immediate
31
26 25
Op
21 20
Rs1
16 15
immediate
Rd
Branch
31
26 25
Op
Rs1
21 20
16 15
Rs2/Opx
immediate
Jump / Call
31
26 25
Op
target
EE37E 2005
48
Byte Ordering
How should bytes within multi-byte word be
ordered in memory?
Conventions
Suns, Macs are Big Endian machines
Least significant byte has highest address
Alphas, PCs are Little Endian machines

Least significant byte has lowest address
EE37E 2005
49
Byte Ordering Example

Big Endian
Least significant byte has highest address
Little Endian
Least significant byte has lowest address
Example
Variable x has 4-byte representation 0x01234567
Address given by &x is 0x100
Big Endian
0x100 0x101 0x102 0x103
01
Little Endian
23
45
67
0x100 0x101 0x102 0x103
67
45
23
01
EE37E 2005
50
Machine-Level Code Representation
Encode Program as Sequence of Instructions

Each simple operation
Arithmetic operation
Read or write memory
Conditional branch
Instructions encoded as bytes
Alphas, Suns, Macs use 4 byte instructions
Reduced Instruction Set Computer (RISC)
PCs use variable length instructions
Complex Instruction Set Computer (CISC)
Different instruction types and encodings for different machines
Most code not binary compatible
Programs are Byte Sequences Too!

EE37E 2005
51
Classification of Processors
We can classify processors according to the areas in
which they are mostly used.
We can identity four different group of processors:
General purpose processors that are used in building
computers
Digital Signal processors which are processors designed
specifically for signal processing.
Microcontrollers which are small microcromputers
which integrate in the same chip a core processors plus
I/O elements and small amount of memories
Application specific processors which design to
performed specific function (i.e. Network processors)
EE37E 2005
52
General Purpose Processors

These processors are used to built major computer
platforms.
We can name:
Intel / AMD based computers also called IBM
compatible
Macintosh computers built using PowerPC processors
Sun machines that use Ultrasparc Processors.
EE37E 2005
53
Examples of General Purpose Processors
Type of Computer
Processors Used
Technology
Macinstosh
PowerPC
Superscalar
(IBM, Motorola)
Sun
Ultrasparc
RISC
(SUN)
IBM Compatible
Intel Processors
Superscalar
Athlon, Duron
(AMD), Cyrix
EE37E 2005
54
DSP
Digital Signal Processing (DSP) is used in a wide variety of

applications, and it is hard to find a good definition that is general.
We can start by dictionary definitions of the words:
Digital
* operating by the use of discrete signals to represent data
in the form of numbers
Signal
* a variable parameter by which information is conveyed
through an electronic circuit
Processing
* to perform operations on data according to programmed
instructions
Which leads us to a simple definition of: Digital Signal processing
changing or analyzing information which is measured as discrete

sequences of numbers
EE37E 2005
55
Note two unique features of Digital Signal processing as opposed to

plain old ordinary digital processing:
signals come from the real world - this intimate connection with the real
world leads to many unique needs such as the need to react in real time and
a need to measure signals and convert them to digital numbers
signals are discrete - which means the information in between discrete
samples is lost
The advantages of DSP are common to many digital systems and include:
Versatility:
digital systems can be reprogrammed for other applications (at least where
programmable DSP chips are used)
digital systems can be ported to different hardware (for example a different
DSP chip or board level product)
Repeatability:
digital systems can be easily duplicated
digital systems do not depend on strict component tolerances
digital system responses do not drift with temperature
Simplicity:
some things can be done more easily digitally than with analogue
systems
EE37E 2005
56
DSP is used in a very wide

variety of applications.
But most share some
common features:
they use a lot of math

(multiplying and adding
signals)
they deal with signals
that come from the
real world
they require a response
in a certain time
Where general purpose

DSP processors are
concerned, most applications
deal with signal frequencies
that are in the audio range.
EE37E 2005
57

Lesson 5: Processor Design: Topic 1 - Methods and Concepts

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Lesson 5: Processor Design: Topic 1 - Methods and Concepts

Încărcat de

Drepturi de autor:

Formate disponibile

Lesson 5: Processor Design

Topic 1 Methods and Concepts

While introducing this topic we will focus on these points:

An ISA serves as an interface between software

Computer System Components

1000MHZ - 3 GHZ (a multiple of system bus speed)

Examples: Alpha, AMD K7: EV6, 400MHZ

Example: PCI-X 133MHZ

Computer System Components

Support for Simultaneous Multithreading (SMT): Alpha EV8.

Recent Trends in Computer Design

The cost/performance ratio of computing systems have seen a steady decline

Architectural improvements in CPU design.

Microprocessor systems directly reflect IC improvement in terms of a yearly

Assembly language has been mostly eliminated and replaced by other

Emergence of RISC architectures and RISC-core architectures.

Adoption of quantitative approaches to computer design based on empirical

Microprocessor Architecture Trends

Three decades of the history of microprocessors

100K 1 M 1M 100M 100M 2

Table 1. The amazing decades of the evolution of microprocessors

Hierarchy of Computer Architecture

Instr. Set Proc. I/O system

Datapath & Control

Instruction Set Processor Design

The Design Process

This process IS design

Design Begins With Requirements

Design Process (cont.)

Design Finishes As Assembly

Top Down decomposition of

Bottom-up composition of primitive

Design is a "creative process," not a simple method

Design involves educated guesses and verification

Feasible (good) choices vs. Optimal choices

Instruction Set Architecture

the attributes of a [computing] system as seen by the

Organization of Programmable Storage

Data Types & Data Structures:

The Instruction Set: a Critical

Dynamic Static Interface

The third role is an associated definition of an

Figure 3: The dynamic-static feature

Computer Architecture Topics

Pipelining, Hazard Resolution,

Principles of Processor Performance

If we are primarily concerned with response time

Cycles Per Instruction

Number of clock cycles

CPU time Cycle Time CPI j I j

Cycles Per Instruction

Number of clock cycles CPI j IC j

where IC j is the number of times that instruction

Aspects of CPU Performance (CPU Law)

Exec Time w/ E Performance w/o E

Suppose that enhancement E accelerates a fraction

Topic 2: Instruction Set Architecture

Classifying Instruction Set

acc acc + mem[A]

tos tos + next

EA(A) EA(A) + EA(B)

load R1, Mem1

ALU Instructions can