Sunteți pe pagina 1din 83

Chapter 1.

Basic Structure of Computers

| COMPUTER ARCHITECTURE & ORGANIZATION | BEC 30303 | 2012/2013 SEM 2 |

DEPARTMENT OF COMPUTER ENGINEERING

Functional Units

| COMPUTER ARCHITECTURE & ORGANIZATION | BEC 30303 | 2012/2013 SEM 2 |

DEPARTMENT OF COMPUTER ENGINEERING

Functional Units
Arithmetic and logic
Memory

Input

Output

Control

I/O

Processor

Figure 1.1. Basic functional units of a computer.


| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

Information Handled by a Computer

Instructions/machine instructions
Govern the transfer of information within a computer as well as between the computer and its I/O devices Specify the arithmetic and logic operations to be performed Program

Data
Used as operands by the instructions Source program

Encoded in binary code 0 and 1


DEPARTMENT OF COMPUTER ENGINEERING

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

Memory Unit

Store programs and data Two classes of storage


Primary storage
Fast Programs must be stored in memory while they are being executed Large number of semiconductor storage cells Processed in words Address RAM and memory access time Memory hierarchy cache, main memory

Secondary storage larger and cheaper


DEPARTMENT OF COMPUTER ENGINEERING

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

Arithmetic and Logic Unit (ALU)

Most computer operations are executed in ALU of the processor. Load the operands into memory bring them to the processor perform operation in ALU store the result back to memory or retain in the processor. Registers Fast control of ALU
DEPARTMENT OF COMPUTER ENGINEERING

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

Control Unit

All computer operations are controlled by the control unit. The timing signals that govern the I/O transfers are also generated by the control unit. Control unit is usually distributed throughout the machine instead of standing alone. Operations of a computer:
Accept information in the form of programs and data through an input unit and store it in the memory Fetch the information stored in the memory, under program control, into an ALU, where the information is processed Output the processed information through an output unit Control all activities inside the machine through a control unit

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

The processor : Data Path and Control


Data PC Address Instructions Instruction Memory Register # Register Bank Register # Register # Data A L U

Address

Data Memory

Two types of functional units: elements that operate on data values (combinational) elements that contain state (state elements)
| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

Five Execution Steps


Step name Action for R-type instructions Action for Memoryreference Instructions Action for branches Action for jumps

Instruction fetch

IR = MEM[PC] PC = PC + 4
A = Reg[IR[25-21]] B = Reg[IR[20-16]] ALUOut = PC + (sign extend (IR[15-0])<<2) ALUOut = A op B ALUOut = A+sign extend(IR[15-0]) IF(A==B) Then PC=ALUOut PC=PC[3128]||(IR[250]<<2)

Instruction decode/ register fetch

Execution, address computation, branch/jump completion

Memory access or R-type completion

Reg[IR[15-11]] = ALUOut

Load:MDR =Mem[ALUOut] or Store:Mem[ALUOut] = B


Load: Reg[IR[20-16]] = MDR

Memory read completion

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

Basic Operational Concepts

| COMPUTER ARCHITECTURE & ORGANIZATION | BEC 30303 | 2012/2013 SEM 2 |

DEPARTMENT OF COMPUTER ENGINEERING

10

Review

Activity in a computer is governed by instructions. To perform a task, an appropriate program consisting of a list of instructions is stored in the memory. Individual instructions are brought from the memory into the processor, which executes the specified operations. Data to be used as operands are also stored in the memory.

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

11

A Typical Instruction

Add LOCA, R0 Add the operand at memory location LOCA to the operand in a register R0 in the processor. Place the sum into register R0. The original contents of LOCA are preserved. The original contents of R0 is overwritten. Instruction is fetched from the memory into the processor the operand at LOCA is fetched and added to the contents of R0 the resulting sum is stored in register R0.

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

12

Separate Memory Access and ALU Operation


Load LOCA, R1 Add R1, R0 Whose contents will be overwritten?

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

13

Memory

Connection Between the Processor and the Memory


MAR MDR Control PC R0 R1 IR Processor ALU Rn 1

n general purpose registers

Fi gure 1.2.

Connecti ons between the processor and the memory.

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

14

Registers

Instruction register (IR) Program counter (PC) General-purpose register (R0 Rn-1) Memory address register (MAR) Memory data register (MDR)

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

15

Typical Operating Steps

Programs reside in the memory through input devices PC is set to point to the first instruction The contents of PC are transferred to MAR A Read signal is sent to the memory The first instruction is read out and loaded into MDR The contents of MDR are transferred to IR Decode and execute the instruction
DEPARTMENT OF COMPUTER ENGINEERING

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

16

Typical Operating Steps (Cont)

Get operands for ALU


General-purpose register Memory (address to MAR Read MDR to ALU)

Perform operation in ALU Store the result back

To general-purpose register To memory (address to MAR, result to MDR Write)

During the execution, PC is incremented to the next instruction


DEPARTMENT OF COMPUTER ENGINEERING

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

17

Interrupt

Normal execution of programs may be preempted if some device requires urgent servicing. The normal execution of the current program must be interrupted the device raises an interrupt signal. Interrupt-service routine Current system information backup and restore (PC, general-purpose registers, control information, specific information)

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

18

Bus Structures

There are many ways to connect different parts inside a computer together. A group of lines that serves as a connecting path for several devices is called a bus. Address/data/control

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

19

Input

Output

Memory

Processor

Bus Structure
Figure 1.3. Single-bus structure.

Single-bus

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

20

Speed Issue

Different devices have different transfer/operate speed. If the speed of bus is bounded by the slowest device connected to it, the efficiency will be very low. How to solve this? A common approach use buffers.

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

21

Performance

| COMPUTER ARCHITECTURE & ORGANIZATION | BEC 30303 | 2012/2013 SEM 2 |

DEPARTMENT OF COMPUTER ENGINEERING

22

Performance

The most important measure of a computer is how quickly it can execute programs. Three factors affect performance:
Hardware design Instruction set Compiler

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

23

Performance

Processor time to execute a program depends on the hardware involved in the execution of individual machine instructions.

Main memory

Cache memory

Processor

Bus

Figure 1.5. The processor cache.


| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

24

Performance

The processor and a relatively small cache memory can be fabricated on a single integrated circuit chip. Speed Cost Memory management

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

25

Processor Clock

Clock, clock cycle, and clock rate The execution of each instruction is divided into several steps, each of which completes in one clock cycle. Hertz cycles per second

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

26

Basic Performance Equation


T processor time required to execute a program that has been prepared in high-level language N number of actual machine language instructions needed to complete the execution (note: loop) S average number of basic steps needed to execute one machine instruction. Each step completes in one clock cycle R clock rate Note: these are not independent to each other

N S T R
How to improve T?

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

27

Pipeline and Superscalar Operation


Instructions are not necessarily executed one after another. The value of S doesnt have to be the number of clock cycles to execute one instruction. Pipelining overlapping the execution of successive instructions. Add R1, R2, R3 Superscalar operation multiple instruction pipelines are implemented in the processor. Goal reduce S (could become <1!)

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

28

Clock Rate

Increase clock rate


Improve the integrated-circuit (IC) technology to make the circuits faster Reduce the amount of processing done in one basic step (however, this may increase the number of basic steps needed)

Increases in R that are entirely caused by improvements in IC technology affect all aspects of the processors operation equally except the time to access the main memory.
DEPARTMENT OF COMPUTER ENGINEERING

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

29

CISC and RISC


Tradeoff between N and S A key consideration is the use of pipelining


S is close to 1 even though the number of basic steps per instruction may be considerably larger It is much easier to implement efficient pipelining in processor with simple instruction sets

Reduced Instruction Set Computers (RISC) Complex Instruction Set Computers (CISC)

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

30

Compiler

A compiler translates a high-level language program into a sequence of machine instructions. To reduce N, we need a suitable machine instruction set and a compiler that makes good use of it. Goal reduce NS A compiler may not be designed for a specific processor; however, a high-quality compiler is usually designed for, and with, a specific processor.

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

31

Performance Measurement

T is difficult to compute. Measure computer performance using benchmark programs. System Performance Evaluation Corporation (SPEC) selects and publishes representative application programs for different application domains, together with test results for many commercially available computers. Compile and run (no simulation) Reference computer

Running time on the reference computer SPEC rating Running time on the computer under test SPEC rating ( SPEC i )
i 1 n 1 n

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

32

Multiprocessors and Multicomputers

Multiprocessor computer
Execute a number of different application tasks in parallel Execute subtasks of a single large task in parallel All processors have access to all of the memory shared-memory multiprocessor Cost processors, memory units, complex interconnection networks

Multicomputers
Each computer only have access to its own memory Exchange message via a communication network messagepassing multicomputers

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

33

Computers Evolution
First Generation: Vacuum Tubes

ENIAC

Electronic Numerical Integrator And Computer

Designed and constructed at the University of Pennsylvania Started in 1943 completed in 1946 By John Mauchly and John Eckert Worlds first general purpose electronic digital computer

Armys Ballistics Research Laboratory (BRL) needed a way to supply trajectory tables for new weapons accurately and within a reasonable time frame
Was not finished in time to be used in the war effort

Its first task was to perform a series of calculations that were used to help determine the feasibility of the hydrogen bomb
Continued to operate under BRL management until 1955 when it was disassembled
DEPARTMENT OF COMPUTER ENGINEERING

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

34

ENIAC
Major drawback

Weighed 30 tons

Occupied 1500 square feet of floor space

Contained more than 18,000 vacuum tubes

140 kW Power consumpti on

Capable of 5000 additions per second

Decimal rather than binary machine

Memory consisted of 20 accumulators, each capable of holding a 10 digit number

was the need


for manual programming by setting switches and plugging/ unplugging cables

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

35

John von Neumann


EDVAC (Electronic Discrete Variable Computer)

First publication of the idea was in 1945 Stored program concept

Attributed to ENIAC designers, most notably the mathematician John von Neumann Program represented in a form suitable for storing in memory alongside the data

IAS computer

Princeton Institute for Advanced Studies Prototype of all subsequent general-purpose computers Completed in 1952

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

36

Structure of von Neumann Machine

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

37

IAS Memory Formats

The memory of the IAS consists of 1000 storage locations (called words) of 40 bits each

Both data and instructions are stored there


Numbers are represented in binary form and each instruction is a binary code

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

38

Structure of IAS Computer

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

39

Registers
Memory buffer register (MBR)
Contains a word to be stored in memory or sent to the I/O unit Or is used to receive a word from memory or from the I/O unit

Memory address register (MAR)

Specifies the address in memory of the word to be written from or read into the MBR

Instruction register (IR)

Contains the 8-bit opcode instruction being executed

Instruction buffer register (IBR)

Employed to temporarily hold the right-hand instruction from a word in memory

Program counter (PC)

Contains the address of the next instruction pair to be fetched from memory

Accumulator (AC) and multiplier quotient (MQ)

Employed to temporarily hold operands and results of ALU operations

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

40

IAS Operations

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

41

The IAS Instruction Set

Table 2.1 The IAS Instruction Set


| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

42

Commercial Computers
UNIVAC

1947 Eckert and Mauchly formed the Eckert-Mauchly Computer Corporation to manufacture computers commercially UNIVAC I (Universal Automatic Computer) First successful commercial computer Was intended for both scientific and commercial applications Commissioned by the US Bureau of Census for 1950 calculations The Eckert-Mauchly Computer Corporation became part of the UNIVAC division of the Sperry-Rand Corporation UNIVAC II delivered in the late 1950s Had greater memory capacity and higher performance

Backward compatible
DEPARTMENT OF COMPUTER ENGINEERING

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

43

Was the major manufacturer of punched-card processing equipment Delivered its first electronic stored-program computer (701) in 1953

Intended primarily for scientific applications

Introduced 702 product in 1955

Hardware features made it suitable to business applications

IBM

Series of 700/7000 computers established IBM as the overwhelmingly dominant computer manufacturer

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

44

Computers Evolution
Second Generation: Transistors

Smaller

Cheaper
Dissipates less heat than a vacuum tube Is a solid state device made from silicon Was invented at Bell Labs in 1947

It was not until the late 1950s that fully transistorized computers were commercially available

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

45

Table 2.2 Computer Generations

COMPUTER GENERATIONS

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

46

Second Generation Computers

Introduced:

More complex arithmetic and logic units and control units The use of high-level programming languages Provision of system software which provided the ability to:

load programs move data to peripherals and libraries perform common computations

Appearance of the Digital Equipment Corporation (DEC) in 1957 PDP-1 was DECs first computer This began the minicomputer phenomenon that would become so prominent in the third generation

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

47

Example Members of the IBM 700/7000 Series

Example Members of the IBM 700/7000 Series


| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

48

IBM 7094 Configuration

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

49

Computers Evolution
Third Generation: Integrated Circuits

1958 the invention of the integrated circuit Discrete component Single, self-contained transistor Manufactured separately, packaged in their own containers, and soldered or wired together onto masonite-like circuit boards Manufacturing process was expensive and cumbersome The two most important members of the third generation were the IBM System/360 and the DEC PDP-8
DEPARTMENT OF COMPUTER ENGINEERING

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

50

Microelectronics

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

51

Integrated Circuits

Data storage provided by memory cells Data processing provided by gates Data movement the paths among components are used to move data from memory to memory and from memory through gates to memory Control the paths among components can carry control signals

A computer consists of gates, memory cells, and interconnections among these elements The gates and memory cells are constructed of simple digital electronic components
Exploits the fact that such components as transistors, resistors, and conductors can be fabricated from a semiconductor such as silicon Many transistors can be produced at the same time on a single wafer of silicon Transistors can be connected with a processor metallization to form circuits
DEPARTMENT OF COMPUTER ENGINEERING

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

52

Wafer, Chip, and Gate Relationship

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

53

Chip Growth

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

54

Moores Law
1965; Gordon Moore co-founder of Intel

Observed number of transistors that could be put on a single chip was doubling every year Consequences of Moores law:
The pace slowed to a doubling every 18 months in the 1970s but has sustained that rate ever since
The cost of computer logic and memory circuitry has fallen at a dramatic rate
The electrical path length is shortened, increasing operating speed
Computer becomes smaller and is more convenient to use in a variety of environments

Reduction in power and cooling requirements

Fewer interchip connections

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

55

Characteristics of the System/360 Family

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

56

Evolution of the PDP-8

Evolution of the PDP-8


| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

57

DEC - PDP-8 Bus Structure

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

58

+
Later Generations

VLSI

LSI

Very Large Scale Integration

Large Scale Integration

ULSI

Semiconductor Memory Microprocessors


| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

Ultra Large Scale Integration


DEPARTMENT OF COMPUTER ENGINEERING

59

Semiconductor Memory
In 1970 Fairchild produced the first relatively capacious semiconductor memory
Chip was about the size of a single core Could hold 256 bits of memory Non-destructive Much faster than core

In 1974 the price per bit of semiconductor memory dropped below the price per bit of core memory
There has been a continuing and rapid decline in memory cost accompanied by a corresponding increase in physical memory density Developments in memory and processor technologies changed the nature of computers in less than a decade

Since 1970 semiconductor memory has been through 13 generations


Each generation has provided four times the storage density of the previous generation, accompanied by declining cost per bit and declining access time

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

60

Microprocessors

The density of elements on processor chips continued to rise

More and more elements were placed on each chip so that fewer and fewer chips were needed to construct a single computer processor First chip to contain all of the components of a CPU on a single chip Birth of microprocessor First 8-bit microprocessor
First general purpose microprocessor Faster, has a richer instruction set, has a large addressing capability
DEPARTMENT OF COMPUTER ENGINEERING

1971 Intel developed 4004


1972 Intel developed 8008

1974 Intel developed 8080

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

61

Evolution of Intel Microprocessors

a. 1970s Processors

b. 1980s Processors
| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

62

Evolution of Intel Microprocessors

c. 1990s Processors

d. Recent Processors
| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

63

Microprocessor Speed
Techniques built into contemporary processors include:
Processor moves data or instructions into a conceptual pipe with all stages of the pipe processing simultaneously

Pipelining Branch prediction Data flow analysis


Speculative execution

Processor looks ahead in the instruction code fetched from memory and predicts which branches, or groups of instructions, are likely to be processed next

Processor analyzes which instructions are dependent on each others results, or data, to create an optimized schedule of instructions

Using branch prediction and data flow analysis, some processors speculatively execute instructions ahead of their actual appearance in the program execution, holding the results in temporary locations, keeping execution engines as busy as possible

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

64

Performance Balance
Adjust the organization and architecture to compensate for the mismatch among the capabilities of the various components

Increase the number of bits that are retrieved at one time by making DRAMs wider rather than deeper and by using wide bus data paths Reduce the frequency of memory access by incorporating increasingly complex and efficient cache structures between the processor and main memory Increase the interconnect bandwidth between processors and memory by using higher speed buses and a hierarchy of buses to buffer and structure data flow

Architectural examples include:

Change the DRAM interface to make it more efficient by including a cache or other buffering scheme on the DRAM chip

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

65

Typical I/O Device Data Rates

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

66

Improvements in Chip Organization and Architecture

Increase hardware speed of processor Fundamentally due to shrinking logic gate size More gates, packed more tightly, increasing clock rate Propagation time for signals reduced Increase size and speed of caches Dedicating part of processor chip Cache access times drop significantly Change processor organization and architecture Increase effective speed of instruction execution Parallelism

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

67

Problems with Clock Speed and Login Density

Power

Power density increases with density of logic and clock speed Dissipating heat Speed at which electrons flow limited by resistance and capacitance of metal wires connecting them Delay increases as RC product increases Wire interconnects thinner, increasing resistance Wires closer together, increasing capacitance
Memory speeds lag processor speeds
DEPARTMENT OF COMPUTER ENGINEERING

RC delay

Memory latency

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

68

+
Processor Trends

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

69

Multicore

Strategy is to use two simpler processors on the chip rather than one more complex processor
The use of multiple processors on the same chip provides the potential to increase performance without increasing the clock rate

With two processors larger caches are justified

As caches became larger it made performance sense to create two and then three levels of cache on a chip

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

70

Many Integrated Core (MIC) Graphics Processing Unit (GPU)


MIC Leap in performance as well as the challenges in developing software to exploit such a large number of cores The multicore and MIC strategy involves a homogeneous collection of general purpose processors on a single chip GPU

Core designed to perform parallel operations on graphics data Traditionally found on a plug-in graphics card, it is used to encode and render 2D and 3D graphics as well as process video Used as vector processors for a variety of applications that require repetitive computations
DEPARTMENT OF COMPUTER ENGINEERING

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

71

Overview
ARM

Results of decades of design effort on complex instruction set computers (CISCs) Excellent example of CISC design Incorporates the sophisticated design principles once found only on mainframes and supercomputers An alternative approach to processor design is the reduced instruction set computer (RISC) The ARM architecture is used in a wide variety of embedded systems and is one of the most powerful and best designed RISC based systems on the market In terms of market share Intel is ranked as the number one maker of microprocessors for non-embedded systems

Intel

x86 Architecture

CISC RISC

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

72

8080

First general purpose microprocessor 8-bit machine with an 8-bit data path to memory Used in the first personal computer (Altair)

8086

16-bit machine Used an instruction cache, or queue First appearance of the x86 architecture

8088

x86 Evolution

used in IBMs first personal computer Enabled addressing a 16-MByte memory +

80286

instead of just 1 MByte

80386

Intels first 32-bit machine First Intel processor to support multitasking

80486

More sophisticated cache technology and instruction pipelining Built-in math coprocessor

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

73

X86 EVOLUTION PENTIUM


Pentium
Superscalar Multiple instructions executed in parallel

Pentium Pro
Increased superscalar organization Aggressive register renaming Branch prediction Data flow analysis Speculative execution

Pentium II
MMX technology Designed specifically to process video, audio, and graphics data

Pentium III
Additional floating-point instructions to support 3D graphics software

Pentium 4 Includes additional floating-point and other enhancements for multimedia

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

74

x86 Evolution
Instruction set architecture is backward compatible with earlier versions

(continued)

Core
First Intel x86 microprocessor with a dual core, referring to the implementation of two processors on a single chip

X86 architecture continues to dominate the processor market outside of embedded systems

Core 2
Extends the architecture to 64 bits
Recent Core offerings have up to 10 processors per chip

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

75

General definition: A combination of computer hardware and software, and perhaps additional mechanical or other parts, designed to perform a dedicated function. In many cases, embedded systems are part of a larger system or product, as in the case of an antilock braking system in a car.

Embedded

Systems

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

76

Examples of Embedded Systems and Their Markets

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

77

Embedded Systems
Requirements and Constraints
Small to large systems, implying different cost constraints and different needs for optimization and reuse

Different models of computation ranging from discrete event systems to hybrid systems

Relaxed to very strict requirements and combinations of different quality requirements with respect to safety, reliability, real-time and flexibility

Different application characteristics resulting in static versus dynamic loads, slow to fast speed, compute versus interface intensive tasks, and/or combinations thereof Different environmental conditions in terms of radiation, vibrations, and humidity
| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

Short to long life times

DEPARTMENT OF COMPUTER ENGINEERING

78

Possible Organization of an Embedded System

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

79

Acorn RISC Machine (ARM)

Family of RISC-based microprocessors and microcontrollers Designs microprocessor and multicore architectures and licenses them to manufacturers Chips are high-speed processors that are known for their small die size and low power requirements

Widely used in PDAs and other handheld devices Chips are the processors in iPod and iPhone devices Most widely used embedded processor architecture Most widely used processor architecture of any kind

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

80

E v o A l R u M t i o n
DSP = digital signal processor SoC = system on a chip
DEPARTMENT OF COMPUTER ENGINEERING
| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

81

ARM Design Categories

ARM processors are designed to meet the needs of three system categories:
Secure applications
Smart cards, SIM

cards, and payment terminals

Embedded real-time

systems

Systems for storage, automotive body and powertrain, industrial, and networking applications

Application Linux, Palm OS, Symbian OS, and Windows CE in wireless, consumer entertainment and digital imaging platforms Devices running open operating systems including applications

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

82

System Clock

| COMPUTER ARCHITECTURE & ORGANIZATION 30303 |SEM 2012/2013 SEM 2 | | DIGITAL DESIGN | BEC 30503||BEC 2012/2013 I |

DEPARTMENT OF COMPUTER ENGINEERING

83

S-ar putea să vă placă și