Sunteți pe pagina 1din 28

CPS 104 Computer Organization and Programming Lecture- 31: Virtual Memory, Intro to I/O

April 2, 2004 Gershon Kedem http://kedem.duke.edu/cps104/Lectures

CPS104 Lec31.1

GK Spring 2004

Admin.

Homework -7: is posted, Deadline extended, Due: next Friday, April 9th.

CPS104 Lec31.2

GK Spring 2004

Review: Reducing Translation Time

Machines with TLBs go one step further to reduce # cycles/cache access They overlap the cache access with the TLB access Works because high order bits of the VA are used to look in the TLB while low order bits are used as index into cache

CPS104 Lec31.3

GK Spring 2004

Review: Overlapped Cache & TLB Access

32

TLB

assoc lookup

index

Cache

PA

Hit/ Miss

12 20 page # disp =

PA

Data

Hit/ Miss

IF cache hit AND (cache tag = PA) then deliver data to CPU ELSE IF [cache miss OR (cache tag = PA)] and TLB hit THEN access memory with the PA from the TLB ELSE do standard VA translation
CPS104 Lec31.4
GK Spring 2004

Review: Problems With Overlapped TLB Access


Overlapped access only works as long as the address bits used to index into the cache do not change as the result of VA translation This usually limits things to small caches, large page sizes, or high n-way set associative caches if you want a large cache Example: suppose everything the same except that the cache is increased to 8 K bytes instead of 4 K: 11 2
cache index

00 This bit is changed by VA translation, but is needed for cache lookup

20 virt page #

12 disp

Solutions: go to 8K byte page sizes; go to 2 way set associative cache; or SW guarantee VA[13]=PA[13] 1K 4 4 2 way set assoc. cache
GK Spring 2004

10
CPS104 Lec31.5

Memory Protection

Paging Virtual memory provides protection by: u Each process (user or OS) has different virtual memory space. u The OS maintain the page tables for all processes. u A reference outside the process allocated space cause an exception that lets the OS decide what to do. u Memory sharing between processes is done via different Virtual spaces but common physical frames.

CPS104 Lec31.6

GK Spring 2004

Putting it together: The SparcStation 20:


The SparcStation 20 has the following memory system. Caches: Two level-1 caches: I-cache and D-cache

Parameter Organization Page size Line size Replacement

Instruction cache 20Kbyte 5-way SA 4K bytes 8 bytes Pseudo LRU

Data cache 16KB 4-way SA 4K bytes 4 bytes Pseudo LRU

TLB: 64 entry Fully Associative TLB, Random replacement t External Level-2 Cache: 1M-byte, Direct Map, 128 byte blocks, 32-byte sub-blocks.

CPS104 Lec31.7

GK Spring 2004

SparcStation 20 Data Access


Virtual Address
20 12 10 2 tag data

Data Cache
tag tag tag

4 bytes

1K

10

= Physical Address
To Memory 24 20

Data Select

TLB
= = =
tag0 tag1 tag2 To CPU

=
CPS104 Lec31.8

tag63
GK Spring 2004

SparcStation 20: Instruction Memory


Instruction Address
20 12 10 2 tag data

Instruction Cache
tag tag tag

4 bytes tag

1K

10

= Physical Address
36 To Memory 24 20

Instruction Select

TLB
= = =
tag0 tag1 tag2 To CPU (instruction register)

=
CPS104 Lec31.9

tag63
GK Spring 2004

Input / Output

CPS104 Lec31.10

GK Spring 2004

The Big Picture: Where are We Now?

Todays Topic: I/O Systems


Processor Input Control Memory Datapath

Network Processor Input Control Memory Datapath

Output

Output

CPS104 Lec31.11

GK Spring 2004

I/O System Design Issues


u u u

Performance Expandability Resilience in the face of failure


interrupts

Processor

Cache

Memory - I/O Bus Main Memory I/O Controller Disk Disk I/O Controller Graphics I/O Controller Network Spring 2004 GK

CPS104 Lec31.12

Producer-Server Model

Producer

Queue

Server

Throughput: u The number of tasks completed by the server in unit time u In order to get the highest possible throughput:

The server should never be idle The queue should never be empty

Response time: u Begins when a task is placed in the queue u Ends when it is completed by the server u In order to minimize the response time:
The queue should be empty The Lec31.13 server will be idle

GK Spring 2004

CPS104

I/O Device Examples


Device Behavior Keyboard Input Mouse Input Line Printer Output Laser Printer Output Graphics Display Output Network-LAN Input/Output Floppy disk Storage Optical Disk Storage Magnetic Disk Storage Partner Human Human Human Human Human Machine Machine Machine Machine Data Rate (KB/sec) 0.01 0.02 1.00 100 > 30,000 10,000 50 500 5,000

CPS104 Lec31.14

GK Spring 2004

Technology Trends

Disk Capacity doubles every 1.5 years

Today: Processing Power Doubles Every 18 months Today: Memory Size Doubles Every 18 months(?) Today: Disk Capacity Doubles Every 12-18 months The I/O The I/O GAP GAP

Disk Positioning Rate (Seek + Rotate) Doubles Every Ten Years!


CPS104 Lec31.15
GK Spring 2004

Types of Storage Devices

Magnetic Disks Magnetic Tapes CD ROM Juke Box (automated tape library, robots)

CPS104 Lec31.16

GK Spring 2004

Magnetic Disks

Long term nonvolatile storage Another slower, less expensive level of memory hierarchy
Track Arm Cylinder Head Platter Sector

CPS104 Lec31.17

GK Spring 2004

Organization of a Hard Magnetic Disk

Platters Track

Typical numbers (depending on the disk size): u 500 to 2,000 tracks per surface u 32 to 128 sectors per track

Sector

A sector is the smallest unit that can be read or written

Traditionally all tracks have the same number of sectors: u Constant bit density: record more sectors on the outer tracks u Recently relaxed: constant bit size, speed varies with track location
CPS104 Lec31.18
GK Spring 2004

Magnetic Disk Characteristic

Track Sector

Cylinder: all the tacks under the head Cylinder at a given point on all surface Platter Read/write data is a three-stage process: Head u Seek time: position the arm over the proper track u Rotational latency: wait for the desired sector to rotate under the read/write head u Transfer time: transfer a block of bits (sector) under the read-write head Average seek time as reported by the industry: u Typically in the range of 8 ms to 12 ms u (Sum of the time for all possible seek) / (total # of possible seeks) Due to locality of disk reference, actual average seek time may: u Only be 25% to 33% of the advertised number

CPS104 Lec31.19

GK Spring 2004

Typical Numbers of a Magnetic Disk


Track Sector

Rotational Latency: u Most disks rotate at 3,600 to 10000 RPM u Approximately 16 ms to 3.5 ms per revolution, respectively Head u An average latency to the desired information is halfway around the disk: 8 ms at 3600 RPM, 4 ms at 7200 RPM Transfer Time is a function of : u Transfer size (usually a sector): 1 KB / sector u Rotation speed: 3600 RPM to 7200 RPM u Recording density: bits per inch on a track u Diameter typical diameter ranges from 2.5 to 5.25 in u Typical values: 2 to 12 MB per second

Cylinder Platter

CPS104 Lec31.20

GK Spring 2004

Disk Access

Access time =

queue + seek + rotational + transfer + overhead

Seek time u move arm over track u average is confusing (startup, slowdown, locality of accesses) Rotational latency u wait for sector to rotate under head u average = 0.5/(3600 RPM) = 8.3ms Transfer Time u f(size, BW bytes/sec)

CPS104 Lec31.21

GK Spring 2004

Disk Access Time Example

Disk Parameters: u Transfer size is 8K bytes u Advertised average seek is 12 ms u Disk spins at 7200 RPM u Transfer rate is 4 MB/sec Controller overhead is 2 ms Assume that disk is idle so no queuing delay What is Average Disk Access Time for a Sector? u Ave seek + ave rot delay + transfer time + controller overhead u 12 ms + 0.5/(7200 RPM/60) + 8 KB/4 MB/s + 2 ms u 12 + 4.15 + 2 + 2 = 20 ms Advertised seek time assumes no locality: typically 1/4 to 1/3 advertised seek time: 20 ms => 12 ms
GK Spring 2004

CPS104 Lec31.22

DRAM as Disk

Solid state disk, Expanded Storage, NVRAM Disk is slow, DRAM is fast => replace Disk with battery backed DRAM BUT, Disk is cheap, much cheaper than DRAM Network Memory u fast networks (e.g., Myrinet) u use DRAM of other workstations as backing store u Trapeze/GMS project here

CPS104 Lec31.23

GK Spring 2004

R-DAT Technology
2000 RPM

Helical Recording Scheme Four Head Recording Tracks Recorded 20 w/o guard band Read After Write Verify
CPS104 Lec31.24
GK Spring 2004

Tape vs. Disk


Longitudinal tape uses same technology as
hard disk; tracks its density improvements Inherent cost-performance based on geometry: fixed rotating platters with gaps (random access, limited area, fixed media) Vs. removable long strips wound on spool (sequential access, "unlimited" length) New technology trend: Helical Scan (VCR, Camcoder, DAT) Spins head at angle to tape to improve density

CPS104 Lec31.25

GK Spring 2004

Storage costs
Media Disk Tape CDR Flash EPROM Paper Capacity
80-300GB 2-300GB

750MB
256-1000MB

10KB

Cost $50-$300 $4-$100 $0.10 $40-$200 $0.01

$/byte 1 x 10-9 3 x 10-11 1.2x 10-10 2 x 10-7 1 x 10-6

Digital storage cost has a profound effect on the way we store and access Information!

Digital storage (and the Web) opens-up new exciting possibilities for access to information culture & knowledge. Digital storage introduces whole new set of problems to long term information storage.

CPS104 Lec31.26

GK Spring 2004

CDR vs. Tape


CDR Type Capacity 5.25" 0.75 GB Helical Scan Tape 8mm 5-90 GB $8 $600

Media Cost $0.10 Drive Cost $100 Access

Write Once Read/Write 10 - 20 s


GK Spring 2004

Robot Time 10 - 20 s
CPS104 Lec31.27

Media cost ratio CDR vs. helical tape ~ 4 : 1

Current Drawbacks to Tape


Tape wear out: Helical 100s of passes to 1000s for longitudinal Head wear out: 2000 hours for helical Both must be accounted for in economic / reliability model Long rewind, eject, load, spin-up times;

CPS104 Lec31.28

GK Spring 2004

S-ar putea să vă placă și