Sunteți pe pagina 1din 41

CS433: Computer

System Organization
Main Memory
Virtual Memory
Translation Lookaside Buffer
Acknowledgement
 Slides from previous CS433 semesters
 TLB figures taken from
http://www.cs.binghamton.edu/~kang/cs350/Chap

 Memory figures from http://larc.ee.nthu.edu.


tw/~cthuang/courses/ee3450/lectures/flecture
/F0713.gif
Topics Today
 Main Memory (chap 5.8, 5.9)
 Simple main memory
 Wider memory
 Interleaved memory
 Memory technologies
 Virtual Memory (chap 5.10, 5.11)
 Basics
 Address translation
Simple Main Memory
 Consider a memory with these
paremeters:
 1 cycle to send an address
 6 cycles to access each word
 1 cycle to send word back to CPU/cache

What is the miss penalty for a 4-word


block?
(1 + 6 + 1) per word x 4 words = 32 cycles
Wider Main Memory
 Make the memory wider
 Higher bandwidth by reading more
words at once
 Same problem…
 1 cycle to send an address
 6 cycles to access each doubleword
 1 cycle to send word back to
CPU/cache
 Miss penalty:
(1 + 6 + 1) x 2 doublewords = 16
Wider Main Memory
 Make the memory wider
 Higher bandwidth
 Read more words in parallel
 Cost:
 Wider bus
 Larger expansion size
 Minimum increment must
correspond to width
 Double width  minimum
increment must be doubled
Interleaved Main Memory
 Organize memory in
banks
 Subsequent words map
to different banks
 Example: word A in bank
(A mod M)
 Within a bank, word A in
location (A div M)
Interleaved Main Memory
 Same problem…
 1 cycle to send an
address
 6 cycles to access each
word
 1 cycle to send word
back to CPU/cache
 Miss penalty:
1 + 6 + 1 x 4 = 11 cycles
Memory Technologies
 Dynamic Random Access Memory (DRAM)
 Optimized for density, not speed
 One transistor per cell
 Cycle time about twice the access time
 Destructive reads
 Must refresh every few ms
 Access every row
Memory Technologies
 Static Random Access Memory (SRAM)
 Optimized for speed, then density
 About 4 – 6 transistors per cell
 Separate address pins
 Static = no refresh
 Access time = cycle time
Memory Technologies
 ROM
 Read-only
 1 transistor per cell
 Nonvolatile
 Flash
 Allows memory to be modified
 Nonvolatile
 Slow write
Virtual Memory
 Motivation:
 Give each running program its own private
address
 Restrict process from modifying other processes
 Want programs to be protected from each other
 bug in one program can’t corrupt memory in another
program
 Programs can run in any location in physical
memory
 Simplify loading the program
Virtual Memory
 Motivation:
 Want programs running simultaneously to share
underlying physical memory
 Want to use disk as another level in the memory
hierarchy
 Treat main memory as a cache for disk
Virtual Memory
 The program sees virtual addresses
 Main memory sees physical addresses

 Need translation from virtual to physical


Virtual Memory vs Cache
 Virtual Memory
 Longer miss penalty
 Handled by OS
 Size dependent of physical address space
 Cache
 Handled by hardware
 Size independent of physical address space
Alias
 2 virtual addresses map
to the same physical
address
 Consistency problems
Virtual Memory
 Block replacement strategy
 LRU
 Write-back strategy
 With dirty bit
 Allows blocks to be written to disk when the block
is replaced
Memory-Management Scheme
 Segmentation

 Paging

 Segmentation + Paging
Segmentation
 Supports user view of memory.
 Virtual memory is divided into variable
length regions called segments
 Virtual address consists of a segment
number and a segment offset
 <segment-number, offset>
Fragmentation
 Memory allocation is a dynamic process that
uses a best fit or first fit algorithm
 Fragmentation is NOT segmentation
 External
 Internal
 Data
Internal vs External
 External fragmentation:
 free space divided into many small pieces
 result of allocating and deallocating pieces of the storage
space of many different sizes
 one may have plenty of free space, it may not be able to all
used, or at least used as efficiently as one would like to
 Unused portion of main memory
 Internal fragmentation:
 result of reserving a piece of space without ever intending
to use it
 Unused portion of page
Paging
 Virtualmemory is divided into fixed-size
blocks called pages
 typically a few kilobytes
 should be a natural unit of transfer to/from disk
 Page replacement
 LRU, MRU, Clock… etc
 Page placement
 Fully associative - efficient
Paging
 Page Identification
 Virtual address is divided into
<virtual page number, page offset>
 The virtual page number is translated into a
physical page number
 Provides indirection
 Indirection is always good
 Translation cached in a buffer
Paging vs Segmentation
 Paging:  Segmentation:
 Block replacement easy  Block replacement hard
 Fixed-length blocks  Variable-length blocks

 Need to find
contiguous, variable-
sized, unused part of
main memory
Paging vs Segmentation
 Paging:  Segmentation:
 Invisible to application  Visible to application
programmer programmer
 No external fragmentation  No internal fragmentation
 There is internal  Unused pieces of main

fragmentation memory
 Unused portion of page  There is external
 Units of code and data are fragmentation
broken up into separate  Keeps blocks of code or
pages data as single units
Segmentation + Paging
 Segmentation is typically combined with
paging
 Paging is transparent to the programmer
 Segmentation is visible to the programmer

 Each segment is broken into fixed-size pages


 Simplify page replacement
 Memory doesn’t have to be contiguous
Page Table
 Page table is a collection of PTEs that maps
a virtual page number to a PTE
 PTEs = page table entry
 The page table tells where a particular virtual
page address is stored on disk and in the main
memory
 Each process has its own page table
 Size depends on # of pages, page size, and
virtual address space
Page Table

 virtual address is divided into


 virtual page number
 page offset
Page Table

 Page-table base-register tells where the page table starts


 Page offset directly maps to physical address page offset
 Reference bit is set when a page is accessed
Page Table

 Each virtual memory reference can cause two physical memory accesses
 One to fetch the page table
 One to fetch the data
Page Table

 Address translation must be performed for every memory


access
 Need optimization!
TLB
 Problems:
 Every virtual memory reference  2 physical
memory accesses
 Every memory access -> address translation
 Toovercome this problem a high-speed
cache is set up for page table entries
 Called a Translation Lookaside Buffer (TLB)
TLB
 The Translation Lookaside Buffer
 a small cache
 contains translations for the most recently used
pages
 Theprocessor consults the TLB first when
accessing memory
Address Translation

 Virtual page number (Tag, Index)


 Index tells which row to access
Address Translation

 TLB is a cache
 Fully associative for efficiency
TLB
 Given a virtual address, processor examines
the TLB
 If page table entry is present (TLB hit)
 the frame number is retrieved and the real
address is formed
TLB
 Ifpage table entry is not found in the TLB
(TLB miss)
 Hardware checks page table and loads new Page
Table Entry into TLB
 Hardware traps to OS, up to OS to decide what to
do
 OS knows which program caused the TLB fault
 the software handler will get the address mapping
TLB Miss
 If page is in main memory (page hit)
 If the mapping is in page table, we add the entry
to the TLB, evicting an old entry from the TLB
TLB
 If page is on disk (page fault)
 load the page off the disk into a free block of
memory
 update the process's page table
TLB
TLB

S-ar putea să vă placă și