Sunteți pe pagina 1din 9

1 Memory Hierarchy

Figure 1.1

13.1 13.2

2 Memory Hierarchy 3 Memory Hierarchy 4 Memory Hierarchy 5 Memory Hierarchy 6 Memory Hierarchy 7 Memory Hierarchy 8 Memory Hierarchy Architecture 9

and Computer organization Part II Memory Hierarchy Part 13.3 Virtual Memory

10 Memory Hierarchy 11 Memory Hierarchy


Figure 11.1

12 Memory Hierarchy 13 Memory Hierarchy


Figure 13.1 Figure 13.2 Figure 13.3 Figure 13.4 Figure 13.5

45

13.3 Virtual Memory


The main memory and the disk storage are not only located hierarchically further away from the CPU, registers and the cache, they are also physically further away. As is depicted in Figure 13.6 the cpu, registers and the cache are all situated on one chip forming, together with other components, the processor. processor The busses on-board the cpu processor are extremely short (submicron scale) compared to the busses registers (cm scale) running between the address data processor and the main memory or cache the secondary storage. Because signals propagation through busses is data address the major cause of the delay, the system bus increase of the physical length is address data address data accountable for the substantial secondary increase in the timing of the system main storage memory bus. onboard Similar to the cache, that exploits the locality to close the Figure 13.6 timing gap between the registers and the main memory, the main memory (the upper level memory) can act as a cache between the cache and the secondary storage (the lower level memory). It is this technique that is called virtual memory. (Fotheringham 1961) (Kilburn 1962) The naming legacy, even from the time before the memory hierarchy, affects the terminology for similar concepts and components. It results in different names for the concepts and components across the virtual memory hierarchy; as is illustrated in Figure 13.7.
disk main memory

cache
registers block frame page

Virtual Memory

Figure 13.7

On each level of the memory hierarchy there is an upper level memory and a lower level memory with similar behaviour in the context of locality, however the chunks of data 46

on each level are named block, frame and page. However, the advantage of using different names is that it makes it unambiguous where the different chunks of data originate. Historically there were two other reasons to put forward the idea of virtual-memory. The first one: it was because programs grow larger more rapidly than the main memory size grew. Luckily, the increase of the secondary storage size was able to stay on track with this trend and is even nowadays able to hold many programs in one location. The second one is the safe sharing of the same main memory locations among multiple programs. It became possible to run large programs by only executing parts of the programs loaded from the large secondary memory into the main memory. To put it the other way around: the virtual memory refers to a technology where part of the hard disk is used as an extension of the main memory. So, in case programs outgrow the main memory during execution it can be swapped to the secondary memory. 13.3.1 Paging and the OS The fact is, based on the principle of locality, registers that there is no obligation to create the same memory size for each of the levels in the memory cache hierarchy. That means that the address-space of higher level memory can be smaller than address space of the lower level memory. virtual Hence, the addresses generated by the process memory being executed must be translated into the addresses-space of the Virtual memory, as is Address Space depicted in Figure 13.8. The physical memory is partitioned in Figure 13.8 frames. The executable program is similarly divided into frame-sized chunks known as pages. So, the physical memory caches pages from the secondary memory. Hence, the processor sees a big virtual address-space that is partitioned in pages. The application programmer does not normally worry himself with the management of the pages and leaves it to the Virtual Memory Management services. The Operating System is controlling this process. Figure 13.9 illustrates the locality in a memory-reference pattern (IBM Systems Journal 1971). Each dot in the pattern represents a page reference (y-axis) generated during the execution (t-axis) of a program. The clustering of the dots confirms the locality. On starting the execution of a program the Figure 13.9 OS loads specific pages, from the secondary memory into the main memory, only needed for launching the program. At a certain moment 47

the fetch-execute cycle of the processor generates a request for data by means of an address that not maps onto the address space of the already loaded pages in the main memory. This situation is labeled a page fault. This leads to a page fault interrupt and triggers a process in the OS that will load a page from the secondary memory (see Figure 13.10).

user program runs

The page fault

user program resumes

OS requests page

OS installs page

Disk read

Figure 13.10

In case all frames in the main memory are occupied with pages the OS needs to decide which frame can be replaced (see 13.3.3). Furthermore, the address generated during the fetch-execute cycle is a virtual address. The translation of the virtual address into a physical address of the main memory is performed by a combined effort of hardware and the OS. The fact that the data transfer time between main memory and disk is significantly longer than between register and cache there is no necessity to implement the control in hardware. It is possible to implement the control in the OS. 13.3.2 Address translation Figure 13.11 shows an arbitrary situation virtual of the page status during a running process. The physical memory pages in the virtual memory depicted in white memory address space for a address space secondary are the pages that are swapped in. The pages process storage frame 0 can be loaded at the start or after page faults are address space page 0 frame 1 page 1 processed (see Figure 13.10). In case there is frame 2 page 2 frame 3 more than one process, then each of the process page 3 frame 4 page 2 page 4 frame 5 has their own virtual memory address space Page 5 frame 6 that maps onto its part of the physical memory space and its part of the secondary storage. This page 9 mapping is called relocation and ensures that page N-2 page 5 page N-1 each program can be loaded anywhere in the page N-2 Swapped-out page page N Swapped-in page main memory. Moreover, the concept of paging Unmapped page with fixed-size blocks eliminates the need to Figure 13.11 find adjoining blocks of memory to accommodate a program. To determine the physical address based on the virtual address an address translation is performed. In virtual memory the address is split into two parts: a virtual page number and a page offset (see Figure 13.12). The page size delimits the number of bits in the page offset 48

field. The number of bits in the virtual page virtual address number field is in principal unrestricted. The illusion of a single big memory is a result of virtual page number page offset this unrestricted size of this field. It comes down to a decoupling of the one to one Translation relationship between the virtual memory table space and the physical memory address locations. Because the penalty for a page fault is physical page number page offset extremely high, additional provisions are physical address necessary to reduce the page fault rate. An implementation of the translation table as a Figure 13.12 lookup table that resides in main memory is a first step (see Figure 13.13). The page table register is extra hardware to store the start address of the lookup table of a specific running process. For each process there is a page table register that points to its own lookup table. The OS is in charge of the organization and control of the page table registers and the filling of the lookup table.
virtual page number page offset

V Page table register

AC

frame number

V = Valid bit AC = Access Control bit D = Dirty bit (see 13.3.3)

physical page number

page offset

Figure 13.13

Each page table entry contains information about a single page. The most important part of this information is the frame number; i.e. where the page is located in physical memory. The address translation combines the frame number (= physical page number) with the page offset part of the virtual address to form the physical address. The valid bit is set if the page is in physical memory and not set if the page is on the disk. This is equivalent with the caching technique. The access control bit is an indication on how a page may be used. However, it is a rather primitive form of memory protection. Access control can be used to indicate if there is access to read, read/write, or execute from this page. In this manner processes are prevented from destroying data that may belong to other processes. Moreover, at this point all the information is available of a process its state. Only the data residing in the program counter, the registers and the page table are essential. Because each process has its own page table register the OS only needs to save at a process switch the data of the: program counter, registers and the page table register of the interrupted process. 49

Hence, there is no need to save the entire page table because it remains idle in the main memory and the page table register contains the entry. 13.3.3 Handling page faults and penalty reduction As is depicted in Figure 13.10 the OS is in control of the pages their placement. Placement of pages is straightforward in case the main memory is not containing its maximum number of pages. If that is the case the OS needs to determine which page must return to the secondary memory to make place for the requested page. This replacement technique is called swapping. Because page faults introduce a huge penalty the choice of which page will swap out of the main memory is crucial. Based on the locality principle, it is not the most recent used page that is chosen to swap. Instead the least recently used or RLU page is chosen and will reduce the number of faults. The LRU replacement scheme is performed by the OS, but also other replacement schemes can be implemented. The fact of the longer data transfer time (as mentioned at the end of 13.3.1) the data write schemes used in register - cache writes needs some extra thoughts in the virtual memory case. A write to a hard disk of a single word on each update gives a huge penalty. Hence, the writethrough scheme is very inefficient and instead a write-back scheme is used. Because the transfer time consists of two components, the throughput and the latency, there is another advantage not to write single words. The throughput or disk transfer time is short compared to the latency or the access time of the disk. Hence, it is much more efficient to transfer a page. Because write-backs are still giving a high penalty it is interesting to know if a page needs to be copied back if it is chosen to replace it. By adding a so-called dirty bit (see Figure 13.13) it is possible to track if a page has been written since it was placed into the main memory. A dirty bit is set at the moment a word in a page is written and that page is called a dirty page. It means that only if the dirty bit is set there is a page swap. Hence, there are no disk writes performed if the dirty bit is not set. 13.3.4 Improving address translation performance The locality principle can also be exploited for improving the performance of the address translation. The aim is to reduce the number of accesses to the page table that resides in main memory. By implementing a cache that will contain a track of recently reverenced virtual page numbers a reduction of accesses is achieved. This address translation cache is called a translation-lookaside buffer or TLB (see Figure 13.14). If there is a TLB miss it can be a page fault or a just a TLB miss. If the page is in memory then the TLB indicates only that the translation is missing. In that case, the OS can handle the TLB miss by loading the translation from the page table into the TLB. If the page is not in main memory, then the TLB miss indicates a true page fault. In this case the OS performs the standard action (see Figure 13.10). After a TLB miss and the missing translation is retrieved from the page table a TLB entry is chosen and replaced with this translation.

50

Page table register

virtual page number

V 1 1 1 1 0 1

Tag

TLB physical page address


physical memory

V V = Valid bit 1 1 1 1 0 1 1 0 1 1 0 1

physical page or disk address

disk storage

Figure 13.14

13.3.5 The Cache, TLB and Virtual Memory chain The OS together with the hardware is in control of the complete memory hierarchy. The hierarchy is formed by the processor with its registries, the cache and the virtual memory. All stages work in lockstep with each other, so that data cannot be in the cache unless it is present in main memory. The OS modifies the content of the page tables and the TLBs according to the pages swapped between main memory and the disk based on e.g. page faults or cache misses. In Figure 13.15 the flow chart shows that this process starts with the virtual address generated by the processor. The next step is to consult the TLB and if it is a hit a physical address is produced. If it is not a write, hence it is a read from cache it can give a hit or a miss. In case it is a miss a new block must be read. If it is a write and the write access bit is off the data can be written into cache, the tag updated and it will put the data and the address into the write buffer.

51

Virtual address

TLB access

In TLB? hit address

miss

TLB miss exception

Physical

No

Lookup page table

Write? Yes No

Try to read data from cache

Write access bit on? Cache miss stall while read block No

Cache hit?
Yes Deliver data to the CPU Write data into cache, update the tag and put the data and the address into the write buffer

Write protection exception

Figure 13.15

13.3.6 Questions 1. Which one gives the lowest latency in accessing a page: access to physical memory or access to the hard disk? What is the order of magnitude in latency difference between these two? 2. Discuss the page table with a suitable example and illustrate it with a drawing. 3. Explain the significance of the control bits in the paging mechanism. 4. In case a demanding process holds such a large size of memory space where the page table will not fit in memory, what strategy would you follow for the paging? 5. Revisit Figure 13.9 and explain how in this figure spatial and temporal locality is observable. 6. Name at least three places where the hardware interacts with the OS; that forms the so called hardware software interface.

13.4 Further reading


John Fotheringham, Dynamic storage allocation in the atlas computer, including an automatic use of a backing store., Communications of the ACM, June 1961 T. Kilburn, D. B. G. Edwards, M. J. Lanigan, AND F. H. Summer, One-level storage system., IRE Transactions on electronic computers, April 1962 D.J. Hatfield, J. Gerald, Program restructuring for virtual memory., IBM systems journal No. 3 1971 Peter J. Denning, Before memory was virtual., George Mason University. January 1996 Peter J. Denning, The locality principle., Communications of the ACM, July 2005 vol. 48. No. 7

52

13.5 References
Fotheringham, J. Dynamic storage allocation in the Atlas computer, including an automatic use of a backing store. ACM Communications (ACM Communications 4, 10 (October), 435-436) 4, no. 10 (October) (1961): 435-436. "IBM Systems Journal." 1971. Kilburn, T., Edwards, D. B. G., Lanigan, M. J., Sumner, F. H. One-level. IRE Transactions EC-11, no. 2 (April) (1962): 223-235.

46

S-ar putea să vă placă și