Sunteți pe pagina 1din 47

Operating Systems CMPSCI 377 Dynamic Memory Management

Emery Berger
University of Massachusetts Amherst

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Dynamic Memory Management

How the heap manager is implemented

malloc, free new, delete

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Memory Management

Ideal memory manager:

Fast

Raw time, asymptotic runtime, locality Low fragmentation

Memory efficient

With multicore & multiprocessors:

Scalable to multiple processors Secure from attack Reliable in face of errors


UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science
3

New issues:

Memory Manager Functions

Not just malloc/free

realloc

Change size of object, copying old contents

ptr = realloc (ptr, 10);

But: realloc(ptr, 0) = ? How about: realloc (NULL, 16) ?

Other fun

calloc memalign

Needs ability to locate size & object start


UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science
4

Fragmentation

Intuitively, fragmentation stems from breaking up heap into unusable spaces

More fragmentation = worse utilization of memory Wasted space outside allocated objects Wasted space inside an object

External fragmentation

Internal fragmentation

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Classical Algorithms

First-fit

find first chunk of desired size

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Classical Algorithms

Best-fit

find chunk that fits best

Minimizes wasted space

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Classical Algorithms

Worst-fit

find chunk that fits worst then split object

Reclaim space: coalesce free adjacent objects into one big object

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Implementation Techniques

Freelists

Linked lists of objects in same size class

Range of object sizes

First-fit, best-fit in this context

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Implementation Techniques

Segregated size classes

Use free lists, but never coalesce or split Exact Powers-of-two

Choice of size classes


UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

10

Implementation Techniques

Big Bag of Pages (BiBOP)

Page or pages (multiples of 4K) Usually segregated size classes


Locate with bitmasking

Header contains metadata

Limits external fragmentation Can be very fast

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

11

Runtime Analysis

Key components

Cost of malloc (best, worst, average) Cost of free Cost of size lookup (for realloc & free)

Examine for first-fit, best-fit, segregated (with BiBOP)

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

12

Space Bounds

Fragmentation worst-case for optimal: O(log M/m)


M = largest object size m = smallest object size

Best-fit = O(M * m) ! Goal: perform well for typical programs

Considerations:

Internal fragmentation External fragmentation Headers (metadata)


13

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Performance Issues

Well talk about scalability later Reliability, too But: general-purpose allocator often seen as too slow

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

14

Custom Memory Allocation

Programmers replace new/delete, bypassing system allocator

Very common practice

Reduce runtime often Expand functionality sometimes Reduce space rarely

Apache, gcc, lcc, STL, database servers Language-level support in C++ Widely recommended

Use custom allocators


UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science
15

Drawbacks of Custom Allocators

Avoiding system allocator:


More code to maintain & debug Cant use memory debuggers Not modular or robust:

Mix memory from custom and general-purpose allocators crash!

Increased burden on programmers Are custom allocators really a win?


UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science
16

(1) Per-Class Allocators

Recycle freed objects from a free list


a = new Class1; b = new Class1; c = new Class1; delete a; delete b; delete c; a = new Class1; b = new Class1; c = new Class1;
Class1 free list

Fast
+

Linked list operations Identical semantics C++ language support

+
b

Simple
+ +

Possibly space-inefficient

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

17

(II) Custom Patterns

Tailor-made to fit allocation patterns

Example: 197.parser (natural language parser)


a db c

char[MEMORY_LIMIT]

end_of_array end_of_array end_of_array end_of_array end_of_array

a = xalloc(8); b = xalloc(16); c = xalloc(8); xfree(b); xfree(c); d = xalloc(8);

Fast
+

Pointer-bumping allocation

- Brittle
- Fixed memory size - Requires stack-like lifetimes
18

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

(III) Regions

Separate areas, deletion only en masse


regioncreate(r) regionmalloc(r, sz) regiondelete(r)
r

Fast
+ +

- Risky
- Dangling references - Too much space

Pointer-bumping allocation Deletion of chunks One call frees all memory

Convenient
+

Increasingly popular custom allocator


UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science
19

Custom Allocators Are Faster


Runtime - Custom Allocator Benchmarks
Custom Win32
regions

Normalized Runtime

1.75 1.5 1.25 1 0.75 0.5 0.25 0

non-regions

17 5. vp r

19 7. pa rs er bo xe dsi m

ap ac he

cbr ee ze

17 6. gc c

As good as and sometimes much faster than Win32


UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science
20

ud lle

lc c

Not So Fast
Runtime - Custom Allocator Benchmarks
Custom 1.75 Win32 DLmalloc regions

Normalized Runtime

1.5 1.25 1 0.75 0.5 0.25 0

non-regions

he

dsi m

5. vp

rs e

ee z

6. gc

lc c

7. pa

cbr

17

DLmalloc: as fast or faster for most benchmarks


UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science
21

19

bo

17

xe

ap

m ud

ac

lle

The Lea Allocator (DLmalloc 2.7.0)

Mature public-domain general-purpose allocator Optimized for common allocation patterns

Per-size quicklists per-class allocation

Deferred coalescing (combining adjacent free objects)

Highly-optimized fastpath

Space-efficient

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

22

Space Consumption: Mixed Results


Space - Custom Allocator Benchmarks
Custom
1.75 1.5 1.25 1 0.75 0.5 0.25 0

DLmalloc
regions

Normalized Space

non-regions

he

dsim

6. gc c

5. vp

rs e

ee z

lc

7. pa

cbr

17

19

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

bo

17

xe

ap

m ud

ac

lle

23

Custom Allocators?

Generally not worth the trouble: use good general-purpose allocator

Avoids risky software engineering errors

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

24

Problems with Unsafe Languages

C, C++: pervasive apps, but langs. memory unsafe Numerous opportunities for security vulnerabilities, errors

Double free Invalid free Uninitialized reads Dangling pointers Buffer overflows (stack & heap)
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Soundness for Erroneous Programs


Normally: memory errors ) ? Consider infinite-heap allocator:

All news fresh; ignore delete

No dangling pointers, invalid frees, double frees No buffer overflows, data overwrites

Every object infinitely large

Transparent to correct program Erroneous programs sound


UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Probabilistic Memory Safety


Approximate

with M-heaps (e.g., M=2)

DieHard: fully-randomized M-heap

Increases odds of benign errors Probabilistic memory safety

i.e., P(no error) n


E(users with no error) n * |users|

Errors independent across heaps

? Efficient implementation

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Implementation Choices

Conventional, freelist-based heaps

Hard to randomize, protect from errors

Double frees, heap corruption

What about bitmaps? [Wilson90]


Catastrophic fragmentation

Each small object likely to occupy one page

obj

obj

obj

obj

pages
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Randomized Heap Layout


00000001 size = 2i+3 1010 10 2i+4 2i+5
heap metadata

Bitmap-based, segregated size classes

Bit represents one object of given size

i.e., one bit = 2i+3 bytes, etc.

Prevents fragmentation

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Randomized Allocation
00000001 size = 2i+3 1010 10 2i+4 2i+5
heap metadata

malloc(8):

compute size class = ceil(log2 sz) 3 randomly probe bitmap for zero-bit (free)

Fast: runtime O(1)

M=2 E[# of probes] 2

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Randomized Allocation
00010001 size = 2i+3 1010 10 2i+4 2i+5
heap metadata

malloc(8):

compute size class = ceil(log2 sz) 3 randomly probe bitmap for zero-bit (free)

Fast: runtime O(1)

M=2 E[# of probes] 2

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Randomized Deallocation
00010001 size = 2i+3 1010 10 2i+4 2i+5
heap metadata

free(ptr):

Ensure object valid aligned to right address Ensure allocated bit set Resets bit

Prevents invalid frees, double frees

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Randomized Deallocation
00010001 size = 2i+3 1010 10 2i+4 2i+5
heap metadata

free(ptr):

Ensure object valid aligned to right address Ensure allocated bit set Resets bit

Prevents invalid frees, double frees

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Randomized Deallocation
00000001 size = 2i+3 1010 10 2i+4 2i+5
heap metadata

free(ptr):

Ensure object valid aligned to right address Ensure allocated bit set Resets bit

Prevents invalid frees, double frees

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Randomized Heaps & Reliability


object size = 2i+3 2 4 5 3 1 6 object size = 2i+4 3

My Mozilla: malignant overflow


Objects randomly spread across heap Different run = different heap

Errors across heaps independent


Your Mozilla: benign overflow

5 4

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

DieHard software architecture

seed1
input

replica1
output

seed2

replica2 vote

broadcast
seed3

replica3
(separate processes)

execute replicas

Replication-based fault-tolerance

Requires randomization: errors independent


UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

DieHard Results

Analytical results (pictures!)

Buffer overflows Uninitialized reads Dangling pointer errors (the best)


Runtime overhead Error avoidance

Empirical results

Injected faults & actual applications

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Analytical Results: Buffer Overflows

Model overflow as write of live data

Heap half full (max occupancy)

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Analytical Results: Buffer Overflows

Model overflow as write of live data

Heap half full (max occupancy)

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Analytical Results: Buffer Overflows

Model overflow: random write of live data

Heap half full (max occupancy)

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Analytical Results: Buffer Overflows

Replicas: Increase odds of avoiding overflow in at least one replica


replicas
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Analytical Results: Buffer Overflows

Replicas: Increase odds of avoiding overflow in at least one replica


replicas
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Analytical Results: Buffer Overflows

Replicas: Increase odds of avoiding overflow in at least one replica


replicas

P(Overflow in all replicas) = ()3 = 1/8 P(No overflow in > 1 replica) = 1-()3 = 7/8

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Analytical Results: Buffer Overflows

F = free space H = heap size N = # objects worth of overflow k = replicas

Overflow one object


UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Empirical Results: Runtime

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

Empirical Results: Error Avoidance

Injected faults:

Dangling pointers (@50%, 10 allocations)

glibc: crashes; DieHard: 9/10 correct glibc: crashes 9/10, inf loop; DieHard: 10/10 correct

Overflows (@1%, 4 bytes over)

Real faults:

Avoids Squid web cache overflow

Crashes BDW & glibc DoS in glibc & Windows

Avoids dangling pointer error in Mozilla

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

The End

UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science

47

S-ar putea să vă placă și