Sunteți pe pagina 1din 14

Exploiting the jemalloc Memory Allocator: Owning Firefox's Heap

Patroklos Argyroudis <argp@census-labs.com>


Chariton Karamitas <huku@census-labs.com>
Census, Inc.
http!!census-labs.com!
jemalloc: You are probably alreay using it
jemalloc is a userland memory allocator that is being increasingly adopted by software projects as a
high performance heap manager. It is used in Mozilla Firefox for the Windows, Mac O ! and "inux
platforms, and as the default system allocator on the Free#$ and %et#$ operating systems.
Faceboo& also uses jemalloc in 'arious components to handle the load of its web ser'ices. (owe'er,
despite such widespread use, there is no wor& on the exploitation of jemalloc.
Our research addresses this. We begin by examining the architecture of the jemalloc heap manager
and its internal concepts, while focusing on identifying possible attac& 'ectors. jemalloc does not
utilize concepts such as )unlin&ing) or )frontlin&ing) that ha'e been used extensi'ely in the past to
undermine the security of other allocators. *herefore, we de'elop no'el exploitation approaches and
primiti'es that can be used to attac& jemalloc heap corruption 'ulnerabilities. +s a case study, we
in'estigate Mozilla Firefox and demonstrate the impact of our de'eloped exploitation primiti'es on the
browser)s heap. In order to aid the researchers willing to continue our wor&, we ha'e de'eloped a
jemalloc debugging tool ,named unmas&-jemalloc. for /$# using its support for 0ython scripting.
jemalloc !echnical O"er"iew
jemalloc recognizes that minimal page utilization is no longer the most critical feature. Instead it
focuses on enhanced performance in retrie'ing data from the 1+M. #ased on the principle of locality
which states that items that are allocated together are also used together, jemalloc tries to situate
allocations contiguously in memory. +nother fundamental design choice of jemalloc is its support for
M0 systems and multi2threaded applications by trying to a'oid loc& contention problems between
many simultaneously running threads. *his is achie'ed by using many )arenas) and the first time a
thread calls into the memory allocator ,for example by calling malloc,3.. it is associated with a
specific arena. *he assignment of threads to arenas happens with three possible algorithms4
5. with a simple hashing on the thread)s I$ if *" is a'ailable
#lac& (at 6+ 7857
5
Exploiting the jemalloc Memory Allocator: Owning Firefox's Heap
7. with a simple builtin linear congruential pseudo random number generator in case
M+""O9-#+"+%9: is defined and *" is not a'ailable
3. or with the traditional round2robin algorithm.
For the later two cases, the association between a thread and an arena doesn)t stay the same for
the whole life of the thread.
9ontinuing our high2le'el o'er'iew of the main jemalloc structures, we ha'e the concept of )chun&s).
jemalloc di'ides memory into chun&s, always of the same size, and uses these chun&s to store all of
its other data structures ,and user2re;uested memory as well.. 9hun&s are further di'ided into )runs)
that are responsible for re;uests<allocations up to certain sizes. + run &eeps trac& of free and used
)regions) of these sizes. 1egions are the heap items returned on user allocations ,e.g. malloc,3.
calls.. Finally, each run is associated with a )bin). #ins are responsible for storing structures ,trees. of
free regions. Figure 5 illustrates in an abstract manner the relationships between the basic building
bloc&s of jemalloc.
#lac& (at 6+ 7857
7
"igure # $emalloc basic design
Exploiting the jemalloc Memory Allocator: Owning Firefox's Heap
#hun$s
In the context of jemalloc, chun&s are big 'irtual memory areas that a'ailable memory is conceptually
di'ided into. +s we ha'e mentioned, chun&s are always of the same size. (owe'er, each different
jemalloc 'ersion has a specific chun& size. For example, the jemalloc 'ersion used in Mozilla Firefox
has a chun& size of 5 M#, while that used in the Free#$ libc has a chun& size of 7 M#. 9hun&s are
described by )arena-chun&-t) structures, illustrated in Figure 7.
Arenas
+n arena is a structure that manages the memory areas jemalloc di'ides into chun&s and the
underlying pages. +renas can span more than one chun&, and depending on the size of the chun&s,
more than one page as well. +s we ha'e already mentioned, arenas are used to mitigate loc&
contention problems between threads. *herefore, allocations and deallocations from a thread always
happen on the same arena. *heoretically, the number of arenas is in direct relation to the need for
concurrency in memory allocation. In practice the number of arenas depends on the jemalloc 'ariant
we deal with. For example, in Firefox)s jemalloc there is only one arena. In the case of single2906
systems there is also only one arena. In M0 systems the number of arenas is e;ual to either two ,in
Free#$ =.7. or four ,in the standalone 'ariant. times the number of a'ailable 906 cores. Of course,
there is always at least one arena. +renas are described by the structure illustrated in Figure 3.
#lac& (at 6+ 7857
3
"igure % Chunks &arena'chunk't(
Exploiting the jemalloc Memory Allocator: Owning Firefox's Heap
%uns
1uns are further memory denominations of the memory di'ided by jemalloc into chun&s. 1uns exist
only for small and large allocations ,size classes are explained in the next paragraph., but not for
huge allocations. In essence, a chun& is bro&en into se'eral runs. :ach run is actually a set of one or
more contiguous pages ,but a run cannot be smaller than one page.. *herefore, they are aligned to
multiples of the page size. *he runs themsel'es may be non2contiguous but they are as close as
possible due to the tree search heuristics implemented by jemalloc.
*he main responsibility of a run is to &eep trac& of the state ,i.e. free or used. of end user memory
allocations, or regions as these are called in jemalloc terminology. :ach run holds regions of a
specific size ,howe'er within the small and large size classes as we ha'e mentioned. and their state
is trac&ed with a bitmas&. *his bitmas& is part of a run)s metadata> these metadata are portrayed in
Figure ?.
#lac& (at 6+ 7857
?
"igure ) *uns &arena'run't(
"igure + Arenas &arena't(
Exploiting the jemalloc Memory Allocator: Owning Firefox's Heap
%egions
In jemalloc the term )regions) applies to the end user memory areas returned by malloc,3.. +s we
ha'e briefly mentioned earlier, regions are di'ided into three classes according to their size, namely4
5. small<medium,
7. large and
3. huge.
(uge regions are considered those that are bigger than the chun& size minus the size of some
jemalloc headers. For example, in the case that the chun& size is ? M# ,?8@A B#. then a huge
region is an allocation greater than ?8C= B#. mall<medium are the regions that are smaller than a
page. "arge are the regions that are smaller than the huge regions ,chun& size minus some headers.
and also larger than the small<medium regions ,page size..
(uge regions ha'e their own metadata and are managed separately from small<medium and large
regions. pecifically, they are managed by a global to the allocator red2blac& tree and they ha'e their
own dedicated and contiguous chun&s. "arge regions ha'e their own runs, that is each large
allocation has a dedicated run. *heir metadata are situated on the corresponding arena chun&
header. mall<medium regions are placed on different runs according to their specific size. +s we
ha'e explained, each run has its own header in which there is a bitmas& array specifying the free and
the used regions in the run.
&ins
#ins are used by jemalloc to store free regions. #ins organize the free regions 'ia runs and also &eep
metadata about their regions, li&e for example the size class, the total number of regions, etc. +
specific bin may be associated with se'eral runs, howe'er a specific run can only be associated with
a specific bin, i.e. there is an one2to2many correspondence between bins and runs. #ins ha'e their
associated runs organized in a tree. :ach bin has an associated size class and stores<manages
regions of this size class. + bin)s regions are managed and accessed through the bin)s runs. :ach bin
has a member element representing the most recently used run of the bin, called )current run) with
the 'ariable name runcur. + bin also has a tree of runs with a'ailable<free regions. *his tree is used
when the current run of the bin is full, that is it doesn)t ha'e any free regions. *he bin structure is
portrayed in Figure D.
#lac& (at 6+ 7857
D
Exploiting the jemalloc Memory Allocator: Owning Firefox's Heap
Figure A summarizes our technical o'er'iew of the jemalloc architecture.
#lac& (at 6+ 7857
A
"igure , -ins &arena'bin't(
"igure . Architecture o/ $emalloc
Exploiting the jemalloc Memory Allocator: Owning Firefox's Heap
Exploitation 'rimiti"es
#efore we start our analysis we would li&e to point out that jemalloc ,as well as other malloc
implementations. does not implement concepts li&e )unlin&ing) or )frontlin&ing) which ha'e pro'en to be
catalytic for the exploitation of dlmalloc and Microsoft Windows allocators. *hat said, we would li&e to
stress the fact that the attac&s we are going to present do not directly achie'e a write2?2anywhere
primiti'e. We, instead, focus on how to force malloc,. ,and possibly realloc,.. to return a chun& that
will most li&ely point to an already initialized memory region, in hope that the region in ;uestion may
hold objects important for the functionality of the target application ,9EE F0*1s, function pointers,
buffer sizes and so on.. 9onsidering the 'arious anti2exploitation countermeasures present in modern
operating systems ,+"1, $:0 and so on., we belie'e that such an outcome is far more useful for an
attac&er than a ? byte o'erwrite.
It is our goal to co'er all possible cases of data or metadata corruption, specifically4
5. +djacent region o'erwrites
7. 1un header corruptions
3. 9hun& header corruptions
?. Magazine ,a.&.a. thread cache. corruptions are not co'ered in this whitepaper since Mozilla
Firefox does not use thread caching. For more information on this subject please see G0(19H
and G0(1BH.
Ajacent %egion O"erwrites
*he main idea behind adjacent heap item corruptions is that you exploit the fact that the heap
manager places user allocations next to each other contiguously without other data in between. In
jemalloc regions of the same size class are placed on the same bin. In the case that they are also
placed on the same run of the bin then there are no inline metadata between them. *herefore, we
can place a 'ictim object<structure of our choosing in the same run and next to the 'ulnerable
object<structure we plan to o'erflow. *he only re;uirement is that the 'ictim and 'ulnerable objects
need to be of a size that puts them in the same size class and therefore possibly in the same run.
ince there are no metadata between the two regions, we can o'erflow from the 'ulnerable region to
the 'ictim region we ha'e chosen. 6sually the 'ictim region is something that can help us achie'e
arbitrary code execution, for example function pointers.
In order to be able to arrange the jemalloc heap in a predictable state we need to understand the
allocator)s beha'ior and use heap manipulation tactics to influence it to our ad'antage. In the context
of browsers, heap manipulation tactics are usually referred to as )(eap Feng hui) after +lexander
otiro')s wor& GF:%/H. #y )predictable state) we mean that the heap must be arranged as reliably as
possible in a way that we can position data where we want. *his enables us to use the tactic of
corrupting adjacent regions, but also to exploit use2after2free bugs. In use2after2free bugs a memory
region is allocated, used, freed and then used again due to a bug. In such a case if we &now the
region)s size we can manipulate the heap to place data of our own choosing in the freed region)s
memory slot on its run before it is used again. 6pon its subse;uent incorrect use the region now has
our data that can help us hijac& the flow of execution.
*o explore jemalloc)s beha'ior and manipulate it into a predictable state we use an algorithm similar
to the one presented in G(O:IH. ince in the general case we cannot &now beforehand the state of
the runs of the class size we are interested in, we perform many allocations of this size hoping to
co'er the holes ,i.e. free regions. in the existing runs and get a fresh run. (opefully the next series of
allocations we will perform will be on this fresh run and therefore will be se;uential. +s we ha'e seen,
#lac& (at 6+ 7857
C
Exploiting the jemalloc Memory Allocator: Owning Firefox's Heap
se;uential allocations on a largely empty run are also contiguous. %ext, we perform such a series of
allocations controlled by us. In the case we are trying to use the adjacent regions corruption tactic,
these allocations are of the 'ictim object<structure we ha'e chosen to help us gain code execution
when corrupted. *he following step is to deallocate e'ery second region in this last series of
controlled 'ictim allocations. *his will create holes in between the 'ictim objects<structures on the
run of the size class we are trying to manipulate. Finally, we trigger the heap o'erflow bug forcing,
due to the state we ha'e arranged, jemalloc to place the 'ulnerable objects in holes on the target run
o'erflowing into the 'ictim objects. We use and elaborate on this approach in the following
paragraphs while discussing a case study on the Mozilla Firefox browser.
%un Heaer #orruptions
In a heap o'erflow situation it is pretty common for the attac&er to be able to o'erflow a memory
region which is not followed by other regions ,li&e the wilderness chun& in dlmalloc, but in jemalloc
such regions are not that special.. In such a situation, the attac&er will most li&ely be able to
o'erwrite the run header of the next run. ince runs hold memory regions of e;ual size, the next
page aligned address will either be a normal page of the current run, or will contain the metadata
,header. of the next run which will hold regions of different size ,larger or smaller, it doesn)t really
matter.. In the first case, o'erwriting adjacent regions of the same run is possible and thus an
attac&er can use the techni;ues that were pre'iously discussed. *he latter case is the subject of the
following paragraphs.
0eople already familiar with heap exploitation, may recall that it is pretty common for an attac&er to
control the last heap item ,region in our case. allocated, that is the most recently allocated region is
the one being o'erflown. *his allows an attac&er to corrupt a run)s header. When a run)s metadata
are o'erwritten, the )bin) pointer can be made to point to a fa&e bin structure. *his is not a good idea
because of two reasons. First, the attac&er needs further control of the target process in order to
successfully construct a fa&e bin header somewhere in memory. econdly, and most importantly, the
)bin) pointer of a region)s run header is dereferenced only during deallocation. + careful study of the
jemalloc source code re'eals that only )run2Jbin2Jreg8-offset) is actually used ,somewhere in
)arena-run-reg-dalloc,.)., thus, from an attac&er)s point of 'iew, the bin pointer is not that interesting
,)reg8-offset) o'erwrite may cause further problems as well leading to crashes and a forced interrupt
of our exploit..
Our attac& consists of the following steps. *he attac&er o'erflows the last item of a run ,for example
run K5. and o'erwrites the next run)s ,e.g. run K7. header. *hen, upon the next malloc,. of a size
e;ual to the size ser'iced by run K7, the user will get as a result a pointer to a memory region of the
pre'ious run ,run K5 in our example.. It is important to understand that in order for the attac& to
wor&, the o'erflown run should ser'e regions that belong to any of the a'ailable bins.
#hun$ Heaer #orruptions
We will now focus on what the attac&er can do once she is able to corrupt the chun& header of an
arena. +lthough the probability of directly affecting a nearby arena is low, a memory lea& or the
indirect control of the heap layout by continuous bin2sized allocations can render the techni;ue
described in this section a useful tool in the attac&er)s hand. *he scenario we will be analyzing is the
following4 *he attac&er forces the target application to allocate a new arena by controlling its heap
allocations. he then triggers the o'erflow in the last region of the pre'ious arena ,the region that
physically borders the new arena. thus corrupting the chun& header metadata. When the application
calls )free,.) on any region of the newly allocated arena, the jemalloc house&eeping information is
#lac& (at 6+ 7857
=
Exploiting the jemalloc Memory Allocator: Owning Firefox's Heap
altered. On the next call to )malloc,.), the allocator will return a region that points to already allocated
space of ,preferably. the pre'ious arena.
#ase (tuy: Mo)illa Firefox
Our jemalloc debugging tool, unmas&-jemalloc, is implemented using the 0ython scripting support of
the /%6 $ebugger ,gdb.. While unmas&-jemalloc supports as2is "inux 372bit and A?2bit Mozilla
Firefox targets, there is a problem when it comes to the Mac O ! operating system. +pple)s gdb is
based on the A.x gdb tree, which means that it does not ha'e support for 0ython scripting. %ew gdb
de'elopment snapshots support Mach2O binaries, but cannot load +pple)s fat binaries. In order to
sol'e this problem we use +pple)s lipo utility and a script we ha'e de'eloped called lipodebugwal&.py.
*his script recursi'ely uses +pple)s lipo on the binaries of Firefox.app. Moreo'er, lipodebugwal&.py
also has support for Mozilla Firefox)s debug symbol binary files. Figure C includes the output of using
fetch2symbols.py ,pro'ided by Mozilla. to get debug symbols for Firefox, and the use of
lipodebugwal&.py to allow a custom2compiled 'ersion of gdb to load Firefox.
*he abo'e procedure allows us to utilize unmas&-jemalloc to explore how jemalloc manages Firefox)s
heap and aid us in the process of exploit de'elopment. Figure = portrays the help message of
unmas&-jemalloc and shows the functionality we ha'e implemented in the tool4
#lac& (at 6+ 7857
@
"igure 0 12ample use o/ /etch-symbols.py and lipodebug3alk.py
Exploiting the jemalloc Memory Allocator: Owning Firefox's Heap
6sing unmas&-jemalloc we can in'estigate how we can manipulate Firefox)s jemalloc2managed heap
from Ia'ascript. *he following script uses unescaped strings and arrays to demonstrate controlled
allocations and deallocations. ince Firefox implements mitigations against traditional heap spraying,
the script uses random padding to the allocated bloc&s G9O1"H.
<html>
<head>
<script>
function jemalloc_spray(blocks, size)
{
var block_size = size !"
ropbootstrap#hatever
var marker = unescape($%ubeef%udead$)"
marker &= marker"
shellcodepayload
var content = unescape($%u''''%u''''$)"
#hile(content(len)th < (block_size !))
{
content &= content"
*
var arr = +,"
for(i = -" i < blocks" i&&)
{
construct the random block paddin)
var rnd. = /ath(floor(/ath(random() 0 .---) % .'"
var rnd! = /ath(floor(/ath(random() 0 .---) % .'"
var rnd1 = /ath(floor(/ath(random() 0 .---) % .'"
var rnd2 = /ath(floor(/ath(random() 0 .---) % .'"
var rndstr = $%u$ & rnd.(to3trin)() & rnd!(to3trin)()"
rndstr &= $%u$ & rnd1(to3trin)() & rnd2(to3trin)()"
var paddin) = unescape(rndstr)"

#hile(paddin)(len)th < block_size 4 marker(len)th 4 content(len)th)
{
#lac& (at 6+ 7857
58
"igure 4 "unctionality o/ the unmask'$emalloc utility
Exploiting the jemalloc Memory Allocator: Owning Firefox's Heap
paddin) &= paddin)"
*
construct the block
var block = marker & content & paddin)"
if re5uired repeat the block
#hile(block(len)th < block_size)
{
block &= block"
*
spray block
arr+i, = block(substr(-)"
*
/ath(asin(.)"
marker = unescape($%ubabe%ucafe$)"
marker &= marker"
content = unescape($%u6666%u6666$)"
#hile(content(len)th < (block_size !))
{
content &= content"
*
for(i = -" i < blocks" i &= !)
{
delete(arr+i,)"
arr(splice(i, .)"
arr+i, = null"
*
var ret = tri))er_)c()"

alert($7fter )arba)e collection8 $ & ret(len)th)"
for(i = -" i < blocks" i &= !)
{
var rnd. = /ath(floor(/ath(random() 0 .---) % .'"
var rnd! = /ath(floor(/ath(random() 0 .---) % .'"
var rnd1 = /ath(floor(/ath(random() 0 .---) % .'"
var rnd2 = /ath(floor(/ath(random() 0 .---) % .'"
var rndstr = $%u$ & rnd.(to3trin)() & rnd!(to3trin)()"
rndstr &= $%u$ & rnd1(to3trin)() & rnd2(to3trin)()"
var paddin) = unescape(rndstr)"
#hile(paddin)(len)th < block_size 4 marker(len)th 4 content(len)th)
{
paddin) &= paddin)"
*
var block = marker & content & paddin)"
#hile(block(len)th < block_size)
#lac& (at 6+ 7857
55
Exploiting the jemalloc Memory Allocator: Owning Firefox's Heap
{
block &= block"
*
arr+i, = block(substr(-)"
*
/ath(atan!(', ')"
return arr"
*
function tri))er_)c()
{
force )arba)e collection

var )c = +,"

for(i = -" i < .-----" i&&)
{
)c+i, = ne# 7rray()"
*
return )c"
*
function run_spray()
{
.-!2 spray blocks of size 1!- (tar)et run8 9.!)
var foo = jemalloc_spray(.-!2, 1!-)"
alert(foo(len)th)"
*
<script>
<head>
<body onload=$run_spray()"$>
-:.116
<body>
<html>
#onclusion
In this whitepaper we ha'e analyzed the jemalloc memory allocator from an exploitation perspecti'e.
We ha'e de'eloped exploitation primiti'es that can be used to attac& any application that utilizes
jemalloc. Moreo'er, we ha'e applied these primiti'es to the most widely used jemalloc application,
namely the Mozilla Firefox browser. Our unmas&-jemalloc debugging utility can be used during exploit
de'elopment to explore the internals of jemalloc and help the researchers willing to continue our
wor&.
About the Authors
0atro&los +rgyroudis is a computer security researcher at 9ensus Inc, a company that builds on
strong research foundations to offer specialized I* security ser'ices to customers worldwide.
0atro&los holds a 0h$ in 9omputer ecurity from the 6ni'ersity of $ublin, *rinity 9ollege, where he
has also wor&ed as a postdoctoral researcher on applied cryptography. (is current focus is on
#lac& (at 6+ 7857
57
Exploiting the jemalloc Memory Allocator: Owning Firefox's Heap
'ulnerability research, exploit de'elopment, re'erse engineering, source code auditing and malware
analysis. 0atro&los has presented research at se'eral international security conferences on topics
such as &ernel exploitation, &ernel mitigation technologies, and electronic payments.
9hariton Baramitas is an undergraduate student at the :lectrical :ngineering and 9omputer
:ngineering $epartment of the +ristotle 6ni'ersity of *hessaloni&i ,/reece., wor&s as a part time
systems administrator at the same department, and is an intern at 9ensus Inc. (is research
interests include static analysis, compilers, re'erse engineering and source code auditing. (e also
enjoys spending his free time studying discrete mathematics, theory of computation, complex
analysis and of course, coding 8day exploitsL 9hariton has pre'iously presented research on
automated blac&box fuzzing and glibc heap exploitation.
About #ensus* +nc,
9ensus, Inc. ,www.census2labs.com. is an independent, pri'ately funded company based in /reece
dedicated to pro'iding highly specialized and professional I* security ser'ices. 9ensus was founded in
%o'ember 788= by computer security experts with distinguished credentials and extensi'e prior
experience. We are moti'ated by passion for I* security research and focused determination to help
our clients achie'e the highest returns from their I* security in'estment. Our company)s independent
status allows us to dynamically approach the needs of our clients without compromising our initial
'ision.
*he ser'ices pro'ided by 9ensus are different from the traditional approach to I* security. We
recognize that information security threats are constantly e'ol'ing. Our specialization and experience
in the field enables us to go beyond the publicly &nown attac& 'ectors, thus gi'ing our clients the
opportunity to be protected from possible future threats to their infrastructure and products.
Our ser'ices aim to4
pro'ide an in2depth examination of our client)s I* security problems and assist in their
resolution
protect our clients) business continuity
ensure that our clients achie'e the best possible returns from their I* security in'estment
&eep our clients informed on current threats and the countermeasures needed to address
these
enable I* security 'endors to pro'ide enhanced ser'ices to their clients
9ensus offers the following I* security ser'ices4
ecurity *esting
ource 9ode +uditing
$igital Forensics
Fulnerability 1esearch
Malware +nalysis
$e'elopment of 9ustom ecurity olutions
9ensus builds on strong research foundations to offer high ;uality ser'ices to customers worldwide.
Our research2dri'en I* security ser'ices enable our clients to be protected against pre'iously
un&nown ,82day. attac&s and threats.
#lac& (at 6+ 7857
53
Exploiting the jemalloc Memory Allocator: Owning Firefox's Heap
%eferences
G0(19H argp, hu&u, 0seudomonarchia jemallocum,
http4<<phrac&.org<issues.htmlMissueNA=OidN58
G0(1BH hu&u, argp, *he +rt of :xploitation4 + case study on jemalloc heap o'erflows,
http4<<phrac&.org<issues.htmlMissueNA=OidN53
G(O:IH Mar& $aniel, Ia&e (onoroff, 9harlie Miller, :ngineering (eap O'erflow :xploits with Ia'ascript,
http4<<securitye'aluators.com<files<papers<isewoot8=.pdf
GF:%/H +lexander otiro', (eap Feng hui in Ia'ascript,
http4<<www.phreedom.org<research<heap2feng2shui<heap2feng2shui.html
G9O1"H corelanc8d3r, (eap spraying demystified,
http4<<www.corelan.be<index.php<7855<57<35<exploit2writing2tutorial2part2552heap2spraying2
demystified<
#lac& (at 6+ 7857
5?

S-ar putea să vă placă și