Sunteți pe pagina 1din 5

SECURE SYSTEMS

Editors: Patrick McDaniel, mcdaniel@cse.psu.edu | Sean W. Smith, sws@cs.dartmouth.edu

A Patch for Postel’s Internet communication must go


on, and we must reexamine our
design and engineering principles

Robustness Principle to protect it. Geer makes a con-


vincing practical case for reexam-
ining Postel’s principle from the
defender’s position; Len Sassaman
Len Sassaman | Katholieke Universiteit Leuven and Meredith L. Patterson arrived
Meredith L. Patterson | Red Lambda at a similar conclusion from a com-
Sergey Bratus | Dartmouth College bination of formal-language the-
ory and exploitation experience. 2

Robustness
versus Malevolence
J on Postel’s Robustness Prin-
ciple—“Be conservative in
what you do, and liberal in what
design that helps avoid these mis-
takes and to “patch” the principle’s
common formulation to remove
Postel’s principle wasn’t meant to
be oblivious of security. For exam-
ple, consider the context in which
you accept from others”—played the potential weakness that these it appears in the IETF’s Request
a fundamental role in how Inter- mistakes represent. for Comments (RFC) 1122, Sec-
net protocols were designed and tion 1.2.2 “Robustness Principle”:3
implemented. Its influence went Robustness and
far beyond direct application by Internet Freedom At every layer of the protocols,
Internet Engineering Task Force Postel’s principle acquired deep there is a general rule whose
(IETF) designers, as generations of philosophical and political signifi- application can lead to enor-
programmers learned from exam- cance—discussed, for instance, mous benefits in robustness
ples of the protocols and server in Dan Geer’s groundbreaking and interoperability [IP:1]:
implementations it had shaped. essay “Vulnerable Compliance.”1
However, we argue that its mis- It created a world of programming “Be liberal in what you accept,
interpretations were also responsi- thought, intuition, and attitude and conservative in what you
ble for the proliferation of Internet that made the Internet what it is: send.”
insecurity. In particular, several a ubiquitous, generally interoper-
mistakes in interpreting Postel’s able system that enables the use of Software should be written to
principle lead to the opposite of communication technology to fur- deal with every conceivable
robustness—unmanageable inse- ther political freedoms. error, no matter how unlikely;
curity. These misinterpretations, Yet this world of revolutionary sooner or later a packet will
although frequent, are subtle, and forms of communication faces an come in with that particu-
recognizing them requires closely insecurity crisis that erodes users’ lar combination of errors and
examining fundamental concepts trust in its software and platforms. attributes, and unless the soft-
of computation and exploitation If users continue to see Internet ware is prepared, chaos can
(or equivalent intuitions). By dis- communication platforms as weak ensue. In general, it is best to
cussing them, we intend neither and vulnerable to push-button assume that the network is
an attack on the principle nor its attack tools that are easily acquired filled with malevolent enti-
deconstruction, any more than a by a repressive authority, they will ties that will send in packets
patch on a useful program intends eventually become unwilling to use designed to have the worst pos-
to slight the program. Our inten- these platforms for important tasks. sible effect. This assumption
tion is to present a view of protocol The world of free, private will lead to suitable protective

1540-7993/12/$31.00 © 2012 IEEE Copublished by the IEEE Computer and Reliability Societies March/April 2012 87
SECURE SYSTEMS

design, although the most seri- We then offer a “patch” that makes that each function (or basic block)
ous problems in the Internet this discouragement more explicit. that works with input data must first
have been caused by unenvis- check that the data is as expected;
aged mechanisms triggered by The Language- however, the context required to
low-­
probability events; mere Theoretic Approach fully check the current data element
human malice would never At every layer of an Internet pro- is too rich to pass around. Program-
have taken so devious a course! tocol stack, implementations face mers are intimately familiar with
a recognition problem—they must this frustration: even though they
This formulation of the prin- recognize and accept valid or know they must validate the data,
ciple shows awareness of security expected inputs and reject mali- they can’t do so fully, wherever in
problems caused by lax input han- cious ones in a manner that doesn’t the code they look. When operat-
dling misunderstood as “liberal expose their recognition or process- ing with some data derived from
acceptance.” So, reading Postel’s ing logic to exploitation. We speak the inputs, programmers are left to
principle as encouraging imple- of valid or expected inputs to stress wonder how far back they should
menters to generally trust network that, in the name of robustness, go to determine if using the data as
inputs would be wrong. some inputs can be accepted rather is would lead to a memory corrup-
Note also the RFC’s statement than rejected without being valid tion, overflow, or hijacked computa-
that the principle should apply at or defined for a given implementa- tion. The context necessary to make
every network layer. Unfortunately, tion. However, they must be safe— this determination is often scattered
this crucial design insight is almost that is, not lead the current layer or or too far down the stack. Similarly,
universally ignored. Instead, imple- higher layers to perform a malicious during code review, code auditors
mentations of layered designs are computation or exploitation. often have difficulty ascertaining
dominated by implicit assumptions In previous research, we whether the data has been fully vali-
that layer boundaries serve as “fil- showed that, starting at certain dated and is safe to use at a given
ters” that pass only well-formed data message complexity levels, recog- code location.
conforming to expected abstrac- nizing the formal language—which Indeed, second-guessing devel-
tions. Such expectations can be so is made up by the totality of valid opers’ data safety assumptions that
pervasive that cross-layer vulner- or expected protocol messages or are unlikely to be matched by actual
abilities might persist unnoticed for formats—becomes undecidable. 5,6 ad hoc recognizer code (also called
decades. These layers of abstraction Such protocols can’t tell valid or input validation or sanity checking
become boundaries of competence.4 expected inputs from exploitative code) has been a fruitful exploi-
ones, and exploitation by crafted tation approach. This is because
Robustness and input is only a matter of exploit developers rarely implement full
the Language programming techniques.7 No recognition of input messages but
Recognition Problem 80/20 engineering solution for rather end up with an equivalent
Insecurity owing to input data han- such problems exists, any more of an underpowered automaton,
dling appears ubiquitous and is than you can solve the Halting which fails to enforce their expec-
commonly associated with message Problem by throwing in enough tations. A familiar but important
format complexity. Of course, com- programming or testing effort. example of this failure is trying to
plexity shouldn’t be decried lightly; For complex message languages match recursively nested struc-
progress in programming has pro- and formats that correspond to tures with regular expressions.
duced ever-more-complex machine context-sensitive languages, full “Liberal” parsing would seem
behaviors and thus more complex recognition, although decidable, to discourage a formal languages
data structures. But when do these requires implementing powerful approach, which prescribes gen-
structures become too complex, automata, equivalent to a Turing erating parsers from formal gram-
and how does message complexity machine with a finite tape. When mars and thus provides little
interact with Postel’s principle? input languages require this much leeway for liberalism. However,
The formal language-theoretic computational power, handling we argue that the entirety of Pos-
approach we outline here lets us them safely is difficult, because vari- tel’s principle actually favors this
quantify the interplay of complex- ous input data elements’ validity approach. Although the prin-
ity with Postel’s principle and draw can be established only by checking ciple doesn’t explicitly mention
a bright line beyond which message bits of context that might not be in input rejection—and would seem
complexity should be discouraged the checking code’s scope. Security- to discourage it—proper, pow-
by a strict reading of the principle. minded programmers un­ derstand erful rejection is crucial to safe

88 IEEE Security & Privacy March/April 2012


recognition. Our patch suggests from its explicit grammar (or at A strict reading of the last sen-
a language in which the balance least checked against one). Con- tence would forbid ambiguity (non-
between acceptance and rejection versely, no other form of imple- clarity) of “meaning.” However,
can be productively discussed. menting acceptance will provide a deciding a packet’s meaning in the
way to enumerate and contain the presence of any particular set of
Computational Power space of errors and error states into “technical errors” can be tricky, and
versus Robustness which crafted inputs can drive an some meanings might be confused
It’s easy to assume that Postel’s ad hoc recognizer. Indeed, had this for others, owing to errors. So, what
principle compels acceptance problem been amenable to an algo- makes a protocol message’s mean-
of arbitrarily complex protocols rithmic solution, we would have ing clear and unambiguous, and
requiring significant computa- solved the Halting Problem. how can we judge this clarity in the
tional power to parse. This is a mis- presence of errors?
take. In fact, such protocols should Clarity versus Ambiguity This property of nonambiguity
be deemed incompatible with the in the Presence of Errors can’t belong to an individual mes-
RFC 1122 formulation. 3 It’s also easy to assume that, no sage of a protocol. To know what a
The devil here is in the details. matter the protocol’s syntax, Pos- message can be confused with, we
Writing a protocol handler that tel’s principle compels acceptance need to know what other kinds of
can deal with “every conceivable of ambiguous messages and silent messages are possible. So, clarity
error”3 can be an insur- must be a property of the
mountable task for complex protocol as a whole.
proto­cols, inviting further In the face of undecidability, dealing We posit that this prop-
im­plementation error—or with every conceivable error is erty correlates with the non-
it might be impossible. ambiguity of the protocol’s
This becomes clear impossible. … Complex protocols, grammar and, generally,
once we consider protocol hungry for computational power, with its ease of parsing. It’s
messages as an input lan- unlikely that the parser of a
guage to be recognized, should be deemed incompatible hard-to-parse protocol can
and the protocol handler with Postel’s Robustness Principle. be further burdened with
as the recognizer automa- fixing technical errors with-
ton. Whereas for regular out introducing the poten-
and context-free languages tial for programmer error.
as well as some classes of context- “fixing” of errors. This is also a mis- Thus, clarity can be a property of
sensitive languages, recognition is take. Prior formulations, such as only an easy-to-parse protocol.
decidable and can be performed IETF RFC 761, clarify the bound- As before, consider the totality
by sub-Turing automata, for more ary between being accepting and of a protocol’s messages as an input
powerful classes of formal lan- rejecting ambiguity:8 language to be recognized by the
guages, it’s generally undecidable. protocol’s handler (which serves as
In the face of undecidability, The implementation of a pro- a de facto recognizing automaton).
dealing with every conceivable tocol must be robust. Each Easy-to-parse languages with no or
error is impossible. For context- implementation must expect controllable ambiguity are usually
sensitive protocols requiring full to interoperate with others cre- in regular or context-free classes.
Turing-machine power for recog- ated by different individuals. Context-sensitive languages
nition, it might be theoretically While the goal of this specifica- require more computational power
possible but utterly thankless. tion is to be explicit about the to parse and more state to extract
These complex protocols, hungry protocol there is the possibil- the message elements’ mean-
for computational power, should ity of differing interpretations. ing. So, they’re more sensitive to
be deemed incompatible with Pos- In general, an implementation errors that make such meaning
tel’s Robustness Principle. must be conservative in its ambiguous. Length fields, which
Robust recognition—and sending behavior, and liberal in control the parsing of subsequent
therefore robust error handling— its receiving behavior. That is, it variable-length protocol fields, are
is possible only when the input must accept any datagram that a fundamental example. Should
messages are understood and it can interpret (e.g., not object such a field be damaged, the rest
treated as a formal language, with to technical errors where the of the message bytes will likely be
the recognizer preferably derived meaning is still clear). misinterpreted before the whole

www.computer.org/security 89
SECURE SYSTEMS

message can be rejected thanks to incomplete. Thus, if a protocol So, by intuition or otherwise,
a control sum, if any. If such a sum specification defines four pos- this example of laudable tolerance
follows the erroneous length field, sible error codes, the software stays on the safe side of recog-
it might also be misidentified.4 must not break when a fifth nition, from a formal language-­
Thus ambiguous input languages code shows up. An undefined theoretic perspective.
should be deemed dangerous and code might be logged … but it
excluded from Postel’s Robustness must not cause a failure. Other Views
Principle requirements. Postel’s principle has come under
This example operates with an recent scrutiny from several well-
Adaptability error code—a fixed-length field known authors. We already men-
versus Ambiguity that can be unambiguously rep- tioned Dan Geer’s insightful essay;
Postel’s principle postulates adapt- resented and parsed and doesn’t Eric Allman recently called for bal-
ability. As RFC 1122 states, 3 affect the interpretation of the rest ance and moderation in the prin-
of the message. That is, this exam- ciple’s application.9
Adaptability to change must be ple of “liberal” acceptance is lim- We agree, but posit that such
designed into all levels of Inter- ited to a language construct with balance can exist only for proto-
net host software. As a simple the best formal language proper- cols that moderate their messages’
example, consider a protocol ties. Indeed, fixed-length fields language complexity—and thus
specification that contains an make context-free or regular lan- the computational complexity and
enumeration of values for a guages; tolerating their undefined power demanded of their imple-
particular header field—e.g., values wouldn’t introduce context mentations. We further posit that
a type field, a port number, or sensitivity or necessitate another moderating said complexity is the
an error code; this enumera- computational power step-up for only way to create such balance. We
tion must be assumed to be the recognizer. believe that the culprit in the inse-
curity epidemic and the driver for
patching Postel’s principle isn’t the
modern Internet’s “hostility” per se
(noted as far back as RFC 11223),
but modern protocols’ excessive
computational power greed.
The issues that, according to
Allman, make interoperability
notoriously hard are precisely
those we point out as challenges
to the security of composed, com-
plex system designs.6 We agree
with much in Allman’s discus-
sion. In particular, we see his “dark
side” examples of “liberality taken
too far”9 as precisely the ad hoc
recognizer practices that we call
on implementers to eschew. His
examples of misplaced trust in
ostensibly internal (and therefore
assumed safe) data sources help
drive home one of the general les-
son we argue for: 5,6

Authentication is no substitu-
tion for recognition, and trust
in data should only be based
on recognition, not source
authentication.

90 IEEE Security & Privacy March/April 2012


We fully agree with the need software’s users to intractable or In­security,” ;login:, vol. 36, no. 6,
for “checking everything, includ- malicious computations. 2011, pp. 22–32; www.usenix.org/
ing results from local cooperat- publ ic at ion s/ log i n/2 011-12/
ing services and even function openpdfs/Sassaman.pdf.
parameters,”9 not just user inputs.
However, we believe that a more
definite line is needed for proto-
R eversing the ubiquitous inse-
curity of the Internet and
keeping it free require that we
6. L. Sassaman et al., Security Appli-
cations of Formal Language Theory,
tech. report TR2011-709, Com-
col designers and implementers to rethink its protocol design from puter Science Dept., Dartmouth
make such checking work. A good the first principles. We posit that College, 25 Nov. 2011; http://lang
example is the missing checks for insecurity comes from ambiguity sec.org/papers/langsec-tr.pdf.
Web input data reasonableness that and the computational complexity 7. S. Bratus et al., “Exploit Program-
Allman names as the cause of SQL required for protocol recognition; ming: From Buffer Overflows to
injection attacks. The downstream minimizing protocol ambiguity ‘Weird Machines’ and Theory of
developer expectations of such rea- and designing message formats Computation,” ;login:, vol. 36, no.
sonableness in combination with so they can be parsed by simpler 6, 2011, pp. 13–21.
data format complexity might place automata will vastly reduce inse- 8. J. Postel, ed., DoD Standard Trans-
undecidable burdens on the imple- curity. Our proposal isn’t incom- mission Control Protocol, IETF
menter and prevent any reasonable patible with the intuitions behind RFC 761, Jan. 1980; http://tools.
balance from being struck. Postel’s principle, but can be seen ietf.org/html/rfc761.
as its stricter reading that should 9. E. Allman, “The Robustness Prin-
The Postel’s Principle Patch guide its application to more ciple Reconsidered: Seeking a
Here’s our proposed patch: secure protocol design. Middle Ground,” ACM Queue, 22
June 2011; http://queue.acm.org/
■■ Be definite about what you accept. Acknowledgments detail.cfm?id=1999945.
■■ Treat valid or expected inputs as While preparing this article for publi-
formal languages, accept them cation, we received extensive feedback
with a matching computational about both the Postel principle and our Len Sassaman was a PhD student
power, and generate their recog- patch for it. We asked for permission to in Katholieke Universiteit Leu-
nizer from their grammar. publish these letters in their entirety ven’s COSIC research group.
■■ Treat input-handling computa- and are grateful for permissions to do His work with the Cypherpunks
tional power as a privilege, and so. Find these letters at http://langsec. on the Mixmaster anonymous
reduce it whenever possible. org/postel. remailer system and the Tor
Project helped establish the field
Being definite about what you References of anonymity research. In 2009,
accept is crucial for the security 1. D. Geer, “Vulnerable Compli- he and Meredith L. Patterson
and privacy of your users. Being ance,” ;login:, vol. 35, no. 6, 2010, began formalizing the founda-
liberal works best for simpler pro- pp. 26–30; http://db.usenix.org/ tions of language-theoretic secu-
tocols and languages and is in fact publ icat ions/ log i n/2 010 -12/ rity. Sassman passed away in July
limited to such languages. Keep pdfs/geer.pdf. 2011. He was 31.
your language regular or at most 2. L. Sassaman and M.L. Patterson,
context free (without length fields). “Exploiting a Forest with Trees,” Meredith L. Patterson is a soft-
Being more liberal didn’t work well Black Hat USA, Aug. 2010; http:// ware engineer at Red Lambda.
for early IPv4 stacks: they were ini- langsec.org. Contact her at mlp@thesmart
tially vulnerable to weak packet 3. R. Braden, ed., Requirements for politenerd.com.
parser attacks and ended up elimi- Internet Hosts—Communica-
nating many options and features tion Layers, IETF RFC 1122, Oct. Sergey Bratus is a research assistant
from normal use. Furthermore, 1989; http://tools.ietf.org/html/ professor in Dartmouth Col-
presence of these options in traffic rfc1122. lege’s Computer Science Depart-
came to be regarded as a sign of sus- 4. S. Bratus and T. Goodspeed, ment. Contact him at sergey@
picious or malicious activities to be “How I Misunderstood Digital cs.dartmouth.edu.
mitigated by traffic normalization Radio,” submitted for publication
or outright rejection. At current to Phrack 68. Selected CS articles and columns
protocol complexities, being lib- 5. L. Sassaman et al., “The Halt- are also available for free at
eral actually means exposing your ing Problems of Network Stack http://ComputingNow.computer.org.

www.computer.org/security 91