Sunteți pe pagina 1din 6

SYSTEMS SECURITY

Editors: Patrick McDaniel, mcdaniel@cse.psu.edu | Sean W. Smith, sws@cs.dartmouth.edu

Bolt-On Security unprotected and increasing overall


syntactic complexity.
We believe that this manner of

Extensions for extension represents a security anti-


pattern—a design that will keep
producing bugs and weaknesses—

Industrial Control and considerably increases the


attack surface associated with pro-
tocol encoding, parsing, and imple-

System Protocols: mentation complexity. Reviews of


SA have overlooked this additional
attack surface, focusing instead on
A Case Study of DNP3 SAv5 its cryptographic primitives and
message flows. In this article, we
discuss this increased attack surface
and how to avoid its worst pitfalls.
J. Adam Crain | Automatak
Sergey Bratus | Dartmouth College DNP3 Overview
The DNP3 protocol stack is split
into three layers: link, transport,
and application (see Figure 1). The
protocol is transport agnostic—all

I ndustrial control system (ICS)


protocols—key to public utility
operations—have developed along-
the current ICSs’ attack surfaces
will dramatically increase risks of
their catastrophic failure due to hos-
three layers are used regardless of
whether the underlying network is
a serial communications channel or
side the Internet but are largely iso- tile actions. a TCP stream (with its own open
lated from it, carried by dedicated DNP3 (IEEE Standard 1815- systems interconnection model lay-
serial lines between closed networks 2012) is widely used in the US ers below DNP3’s link).
with trusted software. However, as power grid and is a typical repre- The link layer is concerned
leased lines are replaced with trans- sentative of the supervisory control primarily with framing, point-to-
mission control protocol (TCP) or and data acquisition (SCADA)/ multipoint addressing, and error
wireless connections to serve the ICS protocol family.1 As with many detection in a manner similar to
needs of “smarter” energy systems other such protocols, DNP3’s origi- Ethernet datagrams, but it also
and as ICS traffic comingles with nal design didn’t include security includes simple stateless functional-
other kinds of packets, legacy ICS features such as authentication. ity such as heartbeat messages.
protocol design becomes a prob- The recently standardized secure The transport layer reassem-
lem. Protocols previously designed authentication (SA) extends DNP3 bles multiple link layer frames into
for isolated networks must receive to provide optional and multiuser larger application messages. This
“bolt-on” security extensions, com- authentication services, with char- reassembly is based on a single-
patible with the bulk of legacy acteristic tradeoffs between security byte transport header with first,
implementations already deployed; and bandwidth. These extensions final, and sequence parameters that
implementations never meant to modify the existing DNP3 appli- allow for only in-order reassembly.
be exposed to maliciously mal- cation layer by creating additional Unexpected transport segments
formed input must be hardened to function codes and object types are simply dropped, and the reas-
reject it gracefully. Attempting to that selectively apply to a subset of sembly buffer is emptied. The maxi-
realize visions of smart utilities with protocol features, leaving others mum default size of a reassembled

74 May/June 2015 Copublished by the IEEE Computer and Reliability Societies  1540-7993/15/$31.00 © 2015 IEEE
application layer message is 2,048 time of occurrence fields and
Application
octets, although this size is adjust- the measurements with which they
able if both ends agree on the value can be paired. This combination of Transport
of using out-of-band configuration. headers requires a common refer-
The application layer handles ence time header to precede one Link
messages called application data ser- or more relative times and acts as
vice units (ASDUs) that derive their a crude way of compressing what’s Figure 1. Abstract DNP3 communication stack. The link
semantics from a combination of normally 48-bit time stamps on layer has direct access to the communication channel.
function codes and objects. Mes- measurement values.
sages can consist of zero or more In any protocol, there’s an inherent
object headers that follow the main design tradeoff between structural
application layer header (see Fig- flexibility and attack surface, which Header
Header + Header +
(including ...
ure 2). Object headers describe the doesn’t favor cryptography. Indeed, data #1 data #N
function)
type and quantity of objects that underlying complexity or ambigu-
follow. The beginning of the next ity of encoding gave rise to a variety
header is discoverable only by pars- of attacks such as Serge Vaudenay’s Figure 2. DNP3 application layer messages consist of
ing the previous one. and Daniel Bleichenbacher’s as well a main header and zero or more object headers and
The rules for determining object as the more recent BEAST, CRIME, associated data.
payload lengths are complex and Lucky13, POODLE, BERserk, and
varied. This complexity gives imple- others, which all worked around
mentations of the DNP3 applica- the enduring strength of the crypto- protocol’s complexity aren’t theo-
tion layer a large attack surface due graphic primitives. retical. For nearly a decade, such
to potential programmer errors. For With its high level of flexibil- vulnerabilities in DNP3 and other
example, programmers might fail to ity, the DNP3 application layer is a SCADA protocol implementations
check a payload’s multiple object poor candidate for encoding cryp- have been found by fuzzing2,3;
lengths for consistency, interpret a tographic functions. Despite the however, little information has been
payload’s contents differently than constraints placed on function and made publicly available on DNP3
intended, or assume the presence object combinations, the number vulnerability specifics. A 2010 US
of objects that are actually absent of valid combinations of objects for government–funded report spe-
from a maliciously crafted pay- many DNP3 function codes is practi- cifically mentioned the dire need
load. As usual in software exploita- cally infinite. The ability to associate to improve input parsing routines
tion scenarios, acting on incorrect multiple objects to a single function in DNP3 implementations without
assumptions while allocating or makes the DNP3 application layer citing specific failure modes.4
copying maliciously crafted payload powerful in terms of flexibility and The most comprehensive study
data results in memory corruptions, bandwidth but also particularly vul- of DNP3 vulnerabilities was con-
which attackers can leverage to nerable when it comes to parsing and ducted by Crain (coauthor of
crash or control ICS processes. processing attacker-supplied input. this article) and Chris Sistrunk
Most types of valid messages By contrast, SCADA protocols of from 2013 to 2014 and resulted in
require at least one object header. similar functionality, such as IEC numerous disclosures coordinated
Notable exceptions are confirm, 60870-5-104, have more rigid appli- with vendors and asset owners (www
cold restart, warm restart, cation layer structures in which the .automatak.com/robus); a small rep-
d e l a y   m e a s u r e m e n t , and function code completely defines resentative fraction of the raw vulner-
record current time, which are the type of data that follows, reduc- ability data was released publicly.5
never paired with any objects. The ing the combinatorial complexity of We recap the results of this study
specification exhaustively defines valid inputs (and thus the complex- here, as they pertain to DNP3 secu-
which objects can be paired with ity of the code that must validate rity extensions.
which function codes.1 them). Not surprisingly, DNP3’s
The vast majority of ­ object complexity is reflected in its distribu- Examples of Vulnerabilities
headers can be processed indepen- tion of vulnerabilities. Crain and Sistrunk tested the effects
dently—that is, they aren’t context that crafted malformed frames
sensitive with regard to other ob- Fuzzing Vulnerable could have on DNP3 implemen-
ject headers in the ASDU. Nonse- Implementations tations in master controllers and
cure DNP3 has only one notable Vulnerabilities in DNP3 imple- outstation (remote) equipment.
exception to this rule: common mentations that arise due to the Nearly all vendor products were

www.computer.org/security 75
SYSTEMS SECURITY

CRC CRC Parsing Guideline Tables in IEEE Stan-


dard 1815-2012 reveals that the most
05 64 06 44 64 00 64 00 FF F2 C0 1D 0A
overloaded functions in terms of the
100 100
number of possible object types are
read, response, and unsolic-
1-byte Unconfirmed First/Final
payload user data Sequence number = 0 ited response.1 The near absence
of vulnerabilities in the read function
Figure 3. A DNP3 frame with source address 100 and destination address 100. It code is best explained by the fact that
contains no application layer payload and caused a fault in a real system owing client ASDUs don’t carry data payloads
to poor input validation. CRC is cyclic redundancy check. but merely describe what data is being
requested, resulting in a simpler syn-
tax. Responses and unsolicited
found to be vulnerable to single-­ A single frame triggering an responses can be associated with
frame attacks for certain frame unhandled exception can be as sim- the majority of the object headers and
types; these types were chosen to ple as a payload that contains no types in the specification, giving these
exercise the protocol’s syntactic APDU under the valid link and trans- functions the highest attack surface.
complexity and to trigger program- port layer checksums (see Figure 3).
mer errors that would likely result Underrepresentation of
from this complexity. Distribution of the Application Layer
A single crafted frame received Vulnerabilities Despite the majority of the failures
by a vulnerable implementation The generational fuzzer used in this discovered in the application layer,
could crash the receiving process or study was designed to stress each there’s reason to believe that this
drive it into an infinite loop, render- layer of the protocol individually to layer is underrepresented in the
ing the entire protocol stack inoper- expose weaknesses in each layer’s results as compared to the link and
able. Moreover, for many vendors, implementation. The tool was itera- transport layers. The open source
broadcast frames could trigger such tively improved using code cover- package used to verify the fuzzer is
effects, which doesn’t require any age analysis obtained from an open a conservative implementation that
attacker knowledge about the link source implementation of DNP3. doesn’t include even more complex
endpoint configurations. More than 80 percent of discov- protocol feature subsets such as file
For example, ASDUs that are ered vulnerabilities were found in the transfer, datasets, and device attri-
too short to contain a valid object application layer. This isn’t surpris- butes. The fuzzer was developed to
header could be delivered in a frame ing given how DNP3’s complexity verify this open implementation
with a correct lower-layer cyclic is distributed. The DNP3 specifica- and therefore doesn’t model these
redundancy check (CRC) value to tion devotes hundreds of pages to optional features. Many of these
cause an unhandled exception in describing the application layer, its features use more complicated
the receiving code. An infinite loop state machine, and the numerous encodings that include variable
could be exploited in another imple- object encodings, whereas the link length fields, many of which can
mentation by setting an object count layer is covered in only 21 pages and be specified in multiple ways and
to the maximum possible value of the transport layer reassembly gets can be internally inconsistent and
65,535 but failing to provide these a mere seven pages. We find similar potentially confusing to a parser. It’s
bytes. A response with two control ratios by counting the source lines almost certain that significant latent
objects unexpected in such a frame of code associated with each layer vulnerabilities exist in these com-
would cause a buffer overrun and in an open source implementation plex but untested areas of the vari-
crash—an example of a payload of the protocol. Simply put, when ous protocol implementations.
that’s syntactically valid according it comes to robustness and security,
to the specification but meaning- less is more. Optional Authentication
less. This creates room for ambiguity Of the application layer vulner- The SA specification lists a set of
of payload interpretation. In other abilities, a disproportionate number function codes that must always be
cases, malformations in simpler were associated with the unsolicited authenticated as well as a smaller
lower layers caused crashes, for response functions. A crude way of subset of function codes that can
example, a link frame encapsulating explaining this is to analyze the speci- be optionally authenticated. This
a single-byte malformed transport fication to see how many object types decision was made to conserve
protocol data unit and no applica- can be paired with certain function communication bandwidth.1 How-
tion protocol data unit (APDU). codes. Performing this analysis on the ever, selective authentication of

76 IEEE Security & Privacy May/June 2015


application layer messages is a codes can be optionally authenticated application layer state machines in a
counterproductive and danger- based on the configuration. Manda- manner that’s difficult to untangle.
ous design pattern, especially in tory function codes—listed in IEEE This presents real challenges for
SCADA. Optional authentication Standard 1815-20121—are primar- implementers who must now sup-
conveys a false sense of security to ily those that can alter the outstation’s port both the secure and nonsecure
users, fails to address the vulner- state and the process’s output state. versions of the protocol.
ability threats posed by parsing and A notable exception is the assign This complexity also extends
processing payloads, and substan- class function code, which can be into parsing and ambiguous encod-
tially increases the protocol’s over- used to silence an outstation’s report- ings. Cryptographic protocols
all complexity by requiring security ing mechanism by assigning all event should always defer as much pars-
mechanisms to be protocol aware. data in the outstation to class 0. This ing and processing as possible until
would have an effect similar to dis- after the sender’s identity has been
Unauthenticated Closed Loop able unsolicited but could established to both derive the most
The spoofing of measurement data be even more harmful because it benefit from cryptographic integrity
has been a component of several would likely persist across device protections and avoid the so-called
major attacks against ICSs, includ- reboots and remove event data from “cryptographic doom principle.”8
ing Stuxnet, allowing attackers to responses to normal event polls. Unfortunately, SA’s design contra-
cause more undetected damage or dicts this principle.
losses to a process over time than Responses Present the Most
with a sudden catastrophic event. In Risk for Exploitation Challenge–Response
this context, not providing manda- As we discussed, the most complex versus Aggressive Mode
tory authentication of measurement response and unsolicited After an initial session key exchange,
data from the field is an impor- res­ponse function codes present normal DNP3 traffic can be authen-
tant oversight. Man-in-the-middle the highest attack surface and there- ticated using one of two modes:
attackers on an SA link with only fore the most risk of exploitation. challenge–response and aggressive,
mandatory authentication enabled Furthermore, remotely compro- which is a form of one-pass authen-
can allow authenticated control mising a physically well-protected tication using sequence numbers
information to pass but subtly alter master from an isolated and less-pro- for replay protection.
measurement data in such a way that tected field asset was until recently The challenge–response mode
gradually degrades the process or an underdiscussed attack vector.6,7 introduces two additional messages
damages equipment. The attack model under which into the normal traffic flow and can
the specification was designed substantially impact latency and
Lack of Stateless Functionality doesn’t seem to include implemen- throughput for a serial link. This two-
Because almost no stateless func- tation defects as a viable threat. pass authentication mode is more
tionality can be found in the proto- Selectively authenticating subsets resistant to replay attacks because
col, configuring an SA system to not of the protocol by function alone— each message is authenticated using
authenticate any particular func- and not for complexity—is a major a unique nonce for each challenged
tion code is inadvisable. The DNP3 oversight and should be regarded as message. In this mode, it’s fairly easy
application layer has only a hand- a secure protocol anti-pattern. for the challenging party to treat
ful of completely stateless function Conversely, requiring authentica- everything after the function code
codes. The delay measurement tion prior to parsing these complex as opaque “payload data” that isn’t
function code, for instance, doesn’t areas of the specification would turn parsed until the remote side authen-
alter any server-side state in the a preauthentication exploit into one ticates. Figure 4 shows challenge–
outstation when processed. How- requiring compromised credentials. response mode’s traffic flow.
ever, because of the event-oriented Aggressive mode adds a user
nature of DNP3, a combination Protocol Complexity object with a sequence number as
of read and confirm functions DNP3 is a complex protocol, the first object header in the ASDU
allows attackers with access to the mostly due to the way it implements and a hash message authentication
network to flush all queued event the transfer of event data using code (HMAC) value as the last
messages from an outstation if these server-side state. A lot of bookkeep- object header. The purpose of this
functions aren’t authenticated. ing and additional messages such mode is to reduce bandwidth and
DNP3 SAv5 requires the authen- as confirms are required to keep latency by authenticating messages
tication of 21 out of 34 total function things synchronized. SA adds even in a request–response exchange. Fig-
codes, whereas the remaining function more complexity to the same set of ure 5 shows the request’s structure.

www.computer.org/security 77
SYSTEMS SECURITY

object headers that use start and In this encoding, message au-
Normal request stop indices. At first blush, it appears thentication code (MAC) value
Function + (payload bytes) necessary to interpret the inner pay- length is unambiguous in the sense
Challenge load data to be able to determine that there’s only one way to deter-
Function + nonce the trailing HMAC’s position. mine its value. If the total size of the
Fortunately, in this case, there’s object is N and the length of all fields
Authentication a nonintuitive and undocumented preceding the MAC value is P, then
HMAC (key, request, nonce) workaround. The HMAC object the length of the MAC value is N –
Normal response and its header are of a known size P. However, this encoding scheme
and can be speculatively parsed off isn’t applied consistently. Some ob-
Figure 4. Challenge–response message flow. Parsing of the the end of the ASDU. Future ver- jects have a preceding length field
message payload can be deferred until after authentication. sions of the specification should for the final variable length field, as
HMAC is hash message authentication code. make explicit recommendations to Figure 7 shows.
implementers to use this methodol- Thus, there are two ways to
ogy for reading the aggressive mode determine the master chal-
HMAC. We note that the signing lenge data field’s length in an
Normal User ID Payload schemes for Linux’s loadable kernel update key change request.
objects HMAC
function and CSQ
... modules have finally converged on In a valid encoding of this object,
a similar design in which a fixed- the entire object’s length must agree
size signature is simply appended with the final field’s explicit length
Figure 5. In aggressive mode, application data service units to the end of the module object value. To complicate the issue, the
sandwich the payload to be processed inside an ad hoc file after a string of unsuccessful specification informs implement-
envelope consisting of user and sequence information and designs that attempted to use more ers that they can use either method
a trailing message digest. The challenge sequence number complex formats and metadata. to establish the final field’s length,1
(CSQ) protects messages from replay attacks. which can lead to implementations
Conflicting Encodings that disagree on the cryptographic
of Length data’s contents. If the protocol
Aggressive Mode Ambiguity Many variable-length objects related can’t be redesigned to remove such
The first issue with aggressive mode to security functionality have incon- encoding ambiguities, the pars-
request encoding is the ambigu- sistent encodings between objects ing recommendation should be to
ity of the request. Normally, DNP3 as well as encodings with multiple always check that these two meth-
message payloads can be processed ways of representing the length of ods produce the same length value.
solely based on the function code. certain fields in a single object. Hav-
In aggressive mode, the first object ing two sources of truth for lengths
header must be inspected to deter-
mine whether the ASDU is a nor-
mal request or an aggressive mode
of certain payload elements has been
a common source of implementa-
tion defects in various protocols,
D NP3 SA contains a num-
ber of anti-patterns that will
likely serve as a significant source
request. The lack of a proper envelope most recently OpenSSL’s Heart- of bugs. Vendors and standards
for the payload data requires imple- bleed and the GNU TLS Hello bodies adding security to SCADA/
menters to perform special-case pars- bug, as well as classic preauthen- ICS protocols should strongly favor
ing in multiple places to safely handle tication bugs such as OpenSSH’s a layered approach to security in
aggressive mode requests. challenge–response vulnerability. which legacy protocol issues can
The most dangerous issue with In DNP3, all variable-length be de­coupled from SCADA object
aggressive mode encoding is that objects are preceded by a UINT16 models and semantics.
many implementers will naively length that defines the entire
parse the entire payload data to object’s length. Fixed-length fields Acknowledgments
reach the HMAC trailer. Recall come first in the object, and vari- This specification review was performed
that DNP3 object headers can’t able-length fields come last. All as part of the process of implementing
normally be skipped over without but the last variable-length field it in a preexisting open source project.
at least some level of light pars- is preceded by its own UINT16 The DHS S&T HOST program award
ing. Numerous vulnerabilities were length field. The last field’s length partially funded this work.
identified in the parsing of these is implicitly established as the
object headers, particularly integer remainder of the envelope length. References
overflow issues related to handling Figure 6 shows this pattern. 1. IEEE Std. 1815-2012, IEEE Stan-

78 IEEE Security & Privacy May/June 2015


dard for Electric Power Systems Octet transmission order
Communications-Distributed Net- 7 6 5 4 3 2 1 0 Bit position
work Protocol (DNP3), IEEE, 2012; b0
https://standards.ieee.org/find
stds/standard/1815-2012.html. Key change sequence number
2. G. Devarjaran, “Unraveling SCADA b31
Protocols: Using Sulley Fuzzer,” b0 User number
DEFCON 15, 2007; www.dc414 b15
.org/download/confs/defcon15 b7 b0 Key wrap algorithm
/Speakers/Devarajan/Presentation b7 b0 Key status
/dc-15-devarajan.pdf. b7 b0 MAC algorithm
b0
3. D.G. Peterson, “Iccpsic Assessment Challenge data length
Tool Set Released,” Digital Bond, b15
2007; www.digitalbond.com/blog
/2007/08/28/iccpsic-assessment
Challenge data
-tool-set-released.
4. NSTB Assessments Summary Report:
Common Industrial Control System
Cyber Security Weaknesses, tech. report
INL/EXT-10-18381, Idaho Nat’l MAC value
Laboratory, May 2010; http://fas
.org/sgp/eprint/nstb.pdf.
5. D. Peterson, “S4x14 Video: Crain/
Sistrunk—Project Robus, Mas- Figure 6. A session key status object with two variable-length fields, challenge
ter Serial Killer,” Digital Bond, 23 data, and message authentication code (MAC) value. The MAC value’s length is
Jan. 2014; www.digitalbond.com the remainder of the length field framing the entire object.1
/blog/2014/01/23/s4x14-video
- c ra i ns i st r u n k- p ro jec t-rob u s
-master-serial-killer. Octet transmission order
6. D. Peterson, “Why Crain/Sistrunk 7 6 5 4 3 2 1 0 Bit position
Vulns Are a Big Deal,” Digital Bond, b7 b0 Key change method
2013; www.digitalbond.com/blog b0
User name length
/2013/10/16/why-crain-sistrunk b15
-vulns-are-a-big-deal. b0 Master challenge data length
7. E. Byers, “DNP3 Vulnerabili- b15
ties Part 1 of 2—NERC’s Elec-
tronic Security Perimeter Is Swiss User name
Cheese,” Tofino Security, 7 Nov.
2013; www.tofinosecurity.com
/blog/dnp3-vulnerabilities-part
-1-2-nerc%E2%80%99s-electronic Master challenge data
-security-perimeter-swiss-cheese.
8. M. Marlinspike, “The Crypto-
graphic Doom Principle,” Thought Figure 7. Update key change request with two variable-length fields,
Crime blog, 13 Dec. 2011; www user name and master challenge data. The length of the challenge
.t h o u g h tc r i m e.o r g / b l o g / t h e data is explicitly encoded in the length field and implicitly encoded as the
-cryptographic-doom-principle. remainder of the length field framing the entire object.

J. Adam Crain is a software engi-


neer, security researcher, and the utility space. Contact him at Science Department at Dart-
open source advocate. He’s also ­jadamcrain@automatak.com. mouth College. His research
a partner at Automatak, which interests include Unix security
aims to improve the penetration Sergey Bratus is a research associ- and wireless networking. Contact
of robust open source software in ate professor in the Computer him at sergey@cs.dartmouth.edu.

www.computer.org/security 79

S-ar putea să vă placă și