Sunteți pe pagina 1din 4

Building Security In

Editors: John Steven, jsteven@cigital.com Gunnar Peterson, gunnar@arctecgroup.net

A Metrics Framework to Drive Application Security Improvement

providing overall Web application security. They provide security


E LIZABETH A. N ICHOLS ClearPoint Metrics G UNNAR PETERSON Arctec Group for underlying hosts and a means of communication, but do little to aid the application resist attack against its software implementation or design. Enterprises must therefore focus on the security of the Web application itself. But in doing so, questions immediately arise: What could go wrong with my software? How vulnerable are my existing applications to the most common problems? What changes to my software development life cycle might affect these vulnerabilities? The Open Web Application Security Project (OWASP; www.owa sp.org) Top Ten offers a starting point for guring out what could go wrong. This installment of Building Security In presents metrics that can help quantify the impact that process changes in one life-cycle phase have on other phases. For the purposes of this short discussion, weve broken an applications life cycle into three main phases: design, deployment, and runtime. By organizing metrics according to life cycle in addition to OWASP type, insight from the derived quantitative results can potentially point to defective processes and even suggest strategies for improvement.
PUBLISHED BY THE IEEE COMPUTER SOCIETY

eb applications functionality and user base have evolved along with the threat landscape. Although controls such as network rewalls are essential, theyre wholly insufcient for

If you develop, manage, or administer Web application software and want to measure, analyze, and improve a development culture that produces secure code, this article provides an excellent starting point.

Life-cycle metrics
Software development managers use design-time metrics to make risk-management decisions when dening, implementing, and building software and related security mechanisms. Both managers and developers should harvest designtime metrics from source code via static analysis, from audits and assessments, and iteratively from other runtime and deployment-time metrics. The importance of design-time metrics stems from their ability to identify and characterize weaknesses early in the applications life cycle, when such weaknesses cost much less to x.1 Deployment-time metrics measure changes to the system and its conguration over time. A common (if oversimplied) view is that change is the enemy of security. Deployment-time metrics provide hard data to characterize the amount of change actually present, uncover patterns
1540-7993/07/$25.00 2007 IEEE

over time, and help establish baselines for anomaly detection. When combined with runtime metrics, deployment-time metrics give insight into the rate of change and key servicelevel agreement metrics such as availability, mean time between failures, and mean time to repair. Runtime metrics focus on the Web applications behavior in production and the security vulnerabilities discovered after deployment. Vulnerability discovery at runtime causes the most expense both in terms of application performance and customer impact. Over time, if the metrics collected in the earlier phases show improvement due to design and deployment process changes, then we would expect to see a corresponding improvement in runtime metrics. The notion of design-time, deployment-time, and runtime metrics are particularly illustrative because they apply to distinct phases of the software development life cycle. We can harvest runtime metrics, for example, in the quality assurance phase.

Top Ten items


To explore some explicit metrics, lets review each OWASP Top Ten item and an example design, deployment, or runtime metric for it.

Unvalidated input
The rst itemunvalidated input involves the information from Web requests that isnt validated before the Web application uses it. Attackers can exploit these weaknesses to

88

IEEE SECURITY & PRIVACY

Building Security In

compromise back-end components through the Web application. A good design-time metric is PercentValidatedInput. To compute this metric, let T equal the count of the amount of input forms or interfaces the application exposes (the number of HTML form POSTs, GETs, and so on) and let V equal the number of these interfaces that use input validation mechanisms. The ratio V/T makes a strong statement about the Web applications vulnerability to exploits from invalid inputthe higher the percentage, the better. If a company sees that all of its Web applications have low values for PercentValidatedInput, then mandating the use of a standard input validation framework would drive lasting improvement for current and future applications.

Broken authentication and session management


The third itembroken authentication and session management means the application doesnt properly protect account credentials and session tokens. Attackers that compromise passwords, keys, session cookies, or other tokens can defeat authentication restrictions and assume other users identities. An example runtime metric is BrokenAccountCount, which we can compute by counting the number of accounts that have no activity for more than 90 days and will never expire. Such accounts represent a clear risk of password compromise and resulting illegal access.

metric is OverowVulnCount, which we can obtain from standard vulnerability management tools that identify the patch level of installed software against the patch levels that repair known buffer overow aws. Another useful set of metrics provide statistics around the patching latency for known overow vulnerabilities. To compute these metrics, calculate the minimum, maximum, mean, and standard deviation of the number of minutes/hours/days it took to patch detected overow vulnerabilities during a given time period. A high mean or a high standard deviation indicates either slow or inconsistent patching processes.

Injection aws Cross-site scripting


With the fourth itemcross-site scripting or XSSattackers can use the Web application as a mechanism to transport an attack to a users browser. A successful attack can disclose the users session token, attack the local machine, or spoof content to fool the user. An example runtime metric is XsiteVulnCount, which we can obtain via a penetration-testing tool. The results will likely enter a bugtracking process (developers can quickly x XSS bugs). However, this is another case in which catching the problem earlier is far better than later. The sixth iteminjection aws involves the Web application as it passes parameters when accessing external systems or the local operating system. If an attacker embeds malicious commands in these parameters, the external system can execute those commands on the Web applications behalf. An example runtime metric is InjectionFlawCount, which we can derive from penetration tests that submit invalid parameters to a running copy of the Web application. This metric characterizes the Web applications vulnerability to potential attacks. Another runtime metric is ExploitedFlawCount, which we can derive from reported incidents in which an attacker successfully exploited the application via an injection aw. This metric characterizes the impact actually

Broken access control


The second itembroken access controlmeans the application fails to impose and enforce proper restrictions on what authenticated users may do. Attackers can exploit such weaknesses to access other users accounts, view sensitive les, or use unauthorized functions. An example runtime metric is AnomalousSessionCount, which we compute in two phases. The rst phase derives a SessionTableAccessProle by correlating application server user log entries for a user session with accessed database tables; the resulting value of SessionTableAccessProle is represented as a user ID followed by a set of ordered pairs with a table name and a count. The second phase derives the AnomalousSessionCount by counting how many SessionTableAccessCounts dont t a predened user prole. If AnamalousSessionCount is greater than one for any user, especially a privileged user, it could indicate the need for signicant refactoring and redesign of the Web applications persistence layer. This is a clear case in which detection at design time is preferable.

Buffer overow
The fth itembuffer overowscan crash Web application components such as libraries and drivers in languages that fail to vali-

Organizing metrics by OWASP Top Ten category and software life-cycle phase can drive improvement in existing processes.
date input, or, in some cases, attackers can use them to take control of a process. An example deployment time suffered. Both metrics offer excellent feedback to the development organization about inadequate parameter checking.
IEEE SECURITY & PRIVACY

www.computer.org/security/

89

Building Security In

Insecure conguration management


The nal iteminsecure conguration managementfocuses on how secure Web applications depend on a strong server conguration. Servers possess many conguration options that affect security, and no default conguration is secure. For a deployment-time metric, count the number of service accounts (the ones a program uses to log into services such as database management systems) with weak or default passwords. This indicator helps quantify the risk of illegal access, breach of condentiality, and loss of integrity. Consistent unacceptable exposure warrants better deployment standards.
Figure 1. Security scorecard. This layout helps reviewers assess one or more of the Web applications current states and quality by providing a color-coded score for each category of Open Web Application Security Project (OWASP; www.owasp.org) aw.

A security scorecard
The scorecard in Figure 1 summarizes the relatively ne-grained metrics that calculate data values from penetration testing, static code analysis, incident management systems, vulnerability scanners, and other instrumentation as mentioned in the previous section. Several of our client companies have used this scorecard to track improvement in security centric coding practices with their respective Web application development organizations. The scorecard gives a calculated rating for seven of the OWASP Top Ten categories. Color helps translate the metric results to a more qualitative state: red for bad, green for good, and yellow for somewhere in between. If you perform this exercise in your own company, the keys to success include forming consensus around the mapping and only making changes in a controlled and fully auditable manner. Making a change based on a pseudo-political need to turn a red into a yellow or a yellow into a green will cause a lot of damage in the long run, rendering the scorecard and its underlying metrics useless. The following steps, inspired by the Six Sigma framework (www.isix

Improper error handling


With the seventh itemimproper error handlingthe application source code doesnt properly check or handle error conditions that occur during normal operation. If an attacker can introduce errors that the Web application doesnt handle, he or she can gain detailed system information, deny service, cause security mechanisms to fail, or crash the server. For a design-time metric, use a static analysis tool to count the number of function calls that dont check return values. This instance per application count provides a good indicator of improper error handlings prevalence. A simple raw count performs best. In this case, dividing by the number of all function calls to normalize a raw count into a percentage potentially masks a serious problem.

However, these functions and the code needed to integrate them have proven difcult to code properly, frequently resulting in weak protection. For a deployment-time metric, compute the percentage of servers with installed and active automatic hard-disk encryption to nd the level of protection available as part of a Web applications operating environment. In short, the higher the metric value, the higher level of protection.

Application denial of service


With the ninth itemdenial of serviceattackers can consume Web application resources to a point where other, legitimate users can no longer access or use the application. Attackers can also lock users out of their accounts or even cause the entire application to fail. For a runtime metric, derive metrics from penetration tests that cover denial-of-service attacks. Vulnerability discovery can help here, but preventing denial of service can be a complicated design issue.

Insecure storage
The eighth iteminsecure storage illustrates how Web applications frequently use cryptographic functions to protect information and credentials.
90
IEEE SECURITY & PRIVACY

MARCH/APRIL 2007

Building Security In

sigma.com), help map quantitative metric data into color-coded ratings: Express each underlying metric in terms of defects divided by opportunities. If, for example, a Web application has 100 input forms, and 12 of them have defective input validation, then the application gets a rating of 88. The equation is 1.0 (#defects/#opportunities). Map values to colors by comparing each value to thresholds. For example, your group could establish that red maps to values less than 80, yellow maps to values between 81 to 90, and green maps to values over 91. Aggregate all individual Web application scores in a given vulnerability category to create a single summary score. On this last point, you can do the mapping of many scores to one color-coded state in several different ways. Some possibilities are: The assigned state for the entire vulnerability category takes the worst value or color. This harsh but useful method gives lagging applications a lot of visibility and stresses improvement for the categorys worst score. Map the mean of all individual metrics to a color via a threshold mechanism. Compute a weighted mean of all individual metrics on the applications agreed upon criticality (consensus is key here). Map the weighted mean to a state using a threshold mechanism. Map the mean minus the standard deviation of all individual metrics to a state for a particular category. This approach favors consistency. Map the value of the lowest application in the top decile (or quartile and so on) to a state. The scorecard provides additional indicators that show an upward, downward, or unmoving trend for the

given time period relative to previous periods. Subordinate scorecards can include trend lines covering several historical time periods.

o enrich the above quantitative scoring, analysts should also include qualitative, unstructured annotations to the scorecard, describing how to use the data provided, what objectives it serves, how to interpret the results, and what actions the company has initiated as a result of the insights derived. In this way, organizations can begin to organize the myriad negrained metrics derived from their existing infrastructure and efciently drive application security improvement. As for the time involved, you can implement and regularly review a scorecard such as the one in Figure 1 incrementally by starting with easily obtained metrics such as those from your currently existing penetration testers, static code scanners, and incident management systems. In our own tests, in which we used a purpose-built security metrics platform, the scorecard took roughly two weeks of effort from initial design to deployment for automatic metric production.

Engineering & Applying the Internet


IEEE Internet Computing reports emerging tools, technologies, and applications implemented through the Internet to support a worldwide computing environment. In 2007, well look at: Autonomic Computing Roaming Distance Learning Dynamic Information Dissemination Knowledge Management Social Search
For submission information and author guidelines, please visit www.computer.org/ internet/author.htm

References
1. D.E. Geer, A.R. Jaquith, and K. Soo Hoo, Information Security: Why the Future Belongs to the Quants, IEEE Security & Privacy, vol. 1, no. 4, 2003, pp. 2432.
Elizabeth A. Nichols is the Chief Technology Ofcer at ClearPoint Metrics. Her research interests include information security metrics design, automation, visualization, and benchmarking. Nichols has a PhD in mathematics from Duke University. Contact her at ean@clearpointmetrics.com. Gunnar Peterson is a founder and managing principal at Arctec Group, which supports clients in strategic technology decision making and architecture. His work focuses on distributed systems security architecture, design, process, and delivery. Contact him at gunnar@arctecgroup.net.

www.computer.org/internet/
IEEE SECURITY & PRIVACY

www.computer.org/security/

91

S-ar putea să vă placă și