Sunteți pe pagina 1din 9

No Minimum Level For R-Squared In

Regression Analysis

Law360, New York (April 22, 2016, 2:58 PM EDT) --

Presentations of regression analysis in litigation matters often emphasize the R-squared


statistic (or the “coefficient of determination”),[1] which provides, in a single number, a
measure of how well the regression model fits the data.[2] An R-squared equal to one
means that 100 percent of the variation observed in the dependent variable is captured by
the explanatory variables in the regression model, i.e., the regression model perfectly fits
the data.[3] Conversely, an R-squared on the opposite end of the range, for example 0.05,
indicates that the regression model captures 5 percent of the observed variation in the data,
and that 95 percent of the observed variation in the dependent variable is not explained by
the model used in the regression.

The deceptively easy-to-interpret measure of R-squared can make it an attractive statistic


for evaluating the overall quality of the regression analysis. Indeed, some court decisions
have focused on R-squared. In Valentino v. United States Postal Service, the plaintiff’s
expert presented a regression model that had an R-squared value of 0.28 (or 28 percent).
The court deemed the regression to have no probative value reasoning that the “low” R-
squared value meant that potentially relevant explanatory variables had been omitted.
Similarly, in Griffin v. Board of Regents, the court held that “the explanatory power of a
model is a factor that may legitimately be considered by the district court in deciding
whether the model may be relied upon.”[4] Although there are other cases where a low R-
squared statistic did not disqualify a model, an R-squared that is viewed as low may present
an additional method for challenging the validity or reliability of a regression analysis.[5]

A common question is whether there is indeed a minimum acceptable value for R-squared.
Does an R-squared value need to be greater than a certain arbitrary level (often 50 percent
or greater)? Is a regression model with a higher R-squared categorically preferable to one
with a lower R-squared? In this article, we examine whether there is indeed a minimum
threshold for R-squared through the examination of R-squared values reported in a sample
of peer-reviewed economics studies. If there is a minimum threshold, it should be evident
from the economics studies. To the contrary, we find that many empirical studies make no
mention of R-squared, and for those articles that report an R-squared value, approximately
half of the studies report an R-squared less than 0.5 (or 50 percent), and approximately 7
percent of the studies report an R-squared less than 0.1 (or 10 percent).

Is There a Minimum Level of R-Squared in the Scientific Community?

One way to examine the question of whether there is a minimum threshold for R-squared
for an econometric model is to review the reported R-squared values in highly regarded,
peer-reviewed economics journals. We reviewed 315 economics articles where a
regression analysis was undertaken across three journals published in 2014 and 2015.[6] If
there was indeed a minimum level of R-squared, we would expect to see this threshold in
the peer-reviewed economics articles. To the contrary, we do not observe a minimum
threshold and instead observe a wide range of values for R-squared in these published
articles. A reasonable interpretation is that there is no consensus of what constitutes an
acceptably high (or unacceptably low) R-squared to warrant inclusion in (or exclusion from)
a peer-reviewed economics journal. Another interesting observation is the fact that almost
half of the empirical articles (146 out of 315 papers, or 46 percent) do not even provide an
R-squared statistic.

The chart below shows all reported instances for R-squared in the reviewed articles,
including adjusted R-squared and pseudo R-squared.[7] Noticeably, the majority of reported
models exhibit fairly “low” (and occasionally negative) R-squared values.
The second figure below shows (unadjusted) R-squared statistics from articles where the
estimation methodology is or appears to be ordinary least squares (OLS).[8]

To further strengthen comparability, the figure below shows the distribution of standard R-
squared statistics in instances where the authors explicitly identified the model was OLS. As
we tighten the criteria for comparability, the distribution of R-squared statistics is
increasingly skewed towards the lower end of the spectrum.

Interestingly, for those articles where the OLS methodology was explicit, only 6 percent of
the articles comment on the R-squared statistic within the text, and the comments are
invariably very brief and generally only state what the R-squared value is.

We also have not identified any article from our sample that explicitly cites the R-squared as
a criterion for preferring one model specification over another. Generally, the R-squared is
provided as one of several statistics describing the regression results with no discussion of
its significance within the text. Moreover, the value of the R-squared does not appear to be
a criterion for determining whether a regression analysis is fit for publication in peer-
reviewed economics journals.

Interpretation of the Results

The fact that the surveyed articles fail to show a minimum acceptable level of R-squared
should not be surprising for economists who recognize the limitations of interpreting the R-
squared value. As noted in a standard econometrics textbook, “low R-squareds in
regression equations are not uncommon, especially for cross-sectional analysis ... [A]
seemingly low R-squared does not necessarily mean that an OLS regression equation is
useless.”[9] In particular, a “high” R-squared is merely an indication that the model fits the
existing data well. By itself, it does not provide insight into whether the model is
economically or statistically meaningful, and it might in fact only reflect a strong but spurious
correlation (and not causation) between the dependent and explanatory variables.

Various examples of high R-squared exist that illustrate this point, such as the relationship
between annual attendance at Fenway Park (the stadium of the Boston Red Sox) and the
annual number of U.S. patent applications filed from 1995 to 2014, which has an R-squared
of 0.8 (or 80 percent).[10] While the play of the Boston Red Sox may have inspired its share
of inventors in the Boston area in recent decades, it is obviously not a significant driver of
U.S. patent applications. In this case, the high R-squared value is economically
meaningless and reflects spurious correlation.[11]

The value of R-squared will vary depending upon the type of economic analysis. For
example, time series analysis tends to result in a higher R-squared than cross-sectional
analysis, and it would be improper to compare R-squared values across disparate
regressions.[12] If the dependent variable is a nonstationary time series,[13] an R-squared
value close to one is unimpressive, and if very close to one, may be a troubling sign that
there are significant time patterns in the errors because a large part of the explanatory
power of the regression may rely on the time trend as opposed to economically interesting
variables. Alternatively, if the dependent variable is a stationary time series, then an R-
squared of 0.25 (or 25 percent), for example, may be reasonable so long as the model is
properly specified.[14]

It is important to bear in mind that often an analyst is not attempting to explain all or a high
proportion of the variation in the dependent variable. Rather, the analyst is often seeking to
assess whether the relationship between the explanatory and dependent variable is
economically material and statistically significant.[15] Given sufficient data it may still be
possible to reliably estimate the impact of individual explanatory variables even though the
model has a “low” R-squared.[16] For example, an econometric study analyzing the
relationship between selected variables and economic growth may yield a low R-squared,
but can reveal an economically meaningful (and potentially important) relationship between
one of the observed variables and economic growth, even if it leaves a large amount of
variation in the data unexplained.[17] It also is important to understand that R-squared is not
a proper measure of whether important explanatory variables have been omitted or whether
there is omitted variable bias.[18]

A second potential objective of regression modeling is to predict or forecast the dependent


variable. In this context, a low R-squared may raise concerns as to the accuracy of
predictions generated by the model. However, even in this context, what one is seeking to
understand is confidence or prediction intervals around which the analyst can be confident
that the true value of the explanatory variable will lie with a certain degree of statistical
accuracy. Narrower confidence intervals suggest more precise predictions. While a low R-
squared suggests that confidence intervals may be wide, it would be most appropriate to
consider explicitly the actual confidence intervals and whether this is an issue depending on
the degree of accuracy sought.

Finally, it would be improper to categorically favor models simply because one specification
yielded a higher R-squared over an alternative, and potentially warranted, specification. A
focus on a “high” R-squared could lead an analyst to discard unduly a theoretically sound
model in favor of another model that achieves a higher R-squared from including arbitrarily
chosen variables. The addition of variables to a model needs to be done with proper
consideration about the cause-and-effect assumptions implicit with those variables, and one
must be careful to examine how the additional variables change the estimated coefficients
of other variables. Targeting a high R-squared could make the interpretation of the overall
model more difficult, mask relationships between key economic variables, and raise
questions about the overall reliability of the model.[19]

Verdict: Requiring a High R-Squared Is Not Supported

There is no support empirically or theoretically for accepting or dismissing a regression


model on the basis of the R-squared value. There is no “good value” for R-squared, and a
high R-squared value by itself cannot validate a regression model. In summary, R-squared
should not be viewed as the bottom line for evaluating a regression, and rigorous analysis
requires an evaluation of regression models that places the R-squared in context of the
quality and type of data being used along with the overall theoretical justification of the
variables within the model.

—By William Choi, Pablo Florian and Stuart Miller, AlixPartners LLP

William Choi, Ph.D., is a managing director in AlixPartners’ San Francisco office. Pablo
Florian is a vice president in the firm's Chicago office. Stuart Miller is a vice president in the
firm's Dallas office.

The opinions expressed are those of the author(s) and do not necessarily reflect the views
of the firm, its clients, or Portfolio Media Inc., or any of its or their respective affiliates. This
article is for general information purposes and is not intended to be and should not be taken
as legal advice.

[1] The interpretation of R-squared only applies for results from the ordinary least squares
(OLS) regression, which is the most commonly used regression model, particularly in
litigation.

[2] Specifically, it is the percentage of the variation in the dependent variable that is
explained by the regression model.

[3] Some brief comments on the purpose of regression models in litigation may be
appropriate. Suppose that, for example, the analyst wishes to assess the impact of a legal
dispute on a firm’s sales or the prices it pays for an input, such as due to a breach of
contract or anti-competitive behavior. Unfortunately, there may have been other factors
affecting sales or prices, so that the analyst cannot simply compare prices/sales before,
during and (if appropriate) after the dispute. The challenge for the analyst is therefore to
disentangle the effects of the legal dispute from other factors which may have affected sales
or prices, and this is where regression models can help. In this case, the dependent (or
response) variable would be sales or prices, and the explanatory (or independent or
predictor) variables would be those which potentially may have affected sales/prices (such
as costs and demand) as well as the impact of the dispute.
[4] Mary P. Valentino v. United States Postal Service. 511 F.Supp. 917 (1981) and Brenda
S. Griffin, et al. v. Board of Regents of Regency Universities, et al. 795 F.2d 1281 (1986).

[5] John J. Calandra, Michael D. Hall, and Sandra B. Saunders (2013), “U.S. Supreme
Court Again Strikes Down a Regression Model Offered for Class Certification: Is More
Rigorous Scrutiny on the Way?”, Bloomberg BNA: Expert Evidence Report.

[6] We reviewed papers in the American Economic Review, the American Economic Journal:
Applied Economics, and American Economic Journal: Economic Policy. Given that the May
issues of the American Economic Review contain the Papers and Proceedings of the
Annual Meeting of the American Association we excluded these, because these tend to
contain articles considering similar issues or topics. In total, our data reflects the content of
430 papers. Including the May issues for the AER would have added 224 articles to the
database.

[7] An R-squared less than zero is possible in certain types of estimation techniques that
are not ordinary least-squares regressions, as well as for R-squared statistics that adjust for
the number of explanatory variables in the model.

[8] These include R-squared statistics for those instances where the authors did not specify
a non-OLS estimation methodology and the specification of the regression is such that OLS
is likely to have been employed.

[9] Wooldridge, Jeffrey M. Introductory Econometrics: A Modern Approach. 3rd edition,


2006, pp. 43-44.

[10] Calculated based on data from http://www.baseball-


reference.com/teams/BOS/attend.shtmland http://www.uspto.gov/web/offices/ac/ido/oeip/taf
/us_stat.htm

[11] For a number of entertaining spurious correlations,


see http://www.tylervigen.com/spurious-correlations. The site provides data indicating an R-
squared of 0.99 for the relationship between the annual divorce rate in Maine and the
annual U.S. per capita consumption of margarine and an R-squared of 0.96 for the
relationship between the annual U.S. per capita consumption of mozzarella cheese and the
annual number of civil engineering doctorate degrees awarded in the U.S.[11]
[12] A time series analysis is a type of econometric analysis that, as the name suggests, is
based on changes in the values of given variables over time. In contrast, a cross-sectional
analysis is based on observed values for given variables at one point in time.

[13] A time-series with a trend.

[14] However, we do emphasize caution to be undertaken when evaluating a model with a


low R-squared, such that the variables included in the model are theoretically sound, and
the data are not contaminated by outliers or improper measurements that could be
unnecessarily adding noise.

[15] The p-value is the probability of observing the available data if the alternative
hypothesis were true, where the alternative hypothesis tends to be that there is no effect,
i.e., that the true value for the coefficient in question is zero. For example, if our null
hypothesis is that a cartel has had an effect on prices and the alternative hypothesis is it did
not, then a p-value of 3% for the coefficient meant to capture the cartel effect means that
the likelihood that we would observe the available data if the cartel had no effect is 3%, i.e.,
very unlikely, and the coefficient is held to be statistically different from zero.

[16] Wooldridge, Jeffrey M. Introductory Econometrics: A Modern Approach. 3rd edition,


2006, p. 207.

[17] Wooldridge, Jeffrey M. Introductory Econometrics: A Modern Approach. 3rd edition,


2006, pp. 43-44.

[18] Three specific tests are the LR test, Wald test, and Lagrange Multiplier test.

[19] When a statistical measure is used for decision-making, it ceases to be a good


measure. This observation, commonly known as Goodhart’s Law after economist Charles
Goodhart and first formulated by Donald T. Campbell, reflects the view that if it is known
that the outcome of a policy will be assessed on the basis of a pre-determined statistical
measure, then the decision makers who are being assessed will target that specific
measure, possibly to the detriment of the wider aim of the policy. For example, aassessing
school performances solely on the basis of standardized achievement test results provides
incentives for teachers to focus on students’ test performance. This outcome may not be
desirable, in particular if the test scores are an imperfect reflection of the achievements the
test scores aim to quantify. Although test scores may rise as a result of this policy, students’
ability to apply the taught material outside the setting of a standardized test may suffer and
would be contrary to policy objectives.

S-ar putea să vă placă și