Documente Academic
Documente Profesional
Documente Cultură
by
Jinook Jeong 1
School of Economics
Yonsei University
and
Daeyoung Jeong
School of Economics
Yonsei University
January 2010
Abstract
We propose an alternative Hausman test for normality in the Tobit model. Unlike
the previous tests by Ruud (1984) or Newey (1987), our test compares the Tobit estimator to
an estimator which is consistent both under the null hypothesis and under the alternative
hypothesis. To do so, it utilizes a purely nonparametric estimator for the Tobit model
proposed by Newey (1999b) and Jeong (2004). We present a Monte Carlo simulation to
compare the finite sample properties of our test to the previous tests.
Acknowledgement: We are grateful for the helpful comments on the earlier draft from
Graduate Seminars on Contemporary Economics at Yonsei University. The authors are
members of Brain Korea 21 Research Group of Yonsei University.
Corresponding Author, 1) address: School of Economics, Yonsei University, Seoul, Korea 120-749, 2) phone:
+82-2-2123-2493, 3) email: jinook@yonsei.ac.kr
I. Introduction
The maximum likelihood estimation of censored regression model has been named
Tobit after Tobin (1958).
on the assumption of normality. Arabmazar and Schmidt (1982) and Goldberger (1983)
among others show that the Tobit estimator becomes inconsistent when the normal
distribution assumption of the disturbance is not satisfied.
Lee (1984), Chesher and Irish (1987), and Pagan and Vella (1989) propose various types of
score tests.
He compares a method of
moment estimator and the maximum likelihood estimator to construct a Hausman test. 2
Ruud (1984) uses a probit estimator, truncated regression estimator, and the Tobit estimator
in constructing a similar pseudo-Hausman test. 3
Hausman-type test using the difference between Powells (1986) symmetrically censored
least squares (SCLS) estimator and the Tobit estimator.
However, the finite sample performance of the Hausman tests for normality is far
from perfect.
observations, Neweys test severely over-rejects the true null hypothesis, and significantly
under-rejects the false null hypothesis. Holden (2004) evaluates the performance of Ruuds
(1984) Hausman test which compares the probit estimator and the Tobit estimator.
2
When
Nelsons test is not exactly Hausman test because his second estimator is not consistent under the alternative
hypothesis. Although a Hausman-type test could be defined with an inconsistent second estimator, the original
Hausman test statistic uses a consistent estimator under both the null hypothesis and the alternative hypothesis
for the second estimator.
3
Nelsons (1981) test could be interpreted as a special case of Ruuds (1984).
2
the sample size is medium (150 observations), the size is inaccurate and the power is very
low.
Even when the sample size is large (900 observations), the size and power are not
satisfactory.
The size distortion and power reduction of the Hausman tests are probably due to the
bad design of the test statistics. Ruuds test compares the probit estimator and the Tobit
estimator.
Both the probit and Tobit estimators are consistent under the normality (H0).
The original
Hausman test compares two estimators, one of which is consistent only under H0, and the
other of which is consistent both under H0 and H1.
Of course, as Nelson
(1981) and Ruud (1984) argue, it is possible that Hausman test could be well defined even
with estimators both are inconsistent under H1.
depends on the actual difference of the two estimates even in such a case.
Neweys test also has a limitation. Neweys test employs the SCLS estimator for
the second estimator to be compared with the Tobit estimator.
Thus, if the
Again, of
course, asymptotically Hausman test has no problem if the probability limits of the estimators
are different under H1.
It utilizes a purely
nonparametric estimator for the Tobit model to avoid the inconsistency of the second
estimator under H1.
i = 1, 2, , n
di* = Zi + vi
(1)
(2)
yi = yi* and di = 1
if di* 0
yi = di = 0
if di* < 0
where Xi has k variables, Zi has m variables, and (ui, vi) is assumed to have a serially
independent bivariate normal distribution with a mean of zero and a variance matrix of
uv
uv
for all i. With the normality assumption, the model can be estimated by the
1
developed by Gourieroux, et al. (1987) and Chesher and Irish (1987), is defined to replace the
unobservable residual in latent variable models.
This simple Tobit model is sometimes called standard Tobit model or Type 1 Tobit model after Amemiya
(1985).
4
defined as:
e i E (ei | yi )
(3)
When the errors of the observed model have non-zero mean due to the limited
observability in the model, the non-zero mean is often a source of bias in estimation.
In
such cases, the generalized residual can capture the bias caused by the non-zero mean of
errors and make the model estimable.
by OLS using only the uncensored data (di* 0), the estimate of will be biased because the
conditional mean of ui is not zero.
term (ei) of the following regression will have a zero mean, and the equation can be estimated
by OLS..
yi= X i + uv v i + ei
(4)
Assuming
normality, the generalized residual becomes the well-known inverse Mills ratio.
v i =
(Zi )
(Zi )
(5)
where is the pdf of the standard normal distribution, and is the cdf of the standard normal
distribution.
In this case, the estimation of equation (4) becomes the Heckit or Heckmans
two-step estimation. 5
If the normality assumption is not satisfied, the generalized residual will take an
If X and Z have exactly the same variables, is only identified by the nonlinearity of the generalized residual.
However, as the generalized residual is often close to linear, such practice may create identification problem.
See Little (1985).
5
unknown form.
equation (2) is estimated by some nonparametric estimation method, such as maximum score
estimation by Manski (1975, 1985), a distribution-free maximum likelihood estimation by
Cosslett (1983, 1987), smoothed maximum score estimation by Horowitz (1992), or single
index estimation by Klein and Spady (1993) and Ichimura (1993), etc.
(6)
In the
y i = x i + g i + e i
(7)
Equation (7) is estimated by OLS, and then the standard error of the estimated , NP , is
computed through bootstrapping. 6
properties, although Newey (1999b) derives the asymptotic variance of NP under somewhat
strong regularity conditions. 7
finite samples, and that it has more accurate sizes and higher powers than the traditional ones
when the underlying distribution is nonnormal.
NP
Vella (1998)
Hausman test statistic with the above nonparametric estimator, NP , and the traditional Tobit
estimator, Tobit .
the Tobit estimator is consistent and efficient under H0, and inconsistent under H1.
The
1
H= n(
NP Tobit ) 'V ( NP Tobit )
is an estimator of the asymptotic variance (V) of the difference,
where V
(8)
n (NP Tobit ) .
distribution of H for the test of normality, to improve its finite sample properties.
estimate V also through a nested bootstrap.
bootstrap.
We
then HB is bootstrapped again in the second stage, to produce the critical values of the
1
normality test, where H B= n(
NP Tobit ) 'VB ( NP Tobit ) .
As the double bootstrap requires a considerable computing time, we propose an
alternative bootstrap procedure.
.
necessary to normalize the statistic with V
(
As this statistic is not pivotal,
NP Tobit ) '( NP Tobit ) , can be bootstrapped for the test.
however, its bootstrap distribution would have skewness and fat tails. 8
We apply Efrons
1 ( ( z[ ])), G
1 ( ( z[1 ]))]
[ G
(9)
where G is the cdf of the bootstrap distribution, is the cdf of the standard normal
distribution, and
(z 0 + z (i ) )
z[i] = z 0 +
1 a 0 (z 0 + z (i ) )
(i = or 1 )
(10)
In (10), z ( ) is the -level critical value of the standard normal distribution, z0 is bias
constant for bias correction, and a0 is the acceleration constant for variance stabilization.
z0 and a0 are computed from the sample as
( ))
z 0 = 1 (G
a0 =
1 3i
6 i2 3 / 2
( )
(11)
(12)
i = lim
In (13), = t (F) .
mass at xi.
point xi.
(13)
derive the critical point of the squared sum of the differences for the test of normality. 10
+ ) , the
For t ((1 ) F
i
statistic of interest, for instance the D-W statistic, is computed using the (n+1) observations. Alternatively,
may be set to a smaller value to give more accuracy to the influence function. In such cases, the empirical
empirical distribution.
test, one with the single index estimation by Klein and Spady (1993) for the binary regression,
and the other with the smoothed maximum score estimation by Horowitz (1992) for the
binary regression, respectively, as explained in the last section.
accuracy of the empirical sizes and empirical powers of the three tests.
The simulation
(14)
(15)
yi = yi* and di = 1
if di* 0
yi = di = 0
if di* < 0
where we set 0 = 1 = 2 = 0 = 1 = 2 = 1.
independently created from a uniform distribution with a mean of and a variance of 10,
respectively. 11
X2 = Z2.
In creating Xs and Zs, to mimic the reality better, one constraint is imposed:
distributions are used for (ui, vi) in the simulation: a bivariate normal distribution, a chisquare distribution, a symmetric mixed-normal distribution, and an asymmetric mixednormal distribution.
The parameters used for the distribution of (ui, vi) are: E(ui) = E(vi) = 0,
11
In order to keep about 15 percent zeros in the dependent variable, and to look into more general cases, we set
at 0, 0.1, 0.25, and 0.45.
12
If we alternatively assume that there is no overlap in Xs and Zs, the proposed nonparametric bootstrap
procedure will be easier because the two-step estimator will be consistent despite any misspecification of
distribution, as Newey (1999a) proves. However, in practice, it is safer to assume that there could be common
variables in Xs and Zs.
9
The
number of bootstrap replications is set to 1,000, and the simulation results are based on 1,000
simulations.
Table 1 presents a preliminary simulation result for the sample size of 50.
In the
table, NEWEY stands for the bootstrap version of Neweys test, and NP-KS for the
nonparametric bootstrap test with the Klein-Spady estimator. 13
Difference in 1
Difference in 2
Squared Sum of the
Differences
NP-KS
normal
mixed-normal
0.063
0.100
0.063
0.170
NEWEY
normal
mixed-normal
0.000
0.070
0.030
0.130
0.063
0.000
0.130
0.060
V. Conclusion
We propose an alternative Hausman test for normality in the Tobit model.
Unlike
the previous tests by Ruud (1984) or Newey (1987), our test compares the Tobit estimator to
an estimator which is consistent both under the null hypothesis and under the alternative
hypothesis.
H1 is a potential source of distorted finite sample properties of the previous tests, the
proposed test is expected to have better small sample performance.
We present a Monte
Carlo simulation to compare the finite sample properties of our test to the previous tests.
13
These are all based on the BCa confidence interval of the squared sum of the differences.
10
Reference
Chesher, A.D. and A. Irish (1987), Numerical and Graphical Residual Analysis in the
Grouped and Censored Normal Linear Model, Journal of Econometrics 34, pp.33-54.
Cosslett, S.R. (1983), Distribution-Free Maximum Likelihood Estimator of the Binary
Choice Model, Econometrica 51, pp.765-782.
Cosslett, S.R. (1987), Efficiency Bounds for Distribution-Free Estimators of the Binary
Choice and Censored Regression Models, Econometrica 55, pp.559-585.
Efron, B. (1979), Bootstrap Methods: Another Look at the Jackknife, Annals of Statistics 7,
pp.1-26.
Efron, B. (1981), Censored Data and the Bootstrap, Journal of the American Statistical
Association 76, pp.312-319.
Efron, B. (1987), Better Bootstrap Confidence Intervals, Journal of the American
Statistical Association 82, pp. 171-85.
Ericson, P. and J. Hansen (1999), A Note on the Performance of Simple Specification Tests
in the Tobit Model, Oxford Bulletin of Economics and Statistics 61, pp. 121127.
Goldberger (1983), Abnormal Selection Bias, in S. Karlin, T. Amemiya, and L. Goodman,
eds., Studies in Econometrics, Time Series and Multivariate Statistics, Academic Press, new
York.
Gourieroux, C., A. Monfort, E. Renault, and A. Trognon (1987), Generalised Residuals,
Journal of Econometrics 34, pp.5-32.
Heckman, J. (1976), The Common Structure of Statistical Models of Truncation, Sample
Selection and Limited Dependent Variables and a Simple Estimator for Such Models,
Annals of Economic and Social Measurement 5, pp.475-492.
Heckman, J. (1979), Sample Selection Bias as a Specification Error, Econometrica 47,
pp.153-161.
Horowitz, J.L. (1992), A Smoothed Maximum Score Estimator for the Binary Response
Model, Econometrica 60, pp.505-531.
Horowitz, J.L. (1997), Bootstrap Methods in Econometrics: Theory and Numerical
Performance, in Advances in Economics and Econometrics: Theory and Applications, 7th
World Congress Vol. 3, (eds) by K.M. Kreps and K.F. Wallis, Cambridge University Press.
Horowitz, J.L. (2000), The Bootstrap, Handbook of Econometrics V, North-Holland.
Hurd (1979) Estimation in Truncated Samples When There is Heteroscedasticity, Journal
of Econometrics, 11, pp. 247-58
11
12