Fixed vs. Random Effects Panel Data Models: Revisiting The Omitted Latent Variables and Individual Heterogeneity Arguments

Fixed vs.
Random Eects Panel Data Models: Revisiting the Omitted Latent Variables and Individual Heterogeneity Arguments
Aris Spanos Department of Economics Virginia Tech Blacksburg, VA 24061 USA May 2008
Abstract One of the most crucial questions in panel data modeling concerns the choice between Fixed (FEP) and Random Eects Panel (REP) data models. A variety of arguments have been proposed in the literature on choosing between the two formulations, but none of them makes a clear case for the circumstances under which each of these models will be appropriate. The primary objective of this paper is to put forward such an argument based on statistical adequacy grounds. This is achieved by relating the error assumptions to the probabilistic structure of the stochastic process underlying the observed data. It is argued that the omitted latent variables argument, when properly interpreted, provides an elucidating interpretation for the REP model and the individual heterogeneity argument does the same for the FEP model. The recast specication of the REP model is used to shed light on several issues relating to the Mundlak auxiliary regression formulation.
This is a preliminary draft; please do not quote without the permission of the author.
Introduction
One of the most crucial questions in panel data modeling concerns the appropriateness of the xed vs. random eects formulations in modeling individual heterogeneity in such data. It turns out that the choice between them has important implications for the consistency and/or eciency of the estimators for the parameters of interest; see Arellano (2003), Baltagi (2005), Hsiao (2003). Despite its importance there is a dearth of convincing arguments in the literature concerning the circumstances under which each formulation is appropriate. As a result of this the two formulations are often used as interchangeable, with the nal choice often made using a Hausmantype specication test. However, as argued by Baltagi (2005), relating the result of the latter test to the choice between the two formulations raises its own problems, rendering the issue anything but clear cut. In this paper it is argued that the source of the problem of choosing between xed vs. random eects models lies with the specication of these panel data models in terms of unobservable error terms: uit = ci +
it ,
iN, tT,
(1)
where ci denotes unobserved individual-specic eects (xed or stochastic), and it denotes the remaining non-systematic eects. The two sets of probabilistic assumptions comprising the xed and random eects models seem arbitrary because their validity cannot be ascertained at the specication stage (when choosing the model) since the error term is unobservable. Hence, the key to addressing this problem is to render the choice between the two formulations ascertainable vis-a-vis the data. This can be achieved by recasting the xed and random eects models in terms of probabilistic assumptions pertaining to the observable stochastic processes involved, say: {Zit :=(yit , Xit ), iN, tT}, and not the error term process {uit , iN, tT}. This recasting transforms the problem of xed vs. random eects models into one of statistical adequacy [are the model assumptions pertaining to {Zit , iN, tT} valid for the particular data?]. By providing a complete set of probabilistic assumptions in terms of the structure of the process {Zit , iN, tT}, and specifying the parameterizations of the unknown parameters for each model explicitly, one can discuss the similarities and dierences between the dierent panel data models, including when the various estimators are consistent/inconsistent and ecient/inecient. In addition, this recasting can elucidate the alternative interpretations of these models based on the omitted latent variables and/or the individual heterogeneity renditions. In particular, it is shown that the omitted latent variables interpretation is inappropriate for the xed eects model, but it can provide an elucidating perspective for the random eects model. On the other hand, the individual heterogeneity interpretation can provide a coherent interpretation for the xed eects model.
Textbook perspective on Panel Data models
It is widely appreciated that panel data models oer two major advantages for empirical modeling in economics because for reasonable values of N and T the sample size NT is quite large, which creates an opportunity for: (a) less restrictive statistical models dening the premises of inference, and (b) enhanced reliability and precision of statistical inference. What is not widely appreciated is that statistical misspecication, the probabilistic assumptions comprising these models being invalid for the data in question, will abnegate both of these potential advantages.
2.1
Specication, estimation and testing

Table 1: Fixed Eects Panel (FEP) data model yit = ci + x> + it [i] E ( it ) =0, [ii] E ( 2 ) = 2 , [iii] E ( it
it ,
The basic two statistical models for panel data are given in tables 1-2 below.
iN, tT, for t6=s, i6=j, i, jN, t, sT.
it js ) =0,
Table 2: Random Eects Panel (FEP) data model yit = x> +i + it [i] E ( it ) =0, [ii] E ( 2 ) = 2 , [iii] E ( it
it ,
iN, tT, for t6=s, i6=j, [iv] E (i ) =0,
it js ) =0,
where iN:={1, 2, ..., N, ...} and iT:={1, 2, ..., T, ...} denote the cross-section and time dimensions. These two models are usually viewed by the literature as interchangeable, with components ci and i representing generic terms which aim to capture the crosssection heterogeneity. On the nature of the heterogeneity terms, the prevailing view is that they should both be treated as random variables, with the crucial dierence being that the REP model raises the possibility that Xit might be correlated with i ; see Wooldridge (2002). As they stand, the probabilistic assumptions of the two models are not ascertainable vis-a-vis the data Z := (y : X ) because the terms (ci , i , it ) are unobservable. Using an obvious notation for all T observations, the FEP data model, as specied in table 1, takes the form: yi = Xi + 1ci +i , i = 1, .., N,
[v] E (2 ) = 2 , [vi] E i j =0, for i6=j, [vii] E it j =0, for all i, jN, tT, i
where yi is (T 1) , Xi is (T k) , 1 :=(1, 1, ..., 1)> and i is (T 1). One can express this model for all T N observations in matrix notation as: X1 1 0 0 1 c1 y1 y2 X2 0 1 0 c2 2 . = . + . . . . . . + . . . . . . . . . . . . . . . . yN XN cN N 0 0 1 | {z } | {z } | {z }| {z } | {z }
y X D c
y = X + Dc + .
b The OLS (Dummy Variables) estimators of (, c, 2 ) , for b = y X Db are: c 1 > 1 > > b X MD y , b= D> MX D c D MX y , s2 = (NT kN) , = X> MD X
In practice, is unknown and needs to be estimated using a feasible GLS procedure e b b to yield F = (X> 1 X )1 X> 1 y ; see Baltagi (2005), Hsiao (2002). The choice between the FEP and REP models in tables 1 and 2 is often based on a Hausman-type specication test whose test statistic relies on the distance between the two estimators: b e b e b e H(z) = ( F )> [Cov() Cov(F )]1 ( F ). 4
Hence, the GLS estimator of takes the form: 1 > 1 e = X> 1 X X y .
where =(IN ) is a T N T N matrix and: 2 + 2 2 2 2 2 + 2 2 =E(u> ui ) = . . . ... i . . . . . . 2 2 + 2 2
1 D, MX = I where the projection matrices (MD , MX ) are: MD = I D D> D > 1 > X X X X . Expressing the REP model for all NT observations takes the form: X1 u1 0 0 y1 u2 y2 X2 0 N 0 = . + . = E(u> uj ) i,j = . . . . . . , i . . . . . . . . . . . . . 0 0 yN XN uN | {z } | {z } | {z } | {z }
y X u (IN )
In addition to the above OLS and GLS estimators, several other estimators have been proposed in the literature, including the pooled OLS, the between, the within and the rst dierence estimators, with the discussion focusing on the circumstances under which each of these estimators is consistent and/or ecient; see Cameron and Trivedi (2005). The focus of this discussion revolves around whether the error term in each model is orthogonal to the explanatory variables Xit or not orthogonality ensures the consistency of the estimator and the form of the covariance matrix. The conventional wisdom in textbook econometrics dismisses any arguments for securing the statistical adequacy of estimated models before any inferences are drawn, by counter arguing that all one needs are consistent estimators for (, 2 ). For a large enough n, one does not need to worry about misspecication, because one can always invoke (under some general conditions) asymptotic Normality to provide the basis for inference. To double-insure one can deploy misspecication-consistent estimators for certain parameters such as the conditional variances. This counter argument is clearly misleading because there are no limit theorems which do not invoke probabilistic assumptions. Moreover, certain misspecications, like heterogeneity, worsen the reliability of inference as n. Consistent estimator of what? What is not clearly spelled out in this literab e ture is the explicit form of the coecients that (, ) are consistent estimators of. To bring out the potential problem consider the simple linear model: y = X + u, u v NIID(0, 2 In ), u where E(u | X)= 0 but also E(u | Z)= 0; Cov(Zt , Xt )=32 6= 0, Cov(Zt )=33 > 0; y : (T 1), X : (T k), Z : (T k), rank(X> X)=k =rank(Z> X)=k. The conb ventional wisdom suggests that both the OLS estimator = (X> X)1 X> y and the > e Instrumental Variable (IV) estimator = (Z X)1 Z> y are consistent estimators of . As the traditional argument goes, one can substitute y into the estimators and show that: > > > > b e = + ( X n X )1 Xn u , = + ( Z nX )1 Zn u . Consistency is then derived (mainly) from the fact that: p lim( Xn u )= 0,
> >
p lim( Zn u )= 0.
b e This argument, however, is highly misleading because and are consistent estimators of very dierent parameters. In particular, where 22 =Cov(Xt ), 21 =Cov(Xt , yt ), 32 = Cov(Zt , Xt ), and 31 =Cov(Zt , yt ), with 6= in general. The proper way to view the consistency argument is to focus on the observables (not the errors), with the argument taking the form:
> > P b = (X> X)1 X> y = ( X n X )1 Xn y 1 21 , 22 > > P e = ( Z nX )1 Zn y 1 31 . 32
b P = 1 21 , 22
e P = 1 31 , 32
These results follow from p lim( X n X ) = 22 , p lim( Xn y ) = 21 , p lim( Z nX ) = 32 > and p lim( Zn y ) = 31 , in conjunction with Slutskys theorem; see White (2001). In general, claims about consistency of estimators in panel data models should be treated with caution when they rely on assumptions concerning the orthogonality of certain variables with the error term, but are not accompanied by the underlying parameterizations.
>
>
>
The Error Statistical (ES) perspective
The Error Statistical (ES) perspective views statistical models as parameterizations of the probabilistic structure of the underlying observable stochastic process {Zit :=(yit , Xit ), iN, tT} dened on a probability space (S, F, P(.)). In particular, given that this probabilistic structure can be fully described using the joint distribution D(Z11 , Z21 , ..., ZNT ; ), one can view statistical models arising as reductions from this joint distribution; see Spanos (1986, 1999). To illustrate this perspective consider the simplest statistical model for panel data.
3.1
The Pooled Panel Data (PPD) model
Let us assume that the vector stochastic process {Zit , iN, tT} is Normal, Independent and Identically Distributed (NIID). The NIID reduction assumptions imply that the joint distribution of this process, say D(Z11 , Z21 , ..., ZNT ; ), can be simplied as follows: N T N T I Q Q IID Q Q D(Z11 , Z21 , ..., ZN T ; ) = Dit (Zit ; (i, t)) = D(Zit ; ) = i=1 t=1 i=1 t=1 (2) N T IID Q Q = D(yit | xit ; 1 ) D(Xit ; 2 ).
i=1 t=1
In addition, Normality implies that D(yit , Xit ; ) is: 1 11 > yit 21 vN , iN, tT. Xit 2 21 22 Hence, the regression and skedastic functions take the form: E(yit | Xit = xit ) = 0 + x> , it where the model parameters are:
V ar(yit | Xit = xit )= 2 , u
(3)
0 = 1 > , = 1 21 , 2 = 11 > 1 21 . 2 u 21 22 22
(4)
The reduction in (2) gives rise to a statistical model known as the pooled panel data (PPD) model: yit = 0 + x> + uit , (uit | xit ) v NIID(0, 2 ), iN, tT. it u 6 (5)
The complete specication of this model in terms of the observable stochastic processes is given in table 3. An important dimension of the specication of statistical models is the statistical Generating Mechanism (GM) (table 3). This is based on an orthogonal decomposition of the form: yit = it + uit , for iN, tT, it = E(yit | Dit ) and uit = yit E(yit | Dit ), where it represents the systematic and uit the non-systematic (error) component, with Dit F specifying the relevant conditioning information set that would render {(uit , Dit ) iN, tT} a Martingale Dierence (MD) process; see White (2001). In the case of the PPD model (table 3) the statistical GM was based on Dit = {Xit = xit }. Table 3 - Pooled Panel Data (PPD) model Statistical GM: [1] [2] [3] [4] [5] yit = 0 + x> + uit , iN, tT. it (yit | Xit = xit ) v N(., .), E (yit | Xit = xit ) = 0 + x> , it V ar (yit | Xit = xit ) = 2 , . u {(yit | Xit = xit ) , iN, tT} independent, ( 0 , , 2 ) are (i, t)-invariant. u
Normality: Linearity: Homoskedasticity: Independence: (i, t)-invariance:
3.2
The FEP and REP models from the ES perspective
The question that naturally arises at this stage is whether there is a way one can view the FEP and REP models given in tables 1 and 2, respectively, in the context of the ES perspective. The answer is provided by seeking the appropriate probabilistic structure for the stochastic process {Zit :=(yit , Xit ), iN, tT} which would give rise to the two models, with their respective statistical GMs taking the form: FEP: it = ci + x> , it
it ,
REP: it = x> , uit = i + it
it .
(6)
Equivalently, one needs to address the issue as to how the terms (ci , i ) pertain to the underlying probabilistic structure of the observable stochastic processes {Zit , iN, tT}, by specifying the relevant conditioning information set Dit in each case that will yield the systematic and non-systematic components given in (6). In the panel data literature there are several alternative arguments being used to explain how the terms (ci , i ) could be rationalized, but they are often equivocal in the sense that they do not provide convincing justications which can be directly assessed 7
vis-a-vis the data Z ; see Wooldridge (2002). Indeed, some of these arguments can be misleading, as the discussion in the next section explains. In the next two sections we will consider the two most widely used arguments, the omitted latent variables and individual heterogeneity arguments, rst articulated by Chamberlain (1982, 1984).
Revisiting the omitted latent variables argument
Consider now an extension of the Error Statistical specication based on the joint distribution D(Z11 , Z21 , ..., ZNT ; ), where there are several omitted, but potentially relevant, latent factors i . In order to relate the latent vector i to (yit , Xit ) we extend the relevant stochastic process to {Z :=(yit , Xit , i ), iN, tT}, assumed to it be NIID, with the joint distribution D(yit , Xit , i ; ) taking the form: 11 12 13 yit 1 Xit v N 2 21 22 23 , iN, tT. (7) i 3 31 32 33
4.1
The Fixed Eects Panel (FEP) data model
For the FEP data model in table 1, which treats the individual eects {ci , iN} as constants, it makes sense to consider: Dit = (Xit = xit , i = i ) . as the relevant conditioning information set. The resulting regression and skedastic functions take the form: E(yit | Xit = xit , i = i )=0 + x> + > , it i V ar(yit | Xit = xit , i = i )= 2 , (8)
where the model parameters := (0 , , , 2 ) take the form: 0 = 1 > > , 2 3 = 1 ( 21 23 1 31 )= , 33 2.3 = 1 ( 31 32 1 21 )= D, 22 3.2 2 = 2 u h > i 13 12 1 23 1 13 12 1 23 , 22 3.2 22
(9)
= 1 31 , 33
= 1 21 , = 1 23 , D = 1 32 , 22 33 22
3.2 = 33 > 23 , 2.3 = 22 D> 32 8
see Spanos (2006) for the details. It is interesting to note that without the omitted latent variables i , the probabilistic structure of NIID for {Zit :=(yit , Xit ), iN, tT} gives rise to the PPD model in table 3, where the model parameterizations are different, unless the latent variables i is uncorrelated with both observable variables (yit , Xit ) : 31 = 0 and 23 = 0. (10) That is, (0 , , , 2 )|31 =0&23 =0 = ( 0 , 0, , 2 ) . Note that (0 , , , 2 )|23 =0 = u ( 0 , 0, , 2 ) , where: w 2 = 2 13 1 13 6= 2 . w u 33
There are two interesting special cases arising from (8) depending on whether i is observed or not. Case 1: i is observed. When all the variables are observed the least-squares b estimators := (b 0 , , , s2 ) : b b 1 > 1 > > b b X M y , = > MX MX y , = X M X N T 2 PP 1 b b s2 = (N T km1) 0 = y x> > , b yit b 0 x> > it b i b
i=1 t=1 P
b converge in probability ( ) to their respective parameterizations in (9). This case is of interest only in so far as it sheds light on the latent case discussed next. Case 2: i is unobserved. When i is latent ci = > , iN, cannot be observed i giving rise to a regression function with latent individual xed eects: E(yit | Xit = xit , i = i ) = 0 + x> + > = 0 + x> + ci . it it i
This gives rise to the statistical model with unobserved xed individual eects: yit = 0 + x> + ci +it , (it |Dit ) v NIID(0, 2 ), iN, tT, it Dit :={Xit = xit , i = i } iN, tT. The question that needs to be posed is whether the FEP model given in table 1 can be interpreted in the context of (11) which views the xed eects factor ci as a linear combination of the observed values of the omitted latent variables i , i.e. ci = > , iN. There are two basic problems with this interpretation argument. The i rst concerns the question what is the xed eects OLS (Dummy Variable) estimator 1 > b = X> MD X X MD y a consistent estimator of? It is obvious that its not 1 a consistent estimator of = 22 23 1 32 ( 21 23 1 31 ), given in (9), 33 33 > 2 because MD 6= M . The second problem is that s = (NT kN ) is not a consistent estimator of 2 . The third problem concerns the conditioning on the observed value of an unobservable variable (i = i ) which is conceptually problematic; it constitutes an oxymoron even when contemplated as a hypothetical scenario. Hence, as it stands, the omitted latent factors interpretation of the FEP given in (11) seems inappropriate. 9 (11)
4.2
The Random Eects Panel (REP) Data Model
For the REP model in table 2, which treats the individual eects {i , iN} as random variables, it makes sense to replace conditioning on i = i with conditioning on the eld generated by the latent variable i , say (i ). That is, the relevant conditioning information set is now: Dit = (Xit = xit , (i )) . This conditioning makes more sense than conditioning on {i = i } because the -eld simply acknowledges the events associated with i by restricting the -eld underlying the universal probability space (S, F, P(.)) by conditioning on (i ) since (i ) F. It also explains why conditioning on {i = i } is problematic, since for any random variable W dened on the same probability space (S, F, P(.)) , the random variable E(W |(i )) does not depend on the actual values i of i , iN. This is because for any Borel function h() such that h(i ) 6= h(j ) when i 6= j , for all i, jN, i.e. it preserves the distinctness of the values of i : (i )=(h(i )) E(W |(i ))=E(W |(h(i ))); see Renyi (1970), p. 259. As shown in Spanos (1986), p. 413, conditioning on Dit yields the stochastic linear regression and skedastic functions:
E(yit |Dit )=0 + x> + > = 0 +x> + i , it i it V ar(yit |Dit )= 2 ,
(12)
where i denotes the random vector itself. In contrast to ci = > , the term: i i = > , iN, i is now stochastic with a distribution: (i |Dit ) v N(,2 ), where = > , 2 = > 33 , 3 where the parameterization (0 , ) coincides with that in (9). This gives rise to the statistical model with a latent random eect: yit = 0 + x> + i + it
it , ( it |Dit ) vNIID(0, 2 ), (i |Dit ) vNIID(, 2 ), iN, tT,
(13)
Dit = (Xit = xit , (i )) , iN, tT.
(14) Taking the mean deviation = i > form, one can re-denes the constant to i 3 =1 > , to yield the modied statistical model: 0 2 yit = + x> + + 0 it i
it ,
iN, tT, (15)
( it |Dit ) vNIID(0, 2 ), ( |Dit ) vNIID(0, 2 ), iN, tT. i
10
The question that naturally arises at this stage, is whether the specications (14) or (15) can be viewed as providing a meaningful interpretation for the REP model in table 2. The surprising answer is no, because as in the case of the xed eects 1 > 1 e model, the estimator = X> 1 X X y is not a consistent estimator 1 1 1 e of = 22 23 33 32 ( 21 23 33 31 ). So, what is a consistent estimator e of? In what sense captures the potential eect of the omitted variables i ? Does this imply that the omitted latent variables specication in (14) is also inappropriate for the REP model in table 2? These are crucial questions that need to be addressed. An interesting attempt to answer these questions was made by Mundlak (1978), which is considered next. 4.2.1 Mundlaks random eects formulation
Motivated by the fact that there is no reason to assume that E(i | Xit = xit )=0, in general, Mundlak (1978) argued that the crucial dierence between the xed and random eects formulation is not the randomness of the latter, but the potential correlation between i and Xit ; see also Wooldridge (2002). To capture that correlation he introduced the auxiliary regression: P (16) i = T > xit + i = > xi + i , i vNIID(0, 2 ), 1 t=1 1 PT > PT 1 where xi = n t=1 xit , i=1, 2, ..., n, E(i | Xit = xit )= t=1 1 xit . When (16) is substituted back into the original formulation it gives rise to the Mundlak formulation: yit = + x> + > xi + i + 0 it 1 ( it | xit ) vNIID(0, ),
2 it ,
iN, tT, (17) iN, tT.
(i | xit ) vNIID(0, 2 ),
As argued by Hsiao (2003), this formulation captures the correlation between i and Xit , but raises other questions concerning the consistency and eciency of the resulting estimators under dierent scenarios. In particular, the following issues arise naturally: 1 e (i) is the GLS estimator a consistent estimator of = 22 23 1 32 ( 21 23 1 31 )? 33 33 No! (ii) is the GLS estimator e1 a consistent estimator of = 1 ( 31 32 1 21 )? 3.2 22 No! (iii) what are (e 1 ) consistent estimators of? ,e (iv) what does the Mundlak formulation provide a solution to? As shown below, the answers to (iii)-(iv) are rather surprising! 4.2.2 A ES latent random eects specication
To answer the latter question one needs to take a more systematic way in dening Mundlaks auxiliary regression. Let us return to the joint distribution D(yit , Xit , i ; ) 11
in (7) and consider the relationship between Xit and i , as it pertains to their correlation in the context of the model in (14). When 23 6= 0 there exists an auxiliary regression between i and Xit of the form: i = 0 + > xit + vi , (vi | xit ) v NIID(0, 3.2 ), (18)
E(i | Xit = xit )= 0 + x> , 0 = 3 > 2 , = 1 23 , 3.2 := 33 > 23 . 22 it Hence, a convenient way to eliminate i from (12) is to substitute (18) into (15) yielding: yit = 0 + x> + > + it i
it
from (9). These simplications imply that: yit = 0 + x> + > + it i

it
Given 0 =1 > > , and > = > 2 : 2 3 3 0 0 + > =1 > [ + ] =1 > = 0 , [ + ] = , 2 2 0 = 0 + x> + v> + it i
it ,
= 0 + x> + ( 0 +> xit + vi )> + it = it > = 0 + > + x> [ + ] + vi + it . 0 it

>
(19)
(20)
and give rise to a statistical model with latent random eects of the form: yit = 0 + x> + i + it
it ,
iN, tT, (21)
( it |Dit ) v NIID(0, 2 ), (i | xit ) v NIID(0, 2 ), iN, tT. > where in view of the fact that i = vi :
2 = > 3.2 =( 31 32 1 21 )> 1 ( 31 32 1 21 ) 6= 2 = > 33 . 22 3.2 22 This recasting brings out several important features of the REP model. (a) In contrast to the xed eects recasting, one can estimate 2 as well as 2 by orthogonally decomposing 2 : u 2 = 2 + 2 = 11 > 1 21 . u 21 22 (b) The latent random eects term i does not represent a linear combination of the omitted latent variables (i 6= i = > ), but a linear combination of the errors i from the auxiliary regression of i on Xit , i.e. (c) The parameterization ( 0 , , 2 , 2 ) is estimable, but 0 , , 2 , 2 is not, unless i is observable! 12 i = (i 0 > xit )> .
In addition, the specication in (21) brings out the danger of attributing to latent terms an arbitrary interpretation one "wishes", without worrying whether it makes probabilistic sense or not. The complete set of assumptions comprising the model in (21) is given in table 4. Table 4 - Random Eects Panel (REP) data model Statistical GM: [1] [2] [3] [4] [5] yit = 0 + x> + i + it , iN, tT. it (yit | Xit = xit , (i )) v N(., .), Normality: (i | Xit = xit ) v N(.,.), > Linearity: E it (yit | Xit = xit , (i )) = 0 + x2 + i , V ar (yit | Xit = xit , (i )) = , Homoskedasticity: V ar (i | Xit = xit ) = 2 , Independence: {(yit | (Xit = xit , (i ))) , iN, tT} independent, (i, t)-invariance: ( 0 , , 2 , 2 ) are (i, t)-invariant.
The question now is whether the statistical model in table 4 provides a pertinent interpretation for the REP model with Dit = (Xit = xit , (i )) being the relevant conditioning information set. (i) The statistical parameterization := ( 0 , , 2 , 2 ) : 0 = 1 > , = 1 21 , 2 + 2 = 2 = 11 > 1 21 , 2 u 21 22 22 is clearly the relevant and estimable one. e (ii) =(X> 1 X )1 X> 1 y is consistent for = 1 21 (not ), 22 e is also ecient because it takes account of the variance-covariance struc(iii) ture. The primary dierence between (15) and (21) is that in the latter case Xit is orthogonal () to ( i , it ) by construction, eliminating the original (potential) correlation between Xit and i . Indeed, this orthogonality is crucial in achieving the proper parameterization and the resulting consistency of the relevant estimators of ( 0 , ). Moreover, the same orthogonality is instrumental in being able to estimate consistently the relevant conditional variance 2 . In relation to the latter, it is impor tant to emphasize that although one cannot estimate consistently in the regression E (yit | Xit = xit , (i )) =0 + x> + > when the omitted variables i are latent, it i one can estimate the relevant conditional variance V ar (yit | Xit = xit , (i )) = 2 , by decomposing (orthogonally) 2 , the conditional variance associated with ( 0 , ), into: u 2 = 2 + 2 ; see (??). u In summary, viewing the REP model table 2 in terms of the underlying stochastic process {Z :=(yit , Xit , i ), iN, tT}, provides a coherent interpretation where: it (a) an unambiguous interpretation of the various terms in the statistical GM, 13
including the errors, (b) an explicit statistical parameterization for all the unknown parameters, and (c) a complete set of testable assumptions (table 4) which pertain to the probabilistic structure of the observable process {Zit :=(yit , Xit ), iN, tT}. 4.2.3 Revisiting the Mundlak formulation Given the above discussion, it is interesting to revisit the Mundlak (1978) formulation where he introduced the auxiliary regression (16) as follows: The properties of the various estimators to be considered depend on the existence and extent of the relations between the Xs and the eects. In order to take an explicit account of such relationships we introduce the auxiliary regression (2.3) i = x> + it ; averaging over t for a given i : it (2.4) i = x> + i , (2.5) i v(0, 2 ). (pp. 71-72) [the notation is changed] i He went on to note that: E(i | xit ) need not be linear. However, only the linear expression is pertinent for the present analysis. (p. 71) Viewing the Mundlak auxiliary regression in light of the specication in (18), three issues arise. First, the linearity and homoskedasticity he imposed are (together) equivalent to assuming that Xit and i are jointly Normal (see Spanos, 1995). This renders the original assumption that D(yit , Xit , i ; ) is multivariate Normal a very reasonable working assumption. Second, the averaging is clearly unnecessary to secure the invariance of the error with respect to t, since, by denition the error term is the non-systematic component of i , not accounted for by conditioning on Xit , i = i E( i | xit ), which does not change with t. Third, a minor point is that there is a missing constant term in (2.3) and (2.4) which captures the linear combination of the means of i and Xit . Putting all these pieces together, and noting that i = > , one can relate Mundi laks (2.3) to the auxiliary regression in (18) via:
> i = > = > + x> + vi . i it 0
(22)
Substituting (22) into (14) yields the model in (21) since: > yit = 0 + x> + i + it = 0 + x> + > + x> + vi + it it it it 0> > = 0 + > + xit [ + ] + vi + it 0 > = 0 + x> + vi + it , it where the simplication follows directly from (19). That is, when the Mundlak auxiliary regression is properly specied, it gives rise to the same formulation as in (21). Returning to the textbook version of Mundlaks formulation, it becomes clear that b b b b e is consistent estimator of , (not ) and e1 = b w , where b and w are the e between and within (GLS) estimators of . This implies that since p lim(e1 )= 1 , 1 estimates nothing of any intrinsic interest since 1 6= ! 14
Revisiting the heterogeneity argument
Having rejected the omitted latent variables interpretation for the xed eects (FEP) model as given in table 1, we proceed to explore the heterogeneity argument from the Error Statistical (ES) perspective in an attempt to nd out whether one can relate the xed eects to the heterogeneity of the process {Zit , iN, tT}.
5.1
Unrestricted heterogeneity and panel data modeling
Assume that the vector stochastic process {Zit , iN, tT}, Zit :=(yit , X> )> be Norit mal, Independent (NI) but non-Identically Distributed, i.e. heterogeneous with respect to both (i, t) . These probabilistic assumptions imply the reduction: D(Z11 , Z21 , ..., ZNT ; ) = =
NI I N T QQ
Dit (Zit ; (i, t))=
NI
where D(yit , Xit ; 1 (i, t)) is multivariate Normal of the form: yit 11 (i, t) > (i, t) 1 (i, t) 21 , iN, tT. vN 2 (i, t) 21 (i, t) 22 (i, t) Xit The regression and skedastic functions associated with D(yit | xit ; 1 (i, t)) are: E(yit |Xit = xit )= 0 (i, t) + x> (i, t), it where the model parameters take the form: 0 (i, t)=1 (i, t)> (i, t)(i, t), (i, t)= 1 (i, t) 21 (i, t), 22 2 2 (i, t)= 11 (i, t) 12 (i, t)1 (i, t) 21 (i, t). 22 u This gives rise to the non-estimable statistical model: yit = 0 (i, t) + x> (i, t) + uit , (uit |xit ) vNIID(0, 2 (i, t)), iN, tT. it u For k = 4, N = 100, T = 25, the unknown model parameters are: 0 (i, t), (i, t), 2 (i, t), i = 1, .., N, t = 1, .., T, u whose total number is m = (K +2)NT = (6)(100)(25) = 15000! To render this model estimable one needs to impose certain restrictions on the form of the heterogeneity of {Zit , iN, tT}. 15 V ar(yit |Xit = xit )= 2 (i, t), u (24)
i=1 t=1 N T QQ i=1 t=1
i=1 t=1
D(yit | xit ; 1 (i, t))D(Xit ; 2 (i, t))
N T QQ
D(Zit ; (i, t))= (23)
5.2
Fixed Individual Eects Panel (FEP) data model
Consider the case where vector stochastic process {Zit , iN, tT} is assumed to be Normal, Independent (NI) but mean heterogeneous (but covariance homogeneous) with respect to iN, and completely homogeneous with respect to tT. These probabilistic assumptions imply the reduction: D(Z11 , Z21 , ..., ZNT ; ) = =
NI I N T QQ
Dit (Zit ; (i))=
NI
i=1 t=1
where D(yit , Xit ; (i)) is multivariate Normal of the form: 11 12 1 (i) yit vN , iN, tT. 2 (i) Xit 21 22 The regression and skedastic functions associated with D(yit | xit ; 1 (i)) are: E(yit | Xit = xit )= 0 (i) + x> , it where the model parameters are: 0 (i)=1 (i)> (i), = 1 21 , 2 = 11 12 1 21 . 2 u 22 22 This gives rise to the (potentially) estimable statistical model: yit = 0 (i) + x> + uit , iN, (uit |xit ) vNIID(0, 2 ), iN, tT. it u The complete specication of this model in terms of the observable stochastic processes is given in table 5. This can be an estimable model because for, say k = 4, N = 100, T = 25, NT = 2500, the unknown model parameters are: 0 (i) = ci , , 2 , i=1, .., N, t=1, .., T, u whose total number is m = N + k + 2 = (100 + 4 + 2) = 106. Table 5 - Fixed Individual Eects Panel data model Statistical GM: [1] [2] [3] [4] [5] yit = 0 (i) + x> + uit , iN, tT. it (yit | Xit = xit ) v N(., .), E (yit | Xit = xit ) = 0 (i) + x> , it V ar (yit | Xit = xit ) = 2 , u {(yit | Xit = xit ) , iN, tT} independent, (, 2 ) are (i, t)-invariant, u { 0 (i), iN} are t-invariant, but iheterogeneous 16 V ar(yit | Xit = xit )= 2 , u (26)
i=1 t=1 N T QQ
i=1 t=1
D(yit | xit ; 1 (i))D(Xit ; 2 (i)),
N T QQ
D(Zit ; (i))= (25)
Normality: Linearity: Homoskedasticity: Independence: (a) (i, t)-invariance: (b) t-invariance:
(b) bi is a consistent estimator of 0 (i)=1 (i)> (i), for all i=1, 2, ..., N, as c 2 T , and N T PP 1 b (yit bi x> )2 is a consistent estimator of 2 . c (c) s2 = NT Nk1 it u
i=1 t=1
The important thing to emphasize about the above (heterogeneity motivated) specication is that one can state unequivocally is that the OLS estimators, 1 > 1 > b = X> MD X X MD y and b= D> MX D c D MX y of (, c) , are consistent estimators of the relevant parameterizations. In particular, b (a) is a consistent estimator of = 1 21 , where: 22 22 =E [Xit 2 (i)][Xit 2 (i)]> , 21 =E ([Xit 2 (i)][yit 1 (i)]) ,
Fixed vs. Random Eects: a statistical adequacy issue
Having proposed two dierent interpretations for the xed and random eects models in tables 1 and 2, respectively, based on specifying the assumptions comprising the two models (tables 4-5) in terms of the probabilistic structure of the observable stochastic process {Zit , iN, tT}, it is important to bring out certain crucial advantages of the latter specication. At the specication stage (when choosing the appropriate statistical model), the specications of the FEP and REP models given in tables 1 and 2 provide no basis for choosing between these models because assumptions [i]-[iii] and [i]-[vii] cannot be related to the structure of the data. In contrast, the specications given in tables 3 and 4 in terms of probabilistic assumptions concerning {Zit , iN, tT}, are ascertainable vis-a-vis data Z := (y : X ) using a variety of graphical techniques. This is because reduction assumptions such as NIID, or NI but non-ID are easy to assess directly using data plots. In assessing dependence and/or heterogeneity one needs to decide on a particular ordering of the data. In the case of the time dimension tT, it is naturally considered to provide the ordering of interest, but in the case of the crosssection dimension iN, it is sometimes argued that the ordering does not matter; it does! In practice, there are several natural orderings that one needs to explore with respect to iN, such as the size of the rm, geographical position of the city, etc., etc. For each of those possible orderings of interest one could assess the presence or absence of i-dependence and/or i-heterogeneity by a variety of graphical techniques, the simplest of which is the t-plot; see Spanos (1999), ch. 5-6. This suggests that the choice between the FEP and REP models, as specied in tables 3 and 4 can be made on the basis of whether the process {Zit , iN, tT} exhibits i-heterogeneity or not. At the Mis-Specication (M-S) testing stage the probabilistic assumptions [1][5] for both models in tables 3 and 4 are all testable vis-a-vis data Z , and should be tested thoroughly for possible departures in order to secure the reliability of inference. 17
Applying Hausman-type specication tests is no substitute for thorough M-S testing because any departures from assumptions [1]-[4] of either model will undermine the reliability of this test in the sense that the nominal size and power will be very dierent from the actual, giving rise to misleading inferences; see Spanos (2006). The original preliminary data analysis based on graphical techniques that guided the specication is also useful in deciding the type os M-S tests one should apply in order to probe for potential misspecications in a thorough manner. Ensuring statistical adequacy (the validity of the model assumptions vis-a-vis data Z ) is of paramount importance for any form of statistical inference, because without it the reliability of inference is questionable. Hence, without statistical adequacy the primary advantages of panel data, including enhanced precision of inference and less restrictive models, are abnegated. A potential criticism of the ES specications of the FEP and REP models is that the probabilistic assumptions [1]-[5] are rather restrictive and unrealistic for many data sets in practice. As argued in Spanos (2008) such arguments are misplaced and highly misleading on a number of counts. In particular, these are not the only assumptions one can impose on the underlying process; the ES approach makes explicit the model assumptions that can give rise to reliable and precise inferences and one should modify these assumptions to ensure their validity vis-a-vis the particular data. The traditional approachs reluctance to make explicit distributional assumptions carries a heavy price in terms of the precision of inference without any guarantees for reliability. Similarly, nonparametric models forsake reliability by invoking non-testable assumptions and relying on asymptotic results whose validity is not established; see Spanos (2001). For reliable and precise inference there is no substitute to fully specied parametric statistical models whose statistical adequacy is secured by thorough M-S testing and respecication before any inferences are drawn. To paraphrase Peirce (1878), there is no royal road to learning from data about phenomena of interest, and really valuable ideas can only be had at the price of securing statistical adequacy.1 Where the Error Statistical (ES) perspective shines is at the respecication stage where one needs to respecify the original model when it is found to be wanting. Again the graphical analysis can be very useful in suggesting alternative specications that take account of systematic information the original model could not account for. Spanos (2007) provides a more comprehensive discussion that includes several additional statistical models of interest in panel data modeling that arise as respecications of the FEP and REP models given in tables 4 and 5.
The original phrase "there is no royal road to geometry" is attributed to Euclid in reply to King Ptolemys request for an easier way of learning mathematics. Charles S. Peirce (1878) paraphrasing was: "There is no royal road to logic, and really valuable ideas can only be had at the price of close attention."
1
18
Conclusion
The paper has proposed specications of the Fixed (FEP) and Random Eects Panel (REP) models (tables 4-5) in terms of the probabilistic structure of the observable stochastic processes {Zit , iN, tT} underlying the observed data. The results bring out the danger in choosing arbitrarily the interpretation one would like to attribute to unobserved eect terms (ci , i ) . The proposed specications show that the interpretation of these terms is inextricably bound up with the statistical parameterizations of these models. Moreover, such specications oer certain distinct advantages over the traditional specications based on error terms (tables 1-2), in relation to the problems of specication, Mis-Specication (M-S) testing and respecication. In contrast, the current textbook statistical modeling for panel data: (i) ignores the statistical adequacy problem, and (ii) does not take full advantage of the large sample size to enhance the reliability and precision of inference. This is primarily due to the ineectiveness of the ways it deals with the (i, t)heterogeneity and dependence. (a) The equal-correlation structure imposed by the REP model is not very realistic. (b) Treating (i, t)heterogeneity primarily as an incidental parameter problem, where parameters are changing (somehow) with i or/and t, is highly unsatisfactory. This strategy squanders a lot of sample information by picking up this heterogeneity by essentially using dummy variables. In addition, this renders asymptotic theory problematic, since the number of unknown parameters increases as N or/and T . (c) Treating (i, t)dependence [including spatial] as primarily generated by error autocorrelation is not such a good idea. To take full advantage of panel data information one needs to construct (i, t)heterogeneity and dependence concepts that would adequately model the observed regularities in panel data. That is, make a genuine eort to model probabilistically, understand and explain (i, t)heterogeneity and dependence.
References
[1] Arellano, M. (2003), Panel Data Econometrics, Cambridge University Press, Cambridge. [2] Baltagi, B. H. (2005), Econometric Analysis of Panel Data, 3rd ed., Wiley, NY. [3] Cameron, A. C. and P. K. Trivedi (2005), Microeconometrics : Methods and Applications, Cambridge University Press, Cambridge. [4] Chamberlain, G. (1982), Multivariate Regression Models for Panel Data, Journal of Econometrics, 18, 5-46. [5] Chamberlain, G. (1984), Panel Data, in Griliches, Z. and M. D. Intriligator (eds.), Handbook of Econometrics, vol. 2, Elsevier Science, Amsterdam. 19
[6] Hsiao, C. (2002), Analysis of Panel Data, 2nd ed., , Cambridge University Press, Cambridge. [7] Mundlak, Y. (1978), On the pooling of time series and cross-sectional data, Econometrica, 46, pp. 69-86. [8] Peirce, C. S. (1878), How to Make Our Ideas Clear, Popular Science Monthly, pp. 286-302. Reprinted in Chance, Love, and Logic: Philosophical Essays, edited by M. R. Cohen, Bison Books, 1998. [9] Renyi, A. (1970), Foundations of Probability, Holden-Day, San Francisco. [10] Spanos, A., (1986), Statistical Foundations of Econometric Modelling, Cambridge University Press, Cambridge. [11] Spanos, A. (1995), On Normality and the Linear Regression model, Econometric Reviews, 14, 195-203. [12] Spanos, A. (1999), Probability Theory and Statistical Inference: econometric modeling with observational data, Cambridge University Press, Cambridge. [13] Spanos, A. (2001), Parametric versus Non-parametric Inference: Statistical Models and Simplicity, pp. 181-206 in Simplicity, Inference and Modelling, edited by A. Zellner, H. A. Keuzenkamp and M. McAleer, Cambridge University Press. [14] Spanos, A. (2006), Revisiting the Omitted Variables Argument: Substantive vs. Statistical Reliability of Inference, Journal of Economic Methodology, 13: 179-218. [15] Spanos, A. (2007), Revisiting the Statistical Foundations of Panel Data Models, Working Paper, Virginia Tech. [16] Spanos, A. (2008), Philosophy of Econometrics, forthcoming in Philosophy of Economics, the Handbook of Philosophy of Science, Elsevier (editors) D. Gabbay, P. Thagard, and J. Woods. [17] White, H. (2001), Asymptotic Theory for Econometricians, Academic Press, London. [18] Wooldridge, J. M. (2002), Econometric Analysis of Cross Section and Panel Data, The MIT Press, Cambridge.
20

Fixed vs. Random Effects Panel Data Models: Revisiting The Omitted Latent Variables and Individual Heterogeneity Arguments

Încărcat de

Informații document

Descriere originală:

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Fixed vs. Random Effects Panel Data Models: Revisiting The Omitted Latent Variables and Individual Heterogeneity Arguments

Încărcat de

Drepturi de autor:

Formate disponibile

Fixed vs.

Textbook perspective on Panel Data models

Specication, estimation and testing

iN, tT, for t6=s, i6=j, i, jN, t, sT.

iN, tT, for t6=s, i6=j, [iv] E (i ) =0,

Hence, the GLS estimator of takes the form: 1 > 1 e = X> 1 X X y .

where =(IN ) is a T N T N matrix and: 2 + 2 2 2 2 2 + 2 2 =E(u> ui ) = . . . ... i . . . . . . 2 2 + 2 2

The Error Statistical (ES) perspective

The Pooled Panel Data (PPD) model

V ar(yit | Xit = xit )= 2 , u

Normality: Linearity: Homoskedasticity: Independence: (i, t)-invariance:

The FEP and REP models from the ES perspective

REP: it = x> , uit = i + it

Revisiting the omitted latent variables argument

The Fixed Eects Panel (FEP) data model

3.2 = 33 > 23 , 2.3 = 22 D> 32 8

The Random Eects Panel (REP) Data Model

Dit = (Xit = xit , (i )) , iN, tT.

iN, tT, (15)

( it |Dit ) vNIID(0, 2 ), ( |Dit ) vNIID(0, 2 ), iN, tT. i

iN, tT, (17) iN, tT.

from (9). These simplications imply that: yit = 0 + x> + > + it i

= 0 + x> + ( 0 +> xit + vi )> + it = it > = 0 + > + x> [ + ] + vi + it . 0 it

iN, tT, (21)

Revisiting the heterogeneity argument

Unrestricted heterogeneity and panel data modeling

Dit (Zit ; (i, t))=

i=1 t=1 N T QQ i=1 t=1

D(yit | xit ; 1 (i, t))D(Xit ; 2 (i, t))

D(Zit ; (i, t))= (23)

Fixed Individual Eects Panel (FEP) data model

Dit (Zit ; (i))=

D(yit | xit ; 1 (i))D(Xit ; 2 (i)),

D(Zit ; (i))= (25)

Normality: Linearity: Homoskedasticity: Independence: (a) (i, t)-invariance: (b) t-invariance:

Fixed vs. Random Eects: a statistical adequacy issue

S-ar putea să vă placă și