Documente Academic
Documente Profesional
Documente Cultură
May 2009
What is the difference between EFA and CFA? o In EFA (Exploratory Factor Analysis) we use the data to determine the underlying structure. Also, typically use an orthogonal rotation and crossloadings are permitted, as long as they are relatively small.
o In CFA (Confirmatory Factor Analysis) we specify the factor structure on the basis of a good theory and then use CFA to determine whether there is empirical support for the proposed theoretical factor structure. Also, assumes oblique rotation and no (zero) cross-loadings.
CFA provides quantitative measures that assess the validity and reliability of theoretical model . . .
1. Do the indicator variables measure the same concept? o Convergent validity measured by shared variance o No (zero) cross-loadings unidimensional o Uncorrelated errors 2. Are the constructs measuring distinctly different concepts? o Discriminant validity average shared variance (AVE) must be larger than interconstruct correlations 3. Are the constructs reliable? Measured based on internal consistency (similar to Cronbach Alpha). 4. Are the interconstruct correlations consistent with your theory? o Nomological validity
Construct
Construct
X1
X2
X3
X4
X1
X2
X3
X4
Cross-Loadings = when indicator variables on one construct are assumed to be related to another construct. Congeneric measurement model = all cross-loadings are assumed to be 0. The assumption of no cross-loadings is based on the fact that the existence of significant cross-loadings is evidence of a lack of unidimensionality and therefore a lack of construct validity, i.e. discriminant validity.
Constructs
o Exogenous = variable or construct that acts as a predictor for other constructs or variables in the model only have arrows leading out of them and none leading into them .
C
o Endogenous = variable or construct that is the outcome variable in at least one causal relationship has one or more arrows leading into them. them C C = construct
Relationships
o Recursive = arrow goes one way. o Nonrecursive = arrows go both ways. o Correlational = arrow is curved with points on both ends.
Indicators
o Formative = arrows go from observed indicator variables to unobserved construct.
C C
V = Indicator variable
Two Latent Constructs and the Measured Variables that Represent Them
Exogenous Construct
Endogenous Construct
X1
X2
X3
X4
Y1
Y2
Y3
Y4
Loadings represent the relationships from constructs to variables as in factor analysis. Path estimates represent the relationships between constructs, similar to beta weights in regression analysis.
Definitions
Communality = the total amount of variance a measured variable has in common with the construct upon which it loads. Good measurement practice suggests that each measured variable should load on only one construct. So it can be thought of as the variance explained in a measured variable by the construct. In CFA, the communality is referred to as the squared multiple correlation for a measured variable. It is similar to the idea of communality from EFA. Factor loadings are squared to get the communality of an indicator variable. Congeneric measurement model = a model consisting of several unidimensional constructs with all cross-loadings assumed to be zero. Also, there is no covariance for between- or within-construct error variances, meaning they are all fixed at zero. Estimated covariance matrix = a covariance matrix comprised of the predicted covariances between all indicator variables involved in a SEM based on the equations that represent the hypothesized model. Typically abbreviated with k. Fixed parameter = a parameter that has a value specified by the researcher. Most often the value is specified as zero, indicating no relationship, although there are instances in which an actual value (e.g., 1.0 or such) can be specified. Free parameter = a parameter estimated by the structural equation program to represent the strength of a specified relationship. These parameters may occur in the measurement model (most often denoting loadings of indicators to constructs) as well as the structural model (relationships among constructs). Goodness-of-fit (GOF) = a measure indicating how well a specified model reproduces the covariance matrix among the indicator variables. Maximum likelihood estimation (MLE) = an estimation method commonly employed in structural equation models. An alternative to ordinary least squares used in multiple regression, MLE is a procedure that improves parameter estimates in a way that minimizes the differences between the observed and estimated covariance matrices.
Definitions continued . . .
Observed sample covariance matrix = the typical input matrix for SEM estimation comprised of the observed variances and covariances for each measured variable. Typically abbreviated with a bold, capital letter S (S). Construct reliability (CR) = a measure of reliability and internal consistency based on the square of the total of factor loadings for a construct. Construct validity = is the extent to which a set of measured variables actually represent the theoretical latent construct they are designed to measure. It is made up of four components: convergent validity, discriminant validity, nomological validity and face validity. Convergent validity = the extent to which indicators of a specific construct converge or share a high proportion of variance in common. Discriminant validity = the extent to which a construct is truly distinct from other constructs. Face validity = the extent to which the content of the items is consistent with the construct definition, based solely on the researchers judgment. Nomological validity = is tested by examining whether or not the correlations between the constructs in the measurement theory make sense. The covariance matrix Phi () of construct correlations is useful in this assessment. Parameter = a numerical representation of some characteristic of a population. In CFA/SEM, relationships are the characteristic of interest that the modeling procedures will generate estimates for. Parameters are numerical characteristics of the SEM relationships, comparable to regression coefficients in multiple regression. Average Variance extracted (AVE) = a summary measure of convergence among a set of items representing a construct. It is the average percent of variation explained among the items.
EP
Endogeneous Variable
Endogeneous Variable
JS SI
OC
Hypotheses: H1: EP + JS H2: EP + OC H3: AC +JS H4: AC +OC H5: JS + OC H6: JS + SI H7: OC +SI
AC
Exogeneous Variable
Note: observable indicator variables are not shown to simplify the model.
Formative Measurement Theory: assumes the measured indicator variables cause the construct and that the error is a result of the inability of the measured indicators to fully explain the construct. Therefore, the arrows are drawn from the measured indicators to the constructs. In short, formative constructs are not considered latent.
Reflective measures = unable to walk in straight line or stumbling, slurred speech, talking loud, laughing, etc. Formative measures = alcohol/drugs combined with lack of sleep, how much you have eaten, how fast and how much you drink, etc.
Assessing Measurement Model Validity Two Broad Approaches . . . 1. Examine the Goodness of Fit (GOF) indices. 2. Evaluate the construct validity and reliability of the specified measurement model.
Selecting a rigid cut-off for the fit indices is like selecting a minimum
R2 for a regression equation there is no single magic value for the fit indices that separates good from poor models. The quality of fit depends heavily on model characteristics including sample size and model complexity. Simple models with small samples should be held to very strict fit standards. More complex models with larger samples should not be held to the same strict standards.
The 2 and the 2 / df (normed Chi-square) One goodness of fit index (e.g., GFI, CFI, NFI, TLI) One badness of fit index (e.g., RMSEA, RMSR)
Covariances calculated for the sample request Sample moments and look in Output under that subheading.
Covariances estimated by AMOS software request Implied moments and look in Output under Estimates.
Residuals = difference between observed and estimated covariances request Residual moments.
A negative sign indicates the observed covariance (2.137) is smaller than the estimated covariance (2.229) by -.093.
Standardized Residuals you look for patterns of larger residuals, generally => 4.0
The CFI is 0.984 it exceeds the minimum (>0.90) for a model of this complexity and sample size.
CMIN/DF a value below 2 is preferred but between 2 and 5 is considered acceptable. GFI = Goodness of Fit Index Chi-square (X2) = likelihood ratio chi-square
AGFI = Adjusted Goodness of Fit Index This is the Model Fit portion of the output. PGFI = Parsimonious Goodness of Fit Index TLI = Tucker- Lewis NFI = Normed Fit Index
Note: If you click on any of the Fit Indices it will give guidelines for interpretation and references supporting the guidelines.
RMSEA = Root Mean Squared Error of Approximation a value of 0.10 or less is considered acceptable (7e, p. 649).
Note that when we evaluate the measures we use the numbers for the default model.
Three Types of Models: 1. Default = your model, the relationships you propose and are testing. 2. Saturated model = a model that hypothesizes that everything is related to everything (just-identified). 3. Independence model = hypothesizes that nothing is related to anything.
degree to which lack of fit is due to misspecification of the model tested versus being due to sampling error.
The AGFI, a parsimony fit index, is .946. This value is above the .90 guideline for this model . Attempts to adjust for model complexity, but penalizes more complex models.
The CFI, an incremental fit index, is 0.984, which exceeds the guidelines (>0.90) for a model of this complexity and sample size (7e, p. 650).
CFI represents the improvement of fit of the specified model over a baseline model in which all variables are constrained to be uncorrelated.
Other Indices
The NFI, RFI and IFI are other incremental fit indices. Our guidelines indicate the NFI should be >0.90 for a model of this complexity and sample size. For the RFI and IFI we indicate that larger values (0 1.0) are better.
The RMSEA, an absolute fit index, is 0.043. This value is quite low and well below the .08 guideline for a model with 12 measured variables and a sample size of 400. This also is called a BadnessOf-Fit index. The 90 percent confidence interval for the RMSEA is between a LO of .028 and a HI of 0.058. Thus, even the upper bound is not close to .08.
Using the RMSEA and the CFI satisfies our rule of thumb that both a badness-of-fit index and a goodness-of-fit index be evaluated. In addition, other index values also are supportive. For example, the GFI is 0.95, and the AGFI is 0.93. We therefore now move on to examine the construct validity of the model.
Use multiple indices of differing types. Adjust the index cutoff values based on model characteristics, e.g., number of constructs and indicators, sample size. Simpler models and smaller samples sizes require stricter evaluation. Remove indicator variables that do not meet established criteria. Use GOF indices to compare models. The pursuit of better fit at the expense of testing a true model is not a good trade-off.
Three Criteria: 1. Goodness of Fit? estimated covariance matrix = observed covariance matrix 2. Validity and Reliability of Measurement Model? 3. Significant and Meaningful Structural Relationships?
2. Assessing the Measurement Model Construct Validity o Face o Convergent o Discriminant o Nomological Construct Reliability 3. Assessing the Structural Model Significant and Meaningful Structural Relationships
Validity
Before running the CFA model, assessments of validity are based on:
A major objective of applying CFA is to empirically estimate validity using more rigorous approaches; e.g., construct validity.
Convergent Validity
Convergent validity there are three measures: 1. Factor loadings 2. Average Variance extracted (AVE) 3. Reliability
Rules of Thumb: Convergent Validity Standardized loadings estimates should be .5 or higher, and ideally .7 or higher. AVE should be .5 or greater to suggest adequate convergent validity. AVE estimates also should be greater than the square of the correlation between that factor and other factors to provide evidence of discriminant validity. Reliability should be .7 or higher to indicate adequate convergence or internal consistency.
JS1
Organizational Commitment
SI1
JS2
JS3
Job Satisfaction
Staying Intentions
SI2
JS4
SI3
JS5
SI4
Environmental Perceptions
AC1
AC2
AC3
AC4
EP1
EP2
EP3
EP4
Note: Measured variables are shown as a box with labels corresponding to those shown in the HBAT questionnaire. Latent constructs are an oval. Each measured variable has an error term, but the error terms are not shown. Two headed connections indicate covariance between constructs. One headed connectors indicate a causal path from a construct to an indicator (measured) variable. In CFA all connectors between constructs are two-headed covariances / correlations.
The asterisks indicate statistical significance <= .05. We use this information to determine if the standardized regression weights are statistically significant.
Factor loadings are the first thing to look at in examining convergent validity. Our guidelines are that all loadings should be at least .5, and preferably .7 or higher. All loadings are significant as required for convergent validity. The lowest is .592 (OC1) and there are only two below .70 (EP1 & OC3).
When examining convergent validity, we look at two additional measures: (1) Average Variance Extracted (AVE) by each construct. (2) Construct Reliabilities (CR). The AVE and CR are not provided by AMOS software so they have to be calculated.
HBAT CFA Three Factor Completely Standardized Factor Loadings, Variance Extracted, and Reliability Estimates
OC 0.59 0.87 0.67 0.84 EP AC
Factor Loadings
OC1 OC2 OC3 OC4 EP1 EP2 EP3 EP4 AC1 AC2 AC3 AC4 Variance Extracted Construct Reliability
2.264/4 = 56.61
Item Reliabilities 0.349 0.759 0.448 0.709 0.477 0.658 0.596 0.679 0.676 0.674 0.699 0.666
2.264
2.410
2.714
delta 0.65 0.24 0.55 0.29 0.52 0.34 0.40 0.32 0.32 0.33 0.30 0.33
56.61%
60.25%
67.86%
Squared Factor Loadings (communalities)
0.84
0.86
0.89
The delta is calculated as 1 minus the item reliability, e.g., the AC4 delta is 1 .666 = .33 The delta is also referred to as the standardized error variance.
The sum of the squared loadings This is the squared loading for OC4 .842 = .709
VE
i2
i 1
Calculated Variance Extracted (AVE): OC Construct = .349 + .759 + .448 + .709 = 2.264 / 4 = .5661 EP Construct = .477 + .658 + .596 + .679 = 2.410 / 4 = .6025 AC Construct = .676 + .674 + .699 + .666 = 2.714 / 4 = .6786
In the formula above the represents the standardized factor loading and i is the number of items. So, for n items, AVE is computed as the sum of the squared standardized factor loadings divided by the number of items, as shown above. A good rule of thumb is an AVE of .5 or higher indicates adequate convergent validity. An AVE of less than .5 indicates that on average, there is more error remaining in the items than there is variance explained by the latent factor structure you have imposed on the measure. An AVE estimate should be computed for each latent construct in a measurement model.
CR
( i ) 2
i 1 2
( i ) ( i )
i 1 i 1
CR (OC) = (.59 +.87 +.67 +.84)2 / [(.59 +.87 +.67 +.84)2 + (.65 +.24 +.55 +.29)] = 0.84 CR (EP) = (.69 +.81 +.77 +.82)2 / [(.69 +.82 +.84 +.82)2 + (.52 +.34 +.40 +.32)] = 0.86
CR (AC) = (.82 +.82 +.84 +.82)2 / [(.82 +.82 +.84 +.82)2 + (.32 +.33 +.30 +.33)] = 0.89
Construct reliability is computed from the sum of factor loadings (i), squared for each construct and the sum of the error variance terms for a construct (i) using the above formula. Note: error variance is also referred to as delta. The rule of thumb for a construct reliability estimate is that .7 or higher suggests good reliability. Reliability between .6 and .7 may be acceptable provided that other indicators of a models construct validity are good. A high construct reliability indicates that internal consistency exists. This means the measures all are consistently representing something.
Discriminant Validity
Discriminant validity = the extent to which a construct is truly distinct from other constructs. Rule of Thumb: all construct average variance extracted (AVE) estimates should be larger than the corresponding squared interconstruct correlation estimates (SIC). If they are, this indicates the measured variables have more in common with the construct they are associated with than they do with the other constructs.
Discriminant Validity
Correlations between the EP, AC and OC constructs. These are standardized covariances.
These are used in calculating discriminant validity.
Discriminant Validity
In the columns below we calculate the SIC (Squared Interconstruct Correlations) from the IC (Innerconstruct Correlations) obtained from the correlations table on the AMOS printout (see previous slide): IC EP AC EP OC AC OC .254 .500 .303 SIC .0645 .2500 .0918
Discriminant validity compares the average variance extracted (AVE) estimates for each factor with the squared interconstruct correlations (SIC) associated with that factor, as shown below:
AVE OC Construct EP Construct AC Construct .5661 .6025 .6786 SIC .2500, .0918 .0645, .2500 .0645, .0918
All variance extracted (AVE) estimates in the above table are larger than the corresponding squared interconstruct correlation estimates (SIC). This means the indicators have more in common with the construct they are associated with than they do with other constructs. Therefore, the HBAT three construct CFA model demonstrates discriminant validity.
Nomological Validity
Nomological validity . . . is tested by examining whether the correlations between the constructs in the measurement model make sense. The construct correlations are used to assess this. To demonstrate nomological validity in the HBAT model . . . the constructs must be positively related based on our HBAT theory. For the HBAT three construct model all correlations are positive and significant see next slide.
The interconstruct correlations are all positive and significant (see above Covariances table).
Error Variances (Unstandardized) To get the standardized error variances, subtract the squared standardized loadings shown below from 1 for each item.
These are the R-squared values (Squared Standardized Loadings). So subtract these from 1 to get the standardized error term estimate.
The Squared Multiple Correlations are also referred to as the squared loadings, i.e., they are calculated by squaring the standardized regression weights (loadings).
The squared loadings are used in calculating the average variance extracted (AVE) for each construct.
Path estimates the completely standardized loadings (AMOS = standardized regression weights) that link the individual indicators to a particular construct. The recommended minimum = .7; but .5 is acceptable. Variables with insignificant or low loadings should be considered for deletion. Standardized residuals the individual differences between observed covariance terms and fitted covariance terms. The better the fit the smaller the residual these should not exceed |4.0|. Modification indices the amount the overall Chi-square value would be reduced by freeing (estimating) any single particular path that is not currently estimated. That is, if you add or delete any path what is the impact on the Chi-square.
The largest residual is -2.0659 (EP3 & OC1) so no residuals exceed our guideline of |> 4.0|.
No refine measures and design a new study. Yes proceed to test the structural model with stages 5 and 6.
Chapter 1
Introduction
Note: this model is drawn by simply changing the relationships between EP OC and AC OC to straight arrows. But the model does not run!
Copyright 2007 Prentice-Hall, Inc.
With the AMOS software you must add an error term on your endogenous variable.
This shows the change from the two-headed arrow to a single headed arrow.
Squared multiple correlation for endogenous variable Organizational Commitment. Can be interpreted like the R2 in multiple regression.
Standardized Regression Weights for indicator variables, also called Factor Loadings.
The unstandardized regression weights for the indicator variables are the same as with the CFA model. Interpretation is shown. To get this click on the estimate.
The new weights at the top are for the two new causal paths to the new endogenous variable Organizational Commitment. The standardized regression weights for the indicator variables are the same as with the CFA model.
Interpretation: When Environmental Perceptions go up by 1 standard deviation, Organizational Commitment goes up by .452 standard deviations.
Estimate of Squared Multiple Correlation It is estimated that the predictors of Organizational Commitment (constructs AC and EP) explain 28.3 percent of its variance (i.e., 71.7% of variance is unexplained).
Where Do We Go From Here? More AMOS Practice: Drawing the 5-Construct HBAT SEM Model