Documente Academic
Documente Profesional
Documente Cultură
Suppose we have the following hypothesis about some variables from the World95.sav data set: A countrys rate of male literacy (Y) is associated with a smaller rate of annual population increase (X1), a greater gross domestic product (X2), and a larger percentage of people living in cities (X3) First lets look at the intercorrelation among these four variables What we hope to find is that each of the three predictors has at least a moderate correlation with the Y variable, male literacy, but are not too highly intercorrelated themselves (avoiding multicollinearity) Lets check this out by obtaining the zero-order correlations
Pearson Correlation Sig. (2-tailed) N People living in cities (%) Pearson Correlation Sig. (2-tailed) N Population increase (% Pearson Correlation per year)) Sig. (2-tailed) N Gross domestic product / capita Pearson Correlation Sig. (2-tailed) N
Now lets run a standard (simultaneous) multiple regression of Y (male literacy) on the three predictor variables
a. Predictors: (Constant), Gross domestic product / capita, Population increase (% per year)), People living in cities (%) ANOVAb
Model 1 Sum of Squares 20478.670 14634.107 35112.776 df 3 81 84 Mean Square 6826.223 180.668 F 37.783 Sig. .000 a
a. Predictors: (Constant), Gross domestic product / capita, Population increase (% Listwise Exclusion: per year)), People living in cities (%) b. Dependent Variable: Males who read (%) Exclude a case if it is missing data on any one of the variables-more typical for multivariate Pairwise-typical for bivariate-exclude cases if theyre missing x or y
Multicollinearity Statistics
So far you have found partial support but not full support for your hypothesis: given this analysis it would have to be revised to leave out GDP as a predictor. What you appear to have found is that Male literacy (in standard scores) = -.517 (Population increase in standard units) + .493 (People living in cities in standard units) (not quite; need to rerun it without GDP) But not so fast! First lets check to make sure we dont have any multicollinearity issues. Below are the collinearity statistics from the coefficients table. Recall that for multicollinearity to be a problem tolerance had to approach zero and VIF approach 10. So everthing below looks OK and you can report what that you have found modified support for your hypothesis, minus the effect of GDP
Model 1
Variables Entered People living in a cities (%) Population increase (% per a year)) Gross domestic product / a capita
Variables Removed .
Method Enter
Enter
Enter
a. All requested variables entered. b. Dependent Variable: Males who read (%)
Model Summary Change Statistics Model 1 2 3 R R Square .587 a .345 .762 b .581 c .764 .583 Adjusted R Square .337 .571 .568 Std. Error of the Estimate 16.649 13.397 13.441 R Square Change .345 .236 .002 F Change 43.668 46.198 .457 df1 1 1 1 df2 83 82 81 Sig. F Change .000 .000 .501
a. Predictors: (Constant), People living in cities (%) b. Predictors: (Constant), People living in cities (%), Population increase (% per year)) c. Predictors: (Constant), People living in cities (%), Population increase (% per year)), Gross domestic product / capita
56.823
.000 b
37.783
.000 c
a. Predictors: (Constant), People living in cities (%) b. Predictors: (Constant), People living in cities (%), Population increase (% per year)) c. Predictors: (Constant), People living in cities (%), Population increase (% per year)), Gross domestic product / capita d. Dependent Variable: Males who read (%)
Model 1
Variables Entered
Variables Removed
Method Stepwise (Criteria: Probabilit y-ofF-to-enter <= .005, Probabilit y-ofF-to-remo ve >= . 010). Stepwise (Criteria: Probabilit y-ofF-to-enter <= .005, Probabilit y-ofF-to-remo ve >= . 010).
Model Summary Change Statistics Model 1 2 R R Square .619 a .383 .762 b .581 Adjusted R Square .376 .571 Std. Error of the Estimate 16.155 13.397 R Square Change .383 .198 F Change 51.538 38.699 df1 1 1 df2 83 82 Sig. F Change .000 .000
a. Predictors: (Constant), Population increase (% per year)) b. Predictors: (Constant), Population increase (% per year)), People living in cities (%)
Overall F test
ANOVAc Model 1 Sum of Squares 13450.814 21661.963 35112.776 20396.129 14716.647 35112.776 df 1 83 84 2 82 84 Mean Square 13450.814 260.988 10198.065 179.471 F 51.538 Sig. .000 a
56.823
.000 b
a. Predictors: (Constant), Population increase (% per year)) b. Predictors: (Constant), Population increase (% per year)), People living in cities (%) c. Dependent Variable: Males who read (%)
Heres your overall F test for significance of the two-predictor model in explaining male literacy
partial confirmation for the research hypothesis: rate of male literacy is a linear function of the countrys annual population increase and the percentage of people living in cities (R = .762, R2 = .581). The overall F for the two-variable model was 56.823, df = 2, 82, p < .001. Standardized beta weights were -.502 for annual population increase and .460 for percentage of people living in cities.
In the final model, only GDP and people living in cities were retained