Documente Academic
Documente Profesional
Documente Cultură
Table 7 show the different regression models for all VIs and nonresponse rates (NRRs)
that were checked for adequacy. The columns are represented as follows: (a) VI, (b)
NRR, (c) IC, (d) the prediction model, (e) the coefficient of determination (R2) and (f) the
F-statistic and its p-value. For each imputation classes, the variables in the regression
model are defined as follows: y , the dependent variable which is the second visit VI
(TOTEX2 or TOTIN2) and LN(FV) , the independent variable which is the first visit VI
(TOTEX1 OR TOTIN1).
In coming up with the regression models that will exhibit better results in the model
transformation was applied to both the dependent and independent variable. The
independent and dependent variable in the model for all ICs under the varying NRR is the
first visit VI (TOTEX1 or TOTIN1) and the second visit VI (TOTEX2 or TOTIN2),
respectively. After applying the transformation, the following results were obtained:
First, in determining the explanatory power of first visit VI to the second visit VI, the
how well the model fits the data. The highest R2 in Table 7 measured 93.2%, the
coefficient of determination for the third imputation class of the TOTEX2 variable under
the highest NRR while the lowest is 70.3%, the coefficient of determination for the first
imputation class of the TOTIN2 variable under 20% NRR. For all NRR and VIs, the third
IC generated the highest R2 while the first IC produced the lowest R2.
Second, using the ANOVA tables presented in Section C of the Appendix for all the
models to check if the models satisfy the linearity assumption, results show that all
models exhibits the assumption of linearity. The p-values for all the models were less
than 0.0001, an indication that the linearity of the models is very significant.
Third, in testing for the assumption of independence, the Durbin – Watson test was
implemented. Results in Section C of the Appendix show that all of the models satisfy the
assumption of independence. However, since the data in this paper is not a time series
data where the assumption of the independence of error terms is relatively important, the
Fourth, to determine whether the models would meet the assumption of the normality, the
normal probability plot (NPP) was obtained. The normal probability plot in all models
moderately follows the S-shaped pattern which indicates that the residuals are not normal
but rather lognormal. However, the shape of the NPP improved after ln transformation
was applied even though it is not linear. Since the data used is a complex data, the models
were used even if assumption of the residuals to be normal is not perfectly achieved.
scatter plot of the residuals against the predicted values was obtained. Results showed
that there were no patterns evident in the scatter plot. The logarithmic transformation