Documente Academic
Documente Profesional
Documente Cultură
Ho : B1 = 0 H0 : B2 = 0 . ETC
Errors are assumed to be: 1. Independent Ha: B1 ! = 0 H1: B2 != 0 .
2. Mean = 0, 3. Constant standard deviation, for Option 2 Higher Order Models
all X, and 4. Normally Distributed. Partial R2
Y = B0 + B1 X + B2 X2 + err (Quadratic) (the R2 between Y and both predictors) (the R2
1. linear inspect the scatter-plot of the data between Y and just X1) / (100% - (the R2
Y = B0 + B1 X + B2 X2 + B3 X3+ err (Cubic) between Y and just X1)
2. Are independent look for the residuals to
be patternless scattered on a residual plot. Transformation should be used wherever it is
possible, instead of higher order models.
The Danger of Multicollinearity Adjusted R2 for comparing best subsets. It 1. Its the difference in slopes, it measures how
adjusts for number of variables. the quantitative effect depends on the group
In a multiple linear regression multicolinearity value in the following sense: F statistic
is when the explanatory variables are perfectly Adjusted R2 used for comparing models to For every one unit increase in the F = (SSG/DFG)/(SSE/DFE) = MSG/MSE
or strongly linearly associated with each other. indicate which is better. quantitative variable X1 , the change in Y is B3
greater/less in the category coded as 1 relative R2= sum of squares for groups/sum of
Ideal we want to have weakly correlated Non-adjusted measures the predictive strength to the category coded as 0 squares total = SSG/SSTotal = SSG/
predictors. of models. (SSG+SSE)
2. B3 captures how the vertical difference
Weakly correlated if similar coefficients Check for multicollinearity problems in between the lines changes, which measures R2 is the proportion of variability in Y
2
between single variable analysis and multi addition to adj R how the group effect depends on the attributable to (or predicted from) the
variable analysis and standard errors are the quantitative value, in the following sense. explanatory variable(s) based on the model in
same or smaller. Including Categorical Variables in the For a particular value x 1 of the quantitative question.
Regression Model variable, the difference in Y for the group coded
Total multicollinearity means that one as 1 compared to the group coded as 0 is
explanatory variable depends entirely on So far used quantitative variables. Often want more/less than it would be without the
another. This will give us an error, because to include categorical variables in the prediction interaction term, by an amount equal to x1 times
wed divide by zero. model. B3 (on average.)
Strong multicollinearity, but not total, will give Can add a dummy variable depending on
misleading results. Determined by wildly categorical number, 0 or 1.
different coefficients and standard errors. Also
look at the high correlations. More precise Beta coefficient shows how much higher or
method is VIF. lower on average the response is for the
category coded as 1 compared to the response
for the category coded as 0.
VIF
Can add multiple levels. Must add two dummy
VIF is the reciprocal of a measure called variables, each of which is binary. The group
tolerance associated with each explanatory represented by (0,0) is the baseline group.
variable.
Modeling Interaction
Tolerance obtained by: One Way Anova
1. Produce a multiple linear regression with Xj Regression models model Q Q, Anova Tukey Pairwise Comparisons
as the response, versus all the other explanatory models C Q .
variables. If an interval for the difference doesnt contain
H 0 : 1 = 2 = = n zero, then we conclude that pair is different
Xj = Bo + B1X1 + B2X2 +. from each-other.
HA : The s are not the same.
2. Obtain the R2 from this regression. Two-Way ANOVA
Ancova exp(B1) is the odds ratio, (odds favoring Y=1 Ordinal logistic regression
When repeated design is not feasible or it when X = x+1 / odds favoring Y=1 when X=x)
wouldnt be desirable there is an alternative Is used in cases when the response variable Y is
analysis. This is ANCOVA since the variables H0 : B1 = 0 a categorical variable with more than two levels
are covariates. We include other quantitative which have a natural ordering.
variables that are correlated with, but not part HA:: B1 != 0
of the main experimental manipulation.
For Binary Logistic there is no equivalent
ANCOVAs usefulness depends on good measure of associoation which has a nice
correlation between the covariate and the mathematical properties of R2
response, otherwise nothing is gained by it.
Percent concordant pairs.
For studies with human subjects, typical
covariates are age, socioeconomic status, Concordant
aptitude, or pre-study attitude.
if X1 > X2 , Y1 > Y2