Documente Academic
Documente Profesional
Documente Cultură
There is little extra to know beyond regression with one explanatory variable.
The main addition is the F-test for overall fit.
This requires the Data Analysis Add-in: see Excel 2007: Access and Activating the Data Analysis Add-in
The data used are in carsdata.xls
We then create a new variable in cells C2:C6, cubed household size as a regressor.
Then in cell C1 give the the heading CUBED HH SIZE.
(It turns out that for the se data squared HH SIZE has a coefficient of exactly 0.0 the cube is used).
We have regression with an intercept and the regressors HH SIZE and CUBED HH SIZE
…ucdavis.edu/…/ex61multipleregression… 1/7
10/21/2010 EXCEL Multiple Regression
We do this using the Data analysis Add-in and Regression.
The only change over one-variable regression is to include more than one column in the Input X Range.
Note, however, that the regressors need to be in contiguous columns (here columns B and C).
If this is not the case in the original data, then columns need to be copied to get the regressors in contiguous
columns.
Hitting OK we obtain
…ucdavis.edu/…/ex61multipleregression… 2/7
10/21/2010 EXCEL Multiple Regression
Explanation
Multiple R 0.895828 R = square root of R2
R Square 0.802508 R2
Adjusted R Square 0.605016 Adjusted R2 used if more than one x variable
Standard Error 0.444401 This is the sample estimate of the standard deviation of the error u
Observations 5 Number of observations used in the regression (n)
The standard error here refers to the estimated standard deviation of the error term u.
It is sometimes called the standard error of the regression. It equals sqrt(SSE/(n-k)).
It is not to be confused with the standard error of y itself (from descriptive statistics) or with the standard errors
of the regression coefficients given below.
…ucdavis.edu/…/ex61multipleregression… 3/7
10/21/2010 EXCEL Multiple Regression
R2 = 0.8025 means that 80.25% of the variation of yi around ybar (its mean) is explained by the regressors x2i
and x3i.
df SS MS F Significance F
Regression 2 1.6050 0.8025 4.0635 0.1975
Residual 2 0.3950 0.1975
Total 4 2.0
The ANOVA (analysis of variance) table splits the sum of squares into its components.
For example:
R2 = 1 - Residual SS / Total SS (general formula for R2)
= 1 - 0.3950 / 1.6050 (from data in the ANOVA table)
= 0.8025 (which equals R2 given in the regression Statistics table).
The column labeled F gives the overall F-test of H0: β2 = 0 and β3 = 0 versus Ha: at least one of β2 and β3 does
not equal zero.
Aside: Excel computes F this as:
F = [Regression SS/(k-1)] / [Residual SS/(n-k)] = [1.6050/2] / [.39498/2] = 4.0635.
Note: Significance F in general = FINV(F, k-1, n-k) where k is the number of regressors including hte intercept.
Here FINV(4.0635,2,2) = 0.1975.
The regression output of most interest is the following table of coefficients and associated output:
…ucdavis.edu/…/ex61multipleregression… 4/7
10/21/2010 EXCEL Multiple Regression
HH SIZE 0.33647 0.42270 0.7960 0.5095 -1.4823 2.1552
CUBED HH SIZE 0.00209 0.01311 0.1594 0.8880 -0.0543 0.0585
Let βj denote the population coefficient of the jth regressor (intercept, HH SIZE and CUBED HH SIZE).
Then
95% confidence interval for slope coefficient β2 is from Excel output (-1.4823, 2.1552).
The coefficient of HH SIZE has estimated standard error of 0.4227, t-statistic of 0.7960 and p-value of 0.5095.
It is therefore statistically insignificant at significance level α = .05 as p > 0.05.
The coefficient of CUBED HH SIZE has estimated standard error of 0.0131, t-statistic of 0.1594 and p-value of
…ucdavis.edu/…/ex61multipleregression… 5/7
10/21/2010 EXCEL Multiple Regression
0.8880.
It is therefore statistically insignificant at significance level α = .05 as p > 0.05.
Then
t = (b2 - H0 value of β2) / (standard error of b2 )
= (0.33647 - 1.0) / 0.42270
= -1.569.
We computed t = -1.569
The critical value is t_.025(2) = TINV(0.05,2) = 4.303. [Here n=5 and k=3 so n-k=2].
So do not reject null hypothesis at level .05 since t = |-1.569| < 4.303.
We test H0: β2 = 0 and β3 = 0 versus Ha: at least one of β2 and β3 does not equal zero.
From the ANOVA table the F-test statistic is 4.0635 with p-value of 0.1975.
Since the p-value is not less than 0.05 we do not reject the null hypothesis that the regression parameters are
zero at significance level 0.05.
Conclude that the parameters are jointly statistically insignificant at significance level 0.05.
Note: Significance F in general = FINV(F, k-1, n-k) where k is the number of regressors including hte intercept.
Here FINV(4.0635,2,2) = 0.1975.
Consider case where x = 4 in which case CUBED HH SIZE = x^3 = 4^3 = 64.
…ucdavis.edu/…/ex61multipleregression… 6/7
10/21/2010 EXCEL Multiple Regression
EXCEL LIMITATIONS
Excel standard errors and t-statistics and p-values are based on the assumption that the error is independent with
constant variance (homoskedastic).
Excel does not provide alternaties, such asheteroskedastic-robust or autocorrelation-robust standard errors and
t-statistics and p-values.
More specialized software such as STATA, EVIEWS, SAS, LIMDEP, PC-TSP, ... is needed.
…ucdavis.edu/…/ex61multipleregression… 7/7