Studenmund Ch03 v2

Chapter 3
Learning to Use
Regression
Analysis
Copyright 2011 Pearson Addison-Wesley. Slides by Niels-Hugo Blunch

All rights reserved. Washington and Lee University
Steps in Applied
Regression Analysis
The first step is choosing the dependent variable this step is

determined by the purpose of the research (see Chapter 11 for
details)
After choosing the dependent variable, its logical to follow the
following sequence:
1. Review the literature and develop the theoretical model
2. Specify the model: Select the independent variables and the
functional form
3. Hypothesize the expected signs of the coefficients
4. Collect the data. Inspect and clean the data
5. Estimate and evaluate the equation
6. Document the results
2011 Pearson Addison-Wesley. All rights reserved. 3-1

Step 1: Review the Literature and
Develop the Theoretical Model
Perhaps counter intuitively, a strong theoretical foundation

is the best start for any empirical project
Reason: main econometric decisions are determined by the
underlying theoretical model
Useful starting points:
Journal of Economic Literature or a business oriented publication of
abstracts
Internet search, including Google Scholar
EconLit, an electronic bibliography of economics literature (for more
details, go to www.EconLit.org)

Step 2: Specify the Model: Independent
Variables and Functional Form
After selecting the dependent variable, the

specification of a model involves choosing the
following components:
1. the independent variables and how they should be
measured,
2. the functional (mathematical) form of the variables,
and
3. the properties of the stochastic error term

Step 2: Specify the Model:
Independent Variables and
Functional Form (cont.)
A mistake in any of the three elements results in a specification error

For example, only theoretically relevant explanatory variables should
be included
Even so, researchers frequently have to make choices also denoted
imposing their priors
Example:
when estimating a demand equation, theory informs us that prices of
complements and substitutes of the good in question are important
explanatory variables
But which complementsand which substitutes?

Step 3: Hypothesize the Expected
Signs of the Coefficients
Once the variables are selected, its important to

hypothesize the expected signs of the regression
coefficients
Example: demand equation for a final consumption good
First, state the demand equation as a general function:
(3.2)
The signs above the variables indicate the hypothesized

sign of the respective regression coefficient in a linear
model
Step 4: Collect the Data & Inspect
and Clean the Data
A general rule regarding sample size is the more

observations the better
as long as the observations are from the same general
population!
The reason for this goes back to notion of degrees of
freedom (mentioned first in Section 2.4)
When there are more degrees of freedom:
Every positive error is likely to be balanced by a negative error
(see Figure 3.2)
The estimated regression coefficients are estimated with a
greater deal of precision

Figure 3.1 Mathematical Fit of a
Line to Two Points

Figure 3.2 Statistical Fit of a Line
to Three Points

Step 4: Collect the Data & Inspect
and Clean the Data (cont.)
Estimate model using the data in Table 2.2 to get:

Inspecting the dataobtain a printout or plot (graph)
of the data
Reason: to look for outliers
An outlier is an observation that lies outside the range of the rest of
the observations
Examples:
Does a student have a 7.0 GPA on a 4.0 scale?
Is consumption negative?

Step 5: Estimate and Evaluate
the Equation
Once steps 14 have been completed, the estimation part

is quick
using Eviews or Stata to estimate an OLS regression takes less
than a second!
The evaluation part is more tricky, however, involving
answering the following questions:
How well did the equation fit the data?
Were the signs and magnitudes of the estimated coefficients as
expected?
Afterwards may add sensitivity analysis (see Section 6.4
for details)

Step 6: Document the Results
A standard format usually is used to present estimated

regression results:
(3.3)
The number in parentheses under the estimated coefficient

is the estimated standard error of the estimated
coefficient, and the t-value is the one used to test the
hypothesis that the true value of the coefficient is different
from zero (more on this later!)

Case Study: Using Regression Analysis
to Pick Restaurant Locations
Background:
You have been hired to determine the best location
for the next Woodys restaurant (a moderately priced,
24-hour, family restaurant chain)
Objective:
How to decide location using the six basic steps of
applied regression analysis, discussed earlier?

Step 1: Review the Literature and
Develop the Theoretical Model
Background reading about the restaurant industry

Talking to various experts within the firm
All the chains restaurants are identical and located in
suburban, retail, or residential environments
So, lack of variation in potential explanatory variables to help
determine location
Number of customers most important for locational decision
Dependent variable: number of customers (measured by
the number of checks or bills)

Variables and Functional Form
More discussions with in-house experts

reveal three major determinants of sales:
Number of people living near the location
General income level of the location
Number of direct competitors near the location

Variables and Functional Form (cont.)
Based on this, the exact definitions of the independent

variables you decide to include are:
N = Competition: the number of direct competitors within a two-
mile radius of the Woodys location
P = Population: the number of people living within a three-mile
radius of the location
I = Income: the average household income of the population
measured in variable P
With no reason to suspect anything other than linear
functional form and a typical stochastic error term,
thats what you decide to use

Step 3: Hypothesize the Expected
Signs of the Coefficients
After talking some more with the in-house

experts and thinking some more, you
come up with the following:
(3.4)

Step 4: Collect the Data &
Inspect and Clean the Data
You manage to obtain data on the dependent and

independent variables for all 33 Woodys restaurants
Next, you inspect the data
The data quality is judged as excellent because:
Each manager measures each variable identically
All restaurants are included in the sample
All information is from the same year
The resulting data is as given in Tables 3.1 and 3.3 in the

book (using Eviews and Stata, respectively)

Step 5: Estimate and Evaluate
the Equation
You take the data set and enter it into the computer
You then run an OLS regression (after thinking the model over one
last time!)
The resulting model is:
(3.5)
Estimated coefficients are as expected and the fit is reasonable

Values for N, P, and I for each potential new location are then
obtained and plugged into (3.5) to predict Y

Step 6: Document the Results
The results summarized in Equation 3.5

meet our documentation requirements
Hence, you decide that theres no need to
take this step any further

Table 3.1a
Data for the Woodys Restaurants Example
(Using the Eviews Program)

Table 3.1b

Table 3.1c

Table 3.2a
Actual Computer Output

Table 3.2b

Table 3.3
(Using the Stata Program)

Table 3.3b

Table 3.4a

Table 3.4b

Key Terms from Chapter 3
The six steps in applied regression analysis

Dummy variable
Cross-sectional data set
Specification error
Degrees of freedom

Studenmund Ch03 v2

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Studenmund Ch03 v2

Încărcat de

Drepturi de autor:

Formate disponibile

Chapter 3

Copyright 2011 Pearson Addison-Wesley. Slides by Niels-Hugo Blunch

The first step is choosing the dependent variable this step is

2011 Pearson Addison-Wesley. All rights reserved. 3-1

Perhaps counter intuitively, a strong theoretical foundation

2011 Pearson Addison-Wesley. All rights reserved. 3-2

After selecting the dependent variable, the

2011 Pearson Addison-Wesley. All rights reserved. 3-3

A mistake in any of the three elements results in a specification error

2011 Pearson Addison-Wesley. All rights reserved. 3-4

Once the variables are selected, its important to

The signs above the variables indicate the hypothesized

A general rule regarding sample size is the more

2011 Pearson Addison-Wesley. All rights reserved. 3-6

2011 Pearson Addison-Wesley. All rights reserved. 3-7

2011 Pearson Addison-Wesley. All rights reserved. 3-8

Estimate model using the data in Table 2.2 to get:

2011 Pearson Addison-Wesley. All rights reserved. 3-9

Once steps 14 have been completed, the estimation part

2011 Pearson Addison-Wesley. All rights reserved. 3-10

A standard format usually is used to present estimated

The number in parentheses under the estimated coefficient

2011 Pearson Addison-Wesley. All rights reserved. 3-11

2011 Pearson Addison-Wesley. All rights reserved. 3-12

Background reading about the restaurant industry

2011 Pearson Addison-Wesley. All rights reserved. 3-13

More discussions with in-house experts

2011 Pearson Addison-Wesley. All rights reserved. 3-14

Based on this, the exact definitions of the independent

2011 Pearson Addison-Wesley. All rights reserved. 3-15

After talking some more with the in-house

2011 Pearson Addison-Wesley. All rights reserved. 3-16

You manage to obtain data on the dependent and

The resulting data is as given in Tables 3.1 and 3.3 in the

2011 Pearson Addison-Wesley. All rights reserved. 3-17

Estimated coefficients are as expected and the fit is reasonable

2011 Pearson Addison-Wesley. All rights reserved. 3-18

The results summarized in Equation 3.5

2011 Pearson Addison-Wesley. All rights reserved. 3-19

2011 Pearson Addison-Wesley. All rights reserved. 3-20

2011 Pearson Addison-Wesley. All rights reserved. 3-21

2011 Pearson Addison-Wesley. All rights reserved. 3-22

2011 Pearson Addison-Wesley. All rights reserved. 3-23

2011 Pearson Addison-Wesley. All rights reserved. 3-24

2011 Pearson Addison-Wesley. All rights reserved. 3-25

2011 Pearson Addison-Wesley. All rights reserved. 3-26

2011 Pearson Addison-Wesley. All rights reserved. 3-27

2011 Pearson Addison-Wesley. All rights reserved. 3-28

The six steps in applied regression analysis

2011 Pearson Addison-Wesley. All rights reserved. 3-29

S-ar putea să vă placă și