Sunteți pe pagina 1din 117

Clase 2: Seleccin por Observables, Matching, y

Propensity Score
Ricardo A Pasquini
IAE B.S. and MEU-UTDT

August 2013

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

1 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

Conditional Independence Assumption (CIA) - Selection on


Observables
Review

We can think about causality using the potential outcomes


framework:

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

2 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

Conditional Independence Assumption (CIA) - Selection on


Observables
Review

We can think about causality using the potential outcomes


framework:
A given individual faces two possible scenarios: it will be part of a
treatment or it will not.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

2 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

Conditional Independence Assumption (CIA) - Selection on


Observables
Review

We can think about causality using the potential outcomes


framework:
A given individual faces two possible scenarios: it will be part of a
treatment or it will not.
In each scenario it will report an outcome we are interested in.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

2 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

Conditional Independence Assumption (CIA) - Selection on


Observables
Review

We can think about causality using the potential outcomes


framework:
A given individual faces two possible scenarios: it will be part of a
treatment or it will not.
In each scenario it will report an outcome we are interested in.
If the outcome under both scenarios were the same, therefore there
would no be eect related to the treatment.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

2 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

Conditional Independence Assumption (CIA) - Selection on


Observables
Review

We can think about causality using the potential outcomes


framework:
A given individual faces two possible scenarios: it will be part of a
treatment or it will not.
In each scenario it will report an outcome we are interested in.
If the outcome under both scenarios were the same, therefore there
would no be eect related to the treatment.
For a given individual, we are only able to observe one scenario => We
use statistics to learn on the average.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

2 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

Conditional Independence Assumption (CIA) - Selection on


Observables
Review

We can think about causality using the potential outcomes


framework:
A given individual faces two possible scenarios: it will be part of a
treatment or it will not.
In each scenario it will report an outcome we are interested in.
If the outcome under both scenarios were the same, therefore there
would no be eect related to the treatment.
For a given individual, we are only able to observe one scenario => We
use statistics to learn on the average.

Comparing treated and not-treated will NOT result in the eect we


are interested in.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

2 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

Conditional Independence Assumption (CIA) - Selection on


Observables
Review

We can think about causality using the potential outcomes


framework:
A given individual faces two possible scenarios: it will be part of a
treatment or it will not.
In each scenario it will report an outcome we are interested in.
If the outcome under both scenarios were the same, therefore there
would no be eect related to the treatment.
For a given individual, we are only able to observe one scenario => We
use statistics to learn on the average.

Comparing treated and not-treated will NOT result in the eect we


are interested in.
As we saw, one (ideal) way to evaluate potential outcomes are
experiments: randomnization.
Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

2 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

Conditional Independence Assumption (CIA) - Selection on


Observables
Review

When no experiments (or natural experiments) are available, we look


for other options. We will evaluate many of them in this course.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

3 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

Conditional Independence Assumption (CIA) - Selection on


Observables
Review

When no experiments (or natural experiments) are available, we look


for other options. We will evaluate many of them in this course.
Alternatives come at a cost: they imply making certain assumptions.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

3 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

Conditional Independence Assumption (CIA) - Selection on


Observables
Review

When no experiments (or natural experiments) are available, we look


for other options. We will evaluate many of them in this course.
Alternatives come at a cost: they imply making certain assumptions.
The rst we analyze is widely used, this is the assumming Conditional
Independence (CIA).

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

3 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

Conditional Independence Assumption (CIA) - Selection on


Observables
Review

When no experiments (or natural experiments) are available, we look


for other options. We will evaluate many of them in this course.
Alternatives come at a cost: they imply making certain assumptions.
The rst we analyze is widely used, this is the assumming Conditional
Independence (CIA).
As we will see, the CIA assumes that if a certain group of variables
are controlled for, then individuals can be considered similar enough
for comparison purposes.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

3 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

Conditional Independence Assumption (CIA) - Selection on


Observables
Review

When no experiments (or natural experiments) are available, we look


for other options. We will evaluate many of them in this course.
Alternatives come at a cost: they imply making certain assumptions.
The rst we analyze is widely used, this is the assumming Conditional
Independence (CIA).
As we will see, the CIA assumes that if a certain group of variables
are controlled for, then individuals can be considered similar enough
for comparison purposes.
This assumption is sometimes called selection-on-observables because
the covariates to be held xed are assumed to be known and observed.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

3 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

CIA - Selection on Observables


Review

Lets recall the framework: Treatment Di = f0, 1g . and for a given


individual i the potential outcome is
potential outcome

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Y1i if Di = 1
Y0i if Di = 0

Econometra 2 - MEU

August 2013

4 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

CIA - Selection on Observables


Review

Previously, we obtained an expression for the the outcome of


dierentiating program participants to non-participants.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

5 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

CIA - Selection on Observables


Review

Previously, we obtained an expression for the the outcome of


dierentiating program participants to non-participants.
This dierence is equal to the Treatment on the Treated Eect (an
interesting expression to us) plus a selection bias:

E [ Yi j D i = 1 ]

E [ Yi j D i = 0 ]

E [Y1i j Di = 1] E [Y0i j Di = 1]
Average Eect of the Treatment on the Treated

+E [Y0i j Di = 0]

E [Y0i j Di = 1]

Selection Bias

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

5 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

CIA - Selection on Observables


Now the innovation: The CIA states that conditional on a number of
observed characteristics Xi , the potential outcomes are independent
of the treatment (e.g., the same probability to achieve a high -or lowpotential outcome if took the treatment or not).

fY0i , Y1i g q Di j Xi

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

6 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

CIA - Selection on Observables


Now the innovation: The CIA states that conditional on a number of
observed characteristics Xi , the potential outcomes are independent
of the treatment (e.g., the same probability to achieve a high -or lowpotential outcome if took the treatment or not).

fY0i , Y1i g q Di j Xi
Therefore, selection bias dissapears, and the dierence between
participants and non participants will result in the treatment eect:
E [ Yi j Xi , D i = 1 ]

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

E [Yi j Xi , Di = 0] = E [Y1i

Y0i j Xi ]

Econometra 2 - MEU

August 2013

6 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

CIA - Selection on Observables


Treatment takes more than two values

Now we expand the approach to a treatment variable that takes more


than two values (think on the relationship between number of
schooling years and earnings). Call this variable si , and use the
following notation for the potential outcome:
Ysi

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

fi (s )

Econometra 2 - MEU

August 2013

7 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

CIA - Selection on Observables


Treatment takes more than two values

Now we expand the approach to a treatment variable that takes more


than two values (think on the relationship between number of
schooling years and earnings). Call this variable si , and use the
following notation for the potential outcome:
Ysi

fi (s )

f i tells us how much would individual i would gain for any number of
years of schooling s.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

7 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

CIA - Selection on Observables


Treatment takes more than two values

Now we expand the approach to a treatment variable that takes more


than two values (think on the relationship between number of
schooling years and earnings). Call this variable si , and use the
following notation for the potential outcome:
Ysi

fi (s )

f i tells us how much would individual i would gain for any number of
years of schooling s.
The CIA in this setup becomes:
Ysi q si j Xi

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

7 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

CIA - Selection on Observables


Treatment takes more than two values

Conditional on Xi , the average causal eect of a one year increase in


schooling is E [fi (s ) fi (s 1) j Xi ]

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

8 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

CIA - Selection on Observables


Treatment takes more than two values

Conditional on Xi , the average causal eect of a one year increase in


schooling is E [fi (s ) fi (s 1) j Xi ]
We only observe Yi = fi (Si ), however, that is fi (s ) with s = Si .

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

8 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

CIA - Selection on Observables


Treatment takes more than two values

Conditional on Xi , the average causal eect of a one year increase in


schooling is E [fi (s ) fi (s 1) j Xi ]
We only observe Yi = fi (Si ), however, that is fi (s ) with s = Si .
But given the CIA, this comparison has a causal interpretation:
E [ Yi j Xi , Si = s ]
E [fi (s )

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

E [Yi j Xi , Si = s
fi (s

Econometra 2 - MEU

1] =

1 ) j Xi ]

August 2013

8 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

CIA - Selection on Observables


Treatment takes more than two values

Conditional on Xi , the average causal eect of a one year increase in


schooling is E [fi (s ) fi (s 1) j Xi ]
We only observe Yi = fi (Si ), however, that is fi (s ) with s = Si .
But given the CIA, this comparison has a causal interpretation:
E [ Yi j Xi , Si = s ]

E [Yi j Xi , Si = s

E [fi (s )

fi (s

1] =

1 ) j Xi ]

For example we might want to compare the additional eect of a year


at the school:
E [Yi j Xi , Si = 12]

E [Yi j Xi , Si = 11]

= E [fi (12) j Xi , Si = 12]


Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

E [fi (11) j Xi , Si = 11]


August 2013

8 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

CIA - Selection on Observables


Treatment takes more than two values

And because of the CIA...the selection bias dissapears...

E [fi (12) j Xi , Si = 12]

E [fi (11) j Xi , Si = 11]

= E [fi (12) j Xi , Si = 12] E [fi (11) j Xi , Si = 12]


+E [fi (11) j Xi , Si = 12] E [fi (11) j Xi , Si = 11]
= E [fi (12) fi (11) j Xi , Si = 12]

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

9 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

CIA - Selection on Observables


Treatment takes more than two values

And because of the CIA...the selection bias dissapears...

E [fi (12) j Xi , Si = 12]

E [fi (11) j Xi , Si = 11]

= E [fi (12) j Xi , Si = 12] E [fi (11) j Xi , Si = 12]


+E [fi (11) j Xi , Si = 12] E [fi (11) j Xi , Si = 11]
= E [fi (12) fi (11) j Xi , Si = 12]
In the practice this implies that we are able to obtain the TOT eect
by taking the dierence between graduates and dropouts for the same
cohort.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

9 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

CIA - Selection on Observables


Treatment takes more than two values

Regression with the CIA also allows to nding causal eects.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

fi (s ) = + s + i

Econometra 2 - MEU

August 2013

10 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

CIA - Selection on Observables


Treatment takes more than two values

Regression with the CIA also allows to nding causal eects.


We can do this by assuming that fi (s ) is both linear in s and the
same for everyone except for an additive error term.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

fi (s ) = + s + i

Econometra 2 - MEU

August 2013

10 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

CIA - Selection on Observables


Treatment takes more than two values

Regression with the CIA also allows to nding causal eects.


We can do this by assuming that fi (s ) is both linear in s and the
same for everyone except for an additive error term.
We would employ a regression like:

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

fi (s ) = + s + i

Econometra 2 - MEU

August 2013

10 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

CIA - Selection on Observables


Treatment takes more than two values

Regression with the CIA also allows to nding causal eects.


We can do this by assuming that fi (s ) is both linear in s and the
same for everyone except for an additive error term.
We would employ a regression like:
fi (s ) = + s + i
where s does not have a subscript i because of the second assumption.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

10 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

CIA - Selection on Observables


Treatment takes more than two values

Regression with the CIA also allows to nding causal eects.


We can do this by assuming that fi (s ) is both linear in s and the
same for everyone except for an additive error term.
We would employ a regression like:
fi (s ) = + s + i
where s does not have a subscript i because of the second assumption.
Even so, allowing for random variation in fi (s ) across people, and for
non-linearity for a given person, regression can be thought of a
strategy for the estimation of a weighted average of the
individual-speci c dierence, fi (s ) fi (s 1). In fact, regression
can be seen as a particular sort of matching estimator.
Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

10 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

CIA - Selection on Observables


Incorporating the CIA in the regression

If we replace s for the realized Si we have a model that we can


estimate.
Yi = + Si + i

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

11 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

CIA - Selection on Observables


Incorporating the CIA in the regression

If we replace s for the realized Si we have a model that we can


estimate.
Yi = + Si + i
However, because of the selection bias we know that the Si might be
correlated with the fi (s ).

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

11 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

CIA - Selection on Observables


Incorporating the CIA in the regression

Suppose that the CIA holds. We can think that the errors are a linear
function of the observables and an error term:
0

i = Xi + vi

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

12 / 44

Seleccin por Observables y el Supuesto de Independencia


Condicional

CIA - Selection on Observables


Incorporating the CIA in the regression

Suppose that the CIA holds. We can think that the errors are a linear
function of the observables and an error term:
0

i = Xi + vi
Because is dened by the regression of i on Xi ; the residual vi and
Xi are uncorrelated by construction.
0

E [fi (s ) j Xi , Si ] = E [fi (s ) j Xi ] = + s + E [ i j Xi ] = + s + Xi
0

Yi = + s + Xi + vi

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

12 / 44

Matching

Matching has been widely used in the last 2 decades.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

13 / 44

Matching

Matching has been widely used in the last 2 decades.


Similarly to regression, the CIA is the main assumption.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

13 / 44

Matching

Matching has been widely used in the last 2 decades.


Similarly to regression, the CIA is the main assumption.
Angrist and Pischke show that Matching and Regression are closely
related: Both can be seen as weighted averages of individual eects,
using dierent weights.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

13 / 44

Matching
Using the CIA the selection bias dissapears after conditioning on Xi .
We can calculate the TOT by iterating on the values of X.

TOT

y0i j Di = 1]

E [y1i

= EX fEY [y1i

TOT

y0i j Xi , Di = 1] j Di = 1g

EX [EY [y1i j Xi , Di = 1]

= EX [ X j D i = 1 ]

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

EY [y0i j Xi , Di = 0] j Di = 1]

Econometra 2 - MEU

August 2013

14 / 44

Matching
Using the CIA the selection bias dissapears after conditioning on Xi .
We can calculate the TOT by iterating on the values of X.

TOT

y0i j Di = 1]

E [y1i

= EX fEY [y1i

y0i j Xi , Di = 1] j Di = 1g

The 1st equality is the denition of TOT and the 2nd equality makes
Xi appear, using the law of iterated expectations.

TOT

EX [EY [y1i j Xi , Di = 1]

= EX [ X j D i = 1 ]

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

EY [y0i j Xi , Di = 0] j Di = 1]

Econometra 2 - MEU

August 2013

14 / 44

Matching
Using the CIA the selection bias dissapears after conditioning on Xi .
We can calculate the TOT by iterating on the values of X.

TOT

y0i j Di = 1]

E [y1i

= EX fEY [y1i

y0i j Xi , Di = 1] j Di = 1g

The 1st equality is the denition of TOT and the 2nd equality makes
Xi appear, using the law of iterated expectations.
By virtue of the CIA, we can substitute
EY [y0i j Xi , Di = 1] = EY [y0i j Xi , Di = 0]
TOT

EX [EY [y1i j Xi , Di = 1]

= EX [ X j D i = 1 ]

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

EY [y0i j Xi , Di = 0] j Di = 1]

Econometra 2 - MEU

August 2013

14 / 44

Matching

Where: X = EY [y1i j Xi , Di = 1] EY [y0i j Xi , Di = 0] is the


random X-specic dierence in outcomes between treated and
non-treated at each value of Xi . For the discrete case we have:
E [y1i

y0i j Di = 1] =

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

X P ( Xi = x j D i = 1 )
x

Econometra 2 - MEU

August 2013

15 / 44

Matching

Where: X = EY [y1i j Xi , Di = 1] EY [y0i j Xi , Di = 0] is the


random X-specic dierence in outcomes between treated and
non-treated at each value of Xi . For the discrete case we have:
E [y1i

y0i j Di = 1] =

X P ( Xi = x j D i = 1 )
x

Where P (Xi = x j Di = 1) is the probability mass function for Xi


given that Di = 1.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

15 / 44

Matching
Angrist (1998) Eect of Military on Earnings

Angrist uses this last estimator to measure the eect of military on


earnings.

ATE

EX [EY [y1i j Xi , Di = 1]

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

EY [y0i j Xi , Di = 0]] =

Econometra 2 - MEU

X P ( Xi = x )
x

August 2013

16 / 44

Matching
Angrist (1998) Eect of Military on Earnings

Angrist uses this last estimator to measure the eect of military on


earnings.
In this case, Xi , takes on values determined by all possible
combinations of year of birth, test-score group, year of
application to the military, and educational attainment at the
time of application.

ATE

EX [EY [y1i j Xi , Di = 1]

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

EY [y0i j Xi , Di = 0]] =

Econometra 2 - MEU

X P ( Xi = x )
x

August 2013

16 / 44

Matching
Angrist (1998) Eect of Military on Earnings

Angrist uses this last estimator to measure the eect of military on


earnings.
In this case, Xi , takes on values determined by all possible
combinations of year of birth, test-score group, year of
application to the military, and educational attainment at the
time of application.
Until this point we have analyzed the TOT eect (given Di = 1). But
notice that just as easily we can derive the unconditional expression:
the ATE (Average Treatment Eect)

ATE

EX [EY [y1i j Xi , Di = 1]

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

EY [y0i j Xi , Di = 0]] =

Econometra 2 - MEU

X P ( Xi = x )
x

August 2013

16 / 44

Matching
Angrist (1998) Eect of Military on Earnings

Angrist uses this last estimator to measure the eect of military on


earnings.
In this case, Xi , takes on values determined by all possible
combinations of year of birth, test-score group, year of
application to the military, and educational attainment at the
time of application.
Until this point we have analyzed the TOT eect (given Di = 1). But
notice that just as easily we can derive the unconditional expression:
the ATE (Average Treatment Eect)

ATE

EX [EY [y1i j Xi , Di = 1]

EY [y0i j Xi , Di = 0]] =

X P ( Xi = x )
x

Where the expectation on X is unconditional to the value of Di .

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

16 / 44

Matching

TOT tells us how much the typical soldier gained or lost as a


consequence of military service, while ATE tells us how much the
typical applicant to the military gained or lost (in Angrist (1998),
population consists of applicants.)

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

17 / 44

Matching

TOT tells us how much the typical soldier gained or lost as a


consequence of military service, while ATE tells us how much the
typical applicant to the military gained or lost (in Angrist (1998),
population consists of applicants.)
The US military tends to be fairly picky about its soldiers, especially
after downsizing at the end of the Cold War.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

17 / 44

Matching

TOT tells us how much the typical soldier gained or lost as a


consequence of military service, while ATE tells us how much the
typical applicant to the military gained or lost (in Angrist (1998),
population consists of applicants.)
The US military tends to be fairly picky about its soldiers, especially
after downsizing at the end of the Cold War.
For the most part, the military now takes only high school graduates
with test scores in the upper half of the test score distribution.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

17 / 44

Matching

TOT tells us how much the typical soldier gained or lost as a


consequence of military service, while ATE tells us how much the
typical applicant to the military gained or lost (in Angrist (1998),
population consists of applicants.)
The US military tends to be fairly picky about its soldiers, especially
after downsizing at the end of the Cold War.
For the most part, the military now takes only high school graduates
with test scores in the upper half of the test score distribution.
The resulting positive screening generates positive selection bias in
naive comparisons of veteran and non-veteran earnings.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

17 / 44

Matching

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

18 / 44

Matching
Angrist (1998)

Notice that once controlling for the relevant covariates (as in the
matching case), the dierence turns negative. Recall that the
estimates here correspond to the weighted average of the dierences
for each category of X.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

19 / 44

Matching
Angrist (1998) Further Details

Finally Column 4 reports the estimation of a (saturated) regression


model:
Yi =

ix x + R Di + i
x

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

20 / 44

Matching
Angrist (1998) Further Details

Finally Column 4 reports the estimation of a (saturated) regression


model:
Yi =

ix x + R Di + i
x

where ix is a dummy that indicates Xi = x, x is a regression-eect


for Xi = x, and R is the regression estimand.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

20 / 44

Matching
Comparing Regression and Matching

The reason the regression and matching estimates are similar is that
regression, too, can be seen as a sort of matching estimator:

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

21 / 44

Matching
Comparing Regression and Matching

The reason the regression and matching estimates are similar is that
regression, too, can be seen as a sort of matching estimator:
the regression estimand diers from the matching estimands only in the
weights used to sum the covariate-speci c eects, X into a single
eect.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

21 / 44

Matching
Comparing Regression and Matching

The reason the regression and matching estimates are similar is that
regression, too, can be seen as a sort of matching estimator:
the regression estimand diers from the matching estimands only in the
weights used to sum the covariate-speci c eects, X into a single
eect.
In particular, matching uses the distribution of covariates among the
treated to weight covariate-specic estimates into an estimate of the
eect of TOT, while regression produces a variance-weighted average
of these eects.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

21 / 44

Matching
Comparing Regression and Matching

The reason the regression and matching estimates are similar is that
regression, too, can be seen as a sort of matching estimator:
the regression estimand diers from the matching estimands only in the
weights used to sum the covariate-speci c eects, X into a single
eect.
In particular, matching uses the distribution of covariates among the
treated to weight covariate-specic estimates into an estimate of the
eect of TOT, while regression produces a variance-weighted average
of these eects.
For the demonstration, see the book.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

21 / 44

Matching
Comparing Regression and Matching

The point is that the treatment-on-the-treated estimand puts the


most weight on covariate cells containing those who are most likely to
be treated. In contrast, regression puts the most weight on covariate
cells where the conditional variance of treatment status is largest. As
a rule, this variance is maximized when P (Di = 1 j Xi = x ) = 21 , in
other words, for cells where there are equal numbers of treated and
control observations.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

22 / 44

Matching
Comparing Regression and Matching

The point is that the treatment-on-the-treated estimand puts the


most weight on covariate cells containing those who are most likely to
be treated. In contrast, regression puts the most weight on covariate
cells where the conditional variance of treatment status is largest. As
a rule, this variance is maximized when P (Di = 1 j Xi = x ) = 21 , in
other words, for cells where there are equal numbers of treated and
control observations.
Note that both estimators will have common support, this is, they will
be limited to covariate values where both treated and control
observations are found.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

22 / 44

Propensity Score Matching (PSM)

Propensity score matching (PSM) constructs a statistical comparison


group that is based on a model of the probability of participating in
the treatment, using observed characteristics.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

23 / 44

Propensity Score Matching (PSM)

Propensity score matching (PSM) constructs a statistical comparison


group that is based on a model of the probability of participating in
the treatment, using observed characteristics.
The idea is to match each individual in the treatment group with a
individual in the control group with a similar probability of taking up
the program.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

23 / 44

Propensity Score Matching (PSM)

This involves two assumptions:

0 < P (Ti = 1jXi ) < 1

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

24 / 44

Propensity Score Matching (PSM)

This involves two assumptions:


CIA: No program take up due to unobserved factors. Self-selection bias
is eliminated.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

0 < P (Ti = 1jXi ) < 1

Econometra 2 - MEU

August 2013

24 / 44

Propensity Score Matching (PSM)

This involves two assumptions:


CIA: No program take up due to unobserved factors. Self-selection bias
is eliminated.
In this, the PSM technique is therefore similar to simple Regression and
Matching.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

0 < P (Ti = 1jXi ) < 1

Econometra 2 - MEU

August 2013

24 / 44

Propensity Score Matching (PSM)

This involves two assumptions:


CIA: No program take up due to unobserved factors. Self-selection bias
is eliminated.
In this, the PSM technique is therefore similar to simple Regression and
Matching.

Presence of a common support or overlap condition, which guarantees


that I will be able to nd a "similar" individual in the control group.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

0 < P (Ti = 1jXi ) < 1

Econometra 2 - MEU

August 2013

24 / 44

Propensity Score Matching (PSM)

This involves two assumptions:


CIA: No program take up due to unobserved factors. Self-selection bias
is eliminated.
In this, the PSM technique is therefore similar to simple Regression and
Matching.

Presence of a common support or overlap condition, which guarantees


that I will be able to nd a "similar" individual in the control group.

0 < P (Ti = 1jXi ) < 1


If P (Ti = 1jXi ) = 1, all of them would be takers...

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

24 / 44

Propensity Score Matching (PSM)


Common Support Example

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

25 / 44

Propensity Score Matching (PSM)


In the practice, estimation in three steps:

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

26 / 44

Propensity Score Matching (PSM)


In the practice, estimation in three steps:
First, p (Ti = 1jXi ) is estimated using some kind of parametric model,
say, Logit or Probit.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

26 / 44

Propensity Score Matching (PSM)


In the practice, estimation in three steps:
First, p (Ti = 1jXi ) is estimated using some kind of parametric model,
say, Logit or Probit.
Second, dierent approaches can be used to match participants and
nonparticipants on the basis of p (Ti = 1jXi )

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

26 / 44

Propensity Score Matching (PSM)


In the practice, estimation in three steps:
First, p (Ti = 1jXi ) is estimated using some kind of parametric model,
say, Logit or Probit.
Second, dierent approaches can be used to match participants and
nonparticipants on the basis of p (Ti = 1jXi )
Third, estimates of the TOT are computed by evaluating the
dierences.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

26 / 44

Propensity Score Matching (PSM)


In the practice, estimation in three steps:
First, p (Ti = 1jXi ) is estimated using some kind of parametric model,
say, Logit or Probit.
Second, dierent approaches can be used to match participants and
nonparticipants on the basis of p (Ti = 1jXi )
Third, estimates of the TOT are computed by evaluating the
dierences.

where w (i, j ) are weights when are more than one control for each
treated observation.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

26 / 44

Propensity Score Matching (PSM)


In the practice, estimation in three steps:
First, p (Ti = 1jXi ) is estimated using some kind of parametric model,
say, Logit or Probit.
Second, dierent approaches can be used to match participants and
nonparticipants on the basis of p (Ti = 1jXi )
Third, estimates of the TOT are computed by evaluating the
dierences.

where w (i, j ) are weights when are more than one control for each
treated observation.
An alternative is to use a regression and the p (Ti = 1jXi ) as weights.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

26 / 44

Propensity Score Matching (PSM)

Matching participants and non-participants:


Nearest-neighbor: just take the n nearest neighbors on the basis of
p ( T i = 1 j Xi ) .

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

27 / 44

Propensity Score Matching (PSM)

Matching participants and non-participants:


Nearest-neighbor: just take the n nearest neighbors on the basis of
p ( T i = 1 j Xi ) .

Caliper or radius matching: there might not be long distance


neighbors, then impose a radius or threshold.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

27 / 44

Propensity Score Matching (PSM)

Matching participants and non-participants:


Nearest-neighbor: just take the n nearest neighbors on the basis of
p ( T i = 1 j Xi ) .

Caliper or radius matching: there might not be long distance


neighbors, then impose a radius or threshold.

Stratication or interval matching. Partitionates the common support


into strata (or intervals) and calculates the programs impact within
each interval.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

27 / 44

Propensity Score Matching (PSM)

Matching participants and non-participants:


Nearest-neighbor: just take the n nearest neighbors on the basis of
p ( T i = 1 j Xi ) .

Caliper or radius matching: there might not be long distance


neighbors, then impose a radius or threshold.

Stratication or interval matching. Partitionates the common support


into strata (or intervals) and calculates the programs impact within
each interval.
Kernel Matching : Takes into account all possible matches and
weights them according to distance.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

27 / 44

Estimating eects using Matching and PSM in Stata

nnmatch Matching estimation.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

28 / 44

Estimating eects using Matching and PSM in Stata

nnmatch Matching estimation.


notice that the keep(lename) option allows using the matched
database

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

28 / 44

Estimating eects using Matching and PSM in Stata

nnmatch Matching estimation.


notice that the keep(lename) option allows using the matched
database

pscore Calculating the propensity score

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

28 / 44

Estimating eects using Matching and PSM in Stata

nnmatch Matching estimation.


notice that the keep(lename) option allows using the matched
database

pscore Calculating the propensity score


attnd Calculating program eects through nearest neighbor matching

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

28 / 44

Estimating eects using Matching and PSM in Stata

nnmatch Matching estimation.


notice that the keep(lename) option allows using the matched
database

pscore Calculating the propensity score


attnd Calculating program eects through nearest neighbor matching
atts Using Stratication matching

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

28 / 44

Estimating eects using Matching and PSM in Stata

nnmatch Matching estimation.


notice that the keep(lename) option allows using the matched
database

pscore Calculating the propensity score


attnd Calculating program eects through nearest neighbor matching
atts Using Stratication matching
attk Using kernel matching

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

28 / 44

Estimating eects using Matching and PSM in Stata

nnmatch Matching estimation.


notice that the keep(lename) option allows using the matched
database

pscore Calculating the propensity score


attnd Calculating program eects through nearest neighbor matching
atts Using Stratication matching
attk Using kernel matching
attr Using radius matching

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

28 / 44

Propensity Score Matching (PSM)


Further Contents

Bootstrap Calculation of Standard Errors

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

29 / 44

Propensity Score Matching (PSM)


Further Contents

Bootstrap Calculation of Standard Errors


Testing for the presence of selection bias due to unobserved factors:
The Sargan-Wu-Hausman Test.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

29 / 44

Propensity Score Matching (PSM)


Further Contents

Bootstrap Calculation of Standard Errors


Testing for the presence of selection bias due to unobserved factors:
The Sargan-Wu-Hausman Test.
Using propensity scores as weights in a regression approach. Hirano,
Imbens, and Ridder (2003)

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

29 / 44

Propensity Score Matching (PSM)


Further Contents

Bootstrap Calculation of Standard Errors


Testing for the presence of selection bias due to unobserved factors:
The Sargan-Wu-Hausman Test.
Using propensity scores as weights in a regression approach. Hirano,
Imbens, and Ridder (2003)
Estimate: Yi = + Ti + Xi + i

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

29 / 44

Propensity Score Matching (PSM)


Further Contents

Bootstrap Calculation of Standard Errors


Testing for the presence of selection bias due to unobserved factors:
The Sargan-Wu-Hausman Test.
Using propensity scores as weights in a regression approach. Hirano,
Imbens, and Ridder (2003)
Estimate: Yi = + Ti + Xi + i
b (X ) / (1
with weights of 1 for participants and weights of P
the control observations.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

b (X )) for
P

August 2013

29 / 44

Propensity Score Matching (PSM)


Jalan and Ravallion (2003)

They impact evaluate the eects of Trabajar, a workfare program in


Argentina in 1997

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

30 / 44

Propensity Score Matching (PSM)


Jalan and Ravallion (2003)

They impact evaluate the eects of Trabajar, a workfare program in


Argentina in 1997
The context: Macroeconomic crises hurts employment.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

30 / 44

Propensity Score Matching (PSM)


Jalan and Ravallion (2003)

They impact evaluate the eects of Trabajar, a workfare program in


Argentina in 1997
The context: Macroeconomic crises hurts employment.
Dependent variable: Income. Objective: Measure the net income
gains.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

30 / 44

Propensity Score Matching (PSM)


Jalan and Ravallion (2003)

They impact evaluate the eects of Trabajar, a workfare program in


Argentina in 1997
The context: Macroeconomic crises hurts employment.
Dependent variable: Income. Objective: Measure the net income
gains.
The Trabajar program is targeted to the lowest income of the
population. Sets a maximum wage rate of $200

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

30 / 44

Propensity Score Matching (PSM)


Jalan and Ravallion (2003)

Advantages of the PSM approach in this setting:

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

31 / 44

Propensity Score Matching (PSM)


Jalan and Ravallion (2003)

Advantages of the PSM approach in this setting:


1

Allows diering eects. An analysis of the distribution of eects is


possible.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

31 / 44

Propensity Score Matching (PSM)


Jalan and Ravallion (2003)

Advantages of the PSM approach in this setting:


1

Allows diering eects. An analysis of the distribution of eects is


possible.
A plausible alternative when no experimental design, and no baseline
survey is available.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

31 / 44

Propensity Score Matching (PSM)


Jalan and Ravallion (2003)

Advantages of the PSM approach in this setting:


1

Allows diering eects. An analysis of the distribution of eects is


possible.
A plausible alternative when no experimental design, and no baseline
survey is available.
Cost e cient when there is availability of data for a population that
can serve as a control group. In this case, a national-wide survey, the
Encuesta de Desarrollo Social (INDEC), is available.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

31 / 44

Propensity Score Matching (PSM)


Jalan and Ravallion (2003)

Possible problems:

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

32 / 44

Propensity Score Matching (PSM)


Jalan and Ravallion (2003)

Possible problems:
1

The well-known problem: non-observables might still drive


self-selection bias.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

32 / 44

Propensity Score Matching (PSM)


Jalan and Ravallion (2003)

Possible problems:
1

The well-known problem: non-observables might still drive


self-selection bias.
Page 12, the survey found a signicant amount of non-respondents (no
addresses, absence of respondents, not wanted to respond). This might
bias the results.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

32 / 44

Propensity Score Matching (PSM)


Jalan and Ravallion (2003)

Possible problems:
1

The well-known problem: non-observables might still drive


self-selection bias.
Page 12, the survey found a signicant amount of non-respondents (no
addresses, absence of respondents, not wanted to respond). This might
bias the results.
If not cleaning well the population-representative data that is used for
control, possible program of including participants: contamination.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

32 / 44

Propensity Score Matching (PSM)


What kind of eect do we expect?

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

33 / 44

Propensity Score Matching (PSM)


What kind of eect do we expect?

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

34 / 44

Propensity Score Matching (PSM)


Jalan and Ravallion (2003)

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

35 / 44

Propensity Score Matching (PSM)


Fitting the PS model

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

36 / 44

Propensity Score Matching (PSM)


Fitting the PS model

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

37 / 44

Propensity Score Matching (PSM)


Fitting the PS model

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

38 / 44

Propensity Score Matching (PSM)


Fitting the PS model

Logits coe cients are reported, it is not possible to interpret them


(an alternative would have been to exhibit marginal eects).

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

39 / 44

Propensity Score Matching (PSM)


Fitting the PS model

Logits coe cients are reported, it is not possible to interpret them


(an alternative would have been to exhibit marginal eects).
Be careful! A lot of variables are included, some of them which might
display strong colinearity. A table with correlations is recommended.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

39 / 44

Propensity Score Matching (PSM)


What about common support?

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

40 / 44

Propensity Score Matching (PSM)


What about common support?

Be careful, the graphic seems to suggest that there might be a


problem. However, the paper says that because the population survey
is so big, they did not had problems.

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

41 / 44

Propensity Score Matching (PSM)

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

42 / 44

Propensity Score Matching (PSM)

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

43 / 44

Propensity Score Matching (PSM)

Ricardo A Pasquini (IAE B.S. and MEU-UTDT)

Econometra 2 - MEU

August 2013

44 / 44

S-ar putea să vă placă și