Regression Discontinuity Designs in Stata

Matias D. Cattaneo
University of Michigan

July 30, 2015


Main goal: learn about treatment eect of policy or intervention.

If treatment randomization available, easy to estimate treatment eects.

If treatment randomization not available, turn to observational studies.

I Instrum ental variables.

I Selection on observables.

Regression discontinuity (RD) designs.

I Sim ple and ob jective. Requires little inform ation, if design available.
I M ight b e viewed as a lo cal random ized trial.
I Easy to falsify, easy to interpret.
I C areful: very local!
Overview of RD packages

rdrobust package: estimation, inference and graphical presentation using local

polynomials, partitioning, and spacings estimators.
I rdrobust: RD inference (p oint estim ation and CI; classic, bias-corrected, robust).
I rdbwselect: bandwidth or window selection (IK, CV, CCT).
I rdplot: plots data (with optim al blo ck length).

rddensity package: discontinuity in density test at cuto (a.k.a. manipulation testing)

using novel local polynomial density estimator.
I rddensity: m anipulation testing using lo cal p olynom ial density estim ation.
I rdbwdensity: bandwidth or window selection.

rdlocrand package: covariate balance, binomial tests, randomization inference

methods (window selection & inference).
I rdrandinf: inference using random ization inference m etho ds.
I rdwinselect: falsication testing and window selection.
I rdsensitivity: treatm ent eect m o dels over grid of windows, CI inversion.
I rdrbounds: Rosenbaum b ounds.
Randomized Control Trials

Notation: (Yi (0); Yi (1); Xi ), i = 1; 2; : : : ; n.

Treatment: Ti 2 f0; 1g, Ti independent of (Yi (0); Yi (1); Xi ).

Data: (Yi ; Ti ; Xi ), i = 1; 2; : : : ; n, with

Yi (0) if Ti = 0
Yi =
Yi (1) if Ti = 1

Average Treatment Eect:

ATE = E[Yi (1) Yi (0)] = E[Yi jT = 1] E[Yi jT = 0]

Experimental Design.
Sharp RD design

Notation: (Yi (0); Yi (1); Xi ), i = 1; 2; : : : ; n, Xi continuous

Treatment: Ti 2 f0; 1g, Ti = 1(Xi x).

Data: (Yi ; Ti ; Xi ), i = 1; 2; : : : ; n, with

Yi (0) if Ti = 0
Yi =
Yi (1) if Ti = 1

Average Treatment Eect at the cuto:

SRD = E[Yi (1) Yi (0)jXi = x] = lim E[Yi jXi = x] lim E[Yi jXi = x]
x#x x"x

Quasi-Experimental Design: local randomization (more later)

Outcome variable (Y)


0.6 0.4 0.2 0.0 0.2 0.4 0.6

Assignment variable (R)

Outcome variable (Y)


0.6 0.4 0.2 0.0 0.2 0.4 0.6

Assignment variable (R)

Outcome variable (Y)


0.6 0.4 0.2 0.0 0.2 0.4 0.6

Assignment variable (R)

Outcome variable (Y)


Local Random Assignment


0.6 0.4 0.2 0.0 0.2 0.4 0.6

Assignment variable (R)

Empirical Illustration: Cattaneo, Frandsen & Titiunik (2015, JCI)

Problem: incumbency advantage (U.S. senate).

Yi = election outcome.
Ti = whether incumbent.
Xi = vote share previous election (x = 0).
Zi = covariates (demvoteshlag1, demvoteshlag2, dopen, etc.).

Potential outcomes:
Yi (0) = election outcome if had not been incumbent.
Yi (1) = election outcome if had been incumbent.

Causal Inference:
Yi (0) 6= Yi jTi = 0 and Yi (1) 6= Yi jTi = 1
Graphical and Falsication Methods

Always plot data: main advantage of RD designs!

Plot regression functions to assess treatment eect and validity.

Plot density of Xi for assessing validity; test for continuity at cuto and elsewhere.

Important: use also estimators that do not smooth-out data.

RD Plots (Calonico, Cattaneo & Titiunik, JASA):

I Two ingredients: (i) Sm o othed global p olynom ial t & (ii) binned discontinuous
lo cal-m eans t.
I Two goals: (i) detention of discontinuities, & (ii) representation of variability.
I Two tuning param eters:

F G lo b a l p o ly n o m ia l d e g re e (kn ).
F L o c a tio n (E S o r Q S ) a n d nu m b e r o f b in s (Jn ).
Manipulation Tests & Covariate Balance and Placebo Tests

Density tests near cuto:

I Idea: distribution of running variable should b e sim ilar at either side of cuto.

I M etho d 1: Histogram s & Binom ial count test.

I M etho d 2: Density Estim ator at b oundary.

F P re -b in n e d lo c a l p o ly n o m ia l m e th o d M c C ra ry (2 0 0 8 ).
F N e w tu n in g -p a ra m e te r-fre e m e th o d C a tta n e o , J a n sso n a n d M a (2 0 1 5 ).

Placebo tests on pre-determined/exogenous covariates.

I Idea: zero RD treatm ent eect for pre-determ ined/exogenous covariates.

I M ethods: global p olynom ial, lo cal p olynom ial, random ization-based.

Placebo tests on outcomes.

I Idea: zero RD treatm ent eect for outcom e at values other than cuto.
I M ethods: global p olynom ial, lo cal p olynom ial, random ization-based.
Estimation and Inference Methods

Global polynomial approach (not recommended).

Robust local polynomial inference methods.

I Bandwidth selection.

I Bias-correction.

I Condence intervals.

Local randomization and randomization inference methods.

I W indow selection.

I Estim ation and Inference m etho ds.

I Falsication, sensitivity and related m etho ds

Conventional Local-polynomial Approach

Idea: approximate regression functions for control and treatment units locally.

Local-linear estimator (w/ weights K( )):

hn Xi < x : x Xi hn :

Yi = + (Xi x) +" ;i Yi = + + (Xi x) + + "+;i

I Treatm ent eect (at the cuto): ^ SRD = ^ + ^

Can be estimated using linear models (w/ weights K( )):

Yi = + SRD Ti + (Xi x) 1 + Ti (Xi x) 1 + "i , hn Xi hn

Once hn chosen, inference is standard: weighted linear models.

I Details com ing up next.
Conventional Local-polynomial Approach
How to choose hn ?

Imbens & Kalyanaraman (2012, ReStud): optimal plug-in,

^ IK = C
h ^IK n 1=5

Calonico, Cattaneo & Titiunik (2014, ECMA): renement of IK

^ CCT = C
h ^CCT n 1=5

Ludwig & Miller (2007, QJE): cross-validation,

^ CV = arg min
h w(Xi ) (Yi ^ 1 (Xi ; h))2

Key idea: trade-o bias and variance of ^SRD (hn ). Heuristically:

" Bias(^SRD ) =) ^
#h and " Var(^SRD ) =) ^
Local-Polynomial Methods: Bandwidth Selection
Two main methods: plug-in & cross-validation. Both MSE-optimal in some sense.

Imbens & Kalyanaraman (2012, ReStud): propose MSE-optimal rule,

1=5 1=5 Var(^SRD )

Bias(^SRD )2

I IK im plem entation: rst-generation plug-in rule.

I CCT im plem entation: second-generation plug-in rule.

I They dier in the way Var(^ SRD ) and Bias(^ SRD ) are estim ated.

Imbens & Kalyanaraman (2012, ReStud): discuss cross-validation approach,

^ CV = arg min CV (h) ,
h CV (h) = 1(X ;[ ] Xi X+;[ ] ) (Yi ^ (Xi ; h))2 ,

I ^ +;p (x; h) and ^ are lo cal p olynom ials estim ates.
;p (x; h)

I 2 (0; 1), X and X+;[ denote -th quantile of fXi : Xi < xg and fXi : Xi xg.
;[ ] ]

I Our im plem entation uses = 0:5; but this is a tuning param eter!
Conventional Approach to RD

Local-linear estimator (w/ weights K( )):

hn Xi < x : x Xi hn :

Yi = + (Xi x) +" ;i Yi = + + (Xi x) + + "+;i

I Treatm ent eect (at the cuto): ^ SRD = ^ + ^

Construct usual t-test. For H0 : SRD = 0,

^SRD ^+ ^
T^(hn ) = p = q d N (0; 1)
V^n ^ ^
V+;n + V ;n

95% Condence interval:

^ n) =
I(h ^SRD 1:96 V^n
Bias-Correction Approach to RD

Note well: for usual t-test,

T^(hMSE ) = p d N (B; 1) 6= N (0; 1), B>0

I Bias B in RD estim ator captures curvature of regression functions.

^ n = 0:5 h
Undersmoothing/Small Bias Approach: Choose smaller hn ... Perhaps h ^ IK ?

=) Not clear guidance & power loss!

Bias-correction Approach:

^SRD B^n
T^bc (hn ; bn ) = p d N (0; 1)
h p i
=) 95% Condence Interval: I^bc (hn ; bn ) = ^SRD ^n
B 1:96 ^n

How to choose bn ? Same ideas as before... ^bn = C

^ n 1=7
Robust Bias-Correction Approach to RD
^ SRD ^ SRD ^n
T^ (hn ) = p d N (0; 1) and T^ (hn ; bn ) = p d N (0; 1)
V^n ^n

I ^ n is constructed to estim ate leading bias B.


Robust approach:

^SRD B^n ^SRD Bn Bn B^n

T^bc (hn ; bn ) = p = p + p
V ^n
V V^n
| {z } | {z }
d N (0;1) d N (0; )

Robust bias-corrected t-test:

^SRD B ^n ^SRD B ^n
T^rbc (hn ; bn ) = p = q d N (0; 1)
^n + W
V ^n ^

=) 95% Condence Interval:

I^rbc (hn ; bn ) = ^SRD ^n
B 1:96 ^n
V bc , ^n
V bc ^n + W
=V ^n
Local-Polynomial Methods: Robust Inference

Approach 1: Undersmoothing/Small Bias.

^ n) =
I(h ^SRD 1:96 V^n

Approach 2: Bias correction (not recommended).

I^bc (hn ; bn ) = ^SRD ^n
B 1:96 V^n

Approach 3: Robust Bias correction.

I^rbc (hn ; bn ) = ^SRD ^n
B 1:96 ^n + W
V ^n
Local-randomization approach and nite-sample inference

Popular approach: local-polynomial methods.

I Approxim ates regression function and relies on continuity assum ptions.

I Requires: choosing weights, bandwidth and p olynom ial order.

Alternative approach: local-randomization + randomization-inference

I Gives an alternative that can b e used as a robustness check.

I K ey assum ption: exists window W = [ hn ; hn ] around cuto ( hn < x < hn ) where

Ti indep endent of (Yi (0); Yi (1)) (for all Xi 2 W )

I In words: treatm ent is random ly assigned within W .

I Go o d news: if plausible, then RCT ideas/m etho ds apply.
I Not-so-good news: m ost plausible for very sm all windows (very few observations).
I One solution: em ploy sm all window but use random ization-inference m etho ds.
I Requires: choosing random ization rule, window and statistic.
Local-randomization approach and nite-sample inference

Recall key assumption: exists W = [ hn ; hn ] around cuto ( hn < x < hn ) where

Ti independent of (Yi (0); Yi (1)) (for all Xi 2 W )

How to choose window?

I Use balance tests on pre-determ ined/exogenous covariates.

I Very intuitive, easy to im plem ent.

How to conduct inference? Use randomization-inference methods.

1 Cho ose statistic of interest. E.g., t-stat for dierence-in-m eans.

2 Cho ose random ization rule. E.g., numb er of treatm ents and controls given.
3 Com pute nite-sam ple distribution of statistics by p ermuting treatm ent assignm ents.
Local-randomization approach and nite-sample inference

Do not forget to validate & falsify the empirical strategy.

1 Plot data to m ake sure lo cal-random ization is plausible.

2 Conduct placeb o tests.

(e.g., use pre-intervention outcom es or other covariates not used select W )

3 Do sensitivity analysis.

See Cattaneo, Frandsen and Titiunik (2015) for introduction.

See Cattaneo, Titiunik and Vazquez-Bare (2015) for further results and

