Bootstrap

Chapter 7, the Bootstrap
US President George W. Bush awards Bradley Efron the 2005 Medal

of Science in the East Room of the White House July 27, 2007 in
Washington, DC. President Bush awarded 27 National Medals of
Science and Technology for 2005 and 2006 at the event.
Brad Efron at Stanford University.
The Bootstrap
Introduced by Brad Efron in 1979

B. Efron. Bootstrap methods: another look at the
jackknife. Annals of Sta+s+cs, 7:1-26, 1979.

B. Efron. Nonparametric esCmates of standard error:
the jackknife, the bootstrap, and other methods.
Biometrika, 68: 589-599, 1981.
Bootstrap methods are a class of nonparametric
Monte Carlo methods that esCmate the distribuCon
of a populaCon by resampling.
2
The Bootstrap
Suppose the only thing you have are a sample of n observaCons:
x1 , x2 ,..., xn .
You do not want to make an assumpCon about a parametric
distribuCon for the data, for example, you do not want to
assume they are Normally distributed or Student t-distributed.

In other words you want to be nonparametric.

The MC methods in Ch. 6 of the book can be called the
parametric bootstrap because simulaCon was performed using a
parametric model, e.g. N(,2).
We want to repeat the same types of things in this chapter
without using a parametric model.
3
Resampling on a discrete uniform
Assume x = ( x1 , x2 ,..., xn ) is an observed random sample from a

distribution with unknown cdf F ( x).
Choose X * at random from x.
1
, for i = 1,..., n.
n
In other words X * is distributed as discrete uniform on x.
Then P( X * = xi ) =
Resampling creates an i.i.d. sample X 1* ,X 2* ,..., X n* from this discrete

uniform on x distribution.
Empirical cumulative distribution

function (ecdf)
Recall, pg. 36 of Ch 2, repeated in the lecture of Ch 3:
The ecdf Fn ( x) is an unbiased estimate of F ( x) = P( X x)
and defined for an observed ordered sample x(1) x( 2 ) ... x( n ) by :
0
x < x(1) ,
Fn ( x) = i / n x(i ) x < x(i +1) , i = 1,..., n 1,

1
x( n ) x.
The standard error of Fn ( x) = F ( x)[1 F ( x)] / n
0.5
.
n
Resampling on a discrete uniform

So under our resampling scheme:
Choose X * at random from x.
1
Then P( X * = xi ) = , for i = 1,..., n.
n
In other words X * is distributed as discrete uniform on x.
Resampling creates an i.i.d. sample X 1* ,X 2* ,..., X n* , from this discrete
uniform on x distribution.
The ecdf Fn is therefore the cdf of X * .
6
Bootstrap sampling scheme

FX
x = ( x1 , x2 ,..., xn )
Fn (x)
x*(1) = ( x1*(1) , x2*(1) ,..., xn*(1) ) x*(2 ) = ( x1*(2 ) , x2*( 2 ) ,..., xn*(2 ) )
*(1)
n
(x )
Fn*( 2 ) ( x* )
x*( B ) = ( x1*( B ) , x2*( B ) ,..., xn*( B ) )

Fn*( B ) ( x* )
Bootstrap sampling scheme

FX
x = ( x1 , x2 ,..., xn )
Fn (x)
x*(1) = ( x1*(1) , x2*(1) ,..., xn*(1) ) x*(2 ) = ( x1*(2 ) , x2*( 2 ) ,..., xn*(2 ) )
*(1)
n
(x )
x*( B ) = ( x1*( B ) , x2*( B ) ,..., xn*( B ) )
Fn*( 2 ) ( x* )
*( b )
n
Chain of convergence : F
Fn*( B ) ( x* )
n
( x ) Fn ( x) and Fn ( x) F ( x).
Fn*(b ) ( x* ) Fn ( x) F ( x)
FX
Increasing n helps here.
x = ( x1 , x2 ,..., xn )
Fn (x)
x*(1) = ( x1*(1) , x2*(1) ,..., xn*(1) ) x*(2 ) = ( x1*(2 ) , x2*( 2 ) ,..., xn*(2 ) )
*(1)
n
(x )
Fn*( 2 ) ( x* )
x*( B ) = ( x1*( B ) , x2*( B ) ,..., xn*( B ) )

Fn*( B ) ( x* )
Fn*(b ) ( x* ) Fn ( x) F ( x)
FX
Increasing B only helps here.
x = ( x1 , x2 ,..., xn )
Fn (x)
x*(1) = ( x1*(1) , x2*(1) ,..., xn*(1) ) x*(2 ) = ( x1*(2 ) , x2*( 2 ) ,..., xn*(2 ) )
*(1)
n
10
(x )
Fn*( 2 ) ( x* )
x*( B ) = ( x1*( B ) , x2*( B ) ,..., xn*( B ) )

Fn*( B ) ( x* )
Example 7.1
n = 10, very small
11
For X ~ Poi (2) , P( X = 0) = e - 2 = .135, not 0.
Estimating a summary (parameter)

of the nonparametric distribution
We are interested in some " parameter" of F ( x), such as the mean of the
distribution F ( x) or the .975 - quantile of F ( x).
Parameter sounds strange since it belongs to a nonparametric distribution.
It is a naming convention only, instead of the word " parameter" think of
it as any " summary" of F ( x).
We can also use bootstrapping to get the empirical distribution of , and
this is a very powerful technique because we can do it for any type of
and for any distribution F ( x).
12
Bootstrap for a summary of interest

FX
= ( FX )
x = ( x1 , x2 ,..., xn )
= ( Fn ( x))
Fn (x)
x*(1) = ( x1*(1) , x2*(1) ,..., xn*(1) ) x*(2 ) = ( x1*(2 ) , x2*( 2 ) ,..., xn*(2 ) )
(1) = ( Fn*(1) )
13
( 2)
x*( B ) = ( x1*( B ) , x2*( B ) ,..., xn*( B ) )
( B)
The bootstrap estimate of F () is the empirical cdf of ( (1) , ( 2) ,..., ( B ) ).
Bootstrap estimate of standard

error
Note, and not * , typically used to estimate .

14
Example 7.2
Ques%on 1: EsCmate the correlaCon between LSAT and GPA

scores based on the random sample above.

Ques%on 2: Write a general bootstrap algorithm for
esCmaCng the standard error of the esCmate in QuesCon 1.
15
Example 7.2

Solu%on:

What is ?
How do we esCmate it in R?
16
Example 7.2
17
640
620
lsat
600
580
560
The book did not specify but Pearsons

correlaCon is the standard correlaCon
measure but only measures a linear
relaConship. The plot looks linear so
we will use it. If there is a non-linear
relaConship it is be^er to use
Spearmans rank correlaCon.
660
Ques%on 1: EsCmate the correlaCon between LSAT and GPA scores

based on the random sample above.
280
290
300
310
gpa
320
330
340
Example 7.2

Solu%on:

What is ? The populaCon correlaCon coecient between
LSAT and GPA scores.
ow do we esCmate it in R? With the cor() funcCon, with
H
the default method=pearson used.
18
> cor(gpa,lsat)
[1] 0.7763745 ANSWER

> cor(gpa,lsat,method="spearman") #for comparison:
[1] 0.7964286
Note we did not

use the bootstrap!
Question 2: Inadequate solution


The soluCon from the book is not specic enough!
What is x1 , x1* ?
Important to note the data are paired.

19
Question 2: Better solution

The data comprise 15 pairs : x = {xi = (GPAi ,LSATi ); i = 1,...,15}.
For each bootstrap replicate, indexed b = 1,...,B :
*
Generate a random sample of 15 pairs x*(b ) = ( x1* ,x*2 ,..., x15
) where
xi* is the ith random pair by sampling with replacement from x.

Compute (b) as the sample correlation of the pairs of x*(b ) .
The bootstrap estimate of the standard error of the correlation is the
sample standard deviation of (1) , (2) ,..., ( B ) .
20
Reporting standard errors

21
The bootstrap esCmate of standard error

(se) is 0.136. We report the correlaCon
ANSWER
between the LSAT and GPA as 0.776 0.136.
Seeing what you are doing

In pracCce it is always a good idea to inspect the distribuCon
of the B bootstrap esCmates of .
T his distribuCon does not
look normal, and will not look
normal no ma^er how large
you make B.
T hat is because n = 15 is a
small sample size and a
correlaCon coecient from
this sample size is not normal.
22
Recall = 0.776.
The difference between bootstrap

and asymptotic Normal intervals
symptoCc normal theory, assuming

A
a large n, is olen used to esCmate
standard errors for correlaCon
coecients, even for small n.

The normal esCmate of the standard
error for this example is 0.115.
Recall = 0.776, bootstrap estimate of SE = 0.136.

23
Bootstrapping
packages in R
Pg. 187, 188
oot(boot)
b
bootstrap(bootstrap)
Not covered for this
class because we learn
this ourselves use
these later in your job.
24
Bias
The bias of an estimator for is
bias() = E[ ] = E[] E[ ] = E[] .
An estimator is unbiased if bias() = 0.
Example 1 : If X 1,...,X n are i.i.d. from a distribution with population
1 n
mean , then the sample mean X = X i is an unbiased estimator for .
n i =1
1 n
1
1 n
1 n
1 n
Proof : E [X ] = E X i = E X i = E ( X i ) = = n = .
n i =1
n
n i =1 n i =1 n i =1
25
E(cX) = cE(X) E(X1 + X2) = E(X1) + E(X2) Xi are i.i.d. with

mean .
Bias
The bias of an estimator for is
bias() = E[ ] = E[] E[ ] = E[] .
An estimator is unbiased if bias() = 0.
Not every estimate is unbiased. For example the maximum likelihood
estimator of the variance of a population, 2 :
1 n
= ( X i X ) 2
n i =1
2
26
2
has bias
. That is why the unbiased estimator :
n
1 n
2
s =
( X i X )2
n 1 i =1
is traditionally used, also implemented in R. Proof next.
Proof
Assume X 1,X 2 ,...,X n i.i.d. with mean and variance 2 . Show 2 biased for 2 .
Proof : Just showed E ( X ) = . Next note that
1
2
1 n
1
n
1 n
2
Var ( X ) = Var X i = 2 Var X i = 2 Var ( X i ) = 2 n =
.
n
n
n i =1 n
i =1 n i =1
1 n
1 n 2
1 n 2
2
2
2
= ( X i X ) = X i nX = X i X 2 .
n i =1
n i =1
n i =1
1 n
1
E ( ) = E ( X i2 ) E ( X 2 ) = nE ( X 12 ) E ( X 2 ) = E ( X 12 ) E ( X 2 ).
n i =1
n
2
Recall, Var ( X ) = E ( X 2 ) E ( X ) 2 E ( X 2 ) = Var ( X ) + E ( X ) 2 .

E ( 2 ) = Var ( X 1 ) + E ( X 1 ) 2 [Var ( X ) + E ( X ) 2 ] = Var ( X 1 ) + 2 [Var ( X ) + 2 ]
(n 1) 2
= Var ( X 1 ) Var ( X ) =
=
2.
n
n
2
27
Bootstrap estimate of bias

The bootstrap esCmate of bias is:
In the Law example

> cor(gpa,lsat)
[1] 0.7763745
28
Bootstrap estimate of bias
and mean(R) gives
A new simulaCon with B = 2000 in Example 7.4, pg. 189,

gives a bootstrap esCmate of bias as -0.0058.
29
Bootstrap confidence intervals

FX
= ( FX )
x = ( x1 , x2 ,..., xn )
= ( Fn ( x))
Fn(x)
x*(1) = ( x1*(1) , x2*(1) ,..., xn*(1) ) x*(2 ) = ( x1*(2 ) , x2*( 2 ) ,..., xn*(2 ) )
(1) = ( Fn*(1) )
x*( B ) = ( x1*( B ) , x2*( B ) ,..., xn*( B ) )
( 2)
( B)
The bootstrap estimate of F () is the empirical cdf of ( (1) , ( 2 ) ,..., ( B ) ).

30
Now need a bootstrap confidence interval (CI) for .

The typical 95% confidence interval (CI) for an estimate of that you have learned is :
1.96se(),
where is the estimate of based on a sample of n observations, se() is the standard
error of .
The 95% CI is a random interval that covers the true but unknown with probability 0.95.
The number 1.96 is the .975 - quantile of the standard normal distribution (q.975 ). For a
general 100(1 - )% CI, q(1- / 2 ) is used.
31
Standard normal = N(0,1)

Note that the CI, 1.96 se(), is
symmetric,
based on an asymptotic Normal distribution for ,
only applies where is a sample average,
and where the sample size n is large.
For example, you could not apply this method directly when is the .95 quantile of
some distribution.
32

There are many bootstrap condence intervals. We will study
the four most commonly used.

Standard normal bootstrap
Basic bootstrap
Bootstrap percenCle
Bootstrap t
And then there are even methods to accelerate these
intervals; see Sec 7.5 of book (not covered in this course).
33
Standard normal bootstrap CI

The typical 100(1 - /2)% confidence interval (CI) for an estimate of is q1 / 2 se(),
where q1 / 2 is the 1 - /2 quantile of the standard normal distribution.
Use the typical CI above but just replace se() by the bootstrap estimate of standard
error that we previously learned.
34
Basic bootstrap CI
The standard normal bootstrap CI is still symmetric and based on the assumption
of a normal distribution.
The basic bootstrap CI is more flexible .Instead of using quantiles from the normal
distribution, it estimates them from the bootstrap replications. It therefore is
non - symmetric.
The 100(1 - )% basic bootstrap CI is given by
(2 1 / 2 ,2 / 2 ),
where q is the sample q - quantile from the ecdf of the bootstrap replicates * .
The specific form of this interval is complicated to derive so omitted from the course.
35
Percentile bootstrap CI
More intuitive with theoretical advantages and better average properties than the
previous intervals.
The 100(1 - )% percentile bootstrap CI is given by
( / 2 , 1 / 2 ),
where q is the sample q - quantile from the ecdf of the bootstrap replicates * .
The boot package in R

Examples 7.10 and 7.11 compare the three CIs previously shown
using the boot.ci funcCon in the R boot package. For this course
you should program these intervals yourself and not use the boot
package.
36
Bootstrap t CI
37
Note: requires
BB
bootstraps.
Complicated,
but has a
theoretical
advantage
over all prior
intervals in
that it has a
statistical
Based on the idea that standard CIs for the sample
mean
property,
when the variance is unknown are based on quantiles
End of Chapter 7
38

Bootstrap

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Bootstrap

Încărcat de

Drepturi de autor:

Formate disponibile

Chapter 7, the Bootstrap

US President George W. Bush awards Bradley Efron the 2005 Medal

Brad Efron at Stanford University.

Resampling on a discrete uniform

Assume x = ( x1 , x2 ,..., xn ) is an observed random sample from a

Resampling creates an i.i.d. sample X 1* ,X 2* ,..., X n* from this discrete

Empirical cumulative distribution

Fn ( x) = i / n x(i ) x < x(i +1) , i = 1,..., n 1,

The standard error of Fn ( x) = F ( x)[1 F ( x)] / n

Resampling on a discrete uniform

Bootstrap sampling scheme

x*( B ) = ( x1*( B ) , x2*( B ) ,..., xn*( B ) )

Bootstrap sampling scheme

x*( B ) = ( x1*( B ) , x2*( B ) ,..., xn*( B ) )

Increasing n helps here.

x*( B ) = ( x1*( B ) , x2*( B ) ,..., xn*( B ) )

Increasing B only helps here.

x*( B ) = ( x1*( B ) , x2*( B ) ,..., xn*( B ) )

For X ~ Poi (2) , P( X = 0) = e - 2 = .135, not 0.

Estimating a summary (parameter)

Bootstrap for a summary of interest

x*( B ) = ( x1*( B ) , x2*( B ) ,..., xn*( B ) )

The bootstrap estimate of F () is the empirical cdf of ( (1) , ( 2) ,..., ( B ) ).

Bootstrap estimate of standard

Note, and not * , typically used to estimate .

Ques%on 1: EsCmate the correlaCon between LSAT and GPA

The book did not specify but Pearsons

Ques%on 1: EsCmate the correlaCon between LSAT and GPA scores

Note we did not

Question 2: Inadequate solution

Important to note the data are paired.

Question 2: Better solution

xi* is the ith random pair by sampling with replacement from x.

Reporting standard errors

The bootstrap esCmate of standard error

Seeing what you are doing

The difference between bootstrap

symptoCc normal theory, assuming

Recall = 0.776, bootstrap estimate of SE = 0.136.

E(cX) = cE(X) E(X1 + X2) = E(X1) + E(X2) Xi are i.i.d. with

Recall, Var ( X ) = E ( X 2 ) E ( X ) 2 E ( X 2 ) = Var ( X ) + E ( X ) 2 .

Bootstrap estimate of bias

In the Law example

Bootstrap estimate of bias

and mean(R) gives

A new simulaCon with B = 2000 in Example 7.4, pg. 189,

Bootstrap confidence intervals

x*( B ) = ( x1*( B ) , x2*( B ) ,..., xn*( B ) )

The bootstrap estimate of F () is the empirical cdf of ( (1) , ( 2 ) ,..., ( B ) ).

Now need a bootstrap confidence interval (CI) for .

Bootstrap confidence intervals

Standard normal = N(0,1)

Bootstrap confidence intervals

Bootstrap confidence intervals

Standard normal bootstrap CI

The boot package in R

S-ar putea să vă placă și

x( B ) = ( x1( B ) , x2( B ) ,..., xn( B ) )

x( B ) = ( x1( B ) , x2( B ) ,..., xn( B ) )

x( B ) = ( x1( B ) , x2( B ) ,..., xn( B ) )

x( B ) = ( x1( B ) , x2( B ) ,..., xn( B ) )

x( B ) = ( x1( B ) , x2( B ) ,..., xn( B ) )

x( B ) = ( x1( B ) , x2( B ) ,..., xn( B ) )