Documente Academic
Documente Profesional
Documente Cultură
The Bootstrap
Introduced
by
Brad
Efron
in
1979
B.
Efron.
Bootstrap
methods:
another
look
at
the
jackknife.
Annals
of
Sta+s+cs,
7:1-26,
1979.
B.
Efron.
Nonparametric
esCmates
of
standard
error:
the
jackknife,
the
bootstrap,
and
other
methods.
Biometrika,
68:
589-599,
1981.
Bootstrap
methods
are
a
class
of
nonparametric
Monte
Carlo
methods
that
esCmate
the
distribuCon
of
a
populaCon
by
resampling.
2
The Bootstrap
Suppose
the
only
thing
you
have
are
a
sample
of
n
observaCons:
x1 , x2 ,..., xn .
You
do
not
want
to
make
an
assumpCon
about
a
parametric
distribuCon
for
the
data,
for
example,
you
do
not
want
to
assume
they
are
Normally
distributed
or
Student
t-distributed.
In
other
words
you
want
to
be
nonparametric.
The
MC
methods
in
Ch.
6
of
the
book
can
be
called
the
parametric
bootstrap
because
simulaCon
was
performed
using
a
parametric
model,
e.g.
N(,2).
We
want
to
repeat
the
same
types
of
things
in
this
chapter
without
using
a
parametric
model.
3
0.5
.
n
x*(1) = ( x1*(1) , x2*(1) ,..., xn*(1) ) x*(2 ) = ( x1*(2 ) , x2*( 2 ) ,..., xn*(2 ) )
*(1)
n
(x )
Fn*( 2 ) ( x* )
x*(1) = ( x1*(1) , x2*(1) ,..., xn*(1) ) x*(2 ) = ( x1*(2 ) , x2*( 2 ) ,..., xn*(2 ) )
*(1)
n
(x )
Fn*( 2 ) ( x* )
*( b )
n
Chain of convergence : F
Fn*( B ) ( x* )
n
( x ) Fn ( x) and Fn ( x) F ( x).
Fn*(b ) ( x* ) Fn ( x) F ( x)
FX
x = ( x1 , x2 ,..., xn )
Fn (x)
x*(1) = ( x1*(1) , x2*(1) ,..., xn*(1) ) x*(2 ) = ( x1*(2 ) , x2*( 2 ) ,..., xn*(2 ) )
*(1)
n
(x )
Fn*( 2 ) ( x* )
Fn*(b ) ( x* ) Fn ( x) F ( x)
FX
x = ( x1 , x2 ,..., xn )
Fn (x)
x*(1) = ( x1*(1) , x2*(1) ,..., xn*(1) ) x*(2 ) = ( x1*(2 ) , x2*( 2 ) ,..., xn*(2 ) )
*(1)
n
10
(x )
Fn*( 2 ) ( x* )
Example 7.1
n
=
10,
very
small
11
= ( FX )
x = ( x1 , x2 ,..., xn )
= ( Fn ( x))
Fn (x)
x*(1) = ( x1*(1) , x2*(1) ,..., xn*(1) ) x*(2 ) = ( x1*(2 ) , x2*( 2 ) ,..., xn*(2 ) )
(1) = ( Fn*(1) )
13
( 2)
( B)
Example 7.2
15
Example 7.2
Ques%on
1:
EsCmate
the
correlaCon
between
LSAT
and
GPA
scores
based
on
the
random
sample
above.
Solu%on:
What
is
?
How
do
we
esCmate
it
in
R?
16
Example 7.2
17
640
620
lsat
600
580
560
660
280
290
300
310
gpa
320
330
340
Example 7.2
Ques%on
1:
EsCmate
the
correlaCon
between
LSAT
and
GPA
scores
based
on
the
random
sample
above.
Solu%on:
What
is
?
The
populaCon
correlaCon
coecient
between
LSAT
and
GPA
scores.
ow
do
we
esCmate
it
in
R?
With
the
cor()
funcCon,
with
H
the
default
method=pearson
used.
18
>
cor(gpa,lsat)
[1]
0.7763745
ANSWER
>
cor(gpa,lsat,method="spearman")
#for
comparison:
[1]
0.7964286
What is x1 , x1* ?
20
21
Recall = 0.776.
Bootstrapping
packages in R
Pg. 187, 188
oot(boot)
b
bootstrap(bootstrap)
Not covered for this
class because we learn
this ourselves use
these later in your job.
24
Bias
The bias of an estimator for is
bias() = E[ ] = E[] E[ ] = E[] .
An estimator is unbiased if bias() = 0.
Example 1 : If X 1,...,X n are i.i.d. from a distribution with population
1 n
mean , then the sample mean X = X i is an unbiased estimator for .
n i =1
1 n
1
1 n
1 n
1 n
Proof : E [X ] = E X i = E X i = E ( X i ) = = n = .
n i =1
n
n i =1 n i =1 n i =1
25
Bias
The bias of an estimator for is
bias() = E[ ] = E[] E[ ] = E[] .
An estimator is unbiased if bias() = 0.
Not every estimate is unbiased. For example the maximum likelihood
estimator of the variance of a population, 2 :
1 n
= ( X i X ) 2
n i =1
2
26
2
has bias
. That is why the unbiased estimator :
n
1 n
2
s =
( X i X )2
n 1 i =1
is traditionally used, also implemented in R. Proof next.
Proof
Assume X 1,X 2 ,...,X n i.i.d. with mean and variance 2 . Show 2 biased for 2 .
Proof : Just showed E ( X ) = . Next note that
1
2
1 n
1
n
1 n
2
Var ( X ) = Var X i = 2 Var X i = 2 Var ( X i ) = 2 n =
.
n
n
n i =1 n
i =1 n i =1
1 n
1 n 2
1 n 2
2
2
2
= ( X i X ) = X i nX = X i X 2 .
n i =1
n i =1
n i =1
1 n
1
E ( ) = E ( X i2 ) E ( X 2 ) = nE ( X 12 ) E ( X 2 ) = E ( X 12 ) E ( X 2 ).
n i =1
n
2
(n 1) 2
= Var ( X 1 ) Var ( X ) =
=
2.
n
n
2
27
28
= ( FX )
x = ( x1 , x2 ,..., xn )
= ( Fn ( x))
Fn(x)
x*(1) = ( x1*(1) , x2*(1) ,..., xn*(1) ) x*(2 ) = ( x1*(2 ) , x2*( 2 ) ,..., xn*(2 ) )
(1) = ( Fn*(1) )
( 2)
( B)
1.96se(),
where is the estimate of based on a sample of n observations, se() is the standard
error of .
The 95% CI is a random interval that covers the true but unknown with probability 0.95.
The number 1.96 is the .975 - quantile of the standard normal distribution (q.975 ). For a
general 100(1 - )% CI, q(1- / 2 ) is used.
31
For example, you could not apply this method directly when is the .95 quantile of
some distribution.
32
33
34
Basic bootstrap CI
The standard normal bootstrap CI is still symmetric and based on the assumption
of a normal distribution.
The basic bootstrap CI is more flexible .Instead of using quantiles from the normal
distribution, it estimates them from the bootstrap replications. It therefore is
non - symmetric.
The 100(1 - )% basic bootstrap CI is given by
(2 1 / 2 ,2 / 2 ),
where q is the sample q - quantile from the ecdf of the bootstrap replicates * .
The specific form of this interval is complicated to derive so omitted from the course.
35
Percentile bootstrap CI
More intuitive with theoretical advantages and better average properties than the
previous intervals.
The 100(1 - )% percentile bootstrap CI is given by
( / 2 , 1 / 2 ),
where q is the sample q - quantile from the ecdf of the bootstrap replicates * .
Bootstrap t CI
37
Note: requires
BB
bootstraps.
Complicated,
but has a
theoretical
advantage
over all prior
intervals in
that it has a
statistical
Based on the idea that standard CIs for the sample
mean
property,
when the variance is unknown are based on quantiles
End of Chapter 7
38