Documente Academic
Documente Profesional
Documente Cultură
CVEN2002/2702
Week 8
This lecture
Additional reading: Sections 5.6 (pp. 234-235), 7.2, 7.3 (pp. 303-306),
7.4 in the textbook
CVEN2002/2702 (Statistics)
Dr Justin Wishart
2 / 36
S2 =
1 X
2,
Xi X
n1
i=1
Dr Justin Wishart
1
n1
i (xi
x )2
3 / 36
Dr Justin Wishart
4 / 36
Dr Justin Wishart
5 / 36
; ST = R
for y > 0
Dr Justin Wishart
6 / 36
0.4
f(t)
0.0
0.0
0.2
0.1
0.4
0.2
F(t)
0.6
0.3
0.8
1.0
cdf F (t)
CVEN2002/2702 (Statistics)
Dr Justin Wishart
7 / 36
f(t)
0.0
0.1
0.2
0.3
t1
t2
t5
t10
t50
N(0,1)
CVEN2002/2702 (Statistics)
Dr Justin Wishart
8 / 36
(for > 2)
E(T ) = 0
and
Var(T ) =
2
The Students t distribution is similar in shape to the standard normal
distribution in that both densities are symmetric, unimodal and
bell-shaped, and the maximum value is reached at 0
However, the Students t distribution has heavier tails than the normal
; there is more probability to find the random variable T far away
from 0 than there is for Z
This is more marked for small values of
As the number of degrees of freedom increases, t -distributions look
more and more like the standard normal distribution
In fact, it can be shown that the Students t distribution with degrees
of freedom approaches the standard normal distribution as
CVEN2002/2702 (Statistics)
Dr Justin Wishart
9 / 36
0.4
0.3
P(T > t; ) = 1
f(t)
1
0.0
t;1 = t;
0.1
0.2
for T t
t, 4
Dr Justin Wishart
10 / 36
P X tn1;1/2 X + tn1;1/2
=1
n
n
CVEN2002/2702 (Statistics)
Dr Justin Wishart
11 / 36
Dr Justin Wishart
12 / 36
CVEN2002/2702 (Statistics)
Dr Justin Wishart
13 / 36
Theoretical Quantiles
10
12
14
16
Sample Quantiles
Dr Justin Wishart
14 / 36
T N (0, 1)
CVEN2002/2702 (Statistics)
Dr Justin Wishart
15 / 36
Dr Justin Wishart
16 / 36
Dr Justin Wishart
17 / 36
Normal QQ Plot
1.0
1
0
Theoretical Quantiles
0.6
0.4
Density
0.8
0.2
0.0
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
0.0
0.2
concentration
0.4
0.6
0.8
1.0
1.2
Sample Quantiles
Dr Justin Wishart
18 / 36
Dr Justin Wishart
19 / 36
CVEN2002/2702 (Statistics)
Dr Justin Wishart
20 / 36
E(X ) =
X
=1
Xi
X =X
n
i=1
Dr Justin Wishart
21 / 36
Xn+1 X
q
N (0, 1)
1 + n1
Xn+1 X
T = q
tn1
S 1 + n1
Manipulating Z and T as we did previously for CI leads to the
100 (1 )% z- and t-prediction intervals on the future observation:
q
q
1
1
x z1/2 1 + n , x + z1/2 1 + n
q
q
1
1
x tn1;1/2 s 1 + n , x + tn1;1/2 s 1 + n
CVEN2002/2702 (Statistics)
Dr Justin Wishart
22 / 36
2 z1/2
1+
1
n
Dr Justin Wishart
23 / 36
Dr Justin Wishart
24 / 36
25 / 36
Estimation of a proportion
In this situation, the random variable to study is
1 if the individual has the characteristic of interest
X =
0 if not
which is Bernoulli distributed (see Slide 9, Week 5), with parameter
being the value of interest:
X Bern()
The random sample X1 , X2 , . . . , Xn is a set of n independent Bern()
random variables
; the number Y of individuals of the sample with the characteristic is
Y =
n
X
Xi Bin(n, )
i=1
CVEN2002/2702 (Statistics)
=Y
P
n
Dr Justin Wishart
26 / 36
Estimation of a proportion
is obviously a natural candidate for
This sample proportion P
estimating the population proportion
From the properties of the Binomial distribution, we know that
E(Y ) = n
and
Var(Y ) = n(1 )
1
n2
Var(Y ) =
n(1)
n2
(1)
n
Dr Justin Wishart
27 / 36
Sampling distribution
using the Binomial
We could make inference about from p
distribution of Y . However, it is probably easier to use the Central
Limit Theorem (Slides 33-34, Week 7). Indeed:
n
X
=Y =1
Xi ,
P
n
n
i=1
P
a
np
N (0, 1)
(1 )
a
Dr Justin Wishart
28 / 36
np
P
(1 )
N (0, 1)
a
is just a particular case of n X
N (0, 1), we can use (almost)
directly the large-sample confidence interval we derived for a mean
Specifically, we have that
P z1/2
np
z1/2
(1 )
'1
or
r
z1/2
P P
(1 )
+ z1/2
P
n
(1 )
n
!
'1
Dr Justin Wishart
29 / 36
(1)
,
n
sd(P) =
n
in the expression of the confidence interval
is the sample proportion in an observed random
Consequently, if p
sample of size n, an approximate two-sided confidence interval of level
100 (1 )% for is given by
"
#
r
r
(1 p
)
(1 p
)
p
p
z1/2
+ z1/2
p
,p
n
n
As this is based on the CLT and requires n large, it is a large sample
confidence interval for
CVEN2002/2702 (Statistics)
Dr Justin Wishart
30 / 36
(1 p
)
p
n
(1 p
)
p
,1
n
r
+ z1
0, p
and
"
r
z1
p
CVEN2002/2702 (Statistics)
Dr Justin Wishart
31 / 36
Dr Justin Wishart
32 / 36
p
(1
p
)
0.118 (1 0.118)
\
=
sd(P)
=
= 0.035
n
85
and an approximated two-sided 95% confidence interval for is
\
= [0.118 1.96 0.035] = [0.049, 0.186]
z0.975 sd(
p
P)
; we are 95% confident that the true proportion of produced bearings
outside specifications is between 0.049 and 0.186
CVEN2002/2702 (Statistics)
Dr Justin Wishart
33 / 36
Dr Justin Wishart
34 / 36
P
P z1/2 n p
z1/2 ' 1
(1 )
The three-way inequality is a quadratic function of , solving that
equation gives the following confidence interval
s
(1 p
)
1
p
1
+ w z1/2 (1 w)
(1 w)p
+w
2
2
2
n + z1/2
4(n + z1/2
)
where
w=
2
z1/2
2
n + z1/2
Dr Justin Wishart
35 / 36
Objectives
Now you should be able to:
Construct z- and t-confidence intervals on the mean of a normal
distribution, advisedly using either the normal distribution or the
Students t distribution
Construct large sample confidence intervals on a mean of an
arbitrary distribution with unknown variance
Explain the difference between a confidence interval and a
prediction interval
Construct prediction intervals for a future observation in a normal
population
Construct confidence intervals on a population proportion
Recommended exercises: ; Q7, Q9, p.301, Q13, Q15 p.302, Q20
p.303, Q35 p.319, Q39 p.320, Q43(a-b) p.320, Q55 p.328, (optional)
Q71, Q73 p.340, Q55 p.238
CVEN2002/2702 (Statistics)
Dr Justin Wishart
36 / 36