Documente Academic
Documente Profesional
Documente Cultură
COMP 233/2
Probability and Statistics
for Computer Science
Week 10 (slides curtsey Dr. T. Fevens)
S. P. Mudur
Parameter estimation
Point estimates
Interval estimates
Reading: §7.1-7.3
COMP233 Week 10 1
11/6/2014
Estimators
COMP233 Week 10 3
“Good” Estimator
2. It is consistent : lim An 0;
n
COMP233 Week 10 4
COMP233 Week 10 2
11/6/2014
Estimating the
Mean and Variance
• Given a population of size N, we need to find a good
estimate for its mean and variance .
• With data for all N, we can calculate the exact
values for and .
• When the number N is too big, so we pick a sample of
reasonable (?) size n.
• Find the sample mean and sample variance. So the
question:
– How good is X as an estimate of ?
– How good is S2 as an estimate of ?
COMP233 Week 10 5
X
i 1
i (X
i 1
i X )2
and
n n
COMP233 Week 10 6
COMP233 Week 10 3
11/6/2014
Comment
(X i X )2
S2 i 1
n 1
in that the denominator is n rather than n − 1.
However, for n of reasonable size, these two
estimators of σ 2 will be approximately equal.
COMP233 Week 10 7
COMP233 Week 10 4
11/6/2014
Reading: §7.3
Motivation
COMP233 Week 10 5
11/6/2014
COMP233 Week 10 11
(X , X )
is the maximum error of estimation.
• That is ( X , X )
• How confident we are in this statement depends
on (1 – ) the confidence level of the interval.
COMP233 Week 10 12
COMP233 Week 10 6
11/6/2014
Note
X X
X
X
X
X
Thus, 1 P{ X }
COMP233 Week 10 14
COMP233 Week 10 7
11/6/2014
Observations
COMP233 Week 10 15
Area 1
16
COMP233 Week 10
COMP233 Week 10 8
11/6/2014
COMP233 Week 10 17
Implications of the
Central Limit Theorem
• For large n,
– The distribution of the sum of independent
identically distributed random variables is
normal although the variables themselves
need not be normally distributed.
– The distribution of the sample means is
approximately normal, with mean and
variance 2/n.
• In many practical examples a sample of size 30
or more will be sufficient for the normal
approximation to work well. In some cases the
Central Limit Theorem will work even if n<30.
COMP233 Week 10 18
COMP233 Week 10 9
11/6/2014
Example
COMP233 Week 10 19
Solution
20 COMP233 Week 10
/2
COMP233 Week 10 10
11/6/2014
Solution
COMP233 Week 10 21
Solution
( X , X + )
= (23.2 0.6, 23.2 + 0.6)
= (22.6, 23.8)
COMP233 Week 10 22
COMP233 Week 10 11
11/6/2014
Summary
COMP233 Week 10 23
COMP233 Week 10 24
COMP233 Week 10 12
11/6/2014
Example
COMP233 Week 10 25
Solution
where z / 2 n / n / 3 .
COMP233 Week 10 26
COMP233 Week 10 13
11/6/2014
Solution
n
• Thus, 2.58, or n = [(2.58)(3)]2 = 59.9.
3
• Which is rounded up to 60.
COMP233 Week 10 27
One-Sided Confidence
Intervals
X
• Knowing that Z n is a standard normal
random variable, along with the identities
P{Z > zα} = α and P{Z < −zα} = α
results in one-sided confidence intervals of any
desired level of confidence. Specifically, we obtain
that
X z , and , X z
n n
are, respectively, 100(1 − α) percent one-sided
upper and 100(1 − α) percent one-sided lower
confidence intervals for μ.
COMP233 Week 10 28
COMP233 Week 10 14
11/6/2014
Example
COMP233 Week 10 29
Example, cont.
• Suppose the successive values received are
5, 8.5, 12, 15, 7, 9, 7.5, 6.5, 10.5
• Construct the following confidence intervals:
a) A 95 percent two-sided confidence interval
estimate of μ
b) A 99 percent two-sided confidence interval
estimate of μ
c) A 95 percent one-sided upper confidence
estimate of μ
d) And a 95 percent one-sided lower confidence
estimate of μ .
COMP233 Week 10 30
COMP233 Week 10 15
11/6/2014
Solution
a) Since 81
X 9
9
It follows, under the assumption that the values
received are independent, that a 95 percent
confidence interval for μ is
2 2
X 1.96 , X 1.96 (7.69, 10.31)
9 9
Hence, we are “95 percent confident” that the
true message value μ lies between 7.69 and 10.31.
COMP233 Week 10 31
Solution
b) Since 81
X 9
9
It follows, since z0.005 = 2.58, that a 99 percent
confidence interval for μ is
2 2
X 2.58 , X 2.58 (7.28, 10 .72 )
3 3
Hence, we are “99 percent confident” that the
true message value μ lies between 7.28 and 10.72.
COMP233 Week 10 32
COMP233 Week 10 16
11/6/2014
Solution
0.95 0.05
Reading: §7.3.1
COMP233 Week 10 17
11/6/2014
Confidence Intervals on .
Area 1
A Problem with
Estimating the Mean
• The above procedure only works if
is known, and ܺ are normally distributed
1.
or n is at least 30, or
2. Variance unknown, and n is at least 40 (we use
S instead of in the previous slide).
• But what if is unknown and n < 40?
– Which is often the case.
COMP233 Week 10 36
COMP233 Week 10 18
11/6/2014
(n 1) S 2
n21
2
COMP233 Week 10 38
COMP233 Week 10 19
11/6/2014
Reminder
• Hence,
S2 n21
2
n 1
• Or,
S n21
n 1
COMP233 Week 10 39
X
X / n Z
Tn 1
S/ n S2 n21
2 n 1
COMP233 Week 10 40
COMP233 Week 10 20
11/6/2014
Conclusion
X
Tn 1
S/ n
COMP233 Week 10 41
Area 1
COMP233 Week 10 42
COMP233 Week 10 21
11/6/2014
S S
1 P X t / 2,n 1 X t / 2,n 1
n n
• I.e.,
1 P{ X X }
where S
t / 2,n 1
n
COMP233 Week 10 43
Example
COMP233 Week 10 44
COMP233 Week 10 22
11/6/2014
Solution
n 10
t / 2, n 1 t0.025, 9 2.262 .
S 0.08
COMP233 Week 10 45
Solution
• Thus,
S 0.08
t / 2, n 1 2.262 0.057.
n 10
( X ε, X + ε)
=(0.32 0.057, 0.32 + 0.057)
= (0.263, 0.377)
COMP233 Week 10 46
COMP233 Week 10 23
11/6/2014
Example
COMP233 Week 10 47
Solution
X 7041.4
S 1610.3
COMP233 Week 10 48
COMP233 Week 10 24
11/6/2014
Solution
• = 0.01, d.f. = n 1 = 6.
• From the table,
n 7
t / 2, n 1 t0.005,6 3.707
S 1610.3
• Thus, = 2256.2.
• The 99% confidence interval is
( X ε, X + ε)
= (7041.4 2256.2, 7041.4 + 2256.2)
= (4685.2, 9297.6)
COMP233 Week 10 49
COMP233 Week 10 50
COMP233 Week 10 25
11/6/2014
COMP233 Week 10 51
Reading: §7.3.2
COMP233 Week 10 26
11/6/2014
COMP233 Week 10 53
n n
(X
k 1
k
2
X) (X
k 1
k ) 2 n( X ) 2 ,
so
n n 2 2
1 Xk X
2
k 1
(Xk X ) 2
k 1
n
COMP233 Week 10 54
COMP233 Week 10 27
11/6/2014
n
Z
k 1
2
k Z2
COMP233 Week 10 55
S2
(n 1) n21
2
COMP233 Week 10 56
COMP233 Week 10 28
11/6/2014
Confidence Intervals on 2
• Suppose 1 - = P {2left,n-1 ≤ 2 ≤ 2right,n-1}, where
2left,n-1 and 2right,n-1 are percentage points
corresponding to
2 distribution
with 4 d.f.
Area / 2
Area / 2
Area 1
2left,n-1 2right,n-1
COMP233 Week 10 57
Note
2 2 2
left, n-1 n 1 right, n-1
2 (n 1) S 2 2
left, n-1 2
right, n-1
1 2 1
2
left, n-1 (n 1) S 2 2
right, n-1
1 2 1
2
right, n-1 (n 1) S 2 2
left, n-1
(n 1) S 2 (n 1) S 2
2
2 2
right, n-1 left, n-1
COMP233 Week 10 58
COMP233 Week 10 29
11/6/2014
Confidence Intervals on 2
COMP233 Week 10 59
COMP233 Week 10 60
COMP233 Week 10 30
11/6/2014
Example
COMP233 Week 10 61
Solution
• d.f. = n 1 = 19
• = 0.05
• Two relevant areas:
/2 = 0.025 (for 2right,n-1), and
1 /2 = 0.975 (for 2left,n-1).
• From the table,
2right,n-1 = 32.852, and
2left,n-1 = 8.907
COMP233 Week 10 62
COMP233 Week 10 31
11/6/2014
Solution
19(1.6) 2 19(1.6) 2
, 1.5, 5.5
32. 852 8 . 907
And, the 95% confidence interval for
1.5,
5.5 1.2, 2.3
COMP233 Week 10 63
Example 2
59 54 53 52 51
39 49 46 49 48
COMP233 Week 10 64
COMP233 Week 10 32
11/6/2014
Solution
• Find S 2 = 28.2.
• # of d.f. = n 1 = 9.
• = 0.1.
• Two relevant areas:
/2 = 0.05 (for 2right,n-1), and
1 /2 = 0.95 (for 2left,n-1).
• From the table,
2right,n-1 = 16.919, and
2left,n-1 = 3.325.
COMP233 Week 10 65
Solution
9(28.2) 9(28.2)
, 15,76.3
16 . 919 3 . 325
And, the 90% confidence interval for
15,
76.3 3.87,8.73
COMP233 Week 10 66
COMP233 Week 10 33
11/6/2014
Example 3
COMP233 Week 10 67
Solution
Area 1
• Find S2 = 1.366x10-5.
• # of d.f. = n 1 = 9.
• = 0.05.
• The relevant area: 2left,n-1
1- = 0.95 (for2left,n-1 since we want the one
sided lower confidence interval, so the lower
bound is 0 and the upper bound is a function
of1/2left,n-1).
• From the table,
2left,n-1 = 21-,n-1 = 20.95, 9 = 3.325.
COMP233 Week 10 68
COMP233 Week 10 34
11/6/2014
Solution
X X /( n 1)
n n
X X i / n,
2
S i
i 1 i 1
COMP233 Week 10 35
11/6/2014
Example
COMP233 Week 10 71
Future Plans
COMP233 Week 10 72
COMP233 Week 10 36
11/6/2014
References/Resources Used
COMP233 Week 1 73
COMP233 Week 10 37