Documente Academic
Documente Profesional
Documente Cultură
Module- 5
ManavazhaganR@Bsnl.in
Estimation
ManavazhaganR@Bsnl.in
Sample , Population – Definition &
Notations
Population:
Population consists of the totality of the
observations with which we are concerned.
Population Parameter:
A numerical measure of a population is called
population parameter. Eg. Mean , standard deviation.
Sample:
A Sample is a subset of a Population.
Statistic:
A numerical measure of the sample is
called statistic
Population parameters are estimated
using the sample statistic.
E.G : Mean, Median,ManavazhaganR@Bsnl.in
Mode, S.D,etc
Sampling Distribution – Definition
& Notations
Sampling Distribution:
The probability distribution of a statistic
is called sampling distribution.
Sampling Distribution of mean:
The probability distribution of sample mean is
ManavazhaganR@Bsnl.in
Sampling Distribution – Definition
& Notations
Sampling Distribution:
The probability distribution of a statistic
is called sampling distribution.
Sampling Distribution of mean:
The probability distribution of sample mean is
called the sampling distribution of the
mean.
Population Sample Sample Sampling
Water in Statistic
10 – Gallon Mean Distribution
no. of Sampling
river
Basket ball containers
Group of 5 bacteria
Median / distribution
Sampling
team players million
height of mean
distribution
parts of median.
ManavazhaganR@Bsnl.in
Sampling Distribution –
Notations
ManavazhaganR@Bsnl.in
Sampling Distribution of mean
Im p o rta n ce o f S a m p lin g D istrib u tio n
on
, th e sa m p lin g d istrib u tio n o f m e a n , still w ill b e N o rm
C e n tra l L im it T h e o re m
ple of size n taken from a population with mean μ and S.D σ , the
bution.
ManavazhaganR@Bsnl.in
Sampling Distribution of Standard
Deviation
Estimation
om sample of size n taken from a population having v
ManavazhaganR@Bsnl.in
Estimation – Definition & Notations
Estimator:
Estimation
When a sample statistic is used to
estimate a population parameter, the
statistic is called an estimator of the
parameter. It is the statistical inference
about
Estimator the Populatio
--------- Population
( Sample Estimates nµ
xStatistic
) Estimates
---- Paramete
s
---
σ
r
ManavazhaganR@Bsnl.in
Estimation – Definition & Notations
Estimation
• The two types of estimators:
1. Point estimates – Mean, Median, S.D,
etc.
2. Interval estimates – Standard error of
the mean , t, z , F tests
ManavazhaganR@Bsnl.in
Point Estimates
Type Mean Standard Deviation
Sample X=
Sample
Frequency
Sampling -do- -do-
Distribution
distribution
Sampling =µ =
distribution ( Finite
(Infinite Population
Population )
) =µ = ( sampling
error ) ( sampling
error )
ManavazhaganR@Bsnl.in
Interval Estimates –
Confidence Interval
A C o n fid e n ce in te rv a l is a ra n g e o f
n u m b ManavazhaganR@Bsnl.in
e rs
Confidence Interval – Format
erval Estimate =
Point Estimate
±
Value of Confidence Level * Sampling E
ManavazhaganR@Bsnl.in
Confidence Interval :– σ known
e = P o in t E stim a te ± Z , t, f v a lu e o f C o n fid e n ce L e v e l *
Confidence
Level
90 % 1.65
95% 1.96
99% 2.58
ManavazhaganR@Bsnl.in
Confidence Interval :– Sample size > 30, σ
estimated by s
e = P o in t E stim a te ± Z , t, f v a lu e o f C o n fid e n ce L e v e l *
Confidence
Level
90 % 1.65
95% 1.96
99% 2.58
ManavazhaganR@Bsnl.in
Confidence Interval :– Sample size < 30, σ
estimated by s
e = P o in t E stim a te ± v a lu e o f C o n fid e n ce L e v e l
ManavazhaganR@Bsnl.in
Sampling Distribution – Definition
& Notations
Property Symbol
The sampling = µ
distribution
The samplingof mean
equal to the population
distribution has a
mean
standard deviation =
( Standard error )
equal to the population
standard deviation
divided by the square
root of the sample size
ManavazhaganR@Bsnl.in
Hypothesis
ManavazhaganR@Bsnl.in
Hypothesis
• Usually the Experiments are done to
test a specific question.
• This question is written in a specific
format called a hypothesis.
•
se t o u t to te st a specific qu
ManavazhaganR@Bsnl.in
Experiments set out to test a
specific question
Case - 1 :
A car manufacturer has
redesigned an existing engine.
The old model engine can be
used to travel 20 Km. per liter
of petrol. He wishes to test
that, the new design
increases the mean mileage
per liter. ManavazhaganR@Bsnl.in
Hypothesis Testing -
Procedure
Write Hypothesis
Format
H1 : The new design increases of the mean
mileage or ManavazhaganR@Bsnl.in
out to test a specific question
1. Write Hypothesis
2. Choose a Test Statistic & Type of Tail
3. Calculate Summary value from our sample using the Test Statistics
mileage. Ho <=20
H1 :The new design increases the mean
mileage.H1>20
Mean Mileage Mean is the
Format of
test statistic
1. ManavazhaganR@Bsnl.in
1. Write
t o u t to te st a sp e cific q u e stio n Hypothesis
2. Choose a Test Statistic & Type of Tail
3. Calculate Summary value from our sample using the Test Statistics
mileage. H1 >20
M e a n M ile a g e o f n e w e n g in e
1. N e w E n g inFormat
e ManavazhaganR@Bsnl.inof
Format of Hypothesis Exercise-1
ManavazhaganR@Bsnl.in
Format of Hypothesis Exercise-2
ManavazhaganR@Bsnl.in
What are we going to do in
Hypothesis Testing ?
Hypothesis Testing is
examining our data to
accept one of the two
questions :
1.Is the variation in our data due to
ManavazhaganR@Bsnl.in
Test Statistics - Formulae
Test Statistic Formula
Z – test
(Z-Single
test Mean )
(t Difference
– Test of two means )
(t –Single
Test Mean )
( Difference of two
means )
n - 1 degrees of freed
ManavazhaganR@Bsnl.in
Choosing the Tail - Quiz..
Case-1
A car manufacturer has
redesigned an existing engine. The
old model engine can be used to
travel 20 Km. per liter of petrol. He
wishes to test that, the new design
Upper
increases the mean mileage
tail per
liter.
Case-2
With the attendance data if a
Faculty wishes to find outUpper
whether
the attendance of students is high on
tail
ManavazhaganR@Bsnl.in
Choosing the Tail - Quiz..
C a se - 3 :
T h e M a n a g e r o f M T R H o te lsa ys
th a t th e m e a n b illfo r a cu sto m e r is
n o t R s. 3 0 0 . H e d e cid e s to te st th e
m a n a g e r’ s cla im . H e u se s a sa m p le o f
m o n th ly b ills to te st th e MTwoa n a g e r’ s
cla im . tail
ManavazhaganR@Bsnl.in
1. Write Hypothesis
2. Choose a Test Statistic
mmary value from our sample using the Test Statistics ---- C
calculated = 0 . 02 ManavazhaganR@Bsnl.in
Calculated Value of Test
Statistics
Test Statistic Formula
Z – test
(Z-Single
test Mean )
(t Difference
– Test of two means )
(t –Single
Test Mean )
( Difference of two
means )
n - 1 degrees of freed
1. Write Hypothesis
2. Choose a Test Statistic
mary value from our sample using the Test Statistics ----
3. Calculate Summary value from our sample using the Test Statistics ---- Calculated Value
Degrees of Freedom ( v )
ManavazhaganR@Bsnl.in
1. Write Hypothesis
2. Choose a Test Statistic
3. Calculate Summary value from our sample using the Test Statistics ---- Calculated Valu
P ro b a b ility V a lu e ( p )
D e g re e s o f F re e d o m ( v ) ManavazhaganR@Bsnl.in
Lower Critical Value Mean Upper
M Critical
ean= Val
15
Lower Tail Upper Tail
e je ctio n R e g io n R e je ctio n R e g io n
=0 . 0 0 1 p =0 . 0 0 1
e je ctio n R e g io n R e je ctio n R e g io n
=0 . 0 0 1 p =0 . 0 0 1
p - v a lu e ( S ig n ifica n ce le v e l ) is th e p ro b a b ility
o f in co rre ctly re je ctin g a tru e n u ll
h y p o th e sis . Ty p e - 1 e rro r.
Øp -V a lu e is ch o se n a rb itra rily . A cce p ta n ce
ØN o rm a lly p -V a lu e w ill b e ch o se n e ithR eerg 0io. n0 5
o r 0 .0 0 1
ManavazhaganR@Bsnl.in
P-Value & Confidence Level-
Quiz….
• P-Value = 0.02 Confidence Level 98 %
=?
• Sign.Level = 0.08 Confidence 92 %
Level= ? 82 %
5.
Type of Test ( t ) Accept Ho Decisio
Two tailed If the t calculated value
One tailed(L) tlies
calbetween
> t tablethe two critical
n
One Tailed ( R)
values of t found using t
t cal < t table Rule
table . Otherwise reject.
= 1.3
Step - 4 : Find the Z critical value : 5 % level of
significance . i . e α = 0 . 05
Critical values of Z for 2 tails = 1 . 96 , - 1 . 96
Step - 5 : Decision rule : Z value calculated in step - 3 lies
between the critical values + 1 . 96 and - 1 . 96 . Hence Ho has to be
accepted . That is this year ’ s absenteeism is not significantly
ManavazhaganR@Bsnl.in
different from the past .
• A Manager wishes to test whether the average
number of absence per worker has changed this
year compared with the average absence of 8 days
in the past . A random sample of 25 workers is
taken giving a mean of 1o.6 days of absence this
year and s.d of 14 days. It is decided to conduct
the
Step - 1 test at the 5%
: Formulate level of
Hypothesis : significance.
H o =8 ; H1 ≠ 8 ( Two tail
test )
Step - 2 : Choose test statistic : It is parametric
data & One sample & sample
size < 30 hence use t statistic
Step - 3 : Find the Test statistic ’ s calculated value :
= 0 . 928
Degrees of Freedom = Sample size - 1 = 25 - 1 = 24
Step - 4 : Find the t critical value : 5 % level of
significance . i . e α = 0 . 05
Critical values of t for 24 d . f = 1 . 711
Step - 5 : Decision rule : t value calculated in step - 3 < t –
table value . Hence Ho has to be accepted .
ManavazhaganR@Bsnl.in
Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
ManavazhaganR@Bsnl.in
ManavazhaganR@Bsnl.in
Chi-
Purpose Squared Types
Test
es and Expected value Chi - Squared Goodness of Fi
a have a particular distribution like Normal , Poisson & Binomi
Heterogeneity
er samples are similar or different from in
eachGoodnes
other
Test
variable
on Chi - Squared Goodness of Fi
has more than two categories of result
categories
Heterogeneity in Goodnes
samples are similar or different from each other
ent variables
Chi - Squared
ne variable has 2 categories Test for Ass
st for association between variables
ManavazhaganR@Bsnl.in
Chi-Squared Goodness-of-Fit
test
Items to be Checked Criteria
Purpose Whish to compare your observed values to those
predicted by an a priori expectation:
Treatment Variable One, Category variable
To find the good fit between Observed value &
ManavazhaganR@Bsnl.in
Procedure-1 for Chi-Squared Goodness-of-Fit test
to be tested
e is no difference between the Expected & Observed v
e is a difference between the Expected & Observed
lue ?
lated using the numerical formula you are expecting
How
3. to find Chi - squared Calculated ?
χ2 Calculated = ∑ ( Observed - expected ) 2
expected
ManavazhaganR@Bsnl.in
Procedure-1 for Chi-Squared Goodness-of-Fit test
ManavazhaganR@Bsnl.in
Chi-Squared Goodness-of-Fit test
Case-1:
A Company XYZ is accused of Sexual
discrimination in its employment practice. It has
200 employees .If any company has 40% female
employees , then that company will not be
accused of sexual discrimination. The Gender
breakup of the company XYZ is as follows:
Male Female Total
150 50 200
Examine the accusation at 5% level of
significance.
Expected value is 40 % Female employees:- a Priori
Observed value is 30 % Female employees
Treatment variable is : Gender
Treatment outcome is : Category variables : M/F
Data type : ManavazhaganR@Bsnl.in
Countable; Sample : one
tep - 1 : Form Hypothesis :
o : No difference between expected and observed value
1 : Both Observed and expected value are different .
Step-2 : Find the Expected Values:
Male Female Total
Observed value ( O) 150 50
200
Expected Value ( E ) 120 80
Step-3 : Find the chi-squared value:
= 18.75
4 : F in d th e ch i- sq u a re d critica l v a lu e :
es of freedom = no . of categories - 1 = 2 ( M / F ) - 1 = 1
ficance level = 0 . 05
uared critical value from table for 1 d . f and 0 . 05 S . L = 3 . 84
p - 5 : Decision rule :
culated value > Critical value , reject null hypothes
ManavazhaganR@Bsnl.in
df 0.995 0.99 0.975 0.95 0.90 0.10 0.05 0.025 0.01 0.005
1 --- --- 0.001 0.004 0.016 2.706 3.841 5.024 6.635 7.879
2 0.010 0.020 0.051 0.103 0.211 4.605 5.991 7.378 9.210 10.597
3 0.072 0.115 0.216 0.352 0.584 6.251 7.815 9.348 11.345 12.838
4 0.207 0.297 0.484 0.711 1.064 7.779 9.488 11.143 13.277 14.860
5 0.412 0.554 0.831 1.145 1.610 9.236 11.070 12.833 15.086 16.750
6 0.676 0.872 1.237 1.635 2.204 10.645 12.592 14.449 16.812 18.548
7 0.989 1.239 1.690 2.167 2.833 12.017 14.067 16.013 18.475 20.278
8 1.344 1.646 2.180 2.733 3.490 13.362 15.507 17.535 20.090 21.955
9 1.735 2.088 2.700 3.325 4.168 14.684 16.919 19.023 21.666 23.589
10 2.156 2.558 3.247 3.940 4.865 15.987 18.307 20.483 23.209 25.188
11 2.603 3.053 3.816 4.575 5.578 17.275 19.675 21.920 24.725 26.757
12 3.074 3.571 4.404 5.226 6.304 18.549 21.026 23.337 26.217 28.300
13 3.565 4.107 5.009 5.892 7.042 19.812 22.362 24.736 27.688 29.819
14 4.075 4.660 5.629 6.571 7.790 21.064 23.685 26.119 29.141 31.319
15 4.601 5.229 6.262 7.261 8.547 22.307 24.996 27.488 30.578 32.801
16 5.142 5.812 6.908 7.962 9.312 23.542 26.296 28.845 32.000 34.267
17 5.697 6.408 7.564 8.672 10.085 24.769 27.587 30.191 33.409 35.718
18 6.265 7.015 8.231 9.390 10.865 25.989 28.869 31.526 34.805 37.156
19 6.844 7.633 8.907 10.117 11.651 27.204 30.144 32.852 36.191 38.582
20 7.434 8.260 9.591 10.851 12.443 28.412 31.410 34.170 37.566 39.997
21 8.034 8.897 10.283 11.591 13.240 29.615 32.671 35.479 38.932 41.401
22 8.643 9.542 10.982 12.338 14.041 30.813 33.924 36.781 40.289 42.796
23 9.260 10.196 11.689 13.091 14.848 32.007 35.172 38.076 41.638 44.181
24 9.886 10.856 12.401 13.848 15.659 33.196 36.415 39.364 42.980 45.559
25 10.520 11.524 13.120 14.611 16.473 34.382 37.652 40.646 44.314 46.928
26 11.160 12.198 13.844 15.379 17.292 35.563 38.885 41.923 45.642 48.290
27 11.808 12.879 14.573 16.151 18.114 36.741 40.113 43.195 46.963 49.645
28 12.461 13.565 15.308 16.928 18.939 37.916 41.337 44.461 48.278 50.993
ypothesis to be tested
Ho: The variables are independent of each other .
H1 : The variables are not independent
lue ?
lated using the numerical formula you are expecting
How
3. to find Chi - squared Calculated ?
χ2 Calculated = ∑ ( Observed - expected ) 2
expected
ManavazhaganR@Bsnl.in
Procedure-1 for Chi-Squared Independence test
ManavazhaganR@Bsnl.in
ANOVA - One Way
sample
ManavazhaganR@Bsnl.in
ANOVA - One Way
within the samples :
F ratio calculated =
ManavazhaganR@Bsnl.in
ANOVA - One Way
cision Rule :
If F calculated < F Table value , accept Ho .
ManavazhaganR@Bsnl.in