Statistical Inference

Chapter Four
Statistical
Inference
1
Objectives
At the end of these session the students will be
able:
1. Explain the principles of estimation and
differentiate between point and interval
estimations
2. Compute appropriate confidence
intervals for population means and
proportions and interpret the findings
2
Statistical Inference
Is the process of drawing conclusions about an
entire population based on the data in a sample.
Methods of inference usually fall into one of two broad
categories: Estimation or Hypothesis Testing.
We will focus on using the observations in a sample to
estimate a population parameter.
The purpose of statistical inference is to make
decisions about population characteristics.
Parameters and Statistics
Parameters are summaries of the population, while

estimates are summaries of the sample.
Parameters are unknown; statistics are calculated.
Parameters are hypothetical; whereas statistics are
"real."
Parameters are constants; statistics are random
variables.
Estimation
Is concerned with estimating the values of

specific population parameters based on
sample statistics.
Is about using information in a sample to
make estimates of the characteristics
(parameters) of the source population.
Continued
Example:
A sample survey revealed:
Proportion of smokers among a certain group of
population aged 15 to 24.
Mean of SBP among sampled population
Prevalence of HIV-positive among people involved in the
study
The next question is what can we predict about the
characteristics of the population from which the sample was
drawn.
Estimation, Estimator and Estimates
Estimation is the computation of a statistic from

sample data, often yielding a value that is an
approximation (guess) of its target, an unknown true
population parameter value.
The statistic itself is called an estimator and can be
of two types point or interval.
The value or values that the estimator assumes are
called estimates.
Parameter Estimations
In parameter estimation, we generally
assume that the underlying (unknown)
distribution of the variable of interest is
adequately described by one or more
(unknown) parameters, referred as
Population Parameters.
As it is usually not possible to make
measurements on every individual in a
population, parameter cannot usually be
determined exactly.
Instead we estimate parameters by
calculating the corresponding characteristics
from a random sample estimates
.
8
Types of estimation
Two methods of estimation are
commonly used:
Point estimation involves the
calculation of a single number to
estimate the population parameter
Interval estimation specifies a
range of reasonable values for the
parameter
9
Continued
From a single sample we can calculate a sample
statistic to estimate a single parameter (a point
estimate).
A single numerical value used to estimate the
corresponding population parameter.
n
Point estimate for population
mean is
xi
x = i =1
n
Point estimate for
population proportion is given by
total number of
(events).
x
p=
n
, Where x is the
10
success
11
Confidence Interval estimate

However, the value of the sample statistic will vary from
sample to sample. Therefore, to simply obtain an estimate
of the single value of the parameter is not generally
acceptable.
We need also a measure of how precise our estimate is
likely to be
We need to take into account the sample to sample
variation of the statistic
A confidence interval defines an interval within which the
true population parameter is like to fall (interval estimate).
12
Interval estimation
13
Continued
Level of Confidence
Denoted by 100(1-)%.
A relative frequency interpretation
In the long run, 100(1-)% of all the
confidence intervals that can be
constructed will contain the unknown
parameter.
A specific interval will either contain or
not contain the parameters.
16
Continued
Confidence interval, therefore; takes into account the
sample to sample variation of the statistic and gives the
measure of precision.
The general formula used to calculate a Confidence interval
is Estimate K Standard Error, k is called reliability
coefficient
Confidence intervals express the inherent uncertainty in any
medical study by expressing upper and lower bounds for
anticipated true underlying population parameter.
Most commonly the 95% confidence intervals are
calculated, however 90% and 99% confidence intervals are
sometimes used.
17
18
Confidence interval
A (1-) 100% confidence interval for
unknown population mean and population
proportion is given as follows;
[ x z . , x z . ]
for estimating mean
n
n
2
2
if is unknown, it can be estimated by s.e
[ p z . p(1 p) / n , p z . p(1 p) / n ]
2
for estimating proportion
19
Confidence intervals
The 95% confidence interval is calculated in such a
way that, under the conditions assumed for
underlying distribution, the interval will contain true
population parameter 95% of the time.
Loosely speaking, you might interpret a 95%
confidence interval as one which you are 95%
confident contains the true parameter.
90% CI is narrower than 95% CI since we are only
90% certain that the interval includes the population
parameter.
On the other hand 99% CI will be wider than 95% CI;
the extra width meaning that we can be more
certain that the interval will contain the population
parameter. But to obtain a higher confidence from
the same sample, we must be willing to accept a
larger margin of error (a wider interval).20
For a given confidence level (i.e. 90%, 95%,
99%) the width of the confidence
interval depends on the standard error
of the estimate which in turn depends on
the
1.Sample size:-The larger the sample size,
the narrower the confidence interval (this
is to mean the sample statistic will
approach the population parameter) and
the more precise our estimate. Lack of
precision means that in repeated sampling
the values of the sample statistic
are
21
- To increase precision (of an SRS), use a
larger sample. You can make the
precision as high as you want by taking
a large enough sample. The margin of
error decreases asn increases.
2.Standard deviation:-The more the
variation among the individual values, the
wider the confidence interval and the less
precise the estimate. As sample size
increases SD decreases.
22
Confidence interval for single mean

If the population standard deviation is known
and the sample size is relatively large the
confidence interval for sample mean is: x
z(/n)
x is the sample mean
is the population standard deviation
n is the sample size
Z is the value from SND
90% CI, z=1.64
95% CI, z=1.96
99% CI, z=2.58
23
Continued
If the population standard deviation is unknown

and the sample size is small (<30), the
formula for the confidence interval for sample
mean is: x t (s/n)
x is the sample mean
s is the sample standard deviation
t is the value from the t-distribution with (n1) degrees of freedom
24
Degrees of Freedom (df)
df = Number of observations that

are free to vary after sample mean
has been calculated
df = n-1
25
Confidence interval for single proportion
For relatively large sample size the

confidence interval for sample p is
given by:
p z (p (1-p) /n)
p is the sample proportion (r/n)
Z is the value from SND
26
Mean Example
A uric acid level of 16 apparently healthy subjects yielded the
following values of urine excreted (milligram per day);
0.007, 0.03, 0.025, 0.008, 0.03, 0.038, 0.007, 0.005, 0.032, 0.04,
0.009, 0.014, 0.011, 0.022, 0.009, 0.008
Compute point estimate of the population mean
If x1 , x 2 , ..., x n are n observed values , then

n
x=
x
i =1
0.295
0.01844
16
Construct 90%, 95%, 98% confidence interval for the mean

(0.01844-1.65x0.0123/4, 0.01844+1.65x0.0123/4)=(0.0134, 0.0235)
(0.01844-1.96x0.0123/4, 0.01844+1.96x0.0123/4)=(0.0124, 0.0245)
(0.01844-2.33x0.0123/4, 0.01844+2.33x0.0123/4)=(0.0113, 0.0256)
27
Example2
28
Continued
29
30
Proportion example
In a survey of 300 automobile drivers in one city, 123
reported that they wear seat belts regularly. Estimate
the seat belt rate of the city and 95% confidence
interval for true population proportion.
Answer :p= 123/300 =0.41=41%
n=300,
Estimate of the seat belt of the city at 95%
CI = p z (p(1-p) /n) =(0.35,0.47)
31
Exercise for proportion
32
Thank you
33

Statistical Inference

Încărcat de

Informații document

Descriere originală:

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Statistical Inference

Încărcat de

Drepturi de autor:

Formate disponibile

Chapter Four

Parameters and Statistics

Parameters are summaries of the population, while

Is concerned with estimating the values of

Estimation, Estimator and Estimates

Estimation is the computation of a statistic from

Confidence Interval estimate

for estimating proportion

Confidence interval for single mean

If the population standard deviation is unknown

Degrees of Freedom (df)

df = Number of observations that

Confidence interval for single proportion

For relatively large sample size the

If x1 , x 2 , ..., x n are n observed values , then

Construct 90%, 95%, 98% confidence interval for the mean

Exercise for proportion

S-ar putea să vă placă și