Documente Academic
Documente Profesional
Documente Cultură
The material presented in this module forms the backbone for much of the
material on statistical inference to follow; namely, confidence intervals and
significance tests based on large samples. One cannot understand the
meaning of a P-value, for example, without understanding sampling
distributions. It may not be easy for you to grasp the idea that a statistic has
its own probability distribution, and concepts such as the mean of a sample
mean confuse many newcomers. But, time invested here will yield rewards
repeatedly as we continue to study.
It is in this module that we will formally introduce the distinction between a
statistic and parameter for here we begin to turn toward inference from
sample to population. A common population parameter of interest is the
population mean . The sample mean x is commonly used to estimate , so
its sampling distribution is important.
In this module we examine several important results concerning this
particular sampling distribution:
(1)The mean of the statistic x is the same as the mean of the population
were sampling from, ;
(2)the standard deviation of the statistic x is n;
(3)because of this result, the Law of Large Numbers states that the
statistic x has less variability (greater consistency) when based on a
larger sample size;
(4)the Central Limit Theorem states the sampling distribution of x will be
approximately normal, no matter what population the sample is taken
from, provided the sample size is large enough.
Sampling Variability
Sampling variability the value of a statistic varies in repeated random
sampling.
Sampling variability is inevitable no matter how careful we are with random
sampling.
Think of a statistic as a random variable because it takes on numerical
values that describe the outcome of a random sampling process.
The problem with this is that we may not always be able to get a large
enough sample size.
How can we make inferences about a population based on findings from a
smaller sample.
Create a sampling distribution.
Suppose that x (x-bar) is the mean of an SRS of size n drawn from a large
population with mean (mu) and standard deviation . Then:
The mean of the sampling distribution of x is x =
The standard deviation of the sampling distribution of x is
x = /(square root of n)
These facts about the mean and standard deviation of xx (x-bar) are always
true
no matter what shape the population distribution has (left skewed, right
skewed, big spread, little spread..)!!!
What this tells us: When we take a large enough random sample, we can
trust that it will help estimate the true population mean accurately because a
large sample will give us a sample mean that is close to the parameter (law
of large numbers), and also a large sample size will ensure a small sampling
distribution standard deviation, which means that all large samples will have
means that are close to the population parameter. (Slide C).
Why are all the sampling distributions seen so far normal? Is it because the
population distributions are normal? What if the population distributions
arent normal?
Most population distributions are not Normal. What is the shape of the
sampling distribution of sample means when the population distribution isnt
Normal?
when sample size is large, the sampling distribution of the sample mean x-
bar is approximately normal with the mean of the sample means equal to
mu (population parameter) and the standard deviation of sigma over the
square root of n.
If the population distribution is not Normal, the central limit theorem (CLT)
tells us that the sampling distribution of x-bar (sample means) will be
approximately Normal in most cases if n >= 30.
Remember:
(1) Means of random samples are LESS VARIABLE than
individual observations.
(2) Means of random samples are MORE NORMAL than
individual observations.