Sunteți pe pagina 1din 9


Alex Nguyen
Mr. Reppenhagen
May 28
, 2014
IB Mathematics
What is the Central Limit Theorem and what are its applications in statistics?
Table of Contents
1. Introduction 2
2. Definition of Terms 2
3. Explanation 4
4. History of Central Limit Theorem 4
5. Demonstration of Proof of Central Limit Theorem 5
6. Applications 6
a. Sampling 7
b. Polls 7
7. Weakness 8
8. Conclusion 8
9. Works Cited


The Central Limit Theorem has many versions, referring to a convergence of means of a probability
distribution towards one single distribution: the normal distribution. The Classical Central Limit Theorem
states that the arithmetic mean of a large number of independent and identically distributed random
variables, each with well-defined mean and variance, will be normally distributed. In simpler terms, if we
obtain a large sample, each individual in the sample being independent from one another and random,
and calculate the means of these random variables, then the central limit theorem states that this
distribution of means will be approximately bell-shaped, or distributed in a normal curve. The Classical
Central Limit Theorem allows one to speculate on the probability distribution of the outcome (the mean)
of a process (some random variables comprising an event) without knowing much about the nature of
the events themselves other than the fact that they are identical and independent. Today, the central
limit theorem is made abstract and its hypotheses weakened to allow for some cases in which the
variables can be dependent on one another, which widens the scope and applicability of the theorem.
The Central Limit Theorem is a surprising result in statistics and probability and is used constantly, which
elevates the importance of the normal distribution in statistics and probability.
Definition of Key Terms
First, let us define our terms. By a random variable we mean something that changes due to chance. A
random variable can be domain of a probability density function. Each value of a random variable,
whether discrete or continuous, is assigned a single probability. For example, the values of a die roll
are values of a random variable. A mathematical function that assigns all possible values of the random
variable an associated probability is called a probability distribution. All probability distribution has its
area under the curve as one unit, because the chances of something in the sample space happening is
certain. The probability distribution for one single die roll is a horizontal straight line.
The mean or expected value of a random variable is the value of the variable we would expect if we
repeat the random variable an infinite number of times and take the average of all the value. In a way,
the mean or expected value is a weighted average of all possible values. The standard deviation is the
square root of the root mean square or quadratic mean of the distances between all the values and the
expected value.

The normal distribution is a very commonly occurring continuous probability distribution in which the
mean, median, and mode are the exactly the same. A normal distribution with mean and standard
deviation is given by the equation:

The normal distribution is also called a Gaussian distribution. The value of a normal distribution is
practically zero several standard deviations away from the mean. For example, 99.7% of values are
present within 3 standard deviations of the mean. Therefore, extreme events are predicted to have very
little chances of occurring, to due the exponential decay demonstrated on both sides.

The picture above is a normal distribution with the probability of values lying between each standard
deviation from the mean.
Then, we need to define what being independent and identically distributed means. The phrase is
used to describe a collection of random variables. Random variables are independent from each other if
one occurring does not alter the probability of the other. Random variables are identically distributed
when they have the same probability distribution as the others. For example, the events of rolling the
die repeatedly is independent and identically distributed because they have the same probability

distribution (horizontal straight line) and are independent from one another (one die roll does not affect
By the mean in this essay we mean the arithmetic mean, and by variance we mean the square of
standard deviation.
The Central Limit Theorem essentially describes the characteristics of a population of means creating
from the means of an infinite number of random population samples size N, all drawn from a parent
population. The Central Limit Theorem specifically predicts that regardless of the distribution of the
parent population, as long as the samples are random and independent from one another, that
1) The mean of a population of means is always equal to the mean of the parent population from
which the samples are taken
2) The standard deviation of a population of means is always equal to the standard deviation of the
parent population divided by the square root of the sample size.
3) The distribution will increasingly approach a normal distribution as the size N of sample
We know that different parts of the distribution converge differently to a normal distribution. The parts
close to the mean converges quickly to the normal distribution but the tails converge more slowly to the
normal. Therefore, we say that the central limit theorem gives an asymptotic distribution. It requires a
large number of observations to stretch the convergence to the tails.
History of the Central Limit Theorem
Many natural and social scientists in the 19
century has noticed a pattern in the means of these
independent random variables. When the outcome (the means) is affected by a lot of random variables
(high sample size) and when each variable only has a slight effect on the outcome as a whole, the mean
is distributed in a certain way, regardless of the actual probability distribution of the random variables.
Mathematically, however, it is an important and seemingly daunting problem, which requires a
mathematician to draw conclusions about the outcome (the mean) from a set of random variables when
little is known about the distribution of the various variables. It has been described as "one of the most

remarkable results in all of mathematics" and "a dominating personality in the world of probability and
statistics" (Adams, 1974, p. 2). It is also one of the earliest results of probability theory.
The central limit theorem was named by the mathematician Georg Polya, from a paper in 1810 by the
French mathematician Laplace, in which Polya recognized a number of theorem that eventually leads to
the appearance of the normal distribution. Polya, drawing from Laplaces foundations, named the
theorems central limit theorems which is used widely today.
Proof of Central Limit Theorem
While the proof of the central limit theorem is too advanced for the scope of this exploration, we can
explore the heuristic behind the central limit theorem.
The normal distribution satisfies a specific identity about itself

A random variable with a normal distribution of mean

and standard deviation

and another
random variable distributed normally with mean

and standard deviation

will have a sum that is

distributed normally with mean

and standard deviation

In essence, normal distributions when added together yield normal distributions up to a degree of
scaling. The equation

defines a normal distribution.
These properties help us understand how we expect normalized means to converge to a normal
distribution. Suppose that the population of means converge to a hypothetical distribution D. We have


is simply the normalized sum. To find the mean we simply divide the sum by .
We would expect that

so D must be normal.
Applications of Central Limit Theorem
Hypothesis testing draws strongly from the central limit theorem. The central limit theorem helps
scientists who want to draw claims about a population that they are studying. For example, in areas of
knowledge where there are variable behaviors such as in psychology, experts need to formulate
hypotheses about a population and need to know the margin of error for their claims. They use
statistical experiments to obtain sample data from the population. Information from the data, such as
standard deviation, sample size, or the mean can be used to test for the accuracy of a specific
hypothesis with regards to a population. Hypothesis testing that assumes data is just from a normal
distribution seems unrealistic, because real world data shows outliers, skewing towards one side,
multiple peaks and asymmetry. For examples, if we sample the worlds wealth, we have outliers such as
Bill Gates or Warren Buffett, which are not taken into by the normal distribution because they are so far
from the mean as to be virtually impossible by the normal distribution. We also have a skewed
distribution towards poverty (there are more moderately poor people than there are moderately rich
people). Therefore, it is unrealistic to treat all data as if they are normal. However, if we take the mean
of such data, assuming that in all samples of data are formed of similar composition and assuming the
sample size is large enough, we can put the data into a normal distribution.

A sampling distribution of a statistic is the probability distribution of a given statistic. For example, if we
have one sample we might want to know the mean of that one sample. However, when we repeat the
experiment we take the mean of each sample as representative of that sample. This mean is called the
sample mean. The sample distribution of the mean is a probability distribution of these means, because
the means might differ from one sample to another. For example, realistically we can take samples in
one geographical area. Perhaps one mean would be altered because one area is a rich neighborhood
and yields a high average income per household. However this is not representative of the actual
population. We need to take many samples. The sample distribution of the mean is the distribution of
these various means.
This is important for statisticians because when they make claims such as 90% of individuals earn an
average of 30,000 dollars to to 70,000 dollars 19 out of 20 times, they want to specifically know if
current statistical surveys will report the same findings. Sampling distributions of the means are used to
generate confidence intervals for survey reports and for significance testing (testing the statistic to see if
they actually describe the population). Therefore it is important to know how variable our estimates are.
The central limit theorem, by generalizing these sample distributions of the mean into normal
distributions, help us figure out specifically what the variability of these statistics are.
An important effect of the central limit theorem affects how we read polls. For example, during election
time we usually see polls that are taken to estimate the percentage of a population which supports a
certain candidate for presidency. Since it is not possible to survey the entire population the pollsters
have to survey only a certain proportion of the population. Suppose the pollsters survey a sample
population of size n for their preferences. The preferences of the people in the sample can be
represented as a sequence of random variables which are independent and presumably identically
distributed. The pollsters sample the mean in the polls. The mean should be distributed normally. As the
number of people surveyed increases, the mean of the sample distribution of means should be close to
the population mean, which is the number reported in the polls.

A weakness of the central limit theorem is the premise of independent and identically distributed
variables. However, the theorem still holds if some of its assumptions (independent and identically
distributed variables with finite mean and standard deviation) are violated. If the variables are weakly
dependent on each other, the sample distribution of the mean converges less quickly to the normal
distribution, which means that our estimates are less accurate than it is had the random variables been
The main thing to understand is that elementary mathematics is an elegant and simple starting point.
Assumptions such as independent and identically distributed random variables are not usually found in
the real world. In the same vein, assumptions in physics such as engines completely transforming heat
into work do not exist in real life. In the real world, nonstationary processes whose probability
distributions, mean, and variance shift as a function of time are commonly found. Therefore, the central
limit theorem does not strictly apply to these situations.
The central limit theorem is an important theorem in statistics and probability theory. Some
mathematicians have called it one of the fundamental theorems of statistics. The theorem tells you how
a population of means behave when the sample size approaches infinity. This greatly helps with surveys
and help us compute how accurately a survey reflects the population. However, due to the various
assumptions that the central limit theorem makes (independent and identically distributed), we can only
use the theorem as a starting point. Therefore, the central limit theorem is important for students to
know how to use, but we need to know its limitations.


Works Cited
Adams, W.J. (1974). The life and times for the Central Limit Theorem.
New York: Kaedmon.
Blacher, Ren. "Central Limit Theorem by Moments." Statistics & Probability Letters 77.17
(2007): 1647-651. Dartmouth University. Web. 28 May 2014.
"The Central Limit Theorem." Intuitor. Web. 28 May 2014.
Clauset, Aaron. "Adapted Probability Distributions." Web. 28 May 2014.
"Distribution, Normal." HighBeam Research, 01 Jan. 2008. Web. 28 May
H., Krieger. "Proof of Central Limit Theorem." Harvey Mudd College, 2005. Web. 28 May 2014.
Lane, David M. "Sampling Distribution (1 of 3)." Sampling Distribution (1 of 3). Web. 28 May