Sunteți pe pagina 1din 9

The normal distribution

GEM2900 - Understanding Uncertainty and


Statistical Thinking

The normal distribution, also called the Gaussian distribution,


can be used to model continuous random variables.

The normal distribution has two parameters and 2 , called the


(population) mean and (population) variance, respectively.

If the random variable X follows a normal distribution with


parameters and 2 (also denoted by X N(, 2 )) then

David Chew and David Nott


Department of Statistics and Applied Probability
National University of Singapore

E[X ] =

and

Var[X ] = 2 .

Woolfson (2008, Chapter 10)

The normal distribution (cont.)

The normal distribution (cont.)


Normal pdfs with same mean but different spread

The pdf is largest at , the expected value, and how spread out"
the density curve is will be controlled by , the standard
deviation.

Areas of regions underneath the pdf represent probabilities.

0.3
f(x)

0.2
0.1

The formula for the pdf of the N(, 2 ) distribution is:




(x )2
1
exp
f (x) =
2 2
2 2

0.4

The probability density function (pdf) of the normal distribution is


symmetric, with a smooth bell-shape.

0.0

10
x

15

The normal distribution (cont.)

As I mentioned, the normal distribution is sometimes also


called the Gaussian distribution, named for the German
mathematician Karl Friedrich Gauss.

The Gaussian distribution, however, was not first


discovered by Gauss, an example of what mathematicians
call Stiglers law. (This is the law that discoveries should
not be named after their inventors). Stiglers law is of
course an example of Stiglers law ...

Until conversion to the euro, the German ten mark bill had
a picture of Gauss, a picture of the Gaussian density
curve, and even the formula for the Gaussian density on it.

Brownian motion
I

One problem in which the Gaussian distribution arises is in the


description of the physical phenomenon of Brownian motion.
Albert Einsteins mathematical description of Brownian motion is
widely regarded as having convinced most physicists of the
existence of atoms.

Brownian motion is named after Robert Brown, who observed


the highly erratic motion of pollen grains in a drop of water.

The motion of the grains can be thought of as arising from a


large number of collisions with the water molecules. Brownian
motion is an idealized model for the coordinates of the motion.

For a particle starting at the origin and followed for a period of


time t the distribution of the x or y coordinates will be N(0, 2 t)
for some 2 > 0.

The normal distribution (cont.)

Brownian motion

It is possible to simulate paths of Brownian motion on a


computer.

The java applet at


http://www.ms.uky.edu/ mai/java/stat/brmo.html
simulates a Brownian motion in two dimensions.

The paths traced out by the process tend to be highly


erratic (in fact, although they are continuous they are not
differentiable anywhere, for those of you who have done
enough math to know what that means).

Brownian motion

Although Robert Brown wasnt the first to observe


Brownian motion (Stiglers law again), he was certainly the
most systematic experimenter to look at this phenomenon.
He looked at not just pollen grains suspended in a water
drop, but also many other things (including scrapings of
particles from the sphinx, which he had access to in his
work as a curator at the British museum - there was some
question about whether only living things were subject to
Brownian motion and apparently he regarded the sphinx
as undeniably, certifiably dead).

Calculating with the normal distribution


I

A normal distribution is specified by its mean and its


variance 2 .

The standard normal distribution has mean = 0 and


variance 2 = 1.

If a random variable is distributed as standard normal, it is


typically denoted by Z rather than X , and the standard normal
observations are known as z-values.

The standard normal distribution is symmetric about = 0,


hence P(Z < c) = P(Z > c) for all c R.

Probabilities of (continuous) random variables are given by


areas under pdf curves. For example, suppose Z is standard
normal. Then the probability P(0 < Z < 1), i.e. the probability
that Z is between zero and one, is equal to the area of the
shaded region in the next slide.

Brownian motion

A century and a half before Brown a draper from Delft,


Antony Van Leeuwenhoek, had observed Brownian motion.

Among other things, he had looked at scrapings of the


unbrushed teeth of old men.

Einstein was the first scientist to really take an interest in


Brownian motion who also had the mathematical ability to
describe it analytically.

Other scientists had guessed the correct explanation but


had not done any calculations that could be compared to
experiments.

Calculating with the normal distribution (cont.)

Calculating with the normal distribution (cont.)

Calculating with the normal distribution (cont.)

How to find normal probabilities?


I

Use a computer.
All software packages will calculate P(X < c) for X distributed
normal with mean and standard deviation .
(E.g. The free statistical package R has the command
pnorm().)

If we wish to calculate a probability P(a < X < b), then we can


use a computer to calculate P(X < a) and P(X < b) and then
use
P(a < X < b) = P(X < b) P(X < a)

Standardisation
I

I
I

Alternatively, the probabilities can be calculated by hand using


a technique known as standardisation.

Calculating with the normal distribution (cont.)

The standardised version of a random variable X with


(population) mean and (population) standard deviation
is
X
Z =

The mean of Z is zero; the standard deviation of Z is one.


If X has a normal distribution, then Z has a standard
normal distribution!
This result does not necessarily hold for other distributions.

Calculating with the normal distribution (cont.)

Example: Marilyn vos Savants IQ


I

Earlier in the course I mentioned Marilyn vos Savant, who was


listed for a time in the Guinness Book of Records as the person
with the worlds highest IQ.

She popularized the Monty Hall problem with her column in


Parade magazine. Her presentation of the problem and her
(correct) solution, generated a lot of heated discussion at the
time.

IQ scores are often assumed to follow a normal distribution,


calibrated so that the mean is 100 and the standard deviation is
15.
Marilyn vos Savants IQ is 228. What is the probability of
someone randomly chosen from the general population having
an IQ score larger than this?

Example: (cont.)
Let X have a normal distribution with mean 100 and standard
deviation 15.
The questions ask for P(X > 228).
I

Method One: Using a computer


I
I

The free statistical package R (http://www.r-project.org)


gives a value of 0 to machine precision ...
Suppose her IQ was a mere 150. Then R gives a value of
0.0004.

Calculating with the normal distribution (cont.)

I Consider a normally distributed random variable X with mean and

Example: (cont.)
I

Calculating with the normal distribution (cont.)


variance 2 , and the random variable Z =
normal.

Method Two: using standardisation


I

We have:
P(X > 228)

=
=

X
228
>



228 100
P Z >
= P(Z > 8.53)
15

So the probability we need is P(Z > 8.53), where Z has a


standard normal distribution. But how do we work out what this is?
We have to look this probability up in a table, like the one given in
the text book. Actually 8.53 is well beyond the upper limit of values
in the table. You wont be required to read normal tables for
anything in this course.

The 68 95 99.7 rule

which is standard

I The for c > 0

I For example

P( c < X < + c) = P(c < Z < c)

P( < X < + ) = P(1 < Z < 1) 0.682

P( 2 < X < + 2) = P(2 < Z < 2) 0.954


P( 3 < X < + 3) = P(3 < Z < 3) 0.997

I This is sometimes called the 68 95 99.7 rule.


I If data are normally distributed, roughly 68% (

2
)
3

of observations
should be within one standard deviation (SD) of the mean; and roughly
95% of observations should be within two SDs of the mean.

Why is the normal distribution so important?


Motivating example: sums of dice
I

The next five slides show the probabilities of the sums of


n = 1, 2, 3, 4 and 5 dice.

What happens to the shape of the probabilities as the


number of dice n increases?

Amazingly, this will happen with (almost) any random


variable; as long as n is large enough the probabilities for
the sum (and mean) will start to follow a normal
bell-shaped curve.

Why is the normal distribution so important?

Why is the normal distribution so important?

Why is the normal distribution so important?

Why is the normal distribution so important?

Why is the normal distribution so important?

The Central Limit Theorem (CLT)


I

Consider a random variable; such as the number on a die roll.

Suppose we observe n realisations of the random variable; i.e.


we roll the die n times.

If n is large enough, then the distribution of the sum (and mean)


of the n values follows approximately a normal distribution.

N OTE: We assume here that the value obtained on one die roll does
not affect the value obtained on another die roll; i.e. we are assuming
independence.
If we do not have independence, then the CLT may not hold.
There are many different sets of assumptions under which the CLT
holds.

What does the CLT tell us?

It shows that a normally distributed random variable can be


regarded as the sum (or mean) of a large number of small
random contributions.

Often it can be argued that variables observed in the real


physical world are subject to a large number of different
sources of variability.

It is therefore not very surprising that many real-life


variables are of the form signal + noise where the noise
has an approximate normal distribution.

Why is the CLT important?

It explains why many real-life observed variables have a


signal + noise form with the noise following a normal
distribution.

In statistics, very commonly used quantities are sums or


means of observations, so the CLT tells us that these
quantities have approximate normal distributions.

Hence many methods in statistics rely on the normal


distribution.

Connection between binomial and normal

Connection between binomial and normal

Assume X follows a binomial distribution with parameters n and p.


Then we can view X as the sum of n independent random variables,
each having a Bernoulli distribution with parameter p.
Hence, if Y is a normally distributed random variable with mean np
and variance np(1 p), then

 Z x+ 1
2
1
1
P(X = x) P x Y x +
=
fY (u) du
1
2
2
x
2

fY (x).

where fY (y ) =

function of Y .

1
2np(1p)



(y np)2
exp 2np(1p)
is the probability density

N OTE : a typical rule of thumb is that this approximation is valid if np 5 and

n(1 p) 5.

Connection between binomial and normal

Connection between binomial and normal

You find that your GEM2900 lecturer is becoming increasingly


difficult, unreasonable and paranoid as the semester progresses.

He has just set a continuing assessment for you to do on the


IVLE containing 100 multiple choice questions with 4 options
each.

As you dont have time for this kind of thing you decide to simply
guess an answer at random for each question.

What is the probability that you pass the CA (that is, what is the
probability that you obtain a score of 50 or more correct?)

Connection between binomial and normal


I Let X be your score. Then if you are guessing each question randomly

clearly X Binomial(100, 0.25). Using the result that for a binomial


with parameters n and p the mean
p is np and the variance np(1 p) we
have E(X ) = 25 and SD(X ) = 300/16.

I Let Y be a normal random variable with the same mean and variance

as X , i.e. Y N(25, 300/16).

I Then

P(X 50)

P(Y 49.5)


49.5
Y

!
49.5 25
P Z p
300/16

7.6 109

=
=

P(Z 5.66)

Connection between Poisson and normal

Connection between Poisson and normal


Assume X follows a Poisson distribution with parameters .
Then we can view X as the sum of n independent random
variables, each having a Poisson distribution with parameter n .
Hence, if Y is a normally distributed random variable with mean
and variance , then
 Z x+ 1

2
1
1
=
P(X = x) P x Y x +
f (u) du
1 Y
2
2
x
2

fY (x).

where fY (y ) =
function of Y .

1
2



2
exp (y )
is the probability density
2

S-ar putea să vă placă și