Lec21 2by2

The normal distribution
GEM2900 - Understanding Uncertainty and

Statistical Thinking
The normal distribution, also called the Gaussian distribution,

can be used to model continuous random variables.
The normal distribution has two parameters and 2 , called the

(population) mean and (population) variance, respectively.
If the random variable X follows a normal distribution with

parameters and 2 (also denoted by X N(, 2 )) then
David Chew and David Nott

Department of Statistics and Applied Probability
National University of Singapore
E[X ] =
and
Var[X ] = 2 .
Woolfson (2008, Chapter 10)
The normal distribution (cont.)

Normal pdfs with same mean but different spread
The pdf is largest at , the expected value, and how spread out"
the density curve is will be controlled by , the standard
deviation.
Areas of regions underneath the pdf represent probabilities.
0.3
f(x)
0.2
0.1
The formula for the pdf of the N(, 2 ) distribution is:

(x )2
1
exp
f (x) =
2 2
2 2
0.4
The probability density function (pdf) of the normal distribution is

symmetric, with a smooth bell-shape.
0.0
10
x
15
As I mentioned, the normal distribution is sometimes also

called the Gaussian distribution, named for the German
mathematician Karl Friedrich Gauss.
The Gaussian distribution, however, was not first

discovered by Gauss, an example of what mathematicians
call Stiglers law. (This is the law that discoveries should
not be named after their inventors). Stiglers law is of
course an example of Stiglers law ...
Until conversion to the euro, the German ten mark bill had
a picture of Gauss, a picture of the Gaussian density
curve, and even the formula for the Gaussian density on it.
Brownian motion
I
One problem in which the Gaussian distribution arises is in the

description of the physical phenomenon of Brownian motion.
Albert Einsteins mathematical description of Brownian motion is
widely regarded as having convinced most physicists of the
existence of atoms.
Brownian motion is named after Robert Brown, who observed

the highly erratic motion of pollen grains in a drop of water.
The motion of the grains can be thought of as arising from a

large number of collisions with the water molecules. Brownian
motion is an idealized model for the coordinates of the motion.
For a particle starting at the origin and followed for a period of

time t the distribution of the x or y coordinates will be N(0, 2 t)
for some 2 > 0.
Brownian motion
It is possible to simulate paths of Brownian motion on a

computer.
The java applet at

http://www.ms.uky.edu/ mai/java/stat/brmo.html
simulates a Brownian motion in two dimensions.
The paths traced out by the process tend to be highly

erratic (in fact, although they are continuous they are not
differentiable anywhere, for those of you who have done
enough math to know what that means).
Brownian motion
Although Robert Brown wasnt the first to observe

Brownian motion (Stiglers law again), he was certainly the
most systematic experimenter to look at this phenomenon.
He looked at not just pollen grains suspended in a water
drop, but also many other things (including scrapings of
particles from the sphinx, which he had access to in his
work as a curator at the British museum - there was some
question about whether only living things were subject to
Brownian motion and apparently he regarded the sphinx
as undeniably, certifiably dead).
Calculating with the normal distribution

I
A normal distribution is specified by its mean and its

variance 2 .
The standard normal distribution has mean = 0 and

variance 2 = 1.
If a random variable is distributed as standard normal, it is

typically denoted by Z rather than X , and the standard normal
observations are known as z-values.
The standard normal distribution is symmetric about = 0,

hence P(Z < c) = P(Z > c) for all c R.
Probabilities of (continuous) random variables are given by

areas under pdf curves. For example, suppose Z is standard
normal. Then the probability P(0 < Z < 1), i.e. the probability
that Z is between zero and one, is equal to the area of the
shaded region in the next slide.
Brownian motion
A century and a half before Brown a draper from Delft,

Antony Van Leeuwenhoek, had observed Brownian motion.
Among other things, he had looked at scrapings of the

unbrushed teeth of old men.
Einstein was the first scientist to really take an interest in

Brownian motion who also had the mathematical ability to
describe it analytically.
Other scientists had guessed the correct explanation but

had not done any calculations that could be compared to
experiments.
Calculating with the normal distribution (cont.)
How to find normal probabilities?

I
Use a computer.
All software packages will calculate P(X < c) for X distributed
normal with mean and standard deviation .
(E.g. The free statistical package R has the command
pnorm().)
If we wish to calculate a probability P(a < X < b), then we can

use a computer to calculate P(X < a) and P(X < b) and then
use
P(a < X < b) = P(X < b) P(X < a)
Standardisation
I
I
I
Alternatively, the probabilities can be calculated by hand using

a technique known as standardisation.
The standardised version of a random variable X with

(population) mean and (population) standard deviation
is
X
Z =
The mean of Z is zero; the standard deviation of Z is one.

If X has a normal distribution, then Z has a standard
normal distribution!
This result does not necessarily hold for other distributions.
Example: Marilyn vos Savants IQ

I
Earlier in the course I mentioned Marilyn vos Savant, who was

listed for a time in the Guinness Book of Records as the person
with the worlds highest IQ.
She popularized the Monty Hall problem with her column in

Parade magazine. Her presentation of the problem and her
(correct) solution, generated a lot of heated discussion at the
time.
IQ scores are often assumed to follow a normal distribution,

calibrated so that the mean is 100 and the standard deviation is
15.
Marilyn vos Savants IQ is 228. What is the probability of
someone randomly chosen from the general population having
an IQ score larger than this?
Example: (cont.)
Let X have a normal distribution with mean 100 and standard
deviation 15.
The questions ask for P(X > 228).
I
Method One: Using a computer

I
I
The free statistical package R (http://www.r-project.org)

gives a value of 0 to machine precision ...
Suppose her IQ was a mere 150. Then R gives a value of
0.0004.
I Consider a normally distributed random variable X with mean and
Example: (cont.)
I

variance 2 , and the random variable Z =
normal.
Method Two: using standardisation

I
We have:
P(X > 228)
=
=
X
228
>

228 100
P Z >
= P(Z > 8.53)
15
So the probability we need is P(Z > 8.53), where Z has a

standard normal distribution. But how do we work out what this is?
We have to look this probability up in a table, like the one given in
the text book. Actually 8.53 is well beyond the upper limit of values
in the table. You wont be required to read normal tables for
anything in this course.
The 68 95 99.7 rule
which is standard
I The for c > 0
I For example
P( c < X < + c) = P(c < Z < c)
P( < X < + ) = P(1 < Z < 1) 0.682
P( 2 < X < + 2) = P(2 < Z < 2) 0.954

P( 3 < X < + 3) = P(3 < Z < 3) 0.997
I This is sometimes called the 68 95 99.7 rule.

I If data are normally distributed, roughly 68% (
2
)
3
of observations
should be within one standard deviation (SD) of the mean; and roughly
95% of observations should be within two SDs of the mean.
Why is the normal distribution so important?

Motivating example: sums of dice
I
The next five slides show the probabilities of the sums of

n = 1, 2, 3, 4 and 5 dice.
What happens to the shape of the probabilities as the

number of dice n increases?
Amazingly, this will happen with (almost) any random

variable; as long as n is large enough the probabilities for
the sum (and mean) will start to follow a normal
bell-shaped curve.
The Central Limit Theorem (CLT)

I
Consider a random variable; such as the number on a die roll.
Suppose we observe n realisations of the random variable; i.e.

we roll the die n times.
If n is large enough, then the distribution of the sum (and mean)

of the n values follows approximately a normal distribution.
N OTE: We assume here that the value obtained on one die roll does
not affect the value obtained on another die roll; i.e. we are assuming
independence.
If we do not have independence, then the CLT may not hold.
There are many different sets of assumptions under which the CLT
holds.
What does the CLT tell us?
It shows that a normally distributed random variable can be

regarded as the sum (or mean) of a large number of small
random contributions.
Often it can be argued that variables observed in the real

physical world are subject to a large number of different
sources of variability.
It is therefore not very surprising that many real-life

variables are of the form signal + noise where the noise
has an approximate normal distribution.
Why is the CLT important?
It explains why many real-life observed variables have a

signal + noise form with the noise following a normal
distribution.
In statistics, very commonly used quantities are sums or

means of observations, so the CLT tells us that these
quantities have approximate normal distributions.
Hence many methods in statistics rely on the normal

distribution.
Connection between binomial and normal
Assume X follows a binomial distribution with parameters n and p.

Then we can view X as the sum of n independent random variables,
each having a Bernoulli distribution with parameter p.
Hence, if Y is a normally distributed random variable with mean np
and variance np(1 p), then

Z x+ 1
2
1
1
P(X = x) P x Y x +
=
fY (u) du
1
2
2
x
2
fY (x).
where fY (y ) =
function of Y .
1
2np(1p)

(y np)2
exp 2np(1p)
is the probability density
N OTE : a typical rule of thumb is that this approximation is valid if np 5 and
n(1 p) 5.
You find that your GEM2900 lecturer is becoming increasingly

difficult, unreasonable and paranoid as the semester progresses.
He has just set a continuing assessment for you to do on the

IVLE containing 100 multiple choice questions with 4 options
each.
As you dont have time for this kind of thing you decide to simply
guess an answer at random for each question.
What is the probability that you pass the CA (that is, what is the
probability that you obtain a score of 50 or more correct?)

I Let X be your score. Then if you are guessing each question randomly
clearly X Binomial(100, 0.25). Using the result that for a binomial

with parameters n and p the mean
p is np and the variance np(1 p) we
have E(X ) = 25 and SD(X ) = 300/16.
I Let Y be a normal random variable with the same mean and variance
as X , i.e. Y N(25, 300/16).
I Then
P(X 50)
P(Y 49.5)

49.5
Y
!
49.5 25
P Z p
300/16
7.6 109
=
=
P(Z 5.66)
Connection between Poisson and normal
Connection between Poisson and normal

Assume X follows a Poisson distribution with parameters .
Then we can view X as the sum of n independent random
variables, each having a Poisson distribution with parameter n .
Hence, if Y is a normally distributed random variable with mean
and variance , then
Z x+ 1

2
1
1
=
P(X = x) P x Y x +
f (u) du
1 Y
2
2
x
2
fY (x).
where fY (y ) =
function of Y .
1
2

2
exp (y )
is the probability density
2

Lec21 2by2

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Lec21 2by2

Încărcat de

Drepturi de autor:

Formate disponibile

The normal distribution

GEM2900 - Understanding Uncertainty and

The normal distribution, also called the Gaussian distribution,

The normal distribution has two parameters and 2 , called the

If the random variable X follows a normal distribution with

David Chew and David Nott

Woolfson (2008, Chapter 10)

The normal distribution (cont.)

The normal distribution (cont.)

Areas of regions underneath the pdf represent probabilities.

The formula for the pdf of the N(, 2 ) distribution is:

The probability density function (pdf) of the normal distribution is

The normal distribution (cont.)

As I mentioned, the normal distribution is sometimes also

The Gaussian distribution, however, was not first

One problem in which the Gaussian distribution arises is in the

Brownian motion is named after Robert Brown, who observed

The motion of the grains can be thought of as arising from a

For a particle starting at the origin and followed for a period of

The normal distribution (cont.)

It is possible to simulate paths of Brownian motion on a

The java applet at

The paths traced out by the process tend to be highly

Although Robert Brown wasnt the first to observe

Calculating with the normal distribution

A normal distribution is specified by its mean and its

The standard normal distribution has mean = 0 and

If a random variable is distributed as standard normal, it is

The standard normal distribution is symmetric about = 0,

Probabilities of (continuous) random variables are given by

A century and a half before Brown a draper from Delft,

Among other things, he had looked at scrapings of the

Einstein was the first scientist to really take an interest in

Other scientists had guessed the correct explanation but

Calculating with the normal distribution (cont.)

Calculating with the normal distribution (cont.)

Calculating with the normal distribution (cont.)

How to find normal probabilities?

If we wish to calculate a probability P(a < X < b), then we can

Alternatively, the probabilities can be calculated by hand using

Calculating with the normal distribution (cont.)

The standardised version of a random variable X with

The mean of Z is zero; the standard deviation of Z is one.

Calculating with the normal distribution (cont.)

Example: Marilyn vos Savants IQ

Earlier in the course I mentioned Marilyn vos Savant, who was

She popularized the Monty Hall problem with her column in

IQ scores are often assumed to follow a normal distribution,

Method One: Using a computer

The free statistical package R (http://www.r-project.org)

Calculating with the normal distribution (cont.)

I Consider a normally distributed random variable X with mean and

Calculating with the normal distribution (cont.)

Method Two: using standardisation

So the probability we need is P(Z > 8.53), where Z has a

The 68 95 99.7 rule

I The for c > 0

P( c < X < + c) = P(c < Z < c)

P( < X < + ) = P(1 < Z < 1) 0.682

P( 2 < X < + 2) = P(2 < Z < 2) 0.954

I This is sometimes called the 68 95 99.7 rule.

Why is the normal distribution so important?

The next five slides show the probabilities of the sums of