Sunteți pe pagina 1din 25

Continuous Distributions

1
Continuous Distributions
Foundations for much of
statistical inference
• Normal Distribution
• Log Normal Distribution
• Gamma Distribution Environmental variables
• Chi Square Distribution
• F Distribution Time to failure, radioactivity
• t Distribution
• Weibull Distribution
• Extreme Value Distribution Basis for statistical tests.
(Type I and II)
• Exponential Distribution Lifetime distributions
Reaction Kinetics

Continuous random variables are defined for continuous


numbers on the real line. Probabilities have to be computed
for all possible sets of numbers.
2
Continuous Distributions
The distributions discussed so far have only has a discrete set of possible
outcomes (eg 0,1,2,...).

Now we'll discuss continuous distributions, whose outcomes lie along the


real line.

Strange Observation:

One interesting point about continuous probability distributions is that,


because an infinite number of points lie on the real line, the probability of
observing any particular point is effectively zero. This means that the
height of the curve does not represent the probability
3
Continuous Distributions (PDF)
Continuous distributions are described by probability density functions, f(x)
What is the meaning of the probability function f(x) when X is continuous?

First observe that it is meaningless to define events in terms of single


continuous values. The probability of an event occurring at
2.35678935465457348204945023983598459830923…. is zero.

A continuous random variable has an infinite number of values.

Thus for a continuous random variable, an event must be defined in terms


of an interval of values.

4
Continuous Distributions
One can therefore find the probability that a random variable X will fall
between two values by integrating f(x) over the interval:

The total integral over the real line must equal one:

Any one point has zero probability of occurrence.

5
Continuous Distributions
Big difference between discrete and continuous distributions:

Height is the probability (Sum of heights = 1) The area is the probability (Total area = 1)

6
Probability Density Function (PDF)
A function which integrates to 1 over its range and from
which event probabilities can be determined.

f(x)
Area under curve
sums to one.

Random variable range

7
Probability Density Function

0 .00 .01 .02 .03 .04 . 5


Chi Square density functions
The pdf does not
have to be
symmetric, nor
be defined for all
real numbers. fX(x|b)

The shape of the


curve is
x 2
determined by
one or more
distribution
parameters.
0 5 1015202530
y

8
Normal Distribution
Or Gaussian Distribution
The Gaussian distribution, or Normal distribution, is
probably the most commonly encountered continuous
distribution. Each time you take a set of data, average
it and calculate the standard deviation of that data,
one implicitly assumes that the underlying distribution
is Gaussian.

The normal distribution is the distribution


that is expected when measurements are
made up from a large number of 'noise'
components that are all distributed in the
same way as each other.

Many biological and physical


measurements have lots of sources of
inaccuracy and noise and so the
distributions of those measurements will
be approximately normal, as long as the
distributions of those components is
9
similar (They don’t have to be normal!)
Normal Distribution - Properties
1. The mean, median, and mode are equal
2. The normal curve is bell-shaped and symmetric about the mean
3. The total area under the curve is equal to one
4. The normal curve asymptotically approaches zero on either side of
the mean.

10
Standard Normal Distribution: Z score
Rescales any normal distribution axis from its true units (time, weight, dollars, barrels, and so forth) to
the standard measure referred to as a z-value. Thus, any value of the normally distributed continuous
random variable can be represented by a unique z-value.

1. Moves mean to zero


2. Normalizes the standard deviation so that 68% mark is now at the x value 1.0 11
Standard Normal Distribution
All normal random
variables can be related
back to the standard
normal random
variable.

A Standard Normal
m-3s m-2s m-s m m+s m+2s m+3s random variable has
mean 0 and
-3 -2 -1 0 +1 +2 +3 standard deviation 1.

12
Illustration
Density of (X-m)/s

Density of X-m
Density of X

s 1
0
m
13
Normal Distribution - Properties
  To the left of and the graph curves upwards. The graph curves
downwards to the right of and . The points at which the curve changes are
called the inflection points.

Inflection point: Where the second derivative is zero and changes sign 14
Normal Distribution
A symmetric distribution defined on the range - to +  whose shape is
 

defined by two parameters, the mean, denoted , that centers the


distribution, and the standard deviation, , that determines the spread
of the distribution.
68% of total area is
between   and  𝜇+𝜎
Inflection Point

P(     X     )  68% 15
Normal Distribution

16
Notation
 Suppose has a normal distribution with mean  and standard deviation
, we often denote this by .

A new random variable defined as , has the standard normal distribution,


denoted

To create a random variable with specific mean and


 𝝈 standard deviation, we start with a standard normal
𝒁+ 𝝁=𝑿 deviate, multiply it by the target standard deviation, and
then add the target mean.

Why is this important? Because in this way, the probability of any event on a normal
random variable with any given mean and standard deviation can be computed from
tables of the standard normal distribution.

Tables in statistics textbooks often have pre-calculated tables that show how the z-score
varies with the probability density.
17
Example
 Assume that a set of test scores has a mean of 150 and standard
deviation of 25.

If a particular student had a score of 190, what is his/her z –score?

Therefore z = (190-150)/25 = 1.6

That is the score is 1.6 standard deviations above form the mean.

What percentage of students have scores above this?

18
Look up table
p-value = P(Z>+1.6) 
=1 - P(Z<1.6) 
=1 – 0.9452 
= 0.0548

= 5.5 %

19
Look up table
Because of symmetry we
could also have looked up
the area from –infinity to -1.6

20
Exercises

1. If z = 2.15, what is the area beyond z?


2. Find the area below z
3. What is the sum of the above two areas?
4. What is the area between the mean and 2.15 standard deviations
5. What is the probability of obtaining a z score between −2.20 and 0.25 on the standard normal curve?
6. What z score is exceeded by 10% of all scores under the normal curve?

21
Example
 Afterrunning repeated experiments, we find that the doubling time
for a particular strain of E. coli is 58 minutes with a standard
deviation of 10 minutes. Using z-scores, determine the range of
expected doubling times at the 95% and 99% confidence levels.

Rearrange the z-score formula to solve for x (both upper and lower):

Look up a standard table to find out what the z score is for 95%

22
Example
Look up a standard table
to find out what the z score
is for 95%

1.645

23
Example
Look up a standard table
to find out what the z score
is for 95%
 𝑥𝑢𝑝𝑝𝑒𝑟 =58+1.645 ×10=74.45 mins
𝑥  𝑙𝑜𝑤𝑒𝑟 =58 − 1.645× 10=41.55 mins

Class you work out


the 99% limits.

24
Week 4: Exercise
A pharmaceutical company manufactures stocks of Ebola vaccine.
The vaccine has a shelf life that is approximately normally
distributed with mean equal to 800 hours and standard deviation of
40 hours. Find the probability that a random sample of 16 vials of
vaccine will have an likely shelf life of 775 hours?

25

S-ar putea să vă placă și