Sunteți pe pagina 1din 53

Note: all definitions are from your textbook

CHAPTER 1
z What is Statistics?
The discipline of statistics teaches us how to make intelligent judgments and informed decision in the presence of uncertainty and variation.
 Branches of Statistics

Descriptive Statistics: Collecting data, summarizing and describing the important


features of the data.
Methods:
Graphical
Steam and leaf
Histogram
Boxplot
Scatterplot
Numerical Summary Measures
Mean
Median
Variance
Standard deviation
Inferential Statistics: Generalized the results from a sample to a population.
Methods:
Point estimation
Confidence interval
Hypothesis testing

F Some Definitions

Population is a collection of all objects.


Sample is a subset of population.
Variable is any characteristic whose value may change from one object to another in
the population.
Experimental unit is the individual or object on which a variable is measured
Discrete variable is almost always result from counting and its possible values
are 0,1,2,.... or some subset of these digits.
Continuous variable is possible values consists of an entire interval on the number
line and arises from making measurements.
Univariate data result consists of observations on single variable.
Bivariate data arises when observations are made on each of two variables.
Multivariate data arises when observations are made on more than two variables.
 Collecting Data
Try to develop the techniques for collecting the data.
Simplest method is Simple Random Sampling, in this method any particular subset of the
specified size (e.g., n = 100) has the same chance of being selected.
F Visual Techniques
 Stem and leaf
Steps for Constructing a Stem and Leaf Plot
Select one or more leading digits for the stem values. The trailing digits become
the leaves.
List possible stem values in a vertical column.
Record the leaf for every observation beside the corresponding stem value.
Indicate the units for stems and leaves someplace in the display.

 Histogram
Constructing a Histogram for Discrete Data
First, determine the frequency or relative frequency of each X value. Then mark possible X values on a horizontal scale, above each value draw a rectangle whose height
is the frequency or relative frequency of that value.

Frequency a value=Number of times the value occurs


number of times the value occurs
Relative frequency of a value= number
of observations in the data set
Percentage=100 relative frequency
Constructing a Histogram for Continuous Data
Determine the frequency or relative frequency for each class. Mark the class boundaries
on a horizontal measurement axis, above each class interval draw a rectangle whose
height is the corresponding frequency or relative frequency.

Number od classes number of observations


Class width numberrange
of classes
Range=largest value - smallest value
F Histogram Shapes
Unimodal: A histogram is one that rises to a single peak and then declines.
Bimodal: A histogram that has two different peaks.
Multimodal: A histogram that has more than two different peaks.
Symmetric: A histogram that left half ia a mirror image of the right half.
Positively Skewed (skewed to the right): A histogram that the right or upper
tail is stretched out compared with the left or lower tail.
Negatively Skewed (skewed to the left): If the stretching is to the left.

Examples:

The ages of 50 tenured faculty at a state university are


34, 48, 70, 63, 52, 52, 35, 50, 37, 43, 53, 43, 52, 44, 42, 31, 36, 48, 43, 26, 58, 62, 49,
34, 48, 53, 39, 45, 34, 59, 34, 66, 40, 59, 36, 41, 35, 36, 62, 34, 38, 28, 43, 50, 30, 43,
32, 44, 58, 53
1. Draw an histogram with 6 classes.
2. Describe the distribution, is there any outlier?
3. What proportion of the tenured faculty are younger than 41?
Construct a stem and leaf plot for these 50 measurements:
3.1, 4.9, 2.8, 3.6, 2.5, 4.5, 3.5, 3.7, 4.1, 4.9, 2.9, 2.1, 3.5, 4.0, 3.7, 2.7, 4.0, 4.4, 3.7, 4.2,
3.8, 6.2, 2.5, 2.9, 2.8, 5.1, 1.8, 5.6, 2.2, 3.4, 2.5, 3.6, 5.1, 4.8, 1.6, 3.6, 6.1, 4.7, 3.9, 3.9,
4.3, 5.7, 3.7, 4.6, 4.0, 5.6, 4.9, 4.2, 3.1, 3.9
1. Describe the shape of the data distribution. Do you see any outliers?
2. Use the stem and leaf plot to find the smallest observation.
3. Find the eighth and ninth largest observations.
z Numerical Measures
Graphical method may not always be sufficient for describing data. You can use the data to
calculate a set of numbers that will convey a good mental picture of the frequency distribution.
Numerical descriptive measures associated with a population of measurement are called parameters; those computed from sample measurements are called statistics.
F Measures of Location
Mean
This is the usual arithmetic mean or average and is equal to the sum of the measurements divided by number of measurements.
Pn
= i=1 Xi
Sample mean = X
n
PN
Xi
Population mean = = i=1
N

Median
This is the middle of the measurements when ordered them.
The position of the median =

n+1
2

and population median denoted by


The sample median denoted by X
.
Note: Mean and median are equal when distribution of data is symmetric, mean is
greater when distribution is skewed to right and is less than median when distribution
is skewed to left.

Example: The prices for 14 different brands of water-packed light tuna are 0.99,
1.92,1.23, 0.85, 0.65, 0.53, 1.41, 1.12, 0.63, 0.67, 0.69, 0.60, 0.60, 0.66

a. Find the average price for the 14 different brands of tuna.


b. Find the median price for the 14 different brands of tuna.
c. Based on your findings in parts a and b, do you think that the distribution of
prices is skewed?
-

Percentiles
pth percentile is the value of measurement that is more than p% of the measurements
in ordered data.
Quartiles Quartiles divide the data set into four equal parts, the first one is lower
quartile and the second one is median and the third one is upper quartile.
 Box plot
Box plot describes center of data, how spread the data, the extend and nature of any departure from symmetry, and identification of outliers.
Order the n observations from smallest to largest and separate the smallest half from the
is included in both halves if n is odd. Then the lower forth is
largest half; the median X
the median of the smallest half and the upper forth is the median of the largest half. The
forth spread fs , is
fs = upper forth lower forth
In general, box plot is based on five number summary:
Smallest value, lower forth, median, upper forth, largest value.
5

Constructing box plot


Calculate five number summary and also fs .
Show five numbers on horizontal line and draw a box above the horizontal line from
lower forth to upper forth. and determine median by a vertical line through the box.
any observation farther than 1.5fs from the closest fourth is an outlier. An outlier
is extreme if it is more than 3fs from the nearest fourth, and it is mild otherwise.
Each mild outlier is represented by a closed circle and each extreme outlier by an open
circle.
Draw two horizontal lines from the end of the box to largest and smallest observations
which are not outliers (whiskers).
Interpreting Box plot
Median line in center of box and whiskers of equal length- symmetric distribution
Median line left of center and long right whisker- skewed right
Median line right of center and long left whisker- skewed left
F Measures of Variability
Data sets may have the same center but look different because of the way the numbers spread
out from the center.
Range

R=largest measurement - smallest measurement


Variance
It measures the average deviation of the measurements about their mean
P
2
PN
( N
2
i=1 Xi )
2
X

(X

)
i
i
N
Population variance : 2 = i=1
= i=1
N
N
P
2
Pn
Pn
( n
2
i=1 Xi )
2

Sxx
i
i=1
2
i=1 (Xi X)
n
Sample variance : s =
=
=
n1
n1
n1
P 2
P
2
Note:
Xi = sum of squares of measurements and, ( Xi ) = square of the sum of
measurements.,

PN

Standard deviation

Population standard deviation : = 2

Sample standard deviation : s = s2


If all measurements add or subtract with a constant (c), variance doesnt change.
If all measurements multiply by a constant (c), variance changes to c2 S 2 and standard
deviation changes to |c|S (same rule for division).

Example: The number of radiation received at a greenhouse plays an important role in


determining the rate of photosynthesis. The accompanying observations on incoming solar radiation were read from a graph in the articleRadiation Components over Bare and
Planted Soil in Greenhouse.
6.3, 6.4, 7.7, 8.4, 8.5, 8.8, 8.9, 9.0, 9.1, 10.0, 10.1, 10.2, 10.6, 10.6, 10.7, 10.7, 10.8, 10.9, 11.1,
11.2, 11.2, 11.4, 11.9, 11.9, 12.2, 13.1
Use some of the methods discussed in this chapter to describe and summarize this data.
Suggested Exercises for Chapter 1: 13, 21, 23, 27, 31, 33, 35, 39, 45, 51, 53, 57, 61, 65,
69

CHAPTER 2
z PROBABILITY
Probability is used in inference statistics as a tool to make statement for population from
sample information.
Experiment is a process for generating observations
Sample space is all possible outcomes of an experiment.
Event is a collection of one or some outcomes from sample space, usually denoted by
a capital letter.
Simple Event: The event that cannot decomposed.
Venn Diagram is used to show the result of an experiment, for this reason all simple
event show in a box by a point.
Tree Diagram is used when the experiment generated in several steps.
 Some Relations Between Events
Union: The union of events A and B, denoted by A B is the event that contains all
outcomes that are either in A or B or both.
Intersection: The intersection of events A and B, denoted by A B is the event that
contains all outcomes that are in both A and B.
Complement: The complement of an event A, denoted by A0 is the event that contains
all outcomes in sample space S but not in A.
 Two events are mutually exclusive or disjoint, if they dont have any common outcome, or when one event occurs, the other cannot, and vice versa.
 Calculating Probability: P (A) is a measure of the chance that A will occur.

Calculating Probability by Using Relative Frequency:


frequency
n
n

P (A) = lim relative frequency = lim


n

Frequency is the number of times that event A occurred.


n is number of times that experiment repeat.

Calculating Probability by Using Sample Space:


First assign same probability to each simple event such that each probability be a
number between 0 and 1 also the sum of all probabilities be 1 , then the probability of
event A is equal to the sum of probabilities of simple events contained in A.
Axioms of Probability
1. For any event A, P (A) > 0.
2. P (S) = 1.
3. If A1 , A2 , is an infinite collection of disjoint events, then
P (A1 A2 A3 ) =

P (Ai )

i=1

Properties of Probability
P (A) = 1 P (A0 ) A.
P (A B) = 0 A and B mutually exclusive events.
P (A B) = P (A) + P (B) P (A B) A and B.
For any three events A, B, and C,
P (AB C) = P (A)+P (B)+P (C)P (AB)P (AC)P (B C)+P (AB C)
Example: Consider the following table
Used eyeglasses for reading
Judge to need eyeglasses

Yes

No

Yes
No

0.44
0.02

0.14
0.40

If a person is selected from this large group, find the probability of each event:
a. The adult is judged to need eyeglasses.
b. The adult needs eyeglasses for reading but does not use them.
c. The adult uses eyeglasses for reading whether he or she needs them or not.

Counting Techniques
One of the method for computing probability is using simple events and
P (A) =

n(A)
n

n : number of simple events in sample space.


n(A) : number of simple events contained in A.
There is some rule for counting n and n(A) which are needed for calculate P (A).
The mn rule: If an experiment is performed in two stages, with m ways to accomplish the
first stage and n ways to accomplish the second stage, then there are mn ways to accomplish
the experiment.
If experiment has k stages, then n = n1 n2 nk such that n1 is number of ways for
first stage, ......
Permutation: There are n distinct objects and want to choose k objects in order, then
there are
n!
Pk,n =
ways.
(n k)!
Combination: There are n distinct objects and want to take r objects at a time, then
there are
 
n
n!
ways.
Cr,n =
=
r!(n r)!
r

Example: A university warehouse has received a shipment of 25 printers, of which 10 are


laser printers and 15 inkjet models. If 6 of these 25 are selected at random to be checked
by a particular technician, what is the probability that exactly 3 of those selected are laser
printers?
 Conditional Probability
The conditional probability of A given that B has occurred, is
P (A|B) =

P (A B)
P (B)

if P (B) 6= 0

Example: A new magazine publishes three columns entitled Art (A),Book (B), and
Cinema (C). Reading habits of a randomly selected reader with respect to these columns
are
Read regularly
Probability

AB

AC

BC

ABC

0.08

0.09

0.13

0.05

0.14 0.23 0.37

Find P (A|B), P (A|B C), P (A B|C).

Multiplication Rule
P (A B) = P (A|B)P (B) = P (B|A)P (A)
Law of Total Probability
If A1 , A2 , , Ak be mutually exclusive and exhaustive events,for an event B,
P (B) = P (B|A1 )P (A1 ) + + P (B|Ak )P (Ak ) =

k
X

P (B|Ai )P (Ai )

i=1

Bayes Rule
Let A1 , A2 , , Ak be mutually exclusive and exhaustive events,if an event B occurs, then
P (Aj |B) =

P (B|Aj )P (Aj )
P (Aj B)
= Pk
P (B)
i=1 P (B|Ai )P (Ai )

j = 1, k

Example: Only 1 in 1000 adults is afflicted with a rare disease for which a diagnostic test
has been developed. The test is such that when an individual actually has the disease, a
positive result will occur 99% of the time, whereas an individual without the disease will
show a positive test result only 2% of the time. If a randomly selected individual is tested
and the result is positive, what is the probability that the individual has the disease?

Independence
Two events A and B are independent if the
P (A B) = P (A)P (B)
A, B and C are mutually independent if
P (A B C) = P (A)P (B)P (C)
4


Example: Two cards are drown from a deck of 52 cards. calculate the probability that the
draw includes an ace and a ten.

Suggested Exercises for Chapter 2: 3, 11, 13, 17, 21, 23, 25, 45, 47, 49, 51, 63, 71, 73,
77,79, 80, 83, 87, 91

CHAPTER 3
Random Variable
A rule that associate a number to each outcome of an experiment (or each outcome in S) is
random variable.
Bernoulli random variable: Any random variable whose only possible values are 0 and 1

Example: Give three examples of Bernoulli random variables.


There is two different types of random variable:
Discrete random variable: Possible values are integer.
Continuous random variable: Possible values consist of an entire interval on the
number line.

Example: Three automobiles are selected at random, and each is categorized as having a
diesel (S) or nondiesel (F) engine. If X=the number of cars among the three with diesel
engine, list each outcome in S and its associated X value.

Probability Distribution for Discrete Random Variables


The probability distribution of X determine how the total probability is distributed among
the values of X. For showing probability distribution can use a formula, graph, or table.
The probability distribution or probability mass function for discrete random variable p(x) =
P (X = x) has two conditions:
1. p(x) > 0
P
2.
all possiblex p(x) = 1
Examples:
Airline sometimes overbook flights. Suppose that for a plane with 50 seats, 55 passengers
have tickets. Define the random variable Y as the number of ticketed passengers who actually show up for the flight. The probability mass function of Y appears in the accompanying
table.

y
p(y)

45 46 47 48 49 50 51 52 53 54 55
.05 .10 .12 .14 .25 .17 .06 .05 .03 .02 .01

a. What is the probability that flight will accommodate all ticketed passengers who show
up?
b. What is the probability that not all ticketed passengers who show up can be accommodate?
An automobile service facility specializing in engine tune-ups knows that 45% of all tuneups are done on four cylinder automobiles, 40% on six cylinder automobiles, and 15% on
eight-cylinder automobiles. Let X= the number of cylinders on the next car to be tuned.
What is the pmf of x?

A Parameter of a Probability Distribution


Suppose p(x) depends on a quantity that can be assigned any of a number of possible values, with each different value determining a different probability distribution. Such quantity
is called parameter of the distribution. The collection of all probability distributions for
different values of the parameter is called a family of probability distributions.

1 if x = 0
p(x; ) =

if x = 1

0
otherwise.
Example: Starting at fixed time, we observe that the gender of each newborn child until a
boy (B) is born. Let p = P (B), and define the random variable X by X=number of birth
observed, then
(
(1 p)x1 p x = 1, 2, 3,
p(x) =
0
otherwise.

The Cumulative Distribution Function


The cumulative distribution function (cdf) F (x) of a discrete random variable X with
pmf p(x) is defined for every number x by
X
F (x) = P (X 6 x) =
p(y)
y:y6x

For any number x, F (x) is the probability that the observed value of X will be at most x.
Cumulative distribution function for random variable in above example is:
(
1 (1 p)[x]
x>1
F (x) =
0
x < 1.
2

Example: The pmf of Y is


y
p(y)

1 2 3 4
.4 .3 .2 .1

Obtain the cdf of Y and show it by graph.

Based on definition for cdf, for any two number a and b with a 6 b.
P (a 6 X 6 b) = F (b) F (a )
where a represent the largest possible X value that is strictly less than a. In particular, if
the only possible values are integers and if a and b are integers, then
P (a 6 X 6 b) = F (b) F (a 1)
Taking a = b yields P (X = a) = F (a) F (a 1).

Example: An insurance company offers its policyholders a number of different premium


payment options. For a randomly selected policyholder, let X = the number of months
between successive payments. The cdf of X is as follow

0
x<1

0.3
16x<3

0.4
36x<4
F (x) =

0.45 4 6 x < 6

0.60 6 6 x < 12

1
12 6 x
a. What is the pmf of X?
b. Using just the cdf, compute P (3 6 X 6 6) and P (4 6 X).
Expected Values of Discrete Random Variable
E(X) = x =

xp(x) D is set possible values of x.

xD

P
Expected value for a function h(x) is E[h(x)] = h(x)p(x)
Expected value for a linear function is E(aX + b) = aE(X) + b, therefore for any constant
E(aX) = aE(X)
E(X + b) = E(X) + b
3

The Variance of Random Variable


V (X) =

(x )2 p(x) = E[(X )2 ].

Also
V (X) = E(X 2 ) [E(X)]2 =
p
The standard deviation of X is x = x2 .

hX

The variance of a function h(x) is V [h(x)] = h(x) =

i
x2 p(x) 2 .

P

2
h(x) E[h(x)] p(x).

Variance for a linear function is V (aX + b) = a2 x2 and aX+b = |a|x .


Therefore
2
ax
= a2 x2
2
= x2 .
x+b

Example: The random variable X has following pmf


x
p(x)

0.08 0.15 0.45 0.27 0.05

Compute
a. E(X)
b. V (X)
c. The standard deviation of X.

z Binomial Distribution
A binomial experiment is one that has these five characteristics:
1. The experiment consists of n identical trials.
2. Each trial results in one of two outcomes. The one outcome is called a success S, and
the other a failure, F .
3. The probability of success on a single trial is equal to p and probability of failure is
equal to (1 p) = q.
4. The trials are independent.
5. We are interested in X, the number of successes observed during the n trials, for
X = 0, , n.
4

Example: Determine whether the following experiments are binomial


Check 100 births to find the proportion of boys.
A shipment contains 30 computer and 2 of them are defective, a purchaser wants to
check 3 of them to reject or accept the shipment.
In a population, there are 500,000 licensed drivers, of whom 400,000 are insured, a
sample of 10 drivers is chosen without replacement.
When the sample came from a large population, the probability of success p stayed about
the same from trial to trial.
Rule of thumb: If the sample size is small relative to the population size such that n is
at most 5% of the population size in without replacement sampling, the experiment follows
binomial if satisfies the other conditions.
 The Binomial Probability Distribution
A binomial experiment consists of n identical trials with probability of success p on each
trial. Because the pmf of a binomial rv X depends on the two parameters n and p, the pmf
is denoted by b(x; n, p). The probability of x successes in n trials is equal to
(
n!
px (1 p)nx
x = 0, 1, 2, n
Cxn px q nx = x!(nx)!
b(x; n, p) =
0
otherwise.
 Mean and Standard Deviation for the Binomial Probability Distribution
The random variable x, the number of successes in n trials, has a probability distribution
with
Mean : = np
Variance : 2 = npq

Standard deviation : = npq

Example: A marksman hits a target 80% of the time. He fires five shots at the target.
What is the probability that exactly 3 shots hit the target? What is the probability that
more than 3 shots hit the target?
 Cumulative Probability Tables
You can use the cumulative probability tables to find probabilities for selected binomial
distributions.
5

Find the table for the correct value of n.


Find the column for the correct value of p.
The row marked x gives the cumulative probability, P (X 6 x) = P (X = 0) + +
P (X = x).
Example: Let x be a binomial random variable with n = 20 and p = 0.1.
a. Calculate P (x 6 4).
b. Calculate the mean and standard deviation of the random variable x.
c. Calculate the interval , 2, and 3. Find the probability that an observation
will fall into each of these intervals.
 Hypergeometric Distribution
A bowl contains M red balls and N M white balls, for a total of N balls in the bowl.
Select n balls from the bowl and record x the number of red balls. If define a success to be
a red ball, then x is a hypergeometric random variable.
 The Hypergeometric Probability Distribution
A population contains M successes and N M failures. The probability of exactly x successes
in a random sample of size n is
P (X = x) = h(x; n, M, N ) =

N M
CxM Cnx
CnN

max(0, n N + M ) 6 x 6 min(n, M )

The mean and variance of a hypergeometric random variable x are


M
)
N
M N M N n
)(
)
2 = n( )(
N
N
N 1
= n(

Example: A candy dish contains five blue and three red candies. A child reaches up and
selects three candies without looking.
a. What is probability that there are two blue and one red candies in the selection?
b. What is the probability that the candies are all red?
c. What is the probability that the candies are all blue?

z The Negative Binomial Distribution


The negative binomial is based on experiment satisfying the following conditions:
1. The experiment consists of a sequence of independent trials.
2. Each trial can result in either success (S) or a failure (F).
3. The probability of success is constant from trial to trial, so P (S on trial i) = p for
i = 1, 2, .
4. The experiment continuous (trials are performed) until a total of r successes have been
observed, where r is a specified positive integer.
The random variable of interest is X = the number of failures that precede the rth success.
The pmf of the negative binomial rv X with parameters r = number of successes and
p = P (success) is
x+r1 r
nb(x; r, p) = Cr1
p (1 p)x x = 0, 1, 2,
If X is a negative binomial rv with pmf nb(x; r, p), then
r(1 p)
r(1 p)
V (X) =
p
p2

Examples:
An instructor who taught two sections of engineers statistics last term, the first with 20
students and the second with 30, decided to assign a term project. After all projects had
been turned in, the instructor randomly order them before grading. Consider the first 15
graded projects.
a. What is the probability that exactly 10 of these are from the second section?
b. What is the probability that at least 10 of these are from the second section?
c. What is the probability that at least 10 of these are from the same section?
d. What is the mean value and standard deviation of the number among these 15 that are
from the second section?
e. What are the mean value and standard deviation of the number of projects not among
these first 15 that are from the second section?
A family decides to have children until it has three children of the same gender. Assuming
P (B) = P (G) = 0.5, what is the pmf of X = the number of children in the family?
E(X) =

z Poisson Distribution
The Poisson random variable x is a model for data that present the number of occurrences
of a specified event in a given unit of time or space.
Examples:
7

The number of calls received by a switchboard during a given period of time.


The number of machine breakdowns in a day.
The number of traffic accidents at a given intersection during a given time period.
 The Poisson Probability Distribution
Let be the average number of times that an event occurs in a certain period of time
or space. A random variable X is said to have a Poisson distribution with parameter
( > 0) if the pmf of X is
p(x; ) =

x e
,
x!

x = 0, 1, 2,

The mean and standard deviation of the Poisson random variable X are
Mean : E(X) =
Variance : V (X) =

Example: Suppose pulses arrive at the counter at an average rate of six per minute, what
is the probability that in a 0.5-min interval at least one pulse is received?
 Cumulative Probability Tables
You can use the cumulative probability tables to find probabilities for selected Poisson
distributions.
Find the column for the correct value of .
The row marked k gives the cumulative probability, P (x 6 k) = P (x = 0) + +
P (x = k)
 The Poisson Approximation to the Binomial Distribution
The Poisson probability distribution provides a simple, easy-to-compute, and accurate approximation to binomial probabilities when n is large and = np is small, preferably with
n > 50 and np < 5, i.e.
b(x; n, p) p(x, ) when n , p 0

Examples:
1. The number X of people entering the intensive care unit at the particular hospital on
any one day has a Poisson probability distribution with mean equal to five persons per
day.
a. What is the probability that the number of people entering the intensive care unit
one particular day is two? Less than or equal to two?
b. Is it likely that X will exceed 10? Explain.
2. Sporadic outbreaks of E.coli have occurred at a rate of 2.5 per 100,000 for period of
one year.
a. What is the probability that at most five cases of E.coli per 100,000 are reported
in a given year?
b. What is the probability that more than five cases of E.coli per 100,000 are reported
in a given year?

Suggested Exercises from Chapter 3: 7, 11, 13, 17, 23, 29, 39, 47, 49, 55, 57, 65,
69, 71, 73, 79, 81, 85, 95, 97, 101, 103, 109,

CHAPTER 4
Continuous Random Variables and Probability Distributions
z Basic definitions and properties of continuous random variables
Continuous random variable: A random variable is continuous if its set of possible values
is an entire interval of numbers.
Probability distribution for continuous variables: It is possible to construct a probability histogram (same as relative frequency histogram) for continuous variable. But by
measuring the variable more and more finely, the resulting histogram approaches to a smooth
curve. It is obvious that the total area under this curve is 1, also probability that the variable
be between two points is the area under the curve between two points. It means probability distribution or probability density function (pdf ) for a continuous random variable
X is a function f (x), such that
Z b
f (x)dx
P (a 6 X 6 b) =
a

f (x) is a pdf if satisfies the following two conditions:


f (x) > 0 for all x
R
f (x)dx = 1 = area under the entire curve of f (x)

Example: Suppose that X has following density function


(
0.5x 0 6 x 6 2
f (x) =
0
otherwise.
Calculate
a. P (X 6 1)
b. P (0.5 6 X 6 1.5)
c. P (1.5 6 X)

Note: For a continuous random variable P (X = C) = 0, then


P (a 6 X 6 b) = P (a < X 6 b) = P (a 6 X < b) = P (a < X < b)

Example: Consider the following function


(
0.15e0.15(x0.5)
x > 0.5
f (x) =
0
otherwise.
1

a. Verify that f (x) is a pdf


b. P (X 6 5)

z Uniform Distribution
A continuous random variable X has uniform distribution on interval [a, b] if the pdf of X is
(
1
a6X6b
ba
f (x; a, b) =
0
otherwise.
z Cumulative Distribution Function

The cumulative distribution function (pdf) for a continuous random variable


is
Z x
F (x) = P (X 6 x) =
f (y)dy x

F (x) is the area under the density curve to the left of x.


Example: Find cdf for uniform distribution on [a, b] and then graph it.

Same as discrete random variable, the probabilities of intervals can be computed from F (x) as
P (X > a) = 1 F (a),

P (a 6 X 6 b) = F (b) F (a)

Example: Solve example 1 by using cdf.

If X is a continuous random variable with cdf F (x) then


F 0 (x) = f (x)
The median
for a continuous random variable satisfies 0.5 = F (
).
The general, the (100p)th percentile of a distribution of a continuous
random variable is defined by
Z (p)
p = F (()) =
f (y)dy (p) is (100p)th percentile

z Expected Value for Continuous Random Variables


The expected value or mean of a continuous random variable X with f (x) is
Z
x = E(X) =
xf (x)dx

The expected value for a function of X is


Z
E(h(X)) = h(X) =

h(x)f (x)dx

z Variance of Continuous Random Variables


The variance of a continuous random variable X with pdf f (x) and mean
value is
Z

x2

(x )2 f (x)dx = E(X )2

= V (X) =

and also standard deviation of X is


x =

p
V (X)

The easier way for computing variance is the following formula


V (X) = E(X)2 (E(X))2
Example: The pdf of X is
(
90x8 (1 x) 0 < x < 1
f (x) =
0
otherwise.
a. Obtain the cdf of X.
b. What is P (X 6 0.5)?
c. What is P (0.25 6 X < 0.5)?
d. What is 75th percentile of the distribution?
e. What is the probability that X is within 1 standard deviation of its mean
value?

The Normal Distribution


The normal distribution is the most important distribution in probability
ad statistics, because it can be fit for a large number of random variables
like weights, heights, ..... A continuous random variable X has a normal
distribution with parameter and 2 , if the pdf of X is
(x)2
1
f (x; , ) = e 22
2

< x < , > 0, < <

and are parameters of normal distribution, and X N (, 2 ) means


random variable X has normal distribution with parameters and 2 .
Normal distribution is symmetric, then
Z
Z
f (x)dx =
f (x)dx = 0.5

To compute P (a 6 X 6 b) when X N (, 2 ) is
Z b
(x)2
1
e 22 dx
a 2
For evaluating this expression, use standard normal ( = 0, = 1) which
tabulated for different values of a and b.
A random variable with = 0 and = 1 is called a standard normal distribution and denoted by Z. The pdf of Z is
x2
1
f (z; 0, 1) = e 2
2
Rz
The cdf of Z is P (Z 6 z) = f (y)dy which will denoted by (z).

Example: Compute the following probabilities:

P (0 6 Z 6 2.17), P (2.5 6 Z 6 0), P (2.5 6 Z 6 2.5), P (1.5 6 Z), P (|Z| 6 2.5)

z Notation
z denotes the value on z axis for which of the area under the z curve lies
to the right of z . Thus z is the 100(1 )th percentile of the standard
4

normal distribution.

Examples:

1. Determine the value of c that makes the probability statement correct.


(c) = 0.9838,
P (c 6 Z) = 0.121,
P (c 6 |Z|) = 0.016
2. Find the following percentiles for the standard normal distribution
75th
9th
3. Determine z for the following
= 0.0055,
= 0.663

Normal Distribution: If X N (, 2 ), then Z =


normal distribution, thus
P (a 6 X 6 b) = P (

has a standard

b
b
a
a
6Z6
) = (
) (
)

If the population of a variable is (approximately) normal, then


1. Roughly 68% of the values are within 1 SD of the mean
2. Roughly 95% of the values are within 2 SDs of the mean
3. Roughly 99.7% of the values are within 3 SDs of the mean
In general the (100p)th percentile of any normal distribution is related to the
(100p)th percentile of standard normal distribution as
(100p)th percentile for N (, ) = +[(100p)th for standard normal ]
The Normal Approximation to the Binomial Distribution
When X b(x; n, p), mean and standard deviation are = np and =

npq, then if the probability histogram is not too skewed, X has approximately normal distribution with same mean and standard deviation and for
5

finding probability based on normal distribution, need continuity correction,


for example:


x + 0.5 np
P (X 6 x) = (area under the normal curve to the left of x+0.5) =

npq
The condition for this approximation is both np > 10 and nq > 10.
Example: Suppose only 40% of all drivers in a certain state wear a seatbelt.
A random sample of 500 drivers is selected, what is the probability that
a. Between 180 and 230 (inclusive) of the drivers in the sample wear a seatbelt?
b. Fewer than 170 of those in the sample wear a seatbelt? Fewer than 150?
The Gamma Distribution
Normal distribution is bell shape and symmetric, but there is many random
variables that have a skewed situation. For these kind of variables first define
the gamma function. For > 0, the gamma function () is
Z
() =
x1 ex
o

This function has following properties:


1. () = ( 1)( 1) > 1
2. () = ( 1)! positive integer

3. ( 12 ) =
In general, continuous random variable X has gamma distribution of the pdf
of X is
(
x
1
1
x
e
x>0

()
f (x; , ) =
0
otherwise.
and are the parameters of distribution and > 0, > 0.
For standard gamma distribution = 1, then pdf for standard gamma ran6

dom variable is

(
f (x; ) =

x1 ex
()

x>0

otherwise.

The mean and variance for a gamma random variable are


V (X) = 2 = 2

E(X) = =

Computing probabilities for Gamma distribution


For computing probabilities for a gamma random variable, same as normal
distribution, probability can find for standard gamma by using table. Divided
x by change any kind of gamma distribution to standard gamma, then
x
P (X 6 x) = F (x; , ) = F ( ; )

Examples:
1.
Evaluate the following
(6)

(5/2)

F (5; 4)

Let X have a standard gamma distribution with = 7. Evaluate


P (X 6 5)

P (3 < X < 8)

P (X < 4 or X > 6)

2. Suppose the time taken by a homeowner to mow his lawn is an random


variable X having a gamma distribution with parameters = 2 and = 12 .
What is the probability that is takes:
a. At most 1 hour to mow the lawn?
b. At least 2 hours to mow the lawn?
c. Between 0.5 and 1.5 hours to mow the lawn?

The Exponential Distribution


X has an exponential distribution if pdf of X is
(
ex x > 0
f (x; ) =
0
otherwise.
Exponential distribution is a special case of gamma distribution with = 1
and = 1 , then
1
1
= =
2 = 2 = 2

The cdf of exponential random variable is


(
1 ex x > 0
F (x; ) =
0
otherwise.
The exponential distribution has two important applications:
This distribution is a good model for the distribution of times between
the occurrence for two successive events. As before the number of event
in a time interval follows Poisson distribution, it can be shown that if
the time interval is t and average number of events in a unit of time is
, then = t, then the distribution of time between two successive
events is exponential with parameter =
Memoryless property
P [(X > t + t0 ) (X > t0 )]
P (X > t0 )
P (X > t + t0 ) 1 F (t + t0 ; )
=
=
= et
P (X > t0 )
1 F (t0 ; )

P (X > t + t0 |X > t0 ) =

This property is useful for distribution of component lifetime and it


means the distribution of additional lifetime is exactly the same as original distribution of lifetime , in other words, the distribution of the
remaining lifetime is independent of current age.
Example:
8

The exponential distribution with mean value 6 MPa is used as a model


for the distribution of stress range in certain bridge connections. Find
a) Probability that stress range is at most 10 MPa.
b) Probability that stress range is between 5 and 10 MPa.

Suggested Exercises for Chapter 4: 3, 5, 9, 11, 13, 19, 21, 33,35,


37, 41, 45, 47, 53, 55, 59, 63, 65, 67, 99, 101, 105, 107

CHAPTER 5
 Jointly Distributed Random Variable
There are some situations that experiment contains more than one variable and researcher
interested in to study joint behavior of several variables at the same time.
Jointly Probability Mass Function for Two Discrete Distributed Random Variables:
Let X and Y are discrete random variables. The joint pmf p(x, y) is defined for each pair of
numbers (x, y) by
p(x, y) = P (X = x and Y = y),
then the probability P [(X, Y ) A] can find by
P [(X, Y ) A] =

XX

p(x, y),

(x,y) A

The marginal pmf of X and Y are


X
pX (x) =
p(x, y)

pY (y) =

p(x, y)

X and Y are independent, if for every pair of x and y


p(x, y) = pX (x) pY (y)

Example The joint pmf of X and Y appears in the accompanying tabulation


y
p(x,y) 0
1
2
0 .1 .04 .02
x 1 .08 .2 .06
2 .06 .14 .3
a. What is P (X = 1 and Y = 1)?
b. Compute P (X 1andY 1).
c. Give a word description of the event (X 6= 0andY 6= 0) and compute the probability of
this event.
d. Compute the marginal pmf of X and of Y . What is P (X 1)?
e. Are X and Y independent r.vs?

Jointly Probability Density Function for Two Continuous Distributed Random


Variables:
The joint pdf for two continuous random variables X and Y for any two-dimensional set A
is
Z Z
P [(X, Y ) A] =
f (x, y)dxdy
A

If A be a rectangle {(x, y) : a 6 x 6 b, c 6 y 6 d}, then


Z bZ
P [(X, Y ) A] = P (a 6 x 6 b, c 6 y 6 d) =

f (x, y)dydx.
a

The marginal pdf of X and Y are


Z
fX (x) =

f (x, y)dy

for < x <

f (x, y)dx for < y <

fY (y) =

Two continuous random variables X and Y are independent, if for every pair of x and y
f (x, y) = fX (x)fY (y)

Example: Each front tire on a particular type of vehicle is supposed to be filled to a pressure
of 26 psi. Suppose the actual air pressure in each tire is a random variable (X) for the right
tire and (Y ) for the left tire, with joint pdf
(
K(x2 + y 2 ) 20 6 x 6 30, 20 6 y 6 30
f (x, y) =
0
otherwise.
a. What is the value of K?
b. What is the probability that both tires are under filled?
c. What is the probability that the difference in air pressure between the two tires is at most
2 psi?
d. Determine the distribution of air pressure in the right tire alone.
e. Are X and Y independent rvs?

For two continuous rvs X and Y , the conditional pdf of Y given that X = x is
fY |X (y|x) =

f (x, y)
<y <
fX (x)

pY |X (y|x) =

p(x, y)
<y <
pX (x)

If X and Y be discrete

Expected Values, Covariance, and Correlation


The expected value of function h(x, y) denoted by E[h(X, Y )] or h(X,Y ) is
( P P
if X and Y are discrete
y h(x, y)p(x, y)
E[h(X, Y )] = R x R
h(x, y)f (x, y)dxdy
if X and Y are continuous

The covariance between two random variables X and Y is
Cov(X, Y ) = E[(X X )(Y Y )]
( P P
y (x X )(y Y )p(x, y)
= R x R
(x X )(y Y )f (x, y)dxdy

X and Y discrete
X and Y continuous

Also
Cov(X, Y ) = E(XY ) X Y
The correlation coefficient of two random variables is
Corr(X, Y ) = X,Y =

Cov(X, Y )
X Y

and has the following properties


Corr(aX + b, cY + d) = Corr(X, Y ), if a and c have same sign (same positive or
negative).
1 6 X,Y 6 1
X,Y = 1 or -1 if and only if Y = aX + b such that a 6= 0
If X and Y are independent = 0
Example: Consider the following joint pmf
p(x,y)

0
0
.02
5 .04
10 .01

y
5
.06
.15
.15

10
.02
.2
.14

15
.1
.1
.01

a. What is E(X + Y )?
b. What is expected value for maximum of X and Y ?
c. Compute the covariance for X and Y .
d. Compute for X and Y .

The Distribution of the Sample Mean

A statistic is any quantity that calculated from sample like sample mean (X).
Random variables X1 , X2 , Xn from a random sample of size n if
1. The Xi s are independent random variables.
2. Every Xi has the same probability distribution.
If X1 , X2 , Xn be a random sample from a distribution with mean and variance 2 , then
is unbiased
= X = X
1. E(X)
= 2 =
2. V (X)
x

2
n

Also, for T = X1 + X2 + + Xn (the total sample)


1. E(T ) = n
2. V (T ) = n 2
If X1 , X2 , Xn be a random sample from a normal distribution with and 2 , then for
any n, sample mean is normally distributed with and 2 , i.e.,
2

N (, )
X
n
also
T N (n, n 2 )
The Central limit theorem
For a random sample X1 , X2 , Xn from a distribution with and 2 , sample mean has
2
approximately a normal distribution with mean and variance n , if n is sufficiently large.
(Also total sample has a normal distribution)
If n > 30, the central limit theorem can be used.
Example: The inside diameter of a randomly selected position ring is a random variable
with mean value 12 cm and standard deviation 0.04 cm.
is the sample mean for a random sample of n = 16 rings, where is the sampling
a. If X
centered, and what is the standard deviation of the X
distribution?
distribution of X
b. Answer the question part (a) for a sample size of n = 64 rings.
is more likely to be within 0.01 cm of 12 cm?
c. For which of the two random samples, X
6 12.01) when n = 64.
d. Calculate P (11.99 6 X

The Distribution of a Linear Combination


In general a1 X1 + a2 X2 + + an Xn is a linear combination of random variables X1 , X2 ,
, Xn have mean values 1 , 2 , , n , and variance of 12 , 22 , , n2 . respectively
E(a1 X1 + a2 X2 + + an Xn ) = a1 E(X1 ) + a2 E(X2 ) + + an E(Xn )
= a1 1 + a2 2 + + an n ,
n X
n
X
V (a1 X1 + a2 X2 + + an Xn ) =
ai aj Cov(Xi Xj ).
i=1 j=1

If Xi s and Xj s be independent, Cov(Xi , Xj ) = 0, then V (a1 X1 + a2 X2 + + an Xn ) =?


In particular, for difference of two random variables
E(X1 X2 ) = E(X1 ) E(X2 )
V (X1 X2 ) = V (X1 ) + V (X2 ),

if X1 and X2 are independent

If X1 , X2 , , Xn are independent and normally distributed, any linear combination of them


has also normal distribution.

Example: Let X1 , X2 , X3 , X4 , X5 be the observed numbers of miles per gallon for the five
cars. suppose these variables are independent and normally distributed with 1 = 2 =
20, 3 = 4 = 5 = 21, and 2 = 4 for X1 and X2 and 2 = 3.5 for others, define Y as
Y =

X1 + X 2 X3 + X 4 + X5

2
3

Compute P (0 6 Y ) and P (1 6 Y 6 1).

Suggested Exercises for Chapter 5: 3, 5, 11, 13, 15, 19, 25, 27, 31, 37, 39, 41, 47, 49,
51, 55, 59, 63, 65, 69, 73, 75,

CHAPTER 6
 Point Estimate
The goal in this section, is to estimate a parameter of population based on a random sample
of size n. If we consider a single number as a parameter estimate, named it point estimate.
Therefore, point estimate is a suitable statistics that its value computing from the sample
data. For example
sample mean, is a point estimator for population mean .
X,
sample median, is a point estimator for population median
X,
.
p, sample proportion, is a point estimator for population proportion p.
s2 , sample variance, is P
a point estimator for population variance 2 , another alternative
2
X)
.
as estimator for 2 is (X
n

In general, if parameter of interest is , its point estimator denoted by .


The best estimator is the estimator which is unbiased with minimum variance.
 Unbiased Estimator
A point estimator is unbiased estimator for , if
= .
E()
is called biased of .

If is not unbiased, then E()


is unbiased for .
X
Sample proportion, p is unbiased for p
s2 is unbiased for 2
 Estimators with Minimum Variance
When we have two unbiased estimators, the spreads of distributions about the true value may
be different, we choose the one with minimum variance. The result is called the minimum
variance unbiased estimator (MVUE).
is the MVUE for
X

Example: There is a sample of s of size 27 as


5.9, 7.2, 7.3, 6.3, 8.1, 6.8, 7.0, 7.6, 6.8, 6.5, 7.0, 6.3, 7.9, 9.0, 8.2, 8.7, 7.8, 9.7, 7.4, 7.7, 9.7,
7.8, 7.7, 11.6, 11.3, 11.8, 10.7
P
a. Calculate the point estimate of population mean.( xi = 219.8)
b. Calculate the point estimate that separate the smallest 50% from the largest 50%, which
1

estimator you used?


P
c. Calculate the point estimate of population standard deviation. ( x2i = 1860.94)
d. Calculate the point estimate of the proportion of all values greater than 10.
e. Calculate the point estimate of the population coefficient of variation /.

 Maximum Likelihood Estimation


One of the best method for finding estimator for population parameter is method of maximum likelihood. The result of this method is invariance and asymptotically unbiased. Let
X1 , X2 , , Xn have joint pmf or pdf f (x1 , x2 , , xn ; 1 , , m ), where the parameters
1 , , m have unknown values. When x1 , x2 , , xn be the observed sample values, then
f (x1 , x2 , , xn ; 1 , , m ) is a function of 1 , , m and is called likelihood function.
The maximum likelihood estimates for 1 , , m are the values that maximize the likelihood
function, i.e.,
f (x1 , x2 , , xn ; 1 , , m ) > f (x1 , x2 , , xn ; 1 , , m ) 1 , , m .
If likelihood be a differential function of parameter, then derivative of it to zero gives the
maximum.
Suppose X1 , X2 , , Xn is a random sample from exponential distribution with parameter
, what is the mle (maximum likelihood estimator) for ?
f (x1 , x2 , , xn ; ) = (ex1 ), , (ex1 ) = n e
X
ln[f (x1 , x2 , , xn ; )] = n ln()
xi .

xi

For finding maximum


n X
n
1
d
= 1.
ln[f (x1 , x2 , , xn ; )] =
xi = 0 = P =

x
xi
X
Examples:

A random sample of n bike helmets manufactured by a certain company is selected.


Let X =number among the n that are flawed and let p = P (flawed). Assume that only
X is observed rather than the sequence of Ss and Fs.
a. Derive the mle of p. If n = 20 and x = 3, what is the estimate?
b. Is the estimator of part (a) unbiased?
c. If n = 20 and x = 3, what is the mle of the probability (1 p)5 that none of the
next five helmets examined is flawed?
2

Suppose the pdf of X is


(
f (x; ) =

( + 1)x
0

06x61
otherwise.

where 1 < . A random sample of 10 students yields data x1 = 0.92, x2 = 0.79, x3 =


0.90, x4 = 0.65, x5 = 0.86, x6 = 0.47, x7 = 0.73, x8 = 0.97, x9 = 0.94, x10 = 0.77.
Obtain the maximum likelihood estimator of and then compute the estimate for the
given data.
Suggested Exercises for Chapter 6: 3, 5, 7, 13, 15, 21, 23, 25, 27, 29

CHAPTER 7
 Statistical Intervals Based on a Single Sample
The point estimate report a single number that does not provide any information about the
precision and reliability of estimation. An alternative estimate is interval estimate. A confidence interval is always calculated by first selecting a confidence level, which is a measure
of the degree of reliability of the interval.
For example a confidence of 95% implies that 95% of all samples would give an interval that
includes the parameter.
 Confidence Intervals for
Suppose that the parameter of interest is the population mean and also
The population distribution is normal.
The population standard deviation is known.
By knowing that the area under the standard normal curve between -1.96 and 1.96 is 0.95,
P (1.96 <
then


X
< 1.96) = 0.95
/ n

1.96 < < X


+ 1.96 ) = 0.95.
P (X
n
n

Then by substitute sample mean by x,

P (
x 1.96 < < x + 1.96 ) = 0.95 is a 95% CI for .
n
n
In general, a 100(1-)% confidence interval for the mean of a normal population when
the value of is known is given by

(
x z/2 , x + z/2 )
n
n

Choice of Sample Size The width of the 95% interval is 2(1.96)/ n which specify the
precision or accuracy of interval. This is possible to determine n by knowing this width, i.e.,

2
n = 2z/2
w
where w is width of interval.

Example: Consider a normal population with = 3.


= 58.3.
a. Compute a 95% CI for when n = 25 and X
= 58.3.
b. Compute a 95% CI for when n = 100 and X
1

= 58.3.
c. Compute a 99% CI for when n = 100 and X
= 58.3.
d. Compute a 82% CI for when n = 100 and X
e. How large must n be if the width of the 99% interval for is to be 1.0?

 Large Sample Confidence Intervals for


Based on CLT, for large sample

X

Z=
S/ n
has approximately a standard normal distribution, then
s
x z/2
n
is a 100(1-)% large-sample CI for , which is not related to the shape of the population
distribution.

Example: A random sample of 110 lightning flashes in a certain region resulted in a sample
average radar echo duration of 0.81 sec. and a sample standard deviation of 0.34 sec. Calculate a 99% CI for the true average echo duration , and interpret the resulting interval

 Large-Sample Confidence Interval for a Population Proportion


p
If X be a binomial rv. with E(X) = np and x = np(1 p), for a large n (np > 10, nq >
10), X has approximately normal distribution.
p = X/n is an estimator for p which is a linear combination of X, then p has approximately
p
a normal distribution as well with E(
p) = p and p = p(1 p)/n, then
p p
P (z/2 < p
< z/2 ) 1
p(1 p)/n
Then, a 100(1 )% large-sample CI for p is
p
p z/2 pq/n.
p
If, consider w = 2z/2 pq/n, the approximate sample size is
2
n 4z/2

pq
w2

 One-Sided Confidence Interval


A large-sample upper confidence bound for is
s
< x + z
n
2

and a large-sample lower confidence bound for is


s
> x z
n
Example: When each football helmet in a random sample of 37 suspension-type helmet
was subjected to a certain impact test, 24 showed damage. Let p denote the proportion of
all helmets of this type that would show damage when tested in the prescribed manner.
a. Calculate a 99% CI for p.
b. What sample size would be required for the width of a 99% CI to be at most 0.10 irrespective of p?
 Intervals Based on a Normal Population Distribution
Suppose X1 , , Xn be a random sample from normal distribution with both and un
X
doesnt have a standard normal distribution. This random
known. For a small sample, s/
n
variable has t-distribution with n 1 degrees of freedom (df ).
Properties of t-distribution
1. tv is bell shaped and centered at 0.
2. tv is more spread out than the standard normal.
3. As v increase, the spread of tv decrease.
4. As v , tv approaches the standard normal.
 The One-Sample t Confidence Interval
Suppose that x and s are sample mean and standard deviation of a random sample from a
normal population, then
s
s
(
x t/2,n1 , x + t/2,n1 )
n
n
is a 100(1-)% confidence interval for .
An upper confidence bound for is
s
x + t,n1
n
also

s
x t,n1
n
3

is lower confidence bound for .


Example: A random sample of n = 8 E-glass fiber test specimens of a certain type yielded
a sample mean interfacial shear yield stress of 30.2 and a sample standard deviation of 3.1.
Assuming the interfacial shear yield stress is normally distributed, compute a 95% CI for
the true average stress.
Suggested exercises for Chapter 7: 3, 5, 7, 9, 13 , 15, 17, , 19, 23, 29, 35,
37, 39, 49,

CHAPTER 8
 Test of Hypotheses Based on a Single Sample
Hypothesis testing is the method that decide which of two contradictory claims about the
parameter is correct. Here the parameters of interest are population mean and proportion.
 Hypotheses and Test Procedures
In any hypothesis testing problem, there are:
Null Hypothesis, denoted by H0 which is the initially assumption about population parameter. H0 is an equality claim and the form of null hypothesis is H0 : = 0 where is
parameter of interest.
Alternative Hypothesis denoted by Ha is the contradictory to H0 and looks like one of
the following cases
Ha : > 0
Ha : < 0
Ha : 6= 0
Test of Hypotheses is a method for using sample data to decide that H0 should be reject
or not.
Test Procedures is a rule, based on sample data, for deciding whether to reject H0 and
contains:
Test Statistic, which is a function of sample data for making decision
Reject Region, which is the set of all test statistic values for which H0 will be rejected.

Errors in Hypotheses Testing There are two different types of error


Type I error consists of rejecting H0 when it is true denoted by = P (type I error)
Type II error consists of accepting H0 when it is false denoted by = P (type II
error)

Example: The calibration of a scale is to be checked by weighing a 10-kg test specimen 25


times. Suppose that the results of different weighing are independent of one another and
1

that the weight on each trial is normally distributed with = 0.2 kg. Let denote the true
average weight reading on the scale.
a. What hypotheses should be tested?
b. Suppose the scale is to be recalibrated if either x > 10.1032 or x 6 9.8968. What is the
probability that recalibration is carried out when it is actually unnecessary?
c. What is the probability that recalibration is judge unnecessary when in fact = 10.1
When = 9.83?

d. Let z = (
x 10)/(/ n). For what value c is the rejection region of part (b) equivalent
to the two tailed region either z > c or z 6 c?
e. If the sample size were only 10 rather than 25, how should the procedure of part (d) be
altered so that = 0.05?
f. Using the test part (e), what would you conclude from the following sample data:
9.981 10.006 9.857 10.107 9.888 9.728 10.439 10.214 10.190 9.793
g. Reexpress the test procedure of part (b) in terms of the standardized test statistic
10)/(/n)
Z = (X

 Test About a Population Mean


For this test, consider three different cases:
1. A Normal Population with Known
Let X1 , X2 , Xn be a random sample of size n from a normal population with variance
of (known), then
N (, 2 /n)
X
Suppose the null value for population mean is 0 , when H0 is true x = 0 , and
Z=

0
X

/ n

has a standard normal distribution which is the test statistic for H0 : = 0 . First
consider Ha : > 0 , the rejection region calculate based on type I error (level of
significant denoted by ), if = 0.05 by using this fact that the distribution of Z is
standard normal, the cut of point c is 1.645 and H0 is rejected if z > 1.645. We have
same argument for other kinds of alternatives.
In general
Null hypothesis: H0 : = 0
Test statistic value: z =

0
/ n

Alternative hypothesis

Rejection region for level test

Ha : > 0
z > z (upper-tailed test)
Ha : < 0
z 6 z (lower-tailed test)
Ha : 6= 0
z > z/2 or z 6 z/2 (two-tailed test)
Example: The melting point of 16 samples of a certain brand of hydrogenated vegetable oil was determined, resulting in x = 94.32. Assume that the distribution of
melting point is normal with = 1.20. Test H0 : = 95 versus Ha : 6= 95 using a
two-tailed level 0.01 test.
2. Large-Sample Tests
When the population is not normal and also is unknown, again based on CLT
Z=

s/ n

has approximately a standard normal distribution and everything is same as previous


case.
3. A Normal Population Distribution
As seen before, if X1 , X2 , , Xn is a random sample from a normal distribution, then
T =

0
X

s/ n

has t-distribution with n 1 degrees of freedom, therefore instead of finding critical


region from normal distribution, find critical region by using t-distribution. Then in
general
Null hypothesis: H0 : = 0
Test statistic value: t =
Alternative hypothesis
Ha : > 0
Ha : < 0
Ha : =
6 0

0
s/ n

Rejection region for level test


t > t,n1 (upper-tailed test)
t 6 t,n1 (lower-tailed test)
t > t/2,n1 or t 6 t/2,n1 (two-tailed test)


Example: The amount of shaft wear after a fixed mileage was determined for each of n = 8
internal combustion engines having copper lead as a bearing material, resulting in x = 3.72
and s = 1.254. Assuming that the distribution of shaft wear is normal with mean , use the
t test at level 0.05 to test H0 : = 3.5 versus Ha : > 3.5.

 Test Concerning a Population Proportion


Based on CLT the estimator p = X/n has a normal distribution with mean p and variance
p(1 p)/n, then when n is large and H0 : p = p0 is true
p p0
Z=p
p0 (1 p0 )/n
has approximately a standard normal distribution, and same as before critical value will
calculated based on normal distribution. In general
Null hypothesis: H0 : p = p0
Test statistic value: z =
Alternative hypothesis

pp0
p0 (1p0 )/n

Rejection region for level test

Ha : p > p0
z > z (upper-tailed test)
Ha : p < p0
z 6 z (lower-tailed test)
Ha : p 6= p0
z > z/2 or z 6 z/2 (two-tailed test)
These test valid if np0 > 10 and n(1 p0 ) > 10
Example: A random sample of 150 recent donations at a certain blood bank reveals that
92 were type A blood. Does it suggest that the actual percentage of type A donations differ
from 40%, the percentage of the population having type A blood Carry out a test of the
appropriate hypotheses using a significance level of 0.01. Would your conclusion have been
different if a significance level of 0.05 had been used?
P-Value is the smallest level at which H0 would be rejected. Once P-value has been determined, the conclusion at any particular level results from comparing the P-value to :
P-value 6 reject H0 at level
P-value > do not reject H0 at level
4


Suggested Exercises for Chapter 8: 3, 7, 15, 17, 19, 23, 27, 31, 35, 37, 39, 47, 49, 53,
57, 59, 61, 65

CHAPTER 9
Inference Based on Two Samples
z Test and Confidence Intervals for a Difference Between Two Population Means
Case I: Normal populations with known variances
Assumptions:
1. X1 , X2 , , Xm is a random sample from a population with mean 1 and variance 12
2. Y1 , Y2 , , Yn is a random sample from a population with mean 2 and variance 22
3. The X and Y samples are independent of one another
1. Null hypothesis:
wish to tests.

H0 : (1 2 ) = D0 , where D0 is some specific difference that you

2. Alternative hypothesis:
One-Tailed Test

Ha : (1 2 ) > D0 or (1 2 ) < D0

Two-Tailed Test

(1 2 ) 6= D0

3. Test statistic:
z=

(
x1 x2 ) D0
q
12
22
+
m
n

4. Rejection region: Reject H0 when


One-Tailed Test
z > z when (1 2 ) > D0
or z < z when (1 2 ) < D0
Two-Tailed Test
or when p-value<

z > z/2 or z < z/2

p-value is the smallest level of significance at which H0 would be rejected.

(1-)% Confidence Interval


r
x y z/2

12 22
+
m
n

Case 2: Large Samples

1. Null hypothesis:
wish to tests.

H0 : (1 2 ) = D0 , where D0 is some specific difference that you

2. Alternative hypothesis:
One-Tailed Test

Ha : (1 2 ) > D0 or (1 2 ) < D0

Two-Tailed Test

(1 2 ) 6= D0

3. Test statistic:
z=

(
x1 x2 ) D0
q
s21
s2
+ n2
m

4. Rejection region: Reject H0 when


One-Tailed Test
z > z when (1 2 ) > D0
or z < z when (1 2 ) < D0
Two-Tailed Test
or when p-value<

z > z/2 or z < z/2

Assumptions: The samples are randomly and independently selected from the two
populations and m > 30 and n > 30.
(1-)% Confidence Interval
r
x y z/2

s21 s22
+
m
n

Case 3: Small sample from normal populations

1. Null hypothesis:
wish to tests.

H0 : (1 2 ) = D0 , where D0 is some specific difference that you

2. Alternative hypothesis:
One-Tailed Test

Ha : (1 2 ) > D0 or (1 2 ) < D0

Two-Tailed Test

Ha : (1 2 ) 6= D0

3. Test statistic:
t=

(
x1 x2 ) D0
q
s21
s2
+ n2
m

4. Rejection region: Reject H0 when


One-Tailed Test
Two-Tailed Test
or when p-value<

t > t when (1 2 ) > D0 or t < t when (1 2 ) < D0


t > t/2 or t < t/2

The critical values of t are based on (m + n 2) df .


Assumptions: The samples are randomly and independently selected from normally
distributed populations.
(1-)% Confidence Interval
r

s21 s22
+
m
n

Examples:
1. Random samples of 50 recent college graduates in each major were selected and the following information was obtained:
Major Education Social science
Mean
40554
38348
SD
2225
2375
x y t/2

a. Do the data provide sufficient evidence to indicate a difference in average starting salaries
for college graduates who majored in education and the social sciences? Test using = 0.05.
b. Find a 95% confidence interval for difference between means for the two groups in the
general population. Compare your result with part a.
2. A geologist collected the titanium contents of the samples, found using two different
methods:
Method 1: 0.011, 0.013, 0.013, 0.015, 0.014, 0.013, 0.010, 0.013, 0.011, 0.012
Method 2: 0.011, 0.016, 0.013, 0.012, 0.015, 0.012, 0.017, 0.013, 0.014, 0.015
a. Use an appropriate method to test for a significant difference in the average titanium
contents using the two different methods.
b. Determine a 95% confidence interval estimate for (1 2 ). Does your interval estimate
support your conclusion in part a?

z Large-Sample Statistical Test for (p1 p2 )


Let X Bin(m, p1 ) and Y Bin(n, p2 ).
1. Null hypothesis:

H0 : (p1 p2 ) = 0, or alternatively H0 : p1 = p2 .

2. Alternative hypothesis:
One-Tailed Test

Ha : (p1 p2 ) > 0 or (p1 p2 ) < 0

Two-Tailed Test

(p1 p2 ) 6= 0

3. Test statistic:

(
p1 p2 )
(
p1 p2 )
z = p p1 q1 p2 q2 = p pq pq
+ n
+ n
m
m

where p1 = x/m and p2 = y/n. Since the common value of p1 = p2 = p (used in the
standard error) is unknown,it is estimated by
p =
and the test statistic is

x+y
m+n

(
p1 p2 )
z=q

pq m1 + n1

4. Rejection region: Reject H0 when


One-Tailed Test

z > z or z < z

Two-Tailed Test

z > z/2 or z < z/2

or when p-value<
Assumptions: Samples are selected in a random and independent manner from two
binomial populations and m and n are large enough.
Example: Independent random samples of 280 and 350 observations were selected from
binomial populations 1 and 2 respectively. Sample 1 had 132 successes, and sample 2 had
178 successes. Do the data present sufficient evidence to indicate that the proportion of
successes in population 1 is smaller than the proportion in population 2?

z Analysis of Paired Data


There are a number of experimental situations in which two samples are dependent. In this
case, the difference between two sample data is considered as
P
P
2
(di d)
d
i
2
sd =
di = x1i x2i
d =
n
n1
then, test is done for di .
1. Null hypothesis: H0 : d = 0
2. Alternative hypothesis:
One-Tailed Test

Ha : d > 0 (or, Ha : d < 0)

Two-Tailed Test

Ha : d 6= 0

3. Test statistic: t =

d0

sd / n

d
sd / n

where
n =Number of paired differences
d = Mean of the sample difference
sd = Standard deviation of the sample differences

4. Rejection region: Reject H0 when


One-Tailed Test
Ha : d < 0)

t > t (

Two-Tailed Test

t > t/2

or t < t when the alternative hypothesis is

or t < t/2

or when pvalue <


(1 )100% Small-Sample Confidence Interval for (1 2 ) = d Based on
a Paired-Difference Experiment


s
d
d t/2
n
Assumptions: The experiment is designed as a paired-difference test so that the n
difference represent a random sample from a normal population.

Example: An advertisement for Albertsons, a supermarket chain in the western United


States, claims that Albertsons has had consistently lower prices than four other full-service
supermarkets. As part of survey conducted by an independent market basket price-checking
company, the average weekly total, based on the prices of approximately 95 items, is given
for two different supermarket chains recorded during 4 consecutive weeks in a particular
month.
Week
1
2
3
4

Albertsons
254.26
240.62
231.90
234.13

Ralphs
256.03
255.65
255.12
261.18

a. Is there a significant difference in the average prices for these two different supermarket
chains?
b. What is the approximate pvalue for the test conducted in part a?
c. Construct a 99% confidence interval for the difference in the average prices for the two
supermarket chains. Interpret this interval.

Suggested Exercises For Chapter 9: 3, 7, 19, 25, 29, 37, 39, 41, 43(b,c), 47, 49,
51,
6

S-ar putea să vă placă și