00 voturi pozitive00 voturi negative

51 vizualizări182 paginiSlides from Dr. Micheal Minnotte's University of North Dakota Math Class, Math 321 - Applied Statistical Methods. These are slides from the Summer of 2015 Semester. This class uses Principles of Statistics for Engineers and Scientists written by William Navidi, first edition.

Jun 04, 2015

© © All Rights Reserved

PDF, TXT sau citiți online pe Scribd

Slides from Dr. Micheal Minnotte's University of North Dakota Math Class, Math 321 - Applied Statistical Methods. These are slides from the Summer of 2015 Semester. This class uses Principles of Statistics for Engineers and Scientists written by William Navidi, first edition.

© All Rights Reserved

51 vizualizări

00 voturi pozitive00 voturi negative

Slides from Dr. Micheal Minnotte's University of North Dakota Math Class, Math 321 - Applied Statistical Methods. These are slides from the Summer of 2015 Semester. This class uses Principles of Statistics for Engineers and Scientists written by William Navidi, first edition.

© All Rights Reserved

Sunteți pe pagina 1din 182

measurement and decision-making under

conditions of uncertainty, randomness,

and variability.

dealing with data.

Math 321 - Dr. Minnotte

collect information, to help make

decisions.

that sort of thing every day, in every field

of study, and in our everyday life.

process mathematically. This allows us to

recognize smaller differences than might

otherwise be found, and to make decisions

under conditions of greater uncertainty.

Math 321 - Dr. Minnotte

any bit of numerical information, like the 6.3%

unemployment rate in April, 2014 or the

15,143 students enrolled at UND in Fall, 2013.

every time we read the newspaper, or watch

TV news, or read a journal in our field.

understanding, so should statistics. If we

uncritically accept the numbers others give us,

we open ourselves to believing

misinformation.

Math 321 - Dr. Minnotte

every field. In this class, well look at

examples like:

works?

How can irrigation engineers use past river flow

rates to predict future flows?

How can polltakers use responses from a few

thousand voters to predict the results of an

election in which more than a hundred million

people vote?

practice?

Math 321 - Dr. Minnotte

A Statistical Cautionary Tale

contributed to a tragedy: the explosion of the

space shuttle Challenger.

seven astronauts, including Christa

McAuliffe, a 37-year-old teacher selected to

be the first teacher in space, and set the U.S.

manned space program back several years.

space shuttles are shipped to the Kennedy

Space Center in four pieces. Large rubber

O-rings are used to seal the three joints

between the pieces.

of the O-rings failed to seal quickly enough to

prevent hot gasses from escaping from the

rocket and igniting the large external fuel

tank.

cold (for Florida) launch temperature of 29F.

Math 321 - Dr. Minnotte

predicted a temperature of 31F for the

launch time.

between people at:

motors)

Marshall Space Flight Center (NASA center

for motor design control), and

Kennedy Space Center.

Math 321 - Dr. Minnotte

temperatures could lead to problems with

the O-rings.

the launch until the temperature rose

above 53F, the lowest previous launch

temperature, in which the greatest number

of damaged O-rings occurred.

Math 321 - Dr. Minnotte

10

to launch on schedule, in part because of

the following plot.

damaged O-rings for the 7 affected

launches.

11

12

13

incidents occurred, the investigators left

out some important information!

plotted, a temperature dependence

becomes obvious.

All of the 4 launches below 66F had damage.

Only 3 out 16 flights above that temperature

suffered damage.

that plot.

14

but unnecessary.

the complete data in such a format, they

might well have convinced the decisionmakers to delay the launch and prevented

the tragedy.

to it later in the semester.

Math 321 - Dr. Minnotte

15

Definition: A population consists of all

potential observations from a distribution

of interest.

be tangible, real and finite, and might be

represented by a sampling frame listing the

members of the population.

o

corporations, or items in a shipment.

Math 321 - Dr. Minnotte

16

process, and the conceptual population is

infinite and simply a useful theoretical

construct. No sampling frame is possible.

o

or objects coming off an ongoing assembly line, or

repeated measurements of the same underlying

weight.

of flexibility in defining the population of

interest.

Math 321 - Dr. Minnotte

17

UND students. What are some possible

relevant populations?

study the volume of milk in containers

coming off a production line. What are

possible populations?

incidence of obesity in preteen children.

What is an appropriate population?

Math 321 - Dr. Minnotte

18

take a sample from that population.

sample will be the observations which

make up the dataset we will analyze.

Math 321 - Dr. Minnotte

19

Experiments

to determine how the concentration of a

catalyst affects the yield of a process.

times, changing the concentration each

time and compare the yields that result.

controlled experiment because the values

of the concentration variable are under the

control of the experimenter.

Math 321 - Dr. Minnotte

20

Observational Studies

cannot control the variables of interest.

determine the effect of cigarette smoking on

the risk of lung cancer. In these studies,

rates of cancer among smokers are

compared with rates among nonsmokers.

smokes and who doesnt.

study.

Math 321 - Dr. Minnotte

21

sure it is representative of the population.

enumeration, of everyone in the

population. What are some problems with

this approach?

22

sample, choosing your sample with planned

probability methods.

The most basic such method is called a

simple random sample (SRS).

In a SRS, we draw individuals out of the

population with the equivalent of drawing

names out of a (well-mixed) hat.

Each subset of the population of the

appropriate size is equally likely to make up

the sample.

This is theoretically convenient, but often

hard to arrange in practice.

Math 321 - Dr. Minnotte

23

observations of a SRS should not show

any noticeable pattern or trend.

24

population perfectly.

each other; occasionally a sample is

substantially different from the population.

variation.

25

knowing the values of some of the items

does not help to predict the values of the

others.

treated as independent in most cases

encountered in practice. The exception

occurs when the population is finite and

the sample comprises a large fraction

(more than 5%) of the population.

26

Samples of Convenience

convenience, may be easier to collect, but

may be nonrepresentative in some

important ways.

making them worthless (or at least a whole

lot less trustworthy).

27

hometowns for all U.S. college students,

but only sample at UND.

students on math anxiety, and pick a class

to interview:

Math 321?

Upper-division English?

28

a new AIDS vaccine. We could give those

who consent the vaccine, and leave those

who dont alone to be the control group.

vaccinated group with past infection

rates)?

Math 321 - Dr. Minnotte

29

our sample, we are generally interested

only in a small number of characteristics.

called a variable, and assigned a letter

from the end of the alphabet.

30

types:

1)

of several distinct groups.

o

o

o

2)

X = Sex

T = Hair Color

W = Zip Code

where operations like averages make sense.

o

o

o

Y = Age

U = Rainfall

Z = Volume of milk

Math 321 - Dr. Minnotte

31

10

many variables we measure on each

individual.

we say the dataset is univariate.

(e.g. age and sex), we say it is bivariate.

trivariate, quadrivariate, and so on, or more

commonly, that it is multivariate.

Math 321 - Dr. Minnotte

32

name (letter) to indicate specific

observations in a dataset, such as X1, X2,

, Xn.

A subscript of i (occasionally j or k)

indicates a specific, but arbitrary,

observation.

number of observations (the sample size).

Math 321 - Dr. Minnotte

33

statistics:

1)

simplify and understand a dataset.

2)

something about the broader population or

distribution from which the data was

drawn.

start there.

Math 321 - Dr. Minnotte

34

11

sample statistics to summarize the dataset.

calculated from a dataset. A sample statistic

simply makes clear that it derives from a

sample.

understanding of the data, as well as make it

easier to communicate with others about it.

Math 321 - Dr. Minnotte

35

describe is generally its location, or the

location of its center.

is the familiar average, or sample mean.

, Xn is

36

Example: Stocks:

37

12

suppose we were to take a very thin

yardstick or similarly marked board, and

place a small (equal) weight at the mark

for each observations value.

where this would balance.

38

Outliers

different from the rest of the sample. For

univariate data, this means it is much larger

or much smaller than the rest.

they are the result of measurement or

recording errors.

but unusual values, however, should be kept.

Math 321 - Dr. Minnotte

39

to outliers). Changing even one

observation can change the sample mean

as much as we want.

374 (instead of 37.4). What is the sample

mean now?

40

13

Measures of Variability

feature to describe a sample is its

variability, or spread.

41

range, the difference between the

maximum and minimum values.

R = max(X) min(X)

of the data, and is maximally non-robust,

using only the two extreme data points, so

it is rarely used.

Math 321 - Dr. Minnotte

42

from the mean,

This removes the

effect of the mean (location), and looks

only at the variability around the mean.

from the mean.

negative ones, and the average deviation

from the mean is always 0.

Math 321 - Dr. Minnotte

43

14

deviations, but for a few theoretical

reasons, its better to look at the squared

deviations instead.

measures the spread of a dataset.

s, is the square root of the sample

variance.

Math 321 - Dr. Minnotte

44

it requires finding and squaring each of the

n deviations from the mean.

the following computation formula.

45

standard deviation of the stocks data?

46

15

deviation are measures of the spread of a

dataset, and estimates of the variance and

standard deviation of the underlying

population or distribution.

Example: Stocks, replace 37.4 with 374:

s2 = ?

s=?

Math 321 - Dr. Minnotte

theoretically, the variance and standard

deviation are a little tricky intuitively.

47

About 95% of data should fall in

Almost all data should fall in

48

where a and b are constants, then

change units for our data.

49

16

temperatures measured in degrees

Celsius, with = 30. Let Y1,,Yn be the

same temperatures in degrees Fahrenheit,

Yi = 9/5 Xi + 32. What is ?

temperatures be

= 25.

What is the variance of the Fahrenheit

temperatures? The s.d.?

Math 321 - Dr. Minnotte

50

Measures of Center and Spread

ith smallest value when the Xs are sorted.

The minimum is X(1), the second smallest

X(2), and so on up to the maximum, X(n).

51

on.

or last few order statistics, values

computed from middle order statistics will

be very robust.

Math 321 - Dr. Minnotte

52

17

middle of the sorted data.

, is the

order statistic.

(n+2)/2th order statistics.

Example: Stocks:

=?

53

on either side of it.

changing one or a few observations wont

change it much, if at all.

and the sample median remains 17.6

54

Quartiles

into quarters.

of the sample from the rest.

statistic.

If (n+1)/4 is not an integer, Q1 is the average of

the two order statistics on either side.

quarter from the rest.

Math 321 - Dr. Minnotte

55

18

Q1 = ?

Q3 = ?

is a robust measure of spread, found as

the difference between the sample

quartiles, IQR = Q3 Q1.

change Q1, Q3, or IQR.

56

57

Percentiles

(roughly) p% of the data below it, and

(100-p)% above it.

integer, use that order statistic. If not,

average the two closest order statistics.

names for the 50th, 25th, and 75th

percentiles.

Math 321 - Dr. Minnotte

58

19

Variable

Stock Returns 19

Variable

Stock Returns 19

Mean

StDev

Variance

Minimum

Q1

Median

Q3

Maximum

15.37

13.66

186.49

-7.20

5.48

17.60

28.90

37.40

IQR

23.43

59

for understanding a dataset are graphics

which we can use to look at our data.

large tables or long columns of numbers.

But the human eye is very good at picking

out patterns in pictures.

60

Bar Charts

plot available is usually a simple bar chart.

height proportional to the count

(frequency) or percentage found in that

category.

may also be compared.

Math 321 - Dr. Minnotte

61

20

Category

Perfect

62

Count

64

Good

Satisfactory

Fail

47

33

6

Total

150

63

categories.

(no truncation!). Otherwise, relative

heights get distorted.

64

21

65

(e.g. poor-fair-good-excellent; not

alphabetical), bars should be sorted in

ascending or descending order. This

makes comparisons between close values

much easier.

66

67

22

may be better served by horizontal bars.

68

69

clarity usually a bad idea.

70

23

categorical variable, but focuses on the

totals for the main category of the bars.

Individuals on the Titanic

1000

900

800

700

600

500

400

300

200

100

0

Survived

Died

1st Class

2nd

Class

3rd

Class

Crew

counts of the specific combinations of

categories, and is useful for comparing the

distribution of one variable for different

values of the other.

800

700

600

500

400

300

200

100

0

Died

Survived

1st

Class

2nd

Class

3rd

Class

Crew

71

72

73

24

74

75

Pie Charts

data.

categories represent (all of the) parts of

some whole, and so should always plot

percentages.

76

25

equal to

than comparing heights or lengths. Bar

charts are almost always more effective.

(Probably worse than no chart.)

77

78

79

Minitab:

26

Dotplots

useful for looking at univariate numeric

data, especially when the sample size is

small or there are many ties in the data.

above an appropriate number line. If there

are ties, one dot is stacked for each tied

observation.

Math 321 - Dr. Minnotte

80

the first 25 space shuttle launches.

66

70

69

80

68

67

72

73

70

57

63

78

70

67

53

75

67

70

81

76

79

75

76

58

31

81

Histograms

data.

shape of the distribution of the data.

sample, the shape is also descriptive of

the population the sample was taken from.

plots, which are similar, but rarely used.

Math 321 - Dr. Minnotte

82

27

Constructing a Histogram

Find the minimum and maximum of the

data.

1)

2)

large samples, less for small ones.

A reasonable rule of thumb is

width.

Math 321 - Dr. Minnotte

3)

relative frequencies (fi = ni/n) in each

class.

4)

class whose height equals fi or ni.

83

84

1976-1995):

85

28

the distribution. Some things to look for

include:

Symmetric?

Right-skewed?

Multimodal?

Math 321 - Dr. Minnotte

86

87

of bin width and location, as different

choices here can produce dramatically

different histograms.

are likely to be trustworthy; those that only

appear sometimes are less certain.

88

29

89

90

91

30

92

93

Boxplots

tool for displaying a sample:

94

31

quartile, with a line at the median.

as any values below

Q1 1.5 IQR

or above

Q3 + 1.5 IQR.

least and greatest values among the nonoutliers.

Math 321 - Dr. Minnotte

95

histograms for a single distribution, so the

histogram is usually preferable.

is difficult, while comparing boxplots is

easy.

distributions.

96

97

32

98

relationships between variables.

at pairs of measurements made on the

same subjects, (x, y).

variables).

Math 321 - Dr. Minnotte

99

Examples:

ACT score and Freshman GPA for college

students.

January and April average temperatures for

many years at a specified location.

January and February inflows of the Nile river

at a location.

100

33

cause-and-effect relationship.

variable, x, is assumed to play some role

in determining the value of the response

(dependent) variable, y.

x

101

Scatterplots (2.1)

common graph for displaying bivariate

data. It consists of plotting each point at

(xi, yi), on a standard x-y graph.

describes the relationship between the

variables.

102

103

34

104

105

106

35

Minitab Scatterplot:

107

Correlation

and compute the sample means, and

product of the two deviations from the

means.

results in two quadrants where the product

is positive, and two where it is negative.

108

109

36

relationship, most of the products will have

a positive sign, and the sum will be

positive.

relationship, the sum of the products will

be negative.

depends on the units and spread (as

measured by standard deviation) of the

variables.

Math 321 - Dr. Minnotte

110

solves this issue.

Then

is a good, unitless

measure of the linear relationship between x

and y called the correlation coefficient.

Math 321 - Dr. Minnotte

What is r?

Math 321 - Dr. Minnotte

111

112

37

Properties of r

1.

or y. We will not change r if we multiply all xs,

all ys, or both by a positive constant or if we add

any constant to all xs, all ys, or both.

2.

variable is labeled x.

3.

4.

between x and y is positive or negative.

Math 321 - Dr. Minnotte

113

Properties of r (continued)

5.

linear relationship between x and y. Roughly

speaking:

a.

b.

c.

d.

If 0.5 < |r| < 0.8, the association is moderate.

If 0.8 < |r| < 1.0, the association is strong.

If |r| = 1.0, the association is perfect. This occurs only

when all (x, y) points fall in a perfect line.

Note that strength is often context- and disciplinedependent. An engineer might find any correlation less

than .95 to be weak, while a social scientist might find a

correlation of .3 to be very strong.

114

115

38

116

Properties of r (continued)

6.

strength of a nonlinear (curved) relationship.

7.

117

118

39

association, not necessarily causality.

explanations:

1)

2)

3)

x determines y

y determines x

Some third value, z, (called a confounding

factor) determines both x and y.

119

capita chocolate consumption is strongly

correlated with traffic fatalities.

chocolate be outlawed?

Do people eat a lot of chocolate at funerals?

Is there a third explanation that makes more

sense?

120

Massachusetts are strongly correlated with

the price of rum in Havana. What is the

causal relationship here?

correlated with size of vocabulary. What is

the causal relationship?

121

40

randomized, controlled experiments is that

potential confounding factors should be

(roughly) balanced between levels of the

independent variable we are investigating,

so should be much less likely to produce a

spurious correlation.

122

and predicting the values of one response

variable, based on the observed values of

one or more other explanatory variables.

regression, where a straight line is fit to a

scatterplot of x and y.

123

uses the least squares fit, minimizing

124

41

125

126

and what should we predict the flow for

February to be if Januarys was 3?

127

42

128

January value of 10?

(Recall, Januarys mean is about 4, and its

standard deviation is about 1.)

is dangerous.

Math 321 - Dr. Minnotte

129

associated fitted regression model, the

fitted value for observation i is

the regression line are at predicting y.

130

43

computing formula:

Math 321 - Dr. Minnotte

131

132

measures the proportion of the total

variation of y which is explained by x:

the relationship is at explaining the

variation in y.

determination is the square of the

correlation coefficient.

Math 321 - Dr. Minnotte

133

44

Note: r = 0.933.

Math 321 - Dr. Minnotte

134

in Minitab output.

column of the Analysis of Variance table.

February Inflow = - 0.4698 + 0.8362 January Inflow

S = 0.330519

R-Sq = 87.1%

R-Sq(adj) = 87.0%

Analysis of Variance

Source

DF

SS

MS

83.3794

83.3794

763.25

0.000

Error

113

12.3444

0.1092

Total

114

95.7238

Regression

135

Chapter 3: Probability

mathematics dealing with chance,

randomness, and uncertainty.

mathematical foundation for inferential

statistics.

136

45

outcome cannot be determined in advance

is called an experiment.

Examples:

The draw of a card.

The lifetime of an electronic component.

experiment is the set of all possible

outcomes.

Examples:

137

Die: S = {1, 2, 3, 4, 5, 6}

Card: S = ?

Component: S = ?

visually represented by a tree diagram:

138

139

46

Events

Definition: Set A is a subset of set B

(A B) if every element of A is also in B.

Example: S = {1, 2, 3, 4, 5, 6}

A = {1, 3, 5} S

B = {1, 2, 6, 7} S

elements, is a subset of every set.

Math 321 - Dr. Minnotte

sample space can be called an event.

Examples:

140

Card: B = ?

Component: C = ?

are sometimes called simple events.

141

Combining Events

1)

consisting of all elements found in A, B, or

both.

Keyword: or

Example: S = {1, 2, 3, 4, 5, 6}

A = {1, 3, 5} S

B = {1, 2, 3} S

AB=?

Math 321 - Dr. Minnotte

142

47

set consisting of all elements found in both

A and B.

2)

Example: S = {1, 2, 3, 4, 5, 6}

A = {1, 3, 5}

B = {1, 2, 3}

AB=?

143

consisting of all elements of S not found in

A.

3)

Keyword: not

Example: S = {1, 2, 3, 4, 5, 6}

A = {1, 3, 5}

Ac = ?

144

exclusive if there are no elements in both

A and B. That is, if A B = (the empty

set).

4)

Example: S = {1, 2, 3, 4, 5, 6}

A = {1, 3, 5}

C = {4, 6}

A and C = , so A and C are mutually

exclusive.

Math 321 - Dr. Minnotte

145

48

B?

A or B?

Not A?

Math 321 - Dr. Minnotte

146

Definition: A probability function P() is a

function from subsets of S (events) to the

real numbers which satisfies the following

axioms of probability:

1)

2)

3)

P(S) = 1.

0 P(A) 1 for all events A.

If A and B are mutually exclusive,

P(A B) = P(A) + P(B).

Math 321 - Dr. Minnotte

147

P(5) = 1/6, P(6) = 1/6.

axiom 3:

P({1,3,5}) = ?

148

49

P(5) = 1/6, P(6) = 3/12 = 1/4.

Note:

P({1,3,5}) = ?

149

probability measures (long-term)

likelihood: if the experiment is repeated

many times, event A should occur roughly

P(A) fraction of the time.

150

additional properties:

1)

events rule, or the opposites rule.

Show:

Note: Since Sc = , P() = 0.

Math 321 - Dr. Minnotte

151

50

P(A B) = P(A) + P(B) P(A B).

2)

rule.

Show:

Note: if A and B are mutually exclusive,

P(A B) = P() = 0, so this is the same

as axiom 3.

152

P(5) = 1/6, P(6) = 1/6.

A = {1, 3, 5},

P(A) = 3/6 = 1/2.

B = {1, 2},

P(B) = 2/6 = 1/3.

P(Ac) = ?

A B = {1},

P(A B) = ?

P(A B) = 1/6.

probability function to use these.

Suppose we know:

153

P(B) = P(40 T 80) = .34

P(A B) = P(40 T 60) = .26

Then:

P(T 60) = ?

P(lifetime no more than 80) = ?

Math 321 - Dr. Minnotte

154

51

integrated circuit chip has defective

etching is 0.12. The probability that the

chip has a crack defect is 0.29. And the

probability of both defects is 0.07.

have defective etching?

defect?

defect?

Math 321 - Dr. Minnotte

155

and event A consists of k of them,

P(A) = k/N.

standard deck (52 cards, 13 spades). What

is the probability of drawing a spade?

contains 6 which do not work. If we draw one

at random, what is the probability of selecting

a defective drive?

Math 321 - Dr. Minnotte

156

the outcome of an experiment. In

particular, suppose we know that the event

B has occurred.

probability of another event, A.

conditional probability, as it depends on

the condition of B being true.

Math 321 - Dr. Minnotte

157

52

A = {1, 3, 5}

P(A) = 3/6 = 1/2

B = {1, 2, 3}

P(B) = 3/6 = 1/2

P(A B) = P({1, 3}) = 2/6 = 1/3

If I roll the die and, without showing you, tell

you event B has occurred (I rolled no greater

than 3), now what is the probability of event

A?

reduces to B: {1, 2, 3}.

A), and the chances are still equal. So

P(A|B) = 2/3.

probability increases to 2/3 that its odd.

Math 321 - Dr. Minnotte

158

159

given B is

(undefined if P(B) = 0).

has occurred, that event A has also

occurred.

Die:

Math 321 - Dr. Minnotte

160

53

P(crack defect) = 0.29.

P(etching and crack defects) = 0.07.

(conditional) probability that it also has

defective etching?

161

defect but satisfactory etching?

probability that it has satisfactory etching?

P(A) =1 - P(Ac).

Math 321 - Dr. Minnotte

162

probability that it also has a crack defect?

163

54

Independence

A and B are independent.

P(B)>0, then

this as the definition of independence.

Math 321 - Dr. Minnotte

164

P(A|B) = P(A)

P(B|A) = P(B)

165

well-shuffled deck. Define:

A = {draw a club}

B = {draw an ace}

C = {draw a red card}

166

55

and their being independent is not the

same thing.

are mutually exclusive, they cannot be

independent!

167

calculate probabilities of intersections.

A = {red 6}

B = {black 6}

P(A) = 1/6

P(B) = 1/6

(fair dice)

other, so we assume independence.

= (1/6)(1/6) = 1/36.

Math 321 - Dr. Minnotte

168

events says that if events A1, A2, , An

are independent (that is, knowledge of any

combination of the Ais does not change

the probabilities of the remainder), then

events occur.

Math 321 - Dr. Minnotte

169

56

P(Ai) = 1/2,

i = 1, 2, 3, 4

Separate flips are independent. (Why?)

P(4 heads) = P(A1 A2 A3 A4)

= P(A1) P(A2) P(A3) P(A4)

= (1/2) (1/2) (1/2) (1/2)

= 1/16.

170

deck 3 times with replacement (replace

and reshuffle after each draw).

P(Ai) = 13/52 = 1/4, i = 1, 2, 3

Separate draws are independent. (Why?)

P(3 spades) = ?

Recall,

171

Math 321 - Dr. Minnotte

172

57

labeled 1, 2, 3, and 4. Suppose we

draw two at random without replacement.

What is the probability both cards are

odd?

173

random without replacement from a

standard deck. What is the probability

both cards are spades?

174

number. It is obtained by assigning a

number to each outcome of an

experiment.

a random variable.

175

58

sequence of heads and tails a random

variable (Example: HHTHT)?

generate from 5 coin flips:

X=#H

Y=#H#T

Z = # H before first T

capital letters from the end of the alphabet.

176

large colony. What are some possible

random variables?

177

variables: discrete and continuous.

only take on a specified (countable) list of

values. There is a gap between any two

elements in its sample space.

sort, and thus whole numbers.

Math 321 - Dr. Minnotte

178

59

may take any real number in some (set of)

interval(s).

discrete and continuous random variables.

Math 321 - Dr. Minnotte

179

Definition: The probability mass function

(p.m.f.) of a discrete random variable X is

a function p() from the support of X to the

real numbers, where

p(x) = P(X = x) .

Notation:

x: lowercase letter, indicates a specific value.

Math 321 - Dr. Minnotte

180

S = {1, 2, 3, 4, 5, 6}

p(1) = P(X = 1) = 1/6

p(2) = P(X = 2) = 1/6

and so on.

We might write

p(x) = 1/6

x {1, 2, 3, 4, 5, 6}

181

60

machines. The probability that X are

operating at a given random time may be

found from

x

p(x)

1)

? p(x) ?

2)

x S p(x) = ?

182

183

184

61

equal to probabilities:

185

186

187

62

take any value in some real interval.

measurements (length, weight, lifetime,

etc.).

cant use a p.m.f. to find probabilities.

Instead:

(density, p.d.f.), f(x), is a function which

determines the probability properties of a

continuous random variable. If X f(x),

then

188

189

If f(x) is a p.d.f.:

f(x) ?

Why?

Math 321 - Dr. Minnotte

190

63

has p.d.f.

191

probability that X will be between 0.5 and

1.0?

P(2.5 X 3.0) = ?

P(0.2 X 0.2) = ?

192

193

64

function (c.d.f.), F(x), of a random variable

is defined as

F(x) = P(X x).

If X is continuous,

194

1)

limx-F(x) = 0

2)

limxF(x) = 1

3)

4)

P(a X b) = P(X b) P(X a)

= F(b) F(a).

This is often easier than integrating f(x).

Math 321 - Dr. Minnotte

P(0.5 X 1.0) = ?

195

Math 321 - Dr. Minnotte

196

65

Definition: The population mean (expectation,

expected value) of random variable X is

if X is discrete, and

if X is continuous.

It can be thought of as the long-term average

of X, or the mean of a sample that follows the

distribution of X perfectly.

197

=?

Example: Machines

x

p(x)

=?

0

1

2

3

0.12 0.27 0.46 0.15

198

199

Example:

=?

Example:

=?

66

Expectations of Functions of

Random Variables

are really interested in a function, h(X).

The expected value of h(X) is

if X is discrete, and

if X is continuous.

Math 321 - Dr. Minnotte

Example: X ~ p(x) = , x = 1, 2.

What is E(X2)?

E(X)? [E(X)]2?

Is E(X2) = [E(X)]2?

Math 321 - Dr. Minnotte

200

201

Standard Deviation

measure of the center of a distribution, the

population variance and standard

deviation measure a distributions spread.

202

67

mean . Then the population variance of

X, 2, is

deviation, , of random variable X is the

square root of the variance of X.

203

=?

E(X2) = ?

V(X) = ?

=?

=?

E(X2) = ?

V(X) = ?

=?

204

Example: Machines

x

p(x)

=?

E(X2) = ?

V(X) = ?

=?

0

1

2

3

0.12 0.27 0.46 0.15

205

68

Example:

=?

E(X2) = ?

V(X) = ?

=?

206

Variables (3.4)

combination) of variables x1, x2, , xn, is

a function of the form

f(x1,x2,,xn) = a1x1 + a2x2 + +anxn + b

where b and all of the ais are fixed

constants.

Math 321 - Dr. Minnotte

207

and known constants a1, a2, , an, and b,

then

E(a1X1 + a2X2 + + anXn + b) =

a1E(X1) + a2E(X2) + + anE(Xn) + b .

combination of random variables, we need

only know the constants and the

expectation of each random variable

individually.

Math 321 - Dr. Minnotte

208

69

measured in degrees Celsius, with E(X) =

10. Let Y be the same temperature in

degrees Fahrenheit, Y = 9/5 X + 32. What

is E(Y)?

fair die is 3.5. What is the expectation of

the sum of four such rolls?

Math 321 - Dr. Minnotte

209

if knowledge of one does not affect the

probability of the other.

independent if knowing the value of X

does not affect probabilities of Y, no

matter what value X takes (and viceversa).

Math 321 - Dr. Minnotte

210

involving X alone will be independent from

any event involving Y alone.

P(X A and Y B) = P(X A)P(Y B)

for any A and B.

independent, but may be treated as

though they are if the sample size is much

smaller than the population size.

Math 321 - Dr. Minnotte

211

70

then

a12V(X1) + a22V(X2) + + an2V(Xn) .

Notes:

The coefficients ai are squared.

Dependent random variables require a more

complex formula.

Math 321 - Dr. Minnotte

212

temperature X be V(X) = 25.

213

die is 35/12. What is the variance of the

sum of four such rolls?

what is the variance of the result? Why is

this different?

214

71

variance 4. What are the mean and

variance of Z = X Y?

215

the Sample Mean

sample mean of the Xis,

Note that

Xis.

216

random variables, each with E(Xi) = and

V(Xi) = 2, then

and

Proof:

217

72

probability p of coming up heads. We flip

it and let X = 1 if heads, 0 if tails.

218

which represent entire families of

distributions.

constants (called parameters) which must be

specified to define a specific distribution.

important families, the binomial and normal

distributions.

Math 321 - Dr. Minnotte

219

important common named family of

discrete distributions.

by a probability mass function p(), where

p(0) = P(X = 0)

p(1) = P(X = 1)

and so on.

Math 321 - Dr. Minnotte

220

73

with only two possible outcomes.

with probability p.

occurs with probability (1 p).

(after 17th-century probabilist James

Bernoulli).

number of independent identical Bernoulli

trials, and counts the number of successes.

Math 321 - Dr. Minnotte

221

are made in pairs, and that 30% of all

chips produced are defective.

independent of each other.

the second is defective in 30% of pairs.

This remains true for pairs in which the

first chip is defective.

Math 321 - Dr. Minnotte

222

chip. Out of those, 70% will also have a

good second chip. Overall, 70% of 70%, or

49% (.7*.7 = .49) will have two good chips.

(.7*.3 = .21) will have a good first chip and a

defective second chip.

defective first chip, and 70% of those (21%

overall) will have a good second chip.

chips defective.

Math 321 - Dr. Minnotte

223

74

represent a good chip, and F (for failure)

represent a defective one, we can

summarize as:

P(SF) = .7*.3 = .21

P(FS) = .3*.7 = .21

P(FF) = .3*.3 = .09

produced in a pair.

224

p(2) = P(X = 2) = P(SS) = .49

p(1) = P(X = 1) = P(SF or FS) = .21 + .21

= .42

225

4?

consisting of 2 good and 2 defective chips,

we can think about the case of SSFF the

first and second chips are good, while the

third and fourth are defective.

will be .7*.7*.3*.3 = .0441 or 4.41%.

Math 321 - Dr. Minnotte

226

75

successes and two failures 5 other

ways, in this case:

P(SFSF) = .7*.3*.7*.3 = .0441

P(SFFS) = .7*.3*.3*.7 = .0441

P(FSSF) = .3*.7*.7*.3 = .0441

P(FSFS) = .3*.7*.3*.7 = .0441

P(FFSS) = .3*.3*.7*.7 = .0441

=.2646.

Math 321 - Dr. Minnotte

227

experiment consisting of n independent

Bernoulli trials.

wish to count are called successes, and

occur with probability p.

these occur with probability (1 p).

full experiment.

Math 321 - Dr. Minnotte

228

the number of successes in the

experiment, has a binomial distribution

with parameters n and p.

X Binomial(n, p) or X Bin(n, p) .

229

76

factorial.

arrangements, and is found as

n! n (n-1) (n-2) 2 1.

0 objects, we define 0! = 1.

Math 321 - Dr. Minnotte

230

produced in batches of 4. Let X be the

number of good chips in a batch.

What is p(2)?

will contain no more than one good chip?

231

pure yellow peas leads to pods where p =

P(yellow) = .

probability that a random pod will contain 6

yellow seeds?

will contain at least 6 yellow seeds?

Math 321 - Dr. Minnotte

232

77

calculations by providing probabilities of

P(X x) for n 20 and certain values of p.

from a standard deck, and let X = number

of spades drawn.

233

variance may generally be found as a

function of the parameters.

each pod contains 8 seeds, what is the mean

number of yellow seeds per pod?

Example: If we have 4 fair coins which we flip

as a batch, what is the mean number of

heads?

Math 321 - Dr. Minnotte

234

2 = np(1 p).

What are the variance and standard deviation

of X?

What are the variance and standard deviation

of X?

235

78

random samples) are not independent.

though they are independent (including

binomial calculations) as long as the

sample size is small (less than 5%)

compared to the population size.

236

components contains 7% defective. We

sample 8 at random.

components in our sample?

defective?

in our sample?

Math 321 - Dr. Minnotte

237

distribution has two parameters, and 2.

If X ~ N(, 2),

and is also very important theoretically.

Math 321 - Dr. Minnotte

238

79

bell-shaped curve,

symmetric around,

and with its peak at,

. E(X) = .

Its width is

determined by 2;

large values of 2

imply a wide, low

curve, while small

values imply a

narrow, tall one.

V(X) = 2.

Math 321 - Dr. Minnotte

239

normal distribution, with = 0 and 2 = 1.

variables with the letter Z.

density of Z is

240

normal probability density function, so we

cant find probabilities that way.

computer programs (which themselves

use numeric integration), or tables such as

Table A.2 (p. 521-522, and inside the front

cover of your book) of the standard normal

distribution.

241

80

242

Examples:

P(Z 1.00) = ?

P(-2.00 Z 0.75) = ?

243

converting to standard units.

inequality the same way.

244

81

P(X 6.00) = ?

245

Normal Percentiles

distribution has p% of the probability below

it, and (100 p)% above.

distribution using Table A.2 again, but

reading from the inside out.

table, start there.

Read to the outside to find the percentile.

Math 321 - Dr. Minnotte

246

percentile of Z?

247

82

find the desired percentile for the standard

normal, then use the fact that since

Z = (X - )/, therefore X = + Z.

percentile of X?

248

distributions, there are a number of other

named families of distributions with useful

properties.

(Section 4.2) is useful for modeling random

counts in a fixed interval of time or space.

lognormal, exponential, gamma, and Weibull

distributions, which are useful for modeling

continuous histograms which are positively

skewed and unimodal.

Math 321 - Dr. Minnotte

249

some distribution f. (X ~ f )

random variables, X1, Xn, independently

from f.

from f.

Sometimes we say that X1, Xn are i.i.d.

(independent and identically distributed) from

f.

Math 321 - Dr. Minnotte

250

83

compute sample statistics such as the

mean,

is

and since it is a number, is itself a

random variable with a distribution.

sampling distribution of

and plays a

large role in inferential statistics.

Math 321 - Dr. Minnotte

let X1 and X2 be independent draws from

pX(x).

Now let = (X1 + X2)/2 be the average of

X1 and X2.

Note that is also a discrete random

variable, and therefore has a probability

mass function.

What is the mass function (sampling

distribution) of ?

Math 321 - Dr. Minnotte

251

252

histogram of 1000 Xs looks like this:

253

84

get a histogram such as this:

Note that

254

Is centered on 50 ();

Is narrower than the solid normal curve for the

individual Xs the variance and standard

deviation of are smaller than those of X.

Remains bell-shaped and (roughly?) normal.

statistics and their relationships to the

associated population parameters is the

basis of most of inferential statistics.

255

estimate a population parameter:

centered on (or at least near) the parameter.

The spread of the sampling distribution will

decrease as the sample size gets larger.

As the sample size gets larger, the shape of

the sampling distribution will usually get more

and more bell-shaped (normal).

256

85

Let

be the sample mean of a random

sample X1, X2, Xn, from a population or

process with mean and standard

deviation . Then (recall, Section 3.4):

, is , the population mean, regardless of

sample size n.

The standard deviation of the sampling

distribution of ,

, is

, the population

standard deviation divided by the square root

of the sample size.

Math 321 - Dr. Minnotte

257

mean,

, is often called the standard

error of the sample mean.

sampling distribution, not a population.

258

more information and can make better

estimates, so the standard error

decreases.

means we have diminishing returns; each

new observation provides less new

information than the previous one.)

likely to be to .

Math 321 - Dr. Minnotte

is

259

86

260

distribution, the sampling distribution of

is also normal, regardless of sample size.

fills soft drink cans with a volume that has

a normal distribution with = 0.05 ounces.

mean, what is the probability that will be

within 0.04 ounces of the population mean

?

Math 321 - Dr. Minnotte

261

important theorem in statistics.

distribution, and provides the justification

of many of the most fundamental statistical

methods.

262

87

has a normal distribution, we know that the

sampling distribution of will also be

normal. This allows us to compute useful

probabilities.

population distribution (or perhaps we

know that it is not normal).

Math 321 - Dr. Minnotte

263

number of independent random variables

has a sampling distribution which is

approximately normal, no matter what

distribution the original random variables

come from.

Theorem.

X2, Xn are independent random

variables, from a population or process

with mean and standard deviation ,

then as long as n is sufficiently large,

sums or averages, without knowing the

distribution of the Xis!

Math 321 - Dr. Minnotte

264

265

88

266

required for maintenance on an airconditioning unit is 1 hour, and the

standard deviation is also 1 hour. A

company operates 50 such units.

maintenance on a single unit requires more

than 2 hours from the information given?

for maintenance will be more than 75

minutes?

maintenance will be less than 40 hours?

267

268

89

enough that the Central Limit Theorem is

reasonable.

much less, often as few as 10, or even

fewer.

50 or more should be fairly safe in all but

the worst cases.

Math 321 - Dr. Minnotte

269

to the Binomial Distribution

V(X) = np(1-p).

binomial distribution which is not very

skewed, the

distribution can be a good approximation to

the B(n,p) distribution.

We usually require that np 10 and

n(1-p) 10 .

270

the number of 6s rolled (X).

271

90

on the integers, but normal probabilities

are smeared out over the whole real line

(remember the probability histogram).

a continuity correction, by taking the

normal probability from (x - .5) to (x + .5)

to approximate the binomial P(X = x).

272

273

P(X 25) = P(X 24.5) =

normal approximation to estimate

P(15 < X < 25).

274

91

inferential statistics.

the distribution in question and wish to

calculate something about particular

outcomes or events.

275

and wish to use that information to say

something about the population or

distribution the sample was drawn from.

Probability

Population

Sample

Inferential

Statistics

276

quantity related to a population or

distribution.

be calculated from a dataset.

to tell us something about an unknown

parameter (what we wish we knew).

277

92

, is a statistic, , which represents a

best guess for .

distribution, X ~ f(x), and we wish to know

the unknown parameter = E(X). We

take a sample X1, X2, Xn, and estimate

with the known statistic

.

Math 321 - Dr. Minnotte

278

If X ~ Binomial(n, p) (n known, p

unknown), estimate p with

.

(median, quartiles, etc.) are good

estimates of the corresponding population

or distribution parameters.

279

Properties of Estimates

see in a parameter estimate.

estimate should give the correct value for

the parameter. If the mean of the

sampling distribution of our estimate is the

parameter we are estimating, that is,

we say that is an unbiased

estimate of .

Math 321 - Dr. Minnotte

280

93

unbiased estimate of .

so

is an

Also,

and

(proof:)

are unbiased estimates of the population

variance and proportion.

n to find s2.

Math 321 - Dr. Minnotte

deviation, s, has

so s is a biased

estimate for .

or more generally,

) is small,

especially as n gets large.

281

282

unbiased, does not guarantee that it will

give you the exact parameter on this (or

possibly, any) sample.

Even though is unbiased for p, there is

no value of X that will give

unbiased estimates distribution will be

centered correctly, but it will still have

some spread.

Math 321 - Dr. Minnotte

283

94

estimate measures that spread and is also

important in measuring how well it performs.

single measure, the mean squared error:

variance are small.

Math 321 - Dr. Minnotte

independent, with E(X1) = E(X2) = and

V(X1) = V(X2) = 2.

Let

284

285

Find:

286

95

Find:

287

in learning about a population parameter.

our estimate is likely to be to the

parameter.

error, remembering that we will usually be

within 2-3 standard errors of the parameter

(if we use an unbiased estimate).

Math 321 - Dr. Minnotte

288

know our estimate is incorrect. (We just

dont know by exactly how much.)

expanding our point estimate to an interval

estimate, providing a range of plausible

values for .

is that our interval includes .

Math 321 - Dr. Minnotte

289

96

the Central Limit Theorem to give us the

following.

290

population mean with probability 0.95.

interval.

that are consistent with the data.

Math 321 - Dr. Minnotte

body shops for cost to repair a particular

kind of damage have mean $472.36 and

standard deviation $62.35.

the mean of this population?

291

292

97

Is it correct to say

P(458.70 486.02) = 0.95 ?

statement is random. Recall:

statistics.

parameter, .

Math 321 - Dr. Minnotte

293

intervals from independent datasets, wed

get many different sample means and

sample standard deviations, and each

would lead to a different confidence

interval.

different confidence intervals would

contain the true parameter .

and the interval, not in the parameter!

294

295

98

level. We say we are 95% confident that

the population mean lies within the

computed interval.

desired, by replacing the critical value 1.96

with the Z-percentile that gives the

appropriate center probability.

common, but levels of 90% (1.645) and

99% (2.575) are also often used.

Math 321 - Dr. Minnotte

296

above which there is probability p in the

tail of the standard normal distribution.

the standard normal distribution.

use the critical value z/2.

for an 80% confidence interval?

297

298

99

of the confidence interval?

s If s is bigger,

is less accurate, and the

interval must be wider.

Confidence level To be more confident of

including the true value, we must make the

interval wider.

n as n gets bigger, the standard error of

gets smaller, and the interval gets narrower.

Math 321 - Dr. Minnotte

299

300

width (interval half-width) no more than w, we

can compute a (rough) minimum sample size if

we have an estimate or upper bound for s.

critical value to find sample sizes for other

confidence levels.

Math 321 - Dr. Minnotte

301

100

n = 50,

= 2.0727, s = 0.0711

Find a 95% confidence interval for .

w=?

302

Confidence Bounds

(or upper) bound on .

intervals, also called confidence bounds,

in a similar way to the usual two-sided

case.

303

replace 1.645 with 1.28, 2.33, or z,

respectively.

Math 321 - Dr. Minnotte

304

101

measurements give a mean of

17.17 N/mm2 and a standard deviation of

3.28 N/mm2.

shear strength is great enough, find a 90%

lower bound on .

305

and level to be valid, we must know (or at

least assume) that:

population.

The sample size n is large enough that the

sample mean is approximately normally

distributed and that s is a good estimate of .

306

useful for providing an idea of the value of

a population parameter.

more specific question about a parameter.

For this purpose, we use the other major

branch of inferential statistics, hypothesis

testing.

Math 321 - Dr. Minnotte

307

102

2.04 L of milk. Recall, a sample of size 50

gave = 2.0727, s = 0.0711. Does the

machine need to be recalibrated?

machine is working properly, and see how

likely we are to get a sample mean as far

or further from the expected value as the

sample mean we actually saw (2.0727).

Math 321 - Dr. Minnotte

308

hypothesis, H0.

parameter (say, ), generally that it is

equal to the value of interest (denoted 0).

everything is as it should be, or nothing

interesting is happening.

Here:

H0:

= 2.04 (= 0)

309

H1, that the null is incorrect.

H1:

null is incorrect, but this is often the more

interesting or important result.

2.04

310

103

the assumption that H0 is correct.

mean, , we usually use the z-statistic:

Here: z = ?

If H0 is true,

distribution?

311

that a sample from the null distribution

would give a test statistic as or more

unusual as the one we just saw.

P-value: P = P(|z| 3.25) (z ~ N(0,1)).

P (|z| 3.25) = .0012.

312

1)

unlucky to happen to get the (roughly) 1 in

800 chance to get

2.0727 (or the

equally unusual 2.0073), or

2)

H0 is wrong.

decide the filling machine does require

recalibration.

Math 321 - Dr. Minnotte

313

104

pattern:

1)

2)

3)

4)

5)

wish to decide if it reflects a true difference in

the population.

Identify the null and alternative hypotheses.

Compute a test statistic which has a known

distribution when the null hypothesis is true.

Find a P-value: the probability of a statistic as

or more unusual than the one we observed,

when the null hypothesis is true.

If P is small, reject the null hypothesis.

Otherwise, fail to reject it.

Math 321 - Dr. Minnotte

314

tests on different parameters with different

assumptions.

for a single population, we often use the

one sample z-test demonstrated above.

315

We have a single population, and a

specific value, 0, we wish to consider for

the population mean.

1)

some related population (see next example).

Or it may be a desired population mean

(example: milk data).

A sample from the population will give a

sample mean different from 0, even if that is

the actual population mean.

Math 321 - Dr. Minnotte

316

105

2)

going on. It is usually what we wish to prove.

We should decide if we care about a onesided or two-sided alternative, ideally before

we ever see data.

Two-sided: H0: = 0 vs. H1: 0.

One-sided: H0: 0 vs. H1: > 0

or:

H0: 0 vs. H1: < 0

We always compute z and P using 0, so

= 0 is always part of H0.

Math 321 - Dr. Minnotte

says that college freshmen average 7.5

hours per week at parties.

college.

317

H0 = ?

H1 = ?

Math 321 - Dr. Minnotte

318

3)

with 0:

The average reported time spent at parties

is 6.6 hours, and the standard deviation is

9 hours.

z=?

Math 321 - Dr. Minnotte

319

106

4)

z* ~ N(0,1), depending on H1:

H1

P

0

> 0

< 0

P(z* z)

new sample would give a statistic

which

disagrees with H0 at least as much as the

statistic

we have.

Math 321 - Dr. Minnotte

320

= 6.6 hours, s = 9 hours.

z=?

P=?

321

5)

Values of 0.05 or 0.01 are most commonly

used.

H0, and we say we reject H0 (at the level).

We have strong evidence of H1.

If P > , our test statistic is pretty reasonable

under H0, so we say we fail to reject H0.

Note: a large P-value is not proof of H0; many

other hypotheses may also be reasonable.

This is why we do not say that we accept H0.

Math 321 - Dr. Minnotte

322

107

statistically significant, and any with P < 0.01

highly (statistically) significant.

Note that this is very artificial. A P-value of

0.049 is only slightly stronger than one of

0.051, yet we treat them very differently.

We should always report the P-value, to

provide full information.

You should always explain in words what your

conclusion implies for the situation.

Math 321 - Dr. Minnotte

323

our P-value suggest about our

hypotheses?

of freshmen at our university?

324

P( 7.5) = .16?

P(H0 is true) = P?

true, not a probability on H0 itself.

Math 321 - Dr. Minnotte

325

108

cylinders is set to make cylinders with

diameter 50mm. A random sample of 60

cylinders has = 49.9865 and s = 0.0524.

326

same as statistical significance.

significance for our machines calibration,

it may be that the difference between

50mm and 49.9865mm is too small to

justify the expense of recalibration.

327

indicate statistical significance despite a

difference too small to be important.

large standard errors, so that a difference

which might be very important if confirmed

cannot be shown to be statistically

significant.

328

109

confidence interval, which will do a much

better job of indicating the size and

therefore importance of a potential

difference.

and s = 0.0524.

= 49.9865

interval for will include (exclude) 0

exactly whenever a two-sided test of

H0: = 0 fails to reject (rejects) at the

level.

fall below (above) 0 exactly whenever a

one-sided test of H0: 0 fails to reject

(rejects) at the level.

H0: 0.

Math 321 - Dr. Minnotte

329

330

intervals for means we looked at in Sections

5.2 and 6.1 require that we know the

standard deviation, , for our population.

enough estimate for that we can use s

without harming our P-values or interval

coverage severely.

require an adjustment to our intervals that

takes into account this uncertainty.

Math 321 - Dr. Minnotte

331

110

The t-statistic

unknown, when n is small (n < 30) we

often use the following result:

freedom.

Math 321 - Dr. Minnotte

332

heavier tails than the normal distribution.

around, 0.

heaviest tails). As gets larger, the tails

get lighter and the curve gets less spread

out.

standard normal distribution.

333

334

111

Table A.3

contains

important

percentiles

(critical values).

Each row

represents a

different t

distribution.

335

Example: T ~ t12

P(T < -1.356) = 0.10

Example: T ~ t9

P(T 1.833) = ?

Math 321 - Dr. Minnotte

336

one in Section 5.2 to justify a t-based

confidence interval when n < 30.

standard deviation from a sample of size n

from a normal population or process.

Then a confidence interval for the

population mean has the form

337

112

z-interval. The only difference is the

replacement of the usual normal (z) critical

value (such as 1.96) from one found on

the t-table with (n-1) degrees of freedom.

found by taking only the appropriate limit

(+ or -) and choosing the one-sided t

critical value (tn-1,).

Math 321 - Dr. Minnotte

338

life of a new rubber compound finds the

mileage to end-of-life. A sample of size 10

finds a mean of 61,492 miles and a standard

deviation of 3,035 miles. A normal model is

appropriate.

population mean tire life.

lower bound for population mean tire life.

Math 321 - Dr. Minnotte

339

determine the level of polyunsaturated fatty

acid. In 6 samples, the mean percent is

16.98%, and the s.d. is 0.32%. A normal

distribution is reasonable for this variable.

mean percent pfa.

percent pfa.

Math 321 - Dr. Minnotte

340

113

t-Tests (6.4)

conduct hypothesis tests when n is small,

and is unknown.

but can be used for any sample size.

of a normal population.

341

identical to conducting a z-test, except for

step 4, computation of the P-value.

1)

specific value, 0, we wish to consider for

the population mean.

2)

3)

342

4)

on H1:

H1

> 0

P(t* t)

< 0

P(t* t)

and other software can make your P-values exact.

P still gives the probability that, if H0 is true, a new sample

would give an

which is at least as unusual as the

we

have.

Math 321 - Dr. Minnotte

343

114

5)

model gets 35 mpg. A consumer group

wishes to test this claim. We measure 14

cars, find = 34.271 mpg, s = 2.915 mpg.

H0?

H1?

t=?

d.f. = ?

P=?

Conclusion?

Math 321 - Dr. Minnotte

344

tests, intervals

One-Sample T: Cholesterol in mg/dL

Variable

Cholesterol in m

Mean

StDev

SE Mean

20

205.800

48.392

10.821

95% CI

(183.152, 228.448)

2.85

0.010

345

when hypothesis testing, depending on

our decision and the actual (unknown)

truth:

Truth

Reject H0

Decision

Fail to

Reject H0

H0True

H1 True

Type I

Error

Correct

Decision

Correct

Decision

Type II

Error

346

115

serious. This may influence the choice of

H0 and H1.

of Type I error we are willing to accept

(when the null hypothesis is true).

error at all. The probability of Type II error

may be very large, especially for small n.

347

consequences of the different errors, by

considering the (usually nonstatistical)

example of a jury trial.

Truth

H0True

(Defendant

Innocent)

Reject H0

(Convict)

Decision

Fail to

Reject H0

(Acquit)

H1 True

(Defendant

Guilty)

Correct

Type I Error

Decision

Correct

Decision

Type II

Error

348

Power (6.7)

with significance level is

power = P(reject H0 | H0 false)

= 1 P(Type II Error | H0 false).

associated with H1, power will generally be

computed for a specific value of .

Math 321 - Dr. Minnotte

349

116

increase as the sample size increases.

desirable to have a power of at least 0.8 or

0.9 for a difference which is big enough to

be important.

conducting an experiment whenever

possible, to verify that the experiment will

probably show results if the difference you

desire or anticipate exists.

Math 321 - Dr. Minnotte

1.

2.

3.

Compute the rejection region, the set of

possible values of

which would lead to

rejecting H0.

Compute the probability of finding in the

rejection region, given the specified value of

.

350

351

power for a specified test, or the sample size necessary

to achieve a given power.

1-Sample t Test

Testing mean = null (versus > null)

Calculating power for mean = null + difference

Alpha = 0.05

Sample

Difference

Size

Power

0.786845

Math 321 - Dr. Minnotte

352

117

a 5% chance that the result of a test will

have a p-value of less than 0.05.

10, to be significant, even if the null is true

every time.

significant results should be reconfirmed

with an additional study with new data.

Math 321 - Dr. Minnotte

353

asbestos fibers in water and many cancer

rates. They did many tests, only a few of

which gave p-values less than 0.05, and

only one of which gave less than 0.01.

relationship between lung cancer rates

and asbestos concentration, even though

their own study suggested that a 100-fold

increase in asbestos was accompanied by

a 5% increase in lung cancer rate.

Math 321 - Dr. Minnotte

354

this with the Bonferroni correction.

so if the adjusted P is small, we may still

reject H0.

Math 321 - Dr. Minnotte

355

118

for mean yields higher than current standard

formulation.

Formulation A: P = 0.49

Formulation B: P = 0.24

Formulation C: P = 0.17

Formulation D: P = 0.003

Formulation E: P = 0.53

= .015.

improvement.

Math 321 - Dr. Minnotte

356

reject H0. D might have been higher by

chance.

practical significance, rerun the

experiment for D, and collect new data.

convincing.

Math 321 - Dr. Minnotte

357

the specific value of the mean of a

population, so much as comparing the

means of two separate, but related,

populations.

investigate these sorts of questions.

358

119

F = population mean of female salaries

Test H0: M F vs. H1: M > F

0 = average yield for a common current

treatment

Test H0: 1 0 vs. H1: 1 > 0

Confidence interval on (1 - 0)

Math 321 - Dr. Minnotte

processes.

standard deviation X.

and standard deviation Y.

looking at the difference, X - Y.

Math 321 - Dr. Minnotte

359

360

independent samples from each

population (of sizes nX and nY, which may

or may not be the same) and computing

the sample means ( and ) and

standard deviations (sX and sY).

population means, X - Y, by the

difference of the sample means,

the sampling distribution of

Math 321 - Dr. Minnotte

361

120

3.4 and 4.3:

1)

The mean of the difference is the

difference of the means.

2)

The variance of the difference is the sum

of the variances.

3)

is the difference.

Math 321 - Dr. Minnotte

362

already know about the sampling

distributions of and

give us:

is an unbiased estimator of X - Y.

1)

Show:

2)

is

Show:

Math 321 - Dr. Minnotte

363

3)

sampling distribution of

4)

Limit Theorem tells us that the sampling

distribution of

will be approximately

normal no matter what shapes the

population distributions have.

5)

standard deviations from small samples

from normal populations, we should

continue to use a t-distribution.

Math 321 - Dr. Minnotte

364

121

construct a two-sample test in much the

same way as the one-sample version.

normal distribution as we do in the singlesample case.

testing.

Math 321 - Dr. Minnotte

wish to compare, and a sample from each

population.

1)

deviation X.

Likewise, population 2 (Y) has mean Y and

standard deviation Y.

The sample means and will be different,

even if the population means X and Y are the

same.

Are and different enough to provide strong

evidence that X and Y are different as well?

Math 321 - Dr. Minnotte

365

366

Chapin Social Insight Test) is given to a

large number of college students, with a

desire to see if there is a difference in how

men and women score.

college men and women have different

means on this test?

Math 321 - Dr. Minnotte

367

122

our hypotheses to describe a statement

about a difference in the population means.

2)

Two-sided:

H0: X Y = vs. H1: X Y .

One-sided:

H0: X Y vs. H1: X Y > .

or:

H0: X Y vs. H1: X Y < .

Math 321 - Dr. Minnotte

368

between the mens population mean (X) and

the womens population mean (Y), so use a

two-sided alternative.

369

3)

Math 321 - Dr. Minnotte

is

370

123

Example: Students:

z=?

4)

probabilities on z* ~ N(0,1), depending on H1:

H1

X Y

X Y >

X Y <

P(z* z)

new sample would give a difference which is

at least as unusual as the one we have.

Example: z = 0.65. P = ?

Math 321 - Dr. Minnotte

372

5)

371

for any other hypothesis test.

If P , the evidence is pretty strong against

H0, and we say we reject H0 (at the level).

We have strong evidence of H1.

If P > , our test statistic is pretty reasonable

under H0, so we fail to reject H0. H0 is

plausible (although probably so is H1).

populations of student men and women?

Math 321 - Dr. Minnotte

373

124

tensile strength should be at least 8

N/mm2 greater for 12mm-diameter steel

rods than for 10mm-diameter rods.

Samples of size 50 give:

374

test, we may desire a confidence interval

for the difference of the population means,

X Y.

a test also allow computation of this

interval.

X - Y will be

exactly the same as in the one-sample

case.

Math 321 - Dr. Minnotte

375

376

125

Example: Students:

difference in population means between

men and women.

377

in population means.

378

one of the sample sizes is small, we run

into the same dangers for estimating the

standard error from the sample as we do

in the single-sample case.

as a two-sample z-test, but uses t

probabilities for P-values.

Math 321 - Dr. Minnotte

379

126

package such as Minitab, just as with the

one-sample t-test.

calculated as

Math 321 - Dr. Minnotte

380

an alternative to wood pulp in paper

production.

They wish to determine if adding the chemical

anthraquinone increases the pulp yield.

population mean yield? Conduct a twosample t-test.

mean improvement.

381

382

127

of population differences by arranging to

collect the data in a paired fashion.

should be paired with an observation from

population 2 (Y).

that the pairs tend to be correlated.

Math 321 - Dr. Minnotte

383

reduction.

For samples of size 20, we could get 40

volunteers and divide them at random into two

groups to get independent samples.

If drug response varies substantially from

subject to subject, it may be better to give

both drugs to each subject (on different

occasions, in random order).

This reduces the effect of subject variability,

and is probably cheaper and easier as well!

Math 321 - Dr. Minnotte

Other examples:

Brand A on left front wheel and Brand B

on right front wheel (or vice-versa, at

random) on the same cars.

using one method on left wing, other on

right wing of the same planes.

384

385

128

simpler than dealing with two independent

samples.

difference Di = Xi Yi, and then conduct a

one-sample z- or t-test or construct a onesample z- or t-confidence interval on the

differences Di, depending on the number

of pairs.

D = X Y.

Math 321 - Dr. Minnotte

386

standard drug for subject i, and let Yi be the

same for the new drug. Let Di = Xi Yi.

Data:

Patient

Xi

Yi

Di

1

28.5

34.8

-6.3

2

26.6

37.3

-10.7

:

:

:

:

20

40.1

40.8

-0.7

Math 321 - Dr. Minnotte

387

the new drug is more effective at reducing

heart rate than the old one on average?

d = 1 2.

Math 321 - Dr. Minnotte

388

129

Two-sample tests, intervals

Two-Sample T-Test and CI: With, Without

N

Mean

StDev

SE Mean

With

25

44.18

3.99

0.80

Without

20

38.56

3.63

0.81

Estimate for difference:

5.62500

3.71000

P-Value = 0.000

DF = 42

389

Paired T-Test and CI: StdDrug, NewDrug

Mean

StDev

StdDrug

40

31.1825

4.8318

SE Mean

0.7640

NewDrug

40

33.8375

4.9379

0.7808

Difference

40

-2.65500

3.73012

0.58978

T-Test of mean difference = 0 (vs not = 0): T-Value = -4.50

P-Value = 0.000

390

means of samples from two populations.

than two populations, and we wish to test

whether all of the populations have the

same mean.

called Analysis of Variance (ANOVA).

important distribution, the F distribution.

Math 321 - Dr. Minnotte

391

130

commonly used in hypothesis tests.

positive real line.

and 2.

If X ~

(X has an F distribution with 1 and

2 degrees of freedom),

392

393

for the t distributions.

combination of 1 and 2, we only get critical

points for a few values of .

0.001.

generally give us what we need.

precise P-values.

Math 321 - Dr. Minnotte

394

131

395

Example: x ~ F5,7

5 and 7), what can we say about an uppertailed P-value?

396

The populations are often called levels.

a factor.

also may identify different treatment

groups in a controlled experiment.

397

132

independent sample of size Ji.

deviations are assumed to be equal.

N = J1+ J2 + + JI.

Math 321 - Dr. Minnotte

398

We wish to test

H0: 1= 2 = = I vs.

H1: Two or more of the i are different.

population means as

and the common, or grand, mean (if H0 is

true) as

399

Stenosis

Level 1

Level 2

Level 3

Flowrate

10.6

11.7

19.6

(ml/s) at

9.7

12.7

15.1

collapse

8.3

17.6

16.6

11

11.209

14

15.086

10

17.330

Ji

N = 11 + 14 + 10 = 35

Math 321 - Dr. Minnotte

400

133

401

402

by the treatment sum of squares.

the error sum of squares.

403

134

as the total sum of squares.

dataset.

Note that

Math 321 - Dr. Minnotte

404

405

two measures of variability in an F-statistic

statistic are called the mean square for

treatments and the mean square error,

respectively.

406

135

mean square for treatments and the mean

square error are both estimates of the

common variance, 2, and

means will lead to large values of F, so the

P-value is defined as

P = P(X F), where

Math 321 - Dr. Minnotte

F=?

P-value?

Conclusion?

407

408

One-way ANOVA: Collapse Flowrate versus Amount of Stenosis

Source

Amount of Stenos

Error

Total

S = 2.080

Level

level 1

level 2

level 3

DF

2

32

34

SS

204.02

138.47

342.49

R-Sq = 59.57%

N

11

14

10

Mean

11.209

15.086

17.330

StDev

1.899

2.150

2.168

MS

102.01

4.33

F

23.57

P

0.000

R-Sq(adj) = 57.04%

StDev

+---------+---------+---------+--------(----*----)

(---*----)

(----*-----)

+---------+---------+---------+--------10.0

12.5

15.0

17.5

409

136

that at least some of the level means are

different from one another, but does not

automatically identify which ones.

but doing many tests risks false positives

(Type I error recall Section 6.8).

this is unnecessarily conservative.

Math 321 - Dr. Minnotte

410

multiple comparisons procedure, which

adjusts for the number of tests.

one or more Type I errors (false significant

differences) out of the full set.

individual probabilities of Type I error (and

the more conservative the individual tests)

must be.

Math 321 - Dr. Minnotte

411

sizes, we need an estimate of the common

variance, 2.

MSE = SSE/(N - I)

is an estimate of 2, so we use that.

412

137

Studentized range distribution, which adjusts for

the multiple comparisons.

We use qI,N-I,.

difference between level means i and j is

different at level if this interval does not

include 0.

Math 321 - Dr. Minnotte

413

q3,32,.05 3.49

2 1:

3 1:

3 2:

Math 321 - Dr. Minnotte

414

All Pairwise Comparisons among Levels of Amount of Stenosis

Individual confidence level = 98.06%

Amount of Stenosis = level 1 subtracted from:

Amount

of

Stenosis

level 2

level 3

Lower

1.814

3.884

Center

3.877

6.121

Upper

5.939

8.357

--+---------+---------+---------+------(-----*-----)

(-----*------)

--+---------+---------+---------+-------3.5

0.0

3.5

7.0

Amount

of

Stenosis

level 3

Lower

0.125

Center

2.244

Upper

4.364

--+---------+---------+---------+------(-----*-----)

--+---------+---------+---------+-------3.5

0.0

3.5

7.0

415

138

Model Assumptions

model of independent draws from normal

populations with a common variance.

common variance will not have a strong

effect, but large deviations will require

other techniques.

experience can all be useful guides.

Math 321 - Dr. Minnotte

416

of to construct confidence intervals for

its related parameter, .

sampling distribution of the sample

proportion, , to construct tests and

confidence intervals for a population

proportion or probability, p.

Math 321 - Dr. Minnotte

417

enough that the Central Limit Theorem tells

us that is approximately normal as well.

hypothesis tests on p. Since we use the

normal table, these will be z-tests.

Math 321 - Dr. Minnotte

418

139

value, p0, we wish to consider for the

population probability of success or

population proportion of successes.

1)

sample proportion different from p0, even if

that is the actual population proportion.

Example: We have a possibly biased coin.

We wish to test whether or not

Math 321 - Dr. Minnotte

419

2)

interesting is going on, or what we wish to

prove.

Choose a one-sided or two-sided alternative,

depending on our purpose.

Two-sided: H0: p = p0 vs. H1: p p0.

One-sided: H0: p p0 vs. H1: p > p0

or:

H0: p p0 vs. H1: p < p0

Example: Coin: H0: p= 0.5 vs. H1: p 0.5.

Math 321 - Dr. Minnotte

420

3)

statistic

=?

z=?

421

140

4)

probabilities on z* ~ N(0,1), depending on H1:

H1

> 0

< 0

P(z* z)

new sample would give a

which disagrees

we have.

with H0 at least as much as the

Example: z = -2.40. P = ?

Math 321 - Dr. Minnotte

422

We have strong evidence against the null

hypothesis.

5)

hypothesis is plausible.

Example:

z = -2.40, P = 0.0164.

If = 0.05, what should we conclude?

Math 321 - Dr. Minnotte

423

author of The Canterbury Tales. Should

we believe that more than 1/3 of all

students know this?

424

141

binomial probabilities for a slightly more

accurate P-value.

Test of p = 0.5 vs p not = 0.5

Exact

Sample

1

Sample p

95% CI

P-Value

176

400

0.440000

(0.390707, 0.490187)

0.019

Test of p = 0.5 vs p not = 0.5

Sample

1

Sample p

95% CI

176

400

0.440000

(0.391355, 0.488645)

Z-Value

P-Value

-2.40

0.016

425

power calculations for tests of proportions.

Test for One Proportion

Alpha = 0.05

Alternative

Sample

Target

Proportion

Size

Power

Actual Power

0.55

783

0.8

0.800239

426

Population Proportions (5.3)

We have:

trickier than in the case for , because it

appears in both the numerator and the

denominator.

Math 321 - Dr. Minnotte

427

142

unknown ps in the standard error with the

known , so the 95% confidence interval

for p would be

format as the -interval:

recommended.

Math 321 - Dr. Minnotte

428

p in the interval can be well below 95% for

smaller n.

2 successes and 2 failures to our counts:

p will be

for all n.

429

choose between instant and fresh-brewed

coffee. Out of 40 subjects, 12 prefer the

instant coffee. If p is the probability that a

random person prefers instant coffee in a

blind test, find a 95% confidence interval

for p.

430

143

biased) coin. Let p be the probability of a

head. If 100 tosses result in 45 heads,

find a 95% confidence interval for p. Is it

plausible that our coin could be fair?

is our confidence interval for p?

431

require a 95% confidence interval of error bound

(interval half-width) no more than w, we can

compute a minimum sample size.

critical value to find sample sizes for other

confidence levels.

Math 321 - Dr. Minnotte

the conservative value 0.5, and require a

minimum sample size of

432

polls generally report.

Math 321 - Dr. Minnotte

433

144

2% margin of error (w = 0.02), how big a

sample must we take?

margin of error?

434

confidence intervals to compare means of

two related populations, we may also use

them to compare two related binomial

probabilities or population proportions.

435

difference between two proportions, we

can use the fact that for independent X

and Y with large nX and nY,

436

145

interval on a single proportion, we used an

alternative estimate of p, adding two

successes and two failures to our counts:

between the two estimates:

437

to use a natural Christmas tree than urban

ones?

Rural: nX = 160, X = 64

Urban: nY = 261, Y = 89

Find a 95% confidence interval for pX pY.

Math 321 - Dr. Minnotte

438

we must estimate the common null

proportion p with the pooled proportion:

Math 321 - Dr. Minnotte

439

146

Rural: nX = 160, X = 64

Urban: nY = 261, Y = 89

Can we conclude that rural households are

more likely to use a natural Christmas tree

than urban ones?

440

likely to be frequent binge drinkers than

female students?

classified as frequent binge drinkers.

Of 7180 college men surveyed, 1630 were

considered frequent binge drinkers.

Is there a significant difference in the

populations?

441

Minitab:

Sample

Sample p

1684

9916

0.169827

1630

7180

0.227019

Estimate for difference:

-0.0571930

-0.0469659

Z = -9.34

P-Value = 0.000

442

147

of the chi-squared family of distributions as

null distributions.

hypotheses involving categorical variables

with more than two categories.

comparing more than two populations.

Math 321 - Dr. Minnotte

443

with support on the positive real line.

.

If Y ~

444

445

148

Table A.5

contains

important critical

values,

If

then

446

Example: X~

P(X 11.345) = 0.01

Example: X~

P(X 9.236) = ?

447

Categorical Variable

down into multiple categories.

(success / failure), then we study the

probability of a success, p, and test using

the z-test.

448

149

interest, we analyze them in a different

way.

dont record each observation individually.

Instead, we record the number of times

each category occurs, in a contingency

table.

449

record the numbers rolled. The

contingency table might look like:

Roll

Total

Count

14

20

17

10

21

n = 90

450

recorded by day of the week.

Day

Th

Total

Count

65

43

48

41 73 n = 270

be any of three colors.

Color

Red

Count

57

Pink White

89

54

Total

n = 200

451

150

Each trial may result in category 1 with

probability p1, category 2 with probability

p2, and so on up to category k with

probability pk. (Note: p1 + + pk = 1.)

category i, i = 1, k.

452

was fair, or whether accidents were equally

likely each day of the week, or whether the

snapdragons satisfy standard genetic theory.

H0: p1 = p10, , pk = pk0.

Example (Factory): H0: pi = 1/5, i = 1, 5.

Example (Snapdragons): H0: p1 = .25, p2 = .5,

p3 = .25

Math 321 - Dr. Minnotte

453

simply that at least one of the probabilities

is incorrect. This is often left implied.

the observed cell frequencies O1, , Ok,

with expected cell frequencies E1,, Ek.

the expected count in this category from N

trials is Ei = Npi0.

Math 321 - Dr. Minnotte

454

151

in our table.

Ei = 90/6 = 15,

i = 1, , 6.

Ei = 270/5 = 54,

i = 1, , 5.

will only be equal if the pi0s are.

E1 = E3 = 200*.25 = 50, E2 = 200*.5 =100.

Math 321 - Dr. Minnotte

455

expected frequency for each cell in our

table, we compute a test statistic.

frequencies, with a larger value indicating

less similar sets.

We will compute P-values using the chisquare table, so these are referred to as

chi-square statistics (and tests).

Math 321 - Dr. Minnotte

456

cell, in either direction.

difference to contribute as much to X2.

Math 321 - Dr. Minnotte

457

152

a chi-square distribution with (k-1) degrees

of freedom (one less than the number of

cells).

statistic is larger than the critical point

458

Example (Die):

Roll

Oi

1

14

2

20

3

17

4

10

5

8

6

21

Total

N = 90

Ei

15

15

15

15

15

15

N = 90

d.f. = 5,

0.05 P 0.10

Math 321 - Dr. Minnotte

459

Example (Snapdragons):

Color

Oi

Red

57

Pink

89

White Total

54

N = 200

Ei

X2 = ?

P?

460

153

Independence

one categorical variable at once. We can

present the results for two such variables

in a two-way table, with one variable in

rows, the other in columns.

clustered bar charts.

Math 321 - Dr. Minnotte

461

Right-handed

Men

934

Women

1070

Total

2004

Left-handed

113

92

205

Ambidextrous

20

28

Total

1067

1170

2237

462

463

154

1

O11

O12

O21

O22

Oi1

Oi2

Row

Totals

O1j

O1J

O1

O2j

O2J

O2

Oij

OiJ

O i

OI1

OI2

OIj

OIJ

OI

Column

Totals

O1

O2

Oj

OJ

O=N

464

from such two-way data is that the two

variables are independent. That is, the

probability of seeing one level in variable 1

does not depend on the level in variable 2

and vice-versa.

alternative of dependence using another

chi-square test.

mind here.

Math 321 - Dr. Minnotte

465

represents several related populations, our

null hypothesis is that these populations

are homogenous with respect to the

remaining variable.

the same for each population.

as independence.

Math 321 - Dr. Minnotte

466

155

Oij.

independence are found as

all cells.

degrees of freedom to find our P-value.

Math 321 - Dr. Minnotte

467

Right-handed

Men

934

Women

1070

Total

2004

Left-handed

113

92

205

Ambidextrous

20

28

Total

1067

1170

2237

X2?

d.f.?

P?

Math 321 - Dr. Minnotte

468

Minitab:

Expected counts are printed below observed counts

Chi-Square contributions are printed below expected counts

Total

Men

Women

Total

934

1070

2004

955.86

1048.14

0.500

0.456

113

92

97.78

107.22

2.369

2.160

20

13.36

14.64

3.306

3.015

1067

1170

205

28

2237

Math 321 - Dr. Minnotte

469

156

Recall our analyses of Chapter 2, where

we looked at bivariate data.

Example:

at a location.

predicting the values of one response

variable, based on the observed values of

one or more other explanatory variables.

470

The simple linear regression model fits a

straight line to a set of paired data

observations.

Formally:

yi = 0+ 1xi+ i

0 and 1 are (unknown) constants

1,,n are assumed to be independent draws

from a N(0, 2) distribution.

yi ~ N(0+ 1xi, 2)

E(yi) = 0+ 1xi

Math 321 - Dr. Minnotte

471

uses the least squares fit, minimizing

472

157

associated fitted regression model, the fitted

value for observation i is

around the regression line is

473

and the regression sum of squares, SSR:

which give us the computing formula

SSE = SST SSR.

the proportion of the total variation of y which

is explained by x:

474

usually focuses on , the estimate of the

slope parameter 1, which measures how

much y changes for a one-unit change in x.

sampling distribution.

and may be used to construct confidence

intervals and hypothesis tests.

Math 321 - Dr. Minnotte

has a

475

158

471),

with 10 in place of 1 and using a t table for

n-2 degrees of freedom to find a P-value.

Most commonly, 10 = 0.

Math 321 - Dr. Minnotte

476

n = 115

= .836

s = .331

= 119.25

=?

t=?

=?

P?

477

February = - 0.470 + 0.836 January

Predictor

Constant

January

Coef

-0.4698

0.83617

SE Coef

0.1257

0.03027

T

-3.74

27.63

P

0.000

0.000

478

159

1 is greater than 0.8?

construct a confidence interval for 1 as

479

480

Inference in Correlation

coefficient.

Then

.

Math 321 - Dr. Minnotte

481

160

H0: = 0 vs. H1: 0.

estimate our P-value.

Math 321 - Dr. Minnotte

482

r = -.488

U=?

d.f. = ?

P?

483

Murder

HS Grad

Illiteracy

-0.657

0.000

HS Grad

0.703

0.000

-0.488

0.000

Murder

P-Value

Math 321 - Dr. Minnotte

484

161

check on a regression analysis.

(or ) versus ei.

appears random around e = 0, everything

is probably fine.

485

486

not well fit by the model. Check for

explanations, and possibly remove those

points.

487

162

suggest a linear fit is inappropriate.

488

heteroscedastic (different scatter), meaning the

standard deviation of y is not constant it

depends on x. Fitted values may still be

reasonable, but r2 and s may not mean much.

489

Power Transformations

and heteroscedasticity can often all be

improved by the use of nonlinear

transformations on y, on x, or on both.

as logs, square roots, and reciprocals (1/x).

on this variable.

Math 321 - Dr. Minnotte

490

163

moisture content in % (y), shows a curved,

decreasing relationship.

491

linear plot.

492

x and y may require some experimentation and

patience.

transformed variables. Inverse transformations

may give us models for x and y.

us

ln(y) = 4.64 - 1.05 ln(x)

and taking antilogs on each side gives us

y = 103.4 x- 1.05.

Math 321 - Dr. Minnotte

493

164

between a response variable and multiple

explanatory variables.

relationships with a matrix plot (or

scatterplot matrix).

scatterplots, one in each orientation.

Math 321 - Dr. Minnotte

Concord, New Hampshire, began a

campaign to encourage water

conservation.

usage (in cubic feet) in 1981 based on

1980 usage and a variety of other

household variables.

494

495

496

165

multiple regression.

linear regression, with additional x terms.

yi = 0 + 1x1i + 2x2i + + pxpi + i.

expect y to change when increasing xj by

one unit, while holding all of the other xs

constant.

Math 321 - Dr. Minnotte

497

method of least squares.

complex than in the simple linear

regression case.

computed using matrix methods difficult

by hand, but easy for a computer.

498

WATER81 = 412 + 0.489 WATER80 + 0.0193 INCOME - 43.7 EDUCATION

+ 235 PEOPLE81 + 96.6 CHPEOPLE

Predictor

Constant

WATER80

Coef

SE Coef

412.0

189.0

2.18

0.030

0.48885

0.02638

18.53

0.000

0.019271

0.003368

5.72

0.000

EDUCATION

-43.65

13.23

-3.30

0.001

PEOPLE81

234.71

28.00

8.38

0.000

CHPEOPLE

96.56

80.76

1.20

0.232

INCOME

499

166

S = 851.914

R-Sq = 67.5%

R-Sq(adj) = 67.1%

Analysis of Variance

Source

Regression

DF

SS

MS

737617962

147523592

203.27

0.000

725757

Residual Error

490

355620748

Total

495

1093238710

500

501

in all respects except that one includes an

additional person, how much additional

water should we predict the larger

household will use?

502

167

contained 4 people in both 1980 and 1981,

had a household income of $25,000, and

had a head of household with 12 years of

education.

in 1981.

503

effectiveness of our model by

and

and combine them with the coefficient of

multiple determination

Math 321 - Dr. Minnotte

504

may be interpreted as the proportion of

variance in y explained by our model and

all of the xs.

more.

variability in 1981 water usage is

explained by our model?

Math 321 - Dr. Minnotte

505

168

another x to your model, even if it is

related to y only by chance.

adjusted R2 for multiple regression,

especially when comparing models with

different numbers of explanatory variables.

506

intervals on the individual coefficients in a

multiple regression model.

compute, but available in output from

Minitab and other packages.

be [n (p + 1)].

507

on the entire model at once.

coefficients are 0 (so none of the xs are

useful in predicting y).

and [n (p + 1)] degrees of freedom.

Math 321 - Dr. Minnotte

508

169

the ANOVA table in regression output from

Minitab and other statistical packages.

simple regression case, but is completely

equivalent to the t-test on 1 for this case.

509

utility say about our multiple regression of

1981 water usage?

individual terms in the model?

510

regression involves the use of interaction

(product) terms.

yi = 0 + 1x1i + 2x2i + 12x1ix2i + i.

on the value of x2.

relationships suggested by interaction

terms may be important and interesting.

Math 321 - Dr. Minnotte

511

170

WATER81 = -769 + 0.974 WATER80 + 0.0213 INCOME + 39.5 EDUCATION

+ 217 PEOPLE81 - 0.0336 WATER80*EDUCATION

Predictor

Constant

WATER80

Coef

SE Coef

-768.9

313.4

-2.45

0.014

0.9742

0.1090

8.93

0.000

0.021263

0.003310

6.42

0.000

EDUCATION

39.55

22.25

1.78

0.076

PEOPLE81

216.57

27.52

7.87

0.000

-0.033617

0.007275

-4.62

0.000

INCOME

WATER80*EDUCATION

S = 835.152

R-Sq = 68.7%

R-Sq(adj) = 68.4%

Math 321 - Dr. Minnotte

512

slope) on Water80 for a household whose

head has 8 years of education?

12 years?

16 years?

513

regression.

yi = 0 + 1xi + 2xi2 + i.

scatterplots.

Example: Yield

(kg/ha) vs. time

to harvest (days

after flowering)

for paddy, a

grain from India.

Math 321 - Dr. Minnotte

514

171

The regression equation is

Yield = - 1070 + 293.5 Time - 4.536 Time**2

S = 203.883

R-Sq = 79.4%

R-Sq(adj) = 76.2%

Analysis of Variance

Source

Regression

DF

SS

MS

2084779

1042390

25.08

0.000

41568

Error

13

540388

Total

15

2625168

515

interaction terms is still considered linear

regression, not because it is linear in the

xs (its not), but because it is linear in the

parameters being estimated, the s.

516

Even most statisticians deal with pregenerated statistics in journals and the

news far more frequently then they are

called upon to compute them themselves.

statistics with a properly critical eye and

brain (statistical literacy) is one which

should be expected of any educated adult.

Math 321 - Dr. Minnotte

517

172

Afghanistan was $3.4 billion.

Three out of every 1,000 patients who have

their stomachs stapled will die within three

months.

Each year, about 1,100 suicides occur on

U.S. college campuses.

broken down into three parts:

randomness (unavoidable)

nonstatistical mistakes (what to watch for)

is the true value, and forget the other

components.

519

518

reporting standard errors,

using confidence intervals instead of point

estimates,

testing for statistical significance,

and so on.

Math 321 - Dr. Minnotte

520

173

Remember, our statistical methods of data

analysis all assume weve collected our

data through planned introduction of

chance.

This means:

Random samples.

521

or a nonrandomized one (such as

historical data)

showed its subjects survived longer than

historical controls.

The sickest subjects couldnt be given the

surgery (they likely wouldnt survive it), so

were excluded from the study.

A randomized controlled study showed only

minor survival differences between the

groups.

Math 321 - Dr. Minnotte

522

convenience.

participate in experiments.

Critics have suggested modern psychology be

renamed psychology of the college

sophomore.

523

174

drawn from a different population than

implied.

children of students. A sample of students of

the university asking if they require daycare to

attend classes is virtually certain to have a

small percentage (at most) saying yes. This

is not useful for determining if providing

daycare would allow other parents to attend

classes.

Math 321 - Dr. Minnotte

524

about how the sample was collected, take

the results with a few barrels of salt,

especially if they seem unreasonable

otherwise.

525

commonly reported statistical results, yet

they have some of the greatest dangers

associated with them.

to evaluating the results it reports.

526

175

finding it once it is defined is often difficult

or impossible.

a well-defined population, but it could be hard

to find a list.

527

often produce skewed results.

Democrat) was running for his second

term as president against Republican Alf

Landon.

questionnaires to 10 million people from

phone books, club membership lists, and

magazine subscription lists.

Math 321 - Dr. Minnotte

528

received, they predicted Landon would win

57% to 43%. On election day, Roosevelt

won 62% to 38%.

529

176

Report: Women and Love. This was a

study based on a long, essay-type survey

of women on love and sex.

to organizations like church groups,

political organizations, and counseling

centers, Hite received 4,500 back.

530

headlines!) like 70% of women married

five years or more are having sex outside

of their marriages.

large, and well-matched to census data in

factors such as race, income, and

geographic region, that her results can be

taken as representative of the country as a

whole.

Math 321 - Dr. Minnotte

531

a survey that will not provide the exact

questions asked.

on assistance to the poor.

44% think we spend too much on welfare.

532

177

care plan administered by the federal government that

would compete directly with private health insurance

companies?

o

feel it is to give people a choice of both a public plan

administered by the federal government and a private

plan for their health insurance extremely important,

quite important, not that important, or not at all

important?

o

77% extremely (58%) + quite (19%), 22% not that (7%) + not

at all (15%)

Math 321 - Dr. Minnotte

533

From Republican

congressman

John Culbersons

web page:

534

potentially investing billions to try to keep financial institutions

and markets secure. Do you think this is the right thing or the

wrong thing for the government to be doing?

use taxpayers' dollars to rescue ailing private financial firms

whose collapse could have adverse effects on the economy

and market?

steps the Federal Reserve and the Treasury Department

have taken to try to deal with the current situation involving

the stock market and major financial institutions?

535

178

may be confusing, and thus bias the

results.

impossible to you that the Nazi extermination

of the Jews never happened?

22% said possible.

1994: Does it seem possible to you that the

Nazi extermination of the Jews never

happened, or do you feel certain that it

happened?

1% said possible.

Math 321 - Dr. Minnotte

536

the answer.

black, surveyed Southern blacks during World

War II, asking if blacks would be treated better

or worse if Japan conquered the U.S.

Black interviewers: 9% better, 25% worse.

White interviewers: 3% better, 45% worse.

537

interviewer can affect the results no matter

who does the questioning.

if people brushed as much as they

claimed, toothpaste sales would be three

times higher than they actually were.

538

179

that 39% of Americans had an opinion about the

Simpson-Bowles deficit reduction plan.

Panetta-Burns plan, even though the latter didnt

exist!

voted on Proposition 19 to legalize, tax,

and regulate recreational marijuana use.

been conducted:

539

proposition being defeated by 1, 2, and 4

percentage points.

3 automated polls (robopolls) showed the

proposition passing by 10, 14, and 16 points.

Math 321 - Dr. Minnotte

540

Dangers of Inference

Garbage In Garbage Out

there are issues with the data collection, there

will still be perfectly good-looking results from

the computer.

Thats why checking the study design is so

critical.

541

180

Data snooping

conduct a test, we should not be surprised

when that test returns a significant result.

Ex: A town of 50,000 has very high voltage

power lines. One year, the rate of a particular

type of cancer is 3 times the national

average.

A test of significance gives a p-value of

0.0002 = 1/5,000. Are the power lines

causing cancer?

Math 321 - Dr. Minnotte

542

250,000,000 into sets of 50,000, there would

be more than 5,000 of them. By chance,

youd expect at least one to have such a high

rate. Since the high rate led us to test, its not

convincing yet.

If the high rate were to persist over several

years, it would suggest that something in the

town was causing it. (Remember, correlation

is not causation!)

Important studies should be replicated to be

truly convincing.

Math 321 - Dr. Minnotte

543

device called the Aquarius which chose one

of 4 targets, which the subject was supposed

to predict.

Out of 7,500 guesses from 15 clairvoyant

subjects, 2,006 were hits. Compared to a null

of p=1/4, this gives a p-value of 0.0002.

Did this prove ESP?

544

181

number generator almost never picked the

same target twice in a row.

By selecting a different choice for the next

guess after the target lit up, a subject could

almost have a 1/3 chance of a hit.

A replication with an improved r.n.g. showed

no significant results.

The results of the first experiment werent due

to chance with p=1/4, but they werent due to

ESP either.

Statistical tests wont check your experimental

design.

Math 321 - Dr. Minnotte

545

editorial. Google found a report on The

Afghan Opium Survey 2008 from the

United Nations Office on Drugs and Crime.

imagery and surveys of farmers, villagers,

and traders.

546

the International Bariatric Surgery

Registry. No information how computed.

American Foundation for Suicide

Prevention. Also no information.

547

182

## Mult mai mult decât documente.

Descoperiți tot ce are Scribd de oferit, inclusiv cărți și cărți audio de la editori majori.

Anulați oricând.