Sunteți pe pagina 1din 101

Why Statistics?

Two Purposes
1. Descriptive
Finding ways to summarize the
important characteristics of a dataset
2. Inferential
How (and when) to generalize from a
sample dataset to the larger population
Descriptive
Statistics
3.88 3.38
1.88 .63
2.00 3.13
3.88 4.25
2.50 .50
3.25 3.75
3.13 1.50
1.50 1.88
3.75 .88
2.00 2.25
2.38 1.13
3.25 3.38
2.88 1.00
.88 -.25
3.50 1.63
4.13 1.50
.38 2.00
4.63 2.13
Firsthand
Impression
Secondhand
Impression
Firsthand
0
2
4
6
-0.5 0.5 1.5 2.5 3.5 4.5 More
F
r
e
q
u
e
n
c
y
Secondhand
0
2
4
6
8
-0.5 0.5 1.5 2.5 3.5 4.5 More
F
r
e
q
u
e
n
c
y
Difference
0
5
10
-3 -2 -1 0 1 2 3 More
F
r
e
q
u
e
n
c
y
3.88 3.38
1.88 .63
2.00 3.13
3.88 4.25
2.50 .50
3.25 3.75
3.13 1.50
1.50 1.88
3.75 .88
2.00 2.25
2.38 1.13
3.25 3.38
2.88 1.00
.88 -.25
3.50 1.63
4.13 1.50
.38 2.00
4.63 2.13
Firsthand
Impression
Secondhand
Impression
-3.5
-3
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
0 1 2 3 4 5
Firsthand
C
h
a
n
g
e
-1
0
1
2
3
4
5
0 1 2 3 4 5
Firsthand
S
e
c
o
n
d
h
a
n
d
Characterizing a Distribution of Data

1.00 2.00 3.00 4.00 5.00 6.00 7.00
Voice
0
2
4
6
8
10
12
F
r
e
q
u
e
n
c
y
frequency
Comparing Distributions of Data

How could you summarize the differences?
1.00 2.00 3.00 4.00 5.00 6.00 7.00
Voice
0
2
4
6
8
10
F
r
e
q
u
e
n
c
y
frequency
Men
1.00 2.00 3.00 4.00 5.00 6.00 7.00
Voice
0
2
4
6
8
F
r
e
q
u
e
n
c
y
frequency
Women
Looking for Linear Relationships

1 2 3 4 5 6 7
Conflict Significance
2.0
3.0
4.0
5.0
6.0
7.0
Anger during Conflict
Looking for Linear Relationships

1 2 3 4 5 6 7
Conflict Significance
2
3
4
5
6
7
Current relationship
satisfaction
Comparing Linear Relationships

How could you summarize the differences?
1 2 3 4 5 6 7
Conflict Significance
2.0
3.0
4.0
5.0
6.0
7.0
1 2 3 4 5 6 7
Conflict Significance
2
3
4
5
6
7
Anger during
Conflict
Current
relationship
satisfaction
Complex Linear Relationships

1 2 3 4 5 6 7
Conflict Significance
2
3
4
5
6
7
theory
entity
incremental
Current
relationship
satisfaction
Descriptive Statistics
Provides graphical and numerical ways to
organize, summarize, and characterize a dataset.
Types of Studies
Experimental:
The predictor variable is manipulated by the
researcher.
Observational:
The predictor variables are merely observed and
recorded by the researcher.
Types of Variables
Predictor variable:
The antecedent conditions that are going to be used
to predict the outcome of interest. If an experimental
study, then called an independent variable.
Outcome variable:
The variable you want to be able to predict. If an
experimental study, then called a dependent
variable.
Types of Variables
Continuous variable:
There are an infinite number of possible values that fall
between any two observed values.
Discrete variable:
Consists of separate, indivisible categories
Ordinal Categorical
A set of categories that have
different names
A set of categories that are
organized in an ordered sequence
Summarizing Discrete Data

Name

Eye Color
Janice brown
Tom blue
Danielle green
Ian brown
Eduardo brown
Emily brown
Anja blue
Cara brown
Adrian brown
Eric blue
Sarah brown
David brown
Frequency Tables
Eye Color
Frequency
Brown
Blue
Green
33
14
3
Frequency Tables
Eye Color
Frequency
Relative
Frequency
Brown
Blue
Green
33
14
3
66%
28%
6%
Frequency Bar Graph
0
5
10
15
20
25
30
35
Frequency
Eye Color
Brown
Blue
Green
Relative Frequency Bar Graph
0
20
40
60
80
100
Relative
Frequency
Eye Color
Brown
Blue
Green
Summarizing Continuous Data

Name
Hours of
Sleep / Night
Janice 6
Tom 7.5
Danielle 10.5
Ian 9
Eduardo 7
Emily 6
Anja 8
Cara 5
Adrian 8.5
Eric 6.5
Sarah 7.5
David 4
Frequency Tables
Hours of
Sleep
Frequency
3 - 4 hrs
4 - 5 hrs
5 - 6 hrs
6 - 7 hrs
7 - 8 hrs
8 - 9 hrs
9 - 10 hrs
10 - 11 hrs
1
3
6
14
16
5
3
2
Histogram (Frequency)

0
2
4
6
8
10
12
14
16
F
r
e
q
u
e
n
c
y
3.5 4.5 5.5 6.5 7.5 8.5 9.5 10.5 11.5
Nightly Hours of Sleep
Frequency Tables
Hours of
Sleep
Frequency
3 - 4 hrs
4 - 5 hrs
5 - 6 hrs
6 - 7 hrs
7 - 8 hrs
8 - 9 hrs
9 - 10 hrs
10 - 11 hrs
1
3
6
14
16
5
3
2
Relative
Frequency
2%
6%
12%
28%
32%
10%
6%
4%
Histogram (Relative Frequency)

0
20
40
60
80
100
R
e
l
a
t
i
v
e

F
r
e
q
u
e
n
c
y
3.5 4.5 5.5 6.5 7.5 8.5 9.5 10.5 11.5
Nightly Hours of Sleep
Frequency Tables
Hours of
Sleep
Frequency
Relative
Frequency
3 - 4 hrs
4 - 5 hrs
5 - 6 hrs
6 - 7 hrs
7 - 8 hrs
8 - 9 hrs
9 - 10 hrs
10 - 11 hrs
1
3
6
14
16
5
3
2
2%
6%
12%
28%
32%
10%
6%
4%
Cumulative
Frequency
2%
8%
20%
48%
80%
90%
96%
100%
Stem and Leaf Plots

Name


Janice 54
Tom 59
Danielle 35
Ian 41
Eduardo 46
Emily 25
Anja 47
Cara 60
Adrian 41
Eric 34
Sarah 22
David 45
Stem Leaves
2
3
4
5
6
2 5
4 5
1 1 5 6 7
4 9
0
Stem and Leaf Plots

Name


Janice 54
Tom 59
Danielle 35
Ian 41
Eduardo 46
Emily 25
Anja 47
Cara 60
Adrian 41
Eric 34
Sarah 22
David 45
S
t
e
m

L
e
a
v
e
s










2

3

4

5

6

2

5

4

5

1

1

5

6

7

4

9

0

Stem and Leaf Plots

Name


Janice 54
Tom 59
Danielle 35
Ian 41
Eduardo 46
Emily 25
Anja 47
Cara 60
Adrian 41
Eric 34
Sarah 22
David 45
Stem Leaves
2
3
4
5
6
2 5
4 5
1 1 5 6 7
4 9
0
Back-to-Back Stem and Leaf Plots

Name


Janice 54
Tom 59
Danielle 35
Ian 41
Eduardo 46
Emily 25
Anja 47
Cara 60
Adrian 41
Eric 34
Sarah 22
David 45
2
3
4
5
6
2 5
5
7
4
0
women

4
1 1 5 6
9

men
Visual Depictions of Distributions
Summary
Frequency Tables
Bar Graphs
Discrete Data
Frequency Tables
Bar Graphs
Stem and Leaf Plots
Continuous Data
Visual Depictions of Relationships
IV -- categorical; DV -- continuous
Feelings of
Caring
With
peers
With
profs
With
women
With
men
With
familiar
With
unfamiliar

Average


Unfriendly
Female
-2.61 1.60 -2.37 -1.60 1.83 -3.38 -1.62

Charts
Visual Depictions of Relationships
IV -- categorical; DV -- continuous
3.5
4.9
3.8
2.5
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Entity Incremental
Voicing Response
Mild
Conflict
Extreme
Conflict
Bar Graphs
Visual Depictions of Relationships
IV -- categorical; DV -- continuous
3.5
3.8
4.9
2.5
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Mild Conflict Extreme Conflict
Voicing Response
Entity
Incrementa
l
Look at the same
graph differently!
Bar Graphs
Visual Depictions of Relationships
IV -- categorical; DV -- continuous
3.5
3.8
4.9
2.5
2
2.5
3
3.5
4
4.5
5
Mild Conflict Extreme Conflict
Voicing Response
Entity
Incrementa
l
Look again!
Bar Graphs
Visual Depictions of Relationships
IV -- categorical; DV -- continuous
Line Graphs
Intention-Reading Performance
0
10
20
30
40
50
60
70
80
90
100
1st
Quartile
2nd
Quartile
3rd
Quartile
4th
Quartile
P
e
r
c
e
n
t
i
l
e
Estimated
Actual
Visual Depictions of Relationships
IV -- categorical; DV -- continuous
Box-plots
Implicit Theory
Voice
Visual Depictions of Relationships
IV -- categorical; DV -- continuous
Error-bar plots
Person
attributions
Implicit Theory
Visual Depictions of Relationships
IV -- continuous; DV -- continuous
Scatterplots
1 2 3 4 5 6 7
Conflict Significance
2
3
4
5
6
7
Current
relationship
satisfaction
Visual Depictions of Relationships
IV -- continuous; DV -- continuous
Scatterplots
1 2 3 4 5 6 7
Conflict Significance
2
3
4
5
6
7
theory
entity
incremental
Current
relationship
satisfaction
Visual Depictions of Relationships
IV -- continuous; DV -- continuous
Scatterplots
with regression
lines
1 2 3 4 5 6 7
Conflict Significance
2
3
4
5
6
7
theory
entity
incremental
Current
relationship
satisfaction
Visual Depictions of Relationships
IV -- categorical; DV -- categorical
Contingency
table
Narcicists * Male Crosstabulation
Count
19 53 72
15 51 66
34 104 138
.00
1.00
Narcicists
Total
.00 1.00
Mal e
Total
Visual Depictions of Relationships
IV -- categorical; DV -- categorical
Contingency table
IV -- continuous; DV -- continuous
Scatterplot (regression line)
IV -- categorical; DV -- continuous
Charts, bar graphs, line graphs, box plots, error
bar plots
Inferential
Statistics
Inferential Statistics
Population:
The set of all individuals of interest (e.g. all
women, all college students)
Sample:
A subset of individuals selected from the
population from whom data is collected
Inferential statistics
Are these sample differences simply
due to chance?

0
1
2
3
4
5
6
7
8
N
o
.

o
f

P
e
o
p
l
e
3.5 5.5 7.5 9.5 11.5
Nightly Hours of Sleep
Women
0
1
2
3
4
5
6
7
8
N
o
.

o
f

P
e
o
p
l
e
3.5 5.5 7.5 9.5 11.5
Nightly Hours of Sleep
Men
Some important terms
Parameter:
A characteristic of the population. Denoted
with Greek letters such as or .
Statistic:
A characteristic of a sample. Denoted with
English letters such as X or S.
Sampling Error:
Describes the amount of error that exists
between a sample statistic and the
corresponding population parameter.
We want to know whether Joe is an above average
free-throw shooter. We collect some data
B
B
M
B
B
M
M
B
B
B
M
B
M
B
B
M
B
M
M
B
B
B
M
B
% baskets = .75
% baskets = .63
% baskets = .58
Would you bet $10.00 that he makes the next shot?
Chance is Lumpy

H
H
T
H
% heads = .75
H
T
T
H
H
H
T
H
% heads = .63
T
H
H
T
H
T
T
H
H
H
T
H
% heads = .58
So how do we decide?
H
H
T
H
Sample proportion = .75
Inferential Statistics helps us answer the
question:
Given a fair coin tossed four times, how often
would we get the result 75% heads by chance
alone?
Answer: If we took a fair coin and repeated
this procedure many times, wed get this result
one out of every four times. Pretty often!
So differences we see between
samples might not be reliable
Inferential statistics can tell us
whether or not our results are likely to
be due to chance alone
(especially when the differences are small
or the samples are small)
Important Point of Clarification

Statistics asks: Was this observed effect caused
by (lumpy) chance alone?
Random Causes:
Fluctuations of chance
Non-random causes:
True differences in the population
Bias in the design of the study
Inferential
statistics
separates
A statistically significant result doesnt mean the results have to be
true. Just that they are non-random.
Inferential Statistics
Descriptive
Statistics
Probability
Theory
Types of Analyses
IV -- categorical (groups); DV -- continuous
One Sample T-test. Inferences about the mean of one group
Two Sample T-test. Differences between the means of two groups.
ANOVA. Differences between the means of three or more groups.
0
10
20
30
40
50
Score
First Grade
Third Grade
Fifth Grade
Types of Analyses
IV -- continuous; DV -- continuous
Correlation. The linear association between two continuous
variables
Regression. The best fit line of prediction.
0
1
2
3
4
5
6
7
8
9
10
0 10 20 30 40 50 60 70 80
Age
S
l
e
e
p
Types of Analyses
IV -- categorical (groups); DV -- categorical
Z-test for proportions. The difference between two sample
proportions.
Chi-square test. The distribution of counts in each category,
compared across groups.
Narcicists * Male Crosstabulation
Count
19 53 72
15 51 66
34 104 138
.00
1.00
Narcicists
Total
.00 1.00
Mal e
Total
Fallibility of
Everyday
Reasoning
Everyday Statistical Reasoning
1. Something out of nothing: the misperception of
random data.
2. Too much from too little: the misinterpretation of
incomplete data
3. Seeing what you expect: biased evaluation of
ambiguous data
Misperceiving Random Data

The human understanding supposes a greater degree of
order and equality in things than it really finds; and
although many things in nature be most irregular, will yet
invest parallels and conjugates and relatives where no such
thing is. -Francis Bacon
The clustering illusion
People do not intuitively expect chance to be lumpy.
They reject the possibility that clustering can be
random.
Hot hand in basketball. Winning streak or hot
table in gambling.
Gilovich et al., 1985
Interviewed 100 basketball fans

91% thought a player has a better chance of making a shot after
having just made his last 2-3 shots than he does after having just
missed his last 2-3 shots.

They estimated that a players shooting percentage would be
61% after having just made a shot and 42% after having just
missed a shot.

84% of the respondents thought that it is important to pass the
ball to someone who has just made several shots in a row.
Gilovich et al., 1985
On average, players made 51% of shots after making their
previous shot, 54% of shots after missing their previous shot.
They made 50% of shots after making their previous two shots,
53% after missing their previous two shots.
They made 46% of shots after making their previous three
shots, 56% of shots after missing three in a row.
There were no more streaks of 4, 5, or 6 hits in a row than
chance would have predicted.
The data
The players, however, believed that they tended to shoot in
streaks.
Gilovich et al., 1985
A group of college b-ball players were asked to take 100 shots.
Before each shot they chose either a risky or conservative bet on
their ability to make the shot.

They tended to make risky bets after hitting their previous shot
and conservative bets after missing their previous shot.

However, there was no correlation between the outcome of
consecutive shots. No correlation between bets and outcomes.
The data
Gilovich et al., 1985
The response
Who is this guy? So he makes a study. I couldnt care less.
-Red Auerbach, Celtics
There are so many variables involved in shooting the basketball
that a paper like this doesnt mean anything.
-Bobby Knight
Selective Attention
Post-hoc causal explanations
Dangers of Post-Hoc
theorizing!
LAW of LARGE NUMBERS
The correct proportion of heads and tails or
hits and misses will be present globally in a
long sequence.

It will NOT, however, always be present
locally, in each of its parts.
Misinterpreting Incomplete Data

They still cling stubbornly to the idea that the only good
answer is a yes answer. If they say, Is the number
between 5,000 and 10,000 and I say yes, they cheer; if I
say no, they groan, even though they get exactly the same
amount of information in either case.
-John Holt
Are professors particularly likely to be
absent-minded?
Professors
Absent-Minded Not Absent-Minded
600 400
Not Professors 300 200
Does carrying an umbrella make it less likely
to rain?
Umbrella
Rain No rain
No umbrella
Does the Cosmo horoscope predict the
future?
Cosmo predicts event
Event happens Event doesnt happen
Cosmo doesnt predict event
Can alternative medical technique X help
cancer patients who have been diagnosed as
incurable?
Patient gets
alternative med
Patient recovers Patient fails to recover
Patient does not
get alternative med
500 4000
3800 700
Selective attention
Available information
Positive test strategy

A B 2 3
All cards with a vowel on one side have an even number
on the other.
Selective attention
Available information
Positive test strategy
Under-appreciation of base rates


Event
hypothesized
Event occurs Event does not occur
No event
hypothesized
Watch out for incomplete data!
III. Projecting onto Ambiguous Data

Ill see it when I believe it.
-Thane Pittman
Illusory correlations
When people see an association that is not present in
the data.
Arthritis pain is influenced by the weather.
Most women get bad moods before their menstrual
periods.
Chapman et al., 1967
Why do clinical psychologists continue to use projective
tests even though dozens of studies have shown these
tests are not valid indicators of personality?

Showed clinicians a series of Rorschach cards as well as
the patients response to the card and some info
describing the patients characteristics. (including
sometimes sexual orientation).

Examined the correlations that clinicians saw between
particular responses and homosexuality.
Chapman et al., 1967
In truth, there are some counter-intuitive relationships.
Homosexuals are more likely to see a monstrous figure on
one card and an ambiguous animal-human figure on
another card.

Many of the intuitive relationships do not hold.
Homosexuals are not more likely to see anal content,
feminine clothing, or humans of uncertain gender.
Chapman et al., 1967
In Study 1, researchers designed the materials so that
there was no correlation between any of the responses
and homosexuality.

Clinicians did, however, believe the highly intuitive -- but
invalid -- correlations.
Chapman et al., 1967
In followup studies, researchers designed the materials
so that there was a negative correlation between the
intuitive responses and homosexuality.

The size of the illusory correlation was not reduced.
Clinicians may see non-existent correlations between
test responses and diagnoses
Managers may see non-existent correlations between
employees race or gender and performance
Parents may see nonexistent correlations between
childrens sugar consumption and unruly behavior
Students may see nonexistant correllations between
their peers college majors and personalities.
Much of what we learn from experience may
reflect our prior theories about reality rather
than the actual nature of reality.
Everyday Statistical Reasoning
1. Something out of nothing: the misperception of
random data.
- Drawing strong conclusions from small lumpy
samples
2. Too much from too little: the misinterpretation of
incomplete data
- Inadequate comparison groups
3. Seeing what you expect: biased evaluation of
ambiguous data
- Illusory correlation based on confirmation bias
But theres hope
Following training in probability and
statistics, people are less likely to make
these errors.
Fallibility of
Statistical
Reports
Everyday Reasoning
1. Something out of nothing: the
misperception of random data.
2. Too much from too little: the
misinterpretation of incomplete
data (~control groups)
3. Seeing what you expect: biased
evaluation of ambiguous data
Statistical Reports
1. One thing out of something else:
overgeneralization from biased samples
and measures
2. Too much from too little:
the misinterpretation of incomplete data
(~control groups)
3. Getting what you expect: biased
presentation of ambiguous data
Overgeneralizing from Biased Samples

- In 1934, the Literary Digest predicted that Alf Landon would beat
Franklin D. Roosevelt in the presidential election, based on approx 2
million survey responses
- How could a study with such a large sample be so wrong? Selection bias?
But participants were selected randomly from phone books
- Other polling agencies with smaller samples but more representative
methods accurately predicted Roosevelts win
1934 Election Poll
Overgeneralizing from Biased Samples

- In early 1996, media raised the alarm about declining sperm counts, as a
result of a book published by Colburn, an environmentalist
- The book relied heavily on a 1992 Danish meta-analysis reviewing 61
papers published between 1938 and 1991, in which a total of 14,947 men
had their sperm tested.
- Found a significant decline in sperm count: from 113 m sperm per ml in
1940 to 66 m sperm per ml in 1990
Sperm Study
Overgeneralizing from Biased Samples

- Sample:
-The entire decline was carried by the single 1951 sample
-From 1970-1991, sperm counts actually increased
Pre-1950
596
1951
1000
1952-1970
184
1970-1991
13,167
one study!
Sperm Study
Misinterpretation of Incomplete Data

- Murders significantly fell in NYC in the last decade: from 2,245 in
1990 to 596 in 2003
presumed cause: Giuliani
- Murders significantly fell all across the country from 1990 to 2003
- Crime started dropping in NYC in 1990, four years before Giuliani
became mayor.
Crime Study
Misinterpretation of Incomplete Data

- In October 1996, NCHS issued data showing that the rate of births
to unwed mothers had declined from 46.9 per thousand in 1994 to
44.9 per thousand in 1995. The first decline in 20 years. Front page
coverage in the NYtimes and LAtimes.
- Clinton trumpets the results as a success for his new welfare
policies (instituted in 1996)
- Not mentioned: from 1993-1994 there was the largest one-year
increase in out-of-wedlock births since national figures have been
kept
Unwed Mothers Study
Biased Presentation of Ambiguous Results

Selective presentation of results
-In 1996, media publicized the results of a study presented at an NICHD
conference claiming that the bond between mothers and babies is not
weakened when the child is placed in day care
-Study measured the presence or absence of secure attachment in
infants
-Overall no difference in day care versus home care babies
Day Care Study
Biased Presentation of Ambiguous Results

-What the media did not highlight: a more confusing picture emerges
when the averages are broken out
-Baby boys were most likely to be insecurely bonded when they were in
day care for more than thirty hours a week
-Baby girls were most likely to be insecurely bonded when they were in
day care for less than 10 hours a week
Day Care Study
Selective presentation of results
Biased Presentation of Ambiguous Results

- Researchers sometimes present significant results and fail to present
null (or opposing) results.
-Sometimes you can catch them look at their methods section and see
how many tests they must have run and how many they reported.
Psychology Research
Selective presentation of results
Biased Presentation of Ambiguous Results

Spin or specialized emphasis of particular results
Mortgage Study
-In 1995 a study by the Federal Reserve Bank of Chicago, showed that
among people with bad credit ratings, 10% of white applicants are
denied mortgages while 20% of black and Hispanic applicants are
denied mortgages.
-In the same study, however, it was found that compared to past years,
approved mortgages rose by 55% for black applicants rose by 55%,
45% for Hispanic applicants, and 16% for white applicants.
Biased Presentation of Ambiguous Results

Spin or specialized emphasis of particular results
Mortgage Study
-In 1995 a study by the Federal Reserve Bank of Chicago, showed that
among people with bad credit ratings, 90% of white applicants are
granted mortgages while 80% of black and Hispanic applicants are
granted mortgages.
-In the same study, however, it was found that compared to past years,
approved mortgages rose by 55% for black applicants, by 45% for
Hispanic applicants, and by only 16% for white applicants.
Biased Presentation of Ambiguous Results

Spin or specialized emphasis of particular results
Mortgage Study
-The NYtimes did not report the second finding until the fourth
paragraph of the article
-They also reported denial rates rather than approval rates.
-In approval terms, the comparison is 90% versus 80%. In denial terms,
the comparison is 10% versus 20%.
-Twice as likely to be denied. (makes you think people of color were
half as likely to be accepted, but actually they were 88% as likely to be
accepted)
Biased Presentation of Ambiguous Results

Spin or specialized emphasis of particular results
Psychology Research
- Sometimes a p-value of 0.10 is treated as not significant (especially
if the researcher did not predict the effect)
- Other times the same p-value is emphasized as marginally
significant (esp if the researcher predicted the effect)
Cant always trust intuition


Cant always trust statistical reports
- Learn more about possible pitfalls in
intuitive decision making
- Learn more about how to evaluate
statistical reports and research findings
Practice
Washington Post April 12, 2000
Government-funded medical surveys since 1960 have shown
higher rates of at least one type of cancer varying from
thyroid tumors to leukemia at most of the major facilities
that produced nuclear weapons.
Whats problematic about this statement?

S-ar putea să vă placă și