Documente Academic
Documente Profesional
Documente Cultură
Example
Number of car thefts in the past 12 days
6
11
15
6 + 3 + 7 + 11 + 4 + 3 + 8 + 7 + 2 + 6 + 9 + 15
=
12
81
=
= 6.75
12
11
15
Sample means:
odd number days
even number days
6+7+4+8+2+9
6
3+11+3+7+6+15
6
36
6
=6
45
6
= 7.5
FL
IA
IL
KS
LA
MO
MS
NE
OK
SD
TX
1113
2009
1374
1137
2110
1086
1166
1039
1673
2300
1139
5490
= 21636
=1803
12
Eliminate outlier (TX-5490) from the sample set:
= 16146
=1468
11
Trimmed Mean
Can used a trimmed mean to get rid of outliers.
Calculate trimmed mean by dropping certain percentage
of values from each end of a ranked data set.
Example: The following data represent the number of
tornadoes that touched down during 1950-1994 in the 12
states that had the most tornadoes during that period.
CO
FL
IA
IL
KS
LA
MO
MS
NE
OK
SD
TX
1113
2009
1374
1137
2110
1086
1166
1039
1673
2300
1139
5490
Rank data: 1039, 1086, 1113, 1137, 1139, 1166, 1374, 1673, 2009, 2110, 2300, 5490
Trim 10% from both sides. 10% of 12 is 1.2, so eliminate 1 point from each side
Trimmed data: 1086, 1113, 1137, 1139, 1166, 1374, 1673, 2009, 2110, 2300
= 15107
10 =1511
Weighted Mean
Use when certain values of a data set are
considered more important than others
=
Weight
Actual Grade
Computer Practice
10%
95%
1095+1590+10876583
10+15+10+65
8565
=
100
Computer Quizzes
15%
90%
= 85.65%
In-Class Quizzes
10%
87%
In-Class Tests
65%
83%
Median
The value of the middle term in a data set
that has been ranked in increasing order
Gives the center point of the histogram
Not influenced by outliers
11
15
Mode
The value that occurs with the highest frequency in a
data set
Data set may have none (no observation occurs more
than once) or may have more than one made
One mode - unimodal
Two modes - bimodal
More than two modes - multimodal
Example: Number of car thefts in the past 12 days
6
11
15
Measures of Dispersion
for Ungrouped Data
Section 3.2
Measures of Dispersion
FL
IA
IL
KS
LA
MO
MS
NE
OK
SD
TX
1113
2009
1374
1137
2110
1086
1166
1039
1673
2300
1139
5490
Standard Deviation
Population Variance: =
()2
Sample Variance: =
()2
1
Recall:
()2
Calculate : = 2.875
Calculate ( )2 = 494.875
2
494.875
8
= 61.859
= 61.859 = 7.865
2 =
2
(
)
2
2
(
)
2
1
FL
IA
IL
KS
LA
MO
MS
NE
OK
SD
TX
1113
2009
1374
1137
2110
1086
1166
1039
1673
2300
1139
5490
2 =
Need to know:
( )2
2 and ( )2
1549337.273 = 1244.724
Parameter
o A numerical measure calculated for a population data
set
o e.g. mean, median, mode, range, variance, standard
deviation
o ,
Statistic
Number of Jugs
Midpoint
mf
122
=122*5 = 610
13
124
=124*13 = 1612
42
126
=126*42 = 5292
129
128
=128*129 = 16512
61
130
=130*61 = 7930
= .
=
()2
()2
1
Short-cut Formulas:
2 =
( )2
2
2 =
( )2
2
Ounces of Milk
Number of Jugs
Midpoint
mf
122
=122*5 = 610
74420
13
124
=124*13 = 1612
199888
42
126
=126*42 = 5292
666792
129
128
=128*129 = 16512
2113536
61
130
=130*61 = 7930
1030900
= 31956
Recall:
2 =
( )2
319562
4085536
250 = 3.182
2 =
250 1
2 = 4085536
s = 3.182 = 1.784
Use of Standard
Deviation
Section 3.4
Standard Deviation
Allows us to find the proportion or
percentage of total observations that fall
within a given interval about the mean
Chebyshevs Theorem
Gives a lower bound for the area under a
curve between two points on opposite sides
of the mean and at the same distance from
the mean.
For any number k greater than 1, at least
1
1 2 of the data values lie within k standard
Chebyshevs Theorem
If k=2,
1
1
1
1 2 = 1 2 = 1 = .75 75%
2
4
75% of the values in the data set lie within 2
standard deviations of the mean.
If k=3,
1
1
1
1 2 = 1 2 = 1 = .89 89%
3
9
89% of the values in the data set lie within 3
standard deviations of the mean.
Solution
a.i. 84% of all students had loans between
$2079.65 and $14,139.65.
a.ii. 75% of all students had loans between
$3285.65 and $12,933.65.
b. The interval $873.65 - $15,345.65 contains
89% of all students.
Empirical Rule
*Only applies to a bell-shaped (normal) distribution.
Solution
a.i. 68% of all textbooks cost between $150
and $210
a.ii. 95% of all textbooks cost between $120
and $240
b. 99.7% of college textbooks cost between
$90 and $270
Measures of Position
Section 3.5
Measure of Position
The measure of position determines the
position of a single value in relation to other
values in a sample or population data set.
Quartiles
Summary measures that divide a ranked data
set into four equal parts
The 2nd quartile, Q2 is the same as the median.
The 1st quartile, Q1 is the value of the middle
term among observations less than the median.
The 3rd quartile, Q3 is the value of the middle
term among observations greater than the
median.
Interquartile Range
The interquartile range (IQR) is the difference
between the third and first quartiles.
IQR = Q3 - Q1
Example: Number of car thefts in the past 12
days
6
11
Q1 = 3.5
Q2 = 6.5
IQR = 8.5 - 3.5 = 5
Q3 = 8.5
15
Percentiles
Percentiles divide a ranked data set into 100
equal parts.
Calculating Percentiles
Example: Number of car thefts in the past 12
days
6
11
15
Percentile Rank
The percentile rank of a value xi in a data set gives the
percentage of values in the data set that are less than xi.
=
100%
11
15
Box-and-Whisker Plot
Section 3.6
Box-and-Whisker Plot
A graphic presentation of data using five
measures: median, first quartile, third
quartile, and smallest and largest values in
the data set between the lower and upper
inner fences.
A visualization of the center, the spread, and
the skewness of a data set.
Can detect outliers
Lower inner fence = Q1 - 1.5 x IQR
Upper inner fence = Q3 + 1.5 x IQR
69
84
112
74
104
81
90
94
144
79
98
Review Exercise
The following data gives the time (in minutes)
that each of 20 students selected from a
university waited in line at their bookstore to
pay for their textbooks in the beginning of the
Fall 2012 semester. Create a box-andwhisker plot displaying the data set.
15 8
23 21 5
6
5
10 14 17
31 19 34 3
22
30 31 25 17 16