Sunteți pe pagina 1din 114

Statistics for Business and

Economics
Thirteenth Edition

Chapter 2
Methods for
Describing Sets
of Data

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 1
Contents (1 of 2)

1. Describing Qualitative Data


2. Graphical Methods for Describing Quantitative
Data
3. Numerical Measures of Central Tendency

4. Numerical Measures of Variability


5. Using the Mean and Standard Deviation to
Describe Data

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 2
Contents (2 of 2)

6. Numerical Measures of Relative Standing


7. Methods for Detecting Outliers: Box Plots and z-
scores
8. Graphing Bivariate Relationships
9. The Time Series Plot
10.Distorting the Truth with Descriptive Techniques

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 3
Learning Objectives

1. Describe data using graphs


2. Describe data using numerical measures
3. Describe quantitative data using numerical
measures
4. Describe the relationship between two
quantitative variables using graphs
5. Detecting descriptive methods that distort the
truth
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 4
Section 2.1 Describing Qualitative
Data

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 5
Key Terms
A class is one of the categories into which
qualitative data can be classified.
The class frequency is the number of
observations in the data set falling into a particular
class.
The class relative frequency is the class
frequency divided by the total numbers of
observations in the data set.
The class percentage is the class relative
frequency multiplied by 100.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 6
Data Presentation (1 of 8)

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 7
Data Presentation (2 of 8)

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 8
Summary Table

1. Lists categories & number of elements in


category
2. Obtained by tallying responses in category
3. May show frequencies (counts), % or both

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 9
Data Presentation (3 of 8)

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 10
Bar Graph

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 11
Data Presentation (4 of 8)

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 12
Pie Chart

1. Shows breakdown of
total quantity into
categories
2. Useful for showing
relative differences
3. Angle size

•  360   percent 
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 13
Data Presentation (5 of 8)

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 14
Pareto Diagram

Like a bar graph, but with the categories arranged


by height in descending order from left to right.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 15
Section 2.1 Summary
Bar graph: The categories (classes) of the qualitative
variable are represented by bars, where the height of each
bar is either the class frequency, class relative frequency,
or class percentage.
Pie chart: The categories (classes) of the qualitative
variable are represented by slices of a pie (circle). The size
of each slice is proportional to the class relative frequency.
Pareto diagram: A bar graph with the categories (classes)
of the qualitative variable (i.e., the bars) arranged by height
in descending order from left to right.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 16
Thinking Challenge

You’re an analyst for IRI. You want to show the


market shares held by Web browsers in 2016.
Construct a bar graph, pie chart, & Pareto
diagram to describe the data.
Browser M kt. Share (%)
ar e

Firefox 14
Internet Explorer 81
Safari 4
Others 1

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 17
Bar Graph Solution*

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 18
Pie Chart Solution*

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 19
Pareto Diagram Solution*

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 20
Section 2.2 Graphical Methods for
Describing Quantitative Data

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 21
Data Presentation (6 of 8)

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 22
Dot Plot

1. Horizontal axis is a scale for the quantitative


variable, e.g., percent.
2. The numerical value of each measurement is
located on the horizontal scale by a dot.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 23
Data Presentation (7 of 8)

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 24
Stem-and-Leaf Display
1. Divide each observation
into stem value and leaf
value
• Stems are listed in
order in a column
• Leaf value is placed in
corresponding stem row
to right of bar

2. Data: 21, 24, 24, 26, 27,


27, 30, 32, 38, 41

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 25
Data Presentation (8 of 8)

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 26
Determining the Number of Classes in a
Histogram
Number of Observations Number of Classes
in Data Set
Less than 25 5-6
25-50 7-14
More than 50 15-20

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 27
Histogram
Class Frequency
15.5-25.5 3
25.5-35.5 5
35.5-45.5 2

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 28
Section 2.2 Summary (1 of 2)
Dot plot: The numerical value of each quantitative
measurement in the data set is represented by a dot on a
horizontal scale. When data values repeat, the dots are
placed above one another vertically.
Stem-and-leaf display: The numerical value of the
quantitative variable is partitioned into a “stem” and a “leaf.”
The possible stems are listed in order in a column. The leaf
for each quantitative measurement in the data set is placed
in the corresponding stem row. Leaves for observations
with the same stem value are listed in increasing order
horizontally.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 29
Section 2.2 Summary (2 of 2)

Histogram: The possible numerical values of the


quantitative variable are partitioned into class
intervals, where each interval has the same width.
These intervals form the scale of the horizontal
axis. The frequency or relative frequency of
observations in each class interval is determined.
A horizontal bar is placed over each class interval,
with height equal to either the class frequency or
class relative frequency.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 30
Section 2.3 Numerical Measures of
Central Tendency

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 31
Two Characteristics (1 of 2)

The central tendency of the set of


measurements-that is, the tendency of the data to
cluster, or center, about certain numerical values.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 32
Two Characteristics (2 of 2)

The variability of the set of measurements-that is,


the spread of the data.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 33
Mean

The mean of a set of quantitative data is the sum


of the measurements divided by the number of
measurements contained in the data set.
n

x i
x i 1
n

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 34
Example 1

Calculate the mean of the following six sample


measurements:
10.3, 4.9, 8.9, 11.7, 6.3, 7.7
n

x i
10.3  4.9  8.9  11.7  6.3  7.7
x i 1

n 6
49.8
  8.3
6
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 35
Symbols for the Sample and Population
Mean

In this text, we adopt a general policy of using


Greek letters to represent population numerical
descriptive measures and Roman letters to
represent corresponding descriptive measures for
the sample. The symbols for the mean are
Sample mean = x
Population mean  

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 36
Median

1. Measure of central tendency


2. Middle value in ordered sequence
If n is odd, middle value of sequence
If n is even, average of 2 middle values
3. Position of median in sequence
n 1
Positioning Point 
2

4. Not affected by extreme values


Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 37
Median Example Odd-Sized Sample

Raw Data:  24.1  22.6  21.5  23.7  22.6

Ordered:  21.5  22.6  22.6  23.7  24.1


Position: 1  2  3  4 5
Median = 22.6

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 38
Median Example Even-Sized Sample

Raw Data: 1
  0.3  4.9  8.9 1 1.7  6.3  7.7
Ordered: 4.9  6.3  7.7  8.9 1 0.3 1 1.7

Position: 1  2  3 4 5 6
7.7  8.9
Median =  8.3
2

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 39
Skewed

A data set is said to be skewed if one tail of the


distribution has more extreme observations than
the other tail.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 40
Shape
1. Describes how data are distributed
2. Measures of Shape
• Skew = Symmetry

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 41
Mode

1. Measure of central tendency


2. Value that occurs most often
3. Not affected by extreme values
4. May be no mode or several modes

5. May be used for quantitative or qualitative data

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 42
Mode Example

No Mode
Raw Data: 1
  0.3  4.9  8.9 1 1.7  6.3  7.7
One Mode
Raw Data:  6.3  4.9   8.9  6.3  4.9  4.9

More Than 1 Mode


Raw Data:  21  28   28  41  43  43

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 43
Thinking Challenge 1 (1 of 3)

You’re a financial analyst


for Prudential-Bache
Securities. You have
collected the following
closing stock prices of
new stock issues: 17, 16,
21, 18, 13, 16, 12, 11.
Describe the stock prices
in terms of central
tendency.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 44
Thinking Challenge 1 (2 of 3)
Solution
17, 16, 21, 18, 13, 16, 12, 11

17  16  21  18  13  16  12  11
x
8
 15.5
Raw Data: 17 1 6  21 1 8  13 1 6  12 1 1
Ordered: 11 1 2 1 3 1 6 1 6 1 7  18  21
Median = 16
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 45
Thinking Challenge 1 (3 of 3)

Mode
Raw Data: 17  16   21 1
  8  13  16   12 1 1

Mode = 16

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 46
Section 2.4 Numerical Measures of
Variability

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 47
Range
1. Measure of dispersion

2. Difference between largest & smallest observations


Range  xlargest – xsmallest

3. Ignores how data are distributed

Range = 10 − 7 = 3 Range = 10 − 7 = 3

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 48
Variance & Standard Deviation

1. Measures of dispersion
2. Most common measures
3. Consider how data are distributed
4. Show variation about mean ( x or  ) 

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 49
Sample Variance Formula
n

 x  x 
2
i
s 
2 i 1
n 1

 x1  x    x2  x      xn  x 
2 2 2


n 1

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 50
Sample Standard Deviation Formula

s  s2
n

 x  x 
2
i
 i 1
n 1

 x1  x    x2  x      xn  x 
2 2 2


n 1

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 51
Symbols for Variance and Standard
Deviation

s 2 = Sample variance
s = Sample standard deviation
2 = Population variance
 = Population standard deviation

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 52
Example 2
Calculate the variance and standard deviation. 10.3,
4.9, 8.9, 11.7, 6.3, 7.7
Solution
The first step is finding the mean. Which we calculated
earlier to be 8.3.

(10.3  8.3) 2
 (4.9  8.3) 2
 ...  (7.7  8.3) 2
s2 
6 1
s 2  6.368
s  6.368  2.52
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 53
Thinking Challenge 2 (1 of 2)

You’re a financial analyst


for Prudential-Bache
Securities. You have
collected the following
closing stock prices of
new stock issues: 17, 16,
21, 18, 13, 16, 12, 11.
What are the variance
and standard deviation
of the stock prices?
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 54
Thinking Challenge 2 (2 of 2)
Solution
Sample Variance
17 1 6  21  18  13  16  12 1 1
The mean = 15.5
(17  15.5) 2
 (16  15.5) 2
 ...  (11  15.5) 2
s2 
8 1
 11.14
s  11.14  3.337
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 55
Section 2.5 Using the Mean and
Standard Deviation to Describe
Data

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 56
Using the Mean and Standard Deviation
to Describe Data: Chebyshev’s Rule
a. No useful information is provided on the fraction of measurements that fall within 1

standard deviation of the mean [i.e., within the interval ( x  s, x  s ) for samples
and (   ,    ) for populations].

b. At least ¾ will fall within 2 standard deviations of the mean [i.e., within the interval

( x  2 s, x  2 s ) for samples and (  2 ,   2 ) for populations].


8
c. At least of the measurements will fall within 3 standard deviations of the mean
9
[i.e., within the interval ( x  3s, x  3s ) for samples and (  3 ,   3 )

for populations].

d. Generally, for any number k greater than 1, at least (1  1/ k 2 ) of the

measurements will fall within k standard deviations of the mean [i.e., within the

interval ( x  ks, x  ks ) for samples and (  k ,   k ) for populations].


Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 57
Interpreting Standard Deviation:
Chebyshev’s Theorem

No useful information

At least 3 / 4 of the data

At least 8 / 9 of the data


Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 58
Chebyshev’s Theorem Example (1 of 2)

Previously we found the


mean closing stock price of
new stock issues is 15.5
and the standard deviation
is 3.34.
Use this information to form
an interval that will contain
at least 75% of the closing
stock prices of new stock
issues.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 59
Chebyshev’s Theorem Example (2 of 2)
At least 75% of the closing stock prices of new
stock issues will lie within 2 standard deviations of
the mean.

x  15.5     s  3.34

( x – 2 s, x  2 s)  (15.5 – 2  3.34, 15.5  2  3.34)

  8.82, 22.18 

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 60
Interpreting Standard Deviation: Empirical
Rule (1 of 2)

Applies to data sets that are mound shaped and


symmetric
Approximately 68% of the measurements lie in the
interval x  s to x  s
Approximately 95% of the measurements lie in the
interval x  2 s to x  2 s
Approximately 99.7% of the measurements lie in

interval x  3s to x  3s
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 61
Interpreting Standard Deviation: Empirical
Rule (2 of 2)

Approximately 68% of the measurements

Approximately 95% of the measurements

Approximately 99.7% of the measurements


Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 62
Empirical Rule Example (1 of 2)

Previously we found the


mean closing stock price of
new stock issues is 15.5
and the standard deviation
is 3.34. If we can assume
the data is symmetric and
mound shaped, calculate
the percentage of the data
that lie within the intervals
x  s, x  2s, x  3s.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 63
Empirical Rule Example (2 of 2)

According to the Empirical Rule, approximately


68% of the data will lie in the interval ( x  s, x  s ),
Approximately 95% of the data will lie in the interval
( x  2 s, x  2 s ),
(15.5 – 2  3.34, 15.5  2  3.34)   8.82, 22.18 
Approximately 99.7% of the data will lie in the
interval ( x  3s, x  3s ),
(15.5 – 3  3.34, 15.5  3  3.34)   5.48, 25.52 
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 64
Section 2.6 Numerical Measures of
Relative Standing

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 65
Numerical Measures of Relative
Standing: Percentiles

Describes the relative location of a measurement


compared to the rest of the data are called
measures of relative standing.
th
The  p percentile is a number such that p% of
the data falls below it and (100 – p )% falls above it
Median = 50th percentile

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 66
Quartiles
Measure of noncentral tendency
Split ordered data into 4 quarters

Lower quartile  QL is 25th percentile.

Middle quartile Q2 is the median.


Upper quartile  QU is 75th percentile.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 67
Percentile Example
You scored 560 on the GMAT exam. This score
puts you in the 58th percentile.
What percentage of test takers scored lower than
you did?
58% of test takers scored lower than 560.
What percentage of test takers scored higher than
you did?
 100 – 58 %  42% of test takers scored higher
than 560.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 68
Numerical Measures of Relative
Standing: z-Scores

Describes the relative location of a measurement


compared to the rest of the data
Sample z-score Population z-score
xx x µ
z z
s 
Measures the number of standard deviations away
from the mean a data value is located

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 69
z-Score Example (1 of 2)

The mean time to assemble a


product is 22.5 minutes with a
standard deviation of 2.5
minutes.
Find the z-score for an item that
took 20 minutes to assemble.
Find the z-score for an item that
took 27.5 minutes to assemble.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 70
z-Score Example (2 of 2)
x  20,    22.5    2.5
x 20  22.5
z   1.0
 2.5

x  27.5,    22.5    2.5


x   27.5  22.5
z   2.0
 2.5

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 71
Interpretation of z-Scores for Mound-
Shaped Distributions of Data

1. Approximately 68% of the measurements will


have a z-score between −1 and 1.
2. Approximately 95% of the measurements will
have a z-score between −2 and 2.
3. Approximately 99.7% of the measurements will
have a z-score between −3 and 3.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 72
Interpretation of z-Scores

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 73
Section 2.7 Methods for Detecting
Outliers: Box Plots and z-Scores

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 74
Outlier
An observation (or measurement) that is unusually
large or small relative to the other values in a data set
is called an outlier. Outliers typically are attributable to
one of the following causes:
1. The measurement is observed, recorded, or
entered into the computer incorrectly.
2. The measurement comes from a different
population.
3. The measurement is correct but represents a rare
(chance) event.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 75
Quartiles Review
Measure of noncentral tendency
Split ordered data into 4 quarters

Lower quartile  QL is 25th percentile.


Middle quartile Q2 is the median.
Upper quartile  QU is 75th percentile.

Interquartile range:  IQR  QU - QL


Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 76
Quartile left parenthesis Q sub 2
right parenthesis Example

Raw Data: 1
  0.3  4.9  8.9 1 1.7  6.3  7.7
Ordered: 4.9  6.3  7.7  8.9 1 0.3 1 1.7
Position: 1  2  3 4 5 6

Q2 is the median, the average of the two middle

scores  7.7  8.9  / 2  8.3

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 77
Quartile left parenthesis Q sub 1
right parenthesis Example

Raw Data: 1
  0.3  4.9  8.9 1 1.7  6.3  7.7
Ordered: 4.9  6.3   7.7  8.9 1 0.3 1 1.7
Position: 1  2 3 4 5 6
QL is median of bottom half = 6.3

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 78
Quartile left parenthesis Q sub 3
right parenthesis Example

Raw Data: 1
  0.3  4.9  8.9 1 1.7  6.3  7.7
Ordered: 4.9  6.3  7.7  8.9  10.3 1 1.7
Position: 1  2  3 4 5 6

QU is median of top half = 10.3

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 79
Interquartile Range

1. Measure of dispersion
2. Also called midspread
3. Difference between upper and lower quartiles
Interquartile Range =  QU - QL
4. Spread in middle 50%
5. Not affected by extreme values

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 80
Thinking Challenge 3
You’re a financial analyst for
Prudential-Bache Securities. You
have collected the following
closing stock prices of new stock
issues: 17, 16, 21, 18, 13, 16, 12,
11. What are the quartiles,
Q1 and Q3 , and the

interquartile range?

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 81
Quartile Solution* (1 of 2)
Q1
Raw Data: 1
  7  16  21 1 8 1 3 1 6 1 2 1 1
Ordered: 1 1  12  13 1 6 1 6 1 7 1 8  21
Position:
QL is the median of the bottom half, the average of

the two middle scores  12  13 / 2  12.5

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 82
Quartile Solution* (2 of 2)
Q3
Raw Data: 1
  7  16  21 1 8 1 3 1 6 1 2 1 1
Ordered: 1 1  12  13 1 6 1 6 1 7 1 8  21
Position:
QU is the median of the bottom half, the average of

the two middle scores  17  18  / 2  17.5

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 83
Interquartile Range Solution*

Interquartile Range
Raw Data: 1
  7  16  21 1 8 1 3 1 6 1 2 1 1
Ordered: 1 1  12  13 1 6 1 6 1 7 1 8  21
Position:

Interquartile Range =  Q3  Q1

= 17.5 −12.5 = 5

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 84
Box Plot (1 of 3)

1. Graphical display of data using 5-number


summary

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 85
Box Plot (2 of 3)
2. Draw a rectangle (box) with the ends (hinges)
drawn at the lower and upper quartiles ( QL and
QU ). The median data is shown by a line or

symbol (such as “+”).


3. The points at distances 1.5(IQR) from each hinge
define the inner fences of the data set. Line
(whiskers) are drawn from each hinge to the most
extreme measurements inside the inner fence.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 86
Box Plot (3 of 3)

3. A second pair of fences, the outer fences, are


defined at a distance of 3(IQR) from the hinges.
One symbol (*) represents measurements
falling between the inner and outer fences, and
another (0) represents measurements beyond
the outer fences.
4. Symbols that represent the median and extreme
data points vary depending on software used.
You may use your own symbols if you are
constructing a box plot by hand.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 87
Shape & Box Plot

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 88
Detecting Outliers
Box Plots: Observations falling between the inner
and outer fences are deemed suspect outliers.
Observations falling beyond the outer fence are
deemed highly suspect outliers.
z-scores: Observations with z-scores greater than
3 in absolute value are considered outliers. (For
some highly skewed data sets, observations with
z-scores greater than 2 in absolute value may be
outliers.)

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 89
Section 2.8 Graphing Bivariate
Relationships

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 90
Graphing Bivariate Relationships

Describes a relationship between two quantitative


variables
Plot the data in a scattergram (or scatterplot)

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 91
Scattergram Example (1 of 2)

You’re a marketing analyst for Hasbro Toys. You


gather the following data:
Ad $ (x) Sales (Units) (y)
1 1
2 1
3 2
4 2
5 4

Draw a scattergram of the data


Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 92
Scattergram Example (2 of 2)

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 93
Section 2.9 The Time Series Plot

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 94
Time Series Plot

Used to graphically display data produced over


time
Shows trends and changes in the data over time
Time recorded on the horizontal axis
Measurements recorded on the vertical axis
Points connected by straight lines

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 95
Time Series Plot Example (1 of 2)
Average
The following data Date Price
shows the average
Oct 16, 2006 $2.219
retail price of regular
ober

Oct 23, 2006 $2.173


gasoline for 8 weeks in ober

Oct 30, 2006 $2.177


2016. ober

Nov 6, 2006 $2.158


Draw a time series
ember

Nov 13, 2006 $2.185


plot for this data.
ember

Nov 20, 2006ember $2.208


Nov 27, 2006ember $2.236
Dec 4, 2006 ember $2.298

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 96
Time Series Plot Example (2 of 2)

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 97
Section 2.10 Distorting the Truth
with Descriptive Statistics

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 98
Errors in Presenting Data

1. Use area to equate to value


2. No relative basis in comparing data batches
3. Compress the vertical axis
4. No zero point on the vertical axis
5. Gap in the vertical axis
6. Use of misleading wording
7. Knowing central tendency without knowing
variability
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 99
Reader Equates Area to Value

Bad Presentation Good Presentation

Minimum Wage
1960: $1.00
1970: $1.60

1980: $3.10

1990: $3.80

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 100
No Relative Basis

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 101
Compressing Vertical Axis

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 102
No Zero Point on Vertical Axis

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 103
Gap in the Vertical Axis

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 104
Changing the Wording

Changing the title of the graph can influence the


reader.
We’re not doing so well. Still in prime years!

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 105
Knowing only central tendency

Knowing Only the central tendency might lead one


to purchase Model A. Knowing the variability as
well may change one’s decision!

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 106
Key Ideas (1 of 8)
Describing Qualitative Data
1. Identify category classes
2. Determine class frequencies
3. Class relative frequency = (class freq) / n

4. Graph relative frequencies

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 107
Key Ideas (2 of 8)

Graphing Quantitative Data


1 Variable
1. Identify class intervals
2. Determine class interval frequencies
3. Class interval relative frequency =
 class interval frequencies  / n
4. Graph class interval relative frequencies

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 108
Key Ideas (3 of 8)

2 Variables
Scatterplot
Time series plot

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 109
Key Ideas (4 of 8)

Numerical Description of Quantitative Data

Central Tendency
Mean
Median
Mode

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 110
Key Ideas (5 of 8)
Variation
Range
Variance
Standard Deviation

Interquartile range

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 111
Key Ideas (6 of 8)
Relative standing
Percentile score
z-score

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 112
Key Ideas (7 of 8)

Rules for Detecting Quantitative Outliers

Interval Chebyshev’s Rule Empirical Rule


xs At least 0%  68%
x  2s At least 75%  95%
x  3s At least 89% All

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 113
Key Ideas (8 of 8)

Method Suspect Highly Suspect


Box plot: Values between Values beyond
inner and outer outer fences
fences

z-score 2 | z | 3 | z | 3

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved Slide - 114

S-ar putea să vă placă și