Sunteți pe pagina 1din 4

Section 2.

2 - Graphical Methods for Describing Quantitative Data


Three methods for displaying quantitative data are:
1. Dot plot
2. Stem-and-leaf display
3. Histogram
Our objective is to describe quantitative data with respect to:
1. Center
2. Spread
3. Clusters
4. Gaps
5. Outliers (unusually high or low values)
6. Shape (e.g., symmetric, not symmetric, mound-shaped)
Dot plot
A dot plot condenses data by grouping all identical values.
Stem-and-leaf display
A stem-and-leaf display condenses data by grouping all values with the same "stem".
Histogram
The histogram condenses data by grouping similar data values in the same class.
Before constructing a histogram make a table that divides the values into measurement classes of
equal width.
Comparison of three methods:
Dot plot
Advantage:
Disadvantage:
Stem & leaf
Advantage:
Disadvantage:

All individual data values are apparent.


Cumbersome for large data sets.

All individual data values are apparent.


Cumbersome for large data sets.
Bad choice of stems may obscure patterns.

Histogram
Advantage:
Disadvantage:

Handles very large data sets.


Individual data values are not apparent.

2.2 Graphical Methods for Describing Quantitative Data - 1

Example: EPA Gas Mileage Ratings on 100 Cars of a New Car Model
Table 2.3, p. 30, EPAGAS data set
36.3
32.7
40.5
36.2
38.5
36.3
41.0
37.0
37.1
39.9

41.0
37.3
36.5
37.9
39.0
36.8
31.8
37.2
40.3
36.9

36.9
41.2
37.6
36.0
35.5
32.5
37.3
40.7
36.7
32.9

37.1
36.6
33.9
37.9
34.8
36.4
33.1
37.4
37.0
33.8

44.9
32.9
40.2
35.9
38.6
40.5
37.0
37.1
33.9
39.8

36.8
36.5
36.4
38.2
39.4
36.6
37.6
37.8
40.1
34.0

30.0
33.2
37.7
38.3
35.3
36.1
37.0
35.9
38.0
36.8

37.2
37.4
37.7
35.7
34.4
38.2
38.7
35.6
35.2
35.0

42.1
37.5
40.0
35.6
38.8
38.4
39.0
36.7
34.8
38.1

Dotplot for EPA Mileage Ratings on 100 Cars

30.0

32.5

35.0

37.5

40.0

MPG

Each dot represents a single observation.


For large data sets, a dot may represent several observations.

2.2 Graphical Methods for Describing Quantitative Data - 2

42.5

45.0

36.7
33.6
34.2
35.1
39.7
39.3
35.8
34.5
39.5
36.9

Stem-and-Leaf Display for EPA Mileage Ratings on 100 Cars


2
6
12
18
29
49
(21)
30
20
12
5
2
1
1

(median)

31
32
33
34
35
36
37
38
39
40
41
42
43
44

Count

8
5799
126899
024588
01235667899
01233445566777888999
000011122334456677899
0122345678
00345789
0123557
002
1
9
Stem

Leaf

Explanation of columns:
Count

The row containing the median has its row count in parentheses. Counts for rows
above and below the median are cumulative.

Stem

The stem value represents the digit(s) immediately to the left of the leaf digit. A
stem value of 37 indicates that leaves in the row are 37 & 38.

Leaf

Each leaf value represents a decimal digit from a single observation.

Histogram
Measurement Class
30.0 31.5
31.5 33.0
33.0 34.5
34.5 36.0
36.0 37.5
37.5 39.0
39.0 40.5
40.5 42.0
42.0 43.5
43.5 45.0
Totals

Frequency
1
5
9
14
33
18
12
6
1
1
100

2.2 Graphical Methods for Describing Quantitative Data - 3

Relative Frequency
.01
.05
.09
.14
.33
.18
.12
.06
.01
.01
1.00

Histogram for the EPA Mileage Ratings on 100 Cars


35

Frequency

30
25
20
15
10
5
0

30.0

31.5

33.0

34.5

36.0

37.5

39.0

40.5

42.0

43.5

45.0

MPG
You can also make a relative frequency histogram by plotting relative frequencies instead of
counts on the y-axis.
Important points about histograms
1. The proportion of the total area under the histogram above a particular interval on the xaxis is the relative frequency of observations falling in the interval.
2. For very large data sets, a histogram will have very small intervals and begin to look
like a smooth curve. We can model such a distribution with a math function.
Relative Frequency Histogram

Relative Frequency Histogram

100 observations from a normal distribution with mean 100, and s.d. 15

500 observations from a normal distribution with mean 100, and s.d. 15

15

12
Percent

16

Percent

20

10

70

80

90

100

110

120

130

140

C1

Relative Frequency Histogram


10,000 observations from a normal distribution with mean 100, and s.d. 15

Relative Frequency

4.5

3.0

1.5

32

48

64

80

60

75

90

105
C2

6.0

0.0

96

112

128

144

C3

2.2 Graphical Methods for Describing Quantitative Data - 4

120

135

S-ar putea să vă placă și