Sunteți pe pagina 1din 43

CHAPTER

13
Statistics

Copyright © Cengage Learning. All rights reserved.


Section13.4 Normal Distributions

Copyright © Cengage Learning. All rights reserved.


Frequency Distributions and
Histograms

3
Frequency Distributions and Histograms

Large sets of data are often displayed using a grouped


frequency distribution or a histogram.

For instance, consider the following situation. An Internet


service provider (ISP) has installed new computers. To
estimate the new download times its subscribers will
experience, the ISP surveyed 1000 of its subscribers to
determine the time required for each subscriber to
download a particular file from an Internet site.

4
Frequency Distributions and Histograms

The results of that survey are summarized in Table 13.7

A grouped Frequency Distribution with 12 Classes


Table 13.7 5
Frequency Distributions and Histograms

Table 13.7 is called a grouped frequency distribution. It


shows how often (frequently) certain events occurred.

Each interval, 0–5, 5–10, and so on, is called a class. This


distribution has 12 classes. For the 10–15 class, 10 is the
lower class boundary and 15 is the upper class
boundary.

Any data value that lies on a common boundary is


assigned to the higher class.

The graph of a frequency distribution is called a


histogram. A histogram provides a pictorial view of how
the data are distributed.
6
Frequency Distributions and Histograms

In Figure 13.2, the height of each bar of the histogram


indicates how many subscribers experienced the download
times shown by the class on the base of the bar.

A histogram for the frequency distribution in Table 13.7


Figure 13.2 7
Frequency Distributions and Histograms

Examine the distribution in Table 13.8 below. It shows the


percent of subscribers that are in each class, as opposed
to the frequency distribution in Table 13.7, which shows the
number of customers in each class.

A Relative Frequency Distribution A grouped Frequency Distribution with 12 Classes


Table 13.8 Table 13.7 8
Frequency Distributions and Histograms

The type of frequency distribution that lists the percent of


data in each class is called a relative frequency
distribution.

The relative frequency histogram


in Figure 13.3 was drawn by using
the data in the relative frequency
distribution. It shows the percent
of subscribers along its vertical
axis.
A relative frequency histogram
Figure 13.3

9
Frequency Distributions and Histograms

One advantage of using a relative frequency distribution


instead of a grouped frequency distribution is that there is a
direct correspondence between the percent values of the
relative frequency distribution and probabilities.

For instance, in the relative frequency


distribution in Table 13.8, the percent of
the data that lies between 35 s and 40 s
is 14.9%.

Thus, if a subscriber is chosen at random,


the probability that the subscriber will
require at least 35 s but less than 40 s A Relative Frequency Distribution
Table13.8
to download the music file is 0.149.
10
Example 1 – Use a Relative Frequency Distribution

Use the relative frequency distribution in Table 13.8 to


determine the
a. percent of subscribers who required at least 25 s to
download the file.
b. probability that a subscriber chosen at random will
require at least 5 s but less than 20 s to download the file.

A Relative Frequency Distribution


Table 13.8 11
Example 1 – Solution
a. The percent of data in all the
classes with a lower boundary
of 25 s or more is the sum of
the percents printed in red in
Table 13.9 at right.

Table 13.9

Thus the percent of subscribers who required at least


25 s to download the file is 69.1%.

12
Example 1 – Solution cont’d

b. The percent of data in all the classes with a lower


boundary of at least 5 s and an upper boundary of 20 s
is the sum of the percents printed in blue in Table 13.9.

Thus the percent of subscribers who required at least


5 s but less than 20 s to download the file is 15.2%.

The probability that a subscriber chosen at random will


require at least 5 s but less than 20 s to download the
file is 0.152.

13
Normal Distributions and the
Empirical Rule

14
Normal Distributions and the Empirical Rule

One of the most important statistical distributions of data is


known as a normal distribution. This distribution occurs in a
variety of applications.

Types of data that may demonstrate a normal distribution


include the lengths of leaves on a tree, the weights of
newborns in a hospital, the lengths of time of a student’s
trip from home to school over a period of months, the SAT
scores of a large group of students, and the life spans of
light bulbs.

15
Normal Distributions and the Empirical Rule

A normal distribution forms a bell-shaped curve that is


symmetric about a vertical line through the mean of the
data. A graph of a normal distribution with a mean of 5 is
shown below.

16
Normal Distributions and the Empirical Rule

17
Normal Distributions and the Empirical Rule

In the normal distribution shown below, the area of the


shaded region is 0.159 units. This region represents the
fact that 15.9% of the data are greater than or equal to 10.

Because the area under the curve is 1, the unshaded region


under the curve has area 1 – 0.159, or 0.841, representing
the fact that 84.1% of the data are less than 10.
18
Normal Distributions and the Empirical Rule

The following rule, called the Empirical Rule, describes the


percents of data that lie within 1, 2, and 3 standard
deviations of the mean in a normal distribution.

19
Normal Distributions and the Empirical Rule

A normal distribution

20
Example 2 – Use the Empirical Rule to Solve an Application

A survey of 1000 U.S. gas stations found that the price


charged for a gallon of regular gas could be closely
approximated by a normal distribution with a mean of $3.10
and a standard deviation of $0.18. How many of the
stations charge

a. between $2.74 and $3.46 for a gallon of regular gas?


b. less than $3.28 for a gallon of regular gas?
c. more than $3.46 for a gallon of regular gas?

21
Example 2 – Solution
a. The $2.74 per gallon price is 2 standard deviations
below the mean. The $3.46 price is 2 standard
deviations above the mean. In a normal distribution,
95% of all data lie within 2 standard deviations of the
mean. See Figure 13.4.

Figure 13.4
22
Example 2 – Solution cont’d

Therefore approximately

of the stations charge between $2.74 and $3.46 for a


gallon of regular gas.

23
Example 2 – Solution cont’d

b. The $3.28 price is 1 standard deviation above the


mean. See Figure 13.5.

Figure 13.5

In a normal distribution, 34% of all data lie between the


mean and 1 standard deviation above the mean.
24
Example 2 – Solution cont’d

Thus, approximately

of the stations charge between $3.10 and $3.28 for a


gallon of regular gasoline.

Half of the 1000 stations, or 500 stations, charge less


than the mean.

Therefore about 340 + 500 = 840 of the stations charge


less than $3.28 for a gallon of regular gas.
25
Example 2 – Solution cont’d

c. The $3.46 price is 2 standard deviations above the


mean. In a normal distribution, 95% of all data are within
2 standard deviations of the mean.

This means that the other 5% of the data will lie either
more than 2 standard deviations above the mean or
more than 2 standard deviations below the mean. We
are interested only in the data that are more than 2
standard deviations above the mean, which is of 5%,
or 2.5%, of the data.

26
Example 2 – Solution cont’d

See Figure 13.6.

Figure 13.6

Thus about

of the stations charge more than $3.46 for a gallon of


regular gas.
27
The Standard Normal Distribution

28
The Standard Normal Distribution
It is often helpful to convert data values x to z-scores, by
using the z-score formulas:

If the original distribution of x values is a normal distribution,


then the corresponding distribution of z-scores will also be a
normal distribution.

This normal distribution of z-scores is called the standard


normal distribution.
29
The Standard Normal Distribution
See Figure 13.7. It has a mean of 0 and a standard
deviation of 1.

Conversion of a normal distribution to the standard normal distribution


Figure 13.7 30
The Standard Normal Distribution

Tables and calculators are often used to determine the


area under a portion of the standard normal curve. We will
refer to this type of area as an area of the standard normal
distribution.

Normal curve table gives the approximate areas of the


standard normal distribution between the mean 0 and z
standard deviations from the mean.
31
The Standard Normal Distribution
Because the standard normal distribution is symmetrical
about the mean of 0, we can also use normal curve table to
find the area of a region that is located to the left of the
mean. This process is explained in Example 3.

32
Example 3 – Use Symmetry to Determine an Area

Find the area of the standard normal distribution between


z = –1.44 and z = 0.

Solution:
Because the standard normal distribution is symmetrical
about the center line z = 0, the area of the standard normal
distribution between z = –1.44 and z = 0 is equal to the
area between z = 0 and z = 1.44.

33
Example 3 – Solution cont’d

See Figure 13.9.

Symmetrical region
Figure 13.9

The entry in normal curve table associated with z = 1.44 is


0.425.

Thus the area of the standard normal distribution between


z = –1.44 and z = 0 is 0.425 square unit.
34
The Standard Normal Distribution
In Figure 13.10, the region to the right of z = 0.82 is called
a tail region. A tail region is a region of the standard
normal distribution to the right of a positive z-value or to the
left of a negative z-value.

Area of a tail region


Figure 13.10

To find the area of a tail region, we subtract the entry in


normal curve table from 0.500. 35
The Standard Normal Distribution

Because the area of a portion of the standard normal


distribution can be interpreted as a percentage of the data
or as a probability that the variable lies in a particular
interval, we can use the standard normal distribution to
solve many application problems.

36
Example 5 – Solve an Application
A soda machine dispenses soda into 12-ounce cups. Tests
show that the actual amount of soda dispensed is normally
distributed, with a mean of 11.5 oz and a standard
deviation of 0.2 oz.
a. What percent of cups will receive less than 11.25 oz of
soda?
b. What percent of cups will receive between 11.2 oz and
11.55 oz of soda?
c. If a cup is filled at random, what is the probability that
the machine will overflow the cup?

37
Example 5 – Solution
a. We know that the formula for the z-score for a data value
x is

The z-score for 11.25 oz is

Normal curve table indicates that 0.394 (39.4%) of the


data in a normal distribution are between z = 0 and
z = 1.25.
Because the data are normally distributed, 39.4% of the
data is also between z = 0 and z = –1.25.
38
Example 5 – Solution cont’d

The percent of data to the left of z = –1.25 is


50% – 39.4% = 10.6%. See Figure 13.11.

Portion of data to the left of z = –1.25


Figure 13.11

Thus 10.6% of the cups filled by the soda machine will


receive less than 11.25 oz of soda.
39
Example 5 – Solution cont’d

b. The z-score for 11.55 ounces is

Normal curve table indicates that 0.099 (9.9%) of the


data in a normal distribution is between z = 0 and
z = 0.25. The z-score for 11.2 oz is

Normal curve table indicates that 0.433 (43.3%) of the


data in a normal distribution are between z = 0 and
z = 1.5. 40
Example 5 – Solution cont’d

Because the data are normally distributed, 43.3% of the


data are also between z = 0 and z = –1.5. See Figure 13.12.

Portion of data between two z-scores


Figure 13.12

Thus the percent of the cups that the vending machine will
fill with between 11.2 oz and 11.55 oz of soda is
43.3% + 9.9% = 53.2%. 41
Example 5 – Solution cont’d

c. A cup will overflow if it receives more than 12 oz of


soda. The z-score for 12 oz is

Normal curve table indicates that 0.494 (49.4%) of the


data in the standard normal distribution are between
z = 0 and z = 2.5.

The percent of data to the right of z = 2.5 is determined


by subtracting 49.4% from 50%.

42
Example 5 – Solution cont’d

See Figure 13.13.

Portion of data to the right of z = 2.5


Figure 13.13

Thus 0.6% of the time the machine produces an overflow,


and the probability that a cup filled at random will overflow
is 0.006.
43

S-ar putea să vă placă și