Documente Academic
Documente Profesional
Documente Cultură
Business Analytics
Instructor : Daniyal Nawaz
1
Business Analytics
Lecture # 03
Descriptive Statistics
2
Creating Distributions from Data
Cumulative Distributions
• Cumulative frequency distribution: A variation
of the frequency distribution that provides
another tabular summary of quantitative data
– Uses the number of classes, class widths, and class
limits developed for the frequency distribution
– Shows the number of data items with values less
than or equal to the upper class limit of each class
3
Cumulative Frequency, Cumulative Relative
Frequency, and Cumulative Percent Frequency
Distributions for the Audit Time Data
4
Sorting and Filtering Data in Excel
Conditional Formatting of Data in Excel
PRACTICE
6
Modifying Data in Excel
7
Top-Selling Automobiles Data Sorted by Sales
in March 2010 Sales
8
Modifying Data in Excel
Sorting and Filtering Data in Excel
• Using Excel’s Filter function to see the sales of models made by Toyota
9
Using Excel’s Sort Function to Sort the Top-
Selling Automobiles Data
10
Top Selling Automobiles Data Filtered to Show Only Automobiles
Manufactured by Toyota
11
Modifying Data in Excel
Conditional Formatting of Data in Excel
• Makes it easy to identify data that satisfy certain conditions
in a data set
• To identify the automobile models in Table 2.2 for which
sales had decreased from March 2010 to March 2011:
– Step 1: Starting with the original data shown in Figure 2.3, select
cells F1:F21
– Step 2: Click on the Home tab in the Ribbon
– Step 3: Click Conditional Formatting in the Styles group
– Step 4: Select Highlight Cells Rules, and click Less Than from the
dropdown menu
– Step 5: Enter 0% in the Format cells that are LESS THAN: box
– Step 6: Click OK
12
Using Conditional Formatting in Excel to Highlight Automobiles with
Declining Sales from March 2010
13
Using Conditional Formatting in Excel to Generate Data Bars for the Top-
Selling Automobiles Data
14
Modifying Data in Excel
16
Using Excel to Generate a Frequency Distribution for
Audit Times Data
17
histogram
Histograms can be created in Excel using the Data
Analysis ToolPak.
Following are the steps to create histogram in Excel.
Step 1. Click the DATA tab in the Ribbon
Step 2. Click Data Analysis in the Analysis group
Step 3. When the Data Analysis dialog box opens, choose
Histogram from the list of Analysis Tools, and click OK
In the Input Range: box, enter A2:D6
In the Bin Range: box, enter A10:A14
Under Output Options:, select New Workshee Ply:
Select the check box for Chart Output
Click OK
18
Figure 2.13: Creating a Histogram for the Audit Time Data Using
Data Analysis Toolpak in Excel
19
Figure 2.14: Completed Histogram for the Audit Time Data Using
Data Analysis ToolPak in Excel
20
Measures of Location
22
Example
23
Solution
24
Measures of location (MEDIAN)
2. Median
• measure of central location, it is the value in
the middle when the data are arranged in
ascending order (smallest to largest value).
• With an odd number of observations, the
median is the middle value.
• An even number of observations has no single
middle
25
• Let us apply this definition to compute the
median class size for a sample of five college
classes.
26
MEDIAN
27
• Because n =12 is even, the median is the
average of the middle two values: 199,500
and 208,000.
28
Measure of location (MODE)
• A third measure of location, the mode, is the
value that occurs most frequently in a dataset.
To illustrate the identification of the mode,
consider the sample of five class sizes.
32 42 46 46 54
• The only value that occurs more than once is
46. Because this value, occurring with a
frequency of 2, has the greatest frequency, it
is mode
29
Measure of location (Geometric mean)
30
• The geometric mean is often used in analyzing
growth rates in financial data. In these types
of situations, the arithmetic mean or average
value will provide misleading results.
31
• To illustrate the use of the geometric mean, consider Table
2.10, which shows the percentage annual returns, or growth
rates, for a mutual fund over the past 10 years
32
• $100 - 0.221($100) = $100(1- 0.221)
=$100(0.779) =$77.90
=$100(1.335) = $133.45
34
35
Measures of variability
In addition to measures of location, it is often desirable to
consider measures of variability.
For example,
suppose that you are considering two financial funds. Both
funds require a $1,000 annual investment.
• Fund A has paid out exactly $1,100 each year for an
initial $1,000 investment.
• Fund B has had many different payouts, but the mean
payout over the previous 20 years is also $1,100.
But would you consider the payouts of Fund A and Fund B
to be equivalent?
Clearly, the answer is NO
The difference between the two funds is due to variability.
36
37
• Figure 2.18 shows a histogram for the payouts
received from Funds A and B. Although
• the mean payout is the same for the two
funds, their histograms differ in that the
payouts associated with Fund B have greater
variability.
• Sometimes the payouts are considerably
larger than the mean, and sometimes they are
considerably smaller.
38
39
Range
40
• Range =MAX(B2:B13)-MIN(B2:B13)
41
Variance
42
43
standard deviation
44
45
Table 2.12
The sample variance for the sample of class sizes in five college
classes is s^2 =64.
Thus,
the sample standard deviation is = 8.
46
Coefficient of variation
47
• we found a sample mean of 44 and a sample
standard deviation of 8.
• The coefficient of variation is (8/44 * 100)
18.2%.
48