Sunteți pe pagina 1din 66

QUAN201: Introduction to Business Statistics

Presentation 1

Lecture Topics

Define Statistics Applications of Statistics in Business Difference between Descriptive and Inferential statistics Some Basic Concepts Data Collection Data Presentation via Tables and Graphs

What is Statistics?

Facts and figures Branch of mathematics Science of gathering, analyzing, interpreting, and presenting data.

Application of Statistics in Business


Accounting auditing and cost estimation Economics regional, national, and international economic performance Finance investments and portfolio management Management human resources, and quality management Management Information Systems performance of systems which gather, summarize, and disseminate information to various managerial levels Marketing market analysis and consumer research International Business market and demographic analysis
4

Some Statistical Concepts


Population vs. Sample Census Descriptive vs. Inferential Statistics Parameter vs. Statistic Variables, Data

Population Versus Sample

Population the whole

a collection of persons, objects, or items under study a subset of the population

Sample a portion of the whole

Census gathering data from the entire population


6

Population

Population and Census Data


Identifier RD1 RD2 RD3 RD4 RD5 BL1 BL2 GR1 GR2 GY1 GY2 GY3 Color Red Red Red Red Red Blue Blue Green Green Gray Gray Gray

Sample and Sample Data


Identifier Color RD2 Red

RD5

Red

GR1

Green

GY2

Gray

Descriptive vs. Inferential Statistics

Descriptive Statistics using data gathered on a group to describe or reach conclusions about that same group only. Inferential Statistics using sample data to reach conclusions about the population from which the sample was taken.
10

Descriptive Statistics

Collect data

e.g. Survey e.g. Tables and graphs e.g. Sample mean =

Present data

Characterize data

X
n

11

Descriptive Statistics

Descriptive statistics involves the arrangement, summary, and presentation of data, to enable meaningful interpretation, and to support decision making. Descriptive statistics methods make use of graphical techniques numerical descriptive measures. The methods presented apply to both the entire population the population sample
12

Inferential Statistics

Estimation

e.g.: Estimate the population mean weight using the sample mean weight e.g.: Test the claim that the population mean weight is 120 pounds Drawing conclusions and/or making decisions concerning a population based on sample results. 13

Hypothesis testing

Quiz

Which of the following statements involve descriptive statistics as opposed to inferential statistics? The Alcohol, Tobacco and Firearms Department reported that Houston had 1,791 registered gun dealers in 2006. Based on a survey of 400 magazine readers, the magazine reports that 45% of its readers prefer double column articles. Based on a sample of 300 professional tennis players, a tennis magazine reported that 25% of the parents of all professional tennis players did not play tennis.

14

Parameter vs. Statistic

Parameter descriptive measure of the population

Usually represented by Greek letters

Statistic descriptive measure of a sample

Usually represented by Roman letters


15

Symbols for Population Parameters


denotes population parameter mean

denotes population variance

denotes population standard deviation

16

Symbols for Sample Statistic


x denotes sample mean

denotes sample variance

S denotes sample standard deviation

17

Process of Inferential Statistics


Calculate x
Population

to estimate

Sample x (statistic)

(parameter )

Select a random sample


18

Definitions
A variable is some characteristic of a population or sample that is of interest for us. E.g. student grades. Typically denoted with a capital letter: X, Y, Z

Data are the observed values of a variable. E.g. student marks: {67, 74, 71, 83, 93, 55, 48}

19

Why We Need Data

To provide input to survey


To provide input to study To measure performance of service or production process To evaluate conformance to standards To assist in formulating alternative courses of action To satisfy curiosity
20

Data Sources
Primary Data Collection Secondary Data Compilation

Print or Electronic Observation Survey

Experimentation

21

Types of Data

Knowing the type of data is necessary to properly select the technique to be used when analyzing data.

22

Types of Data
Data
Categorical (Qualitative) Numerical (Quantitative)

Discrete

Continuous

23

Types of data - examples


Quantitative data Age - income
55 42 75000 68000

Qualitative data Person Marital status


1 2 3 married single single

. .

. . Weight gain
+10 +5

. . Computer
1 2 3 . .

. . Brand
IBM Dell IBM . .

. .

24

Data Presentation via Tables and Graphs

Organizing numerical/Quantitative data

The ordered array and stem-leaf display

Tabulating and graphing Univariate numerical/quantitative data

Frequency distributions: tables, histograms, polygons

Cumulative distributions: tables, the Ogive

25

Data Presentation via Tables and Graphs (continued..)

Tabulating and graphing Univariate categorical/qualitative data

The frequency distribution table


Bar and pie charts, the Pareto diagram

Graphing Bivariate numerical data


26

Organizing Numerical/Quantitative Data


Numerical Data
41, 24, 32, 26, 27, 27, 30, 24, 38, 21

Ordered Array
21, 24, 24, 26, 27, 27, 30, 32, 38, 41

Frequency Distributions Cumulative Distributions Ogive Polygons


27

Stem and Leaf Display

2 144677

Histograms Tables

3 028 4 1

Ungrouped Versus Grouped Data

Ungrouped data

have not been summarized in any way are also called raw data have been organized into a frequency distribution
28

Grouped data

Example of Ungrouped Data


42 30 53 26 58 40 32 37 30 34 50 47 57 30 49

50
52 30 55

40
28 36 30

32
23 32 58

31
35 26 64

40
25 50 52

Ages of a Sample of Managers in the United Arab Emirates

49
61 74

33
31 37

43
30 29

46
40 43

32
60 54

29

Frequency Distribution of Managers Ages (An example of a grouped data)

Class Interval 20-under 30 30-under 40 40-under 50 50-under 60 60-under 70 70-under 80

Frequency 6 18 11 11 3 1

30

How to Construct a Frequency Distribution (or Tally) Table/Chart?

Find range: (51)


Select number of classes: (6)

Compute class interval (width): (10)


Determine class boundaries (limits): (20,30,40,50,60,70,80) Count observations & assign to classes
31

Data Range
42 30 53 50 52 30 55 49 61 74 26 58 40 40 28 36 30 33 31 37 32 37 30 32 23 32 58 43 30 29 34 50 47 31 35 26 64 46 40 43 57 30 49 40 25 50 52 32 60 54

Range = Largest - Smallest = 74 - 23 = 51

Smallest

Largest

32

Number of Classes and Class Width

The number of classes should be between 5 and 15.


Fewer than 5 classes cause excessive summarization. More than 15 classes leave too much detail. Divide the range by the number of classes for an approximate class width Round up to a convenient number

Class Width

51 Approximate Class Width = = 8.5 6 Class Width = 10


33

Frequency Distribution of Managers Ages


Class Interval 20-under 30 30-under 40 40-under 50 50-under 60 60-under 70 70-under 80 Frequency 6 18 11 11 3 1

34

Tally Chart of Managers Ages


Class Interval 20-under 30 30-under 40 40-under 50 50-under 60 60-under 70 70-under 80 Tallies IIIII I IIIII IIIII IIIII III IIIII IIIII I IIIII IIIII I III I

35

Class Midpoints, Relative Frequencies, and Cumulative Frequencies

Class Midpoint =

beginning class endpoint + ending class endpoint 2 30 + 40 = 2 = 35

36

Relative Frequency
Class Interval 20-under 30 30-under 40 40-under 50 50-under 60 60-under 70 70-under 80 Total Relative Frequency Frequency 6 .12 6 18 .36 50 11 .22 18 50 11 .22 3 .06 1 .02 50 1.00

37

Cumulative Frequency
Class Interval 20-under 30 30-under 40 40-under 50 50-under 60 60-under 70 70-under 80 Total Frequency 6 18 11 11 3 1 50
Cumulative Frequency 6 18 + 6 24 11 + 24 35 46 49 50

38

Class Midpoints, Relative Frequencies, and Cumulative Frequencies


Relative Class Interval Frequency Midpoint Frequency 20-under 30 6 25 .12 30-under 40 18 35 .36 40-under 50 11 45 .22 50-under 60 11 55 .22 60-under 70 3 65 .06 70-under 80 1 75 .02 Total 50 1.00 Cumulative Frequency 6 24 35 46 49 50
39

Cumulative Relative Frequencies


Cumulative Relative Cumulative Relative Class Interval Frequency Frequency Frequency Frequency 20-under 30 6 .12 6 .12 30-under 40 18 .36 24 .48 40-under 50 11 .22 35 .70 50-under 60 11 .22 46 .92 60-under 70 3 .06 49 .98 70-under 80 1 .02 50 1.00 Total 50 1.00
40

Histogram
Class Interval Frequency 20-under 30 6 30-under 40 18 40-under 50 11 50-under 60 11 60-under 70 3 70-under 80 1
20 Frequency 0 10

10 20 30 40 50 60 70 80 Years

41

Histogram Construction
Class Interval Frequency 20-under 30 6 30-under 40 18 40-under 50 11 50-under 60 11 60-under 70 3 70-under 80 1
20 Frequency 0 10

10 20 30 40 50 60 70 80 Years
42

Shapes of Histograms

Symmetry
A histogram is said to be symmetric if, when we draw a vertical line down the center of the histogram, the two sides are identical in shape and size:

Frequency

Frequency

Variable

Variable

Frequency

Variable

43

Shapes of Histograms

Skewness
A skewed histogram is one with a long tail extending to either the right or the left:
Frequency Variable

Frequency

Variable

Positively Skewed

Negatively Skewed

44

Shapes of Histograms

Bell Shape A special type of symmetric histogram is one that is bell shaped:
Frequency

Many statistical techniques require that the population be bell shaped. Drawing the histogram helps verify the shape of the population in question.

Variable

Bell Shaped
45

Why do we need Histograms?


Suppose a manufacturer of cereal wants to compare performance of his two plants (whether these plants are producing 500 grams of cereal accurately) He picks a sample of 100 cereal boxes from each of his two plants and prepared separate histograms. These histograms can provide information about the accuracy of the working of these plants.
46

Histogram Comparison Compare & contrast the following histograms based on


data from Example 2.6 & Example 2.7.
unimodal vs. bimodal

The two courses have very different histograms

spread of the marks (narrower | wider)


47

Frequency Polygon
Class Interval Frequency 20-under 30 6 30-under 40 18 40-under 50 11 50-under 60 11 60-under 70 3 70-under 80 1
20 Frequency 0 10

10 20 30 40 50 60 70 80 Years

48

Ogive
Class Interval 20-under 30 30-under 40 40-under 50 50-under 60 60-under 70 70-under 80 Cumulative Frequency 6 24 35 46 49 50
60
Frequency

0
0

20

40

10

20

30

40 Years

50

60

70

80

49

Relative Frequency Ogive


Class Interval 20-under 30 30-under 40 40-under 50 50-under 60 60-under 70 70-under 80 Cumulative Relative Frequency .12 .48 .70 .92 .98 1.00
Cumulative Relative Frequency

1.00 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 0.10 0.00 0 10 20 30 40 Years 50 60 70 80

50

Stem and Leaf Display

This is a graphical technique most often used in a preliminary analysis. Stem and leaf diagrams use the actual value of the original observations (whereas, the histogram does not).

51

Stem and Leaf Display


Split each observation into two parts. There are several ways of doing that:
Observation:

42.19
Stem 42 Leaf 19

42.19
Stem 4 Leaf 2

52

Safety Examination Scores for Plant Trainees


Raw Data
86 76 77 92 91 47 60 88 55 67

Stem 2 3 4 3 9 79

Leaf

23
77 81 79 68

59
68 75 83 49

72
82 74 70 56

75
97 39 78 94

83
89 67 91 81

5
6 7 8 9

569
07788 0245567789 11233689 11247

53

Construction of Stem and Leaf Plot


Raw Data
86 76 23 77 77 92 59 68 91 47 72 82 60 88 75 97 55

Stem

Leaf 3 9 79 569 07788

Stem
67 83

2 3 4 5

Leaf
67
91 81

89

81
79 68

75
83 49

74
70 56

39

Stem 78
94 Leaf

7
8 9

0245567789
11233689 11247

54

Graphical Techniques for Qualitative data

When the raw data can be naturally categorized in a meaningful manner, we can display frequencies by

Pie chart emphasize the proportion of occurrences of each category. Bar charts emphasize frequency of occurrences of the different categories.

55

The Pie Chart

The pie chart is a circle, subdivided into a number of slices that represent the various categories.
The size of each slice is proportional to the percentage corresponding to the category it represents.
56

Second Quarter U.S. Truck Production (Example 1)


17% 4% 1%

39% 39%

57

Pie Chart Calculations for Company A


2d Quarter Truck Production 357,411

Company A

Proportion .388

Degrees 140

B
C D E

357, 411 = 920,190

354,936
160,997 34,099 12,747 920,190

.386
.175

139
63 13 5 360
58

.388 360 = .037


.014 1.000

Totals

The Pie Chart

Example 2

The student placement office at a university wanted to determine the general areas of employment of last year school graduates.
Data was collected, and the count of the occurrences was recorded for each area. These counts were converted to proportions and the results were presented as a pie chart and a bar chart.
59

Frequency and Relative Frequency Distributions for Example 2


Area Frequency Relative Fre. ----------------------------------------------------Accounting 73 28.8% Finance 52 20.6 General Management 36 14.2 Marketing/Sales 64 25.3 Other 28 11.1 ---------------------------------------------------Total 253 100

60

The Pie Chart

Other 11.1% General management 14.2%

Accounting 28.9%

(28.9 /100)(3600) = 1040

Finance 20.6%

Marketing 25.3%

61

The Bar Chart

Rectangles represent each category. The height of the rectangle represents the frequency. The base of the rectangle is arbitrary
Bar Chart
80 70 60 50 40 30 20 10 0 1 2 3 Area 4 5 More

73 52 36

64 28

Frequency

62

Graphing the Relationship Between Two Quantitative Variables

To explore this relationship, we employ a scatter diagram, which plots two variables against one another. The independent variable is labeled X and is usually placed on the horizontal axis, while the other, dependent variable, Y, is mapped to the vertical axis.
63

Scatter Diagram

Example 2.9 A real estate agent wanted to know to what extent the selling price of a home is related to its size
Collect the data Determine the independent variable (X = house size) and the dependent variable (Y = selling price) Use Excel to create a scatter diagram

1) 2)

3)

64

Scatter Diagram

It appears that in fact there is a relationship, that is, the greater the house size the greater the selling price

65

Patterns of Scatter Diagrams

Linearity and Direction are two concepts we are interested in

Positive Linear Relationship

Negative Linear Relationship

Weak or Non-Linear Relationship

66

S-ar putea să vă placă și