Documente Academic
Documente Profesional
Documente Cultură
Presentation 1
Lecture Topics
Define Statistics Applications of Statistics in Business Difference between Descriptive and Inferential statistics Some Basic Concepts Data Collection Data Presentation via Tables and Graphs
What is Statistics?
Facts and figures Branch of mathematics Science of gathering, analyzing, interpreting, and presenting data.
Accounting auditing and cost estimation Economics regional, national, and international economic performance Finance investments and portfolio management Management human resources, and quality management Management Information Systems performance of systems which gather, summarize, and disseminate information to various managerial levels Marketing market analysis and consumer research International Business market and demographic analysis
4
Population vs. Sample Census Descriptive vs. Inferential Statistics Parameter vs. Statistic Variables, Data
Population
RD5
Red
GR1
Green
GY2
Gray
Descriptive Statistics using data gathered on a group to describe or reach conclusions about that same group only. Inferential Statistics using sample data to reach conclusions about the population from which the sample was taken.
10
Descriptive Statistics
Collect data
Present data
Characterize data
X
n
11
Descriptive Statistics
Descriptive statistics involves the arrangement, summary, and presentation of data, to enable meaningful interpretation, and to support decision making. Descriptive statistics methods make use of graphical techniques numerical descriptive measures. The methods presented apply to both the entire population the population sample
12
Inferential Statistics
Estimation
e.g.: Estimate the population mean weight using the sample mean weight e.g.: Test the claim that the population mean weight is 120 pounds Drawing conclusions and/or making decisions concerning a population based on sample results. 13
Hypothesis testing
Quiz
Which of the following statements involve descriptive statistics as opposed to inferential statistics? The Alcohol, Tobacco and Firearms Department reported that Houston had 1,791 registered gun dealers in 2006. Based on a survey of 400 magazine readers, the magazine reports that 45% of its readers prefer double column articles. Based on a sample of 300 professional tennis players, a tennis magazine reported that 25% of the parents of all professional tennis players did not play tennis.
14
16
17
to estimate
Sample x (statistic)
(parameter )
Definitions
A variable is some characteristic of a population or sample that is of interest for us. E.g. student grades. Typically denoted with a capital letter: X, Y, Z
Data are the observed values of a variable. E.g. student marks: {67, 74, 71, 83, 93, 55, 48}
19
Data Sources
Primary Data Collection Secondary Data Compilation
Experimentation
21
Types of Data
Knowing the type of data is necessary to properly select the technique to be used when analyzing data.
22
Types of Data
Data
Categorical (Qualitative) Numerical (Quantitative)
Discrete
Continuous
23
. .
. . Weight gain
+10 +5
. . Computer
1 2 3 . .
. . Brand
IBM Dell IBM . .
. .
24
25
Ordered Array
21, 24, 24, 26, 27, 27, 30, 32, 38, 41
2 144677
Histograms Tables
3 028 4 1
Ungrouped data
have not been summarized in any way are also called raw data have been organized into a frequency distribution
28
Grouped data
50
52 30 55
40
28 36 30
32
23 32 58
31
35 26 64
40
25 50 52
49
61 74
33
31 37
43
30 29
46
40 43
32
60 54
29
Frequency 6 18 11 11 3 1
30
Data Range
42 30 53 50 52 30 55 49 61 74 26 58 40 40 28 36 30 33 31 37 32 37 30 32 23 32 58 43 30 29 34 50 47 31 35 26 64 46 40 43 57 30 49 40 25 50 52 32 60 54
Smallest
Largest
32
Fewer than 5 classes cause excessive summarization. More than 15 classes leave too much detail. Divide the range by the number of classes for an approximate class width Round up to a convenient number
Class Width
34
35
Class Midpoint =
36
Relative Frequency
Class Interval 20-under 30 30-under 40 40-under 50 50-under 60 60-under 70 70-under 80 Total Relative Frequency Frequency 6 .12 6 18 .36 50 11 .22 18 50 11 .22 3 .06 1 .02 50 1.00
37
Cumulative Frequency
Class Interval 20-under 30 30-under 40 40-under 50 50-under 60 60-under 70 70-under 80 Total Frequency 6 18 11 11 3 1 50
Cumulative Frequency 6 18 + 6 24 11 + 24 35 46 49 50
38
Histogram
Class Interval Frequency 20-under 30 6 30-under 40 18 40-under 50 11 50-under 60 11 60-under 70 3 70-under 80 1
20 Frequency 0 10
10 20 30 40 50 60 70 80 Years
41
Histogram Construction
Class Interval Frequency 20-under 30 6 30-under 40 18 40-under 50 11 50-under 60 11 60-under 70 3 70-under 80 1
20 Frequency 0 10
10 20 30 40 50 60 70 80 Years
42
Shapes of Histograms
Symmetry
A histogram is said to be symmetric if, when we draw a vertical line down the center of the histogram, the two sides are identical in shape and size:
Frequency
Frequency
Variable
Variable
Frequency
Variable
43
Shapes of Histograms
Skewness
A skewed histogram is one with a long tail extending to either the right or the left:
Frequency Variable
Frequency
Variable
Positively Skewed
Negatively Skewed
44
Shapes of Histograms
Bell Shape A special type of symmetric histogram is one that is bell shaped:
Frequency
Many statistical techniques require that the population be bell shaped. Drawing the histogram helps verify the shape of the population in question.
Variable
Bell Shaped
45
Frequency Polygon
Class Interval Frequency 20-under 30 6 30-under 40 18 40-under 50 11 50-under 60 11 60-under 70 3 70-under 80 1
20 Frequency 0 10
10 20 30 40 50 60 70 80 Years
48
Ogive
Class Interval 20-under 30 30-under 40 40-under 50 50-under 60 60-under 70 70-under 80 Cumulative Frequency 6 24 35 46 49 50
60
Frequency
0
0
20
40
10
20
30
40 Years
50
60
70
80
49
1.00 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 0.10 0.00 0 10 20 30 40 Years 50 60 70 80
50
This is a graphical technique most often used in a preliminary analysis. Stem and leaf diagrams use the actual value of the original observations (whereas, the histogram does not).
51
Split each observation into two parts. There are several ways of doing that:
Observation:
42.19
Stem 42 Leaf 19
42.19
Stem 4 Leaf 2
52
Stem 2 3 4 3 9 79
Leaf
23
77 81 79 68
59
68 75 83 49
72
82 74 70 56
75
97 39 78 94
83
89 67 91 81
5
6 7 8 9
569
07788 0245567789 11233689 11247
53
Stem
Stem
67 83
2 3 4 5
Leaf
67
91 81
89
81
79 68
75
83 49
74
70 56
39
Stem 78
94 Leaf
7
8 9
0245567789
11233689 11247
54
When the raw data can be naturally categorized in a meaningful manner, we can display frequencies by
Pie chart emphasize the proportion of occurrences of each category. Bar charts emphasize frequency of occurrences of the different categories.
55
The pie chart is a circle, subdivided into a number of slices that represent the various categories.
The size of each slice is proportional to the percentage corresponding to the category it represents.
56
39% 39%
57
Company A
Proportion .388
Degrees 140
B
C D E
354,936
160,997 34,099 12,747 920,190
.386
.175
139
63 13 5 360
58
Totals
Example 2
The student placement office at a university wanted to determine the general areas of employment of last year school graduates.
Data was collected, and the count of the occurrences was recorded for each area. These counts were converted to proportions and the results were presented as a pie chart and a bar chart.
59
60
Accounting 28.9%
Finance 20.6%
Marketing 25.3%
61
Rectangles represent each category. The height of the rectangle represents the frequency. The base of the rectangle is arbitrary
Bar Chart
80 70 60 50 40 30 20 10 0 1 2 3 Area 4 5 More
73 52 36
64 28
Frequency
62
To explore this relationship, we employ a scatter diagram, which plots two variables against one another. The independent variable is labeled X and is usually placed on the horizontal axis, while the other, dependent variable, Y, is mapped to the vertical axis.
63
Scatter Diagram
Example 2.9 A real estate agent wanted to know to what extent the selling price of a home is related to its size
Collect the data Determine the independent variable (X = house size) and the dependent variable (Y = selling price) Use Excel to create a scatter diagram
1) 2)
3)
64
Scatter Diagram
It appears that in fact there is a relationship, that is, the greater the house size the greater the selling price
65
66