Documente Academic
Documente Profesional
Documente Cultură
Block 1 Slides
Module Layout
QBA 501
Descriptive Statistics Probability & Distributions Statistical Inference
Data, Tables & Graphs
Graphical Techniques
Regression Analysis
Types of Data
Pivot Tables
Summary Data
Central Location
Dispersion
Boxplots
through scatterplots Understand time series plots Manipulating data with pivot tables
Types of Data
Numerical Data
Numerical Data
Infinite number of possible values No gaps in possible values Gaps in possible values
Discrete
Categorical Data
Eye color blue, brown, etc Size of customer small, medium, large
Categorical Data
Named category as variable Name can be a number Category identifies ranked order of values
Ordinal
Observations
An observation is a member of the population or sample. Each row corresponds to an observation. In this data set, each person represents an observation.
Variables
A variable is a specific attribute being observed/measured. Each column represents a variable. In this data set, each of the six pieces of information about a person is a variable.
code Gender (1 for male and 2 for female) uncode Opinion variable categorize the Age variable as young (34 or younger), middle aged (from 35-59) and elderly (60 or older).
Cross-sectional data
All variables measured at one point in time snap shot Coding.xls is cross-sectional; other examples? Measure one or more variables at successive points in time. Time-series examples?
Time-series data
Module Layout
QBA 501
Descriptive Statistics Probability & Distributions Statistical Inference
Data, Tables & Graphs
Graphical Techniques
Regression Analysis
Types of Data
Pivot Tables
Summary Data
Central Location
Dispersion
Boxplots
Too many, or too few, and we lose meaningful information in the data.
Should be mutually exclusive Should be collectively exhaustive 8-15 bins works best Should have equal widths (round numbers better, e.g. 5, 10, 100, etc, )
Frequency Distribution
Otis Elevators
Data consists of the diameter (in inches) of 400 elevator rails measured by Otis Elevators. The diameters range from a low of approximately 0.449 inch to a high of approximately 0.548 inch.
Symmetric Histograms
A histogram is symmetric if it has a single peak and looks approximately the same to the left and right of the peak.
Data consists of the time between customer arrivals - called interarrival times - for all customers on a given day.
A histogram is positively skewed or skewed to the right if it has a single peak and the values of the distribution extend much farther to the right of the peak than to the left of the peak.
Data consists of the midterm grades for a large class of accounting students.
A histogram is negatively skewed or skewed to the left if its longer tail is on the left.
Otis Elevators 2
Data consists of the diameters of all elevator rails produced on a single day at Otis Elevators. Otis uses two machines to produce elevator rails.
Bimodal Distributions
Some histograms have two or more peaks. This indicates that the data comes from two or more distinct populations. The result in this case is a bimodal distribution. Other multimodal distributions exist trimodal, etc.
Pie Chart
Cousin of the Histogram Visualize frequency as a proportion of the entire data set.
Scatterplots
We are often interested in the relationship between two variables. Plot a point for each observation, where the coordinates represent the values of the two variables. The resulting graph is a scatterplot.
10
Scatterplots
After constructing a scatterplot, we can examine the scatter of points. We look for any relationship between the two variables. Direction: Is there a tendency for one variable to move in concert with, or in opposition to, the other variable? Strength: Are the points tightly clustered around an imaginary straight line? Or are they more broadly scattered?
When we need to forecast future values of a time series, it is helpful to create a time series plot. This is essentially a scatterplot, with the time series variable on the vertical axis and the time itself on the horizontal axis. Also, to make patterns in the data more apparent, the points are usually connected with lines.
When looking at a time series plot we usually look for two things:
Is there an observable trend or cycle? That is, do the values of the series tend to increase (an upward trend) or decrease (a downward trend) over time? Or cycle up and down? Is there a seasonal pattern? For example, do the peaks or valleys for quarterly data tend to occur every fourth observation?
11
Note that you can use different vertical scales for each variable. This can yield time based relationships between variables in a manner similar to a scatterplot for cross-sectional data.
Module Layout
QBA 501
Descriptive Statistics Probability & Distributions Statistical Inference
Data, Tables & Graphs
Graphical Techniques
Regression Analysis
Types of Data
Pivot Tables
Summary Data
Central Location
Dispersion
Boxplots
Pivot Tables
One of Excels most powerful tools. Pivot tables allow us to slice and dice the data. Statisticians often refer to the resulting tables as contingency tables or crosstabs.
12
13
next step.
names to the appropriate field area in the pivot table field list.
14
The best way to learn the full power of Pivot Tables is to get in and play. The automatic link to the charts is very powerful, you can jump back and forth to view the impact of your actions. Finally, you can manipulate pivot charts just like any other Excel charts.
Module Layout
QBA 501
Descriptive Statistics Probability & Distributions Statistical Inference
Data, Tables & Graphs
Graphical Techniques
Regression Analysis
Types of Data
Pivot Tables
Summary Data
Dispersion & Association
Central Location
Boxplots
through scatterplots Understand time series plots Manipulating data with pivot tables
15