Documente Academic
Documente Profesional
Documente Cultură
QMT412
sanizah@tmsk.uitm.edu.my
Learning Outcomes
What is statistics? Uses of statistics Types of statistics Common statistical terms Sources of data Types of variables Scales of measurement
sanizah@tmsk.uitm.edu.my
What is Statistics?
Statistics is the science of:
1.Collecting Data
4. Analyzing Data
2.Organizing Data
Why?
3.Presenting Data
5. Interpreting /DecisionMaking
sanizah@tmsk.uitm.edu.my
Uses of Statistics
Education
Predict most favourite
Medicine
Effectiveness of drugs Predict diseases
Business/Marketing
Predict Sales Consumer Preferences Financial Trends
TYPES OF STATISTICS
DESCRIPTIVE
INFERENTIAL
Types of STATISTICS
DESCRIPTIVE
-describe and summarize characteristics -consist collecting, organizing, summarizing and presentations of data Ex: percentage, mean, median Bar chart, pie chart, frequency table, box plot
INFERENTIAL
-make inference from sample populations -involve statistical tests -results used to make conclusion Ex: estimation, hypothesis testing (t- test, z-test, forecasting, regression)
sanizah@tmsk.uitm.edu.my
Subject or member or element Population : consists of ALL subjects/members/element (human or otherwise) that are being studied Sample : group of subjects selected from population
Sampling frame: the LIST of population members Variable Data : characteristic or attribute of interest in a population/sample. Ex: Gender, marital status, age, weight, income : the values that can be obtained from measurements or observations
7
Statistical Terms
POPULATION
All Items of Interest
CENSUS
if the study is carried out using the whole population
SAMPLE
Portion of Population
SAMPLE SURVEY
involved a subgroup (or sample) of a population being chosen
Parameter
summary measure for the entire population
Statistic
summary measure computed from sample data
SOURCE OF DATA
PRIMARY
First hand data Collected by the investigator Ex: interview respondents, survey, experiment Advantage more accurate and consistent Able to explain how the data are collected and the limitation used Disadvantage - Requires more time, manpower, high cost
Taken from other investigators collection of figures Data collected from other parties
Ex : Bank Negara, Statistics Department
SECONDARY
Advantage - 1) easily accessible from the internet, journals, books, annual report etc., 2) inexpensive, less time to collect Disadvantage - 1) lack accuracy because method of data collection are not explained, 2) biased original purpose of data collection is not known
sanizah@tmsk.uitm.edu.my
Example 1
Advance Co. has established a new service to the customers
called a "help-line". Customers can call the help-line on any matter related to the company and the products. Advance Co. wants to investigate the effectiveness of the help-line among the customers who have purchased their products. The company intends to obtain a sample of 200 customers using the warranty cards. The warranty cards contain information about the products purchased, the telephone numbers and the addresses of the customers. The categories of products and percentage of the warranty cards in each category are as follows:
Categories Of Product Washing Machines Refrigerators Vacuum Cleaners Food Processors
sanizah@tmsk.uitm.edu.my
Example 1questions
i) What is the population under study? ii) What is the sampling frame? iii) What is the variable to be measured?
sanizah@tmsk.uitm.edu.my
Example 1.solution
i) What is the population under study? A: ALL customers who have purchased the Advance Co. products ii) What is the sampling frame? A: LIST of Advance Co. customers in warranty cards iii) What is the variable to be measured? A: Effectiveness of helpline
sanizah@tmsk.uitm.edu.my
Types of variables
Variable Quantitative Qualitative or categorical (e.g., make of a computer, hair color, gender)
13
Levels of Measurement
When we observe and record a
variable, it has characteristics that influence the type of statistical analysis that we can perform on it.
Nominal
Lowest scale
Ordinal
to
analysis is to determine the level of measurement; it tells us what statistical tests can and cannot be performed
Interval
highest scale
Ratio
sanizah@tmsk.uitm.edu.my
Nominal: - Represent observations that can be categorized, do not have a meaningful numeric value - Examples:
Gender Religion Nationality Favorite colour Number on a football jersey Note: The values cannot be compared to see if one is larger than the other Cannot calculate the MEAN
sanizah@tmsk.uitm.edu.my
Nominal Scales
Ordinal:
Ordinal Scales
Represent observations that can be categorized and rank ordered The values can be compared to see if one is larger or smaller than
Grade (A, B, C, D, E, F) Note: cannot assume the differences between adjacent scale values are equal cannot make this assumption even if the labels are number, not words
sanizah@tmsk.uitm.edu.my
Interval Scales
Interval: Represent observations that can be categorized, rank ordered, and have an unit of measure
An unit of measure implies that the difference between any two successive values is identical
Can be added or subtracted (cannot be multiplied or divided) No true zero point (the value 0 does not represent the complete absence of the variable)
sanizah@tmsk.uitm.edu.my
Ratio: Highest and most informative scale Observations that can be categorized, rank ordered, have an unit measure and have a true zero (an absolute zero point) The true zero implies that a value zero represents the complete absence of the variable Examples:
amount of money zero money indicates the absence of money Weight, height, time Note: Can be multiplied or divided
Ratio Scales
sanizah@tmsk.uitm.edu.my
Example 2
Traffic offence is a growing concern at Dewan Bandaraya in Kuala Lumpur. A study was conducted to determine the profile of these traffic offenders. A researcher from this office collected data on the age, gender, race, types of offence, the amount of fine paid and the years of driving experience from a sample of traffic offenders as they entered the building to pay their fines. The researcher also checked the office database to obtain the number of traffic offences by these drivers.
sanizah@tmsk.uitm.edu.my
Example 2.cont.
i)
State the population for the above study. ii) Is the above study a census study or sample study? iii) Was any secondary data used for the above study? If there was, please state the data. iv) State the variable (s) and measurement scale from this study. v) What is the most suitable data collection method? Give ONE (1) advantage and ONE (1) disadvantage of this method.
sanizah@tmsk.uitm.edu.my
Example 2.solution
i) State the population for the above study. A: ALL the traffic offenders in K.L. ii) Is the above study a census study or sample study? A: Sample iii) Was any secondary data used for the above study? If there was, please state the data. A: Yes. Number of traffic offense by the drivers
sanizah@tmsk.uitm.edu.my
Example 2.solution
iv) State the variable(s) and measurement scale from this study.
A:
Variable
Level of measurement
Age Gender Race Types of offence Amount of fine paid Years of driving experience
v) What is the most suitable data collection method? Give ONE (1) advantage and ONE (1) disadvantage of this method. Method Advantage Disadvantage A:
Personal interview Higher response rate
sanizah@tmsk.uitm.edu.my
expensive