What is Statistics? Introduction to Basic term Comparison of Probability and Statistics What is Statistics?
Statistics: The science of collecting, describing,
and interpreting data. Descriptive Statistics
Descriptive Statistics: collection, presentation,
and description of sample data. Inferentical Statistics
Inferential Statistics: making decisions and
drawing conclusions about populations. Population and Sample
Population: A collection, or set, of individuals or
objects or events whose properties are to be analyzed. Two kinds of populations: finite or infinite. Sample: A subset of population Population and Sample - Examples
The population is the age of all members at
Biglabs. A sample is any subset of that population. For example, we might select 10 members and determine their age. Population and Sample Variables
A characteristic about each individual element of
a population or sample. Examples: The variable is the “age” of each member at Biglabs The variable is the “height” of each member at Biglabs The variable is the “weight” of each member at Biglabs The variable is the “handsome” of each member at Biglabs Two kinds of variables
Qualitative, or Attribute, or Categorical,
Variable: A variable that categorizes or describes an element of a population. Note: Arithmetic operations, such as addition and averaging, are not meaningful for data resulting from a qualitative variable. Examples: The variable is the “handsome” of each member at Biglabs Two kinds of variables
Quantitative, or Numerical, Variable: A variable
that quantifies an element of a population. Note: Arithmetic operations such as addition and averaging, are meaningful for data resulting from a quantitative variable. Examples: The variable is the “height” of each member at Biglabs The variable is the “weight” of each member at Biglabs Measuring Variables
To establish relationships between variables,
researchers must observe the variables and record their observations. This requires that the variables be measured. The process of measuring a variable requires a set of categories called a scale of measurement and a process that classifies each individual into one category. Measuring Variables Measuring Variables
Nominal Variable: Variables that are “named”,
i.e. classified into one or more qualitative categories that describe the characteristic of interest no ordering of the different categories no measure of distance between values categories can be listed in any order without affecting the relationship between them Examples Gender (male, female) Blood type (A, B, AB, O) Measuring Variables
Ordinal Variable: Variables that have an inherent
order to the relationship among the different categories Note: The scale of measurement for most ordinal variables is called a Likert scale. Examples Education level (elementary, secondary, college) Agreement level (strongly disagree, disagree, neutral, agree, strongly agree) Measuring Variables
Interval Variable: Variables that have constant,
equal distances between values, but the zero point is arbitrary. Ratio Variable: Variables have equal intervals between values, the zero point is meaningful, and the numerical relationships between numbers is meaningful. Measuring Variables
Discrete Variable: A variable that can assume a
countable number of values. Intuitively, a discrete variable can assume values corresponding to isolated points along a line interval. That is, there is a gap between any two values. Continuous Variable: A variable that can assume an uncountable number of values. Intuitively, a continuous variable can assume any value along a line interval, including every possible value between any two values. Random Variables
A random variable is a function or rule that
assigns a number to each outcome of an experiment. Basically it is just a symbol that represents the outcome of an experiment. C = the daily change in a stock price. R = the number of miles per gallon you get on your auto during a family vacation. V = the speed of an auto registered on a radar detector used on I-20 Random Variables
Discrete random variables have a countable
number of outcomes Examples: Dead/alive, treatment/placebo, dice, counts, etc. Continuous random variables have an infinite continuum of possible values. Examples: blood pressure, weight, the speed of a car, the real numbers from 1 to 6. Comparison of Probability and Statistics
Probability: Statistics: Use
Properties of the information in the population are sample to draw a assumed known. conclusion about Answer questions the population. about the sample based on these properties. Comparison of Probability and Statistics
Example: A jar of M&M’s Example: A handful of
contains 100 candy 10 M&M’s is selected pieces, 15 are red. A from a jar containing handful of 10 is selected. 1000 candy pieces. Probability question: Three M&M’s in the What is the probability handful are red. that 3 of the 10 selected Statistics question: are red? What is the proportion of red M&M’s in the entire jar? Questions DC(u4) = D(abc)-T(abc)-T(ab)-T(ac)-T(bc)-T(a)-T(b)-T(c)-1 = 3 Top-k Buffer to store top-k skyline nodes.