Review Some Basic Statistical Concepts: Appendix 1 A

1A.
1
Appendix 1 A
Review
Some Basic
Statistical Concepts
1A.2
Random Variable
random variable:
A variable whose value is unknown until it is observed.
The value of a random variable results from an experiment.
The term random variable implies the existence of some

known or unknown probability distribution defined over
the set of all possible values of that variable.
In contrast, an arbitrary variable does not have a

probability distribution associated with its values.
1A.3
Controlled experiment values

of explanatory variables are chosen
with great care in accordance with
an appropriate experimental design.
Uncontrolled experiment values

of explanatory variables consist of
nonexperimental observations over
which the analyst has no control.
1A.4
Discrete Random Variable
discrete random variable:
A discrete random variable can take only a finite
number of values, that can be counted by using
the positive integers.
Example: Prize money from the following

lottery is a discrete random variable:
first prize: $1,000
second prize: $50
third prize: $5.75
since it has only four (a finite number)
(count: 1,2,3,4) of possible outcomes:
$0.00; $5.75; $50.00; $1,000.00
1A.5
Continuous Random Variable
continuous random variable:
A continuous random variable can take
any real value (not just whole numbers)
in at least one interval on the real line.
Examples:
Gross national product (GNP)
money supply
interest rates
price of eggs
household income
expenditure on clothing
1A.6
Dummy Variable
A discrete random variable that is restricted

to two possible values (usually 0 and 1) is
called a dummy variable (also, binary or
indicator variable).
Dummy variables account for qualitative differences:

gender (0=male, 1=female),
race (0=white, 1=nonwhite),
citizenship (0=U.S., 1=not U.S.),
income class (0=poor, 1=rich).
1A.7
A list of all of the possible values taken
by a discrete random variable along with
their chances of occurring is called a probability
function or probability density function (pdf).
die x f(x)
one dot 1 1/6
two dots 2 1/6
three dots 3 1/6
four dots 4 1/6
five dots 5 1/6
six dots 6 1/6
1A.8
A discrete random variable X
has pdf, f(x), which is the probability
that X takes on the value x.
f(x) = P(X=xi)
Therefore, 0 < f(x) < 1

If X takes on the n values: x1, x2, . . . , xn,
then f(x1) + f(x2)+. . .+f(xn) = 1.
For example in a throw of one dice:
(1/6) + (1/6) + (1/6) + (1/6) + (1/6) + (1/6)=1
1A.9
In a throw of two dice, the discrete random variable X,
x = 2 3 4 5 6 7 8 9 10 11 12
f(x) =(1/36)(2/36)(3/36)(4/36)(5/36)(6/36)(5/36)( 4/36)(3/36)(2/36)(1/36)
the pdf f(x) can be shown by the presented by height:
f(x)
0 2 3 4 5 6 7 8 9 10 11 12 X
number, X, the possible outcomes of two dice

1A.10
A continuous random variable uses
area under a curve rather than the
height, f(x), to represent probability:
f(x)
red area
green area 0.1324
0.8676
. .
$34,000 $55,000 X
per capita income, X, in the United States

1A.11
Since a continuous random variable has an
uncountably infinite number of values,
the probability of one occurring is zero.
P[X=a] = P[a<X<a]=0
Probability is represented by area.

Height alone has no area.
An interval for X is needed to get
an area under the curve.
1A.12
The area under a curve is the integral of
the equation that generates the curve:
b
P[a<X<b]= ∫a f(x) dx
For continuous random variables it is the

integral of f(x), and not f(x) itself, which
defines the area and, therefore, the probability.
1A.13
Rules of Summation
n
Rule 1: Σ
i=1
xi = x1 + x2 + . . . + xn
n n
Rule 2: Σ kxi = k Σ xi
i=1 i=1
n n n
Rule 3: Σ
i=1
(xi + yi) = Σ xi + Σ yi
i=1 i=1
Note that summation is a linear operator

which means it operates term by term.
1A.14
Rules of Summation (continued)
n n n
Rule 4: Σ
i=1
(axi + byi) = a Σ xi + b Σ yi
i=1 i=1
n x1 + x2 + . . . + xn
= n Σ xi =
1
Rule 5: x
i=1 n
The definition of x as given in Rule 5 implies

the following important fact:
n
Σ
i=1
(xi − x) = 0
1A.15
Rules of Summation (continued)
n
Rule 6: Σ
i=1
f(xi) = f(x1) + f(x2) + . . . + f(xn)
n
Notation: Σx f(xi) = Σi f(xi) = i =Σ1 f(xi)
n m n
Rule 7: Σ Σ f(xi,yj) = i Σ
=1
[ f(xi,y1) + f(xi,y2)+. . .+ f(xi,ym)]
i=1 j=1
The order of summation does not matter :

n m m n
Σ Σ f(xi,yj) =j =Σ1 i Σ
=1
f(xi,yj)
i=1 j=1
1A.16
The Mean of a Random Variable
The mean or arithmetic average of a

random variable is its mathematical
expectation or expected value, E(X).
1A.17
Expected Value
There are two entirely different, but mathematically
equivalent, ways of determining the expected value:
1. Empirically:
The expected value of a random variable, X,
is the average value of the random variable in an
infinite number of repetitions of the experiment.
In other words, draw an infinite number of samples,

and average the values of X that you get.
1A.18
Expected Value
2. Analytically:
The expected value of a discrete random
variable, X, is determined by weighting all
the possible values of X by the corresponding
probability density function values, f(x), and
summing them up.
In other words:
E(X) = x1f(x1) + x2f(x2) + . . . + xnf(xn)

1A.19
Empirical vs. Analytical
As sample size goes to infinity, the
empirical and analytical methods
will produce the same value.
In the empirical case when the

sample goes to infinity the values
of X occur with a frequency
equal to the corresponding f(x)
in the analytical expression.
1A.20
Empirical (sample) mean:
n
x = Σ xi/n
i=1
where n is the number of sample observations.
Analytical mean:
n
E(X) = Σ xi f(xi)
i=1
where n is the number of possible values of xi.
Notice how the meaning of n changes.

1A.21
The expected value of X:
n
E (X) = Σ xi f(xi)
i=1
The expected value of X-squared:

n
Σ
2 2
E (X ) = xi f(xi)
i=1
It is important to notice that f(xi) does not change!
The expected value of X-cubed:

n
E (X )= i=1Σ xi f(xi)
3 3
1A.22
E(X) = 0 (.1) + 1 (.3) + 2 (.3) + 3 (.2) + 4 (.1)
= 1.9
2 2 2 2 2 2
E(X )= 0 (.1) + 1 (.3) + 2 (.3) + 3 (.2) + 4 (.1)
= 0 + .3 + 1.2 + 1.8 + 1.6
= 4.9
3 3 3 3 3 3
E( X ) = 0 (.1) + 1 (.3) + 2 (.3) + 3 (.2) +4 (.1)
= 0 + .3 + 2.4 + 5.4 + 6.4
= 14.5
1A.23
n
E[g(X)] = Σ
i=1
g(xi) f(xi)
g(X) = g1(X) + g2(X)

n
E[g(X)] = Σ
i=1
[ g1(xi) + g2(xi)] f(xi)
n n
E[g(X)] = Σ
i=1
g1(xi) f(xi) +i Σ
=1
g 2 (xi ) f(xi )
E[g(X)] = E[g1(X)] + E[g2(X)]

1A.24
Adding and Subtracting
Random Variables
E(X+Y) = E(X) + E(Y)
E(X-Y) = E(X) - E(Y)

1A.25
Adding a constant to a variable will

add a constant to its expected value:
E(X+a) = E(X) + a
Multiplying by constant will multiply
its expected value by that constant:
E(bX) = b E(X)
1A.26
Variance
var(X) = average squared deviations

around the mean of X.
var(X) = expected value of the squared deviations

around the expected value of X.
var(X) = E [(X - E(X))2]

1A.27
2
var(X) = E [(X - EX) ]
2
var(X) = E [(X - EX) ]
2 2
= E [X - 2XEX + (EX) ]
2 2
= E(X ) - 2 EX EX + E (EX)
2 2 2
= E(X ) - 2 (EX) + (EX)
2 2
= E(X ) - (EX)
2 2
var(X) = E(X ) - (EX)
1A.28
variance of a discrete
random variable, X:
∑ i
2
var (X) = (x - EX ) f (x i )
i=1
standard deviation is square root of variance

1A.29
calculate the variance for a
discrete random variable, X:
2
xi f(xi) (xi - EX) (xi - EX) f(xi)
2 .1 2 - 4.3 = -2.3 5.29 (.1) = .529

3 .3 3 - 4.3 = -1.3 1.69 (.3) = .507
4 .1 4 - 4.3 = - .3 .09 (.1) = .009
5 .2 5 - 4.3 = .7 .49 (.2) = .098
6 .3 6 - 4.3 = 1.7 2.89 (.3) = .867
n
Σ xi f(xi) = .2 + .9 + .4 + 1.0 + 1.8 = 4.3
i=1
n 2
Σ (xi - EX) f(xi) = .529 + .507 + .009 + .098 + .867
i=1
= 2.01
1A.30
Z = a + cX
var(Z) = var(a + cX)
2
= E [(a+cX) - E(a+cX)]
2
= c var(X)
2
var(a + cX) = c var(X)
1A.31
Covariance
The covariance between two random

variables, X and Y, measures the
linear association between them.
cov(X,Y) = E[(X - EX)(Y-EY)]

Note that variance is a special case of covariance.
2
cov(X,X) = var(X) = E[(X - EX) ]
1A.32
cov(X,Y) = E [(X - EX)(Y-EY)]
cov(X,Y) = E [(X - EX)(Y-EY)]

= E [XY - X EY - Y EX + EX EY]
= E(XY) - EX EY - EY EX + EX EY
= E(XY) - 2 EX EY + EX EY
= E(XY) - EX EY
cov(X,Y) = E(XY) - EX EY
Y=1 Y=2 1A.33
X=0 .45 .15 .60

EX=0(.60)+1(.40)=.40
.05 .35 .40

X=1
covariance
.50 .50 cov(X,Y) = E(XY) - EX EY
EY=1(.50)+2(.50)=1.50 = .75 - (.40)(1.50)
= .75 - .60
EX EY = (.40)(1.50) = .60
= .15
E(XY) = (0)(1)(.45)+(0)(2)(.15)+(1)(1)(.05)+(1)(2)(.35)=.75
1A.34
Joint pdf
A joint probability density function,

f(x,y), provides the probabilities
associated with the joint occurrence
of all of the possible pairs of X and Y.
1A.35
Survey of College City, NY
college grads
joint pdf in household
Y=1 Y=2
f(x,y)
f(0,1) f(0,2)
vacation X = 0 .45 .15
homes
owned
X=1
.05 .35
f(1,1) f(1,2)
1A.36
Calculating the expected value of

functions of two random variables.
E[g(X,Y)] = Σ Σ g(xi,yj) f(xi,yj)

i j
E(XY) = Σ Σ xi yj f(xi,yj)
i j
E(XY) = (0)(1)(.45)+(0)(2)(.15)+(1)(1)(.05)+(1)(2)(.35)=.75
1A.37
Marginal pdf
The marginal probability density functions,

f(x) and f(y), for discrete random variables,
can be obtained by summing over the f(x,y)
with respect to the values of Y to obtain f(x)
with respect to the values of X to obtain f(y).
f(xi) = Σ f(xi,yj) f(yj) = Σ f(xi,yj)

j i
1A.38
marginal
Y=1 Y=2 marginal
pdf for X:
X=0 .45 .15 .60 f(X = 0)
.05 .35 .40 f(X = 1)

X=1
marginal .50 .50

pdf for Y:
f(Y = 1) f(Y = 2)
1A.39
Conditional pdf
The conditional probability density

functions of X given Y=y , f(x|y),
and of Y given X=x , f(y|x),
are obtained by dividing f(x,y) by f(y)
to get f(x|y) and by f(x) to get f(y|x).
f(x,y) f(x,y)
f(x|y) = f(y|x) =
f(y) f(x)
1A.40
conditonal
Y=1 Y=2
f(Y=1|X = 0)=.75 f(Y=2|X= 0)=.25
.75 .25
X=0 .45 .15 .60
f(X=0|Y=1)=.90 .90 .30 f(X=0|Y=2)=.30
f(X=1|Y=1)=.10 .10 .70 f(X=1|Y=2)=.70
X=1 .05 .35 .40
.125 .875
f(Y=1|X = 1)=.125 f(Y=2|X = 1)=.875
.50 .50
1A.41
Independence
X and Y are independent random

variables if their joint pdf, f(x,y),
is the product of their respective
marginal pdfs, f(x) and f(y) .
f(xi,yj) = f(xi) f(yj)

for independence this must hold for all pairs of i and j
1A.42
not independent
Y=1 Y=2 marginal
pdf for X:
.50x.60=.30 .50x.60=.30
X=0 .45 .15 .60 f(X = 0)
.05 .35 .40 f(X = 1)

X=1
.50x.40=.20 .50x.40=.20 The calculations
in the boxes show
marginal .50 .50 the numbers
pdf for Y: required to have
f(Y = 1) f(Y = 2) independence.
1A.43
Correlation
The correlation between two random

variables X and Y is their covariance
divided by the square roots of their
respective variances.
cov(X,Y)
ρ(X,Y) =
var(X) var(Y)
Correlation is a pure number falling between -1 and 1.
Y=1 Y=2 1A.44
EX=.40
2 2 2
EX=0(.60)+1(.40)=.40
X=0 .45 .15 .60 2 2
var(X) = E(X ) - (EX)
2
= .40 - (.40)
.05 .35 .40 = .24
X=1
cov(X,Y) = .15
EY=1.50 .50 .50 correlation

cov(X,Y)
2 2 2
EY=1(.50)+2(.50) 2 2
ρ(X,Y) =
var(Y) = E(Y ) - (EY) var(X) var(Y)
= .50 + 2.0
= 2.50 - (1.50)2
= 2.50 ρ(X,Y) = .61
= .25
1A.45
Zero Covariance & Correlation
Independent random variables

have zero covariance and,
therefore, zero correlation.
The converse is not true.

Since expectation is a linear operator, 1A.46
it can be applied term by term.
The expected value of the weighted sum

of random variables is the sum of the
expectations of the individual terms.
E[c1X + c2Y] = c1EX + c2EY
In general, for random variables X1, . . . , Xn :
E[c1X1+...+ cnXn] = c1EX1+...+ cnEXn

1A.47
The variance of a weighted sum of random
variables is the sum of the variances, each times
the square of the weight, plus twice the covariances
of all the random variables times the products of
their weights.
Weighted sum of random variables:

2 2
var(c1X + c2Y)=c1 var(X)+c2 var(Y) + 2c1c2cov(X,Y)
Weighted difference of random variables:
var(c1X − c2Y) = c21 var(X)+c22var(Y) − 2c1c2cov(X,Y)

1A.48
The Normal Distribution
Y~ 2
N(β,σ )
1 - (y - β)2
f(y) = exp
2 π σ2 2 σ2
f(y)
β y
1A.49
The Standardized Normal
Z = (y - β)/σ
Z ~ N(0,1)
1 - z2
f(z) = exp 2
2π
1A.50
Y ~ N(β,σ2)
f(y)
β
a
y
Y-β a-β a-β

P[Y>a] = P > = P Z >
σ σ σ
1A.51
Y ~ N(β,σ2)
f(y)
β
a b
y
a-β Y-β b-β
P[a<Y<b] = P < < σ
σ σ
a-β b-β
= P <Z<
σ σ
1A.52
Linear combinations of jointly
normally distributed random variables
are themselves normally distributed.
Y1 ~ N(β1,σ12), Y2 ~ N(β2,σ22), . . . , Yn ~ N(βn,σn2)
W = c1Y1 + c2Y2 + . . . + cnYn
W ~ N[ E(W), var(W) ]
1A.53
Chi-Square
If Z1, Z2, . . . , Zm denote m independent

N(0,1) random variables, and
V = Z1 + Z2 + . . . + Zm, then V ~ χ(m)
2 2 2 2
V is chi-square with m degrees of freedom.
E[V] = E[ χ(m) ] = m
2
mean:
var[V] = var[ χ(m) ] = 2m

2
variance:
1A.54
Student - t
If Z ~ N(0,1) and V ~ χ(m) and if Z and V

2
are independent then, Z

t= ~ t(m)
Vm
t is student-t with m degrees of freedom.
mean: E[t] = E[t(m) ] = 0 symmetric about zero
variance: var[t] = var[t(m) ] = m/(m-2)

1A.55
F Statistic
If V1 ~ χ(m ) and V2 ~ χ(m ) and if V1 and V2

2 2
1 2
are independent, then
V1
m1
F= ~ F(m1,m2)
V2
m2
F is an F statistic with m1 numerator

degrees of freedom and m2 denominator
degrees of freedom.

Review Some Basic Statistical Concepts: Appendix 1 A

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Review Some Basic Statistical Concepts: Appendix 1 A

Încărcat de

Drepturi de autor:

Formate disponibile

1A.

The term random variable implies the existence of some

In contrast, an arbitrary variable does not have a

Controlled experiment values

Uncontrolled experiment values

Example: Prize money from the following

A discrete random variable that is restricted

Dummy variables account for qualitative differences:

Therefore, 0 < f(x) < 1

number, X, the possible outcomes of two dice

per capita income, X, in the United States

Probability is represented by area.

For continuous random variables it is the

Note that summation is a linear operator

The definition of x as given in Rule 5 implies

The order of summation does not matter :

The Mean of a Random Variable

The mean or arithmetic average of a

In other words, draw an infinite number of samples,

E(X) = x1f(x1) + x2f(x2) + . . . + xnf(xn)

In the empirical case when the

where n is the number of sample observations.

where n is the number of possible values of xi.

Notice how the meaning of n changes.

The expected value of X-squared:

The expected value of X-cubed:

g(X) = g1(X) + g2(X)

E[g(X)] = E[g1(X)] + E[g2(X)]

E(X+Y) = E(X) + E(Y)

E(X-Y) = E(X) - E(Y)

Adding a constant to a variable will

var(X) = average squared deviations

var(X) = expected value of the squared deviations

var(X) = E [(X - E(X))2]

standard deviation is square root of variance

2 .1 2 - 4.3 = -2.3 5.29 (.1) = .529

The covariance between two random

cov(X,Y) = E[(X - EX)(Y-EY)]

cov(X,Y) = E [(X - EX)(Y-EY)]

X=0 .45 .15 .60

.05 .35 .40

A joint probability density function,

Calculating the expected value of

E[g(X,Y)] = Σ Σ g(xi,yj) f(xi,yj)

The marginal probability density functions,

f(xi) = Σ f(xi,yj) f(yj) = Σ f(xi,yj)

X=0 .45 .15 .60 f(X = 0)

.05 .35 .40 f(X = 1)

marginal .50 .50

The conditional probability density

X and Y are independent random

f(xi,yj) = f(xi) f(yj)

.05 .35 .40 f(X = 1)

The correlation between two random

EY=1.50 .50 .50 correlation

Independent random variables

The converse is not true.

The expected value of the weighted sum

E[c1X + c2Y] = c1EX + c2EY

In general, for random variables X1, . . . , Xn :

E[c1X1+...+ cnXn] = c1EX1+...+ cnEXn

Weighted sum of random variables:

Weighted difference of random variables: