Documente Academic
Documente Profesional
Documente Cultură
Outline
1 Definitions and Notation
What is Probability?
Notation and Definitions
Marginal, Joint and Conditional Probability
2 Random Variables and Distributions
What is a Random Variable?
Discrete and Continuous Distributions
Marginal, Joint, and Conditional Distributions
3 Expectation and Transformations
Expectation and Variance
Conditional Expectation and Variance
4 Elementary Asymptotics
Convergence of a Sequence
Convergence in Probability
Convergence in Distribution
5 Some Important Distributions
Intuitive Definition
While there are several interpretations of what probability is,
most modern (post 1935 or so) researchers agree on an
axiomatic definition of probability.
Subjective Interpretation
Frequency Interpretation
If you want to explore this debate further, check out this article
in the Stanford Encyclopedia of Philosophy.
http://plato.stanford.edu/entries/probability-interpret/
A = {a1 , a2 , a3 }.
If A is a subset of B we write A ⊂ B.
The union of two sets A and B is the set that contains the
intersection of A and B, the elements in A that aren’t in B and
the elements of B that aren’t in A.
Sample Spaces
Events
Events are subsets of the sample space.
For Example, if
Ω = {heads, heads}, {heads, tails}, {tails, heads}, {tails, tails} ,
then
∅
{heads, heads}, {heads, tails}, {tails, tails}
{heads, tails}
Probability Function
Conditional Probability
P(A, B)
P(A|B) =
P(B)
This implies that
P(A) = 4/52
P(B|A) = 3/51
P(A, B) = P(A) × P(B|A) = 4/52 × 3/51
Question: P(B) =?
a) 3/51
b) 4/52
c) 4/51
d) not enough information
With 2 Events:
Also, If P(A) > 0 and P(B) > 0, then we can write the following.
P(A)P(B|A)
P(A|B) =
P(B)
P(A)P(B|A)
P(A|B) =
P(B|A) × P(A) + P(B|Ac ) × P(Ac )
Gov2000: Quantitative Methodology for Political Science I
a) < 1/3
b) between 1/3 and 2/3
c) > 2/3
d) not enough information
Independence
Intuitive Definition
Events A and B are independent if knowing whether A occurred
provides no information about whether B occurred.
Formal Definition
⊥B
P(AB) = P(A)P(B) =⇒ A⊥
With all the usual > 0 restrictions, this implies
P(A|B) = P(A)
P(B|A) = P(B)
Conditional Independence
Intuitive Definition
Events A and B are conditionally independent given C, if
knowing whether C occurred and knowing whether A occurred
provides no information about whether B occurred.
Formal Definition
With P(C) > 0, we can write
P(A, B, C)
P(A, B|C) =
P(C)
and we say that A is conditionally independent of B given C
⊥B|C) if
(A⊥
Discrete Distributions
PMF Plot
1.0
0.8
0.6
f(x)
●
0.4
● ●
0.2
0.0
●
0.8
● ●
0.6
F(x)
0.4
● ●
0.2
0.0
a) F (1)
b) F (2)
c) F (1) − F (0)
d) F (2) − F (1)
Continuous Distributions
For example
1/4 0 < x < 4
f (x) =
0 otherwise
1/4 0 ≤ x ≤ 4
f (x) =
0 otherwise
Think of densities as infinite data histograms.
0 1 2 3 4
For example,
0 x <0
F (x) = x/4 0 ≤ x < 4
1 4≤x
0 1 2 3 4
Example:
Y
1 2 3
1 0.22 0.04 0.09 0.35
X 2 0.15 0.10 0.20 0.45
3 0.01 0.07 0.12 0.20
0.38 0.21 0.41 1.00
0.4
0.2
0.0
f(x,
y)
y
x
Y
1 2 3
1 0.22 0.04 0.09 0.35
X 2 0.15 0.10 0.20 0.45
3 0.01 0.07 0.12 0.20
0.38 0.21 0.41 1.00
Gov2000: Quantitative Methodology for Political Science I
Definitions and Notation
Random Variables and Distributions What is a Random Variable?
Expectation and Transformations Discrete and Continuous Distributions
Elementary Asymptotics Marginal, Joint, and Conditional Distributions
Some Important Distributions
0.40
0.39
0.38
f(x,y)
f(x)
0.37
0.36
y
fX ,Y (x, y )
fX |Y (x|y ) =
fY (y )
where it is assumed that fY (y ) > 0. It follows that
Y
1 2 3
1 0.22 0.04 0.09 0.35
X 2 0.15 0.10 0.20 0.45
3 0.01 0.07 0.12 0.20
0.38 0.21 0.41 1.00
Y
1 2 3
1 0.58 0.19 0.22
X 2 0.39 0.48 0.49
3 0.03 0.33 0.29
1.00 1.00 1.00
Gov2000: Quantitative Methodology for Political Science I
fY ,X (y , x)
fY |X (y |x) =
fX (x)
where it is assumed that fX (x) > 0.
1.0
0.8
0.8
0.6
0.6
y
y
0.4
0.4
0.2
0.2
0.0
0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
x x
y
f(y|x
f(x,y
)
)
x x
Marginal Density
0.4
f(y)
0.2
0.0 −3 −2 −1 0 1 2 3 4
0.2
0.0
−3 −2 −1 0 1 2 3 4
0.2
0.0
−3 −2 −1 0 1 2 3 4
Expectation
The expected value of a random variable X is denoted by E[X ]
and is a measure of central tendency of X . Roughly speaking,
an expected value is like a weighted average.
The expected value of a discrete random variable X is defined
as
X
E[X ] = xfX (x).
all x
●
0.4
●
0.2
●
0.0
0.15
0.10
f(x)
0.05
0.00
0 2 4 6 8 10 12
Example
3.0
2.5
2.0
1.5
1.0
0.5
0.0
2 3 4 5 6
Example
3.0
2.5
2.0
1.5
1.0
0.5
0.0
2 3 4 5 6 7
Example
3.0
2.5
2.0
1.5
1.0
0.5
0.0
2 3 4 5 6 7
E[aX ] = aE[X ]
E[b] = b
E[aX + b] = aE[X ] + b
Expectation Question
µ
a) n
b) nµ
c) µ
Variance
The expected value of a function of the random variable X
(g(X ))is denoted by E[g(X )] and is a measure of central
tendency of g(X ).
0.15
0.3
0.10
0.2
f(x)
f(x)
0.05
0.1
0.00
0.0
−6 −2 2 4 6 −6 −2 2 4 6
x x
Gov2000: Quantitative Methodology for Political Science I
Definitions and Notation
Random Variables and Distributions
Expectation and Variance
Expectation and Transformations
Conditional Expectation and Variance
Elementary Asymptotics
Some Important Distributions
Sample Variance
2 3 4 5 6
3.0
2.5
2.0
1.5
1.0
0.5
0.0
2 3 4 5 6
2 3 4 5 6
Variance Question
σ2
a) n
b) nσ 2
c) σ 2
Conditional Expectation
Z ∞
E[Y |x] = yfY |X (y |x)dy
−∞
Marginal Density
0.4
0.2
f(y)
0.0 −3 −2 −1 0 1 2 3 4
0.2
0.0
−3 −2 −1 0 1 2 3 4
0.2
0.0
−3 −2 −1 0 1 2 3 4
E[X],E[Y] E[Y|X]
1.0
1.0
0.8
0.8
0.6
0.6
●
y
y
0.4
0.4
0.2
0.2
0.0
0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
x x
Conditional Variance
Likewise, we can define the conditional variance of Y given
X = x (denoted V [Y |x]) to be the variance of Y under the
conditional distribution of Y given X = x.
Marginal Density
0.4
0.2
f(y)
0.0
−3 −2 −1 0 1 2 3 4
0.2
0.0
−3 −2 −1 0 1 2 3 4
0.2
0.0
−3 −2 −1 0 1 2 3 4
cn → c
Example
If cn is 1 + 1/n, then cn → 1.
2.0
●
1.8
1.6
●
cn
1.4
●
1.2
●
●
●
●●
●●
●●●
●●●●●
●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
1.0
0 20 40 60 80 100
P(|Xn − θ| > ) → 0 as n → ∞
4
1.2
0.35
1.0
3
0.30
0.8
0.25
f(Xn)
f(Xn)
f(Xn)
2
0.6
0.20
0.4
0.15
1
0.2
0.10
0.0
0.05
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
Xn Xn Xn
Convergence Question
Question: Does Xn appear to be converging in probability to 2?
n=1 n = 10 n = 100
0.4
4
1.2
1.0
0.3
3
0.8
f(Xn)
f(Xn)
f(Xn)
0.2
2
0.6
0.4
0.1
1
0.2
0.0
0
0.0
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
Xn Xn Xn
2.0
0.6
0.4
0.5
1.5
0.4
0.3
f(Xn)
f(Xn)
f(Xn)
1.0
0.3
0.2
0.2
0.5
0.1
0.1
0.0
0.0
0.0
0 2 4 6 8 0 2 4 6 8 0 2 4 6 8
Xn Xn Xn
N(0,1)
N(2, 1)
N(0, .25)
1.5
Density
1.0
0.5
0.0
−4 −2 0 2 4
−d/2 −1/2 1 0 −1
fN (x|µ, Σ) = (2π) |Σ| exp − (x − µ) Σ (x − µ)
2
2(−ν/2) (ν/2−1)
fχ2 (x|ν) = x exp(−x/2) for x > 0.
Γ(ν/2)
R∞
where Γ(z) = 0 t z−1 exp[−t]dt (if z is an integer then
Γ(z) = (z − 1)!).
The mean of a chi-square random variable is ν, its variance is
2ν, and (when ν ≥ 2) its modal value is ν − 2.
The parameter ν is referred to as the degrees of freedom.
chisquare 1
0.4
chisquare 4
chisquare 15
0.3
Density
0.2
0.1
0.0
0 10 20 30 40
The t Distribution
Γ ((ν + 1)/2) 1
ft (x|ν) = √ × (ν+1)/2
πνΓ(ν/2) x2
1+ ν
t1
t4
0.5
t 15
0.4
Density
0.3
0.2
0.1
0.0
−4 −2 0 2 4
Z
X ≡q
Y
ν
follows a tν distribution.
If a sample (X1 , . . . , Xn ) of any size n is taken from a
normal distribution with zero mean and unknown variance
then the sampling distribution of the sample mean divided
by the sample standard error will have the t distribution
with ν = n − 1.
The F Distribution
F 1,2
F 5,5
F 30, 20
3
F 500, 200
Density
2
1
0