Documente Academic
Documente Profesional
Documente Cultură
c
Analysis
Nota!ons
Notes:
In the pLSA model, Θ, θm and θm,k are parameters instead of random variables.
We can represent ϕ1 ⋯ ϕk ⋯ ϕK as a K × V matrix Φ.
Notes:
In the pLSA model, Φ, ϕk and ϕk,v are parameters instead of random variables.
Based on the genera!ve process of pLSA, we can represent the pLSA model using
“collapsed”’ plate nota!on.
For an easy understanding, the corresponding ``expanded’’ model is also shown.
EM for pLSA
If the Maximum Likelihood Es!ma!on (MLE) is used to es!mate the parameters for the
pLSA model, we need to fit the parameters which maximize probability observed data of
all words W in documents, or likelihood of W with respect to parameters Θ and Φ.
M Nm
ℓ(Θ, Φ) = log p(W ; Θ, Φ) = log( ∏ ∏ p(wm,n ; Θ, Φ))
m=1 n=1
wm,n is the n-th word in the m-th document, and m, n represents the posi!on of the
word in the documents. Words in different posi!ons (corresponding to different values
of m and n) can be instances of the same v in the dic!onary. In the m-th document, we
V
can set the total number of words which is equal to v as cm,v , and ∑v=1 cm,v = Nm .
So we can store the value of cm,v into a M × V matrix C . The word v is observed with
the probability of p(v; Θ, Φ), and the corresponding topic of v is z .
M V
ℓ(Θ, Φ) = log( ∏ ∏ p(v; Θ, Φ)cm,v )
m=1 v=1
M V
= ∑ ∑(cm,v × log p(v; Θ, Φ))
m=1 v=1
M V
= ∑ ∑(cm,v × log ∑ p(v, z; Θ, Φ))
m=1 v=1 z
M V K
= ∑ ∑(cm,v × log ∑ p(v, z = k; Θ, Φ))
m=1 v=1 k=1
Based on the genera!ve process of pLSA, in the m-th document, we can get
p(z = k; θm ) = θm,k
p(v∣z = k; ϕk ) = ϕk,v
p(v, z = k; Θ, Φ) = θm,k × ϕk,v
M V K
ℓ(Θ, Φ) = ∑ ∑(cm,v × log ∑ p(v, z = k; Θ, Φ))
m=1 v=1 k=1
M V K
= ∑ ∑(cm,v × log ∑ θm,k × ϕk,v )
m=1 v=1 k=1
M V K
θm,k × ϕk,v
= ∑ ∑(cm,v × log ∑(Qm,v (z = k) × ))
Qm,v (z = k)
m=1 v=1 k=1
M V K
θm,k × ϕk,v
≥ ∑ ∑(cm,v × ∑(Qm,v (z = k) × log ))
Qm,v (z = k)
m=1 v=1 k=1
= L(Θ, Φ)
E-step
In the E-step in EM method, we calculate the value of Qm,v (z = k), with which the
lower bound L(Θ, Φ) is equal to ℓ(Θ, Φ), as follows:
p(v, z = k; Θ, Φ)
Qm,v (z = k) = K
∑k′ =1 p(v, z = k ′ ; Θ, Φ)
θm,k × ϕk,v
= K
∑k′ =1 θm,k′ × ϕk′ ,v
Notes:
p(v, z = k; Θ, Φ) p(v, z = k; Θ, Φ)
= = p(z = k∣v; Θ, Φ)
K
∑k′ =1 p(v, z = k ′ ; Θ, Φ) p(v; Θ, Φ)
M-step
In the M-step of the EM method, we assign the value of Qm,v (z = k) calculated from
the E-step, and try to maximize the lower bound L(Θ, Φ) with respect to Θ and Φ.
Notes:
Each Qm,v (z = k) in the M-step is a fixed value instead of variable.
K V
Because ∑k=1 θm,k = 1 and ∑v=1 ϕk,v = 1, we can use the method of Lagrange
mul!plier.
M V K
θm,k × ϕk,v
L(Θ, Φ) = ∑ ∑(cm,v × ∑(Qm,v (z = k) × log ))
Qm,v (z = k)
m=1 v=1 k=1
K V
+ ∑ λk (1 − ∑ ϕk,v )
k=1 v=1
M K
+ ∑ ρm (1 − ∑ θm,k )
m=1 k=1
M
∂L
= ∑ (cm,v × Qm,v (z = k)) − λk × ϕk,v = 0
∂ϕk,v
m=1
And
M
1
ϕk,v = × ∑ (cm,v × Qm,v (z = k))
λk
m=1
V
Because ∑v=1 ϕk,v = 1, we can get
V V M
1
∑ ϕk,v = × ∑ ∑ (cm,v × Qm,v (z = k)) = 1
λk
v=1 v=1 m=1
So we can get
V M
λk = ∑ ∑ (cm,v × Qm,v (z = k))
v=1 m=1
M
∑m=1 (cm,v × Qm,v (z = k))
ϕk,v = V M
∑v=1 ∑m=1 (cm,v × Qm,v (z = k))
V
∂L
= ∑(cm,v × Qm,v (z = k)) − ρm × θm,k = 0
∂θm,k
v=1
And
V
1
θm,k = × ∑(cm,v × Qm,v (z = k))
ρm
v=1
K
Because ∑k=1 θm,k = 1, we can get
K K V
1
∑ θm,k = × ∑ ∑(cm,v × Qm,v (z = k)) = 1
ρm
k=1 k=1 v=1
K V
Because ∑k=1 Qm,v (z = k) = 1 and ∑v=1 cm,v = Nm , we can get
K V
ρm = ∑ ∑(cm,v × Qm,v (z = k))
k=1 v=1
V K
= ∑ ∑(cm,v × Qm,v (z = k))
v=1 k=1
V K
= ∑(cm,v × ∑ Qm,v (z = k))
v=1 k=1
V
= ∑(cm,v × 1)
v=1
= Nm
V
∑v=1 (cm,v × Qm,v (z = k))
θm,k =
Nm