Machine Learning Chapter 13 Solutions Problems Show Solution The Functional Defined Plugging The Mean Field

subject Type Homework Help
subject Pages 9
subject Words 2907
subject Authors Sergios Theodoridis

Unlock document.

This document is partially blurred.
Unlock all pages and 1 million more documents.
Get Access
page-pf1
1
Solutions To Problems of Chapter 13
13.1. Show Eq. (13.5).
Solution: The functional F(q) is defined as
F(q) = Zq(Xl,θ) ln p(X,Xl,θ)
13.2. Show equation (13.38).
Solution: From Eq. (13.37) in the text we have
ln q(j+1)
α(α) = Eq(j+1)
K1
X
ln αk1
K1
X
αkθ2
k#(4)
K1
X
K1
X
K1
X
q(j+1)
α(α)
Y
k=0
α(a1+ 1
2)
kexp b+1
2Eq(j+1)
θ
[θ2
k]αk,
page-pf2
13.3. Show equations (13.43)-(13.45).
Solution: From the text we have
ln q(j+1)
β(β) = Eq(j+1)
13.4. Show that if
p(x)1
x,
then the random variable z = ln x follows a uniform distribution.
Solution: We know that
page-pf3
13.5. Derive the lower bound after convergence of the variational Bayesian EM
for the linear regression task which is modeled as in Section 13.3.
Solution: The lower bound after convergence to ˜qθ(θ), ˜qα(α), ˜qβ(β), which
are defined by ˜
µθ,˜
Σθfor the Gaussian ˜qθ, by (˜a, ˜
bi), i = 1,2, . . . , l for
the gamma ˜qα(α) and ˜c, ˜
dfor the gamma ˜qβ(β), will be
(c) ln p(α) = Kln Γ(a) + Ka ln b+ (a1) PK1
k=0 ln αkbPK1
k=0 αk.
(d) ln p(β) = ln Γ(c) + cln d+ (c1) ln β.
Using identities from the Appendix of the chapter and the independence
among ˜qθ,˜qα,˜qβ, we get:
(a)
2
k=0
2ln(2π)1
2
k=0
=1
2
K1
X
[ψa)ln ˜
bk]K
2ln(2π)1
2
K1
X
˜a
˜
bk˜
Σθ+˜
µθ˜
µT
θkk .
k=0
˜
bk
page-pf4
4
(d)
A4:= Eβ[ln p(β)] = ln Γ(c) + cln d+ (c1) Eβ[ln β]dEβ[β]
=ln Γ(c) + cln d+ (c1)(ψ(˜c)ln ˜
d)d˜c
˜
d.
In the sequel the respective entropies have to be computed
(a)
θE[θ]
=1
2ln |˜
Σθ| − K
2ln(2π) + 1
2˜
µT
θ˜
Σ1
θ˜
µθ
1
Σ1
2ln |˜
2ln(2π) + 1
2˜
θ˜
1
2Trace{I+˜
Σ1
θ˜
µθ˜
µT
θ},
where Eq. 12.48 from the text has been used.
(b)
k=0
k=0
page-pf5
13.6. Consider the Gaussian mixture model
p(x) =
K
X
k=1
PkN(x|µk, Q1
k),
with priors
p(µk) = N(µk|0, β1I),(10)
and
p(Qk) = W(Qk|ν0, W0).
Given the set of observations X={x1,...,xN},xRl, derive the
respective variational Bayesian EM algorithm, using the mean field ap-
proximation for the involved posterior pdfs. Consider Pk, k = 1,2, . . . , K,
as deterministic parameters and optimize the respective lower bound of
the evidence with respect to the Pk’s.
Solution: Consider
q(Z,µ1:K, Q1:K) = q(Z)q(µ1:K)q(Q1:K),
where the notation has been introduced in Section 13.4. From the theory
we have:
Step 1a:
ln q(j+1)
z(Z) = Eq(j)
µq(j)
Q
[ln p(X,Z,µ1:K,Q1:K)] + constant
=Eq(j)
[ln p(X |Z,µ1:K,Q1:K)]+
n=1
k=1
page-pf6
6
Hence, we have that
ln q(j+1)
z(Z) =
N
X
n=1
K
X
k=1
znkEq(j)
µq(j)
Qln P(j)
k+1
2ln |Qk| − 1
2(xnµk)TQk(xnµk)
or if we set
πnk=P(j)
kexp 1
[ln |Qk|]
bilities, hence
ρnk=πnk
PK
k=1 πnk
.
Also note that Eq(j+1)
z[znk] = ρnk, by the binary nature of znk.
2Eq(j)
Q
Q
2µT
Q
1
2βµT
kµk+ constants,
or
K
X
2µT
Q
N
X
Q
n=1
page-pf7
k=1
where
˜
Qk=βI +Eq(j)
Q
[Qk]
K
X
k=1
ρnk
and
kEq(j)
Q
K
X
ln q(j+1)
Q(Q1:K) = Eq(j+1)
zq(j+1)
µhln p(X |Z,µ1:K, Q1:K)+
ln p(Q1:K)i+ constants
=Eq(j+1)
X
X
znk1
Q(Q1:K) =
k=1 1
2ln |Qk|
n=1
1
X
ρnkxnxT
n˜
µkxT
nxn˜
µT
k
Q(Q1:K) =
k=1
page-pf8
8
where
˜νk=ν0+
N
X
ρnk
N
X
Eq(j+1)
Q
[Qk] = ˜νk˜
Wk
Eq(j+1)
Q
[ln |Qk|] =
l
X
i=1
ψ˜νk+ 1 i
2+lln 2 + ln |˜
Wk|.
where ψ(·) is the digamma function defined in the text.
X
k=1
Pk= 1.(12)
Thus,
X
ln Pk(
N
X
ρnk)λ
K
X
Pk#= 0,
λ
n=1
page-pf9
N
n=1
13.7. Consider the Gaussian Mixture model of Problem 13.6, with the following
priors imposed on µ,Q,and P:
p(µ, Q) = p(µ|Q)p(Q)
=
K
Y
k=1
Nµk|0,(λQk)1W(Qk|ν0, Wo),
that is, a Gaussian-Wishart product and
p(P) = Dir(P|a)
K
Y
k=1
Pa1
k,
i.e., a Dirichlet prior. That is, Pis treated as a random vector. Derive
the E algorithmic steps of the variational Bayesian approximation adopt-
ing the mean field approximation for the involved posterior pdfs. We have
adopted the notation µin place of µ1:Kand Qin place of Q1:K, for no-
tational simplicity.
Solution: If Zis the set of latent variables associated with the mixture
indices, we have
q(Z,P,µ, Q) = q(Z)q(P)q(µ, Q).
Step 1a: We have that
ln q(j+1)
z(Z) = Eq(j)
Pq(j)
µ,Q
[ln p(X,Z,P,µ,Q)]
page-pfa
P
2Eq(j)
µ,Q
2Eq(j)
µ,Q
(14)
n=1
k=1
with
ρnk=πnk
PK
,(16)
z"N
n=1
k=1
k=1
=
K
X
k=1 N
X
n=1
ρnk+a1!ln Pk+ constants,
from which we obtain
q(j+1)
P(P)
K
Y
Pak1
k,
=
N
X
n=1
K
X
k=1
Eq(j+1) [znk]1
2ln |Qk| − 1
2(xnµk)TQk(xnµk)
+1
K
X
ln(|λQk|)1
K
X
µT
k(λQk)µk+ν0l1
K
X
ln |Qk|
2
K
X
k=1
page-pfb
k=1 1
2ln |Qk|
k=1
|{z }
A
2trace{Qk
n=1
|{z }
C
1
2µT
k] Qk
N
X
ρnk!µk
+µT
kQk
N
X
ρnkxn
C
Combining all the terms A, B, C respectively together we obtain
A:ν0+ 1 + PN
n=1 ρnkl1
2ln |Qk|
2µT
N
X
N
X
˜
n=1
ˆ
µk=1
˜
λ
N
X
n=1
ρnkxn
C:1
0+
N
X
ρnkxnxT
n)Qk}.
2ln |Qk| − 1
2trace{˜
page-pfc
12
where
˜νk=ν0+
N
X
n=1
ρnk+ 1,
and
N
X
13.8. If µand Q are distributed according to a Gaussian-Wishart product
p(µ, Q) = N(µ|ˆ
µ,(λQ)1)W(Q|ν, W ),
then compute the expectation
E[µTQµ].
Solution: We have that
E[µTQµ] = E[trace{QµµT}] =
EQEµ|Q[trace{QµµT}] =
13.9. Derive the Hessian matrix w.r. to θof the cost function
J(θ) =
N
X
n=1
[ynln σ(φT(xn)θ) + (1 yn) ln(1 σ(φT(xn)θ))]
1
2θTAθ,
page-pfd
13
where
σ(z) = 1
1 + exp(z).
Solution: Define
t=σ(z).
Then we have
t
z =
z
1
1 + exp(z)=
Now let
Jn(t) = ynln t+ (1 yn) ln(1 tn).(18)
We have Jn(t)
1
n=1
=
N
X
n=1
(yntn)φ(xn)Aθ
=
N
X
n=1
(ynσ(φT(xn)θ))φ(xn)Aθ.
n=1
page-pfe
13.10. Show that the marginal of a Gaussian pdf with a gamma prior on the
variance, after integrating out the variance, is the student’s-t pdf, given
by
st(x|µ, λ, ν) = Γ( ν+1
2)
Γ(ν
2)λ
πν 1/21
1 + λ(xµ)2
νν+1
2
.(20)
Solution: From the text in the chapter and for the one dimensional (pa-
(21)
or
p(θ;a, b) = 1
2
baZαa1
2exp (b+1
(22)
Note that the quantity under the integral is a gamma distribution with
parameters α+1
13.11. Derive the pair of recursions (13.62)-(13.63).
page-pff
15
Solution: Our starting point is Eq. (13.60).
L(α, β) = N
2ln(2π)1
2ln |β1I+ ΦA1ΦT|
D:= β1+ ΦA1ΦT1=βI βΦA+βΦTΦ1ΦTβ. (27)
Thus,
E:= yβ1I+ ΦA1ΦT1y=βyTy
βyTΦΣΦTβy
=βyT(yΦµ),

Trusted by Thousands of
Students

Here are what students say about us.

Copyright ©2022 All rights reserved. | CoursePaper is not sponsored or endorsed by any college or university.