Machine Learning Chapter 12 Solutions Problems Show That Z Z And Ptz Taz Then Ezt Aat

subject Type Homework Help
subject Pages 9
subject Words 2367
subject Authors Sergios Theodoridis

Unlock document.

This document is partially blurred.
Unlock all pages and 1 million more documents.
Get Access
page-pf1
1
Solutions To Problems of Chapter 12
12.1. Show that if
p(z) = N(z|µz, Σz),
and
p(t|z) = N(t|Az, Σt|z),
then
E[z|t] = (Σ1
z+ATΣ1
t|zA)1(ATΣ1
t|zt+Σ1
zµz)
Solution: We have shown in the Appendix of the chapter that,
E[z|t] := µz|t=µz+ (Σ1
z+ATΣ1
t|zA)1ATΣ1
t|z(tAµz) (1)
Hence
page-pf2
12.2. Let xRlbe a random vector following the normal N(x|µ, Σ). Con-
sider xn, n = 1,2, . . . , N, to be i.i.d. observations. If the prior for µ
follows N(µ|µ0, Σ0), show that the posterior p(µ|x1,...,xN) is normal
N(µ|˜
µ,˜
Σ) with
˜
Σ1=Σ1
0+NΣ1,
and
˜µ=˜
Σ(Σ1
0µ0+NΣ1¯
x)
where ¯
x=1
NPN
n=1 xn.
Solution: We have that
p(µ|x1,...,xN)p(µ|µ0, Q1
0)
N
Y
n=1
p(xn|µ;Q1),(6)
(2π)N l
2
2
n=1
|Q0|1/2
(2π)l
2
exp 1
2(µµ0)TQ0(µµ0).
Keeping only the terms that depend on µ, we get
NΣ, ˜
page-pf3
12.3. If Xis the set of observed variables and Xlthe set of the corresponding
latent ones, show that
ln p(X;ξ)
ξ=Eln p(X,Xl;ξ)
ξ,
where E[·] is with respect to p(Xl|X ;ξ) and ξis an unknown vector pa-
rameter. Note that if one fixes the value of ξin p(Xl|X ;ξ), then one has
obtained the M-step of the EM algorithm.
Solution: we have that
ln p(X;ξ)
ξ=ln R+
−∞ p(X,Xl;ξ)dXl
ξ
=1
R+
−∞ p(X,Xl;ξ)dXl
R+
−∞ p(X,Xl;ξ)dXl
ξ
1
p(X,Xl;ξ)
12.4. Show equation (12.42).
Solution: By the definition of Eq. (12.40), in case the hyperparameters
vector is considered to be random, we have,
12.5. Let yRN,θRland Φ a matrix of appropriate dimensions. Derive
the expected value of kyΦθk2with respect to θ, given E[θ] and the
corresponding covariance matrix Σθ.
page-pf4
4
Solution: Let φ=yΦθ. By definition we have
Σφ=E[(φE[φ])(φE[upφ])T]
=E[φφT]E[φ]E[φT],(8)
12.6. Derive recursions (12.60)-(12.62).
Solution: Recall from Eq. (12.59) of the text that
Q(Ξ,P;Ξ(j),P(j)) =
N
X
n=1
Ehln p(xn|kn;ξkn)Pkni
:=
N
X
K
X
P(k|xn;Ξ(j),P(j))ln Pkl
k
n=1
n=1
page-pf5
5
which leads to the recursion.
Recursion for the variance: Eq. (12.59) is now written as
N
X
K
X
2σ2
K
X
k1
Pk= 1.
Thus, we have an constrained optimization task. Using Lagrange
12.7. Show that the Kullback-Leibler divergence KL(pkq) is a nonnegative
quantity.
Hint: Recall that ln(·) is a concave function and use Jensen’s inequality,
that is,
fZg(x)p(x)dxZf(g(x))p(x)dx,
where p(x) is a pdf and fis a convex function
page-pf6
6
Solution: By definition of the KL(qkp) divergence and the fact that
ln(·) is a convex function, we have
KL(qkp) = Zp(x) ln q(x)
p(x)dx
12.8. Prove that the binomial and beta distributions are conjugate pairs with
respect to the mean value.
Solution: The binomial distribution
12.9. Show that the normalizing constant Cin the Dirichlet pdf
Dir(x|a) = C
K
Y
k=1
xak1
k,
K
X
k=1
xk= 1,
is given by
C=Γ(a1+a2+. . . +aK)
k=1
k 1
k=1
page-pf7
7
Solution: We will integrate xK1out. Then we get
p(x1, x2, . . . , xK2) =
K2
Y
k=1 xk
K1
X
xK1=t 1
K2
X
k=1
xk!.(13)
Hence,
K2
X
k=1
k 1
k=1
Z1
0
taK11(1 t)ak1dt. (16)
pdf, hence by assumption
CΓ(aK1)Γ(aK)
Γ(aK1+aK)=Γ(a1+a2+. . . +aK2+aK1+aK)
Γ(α1). . . Γ(αK2)Γ(aK1+aK),
or
C=Γ(a1+a2+. . . +aK)
Γ(a1). . . Γ(aK).
page-pf8
12.10. Show that N(x|µ, Σ) for known Σis of an exponential form and that its
conjugate prior is also Gaussian.
Solution:
p(x|µ) = 1
(2π)l/2|Σ|1/2exp 1
2(xµ)TΣ1(xµ)
12.11. Show that the conjugate prior of the multivariate Gaussian with respect
to the precision matrix, Q, is a Wishart distribution.
Solution: We have that,
p(x|Q) = |Q|1/2
(2π)l/2exp 1
2(xµ)TQ(xµ)
(2π)l/2exp 1
2[qT
page-pf9
Q:=
qT
1
qT
2
.
.
.
qT
l
and u(x) comprises the columns of (xµ)(xµ)T, stacked in sequence
one below the other. Hence p(x|Q) is of an exponential form. Its conjugate
2trace{W1Q},
which is a Wishart distribution and the normalizing constant then neces-
sarily becomes
l(l1)
l
Y
j=1
2
1
12.12. Show that the conjugate prior of the univariate Gaussian N(x|µ, σ2) with
respect to the mean and the precision β=1
σ2, is the Gaussian-gamma
product
p(µ, β;λ, v) = Nµ
v2
λ,(λβ)1Gamma β
λ+ 1
2,v1
2v2
2
2λ
where v:= [v1, v2]T.
Solution: We have shown in the respective section in the book that
p(µ, β;λ, v) = h(λ, v)βλ
2exp λβµ2
2exp [β
2, βµ]v1
v2,
2β)
2
page-pfa
12.13. Show that the multivariate Gaussian N(x|µ, Q1) has as a conjugate
prior, with respect to the mean and the precision matrix, Q, the Gaussian-
Wishart product.
Solution: Let
p(x|µ, Q) = |Q|1/2
(2π)l/2exp 1
2µT(λQ)µ1
2µT
h|Q|νλ1
2exp 1
2trace{W1Q},
which after some trivial manipulations becomes
2(µ˜
|Q|˜νl1
2exp 1
2trace{˜
W1Q},
page-pfb
11
12.14. Show that the distribution
P(x|µ) = µx(1 µ)1x, x ∈ {0,1},
is of an exponential form and derive its conjugate prior with respect to µ.
Solution: We have
P(x|µ) = exp ln(µx(1 µ)1x)
page-pfc
12.15. Show that estimating an unknown pdf by maximizing the respective en-
tropy, subject to a set of empirical expectations, results in a pdf that
belongs to the exponential family.
Solution: The problem is cast as
maximize w.r.t p(x)ZAx
p(x) ln p(x)dx,
i∈I

Trusted by Thousands of
Students

Here are what students say about us.

Copyright ©2022 All rights reserved. | CoursePaper is not sponsored or endorsed by any college or university.