Tutorial 3 Problem Set
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
Tutorial 3 Problem Set
1. Verify that the quadratic function y = x2 + x + 1 and the logarithmic function y = − log(x) are convex functions. Is the function y = 2x3 − x2 a convex function? Justify your answer.
2. Suppose we have a concave function ψ(x), show that E [ψ(X)] ≤ ψ(E[X]). Recall the definition of the concave function: given a function ψ : Ω → R where Ω ⊆ R, for any pair of points x1 , x2 ∈ Ω and t ∈ [0, 1], a concave function satisfies ψ(t x1 + (1 − t) x2 ) ≥ t ψ(x1 ) + (1 − t) ψ(x2 ).
3. Given two probability distributions with density functions p and q, respectively. Consider the Kullback– Leibler (KL) divergence of q from p,
DKL(p∥q) := Ep [ log ( )] = ∫ p(x) log ( )dx,
where Ep means the expectation with respect to p. Show that the KL divergence is always non-negative.
4. Prove the monotonicity property of the Expectation Maximisation (EM) algorithm yourself. Consider we have a pair of real-valued random variables (X, Z), where X can be observed and Z is the missing data. Suppose (X, Z) has the joint pdf fX,Z(x, z | θ) parametrised by some θ . We want to estimate θ using observed random variable X = x using EM. Let L(θ; x) = fX (x | θ) = ∫ fX,Z(x, z | θ)dz denote the likelihood, l(θ; x) = ln L(θ; x) denote the log-likelihood, and l(θ; x, z) = ln fX,Z(x, z | θ) denote the joint likelihood. Recall that one iteration of EM is given as follows:
E-step: Suppose θ(t) was the output of the previous iteration. Define
Q(θ | θ(t)) := E [l(θ; x, Z) | x, θ(t)], (1)
where the random variables follows the conditional distribution with pdf fZ|X (z|x, θ(t)). M-step: Compute the value θ(t+1) by maximising Q(θ | θ(t)),
θ(t+1) := arg max Q(θ | θ(t)).
θ (2)
Assuming the joint density fX,Z(x, z | θ) is sufficiently regular so we can apply the Jensen’s inequality, show that, in adjacent iterations, the log-likelihood satisfies
l(θ(t+1); x) ≥ l(θ(t); x).
5. Inverse transform method. Consider a Pareto-distributed random variable X , with probability density
pX (x) = {0(北)
Let α = 3/2. Analytically derive the inverse transform method for sampling X .
2023-05-30