闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Tutorial 3 Problem Set

1. Verify that the quadratic function y = x2 + x + 1 and the logarithmic function y = − log(x) are convex functions. Is the function y = 2x3 − x2 a convex function? Justify your answer.

2. Suppose we have a concave function ψ(x), show that E [ψ(X)] ≤ ψ(E[X]). Recall the definition of the concave function: given a function ψ : Ω → R where Ω ⊆ R, for any pair of points x1 , x2 ∈ Ω and t ∈ [0, 1], a concave function satisfies ψ(t x1 + (1 − t) x2 ) ≥ t ψ(x1 ) + (1 − t) ψ(x2 ).

3. Given two probability distributions with density functions p and q, respectively. Consider the Kullback– Leibler (KL) divergence of q from p,

DKL(p∥q) := Ep [ log ( )] = ∫ p(x) log ( )dx,

where Ep means the expectation with respect to p. Show that the KL divergence is always non-negative.

4. Prove the monotonicity property of the Expectation Maximisation (EM) algorithm yourself. Consider we have a pair of real-valued random variables (X, Z), where X can be observed and Z is the missing data. Suppose (X, Z) has the joint pdf fX,Z(x, z | θ) parametrised by some θ . We want to estimate θ using observed random variable X = x using EM. Let L(θ; x) = fX (x | θ) = ∫ fX,Z(x, z | θ)dz denote the likelihood, l(θ; x) = ln L(θ; x) denote the log-likelihood, and l(θ; x, z) = ln fX,Z(x, z | θ) denote the joint likelihood. Recall that one iteration of EM is given as follows:

E-step: Suppose θ(t) was the output of the previous iteration. Define

Q(θ | θ(t)) := E [l(θ; x, Z) | x, θ(t)], (1)

where the random variables follows the conditional distribution with pdf fZ|X (z|x, θ(t)). M-step: Compute the value θ(t+1) by maximising Q(θ | θ(t)),

θ(t+1) := arg max Q(θ | θ(t)).

θ (2)

Assuming the joint density fX,Z(x, z | θ) is suﬀiciently regular so we can apply the Jensen’s inequality, show that, in adjacent iterations, the log-likelihood satisfies

l(θ(t+1); x) ≥ l(θ(t); x).

5. Inverse transform method. Consider a Pareto-distributed random variable X , with probability density

pX (x) = {0(北)