ST323/ST412 Assignment 2 (2022-23)
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
ST323/ST412 Assignment 2 (2022-23)
Q1. Suppose that for k = 1, . . . , K and i = 1, . . . , nk we observe independent Xki ~ Np(μk, Ip), where
K, n1 , . . . , nK e N. In this question we consider the K-sample testing problem, where we want to test the null hypothesis H0 : μ 1 = . . . = μK . Note that the observations here have identity covariance matrix, so this is a simplification of the setting found in the notes.
(a) Write down the likelihood function L(μ1 , . . . , μK ) for this model, and prove that
= exp ╱ - k nk Xk│ - X │2、,
where Xk = nk(_)1 |
1 Xki and X = N_1 |
K k=1 |
nkXk with N = |
K k=1 |
nk . |
[3 marks] |
(b) In this part of the question we will find the distribution of
nk │Xk - X │2 under H0 .
(i) Define the Kp-dimensional random vector Y = (^n1X ^nKXK(T))T . Give, with jus- tification, the distribution of Y . [2 marks]
(ii) Show that we may write
(^n1 (X1 - X)T , . . . , ^nK (XK - X)T )T = (IKp - N_1 VVT)Y ,
where V is the Kp × p matrix given by VT = (^n1Ip . . . ^nKIp). [1 mark]
(iii) Prove that P = IKp - N_1 VVT is an orthogonal projection matrix satisfying PV = 0, and calculate q = Tr(P). [2 marks]
(iv) Using Proposition 2.3.2 and the previous parts of this question, or otherwise, prove that
under H0 we have nk │Xk - X │2 ~ χq(2) . [2 marks]
(c) Suppose that we have data on K = 3 groups, with n1 = n2 = n3 = 50, and suppose that we observe
X1 = (2, 0, 1)T , X2 = (0, 1, 2)T , X3 = (0, 0, 0)T .
Using part (b), carry out a test of H0 at the 5% significance level. [2 marks]
Q2. In this question we consider a classification problem where c1 = c2 , π 1 = π2 and
X|Y = 1 ~ N2(0, I2 ), X|Y = 2 ~ N2(μ, Σ)
for some μ e R2 and positive definite 2 × 2 matrix Σ .
(a) In each of the following examples find the Bayes classifier and draw a diagram showing the
shape of the regions R1 and R2 . You may use any result from the notes, provided it is clearly stated.
(i) μ = (1, 1)T and Σ = I2 [1 mark]
(ii) μ = 0 and Σ = 2I2
(iii) μ = 0 and Σ = ╱0(2) 1(0)、 (iv) μ = (0, -1)T and Σ = ╱0(2) (v) μ = 0 and Σ = 、 (vi) μ = 0 and Σ = 、
(b) Assuming that μ = 0 and Σ I2, give a necessary and sufficient condition on Σ for the region
R1 to be bounded. You may wish to use the Spectral Decomposition Theorem and separately consider the cases λ2 > 1, λ2 = 1 and λ2 < 1, where λ2 is the smallest eigenvalue of Σ . [4 marks]
Q3. Suppose that we have observations X1 , . . . , Xn in Rp and wish to carry out a cluster analysis.
Since you have reason to believe that the clusters are equally sized and each cluster is spherically symmetric, you consider the model
Xi|Zi = k ~ Np(μk, Ip), P(Zi = k) = 1/K,
where K > 2 is the number of clusters you think are in the data, and μ 1 , . . . , μK e Rp are unknown cluster means. Fitting this model to the data you obtain MLEs 1 , . . . , K .
(a) Simplifying your expressions as far as possible, give the condition under which this model-based
clustering approach would tell you to assign Xi to cluster k . Which other clustering method from the module is this most similar to? You may assume that all X1 , . . . , Xn and 1 , . . . , K are distinct.
(b) Give an expression for K in this model, the maximised value of the likelihood.
(c) Suppose that the model is correct, with K being the correct number of clusters, and that minkk\ |μk - μk\ | is large. What will be the approximate value of -2 log(K )? You do not need to give rigorous mathematical proofs, but you should explain your reasoning. [3 marks]
(d) Show that for any y1 , . . . , ym e Rp we have
m m
i,j=1 i=1
where ym = m_1 yi . Starting from this equality and (c) and giving brief heuristic justification, using the AIC or BIC to choose K is similar to which other method for choosing K from the notes on clustering?
2023-01-28