STAT7301 Mathematical Statistics Semester One Final Examinations, 2022
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
STAT7301 Mathematical Statistics
Semester 1, 2022
Question 1. [23 marks]
Let x1 , . . . , xn denote the observed values of n p-dimensional observations X 1 , . . . , Xn from a p.f. or p.d.f. f (x; θ), where θ is a d-dimensional parameter. If f (x; θ) belongs to the d-dimensional exponential family, then the likelihood function L(θ) for θ has the form
L(x1 , . . . , xn ; θ) = b(x1 , ..., xn ) exp{c(θ)TT (x1 , ..., xn )}/a(θ),
where c(θ) is a q × 1 vector function of the parameter vector θ and where a(θ) and b(x1 , ..., xn ) are non-negative scalar functions.
(i) What is the condition on q and on the Jacobian of c(θ) for the likelihood function (or equivalently, f (x; θ)) to belong to the regular exponential family? [3 marks]
(ii) Discuss the uniqueness of the so-called natural or canonical parameter c(θ) and the corresponding choice of the sufficient statistic T. [3 marks]
(iii) Assuming henceforth that the likelihood function belongs to the regular exponential family, show that the likelihood equation
∂ log L(θ)/∂θ = 0
can be expressed as
T (x1 , . . . , xn ) = E{T (X 1 , . . . , Xn )};
that is, the maximum likelihood (ML) estimate θ(ˆ) of θ satisfies the equation,
T (x1 , . . . , xn ) = Eθ(a){T (X 1 , . . . , Xn )};
where Eθ(a) implies expectation using θ(ˆ) for θ . [7 marks]
(iv) In the current case where L(θ) belongs to the regular exponential family, show that the observed information matrix I(θ(ˆ)) is equal to the estimated Fisher (expected) information matrix s (θ(ˆ)), that is,
I(θ(ˆ)) = s (θ(ˆ)). [10 marks]
Question 2. [15 marks]
Let x1 , . . . , xn denote an observed random sample from a normal distribution with mean µ and variance σ2 .
Derive the likelihood ratio test statistic λ to test the null hypothesis
H0 : µ = σ2
versus the alternative hypothesis
H1 : µ σ 2 .
Explain how an approximate P-value can be calculated for this test statistic.
Question 3. [25 marks]
We let Y denote a random variable distributed as
Y ~ N(θ, 1),
where the mean θ is unknown.
Let Y1 , . . . , Yn denote a random sample, where only values of Y greater than C were able to be measured by the recording machine.
The aim is to find the maximum likelihood (ML) estimate θ(ˆ) on the basis of the observed data, y = (y1 , . . . , yn )T ,
where yj denotes the observed value of Yj (j = 1, . . . , n).
It is proposed to use the expectation–maximization (EM) algorithm to compute θ(ˆ) on the basis of this
truncated sample so obtained.
(i) Write down the likelihood function L(θ) for θ that can be formed on the basis of the observed- data vector y. [2 marks]
(ii) Consider the EM framework where m is introduced as the “missing” number of unobservable observations yj (j = n + 1, . . . , n + m) less than C in the process undertaken to obtain the observed data. That is, the complete-data vector x is given by
x = (yT , zT )T ,
where z = (m, wT )T is the missing-data vector with w = (yn+1, . . . , yn+m)T .
Define a conditional distribution for m given the observed day y so that the complete-data likelihood Lc (θ) implies the incomplete-data likelihood L(θ). [8 marks]
(iii) Give the E-step on the (k + 1)th iteration of the EM algorithm; that is, calculate the so-called Q-function Q(θ; θ(k)). You may use without derivation the expectation of your choice for the conditional distribution of m given y. [5 marks]
(iv) Give the M-step to find the updated estimate θ(k+1) of the estimate of θ . [10 marks]
Question 4. [15 marks]
Consider the following joint density for x and θ ,
f (x, θ) = ╱x(n)、θx+α − 1 (1 - θ)n −x+β − 1
for x = 0, 1, . . . , n, 0 < θ < 1, α > 0, and β > 0.
(i) Find the conditional distribution of x given θ and name the distribution.
Note: It is sufficient to work out the kernel of the conditional distribution, that is, without specifying the normalising constant. [4 marks]
(ii) Find the conditional distribution of θ given x and name the distribution.
Note: As in part (a), you may omit specifying the normalising constant. [4 marks]
(iii) Using parts (i) and (ii), or otherwise, describe how you would construct a Gibbs sampler to sample from f (x, θ). [7 marks]
Question 5. [22 marks]
In a study of a native mammal species in Australia, researchers need to capture m animals at each of the n locations across a national park. Let θ denote the mean rate of encountering an animal per hour. You may assume the same rate for all n locations. Then the waiting time (in hours) Xi until capturing the mth animal at location i can be modelled by a gamma distribution given by
f (xi |θ) = xi(m) − 1 e − , for xi > 0; i = 1, 2, . . . , n.
Consider a prior distribution on θ given by θ ~ InvGamma(α0 , β0 ) with density given by
f (θ) = θ −α0 − 1 e − ,
where α0 > 0 and β0 > 0.
(i) State the prior mean. Note that no derivation is required. [1 mark]
(ii) Find an expression for the posterior distribution of θ given the data x = (x1 , x2 , . . . , xn ) and identify its distribution. [5 marks]
(iii) The ML estimator of θ is θ(ˆ) = . State, without derivation, the posterior mean. Then show that it can expressed as a weighted sum of the prior mean and θ(ˆ). [5 marks]
(iv) What happens to the posterior mean as m → &? [2 marks]
(v) What happens to the posterior mean if the prior hyperparameters α0 , β0 → 0? [2 marks]
(vi) Give an interpretation of the posterior mean by completing the following sentence: [2 marks]
“The effect of the prior hyperparameters α0 and β0 on the posterior mean of θ is like . . . ”
(vii) Describe how you would construct a posterior 100(1 - α)% credible interval for θ . [5 marks]
2023-06-10