关键词 > STAT3001/STAT7301

STAT3001/STAT7301 Mathematical Statistics Semester One Final Examinations, 2020

发布时间：2023-06-10

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

STAT3001/7301

Mathematical Statistics

Semester 1, 2020

1. Let X have a Beta(α, β) distribution for some α > 0 and β > 0. Thus, the pdf of X is given by

f (x; α, β) = , x e (0, 1),

where B is the beta function:

B(α, β) = ,

and Γ denotes the gamma function.

(a) Show that the family of Beta distributions forms an exponential family. [3 marks]

(b) Suppose X1 , . . . , Xm ~iid Beta(α, β). Based on the exponential family representation, give suﬃcient statistics for the estimation of α and β . [2 marks]

2. Let X have pdf:

f˚(x; θ) = αλ(1 + λx)-(α+1), x > 0,

where θ = (α, λ) and α, λ > 0.

(a) Derive the score function S(˚)(θ). [2 marks]

(b) Using the relation,

1 = E = E + λE ,

or otherwise, show that

E = . [2 marks]

Similarly, it can be shown that

E = .

I˚(θ) = ╱ (1)λ ( \ . [3 marks]

(d) The following data

0.859844, 8.20097, 0.0439733, 0.961038, 1.03828,

0.0390542, 2.25154, 0.788615, 1.1421, 50.2556

were drawn from this distribution for a certain α and λ . Formulate the Fisher scoring method to estimate the parameters, and implement it in MATLAB (or other software language). Provide the code. [4 marks]

3. Let Y ~ N(Xβ, σ2 I), where X is the 5 x 2 matrix

←┌2(1)

X = ← 3

←4

1(0)┐}

0 } ,

1 }

β is a 2 x 1 vector of parameters, σ2 > 0, and I is the identity matrix (of dimension 5).

(a) Find the maximum likelihood estimates b β = (βb 1, βb2) > and b σ, for respectively β and σ (note: we intentionally mean the standard deviation σ, not the variance σ 2 ). [4 marks]

(b) What is the probability distribution of the estimator b β ? [3 marks]

(c) Now suppose that σ 2 is known and is equal to 2. Explain how you would construct an exact 95% numerical confifidence interval for β1. [3 marks]

4. The highly inﬂuential paper by Gelman (2006)1 argued against the use of Inverse-Gamma priors for variance parameters in favour of Half-t priors for the standard deviation. One of the main hindrances to early adoption of Gelman’s idea was that subsequent posterior distributions have no closed form, making it computationally more diﬃcult to work with than the classical Inverse- Gamma prior. This question establishes a computationally eﬃcient way to obtain posterior inferences when using the Half-t prior.

Let x1 , x2 , . . . , xn be an i.i.d. sample from a N (µ, σ2 ) distribution. For simplicity, assume that µ is known and that σ2 is the only unknown parameter in the model. Following the result from Assignment 5 Bonus Question, the data model and the Half-t(A, ν) prior on σ can be written as a hierarchical Bayesian model:

x1 , x2 , . . . , xn l σ2 σ 2 la a

Inverse-Gamma(α = ν/2, λ = ν/a) , (2)

Inverse-Gamma(α = 1/2, λ = 1/A2 ) , (3)

where A, ν > 0 are ﬁxed hyperparameters.

(a) Show that the joint posterior distribution of (σ2 , a) given the data x = (x1 , x2 , . . . , xn ) is of the form

f (σ2 , al x) x (σ2 )-n/2-ν/2-1 a-1/2-ν/2-1 exp,_ (xi _ µ)2 _ _ . [3 marks]

(b) Show that the conditional posterior distribution f (σ2 la, x) is Inverse-Gamma with scale parameter α = n/2 + ν/2 and rate parameter λ = 0.5 6(xi _ µ)2 + ν/a. [2 marks]

(c) Show that the conditional posterior distribution f (al σ2 , x) is Inverse-Gamma with scale parameter α = 1/2 + ν/2 and rate parameter λ = ν/σ2 + 1/A2 . [2 marks]

(d) Using your results from parts (b) and (c), or otherwise, describe how you would generate:

(i) samples (approximately) from the joint posterior distribution f (σ2 , al x) ; [2 marks]

(ii) a 95% posterior credible interval for σ 2 given the data x. [2 marks]

Note that you are not asked to write any actual code for this question.

Results (b) and (c) show that the Half-t prior is conditionally conjugate when using the hierar-

chical representation (2)–(3), even though it is not directly conjugate for σ 2 .

5. The Weibull distribution is an example of a Generalized Extreme Value distribution and can be used for modelling nearest-neighbour distances between particles of ideal gases. Let x1 , x2 , . . . , xn be an i.i.d. sample from a Weibull distribution with shape parameter k > 0 and transformed scale parameter θ > 0, such that its pdf (in Bayesian notation) is

f (xl θ, k) = xk-1 e-xk /θ , x > 0 . (4)

Suppose that k is known and θ is the unknown parameter of interest.

(a) Show that the Weibull distribution (4) with k known forms a one-parameter exponential family. Identify the suﬃcient statistic t(x), canonical parameter η(θ) and normalizing func- tion c(θ) for this family. [2 marks]

(b) Using your results from part (a), or otherwise, construct a conjugate family of prior distri- butions for θ . [2 marks]

f (θ l x) x θ -(α+n)-1 exp {_ ,

where α, λ > 0 are freely-chosen hyperparameters, and identify this distribution. [2 marks]

(d) It is given to you that

」0 o θ -a-1 e-b/θ dθ = Γ) ,

for any a, b > 0. Using this result, or otherwise, show that the posterior mean is given by

E(θ l x) = . [2 marks]

(e) What is the eﬀect of the hyperparameters α and λ on the posterior mean “like”? Give an interpretation using the concept of equivalent prior observations. [1 mark]

(f) Using your result from (e), or otherwise, describe how you would set the hyperparameter values α, λ toreﬂect minimal prior knowledge of the underlying process. Is your subsequent prior a proper or improper distribution? [2 marks]

A sample of n = 12 nearest-neighbour distances (x10-9 m) between nitrogen particles in a greenhouse gave the following measurements:

x = [4.42, 1.36, 4.61, 2.17, 4.98, 4.72, 5.20, 2.56, 6.41, 4.78, 2.86, 2.82]

Because nitrogen is (approximately) an ideal gas, we can follow Chandreshekhar (1943)2 and model these inter-particle distances using a Weibull distribution with k = 3.

(g) Using your results from the previous parts, or otherwise, construct a posterior 95% credible interval for the scale parameter θ for nitrogen particles in this greenhouse. [3 marks]

6. A survey3 of 1174 Australian university students on their perceptions towards academic integrity returned the following counts for Item 15: ‘’Have you copied information directly from a web site without referencing the source?”:

Often Sometimes

(3+ times) (1-2 times) Never Total

94 258 822 1174

Let p1 , p2 and p3 = 1 _ p1 _ p2 be the underlying proportion of Australian university students who often, sometimes, or never copy from a web site without referencing, respectively.

(a) Assuming that the data come from a multinomial distribution, show that the prior

(p1 , p2 ) ~ Dirichlet(α1 , α2 , α3 ) with hyperparameters α1 , α2 , α3 > 0 ,

is conjugate for this problem. [Recall: the joint pdf of (p1 , p2 ) ~ Dirichlet(α1 , α2 , α3 ) is

f (p1 , p2 ) x p1(α)1 -1p2(α)2 -1 (1 _ p1 _ p2 )α3 -1 , 0 ≤ p1 , p2 ≤ 1.] [2 marks]

(b) By diﬀerentiating the posterior density f (p1 , p2 l data), or otherwise, show that the posterior modes for p1 and p2 are given respectively by

mode(p1 l data) =

mode(p2 l data) = α 1 + α2 + α3 + 1171 [3 marks]

“The eﬀect of the prior on the posterior mode of p1 is like . . . ” [2 marks]

In a more recent survey4 a subset of 814 Australian university students were asked a similar question, returning the following counts:

Often Sometimes

(3+ times) (1-2 times) Never Total

267 155 390 814

(d) Explain how you would update your posterior belief about p1 , p2 and p3 = 1 _ p1 _ p2 now that you’ve seen these newer survey results. [2 marks]