关键词 > STAT3405/STAT4066

STAT3405/STAT4066 Department of Mathematics and Statistics

发布时间：2023-09-13

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Department of Mathematics and Statistics

STAT3405/STAT4066

Important: This assignment is assessed. Your work for this assignment must be submitted via LMS by 11:00pm on Monday, 11 September 2023.

The expectation for a submission are:

. The questions are answered in complete sentences. Marks will be awarded for the cor- rectness of the answers and that they are given in complete sentences.

The answers to the questions should be submitted via LMS.

. That numerical answers are rounded to an appropriate number of digits.

In particular, R code developed for Task 1(a) should be submitted as an attachment in your LMS submission. But so should be the BUGS or Stan code that you use to solve Task 3.

Submission of code must be in plaintext format, do not use any binary format. Preferably, the attachment should be a single R Notebook with ending *.Rmd.

Marks will be awarded for

– the correctness of the code, i.e. how easy it is to read it1 ; and

– how easy it is to get the code to run, if necessary.

You may receive comments on the efficiency of your code, but there are no marks for efficiency.

Unless special consideration is granted, any student failing to submit work by the deadline will receive a penalty for late submission (as described in the unit outline).

Task 1. This task explores the Metropolis-Hastings (MH) algorithm used in the second com- puter lab a bit further. However, for this assignment, we will use the likelihood based on the data, discussed in the sixth lecture, from a survey that was done on bicycle and other vehicular traffic in the neighbourhood of the campus of the University of California, Berkeley. As in the computer lab, we will use Zellner’s prior:

f(p) ∝ pp (1 − p)1−p

for the parameter p in our binomial model for the survey data..

In the second lab, we used an MH algorithm with a proposal distribution that was inde- pendent of the current location of the chain. While such MH algorithm can be used, they are typically quite inefficient if the proposal distribution is not “close” to the distribution one wants to sample from. Another popular way of implementing an MH algorithm is to use a random walk MH algorithm2 . In such an implementation the proposed new state is obtained by adding some random noise (typically using a normal distribution with mean zero) to the current state. In other words, the proposal distribution is a normal distribution with mean equal to the current state and some variance (which serves can serve as a tuning parameter for the MH algorithm).

An obvious difficulty arises when we try to use a random walk MH algorithm when modelling a probability, which is always between 0 and 1. Adding random noise to the current state could lead to a proposal that is outside the interval (0, 1)! We can circumvent this difficulty by first transforming the current state (a value between 0 and 1) via a transformation to a value

p(b)r(e)o(t)p(w)o(e)s(e)a(n)l on thi(−∞ a)s(n)t(d)ran(∞.)sform(We c)e(a)d(n)scale and th(then add so)en(me)tr(n)a(o)n(is)sf(e)o(t)r(o)m(t)t(h)h(e)is(tr)proposa(ansform)l(e)back to th(d current s)e(t)i(a)nt(te)erva(to o)l(b)0(a)1(n)

Specifically, in this exercise we will use the probit transformation, which is the inverse function of the cumulative distribution function (CDF) of the standard normal distribution:

probit(p) = Φ−1 (p). This transformation is readily implemented in R:

probit <- function(p) qnorm(p)

The inverse of the probit transformation is, of course, the CDF of the standard normal distri- bution, h(x) = Φ(x). This function takes as argument a value on the real line and transforms it (back) to the interval (0, 1). This function is also easily implemented in R:

invprobit <- function(x) pnorm(x)

Finally, it can be shown that the proposal distribution for this process (i.e. (1) transforming the current state p(t−1) to the real line using the probit transformation to obtain, say, u(t−1) , (2) adding random noise coming from a normal distribution with mean 0 and variance σ 2 to u(t−1) to obtain, say, w(t), and (3) transforming w(t) back to (0, 1) using the inverse probit transformation (i.e. the CDF of the standard normal distribution) to obtain the proposal x(t)) has the following PDF:

q(x|p) = 2πσ(1)2 φ(prob(1)it(x)) exp ( − 2σ(1)2 {probit(x) − probit(p)}2 ） o(if)th(0)erwise(< x <) 1 and 0 < p < 1

where

φ(z) = e2 , z ∈ R

is the probability density function of the standard normal distribution (implemented as dnorm() in R).

While this formula looks impressive, the corrective term for using this particular non- symmetric proposal distribution in the MH algorithm is actually quite simple:

q (p(t−1)|x(t),

q (x(t) |p(t−1),

φ(probit (x(t),,

φ(probit (p(t−1),,

Now

(a) Implement the random walk MH algorithm as described above. Appropriate code for this part of the task should be submitted.

(b) What is the acceptance rate of your MH algorithm if you use σ = σ 2 = How would you describe the Markov chain produced by this setting of σ?

(c) What is the acceptance rate of your MH algorithm if you use σ = 3? How would you describe the Markov chain produced by this setting of σ?

(d) For what value of σ do you obtain an acceptance rate of around 70%? How does the corresponding Markov chain look like?

Hint: To answer the last three question you may want to look at the trace plot of the chains and also look at a plot of their auto-correlation functions.

Task 2. In the sixth lecture we discussed data from a survey was done of bicycle and other vehicular traffic in the neighbourhood of the campus of the University of California, Berkeley.

Assume we model the observations yi , i = 1, . . . , 10, as being realisations from independent binomial distributed random variables Yi with parameters mi and p. Regard the mis as fixed and given. To finalise the specification of our Bayesian model assume that we use a Beta(α,β) prior for p, the parameter of interest.

(a) Determine the Bayesian estimates ˆ(p)k that you obtain when selecting the hyper-parameters

B(α)aye(an)s(d)ian(β)ies(n)ti(t)m(he)at(p)e(r)s(io)kaa(s)ga(α)ins(=)t(β)log(=)ly(k) th(=)ip(1)ot(0),h(1)s(2), t(.)o(.) .e(2)s(0)ubmitte(. Produ)d(c)e a plot of the

(b) Discuss the plot briefly (i.e. less than a paragraph). Which priors Beta(γk,γk) would you call non-informative? Which priors would you consider to be mildly informative? Which would you call strongly informative?

Task 3. The following table contains data on mortality rates from 12 English hospitals carrying out heart surgery on children under 1 year old between 1991 and 1995.

	Hospital	Operations mi	Deaths yi
1	Bristol	143	41
2	Leicester	187	25
3	Leeds	323	25
4	Oxford	122	23
5	Guys	164	25
6	Liverpool	405	42
7	Southampton	239	24
8	Great Ormond St	482	53
9	Newcastle	195	26
10	Harefield	177	25
11	Birmingham	581	58
12	Brompton	301	31

For these data:

(a) Fit a model that models yi , i = 1, . . . , 12, as being a realisation from a binomial dis- tributed random variable Yi with parameters mi and p. Regard the mis as fixed and

given, and put your favourite non-informative prior on p.

What is your Bayesian estimate for p?

(b) Fit a model that models yi , i = 1, . . . , 12, as being a realisation from a binomial dis- tributed random variable Yi with parameters mi and pi. Regard the mis as fixed and

given, and put your favourite non-informative prior on the pis.

What are your Bayesian estimate for the pis?

For this model, also look at the difference between the smallest of the pis to the largest of the pis, i.e. study the parameter

r = i12 pi − i12 pi

Include a plot of the posterior density of r in your submission. What is your Bayesian estimate for r? What is the standard deviation of the posterior distribution of r?

(c) Given the results of the previous question, which model do you think is more appropriate for these data? Discuss briefly (i.e. less than a paragraph).