闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

MATH6173 Statistical Computing — Coursework 2

Instructions for Coursework 2

1. This coursework assignment is worth 50% of the overall mark for the module.

2. The deadline is 1500 on Wednesday 11 January 2023 and your completed work must be submitted electronically via Blackboard.

3. Your submission must consist of exactly one written report, which contains the answers (including the required mathematical derivation and graphs) to the relevant questions, and one R script ﬁle, which contains all the R code you used to obtain your results all questions. In the report ﬁle, you must label clearly to which questions the answers are associated. In the R script, you must use comments to make it clear to which questions the R codes are answering. Penalties will be applied if any R codes required are missing from the R script. Please ensure the submitted ﬁles have the correct ﬁle format; otherwise it cannot be marked. Note an R markdown ﬁle is a diﬀerent format to an R script ﬁle.

4. Your written report should be in PDF format. Please name your report as .pdf. For example, if your student ID is 12345678, then your report must have the ﬁlename 12345678 .pdf.

5. Your R script ﬁle must have the ﬁlename .R. For example, if your student ID is 12345678, then your script must have the ﬁlename 12345678 .R.

6. Your entire R script ﬁle must be reproducible and run without any errors.

7. You must present elegant and informative plots in the written report, such as using appropriate plotting area, points/lines sizes and colours. Also please include meaningful labels, titles and legends.

8. You must informatively comment your R code including indication of where each question and task begins. When writing an R function, the 1) function and 2) argument names must be exactly the same as those speciﬁed in the questions; otherwise a heavy penalty will be applied.

● Please note that uppercase and lowercase of the same alphabet are not the same character, so if the question asks to write a function/argument called, apple, but a function/argument called

applE is presented in your answer, than that would be wrong.

9. You must not load any add-on packages, e.g., using function library or require. Otherwise, a penalty will incur.

10. The standard Faculty rules on late submissions apply: for every weekday, or part of a weekday, that the coursework is late, the mark will be reduced by a factor of 10%. No credit will be given for work which is more than one week late.

11. All coursework must be carried out independently. You are reminded of the University’s Academic Integrity Policy, see https://www.southampton.ac.uk/quality/assessment/academic_integrity.page.

12. In the interests of fairness, queries which relate directly to the coursework must be made via the MATH6173: Statistical Computing Discussion Forum on Blackboard. This ensures that every- body has access to the same information.

Total marks available: 100

Question 1: A-R Sampling

[23 marks]

Consider the target distribution π with the probability density function (pdf),

)

π(x) = (

)

(

asa

if s < x,

if s 乂 b s x s s,

otherwise (1)

where the parameters a > 0, b > 0 and s > 0.

The goal is to draw samples from π(x) using A-R sampling, that has a proposal density set to the double exponential density c(λ) with the pdf

g(x) = exp(乂λIx 乂 bI), (2)

where the rate parameter λ > 0, and b is the same as that in equation (1).

(a) [4 marks] Assume the parameter values a = 2, b = 3 and s = 1.25. There are two potential values of interest, 1 and 0.5, for the parameter λ in equation (2). In the report ﬁle, explain which of those two values for λ produces a proposal density that has a higher acceptance probability?

Note: Rigorous proof is not required for this question, but the reasoning must be clear and logical. Figures may be useful to support your explanation.

(b) [6 marks] Use the parameter values a = 2, b = 3, s = 1.25 and the chosen value of λ in part (a) to design an A-R sampling algorithm to generate random values from π(x). Also, select an appropriate

Important:

● In the description of the algorithm, any equations and mathematical expressions must be speciﬁc to the context of this question.

● You must present the description of the algorithm in the report ﬁle .

Arguments:

● n: a positive integer value representing the number of random values to be generated.

● a: a positive numeric value for parameter a in equation (1).

● b: a positive numeric value for parameter b in equation (1) and (2).

● s: a positive numeric value for parameter s in equation (1).

● lambda: a positive numeric value for parameter λ in equation (2).

● M: a positive numeric value.

Computation:

● This R function implements the algorithm designed in part (b), but allow a, b, s, λ and M to be speciﬁed through the arguments.

Returns:

● A numeric vector containing the sequence of n values drawn from π(x) deﬁned equation (1).

(d) [3 marks] Set the seed to 2022, and use the R function q1 .ar created in part (c) to generate 5000 random values with parameter values a = 2, b = 3, s = 1.25 and the chosen value of λ in part (a). In the report ﬁle, compare their empirical distribution with equation (1) by presenting the distribution of the simulated samples as a histogram, and on that same ﬁgure plot a line representing the true pdf ofπ(x) in equation (1).

Important: This question asks for the same type of ﬁgure as the solution to Questions 2 (c) of the

Week 9 Lab.

(e) [4 marks] The kth root of the kth moment of X is given by

(E [X k ]) = ┌.-o(o) xk π(x)dx┐ k . (3)

Write R commands to estimate the statistic speciﬁed by equation (3) for k = 2 and k = 3, using the 5000 values already sampled in part (d). Report your answers (to 3 d.p.) as comments below your R commands for this question in the R script.

Question 2: Importance Sampling

[ 19 marks] Measurements of precipitation are usually presented as percentages in weather forecasts, e.g., “The precipitation at 3 PM today is 3%”. Let {X/ , ..., Xd } be a sequence of d + 1 precipitation values (converted from percentage to proportion), where the Xi e [0, 1] is recorded at time ti with til1 2 ti . The logit of the precipitation value at time ti is deﬁned as

Ui = log ╱ 1 乂(X)X(i)i 、 for all i e {0, ..., d} . (4)

We model the sequence U = (U1 , ..., Ud ) by a continuous-time random walk conditioned on U/ = u/ , which

means

Ui ~ Normal(µ = Ui - 1 , σ 2 = ti 乂 ti - 1 ) for all i e {1, ..., d} , (5)

where µ and σ 2 are the mean and variance of a normal distribution.

(a) [4 marks] Let a = (a1 , ..., ad ) with 0 < ai < 1 for all i e {1, ..., d}. We are interested in estimating the probability

ptail (a) = Pr(X1 > a1 , X2 > a2 , ..., Xd > ad IX/ = x/ ) (6)

by the standard importance sampling.

Consider the shifted exponential distribution that has the pdf

f (x; λ, s) = λ exp [乂λ(x 乂 s)] for x 2 s, (7)

where the parameter s e R shifts the exponential distribution. In the report ﬁle, explain how should we use f (x; λ, s) in equation (7) to form a proposal distribution q for generating the random vector U , such that q avoids wasting simulations on sampling vectors that will not be used to calculate ptail (a)?

(b) [6 marks] Design an importance sampling algorithm that uses the proposal density q formed in part (a) to ﬁnd ptail in equation (6), given the initial precipitation value X/ = x/ .

Important:

● In the description of the algorithm, any equations and mathematical expressions must be speciﬁc to the context of this question.

● You must present the description of the algorithm in the report ﬁle .

Arguments:

● n: a positive integer representing the number of samples to be generated to calculate ptail .

● x0: a numeric value representing the observed value of X/ .

● time: a numeric vector representing the times {t/ , ..., td } when the d values of {X/ , ..., Xd } are observed.

● a: a numeric vector containing the d values of {a1 , ..., ad } in equation (6).

● lambda: a numeric value representing the value of the parameter λ in equation (7).

Computation:

● This R function implements the algorithm designed in part (b) to estimate ptail (a).

Returns:

● A numeric value representing the estimate of probability ptail (a) calculated in Computation.

(d) [3 marks] Set the seed to 2022, and use the R function created in part (c) to estimate Pr(X1 > 50, X2 > 70, X3 > 90IX/ = 30) with t/ = 0, t1 = 0.7, t2 = 1.2, t3 = 2.5 by generating 100000 samples and set λ = 0.5. Report your answers (to 3 signiﬁcant ﬁgures) as comments below your R commands for this question in the R script.

Question 3: Markov Chain Monte Carlo (MCMC)

[20 marks]

Recall the Markov model described in Question 4 of Coursework 1. Assume that data has been collected on

the weather states for n days, X = {X1 , ..., Xn } with Xi e X = {s, c, r} for all i e {1, ..., n}, where s = sunny, c = cloudy and r = rainy. However, the transition probabilities PT are unknown and are to be estimated in a Bayesian framework. Based on the Markov model, the likelihood of PT is given by

p(XIPT , π) = p(X1 ) ) pT (Xi IXi - 1 ), (8)

i=2

where pT (Xi = bIXi - 1 = a) with a, b e X is the entry pab in the transition probability matrix

PT = ╱、

(prs prc prr ) , (9)

and the initial probabilities for the weather state is given by π = (πs , πc , πr ), i.e., Pr(X1 = x) = πx for x e X.

The Dirichlet distribution is the multivariate extension of the Beta distribution and provides a distribution

over random vectors V = (V1 , ..., Vd ) with the constraint lj(d)=1 vj = 1. The pdf of a d-dimensional Dirichlet (a)

distribution is deﬁned as

f (v) = vj(a)j - 1 , (10)

which has the parameters a = (a1 , ..., ad ) with a1 , ..., ad > 0.

Let px = (pxs , pxc , pxr ) be a random vector with values in the row of PT conditioned on weather state x. For all x e X, we apply a Dirichlet(1, 1, 1) prior on px , i.e.,

px ~ Dirichlet(1, 1, 1) for all x e X, (11)

where Dirichlet(1, 1, 1) is a uniform distribution over all vectors V , and px is a priori independent of the values of π . Thus, given that π is known, the posterior distribution of PT follows

p(PT IX, π) x p(XIPT , π). (12)

(a) [5 marks] Design an algorithm that uses MCMC to estimate the posterior distribution of PT . At each step j of the MCMC, use Dirichlet ╱20 x px(/j) - 1户、as the proposal density q(pIpx(/j) - 1户 ), where p is the proposed vector for step j, while px(/j) - 1户 is the sampled vector for step j 乂 1.

Important:

● In the description of the algorithm, any equations and mathematical expressions must be speciﬁc to the context of this question.

● You must present the description of the algorithm in the report ﬁle .

(b) [3 marks] Write an R function called calcDirichletPDF with the following features.

Arguments:

● x: a non-negative numeric vector wherein the elements sum up to 1.

● a: a positive numeric vector that contains the values of a in equation (10) and has the same length as x.

Computation:

● Use equation (10) to evaluate the natural logarithm of the density for x under the Dirichlet distribution with parameters speciﬁed by a.

Hint: The lgamma function may be useful here.

Returns:

● A numeric vector representing the log pdf calculated as in Computation.

Arguments:

● a: a positive numeric vector containing the values of a in equation (10).

Computation:

● For each j in {1, ..., d}, where d is the number of elements in a, sample Uj

Uj ~ Gamma(aj , 1), (13)

where aj is the jth element of a.

● Compute the vector

V = (U1 , ..., Ud ), (14)

where V is a vector drawn from the Dirichlet distribution with parameters speciﬁed by a.

Returns:

● A numeric vector representing V (in equation 14) as generated in Computation.

(d) [6 marks] Write an R function called q3 .mcmc with the following features.

Arguments:

● X: a character vector, where each element must be one of the three character ,

● initStateProbs: a numeric vector containing three elements, where the ﬁrst, second and third elements respectively correspond to the initial probabilities πs , πc and πr .

● initPT: a numeric 3 x 3 matrix representing the initial values of PT .

● burnin: a numeric value represents the length of chain for the burnin period of the MCMC. Note: Here, the burnin period of the chain does not include the initial value of PT .

● n: a numeric value indicating the length of chain after the burnin period of the MCMC.

Computation:

● Using the calcDirichletPDF and simDirichletRV functions created in part (b) and (c), respec- tively, this R function implements the MCMC algorithm designed in part (a) to produce the posterior distribution of PT .

Returns:

● An n x 3 x 3 array object, which contains n posterior samples of PT generated after the burnin period in Computation.

(e) [3 marks] Set the seed to 2022, and use the q3 .mcmc function created in part (d) to produce 100000 samples from the posterior distribution of PT after removing the ﬁrst 10000 samples as burnin. Assume that π = (1/3, 1/3, 1/3) and that every element in the initial value of PT is 1/3. The value of X is recorded in the text ﬁle q3Weather .txt. Calculate the posterior mean of PT . Report the estimated posterior mean (to 3 d.p.) as comments below your R commands for this question in the R script.

Question 4: Mixture of Linear Regressions

[30 marks]

Let Y = (Y1 , ..., Yn ) be a vector of outcomes and x = (x1 , ..., xn ) be the values of the covariate X for n observations. Assume that Y is generated from a mixture model of K linear regressions, where K is a positive integer. That is the ith outcome Yi is generated from its covariate value xi and one of K linear regressions in the mixture model, such that

Yi IXi , Zi = k ~ Normal(ak + bk xi , σk(2)), (15)

where ak , bk and σk are the intercept, slope and standard deviation of the error terms for the kth linear regression. The random variable Zi indicates from which linear regression Yi is generated, we assume

Zi ~ Multinimial(π1 , ..., πK ), (16)

which means p(Zi = k) = πk .

Consider the incomplete data scenario, where the variables Zi ’s are hidden. Let θ be the parameters of a mixture of linear regression. Speciﬁcally, θ = {π, a, b, σ}, where π = (π1 , ..., πK ), a = (a1 , ..., aK ), b = (b1 , ..., bK ) and σ = (σ1 , ..., σK ).

Follow parts (a)– (d) outlined below to write a program in R that uses the expectation-maximisation (EM) algorithm to ﬁnd the maximum likelihood estimates (MLEs) of the parameters θ .

(a) [ 6 marks] Recall that

p(Zi = kIXi , θ cur ) = p(Zi = k)p(Xi IZi = k, θ cur )

where θ cur = {πcur , acur , bcur , σ cur } with the vectors π cur , acur , bcur and σ cur respectively representing the current values of π , a, b and σ .

Write an R function, called linRegMixEstep, which has the following features.

Arguments :

(1) piCur: a numeric vector that contains K elements representing values in π cur .

(2) aCur: a numeric vector that contains K elements representing values in acur .

(3) bCur: a numeric vector that contains K elements representing values in bcur .

(4) sCur: a numeric vector that contains K elements representing values in σ cur .

(5) x: a numeric vector of length n, containing the values of x.

(6) Y: a numeric vector of length n, containing the values of Y .

Computation :

● This R function calculates the conditional distribution p(Zi = kIxi , Yi , θ cur ) for all k e {1, ..., K} and all i e {1, ..., n}.

Returns :

● A numeric matrix with n rows and K columns, where the entry in the ith row and kth column is the value for

p(Zi = kIxi , Yi , θ cur ).

(b) [ 9 marks] Let θ new = (πnew , anew , bnew , σ new ) with π new = (π1(n)ew , ..., πK(new)), (a1(n)ew , ..., aK(ne)w ), (b1(n)ew , ..., bK(ne)w ) and (σ1(n)ew , ..., σK(new)). The expressions for elements of θ new that optimize the function Q(θ, θcur ) is given by

π k(new) = , (18)

╱b(a)nknke(e)w(w)、 = (xT wk x)- 1 xT wk Y, (19)

σ k(new) = 1 l p(Zi l(=)xi(乂),θ(a)nkc乂 bk(n)ew xi )2 , (20)

for all k e {1, ..., K}. The design matrix x has n rows and two columns. In x, every element in the ﬁrst column has the value 1, while the second column is the vector x. wk is an n x n diagonal matrix with ith diagonal entry being p(Zi = kIxi , θ cur ).

(i) Write an R function, called calcNewCoefs, which has the following features. Arguments :

(1) W: the n x K numeric matrix produced by the linRegMixEstep function in part (a).

(2) x: same as in part (a).

(3) Y: same as in part (a).

Computation :

● This R function calculates anew = (a1(n)ew , ..., aK(ne)w ) and bnew = (b1(n)ew , ..., bK(ne)w ) deﬁned by equation (19).

Returns :

● A list containing two objects:

(1) The ﬁrst object is the numeric vector representing anew , and

(2) the second object is the numeric vector representing bnew .

(ii) Write an R function, called calcNewSd, which has the following features.

Arguments :

(1) W: the n x K numeric matrix produced by the linRegMixEstep function in part (a).

(2) x: the same as in part (a).

(3) Y: the same as in part (a).

(4) a: a numeric vector containing K elements, which represent the appropriate intercept estimates for calculating σ new .

(5) b a numeric vector containing K elements, which represent the appropriate slope estimates for calculating σ new .

Computation :

● Calculate σ new , σK(new)) deﬁned by equation (20).

Returns :

● A vector of length K representing σ new .

(iii) Write an R function, called linRegMixMstep, which has the following features.

Arguments :

(1) W: the n x K numeric matrix produced by the linRegMixEstep function in part (a).

(2) x: same as in part (a).

(3) Y: same as in part (a).

Computation :

● This R function ﬁnds values of θ that optimise the function Q(θ, θcur ) by using equation (18) to calculate π new = (π1(n)ew , ..., πK(new)), while using the functions created in parts (i) and (ii) to

calculate anew bnew and σ new .