Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Semester 2 Assessment, 2020

MAST30027 Modern Applied Statistics

Question 1 (7 marks)

Let X1 , ··· ,Xn  be independent samples from a Normal distribution N(1, ) with pdf

(x1)2  .

(a) What is the log-likelihood for this example?

(b) What is the Fisher information for this example?

(c) Find the MLE of ⌧ and its asymptotic distribution.

Question 2 (9 marks)

The dvisits data in the faraway package comes from the Australian Health Survey of 1977–78 and consists of 5190 observations on single adults, where young and old have been oversampled. Here, we consider doctorco as a response and sex, age, income, levyplus, freepoor, freerepa, illness, actdays as predictor variables. The description of each variable is as follows.

• doctorco: number of consultations with a doctor or specialist in the past 2 weeks

 sex: 1 if female, 0 if male

• age: age in years divided by 100

• income: annual income in Australian dollars divided by 1000

• levyplus: 1 if covered by a private health insurance fund for private patients in a public hospital (with doctor of choice), 0 otherwise

• freepoor: 1 if covered by government because of low income, recent immigrant, or unem- ployed, 0 otherwise

• freerepa: 1 if covered by government because of old-age or disability pension, or because of invalid veteran or family of deceased veteran, 0 otherwise

• illness: number of illnesses in past 2 weeks, with 5 or more coded as 5

• actdays: number of days of reduced activity in past two weeks due to illness or injury

Examine the R code and output below, and then answer the questions that follow.

> library(faraway)

> data(dvisits)

> modelA <- glm(doctorco ~ sex + age + income + levyplus + freepoor

+ freerepa + illness + actdays,

family=quasipoisson, data=dvisits)

glm(formula = doctorco ~ sex + age + income + levyplus + freepoor +

freerepa + illness + actdays, family = quasipoisson, data = dvisits)

Deviance Residuals:

Min       1Q   Median       3Q      Max

-2 .7696  -0 .6865  -0 .5773  -0 .4906   5 .5745

Coefficients:

Estimate Std . Error t value Pr(>|t |)

(Intercept) -2 .055666   0 .115808 -17 .751   <2e-16 ***

sex           0 .163442   0 .064454   2 .536   0 .0112 *

age           0 .296311   0 .186496   1 .589   0 .1122

income      -0 .195493   0 .098511  -1 .984   0 .0473 *

levyplus     0 .143743   0 .082153   1 .750   0 .0802 .

freepoor    -0 .404611   0 .206938  -1 .955   0 .0506 .

freerepa     0 .118603   0 .105656   1 .123   0 .2617

illness      0 .211644   0 .019482  10 .864   <2e-16 ***

actdays      0 .133576   0 .005264  25 .377   <2e-16 ***

---

Signif . codes:  0 ‘***’ 0 .001 ‘**’ 0 .01 ‘*’ 0 .05‘ . ’0 .1‘ ’1

(Dispersion parameter for quasipoisson family taken to be 1 .328231)

Null deviance: 5634 .8

Residual deviance: 4394 .3

AIC: NA

on 5189

on 5181

degrees of freedom

degrees of freedom

Number of Fisher Scoring iterations: 6

> modelB <- glm(doctorco ~ sex + age + income,

+              family=quasipoisson, data=dvisits)

> anova(modelB, modelA, test="F")

Analysis of Deviance Table

Model 1: doctorco ~ sex + age + income

Model 2: doctorco ~ sex + age + income + levyplus + freepoor + freerepa +

illness + actdays

Resid . Df Resid . Dev Df Deviance      F    Pr(>F)

1      5186     5434.9

2      5181     4394 .3  5   1040 .5 156 .68 < 2 .2e-16 ***

---

Signif . codes:  0 ‘***’ 0 .001 ‘**’ 0 .01 ‘*’ 0 .05‘ . ’0 .1‘ ’1

>

> modelC <- glm(doctorco ~ sex + age + income + levyplus + freepoor

+ freerepa + illness + actdays,

family=poisson, data=dvisits)

> modelD <- glm(doctorco ~ sex + age + income,

+              family=poisson, data=dvisits)

> summary(modelD)

Call:

glm(formula = doctorco ~ sex + age + income, family = poisson,

data = dvisits)

Deviance Residuals:

Min

-1 .0350

1Q -0 .8031

Median

-0 .6749

3Q

-0 .6069

Max

6 .3695

Coefficients:

Estimate Std . Error z value Pr(>|z|)

(Intercept) -1 .71473    0 .09118 -18 .805  < 2e-16 ***

sex           0 .21565    0 .05589   3 .859 0 .000114 ***

age           1 .23798    0 .13013   9 .514  < 2e-16 ***

income      -0 .27726    0 .07969  -3 .479 0 .000502 ***

---

Signif . codes:  0 ‘***’ 0 .001 ‘**’ 0 .01 ‘*’ 0 .05‘ . ’0 .1‘ ’1

(Dispersion parameter for poisson family taken to be 1)

Null deviance: 5634 .8

Residual deviance: 5434 .9

AIC: 7774 .4

on 5189

on 5186

degrees of freedom

degrees of freedom

Number of Fisher Scoring iterations: 6

> (phiC <- sum(residuals(modelC, type="pearson")^2/modelC$df .residual))

[1] 1 .328231

> (phiD <- sum(residuals(modelD, type="pearson")^2/modelD$df .residual))

[1] 2 .027811

Do you prefer modelA or modelB, and why?

Show how the F-statistic has been calculated in the analysis of deviance. What are the degrees of freedom for the F statistic?

Give the estimator for the coefficient of illness and its standard error for modelC.

For modelD, what is the log-likelihood of the tted model, and the log-likelihood of the

full  (saturated) model?

Question 3 (10 marks)

We assume that the observed data, X1  = 1,X2  = 2,X3  = 4 follow a mixture of two Poisson distributions. Specifically, for i = 1, 2, 3,

Zi  categorical (⇡ , 1 − ⇡),

Xi|Zi = 1 ⇠ Poisson(λ1 = 1.2) and Xi|Zi = 2 ⇠ Poisson(λ2 = 2.5), where the Poisson distribution has the probability mass function

λ e −λ

.

Assume that we derived and implemented the EM algorithm to obtain the MLE of the parameter ⇡ .

(a) Assume we ran the EM algorithm two times with di↵erent initial values. The following

table shows two estimates returned by di↵erent runs of the EM algorithm. Which estimate should we use as the MLE of the parameter ⇡? Why?

 

initial value

estimate for 

run 1 run 2

0.1

0.9

0.3

0.4

(b)

If P(Zi  = 1|Xi) > 0.5, we assign a sample Xi  to cluster 1, and cluster 2 otherwise. Using the MLE from the part (a), compute P(Z2  = 1|X2  = 2) and P(Z2  = 2|X2  = 2). Which cluster is X2 = 2 assigned to?

Question 4 (18 marks)

Model:  We assume that  and y are independent and follow normal distributions: 北 ⇠ N(µ1 , 12)  ,  y ⇠ N(µ2 , 12).

Prior:  We impose the following bivariate normal prior for the mean parameters:

µ(µ)2(1)   N(µ, ) with µ = 0(0) and  =    .

Recall that x = 北(北)2(1)  ⇠ N(µ, ⌃) with µ = µ(µ)2(1)  and ⌃ = 2σ1σ12     σ 122σ2  , i↵ x has joint

density

fµ , (x) =  exp  (x µ)T  1 (x µ).

(a) We wish to perform posterior inference using the Gibbs sampling. Derive the conditional distribution

p(µ1 |µ2,北,y).

If it is a known distribution, identify the distribution name and its parameters.


(b) We wish to perform posterior inference using the Metropolis-Hastings algorithm. For the current values of the parameters (µ1(c),µ2(c)), we propose new values (µ1(n),µ2(n)) as follows: µ1(n) ⇠ N(0, 12) and µ2(n) ⇠ N(µ2(c) , 12). Compute the acceptance probability when (µ1(c),µ2(c)) =

(2, 0), (µ1(n),µ2(n)) = (3, 1),北 = 1,y = 0.

(c)


We wish to perform posterior inference using variational inference with the mean-field variational family where q(µ1 ,µ2 ) = q1 (µ1 )q2 (µ2 ) and use the CAVI algorithm for opti- misation. The CAVI algorithm iteratively optimises each factor as follows while holding the other factor fixed:

q 1(*)(µ1 )  /   exp{Eµ2 [logp(µ1 ,µ2 ,x,y)]},

q2(*)(µ2 )  /   exp{Eµ1 [logp(µ1 ,µ2 ,x,y)]},

where the expectations Eµ2  and Eµ 1  are taken with respect to q2(*)(µ2 ) and q1(*)(µ1 ), respec- tively. Derive q1(*)(µ1 ) and q2(*)(µ2 ), and identify the corresponding distribution names and

their parameters.