闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

MATHEMATICAL STATISTICS

Semester 2

Generalised Linear Models - Revision exercise

2021

1. Exponential family. The random variable X has an inverse Gaussian distribution if it

has density

f (x) = (2πx3 σ 2 )- exp ┌ 一 ┐ , x > 0

where σ 2 > 0 and µ > 0 are unknown parameters.

(a) Show that the density function can be written as

f (x) = exp ┌ xθ一a(φ(b)~~(θ)~~ + c(φ, x)┐ .

Identify θ , b(θ), a(φ) and c(φ, x).

(b) Calculate E(X) and Var(X).

(c) Derive an expression for the deviance of the inverse Gaussian distribution. (d) Derive an expression for the quasi-log likelihood function.

(e) What is the canonical link funciton for the inverse Gaussian distribution?

2. Linear model. In determining which of 4 factors A, B, C, or D might aﬀect the underpuﬃng of cereal in a plant, the following design was implemented. The response scores are recorded on the array.

A1 A2

B1 B2 B1 B2

C1 D1 96.6 125.7

D2 34.5 22.4

C2 D1 14.1 9.5

D2 28.8 52.5

(a) By running all 6 models with 1 interaction term (e.g. AB) using glm or lm, determine

which interaction term(s) are aliased in this design.

(b) Excluding aliased interaction terms in (a), run further models with two interaction

terms. State an appropriate linear model for this design and report all model-ﬁt mea- sures.

3. Estimation.

(a) Find a function g such that Var(g(X)) is approximately constant when X = Z/n, where Z |

Bin(n, p).

(b) Show that hii = p, where hii = zi(T)(X\ X)-1 zi and rank(X) = p.

i=1

(c) Let the correlation matrix for a set of exchangeable random variables be 5 = (1 一 α)I + α11\ . You are given that the inverse of the matrix 5 is

5-1 = 1 1~~一 α~~ [I 一 k11\],

where k = α/[1 + (n 一 1)α]. Consider the simple linear regression model Yi = β0 + β1 xi + ∈i , i = 1, 2, .., n

where the ∈i are assumed to be zero mean, exchangeable random variables. Show that the WLS estimate for the slope is just the simple ordinary least squares estimate. [Hint: use the Bartlett’s Identity: (A + uu\ )-1 = A-1 一 (A-1 u)(u\ A-1 )/(1 + u\ A-1 u) to ﬁnd the inverse.]

4. Count model.

(a) For the Poisson variables with link function µi(2) = ηi = β1 xi1 + ( ( ( + βp xip , prove that

(yi 一 i ) = 0.

i=1

Hence prove that the deviance is

2 yi ln(yi /i ). i=1

(b) Suppose that Y1 and Y2 are independent Poisson random variables such that Y1 | 之(µ)

and Y2 | 之(ρµ). Show that

Y1 + Y2 | 之(µ + ρµ)

Y1 μ Y1 + Y2 = m | 会(m, )

Show how you might use this result to test the composite null hypothesis H0 : ρ = 1 against the one-sided alternative H1 : ρ > 1.

5. Estimation. Let Y be a random variable with mean µ and variance given by Var(Y) = µ(µ + α)/α

for some α 达 0.

(a) Show that the quasi-likelihood for this case is

Q(µ, y) = y ln 一 (y + α) ln

(b) Hence show that the maximum quasi-likelihood estimator for µ based on a sample of

n independent observations is the sample mean.

6. Logit model. Consider the following data collected by Erickson (1987) as part of a study on the measurement of anaesthetic depth. The potency of an anaesthetic agent is measured in terms of the minimum alveolar concentration (MAC) of the agent at which 50% of patients exhibit no response to stimulation (i.e. do not move - moving means jerking or twisting, not twitching or grimacing - in response to a surgical incision). Thirty patients were adminstered an anaestheic agent which was maintained at a predetermined alveolar concentration (actu- ally, anaesthetists refer to concentration when they mean partial pressure - hence alveolar concentration is measured as a percentage of one atmosphere) for 15 minutes before a singel incision was made in each patient. For each patient, the alvecolar concentration (AC) of the anaesthetic agent and the patient’s response to incision (1 for ‘no move’ and 0 for ‘move’) was recorded.

Patient i	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15
Resp. yi	1	0	1	0	0	1	1	0	1	0	0	1	1	1	1
AC xi	1.0	1.2	1.4	1.4	1.2	2.5	1.6	0.8	1.6	1.4	0.8	1.6	2.5	1.4	1.6

Patient i	16	17	18	19	20	21	22	23	24	25	26	27	28	29	30
Resp. yi	1	1	0	1	1	0	0	0	0	0	1	0	1	0	1
AC xi	1.4	1.4	0.8	0.8	1.2	0.8	0.8	1.0	0.8	1.0	1.2	1.0	1.2	1.0	1.2

(a) Fit a logistic regression model to this data and assess its ﬁt, i.e. ﬁt ln[πi /(1 一 π)] = β0 + β1 xi where xi represents the alveolar concentration and πi is the probability of movement in the ith patient.

(b) Use the model to estimate the MAC value.

Var[h()] ← h\ ()T Cov()h\ ()

7. Survival analysis . Dellaportas and Smith (1993) analysed data from Grieve (1987) on photocarcinogenicity in four groups, each containing 20 mice, who have recorded a survival time and whether they died or were censored at that time. A portion of the data, giving survival times in weeks for two groups, Irradiated Control (IC) and Vehicle Control (VC) are shown below. A * indicates censoring.

IC 12 1 21 25 11 26 27 30 13 12 21 20 23 25 23 29 35 40* 31 36

VC 32 27 23 12 18 40* 40* 38 29 30 40* 32 40* 40* 40* 40* 25 30 37 27 A survival model with Weibull distribution for the lifetime T is ﬁtted as below.

> t1=c(12,1,21,25,11,26,27,30,13,12,21,20,23,25,23,29,35,40,31,36) > t2=c(32,27,23,12,18,40,40,38,29,30,40,32,40,40,40,40,25,30,37,27) > w1=c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1)

> w2=c(1,1,1,1,1,0,0,1,1,1,0,1,0,0,0,0,1,1,1,1)

> t=c(t1,t2)

> w=c(w1,w2)

> c=factor(c(rep(0,20),rep(1,20)))

> l=log(t)

> omega=sum(w)

> alpha=1

> c=as .numeric(c)

> iter=15

> result=matrix(0,iter,3)

> for (i in 1:iter) {

+ par=glm(w~offset(alpha*l)+c,family=poisson)$coeff

+ beta0=par[1]

+ beta1=par[2]

+ result[i,]=c(alpha,par[1],par[2])

+ mu=t^alpha*exp(beta0+c*beta1)

+ sum1=sum((mu-w)*log(t))

+ alpha=0 .5*(alpha+omega/sum1)

+ }

> colnames(result)=c("alpha","beta0","beta1")

> result

alpha beta0 beta1

[1,] 1 .000000 -2 .481399 -0 .7075598

[2,] 2 .326261 -6 .562449 -1 .0109130

[3,] 2 .510650 -7 .146599 -1 .0475881

[4,] 2 .567484 -7 .327335 -1 .0586655

[5,] 2 .585766 -7 .385543 -1 .0622067

[6,] 2 .591718 -7 .404499 -1 .0633572

[7,] 2 .593663 -7 .410694 -1 .0637329

[8,] 2 .594300 -7 .412721 -1 .0638558

[9,] 2 .594508 -7 .413385 -1 .0638960

[10,] 2 .594576 -7 .413602 -1 .0639092

[11,] 2 .594598 -7 .413673 -1 .0639135

[12,] 2 .594606 -7 .413696 -1 .0639149

[13,] 2 .594608 -7 .413704 -1 .0639153

[14,] 2 .594609 -7 .413706 -1 .0639155

[15,] 2 .594609 -7 .413707 -1 .0639155

(a) State the shape parameter α and the scale parameter λ for the Weibull distribution.

(b) Write down the hazard function h(t) and survival function S(t) for the IC group and

compare them with the VC group. Estimate the hazard and survival probability for the IC group when t = 20.

8. Survival analysis . Let ti , i = 1, . . . , n be a set of censoring or survival times and ωi the corresponding non-censoring indicators. Consider a composite hazard function h(t) = a + b(t 一 τ)2 I(t), that is,

h(t) = 一达(t) τ,

where a > 0 and b > 0 are unknown parameters, I(t) = I(t 达 τ) and τ is ﬁxed.

(a) Show that the cumulative hazard function is given by

Hence ﬁnd the survival function S(t) and the pdf f(t).

(b) Deﬁne the sum of squares as

SS(a, b) =z [H0 (ti ) 一 H(ti )]2

i=1

where the observed cumulative hazard H0 (ti ) = 一 ln S0 (ti ) is based on

S0 (ti ) = j:tj╱ 1 一、

when there are dj failures among the rj subjects at risk at time tj .

If we write SS(a, b) = i (yi 一 az1i 一 bz2i)2 for a suitable linear model yi = az1i + bz2i , identify yi , z1i and z2i in terms of H0 (ti ), observations and parameters.

(d) A sample of n = 49 Kelvar Epoxy Spherical pressure vessels were subjected to constant sustained pressure at the 70% stress level until all had failed, so that the complete data with exact failure time (in thousands hours) are given as below:

1.051	1.337	1.389	1.921	1.942	2.322	3.629
4.006	4.012	4.063	4.921	5.445	5.620	5.817
5.905	5.956	6.068	6.121	6.473	7.501	7.886
8.108	8.546	8.666	8.831	9.106	9.711	9.806
10.205	10.396	1<