Generalised Linear Models - Revision exercise 2021
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
MATHEMATICAL STATISTICS
Semester 2
Generalised Linear Models - Revision exercise
2021
1. Exponential family. The random variable X has an inverse Gaussian distribution if it
has density
f (x) = (2πx3 σ 2 )- exp ┌ 一 ┐ , x > 0
where σ 2 > 0 and µ > 0 are unknown parameters.
(a) Show that the density function can be written as
f (x) = exp ┌ xθ一a(φ(b)(θ) + c(φ, x)┐ .
Identify θ , b(θ), a(φ) and c(φ, x).
(b) Calculate E(X) and Var(X).
(c) Derive an expression for the deviance of the inverse Gaussian distribution. (d) Derive an expression for the quasi-log likelihood function.
(e) What is the canonical link funciton for the inverse Gaussian distribution?
2. Linear model. In determining which of 4 factors A, B, C, or D might affect the underpuffing of cereal in a plant, the following design was implemented. The response scores are recorded on the array.
A1 A2
B1 B2 B1 B2
C1 D1 96.6 125.7
D2 34.5 22.4
C2 D1 14.1 9.5
D2 28.8 52.5
(a) By running all 6 models with 1 interaction term (e.g. AB) using glm or lm, determine
which interaction term(s) are aliased in this design.
(b) Excluding aliased interaction terms in (a), run further models with two interaction
terms. State an appropriate linear model for this design and report all model-fit mea- sures.
3. Estimation.
(a) Find a function g such that Var(g(X)) is approximately constant when X = Z/n, where Z |
Bin(n, p).
n
(b) Show that hii = p, where hii = zi(T)(X\ X)-1 zi and rank(X) = p.
i=1
(c) Let the correlation matrix for a set of exchangeable random variables be 5 = (1 一 α)I + α11\ . You are given that the inverse of the matrix 5 is
5-1 = 1 1一 α [I 一 k11\],
where k = α/[1 + (n 一 1)α]. Consider the simple linear regression model Yi = β0 + β1 xi + ∈i , i = 1, 2, .., n
where the ∈i are assumed to be zero mean, exchangeable random variables. Show that the WLS estimate for the slope is just the simple ordinary least squares estimate. [Hint: use the Bartlett’s Identity: (A + uu\ )-1 = A-1 一 (A-1 u)(u\ A-1 )/(1 + u\ A-1 u) to find the inverse.]
4. Count model.
(a) For the Poisson variables with link function µi(2) = ηi = β1 xi1 + ( ( ( + βp xip , prove that
n
(yi 一 i ) = 0.
i=1
Hence prove that the deviance is
n
2 yi ln(yi /i ). i=1
(b) Suppose that Y1 and Y2 are independent Poisson random variables such that Y1 | 之(µ)
and Y2 | 之(ρµ). Show that
Y1 + Y2 | 之(µ + ρµ)
Y1 μ Y1 + Y2 = m | 会(m, )
Show how you might use this result to test the composite null hypothesis H0 : ρ = 1 against the one-sided alternative H1 : ρ > 1.
5. Estimation. Let Y be a random variable with mean µ and variance given by Var(Y) = µ(µ + α)/α
for some α 达 0.
(a) Show that the quasi-likelihood for this case is
Q(µ, y) = y ln 一 (y + α) ln
(b) Hence show that the maximum quasi-likelihood estimator for µ based on a sample of
n independent observations is the sample mean.
(c) Obtain the defining equation for the maximum quasi-lielihood estimation for α .
6. Logit model. Consider the following data collected by Erickson (1987) as part of a study on the measurement of anaesthetic depth. The potency of an anaesthetic agent is measured in terms of the minimum alveolar concentration (MAC) of the agent at which 50% of patients exhibit no response to stimulation (i.e. do not move - moving means jerking or twisting, not twitching or grimacing - in response to a surgical incision). Thirty patients were adminstered an anaestheic agent which was maintained at a predetermined alveolar concentration (actu- ally, anaesthetists refer to concentration when they mean partial pressure - hence alveolar concentration is measured as a percentage of one atmosphere) for 15 minutes before a singel incision was made in each patient. For each patient, the alvecolar concentration (AC) of the anaesthetic agent and the patient’s response to incision (1 for ‘no move’ and 0 for ‘move’) was recorded.
Patient i |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
Resp. yi |
1 |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
1 |
0 |
0 |
1 |
1 |
1 |
1 |
AC xi |
1.0 |
1.2 |
1.4 |
1.4 |
1.2 |
2.5 |
1.6 |
0.8 |
1.6 |
1.4 |
0.8 |
1.6 |
2.5 |
1.4 |
1.6 |
Patient i |
16 |
17 |
18 |
19 |
20 |
21 |
22 |
23 |
24 |
25 |
26 |
27 |
28 |
29 |
30 |
Resp. yi |
1 |
1 |
0 |
1 |
1 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
1 |
AC xi |
1.4 |
1.4 |
0.8 |
0.8 |
1.2 |
0.8 |
0.8 |
1.0 |
0.8 |
1.0 |
1.2 |
1.0 |
1.2 |
1.0 |
1.2 |
(a) Fit a logistic regression model to this data and assess its fit, i.e. fit ln[πi /(1 一 π)] = β0 + β1 xi where xi represents the alveolar concentration and πi is the probability of movement in the ith patient.
(b) Use the model to estimate the MAC value.
(c) Express your MAC estimator as a function of β0 and β1 , h(3). Recalling
Var[h()] ← h\ ()T Cov()h\ ()
7. Survival analysis . Dellaportas and Smith (1993) analysed data from Grieve (1987) on photocarcinogenicity in four groups, each containing 20 mice, who have recorded a survival time and whether they died or were censored at that time. A portion of the data, giving survival times in weeks for two groups, Irradiated Control (IC) and Vehicle Control (VC) are shown below. A * indicates censoring.
IC 12 1 21 25 11 26 27 30 13 12 21 20 23 25 23 29 35 40* 31 36
VC 32 27 23 12 18 40* 40* 38 29 30 40* 32 40* 40* 40* 40* 25 30 37 27 A survival model with Weibull distribution for the lifetime T is fitted as below.
> t1=c(12,1,21,25,11,26,27,30,13,12,21,20,23,25,23,29,35,40,31,36) > t2=c(32,27,23,12,18,40,40,38,29,30,40,32,40,40,40,40,25,30,37,27) > w1=c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1)
> w2=c(1,1,1,1,1,0,0,1,1,1,0,1,0,0,0,0,1,1,1,1)
> t=c(t1,t2)
> w=c(w1,w2)
> c=factor(c(rep(0,20),rep(1,20)))
> l=log(t)
> omega=sum(w)
> alpha=1
> c=as .numeric(c)
> iter=15
> result=matrix(0,iter,3)
> for (i in 1:iter) {
+ par=glm(w~offset(alpha*l)+c,family=poisson)$coeff
+ beta0=par[1]
+ beta1=par[2]
+ result[i,]=c(alpha,par[1],par[2])
+ mu=t^alpha*exp(beta0+c*beta1)
+ sum1=sum((mu-w)*log(t))
+ alpha=0 .5*(alpha+omega/sum1)
+ }
> colnames(result)=c("alpha","beta0","beta1")
> result
alpha beta0 beta1
[1,] 1 .000000 -2 .481399 -0 .7075598
[2,] 2 .326261 -6 .562449 -1 .0109130
[3,] 2 .510650 -7 .146599 -1 .0475881
[4,] 2 .567484 -7 .327335 -1 .0586655
[5,] 2 .585766 -7 .385543 -1 .0622067
[6,] 2 .591718 -7 .404499 -1 .0633572
[7,] 2 .593663 -7 .410694 -1 .0637329
[8,] 2 .594300 -7 .412721 -1 .0638558
[9,] 2 .594508 -7 .413385 -1 .0638960
[10,] 2 .594576 -7 .413602 -1 .0639092
[11,] 2 .594598 -7 .413673 -1 .0639135
[12,] 2 .594606 -7 .413696 -1 .0639149
[13,] 2 .594608 -7 .413704 -1 .0639153
[14,] 2 .594609 -7 .413706 -1 .0639155
[15,] 2 .594609 -7 .413707 -1 .0639155
(a) State the shape parameter α and the scale parameter λ for the Weibull distribution.
(b) Write down the hazard function h(t) and survival function S(t) for the IC group and
compare them with the VC group. Estimate the hazard and survival probability for the IC group when t = 20.
8. Survival analysis . Let ti , i = 1, . . . , n be a set of censoring or survival times and ωi the corresponding non-censoring indicators. Consider a composite hazard function h(t) = a + b(t 一 τ)2 I(t), that is,
h(t) = 一 达(t) τ,
where a > 0 and b > 0 are unknown parameters, I(t) = I(t 达 τ) and τ is fixed.
(a) Show that the cumulative hazard function is given by
b
3
Hence find the survival function S(t) and the pdf f(t).
(b) Define the sum of squares as
n
SS(a, b) =z [H0 (ti ) 一 H(ti )]2
i=1
where the observed cumulative hazard H0 (ti ) = 一 ln S0 (ti ) is based on
S0 (ti ) = j:tj
when there are dj failures among the rj subjects at risk at time tj .
If we write SS(a, b) = i (yi 一 az1i 一 bz2i)2 for a suitable linear model yi = az1i + bz2i , identify yi , z1i and z2i in terms of H0 (ti ), observations and parameters.
(c) Find the least square estimates of a and b and describe how their standard errors can be obtained.
(d) A sample of n = 49 Kelvar Epoxy Spherical pressure vessels were subjected to constant sustained pressure at the 70% stress level until all had failed, so that the complete data with exact failure time (in thousands hours) are given as below:
1.051 |
1.337 |
1.389 |
1.921 |
1.942 |
2.322 |
3.629 |
4.006 |
4.012 |
4.063 |
4.921 |
5.445 |
5.620 |
5.817 |
5.905 |
5.956 |
6.068 |
6.121 |
6.473 |
7.501 |
7.886 |
8.108 |
8.546 |
8.666 |
8.831 |
9.106 |
9.711 |
9.806 |
10.205 |
10.396 |
1< |
2022-11-07