关键词 > MTH6134/MTH6134P
MTH6134 / MTH6134P: Statistical Modelling II Main Examination period 2021
发布时间:2023-12-29
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
Main Examination period 2021 – January – Semester A
MTH6134/MTH6134P: Statistical Modelling II
Question 1 [23 marks]. Suppose that Yi N(μi ; σ2 ) for i = 1; 2; : : : ; n, all independent, where μi = β1xi+β2zi , xi and zi are known covariates, and σ is known.
(a) Write down the likelihood for the data y1,...,yn. [6]
(b) Find the maximum likelihood estimatorsβˆ1 andβˆ2 of β1 and β2 . [12]
(c) Prove thatβˆ1 is an unbiased estimator of β1 . [4]
(d) Explain whyβˆ1 has a normal distribution. [1]
Question 2 [20 marks]. The numbers of babies surviving to discharge from a hospital (y) out of the number admitted to neonatal intensive care (r) for two epochs (w) and three gestational ages (x), in weeks, were recorded. Below are the data.
x |
23 |
23 |
24 |
24 |
25 |
25 |
w |
1 |
2 |
1 |
2 |
1 |
2 |
r |
81 |
65 |
165 |
198 |
229 |
225 |
y |
15 |
12 |
40 |
82 |
119 |
142 |
Let Yjk denote the number of babies surviving to discharge out of the rjk of gestational age xk admitted to neonatal intensive care for epoch j. Then it is assumed that Yjk ~ Bin(rjk ; πjk) for j = 1; 2 and
k = 1; 2; 3, all independent, where logfπjk/(1 - πjk)g = aj+βjxk. This model was fitted to the data using R and the following output was obtained:
Call:
glm(formula = p ~ w + w:x, family = binomial, weights = r)
Deviance Residuals:
1 2 3 4 5 6
1.1557 -0.3945 -1.3118 0.3665 0.4957 -0.1753
Coefficients:
Estimate Std . Error z value Pr(>|z|)
(Intercept) -22.9574 3.6704 -6.255 3.98e-10 ***
w2 -0.5081 5.1116 -0.099 0.921
w1:x 0.9188 0.1499 6.128 8.88e-10 ***
w2:x 0.9611 0.1459 6.587 4.47e-11 ***
---
Signif . codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 109.1191 on 5 degrees of freedom
Residual deviance: 3.6228 on 2 degrees of freedom
AIC: 42.76
Number of Fisher Scoring iterations: 4
(a) Plot the proportions of babies surviving to discharge against gestational age by epoch. What are your conclusions? [6]
(b) Write down the fitted logistic regression model for each epoch. [6]
(c) Use the above output to assess the goodness of fit of the model. [4]
(d) Give an approximate 95% confidence interval for β1 −β2. [4]
Question 3 [22 marks]. Suppose that Yi Bin(ri ; πi) for i = 1; 2; : : : ; n, all independent, where the ri are known, logf-log(1 - πi)g = β0+β1xi and xi is a known covariate.
(a) Explain why this is a generalised linear model. [4]
(b) Find the Fisher information matrix. [8]
(c) Obtain the asymptotic distribution of the maximum likelihood estimatorβˆ1 of β1 . [8]
(d) State the form of an approximate test for testing H0: β1 = 0 against H1: β1 0. [2]
Question 4 [23 marks]. A study of 49 attending physicians and 71 surgical residents in training at a university hospital was carried out to investigate whether the two groups of surgeons were applying unnecessary blood transfusions at different rates. For each surgeon, the number of blood transfusions prescribed unnecessarily in one year was recorded. The contingency table below summarises the data.
Surgeon |
Unnecessary Blood Transfusion |
Total |
|||
Frequent |
Occasionally |
Rarely |
Never |
||
Attending |
2 |
3 |
31 |
13 |
49 |
Resident |
15 |
28 |
23 |
5 |
71 |
Let Yjk denote the number of surgeons classified in row j and column k. Then it is assumed that the Yjk for row j have a multinomial distribution with parameters yj: and θjk for j = 1; 2 and k = 1; 2; 3; 4, and that the rows are independent, where yj: = Σk(4)= 1yjk and θjk is the probability that a surgeon is classified in row j and column k. The null hypothesis is that the distributions of unnecessary blood transfusions are the same for the two groups of surgeons.
(a) Briefly explain how you would enter these data into R. What commands would you use to fit a log-linear model to the data? [4]
(b) Explain why, under the null hypothesis, the expected frequency for cell (j ; k) is e jk = yj:y:k/n, where n = 120. [4]
(c) Obtain the expected values under the null hypothesis. Compare these with the observed values. [5]
(d) Find the deviance and the value of Pearson’s goodness-of-fit test statistic. What is your conclusion about the numbers of unnecessary blood transfusions for the two groups of surgeons? [10]
Question 5 [12 marks]. Suppose that Ti Exp(λi) for i = 1; 2; : : : ; n, all independent, where λi = β xi and xi is a known covariate.
(a) Explain what link is being used here. [1]
(b) Write down the likelihood for the data (ti ; δi) for i = 1; 2; : : : ; n, where δi is a censoring variable. [4]
(c) Show that the maximum likelihood estimator of β isβ(ˆ) = Σ1
[5]
(d) Now assume that there is no censoring. Given that the vectors tand x in R contain the times and the covariate values, what commands would you use to obtain the details of the fitted model? [2]