Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Midterm Exam VERSION A
MATH 523: Generalized Linear Models
February 25, 2020
Instructions
• This is a closed book exam.
• A
nswer both questions in the examination booklets provided.
• Calculators and translation dictionaries are permitted.
• R output and statistical tables are provided.
Problem 1. Consider the Geometric family of distributions with parameter pi P p0, 1q and
probability mass function
fpy; piq “ piyp1´ piq, y P t0, 1, . . .u.
(a) [4 marks] Show that the Geometric family is an exponential dispersion family.
Identify the functions bp¨q and cp¨q, as well as the dispersion and canonical
parameters.
(b) [3 marks] Compute the mean and variance of the Geometric distribution and
identify the mean-variance relationship.
(c) [2 marks] Identify the canonical link for a Geometric GLM. Comment on its
suitability.
(d) [2 marks] What other link functions might be suitable and why?
(e) [1 mark] For what kind of data would a Geometric GLM be better suited than a
Poisson GLM (Hint: look at the mean-variance relationship)?
(f) [4 marks] Derive the likelihood (score) equations for the Geometric GLM when the
log link is used. Explain how the equations simplify when the canonical link is
used.
(g) [4 marks] Calculate the Fisher Information Matrix for the Geometric GLM when
the log link is used.
Problem 2. Consider the following data on ear infections in swimmers from the 1990
Pilot Surf/Health Study of the New South Wales Water Board.
NumInfec the number of self-diagnosed ear infections
Age the age of the swimmer (with levels 15-19, 20-24 and 25-29);
Sex gender of the swimmer (with levels Female and Male);
Loc the usual swimming location (with levels Beach and NonBeach)
Swim frequency of swims in the ocean (with levels Freq(ently) and Occas(ionally))
(a) [5 marks] The data were first modeled with a GLM model m1 whose output is given
on page 4, lines 1–23. From this output:
– Identify the response and the predictors;
– Identify the GLM that was used and the link function;
– Identify the sample size n;
– For each main effect, write down whether it is treated as a factor or a covariate
(continuous predictor).
(b) [3 marks] In model m1, quantify the effect of Age and Loc on the response.
(c) [3 marks] Fill in the values marked by XXX on line 12. Does the p-value allow you
to conclude that Age is not a significant predictor? Explain.
(d) [4 marks] What is the estimated mean number of self-diagnosed ear infections of a
swimmer aged 22 who prefers to swim far from the beach?
(e) [5 marks] A simpler model m2 whose output is given on lines 26–46 has been
fitted to the data. Test whether m2 is an adequate simplification of m1 at the 5%
significance level. Interpret the finding in terms of significance of Age and Loc.
Use the R output on page 4, and the χ2ν table on page 5.