MATH 523: Generalized Linear Models Midterm Exam VERSION A 2020
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
Midterm Exam VERSION A
MATH 523: Generalized Linear Models
2020
Problem 1. Consider the Geometric family of distributions with parameter π e(0, 1) and probability mass function
f(y; π) = πu(1 - π), y e {0, 1, . . .}.
(a) [4 marks] Show that the Geometric family is an exponential dispersion family. Identify the functions b(.) and c(.), as well as the dispersion and canonical parameters.
(b) [3 marks] Compute the mean and variance of the Geometric distribution and identify the mean-variance relationship.
(c) [2 marks] Identify the canonical link for a Geometric GLM. Comment on its suitability.
(d) [2 marks] What other link functions might be suitable and why?
(e) [1 mark] For what kind of data would a Geometric GLM be better suited than a Poisson GLM (Hint: look at the mean-variance relationship)?
(f) [4 marks] Derive the likelihood (score) equations for the Geometric GLM when the
log link is used. Explain how the equations simplify when the canonical link is used.
(g) [4 marks] Calculate the Fisher Information Matrix for the Geometric GLM when the log link is used.
Problem 2. Consider the following data on ear infections in swimmers from the 1990 Pilot Surf/Health Study of the New South Wales Water Board.
NumInfec |
the number of self-diagnosed ear infections |
Age |
the age of the swimmer (with levels 15-19, 20-24 and 25-29); |
Sex |
gender of the swimmer (with levels Female and Male); |
Loc |
the usual swimming location (with levels Beach and NonBeach) |
Swim |
frequency of swims in the ocean (with levels Freq(ently) and Occas(ionally)) |
(a) [5 marks] The data were first modeled with a GLM model m1 whose output is given
on page 4, lines 1–23. From this output:
– Identify the response and the predictors;
– Identify the GLM that was used and the link function;
– Identify the sample size n;
– For each main effect, write down whether it is treated as a factor or a covariate (continuous predictor).
(b) [3 marks] In model m1, quantify the effect of Age and Loc on the response.
(c) [3 marks] Fill in the values marked by XXX on line 12. Does the p-value allow you to conclude that Age is not a significant predictor? Explain.
(d) [4 marks] What is the estimated mean number of self-diagnosed ear infections of a swimmer aged 22 who prefers to swim far from the beach?
(e) [5 marks] A simpler model m2 whose output is given on lines 26–46 has been fitted to the data. Test whether m2 is an adequate simplification of m1 at the 5%
significance level. Interpret the finding in terms of significance of Age and Loc. Use the R output on page 4, and the χu(2) table on page 5.
1 C a l l :
2 glm ( f o r m u l a = NumInfec ˜ Age + Loc , f a m i l y = p o i s s o n ) 3
4 D e v i a n c e R e s i d u a l s :
5 Min 1Q Median 3Q Max 6 - 1.9905 - 1.5449 - 1.2971 0 . 6 7 2 3 7 . 3 3 2 6 7
8 C o e f f i c i e n t s :
9 E s t i m a t e Std . E r r o r z v a l u e Pr ( >|z | )
10 ( I n t e r c e p t ) 0 . 1 7 6 7 5 0 . 0 9 3 8 7 1 . 8 8 3 0 . 0 5 9 7 2 .
11 Age20 -24 - 0.34968 0 . 1 2 4 1 1 -2.817 0 . 0 0 4 8 4 **
12 Age25 -29 - 0.17896 0 . 1 2 9 8 2 XXXX XXXX
13 LocNonBeach 0 . 5 0 6 9 2 0 . 1 0 4 3 0 4 . 8 6 0 1 . 1 7 e -06 *** 14 15 S i g n i f . c o d e s : 0 ’* * * ’ 0 . 0 0 1 ’* * ’ 0 . 0 1 ’* ’ 0 . 0 5 ’. ’ 0 . 1 ’ ’ 1
16 17 ( D i s p e r s i o n p a r a m e t e r f o r p o i s s o n f a m i l y t a k e n t o be 1 )
18 19 N u l l d e v i a n c e : 8 2 4 . 5 1 on 286 d e g r e e s o f f r e e d o m
20 R e s i d u a l d e v i a n c e : 7 9 1 . 7 7 on 283 d e g r e e s o f f r e e d o m
21 AIC : 1 1 7 2 . 2
22
23 Number o f F i s h e r S c o r i n g i t e r a t i o n s : 6
24
25
26 C a l l :
27 glm ( f o r m u l a = NumInfec ˜ Loc , f a m i l y = p o i s s o n ) 28
29 D e v i a n c e R e s i d u a l s :
30 Min 1Q Median 3Q Max 31 - 1.8632 - 1.4522 - 1.4522 0 . 8 1 8 2 6 . 8 5 9 5 32
33 C o e f f i c i e n t s :
34 E s t i m a t e Std . E r r o r z v a l u e Pr ( >|z | ) 35 ( I n t e r c e p t ) 0 . 0 5 2 9 9 0 . 0 8 0 3 2 0 . 6 6 0 0 . 5 0 9
36 LocNonBeach 0 . 4 9 8 4 3 0 . 1 0 2 8 0 4 . 8 4 9 1 . 2 4 e -06 ***
37 38 S i g n i f . c o d e s : 0 ’* * * ’ 0 . 0 0 1 ’* * ’ 0 . 0 1 ’* ’ 0 . 0 5 ’. ’ 0 . 1 ’ ’ 1
39 40 ( D i s p e r s i o n p a r a m e t e r f o r p o i s s o n f a m i l y t a k e n t o be 1 )
41 42 N u l l d e v i a n c e : 8 2 4 . 5 1 on 286 d e g r e e s o f f r e e d o m
43 R e s i d u a l d e v i a n c e : 8 0 0 . 3 6 on 285 d e g r e e s o f f r e e d o m
44 AIC : 1 1 7 6 . 8
45 46 Number o f F i s h e r S c o r i n g i t e r a t i o n s : 6
2022-02-23