STATS 330 Mid-Semester Test Semester 1, 2019
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
STATS 330
Semester 1, 2019
Statistical Modelling
STATS 330 Mid-Semester Test
Question 1
Consider the following model fitted to these data in R:
pois1.fit <- glm(Snapper ~ Reserve + Season + Year, family = "poisson" ,
data = snapper.df)
summary(pois1.fit)
##
## Call:
## glm(formula = Snapper ~ Reserve + Season + Year, family = "poisson", ## data = snapper.df)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -4.566 -1.925 -0.391 1.193 4.447
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 2.2369 0.0543 41.21 <2e-16 ***
## ReserveHahei -0.8235 0.0821 -10.03 <2e-16 ***
## ReserveTAWH -0.7181 0.0860 -8.35 <2e-16 ***
## SeasonSpring -0.7968 0.0668 -11.93 <2e-16 ***
## Year98-99 0.1072 0.0619 1.73 0.084 .
## ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## ## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 1048.12 on 203 degrees of freedom
## Residual deviance: 740.46 on 199 degrees of freedom
## AIC: 1304
##
## Number of Fisher Scoring iterations: 5
confint(pois1.fit)
## Waiting for profiling to be done...
## 2.5 % 97.5 %
## (Intercept) 2.129157 2.34196
## ReserveHahei -0.986987 -0.66492
## ReserveTAWH -0.889776 -0.55226
## SeasonSpring -0.928897 -0.66690
## Year98-99 -0.014105 0.22872
(a) Write equations to describe the fitted model pois1.fit.
[5 marks]
(b) Interpret the effect of season based on the model pois1.fit.
[10 marks]
Question 2
Also consider the following model:
pois2.fit <- glm(Snapper ~ Reserve * Season * Year, family = "poisson" , data = snapper.df)
anova(pois1.fit , pois2.fit , test = "Chisq")
## Analysis of Deviance Table
##
## Model 1: Snapper ~ Reserve + Season + Year
## Model 2: Snapper ~ Reserve * Season * Year
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 199 740
## 2 192 702 7 38.1 2.9e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(a) What is the null hypothesis being tested by the anova() function?
[10 marks]
(b) What do you conclude from this hypothesis test?
[5 marks]
Question 3
Consider the following R output:
deviance(pois1.fit)
## [1] 740.46
df.residual(pois1.fit)
## [1] 199
1 - pchisq(deviance(pois1.fit), df.residual(pois1.fit))
## [1] 0
deviance(pois2.fit)
## [1] 702.38
df.residual(pois2.fit)
## [1] 192
1 - pchisq(deviance(pois2.fit), df.residual(pois2.fit))
## [1] 0
(a) Based on the above output, do you think the model pois1.fit is approrpiate? Explain your answer.
[5 marks]
(b) Based on the above output, do you think the model pois2.fit is appropriate? Explain your answer.
[5 marks]
Question 4
Consider the following models. Note that the summary() output has been truncated to save space.
quasipois.fit <- glm(Snapper ~ Reserve + Season + Year, family = "quasipoisson") summary(quasipois.fit)
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -4.5659 -1.9249 -0.3913 1.1928 4.4473
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.23693 0.09779 22.874 < 2e-16 ***
## ReserveHahei -0.82349 0.14793 -5.567 8.32e-08 ***
## ReserveTawh -0.71813 0.15502 -4.633 6.52e-06 ***
## SeasonSpring -0.79681 0.12037 -6.620 3.27e-10 ***
## Year98-99 0.10718 0.11157 0.961 0.338
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ##
## (Dispersion parameter for quasipoisson family taken to be 3.245656) ## ...
nb.fit <- glm.nb(Snapper ~ Reserve + Season + Year)
summary(nb.fit)
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.5791 -1.1752 -0.2334 0.5523 1.9450
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 2.3240 0.1286 18.072 < 2e-16 ***
## ReserveHahei -0.9775 0.1603 -6.099 1.07e-09 ***
## ReserveTawh -0.7884 0.1693 -4.658 3.20e-06 ***
## SeasonSpring -0.9199 0.1338 -6.874 6.25e-12 ***
## Year98-99 0.1336 0.1327 1.006 0.314
## ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## ## (Dispersion parameter for Negative Binomial(1.5522) family taken to be 1)
## ...
##
##
##
Theta: 1.552
Std. Err.: 0.239
plot(predict(quasipois.fit), residuals(quasipois.fit , type = "deviance"), ylab = "Residual")
abline(h = 0 , lty = "dotted")
1.0
1.5
2.0
predict(quasipois.fit)
plot(predict(nb.fit), residuals(nb.fit , type = "deviance"), ylab = "Residual") abline(h = 0 , lty = "dotted")
0.5 1.0 1.5 2.0 2.5
predict(nb.fit)
plot(predict(nb.fit), qresiduals(nb.fit), ylab = "Residual")
abline(h = 0 , lty = "dotted")
0.5 1.0 1.5 2.0 2.5
predict(nb.fit)
Based on the output above, which model do you prefer out of quasipois.fit and nb.fit? Explain your answer.
[10 marks]
Question 5
The second observation in the data set has the following observed values:
snapper.df[2, ]
## Reserve Season Year Snapper
## 2 Hahei Spring 97-98 4
It has the following fitted expected values (2 ) under models pois1.fit and nb.fit:
fitted(pois1.fit)[2]
## 2
## 1.8527
fitted(nb.fit)[2]
## 2
## 1.5321
(a) What is the estimated variance of the response for this observation under the following
models? Calculate your answers to two decimal places, and show your working.
[15 marks]
(i) Model pois1.fit (from Question 1).
(ii) Model quasipois.fit (from Question 4).
(iii) Model nb.fit (from Question 4).
(b) What is this observation’s Pearson residual under the following models? Calculate your
answers to two decimal placees, and show your working.
[10 marks]
(i) Model pois1.fit (from Question 1).
(ii) Model nb.fit (from Question 4).
Question 6
Many of the models fitted above use the following simple additive equation within the model fitting function:
Snapper ~ Reserve + Season + Year
However, this does not account for any interactions that may exist between the variables.
Describe a strategy to find the model with the set of main effects and interaction effects that are best supported by the data.
[15 marks]
Question 7
A fisheries expert believes that snapper prefer one of the sites over the others in autumn, but prefer a different site in spring. Also, they believe that there were more snapper around for the 98–99 year than the 97–98 year, and that this effect was the same for all seasons and sites.
Which explanatory terms could you include in a model so that it is capable of estimating the effects that the fisheries expert believes to exist?
You can either provide your answer as a description in words or an R equation, for example, in the form
y ~ x * z
[10 marks]
Useful equations
If yi ~ Normal(ui , 72 ), then
E(yi ) = ui ,
Var(yi ) = 72 .
If yi ~ Binomial(ni , pi ), then
E(yi ) = nipi ,
Var(yi ) = nipi (1 _ pi ).
If yi ~ Poisson(ui ), then
E(yi ) = ui ,
Var(yi ) = ui .
If yi ~ Negative Binomial(ui , 9), then
E(yi ) = ui ,
Var(yi ) = ui +
2022-01-21