RClass-ExploringLinkFunctions
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
RClass-ExploringLinkFunctions
2022
Consider again the O-rings data example, recording the number of damaged O-rings in the rocket booster of a space shuttle, and temperature at launch.
library(faraway)
data(orings)
attach(orings)
head(orings)
## ## 1 ## 2 ## 3 ## 4 ## 5 ## 6 |
temp 53 57 58 63 66 67 |
damage 5 1 1 1 0 0 |
The response damage/6 is a binomial proportion, so we can fit a binomial GLM.
In this module, we will try out different link functions and explore their effect.
First, try the logit (canonical) link.
logitmod <- glm(cbind(damage,6-damage)~temp,family=binomial)
summary (logitmod)
##
## Call:
## glm(formula = cbind(damage, 6 - damage) ~ temp, family = binomial)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.9529 -0.7345 -0.4393 -0.2079 1.9565
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 11.66299 3.29626 3.538 0.000403 ***
## temp -0.21623 0.05318 -4.066 4.78e-05 ***
## ---
## Signif. codes: 0 '*** ' 0.001 '** ' 0.01 '* ' 0.05 ' . ' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 38.898 on 22 degrees of freedom
## Residual deviance: 16.912 on 21 degrees of freedom
## AIC: 33.675
##
## Number of Fisher Scoring iterations: 6
We can also try the probit and complementary log-log links.
probitmod <- glm(cbind(damage,6-damage)~temp,family=binomial(link=probit)) loglogmod <- glm(cbind(damage,6-damage)~temp,family=binomial(link=cloglog)) summary (logitmod)
##
## Call:
## glm(formula = cbind(damage, 6 - damage) ~ temp, family = binomial)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.9529 -0.7345 -0.4393 -0.2079 1.9565
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 11.66299 3.29626 3.538 0.000403 ***
## temp -0.21623 0.05318 -4.066 4.78e-05 ***
## ---
## Signif. codes: 0 '*** ' 0.001 '** ' 0.01 '* ' 0.05 ' . ' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 38.898 on 22 degrees of freedom
## Residual deviance: 16.912 on 21 degrees of freedom
## AIC: 33.675
##
## Number of Fisher Scoring iterations: 6
summary (probitmod)
##
## Call:
## glm(formula = cbind(damage, 6 - damage) ~ temp, family = binomial(link = probit)) ##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.0134 -0.7761 -0.4467 -0.1581 1.9983
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 5.59145 1.71055 3.269 0.00108 **
## temp -0.10580 0.02656 -3.984 6.79e-05 ***
## ---
## Signif. codes: 0 '*** ' 0.001 '** ' 0.01 '* ' 0.05 ' . ' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 38.898 on 22 degrees of freedom
## Residual deviance: 18.131 on 21 degrees of freedom
## AIC: 34.893
##
## Number of Fisher Scoring iterations: 6
summary (loglogmod)
##
## Call:
## glm(formula = cbind(damage, 6 - damage) ~ temp, family = binomial(link = cloglog)) ##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.9884 -0.7262 -0.4373 -0.2141 1.9520
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 10.86388 2.73668 3.970 7.20e-05 ***
## temp -0.20552 0.04561 -4.506 6.59e-06 ***
## ---
## Signif. codes: 0 '*** ' 0.001 '** ' 0.01 '* ' 0.05 ' . ' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 38.898 on 22 degrees of freedom
## Residual deviance: 16.029 on 21 degrees of freedom
## AIC: 32.791
##
## Number of Fisher Scoring iterations: 7
Which of these three GLMs is most appropriate? Let’s plot the estimated means as a function of temperature. summary (temp)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 53.00 67.00 70.00 69.57 75.00 81.00
x <- seq (from=25 ,to=85 ,by=1)
plot(damage/6~temp,xlim=c (25 ,85),ylim=c (0 ,1),xlab="Temperature" ,ylab="Prob of damage") coef(logitmod)
## (Intercept) temp
## 11.6629897 -0.2162337
lines(x,ilogit(coef(logitmod)[1]+coef(logitmod)[2]*x),col="red") lines(x,pnorm (coef(probitmod)[1]+coef(probitmod)[2]*x),col="darkgreen") lines(x,1-exp (-exp (coef(loglogmod)[1]+coef(loglogmod)[2]*x)),col="blue")
30 40 50 60 70 80
Temperature
We can also compare the predictions of what happens at 31 degrees F (this was the temperature at launch of the Challenger)
First, calculate the expected probabilities of failure.
ilogit(coef(logitmod)[1]+coef(logitmod)[2]*31)
## (Intercept)
## 0.9930342
pnorm (coef(probitmod)[1]+coef(probitmod)[2]*31)
## (Intercept)
## 0.9895983
1-exp (-exp (coef(loglogmod)[1]+coef(loglogmod)[2]*31))
## (Intercept)
## 1
You can also look at odds (ratios of probability of failure vs. probability of no failure) ilogit(coef(logitmod)[1]+coef(logitmod)[2]*31)/(1-ilogit(coef(logitmod)[1]+coef(logitmod)[2]*31))
## (Intercept)
## 142.5576
pnorm (coef(probitmod)[1]+coef(probitmod)[2]*31)/(1-pnorm (coef(probitmod)[1]+coef(probitmod)[2]*31))
## (Intercept)
## 95.13813
(1-exp (-exp (coef(loglogmod)[1]+coef(loglogmod)[2]*31)))/(exp (-exp (coef(loglogmod)[1]+coef(loglogmod)[2]*
## (Intercept)
## 6.402777e+38
And you can compare the models using AIC
logitmod$aic
## [1] 33.67479
probitmod$aic
## [1] 34.89308
loglogmod$aic
## [1] 32.79108
Based on the AIC, the preferred model is the one using the complementary log-log link.
Temperature has a negative effect, meaning that the lower the temperature, the more likely it is that the O-ring will become damaged.
2022-02-23