STAT 3690 Lecture 18
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
STAT 3690 Lecture 18
2022
Testing for nested models
● H0 : E(Y | X) = X(0) B(0) (nested model) vs. H1 : E(Y | X) = X(0) B(0) + X(1) B(1) (full model)
- When X(0) has only the column of ones, we are testing the empty model (i.e., only the intercept) against the full model.
- When X(1) only contains one column, we are testing for the significance of that variable.
● Likelihood ratio
λ = L(市)o \−n/2 = │det ┌(ML,市o - ML ) M(−)L(1) + I┐入 −n/2
● Alternatives to likelihood ration
- Suppose η 1 ≥ . . . ≥ ηb are eigenvalues of ( ML,市o - ML ) M(−)L(1)
- Wilks’ lambda: ' (1 + η' ) − 1
- Pillai’s trace: ' {η' (1 + η' ) − 1 }
- Hotelling-Lawley trace: ' η'
- Roy’s largest root: η 1 (1 + η1 ) − 1
- When X(1) has only one column, all four tests are equivalent; as n increases, all four tests give similar results.
options(digits = 4)
tear <- c (
6.5 , 6.2 , 5.8 , 6.5 , 6.5 , 6.9 , 7.2 , 6.9 , 6.1 , 6.3 ,
6.7 , 6.6 , 7.2 , 7.1 , 6.8 , 7.1 , 7.0 , 7.2 , 7.5 , 7.6
)
gloss <- c (
9.5 , 9.9 , 9.6 , 9.6 , 9.2 , 9.1 , 10.0 , 9.9 , 9.5 , 9.4 ,
9.1 , 9.3 , 8.3 , 8.4 , 8.5 , 9.2 , 8.8 , 9.7 , 10.1 , 9.2
)
opacity <- c (
4.4 , 6.4 , 3.0 , 4.1 , 0.8 , 5.7 , 2.0 , 3.9 , 1.9 , 5.7 ,
2.8 , 4.1 , 3.8 , 1.6 , 3.4 , 8.4 , 5.2 , 6.9 , 2.7 , 1.9
)
rate <- factor(gl(2 ,10 ,length=n), labels=c ( "Low" , "High"))
additive <- factor(gl(2 ,5 ,length=nrow (X)), labels=c ( "Low" , "High"))
# Testing the necessity of interaction
fit0 <- lm(cbind(tear, gloss, opacity) ~ rate+additive)
fit1 = lm(cbind(tear, gloss, opacity) ~ rate*additive)
anova (fit1, fit0, test= 'Wilks ')
anova (fit1, fit0, test= 'Pillai ')
anova (fit1, fit0, test= 'Hotelling')
anova (fit1, fit0, test= 'Roy')
Information criteria
● Akaike’s information criterion (AIC)
- ln Likelihood + 2 x number of parameters to estimate
- Number of parameters to estimate in B and Σ: p(q + 1) + p(p + 1)/2
- Smaller is better
● Bayesian information criterion (BIC)
- ln Likelihood + ln n x number of parameters to estimate
● Model selection using information criteria proceeds as follows
- Select models of interest M1 , . . . , M六 . They do not need to be nested.
* Candidate models should be selected using domain-specific expertise, if possible. Or, you can go through all possible models.
- Compute the specific information criterion for each model.
- Select the model with the smallest value of the information criterion.
options(digits = 4)
tear <- c (
6.5 , 6.2 , 5.8 , 6.5 , 6.5 , 6.9 , 7.2 , 6.9 , 6.1 , 6.3 ,
6.7 , 6.6 , 7.2 , 7.1 , 6.8 , 7.1 , 7.0 , 7.2 , 7.5 , 7.6
)
gloss <- c (
9.5 , 9.9 , 9.6 , 9.6 , 9.2 , 9.1 , 10.0 , 9.9 , 9.5 , 9.4 ,
9.1 , 9.3 , 8.3 , 8.4 , 8.5 , 9.2 , 8.8 , 9.7 , 10.1 , 9.2
)
opacity <- c (
4.4 , 6.4 , 3.0 , 4.1 , 0.8 , 5.7 , 2.0 , 3.9 , 1.9 , 5.7 ,
2.8 , 4.1 , 3.8 , 1.6 , 3.4 , 8.4 , 5.2 , 6.9 , 2.7 , 1.9
)
n = length(opacity)
rate <- factor(gl(2 ,10 ,length=n), labels=c ( "Low" , "High"))
additive <- factor(gl(2 ,5 ,length=n), labels=c ( "Low" , "High"))
fit0 <- lm(cbind(tear, gloss, opacity) ~ rate)
logLik(fit0)
AIC(fit0)
BIC(fit0)
logLik.mlm <- function(object, ...) {
resids <- residuals(object)
Sigma_ML <- crossprod(resids)/nrow (resids)
ans <- sum (mvtnorm ::dmvnorm(resids, sigma = Sigma_ML,log = TRUE))
df <- prod(dim(coef(object))) + choose(ncol(Sigma_ML) + 1 , 2)
attr(ans, "df") <- df
class(ans) <- "logLik"
return(ans)
}
logLik(fit0)
AIC(fit0)
BIC(fit0)
fit1 <- lm(cbind(tear, gloss, opacity) ~ additive)
AIC(fit1)
BIC(fit1)
2022-03-18