Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Multiple Linear Regression Models - Part 2
Residual Diagnostics, Unusual observations
STAT3022
Applied linear models
Regre
ssion Diagnostics
Background
Recall the MLR model
y = Xβ + ε, E(y) = Xβ, Var(y) = Var(ε) = σ2In
Assuming the design matrix X is full-ranked, so the OLS estimate
is
βˆ = (X>X)−1X> y .
The vector of fitted value and residual are
yˆ = X βˆ = X(X>X)−1Xy = Hy,
e = y−yˆ = y−Hy = (In−H)y
where H = X(X>X)−1X> is the n× n hat matrix.
Background
Similar to model diagnostics for SLR, diagnostic for MLR is based
on the residuals, which depends critically on the hat matrix H.
• H is symmetric, i.e H> = H. As a result, the matrix In −H
is also symmetric.
• Next, HX = X. As a result, (In−H)X = X−X = 0.
• Third, H2 = H, so we say H is idempotent. As a result, the
matrix In −H is also idempotent, since
(In −H)(In −H) = InIn −H In − In H + H H
= In −H−H + H = In −H .
• Finally, as proved in the Tutorial 4, trace(H) =
∑n
i=1 hii = p.
Residual vector
• First, let’s compute its expectation:
E(e) = E {(In−H)y} = (In−H)E(y) = (In−H)Xβ = 0.
• Second, let’s compute the variance-covariance matrix.
Var(e) = Var {(In−H)y} = (In−H) Var(y)(In−H)>
= (In−H)σ2 In(In−H) = σ2(In−H)(In−H)
= σ2(In−H),
i.e Var(ei) = σ
2(1− hii), Cov(ei, ej) = −σ2hij .
These computation tell us that (1) each residual term ei has a
smaller variance than the true error εi, and (2) these residuals are
correlated.
Residuals plots
We can use similar residual plots similar to in the case of simple
linear regression for model diagnostics. Specifically,
• To check constant variance assumption: Use the plot of
residual ei vs. fitted values yˆi or the plot of residual vs. each
covariate. no news is good news.
• To check normality assumption: Use normal quantile-quantile
plot, or normality test.