MATH 2697 Solutions
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
MATH 2697
Solutions
1. (a) These are
1 16.47 0
1 16.02 0
1 16.81 0
1 22.87 1
(b) Denote C = .(╱) .(、) and g = ╱! y..y五(1) 、|. Then
βˆ} = c}(户)x户 g
so that βˆ1 = -1.0097, βˆ2 = -0.1009, βˆ3 = -0.0028.
Further, with s = ′0.1338985 = 0.3659214, one has
SE(βˆ} ) = s )C}}
so that SE(βˆ1 ) = 0.4554, SE(βˆ2 ) = 0.0268, SE(βˆ3 ) = 0.2180.
(c) Let = (βˆ1 , βˆ2 , βˆ3 )户 = (-1.0097, -0.1009, -0.0028)户 .
It is a Tuesday, so z0 = 0. Hence, one has α0 = (1, 16.5, 0)户 and so (i) yˆ0 = α0(户) = -2.67455
(ii) It is α0(户)Cα0 = 1.54887 - 2 x 0.08823 x 16.5 + 0.00537 x 16.52 = 0.0992625
and hence the CI is
yˆ0 士 t14-3,0A025 x s )α0(户)Cα0
= -2.67455 士 2.201 x 0.3659214 ′0.0992625 = [-2.928297, -2.420803]
(iii) Similarly, the PI is obtained as
yˆ0 士 t14-3,0A025 x s )1 + α0(户)Cα0
= -2.67455 士 2.201 x 0.3659214 ′1 + 0.0992625 = [-3.51897, -1.83013]
2. (a) 五五户 = x(x户 x)-1 x户 [x(x户 x)-1 x户 ]户 = x(x户 x)-1 x户 x(x户 x)-1 x户 = x(x户 x)-1 x户 = 五.
Tr(五) = Tr ╱x(x户 x)-1 )x户、= Tr ╱x户 x(x户 x)-1、= Tr(ⅠY ) = p
(b) Let 五 = [h之} ]15之,}5五 . Using part (a), and taking the i-th diagonal element of 五五户 = 五, one has
匕 h之}(2) = h之之
}
i.e. h之之 > 0, and
h之之(2) +匕 h之}(2) = h之之
} 之
from which we see that
h之之(2) < h之之
hence
h之之 (1 - h之之 ) > 0
so that necessarily 0 < h之之 < 1.
(c) (i) There is one observation which has a far larger leverage value than the others. Hence, we would say that observation ’8’ is a potentially influen- tial observation.
(ii) No, it is not. The leverage values do not take the response into account; hence they do not tell us whether the observation is actually influential. This could be investigated using Cook’s Distance, for instance.
(iii) Using part (a), 46 x 0.06521739 = 3.
3. (a) One has
E[] = E[ ╱xT x、-1 xT r ] =╱xT x、-1 xT E[ r ] =╱xT x、-1 xT x3 = 3 .
and
Var[ ] =Var[ ╱xT x、-1 xT r ] =╱xT x、-1 xTVar[ r ] ( ╱xT x、-1 xT)T = ╱xT x、-1 xT ┌σ2 Ⅰ 五 ┐x╱xT x、-1
= ╱xT x、-1 σ 2
Since r ~ N五 (xβ, σ2 Ⅰ 五 ), these results imply that the sampling distribution of is given by
~ NY (3, σ2 (x户 x)-1 ).
(b)
(r - x3)户 Ⅰ五 (r - x3) ~ χ2 (n)
(c)
( - 3)户 x户 x( - 3) ~ χ2 (p)
(d) Using = (x户 x)-1 x户 r in step (*), one obtains
(r - x )户 (r - x ) + (3 - )户 x户 x(3 - ) =
= r户 r - r户 x - 户 x户 r + 户 x户 x +
3户 x户 x3 - 户 x户 x3 - 3户 x户 x + 户 x户 x
r户 r - r户 x3 - 3户 x户 r + 3户 x户 x3
= (r - x3)户 (r - x3).
(e) From (d), one has
(n - p)s2 = (r - x )户 (r - x ) =
= (r - x3)户 (r - x3) - ( - 3)户 ( - 3)
Now, we know from (b) that the first of these terms follows a χ2 (n) distribu- tion, and we know from (c) that the second of these terms is χ2 (p) distributed,
justifying qualitatively that their difference is χ2 (n - p) distributed [Formal proof would require showing that the pieces corresponding to χ2 (n - p) (that is, s2 ) and to χ2 (p) (that is, ) are independent, as only this ensures that ac-
tually χ2 (n - p) + χ2 (p) = χ2 (n). ]
Hence, c = (n - p) and k = n - p.
(f) It is E(s2 ) = σ2 , and
Var(s2 ) = Var ╱σ 2 χ2 (n - p)、= σ4 Var(χ2 (n - p))
4 1 2σ4
(n - p)2 n - p .
4. (a) One has
xT = xT (r - x ) = xT r - xT x = o,
and with rˆ = x = 五r it follows that
rˆ户 = rˆ户 (r - rˆ) = 户 x户 (r - x )
= x户 r - 户 x户 x = x户 r - x户 r = 0.
(b) In usual linear model notation,
|
|
五 五 |
= |
匕(y之 - y¯)2 = 匕(y之 - yˆ之 + yˆ之 - y¯)2 之=1 之=1 五 五 五 |
|
= |
匕(y之 - yˆ之 )2 +匕(yˆ之 - y¯)2 + 2 匕(y之 - yˆ之 )(yˆ之 - y¯) |
之=1
之=1
五
之=1
五
= SSE + SSR + 2匕 之yˆ之 - 2y¯匕 之 (1)
之=1 之=1
= SSE + SSR
where the last two terms in (1) vanish due to part (a). Specifically, since the first column of x just consists of a vector of 1’s, the first entry of the 2 x 1 vector xT corresponds just to ! 之 , which is, hence, equal to 0. Exactly for this reason, the intercept is required; otherwise! 之 0.
(c) R2 = SSR/SST. Hence,
R2 SSR/SST SSR/SST SSR
= = =
1 - R2 1 - SSR/SST (SSE + SSR - SSR)/SST SSE .
Let c = Y(五)-(-)1(Y). Then F = c and so
F (1 - R2 ) = R2 c
F = (c + F)R2
R2 = F
c + F .
(d) (i) F = 7.539 and FY-1,五-Y = F2,11,0A01 = 7.21, so H0 is rejected; that is there is evidence that the predictors do explain some variation in the response.
(ii) It is c = = 5.5 and hence
R2 = = = 0.57819.
[If they could not solve part (c), they can use the given value of SSR to compute SSE = 1.45835 from the formula given in (c), and then R2 = 1.999/(1.999 + 1.45835) = 0.57819.]
(iii) This value of R2 is not interpretable since the ratio SSR/SST = SSR/SST is not meaningful in this scenario, as SST does not decompose into SSE and SSR, as shown in part (b). In particular, the higher R2 does not imply that model fits better in any sense.
2022-05-16