闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

MATH2697

Statistical Modelling II

2022

SECTION A

1. An air pollution monitoring station in the city of Munich has recorded daily average SO2 concentrations over a period of 14 consecutive days. We denote the logarithms of these daily averages by y1 , . . . , y14 , which will form our response variable to which we refer as “pollution” in what follows. We are interested in modelling the responses yi in dependence of the daily average temperatures x1 , . . . , x14 (recorded in degrees Celsius on the same 14 days), as well as an indicator zi which takes the value 0 if day i is a weekday, and 1 if day i is a Saturday or Sunday. The full data set is provided below.

i 1 2 3 4 5 6 7

yi	-3.147	-2.830	-3.016	-3.079	-3.541	-2.976	-2.781
xi	16.47	16.02	16.81	22.87	21.68	21.23	20.55
zi	0	0	0	1	1	0	0
i	8	9	10	11	12	13	14
yi	-3.352	-2.765	-1.897	-2.120	-2.453	-1.973	-2.235
xi	18.32	15.96	15.36	12.47	12.46	11.77	11.72
zi	0	0	0	1	1	0	0

(a) We are ﬁtting the linear model yi = β1 + β2 xi + β3 zi + ∈i . Write down the ﬁrst

four rows of the design matrix, X .

(b) Denote C = (XT X)-1 and s2 the usual unbiased estimator of the error vari-

ance. You can use in what follows that

C = ╱ ． ←0.01627

、

←0.00510 0.35484 ． ,

s2 = 0.1338985, and XT (y1 , . . . , y14 )T = ( ←38.1650, ←656.4754, ← 11.1930)T . Find βˆj , j = 1, 2, 3, and their standard errors SE(βˆj ), j = 1, 2, 3.

(c) Assume that on a particular Tuesday one observes an average temperature x0 = 16.5○ . We would like to predict the true, unknown pollution y0 on this day, using the ﬁtted model. Hence, ﬁnd

(i) the predicted pollution, yˆ0 , on that day;

(ii) a 95% conﬁdence interval for the expected pollution E(y0 ex0 , z0 = 0) on

that day;

(iii) a 95% prediction interval for the actual pollution y0 on that day.

2. For a linear model of type Y = Xβ + e, with β ∈ Rp , the hat matrix is given by H = X(XT X)-1 XT .

(a) Show HHT = H and Tr(H) = p.

(b) Of particular interest are the diagonal values of H, the so-called leverage values

hi , i = 1, . . . , n. Show 0 × hi × 1.

i. Give an interpretation of this plot.

ii. Is this plot useful to judge whether the linear model ﬁt would change considerably if the observation labelled “8” were removed from the data set? If so, provide this judgement. Otherwise, suggest an alternative measure to deal with this question (no formulae necessary).

iii. The mean of the plotted leverage values is 0.06521739. Find p.

+ +

+ + + +

+ +

+ + +

+ +

+ + + + + + + + + + + +

+ + + +

+ + + + + + +

Index i

SECTION B

3. We consider a linear model Y = Xβ + e, with β ∈ Rp and e ~ Nn (0, σ2 In ), where 0 denotes a vector of appropriate length consisting only of zeros, and In is the n 一 n identity matrix. Denote by = (XT X)-1 XT Y the least squares estimator of β , and s2 = (Y ← X )T (Y ← X ) the unbiased estimator of σ 2 .

(a) Derive the expectation and variance of . Hence, give the sampling distribution of .

(b) Write down the expression for the (squared) Mahalanobis distance between Y

and Xβ , and give its distribution.

(d) Prove the decomposition

(Y ← Xβ)T (Y ← Xβ) = (Y ← X )T (Y ← X ) + (β ← )T XT X(β ← ).

(e) Using (b), (c), and (d), justify that, after appropriate standardization, the

sampling distribution of s2 is given by a χ2 distribution, that is

cs2 ~ χk(2) , (1)

and give the constant c, as well as the degrees of freedom k. [Note: Please explain your line of reasoning, but no formal proof is required. In particular, you do not need to show that and s2 are independent.]

(f) Give E(s2 ), and develop a formula for Var(s2 ). [Hint: You can use that Var(χk(2)) = 2k, for k ∈ Z+ . If you could not solve part (e), please work with equation (1) as displayed.]

4. We are given a multiple linear regression model in the form yi = xi(T)β + ∈i , i = 1, . . . , n, where β ∈ Rp , X ∈ Rn ×p , and Y = (y1 , . . . , yn )T .

(a) Show that, for models involving an intercept, one has XT = 0 and YˆT = 0, where Yˆ and are the vectors of ﬁtted values and residuals, respectively, after the usual least squares ﬁt;

(b) Hence, for models involving an intercept, show that

SST = SSR + SSE (2)

where SST = (yi ← y¯)2 , SSR = (yˆi ← y¯)2 , and SSE = (yi ← yˆi )2 . Also explain why equation (2) is generally not correct if there is no intercept in the model.

SSR/(p ← 1)

F =

SSE/(n ← p) .

Deﬁne the coeﬃcient of determination (R2 ) in terms of the quantities introduced in part (b), and ﬁnd an expression for R2 which only depends on F , n and p.

(d) We are given a real data set with n = 14, which after ﬁtting the linear model with p = 3 (including the intercept) yields the value F = 7.539.

i. Carry out the overall F–test at the 0.01 level of signiﬁcance.

ii. Compute R2 , and interpret the result. [Note: If you could not solve part

(c), you can make use of the information SSR = 1.999.]

iii. Assume that, for subject–matter considerations, the data analyst decides to remove the intercept. They reﬁt the model using some statistical software, which reports a value R2 = 0.9803 for the ﬁtted model. Does this give evidence that the model without intercept is preferable to the model with intercept? Explain your answer carefully.