MAST30025 Linear Statistical Models Semester 1 Assessment, 2019
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
Semester 1 Assessment, 2019
MAST30025 Linear Statistical Models
Question 1 (10 marks)
(a) [3 marks] Let A1, . . . , Am be a
idempotent. Show directly that
set of symmetric idempotent matrices whose sum is also
r ╱ i Ai \ = i r(Ai).
(b) [3 marks] Let y be a random vector with var y = V . Show that V is positive semidefinite.
(c) [4 marks] Show directly that for any matrix A, we have A = A(AT A)c AT A. (∶】2t. 佐| MT M = 0← t|à2 M = 0 ﹐ )
Question 2 (12 marks) Let
y ~ MVN ╱┌ 4_1 ┐ , ┌ 1(3) 3(1) ┐、, A = ┐ .
(a) [3 marks] Let c = 1. Calculate E[yT Ay].
(b) [3 marks] Let c = _1. Describe the distribution of Ay.
(c) [4 marks] Find all values of c for which yT Ay has a non-central χ2 distribution.
(d) [2 marks] Let c = . Determine if yT Ay is independent of y1 + y2 .
Question 3 (18 marks) In this question, we study a dataset of 125 countries, collected by the United Nations. This dataset contains the variables:
❼ ModernC: Percent of unmarried women using a modern method of contraception ❼ Change: Annual population growth rate, percent
❼ PPgdp: Per capita gross national product, US dollars
❼ Frate: Percent of females over age 15 economically active
❼ Pop: Total 2001 population, 1000s
❼ Fertility: Expected number of live births per female, 2000
❼ Purban: Percent of population that is urban, 2001
We wish to model the birth rate (Fertility) in terms of the other variables. The following R calculations are produced (with some removed):
> UN <- read.csv(✬UN3.csv✬, header=T)
> pairs(UN, cex=0.5)
−1 1 3
0 40 80
1 4 7
●●●(●●)● ● ●● |
0 40 80 0 20000 0 800000 20 60
> UN$Fertility <- log(UN$Fertility)
> UN$PPgdp <- log(UN$PPgdp)
> UN$Pop <- log(UN$Pop)
> fullmodel <- lm(Fertility ~ ., data=UN)
> deviance(fullmodel)
[1] 4.929815
> # Input removed
Start: AIC=-390.13
Fertility ~ ModernC + Change + PPgdp + Frate + Pop + Purban
|
Df |
Sum of Sq |
RSS AIC |
- Frate |
1 |
0.0003 |
4.9301 -392.12 |
- PPgdp |
1 |
0.0277 |
4.9575 -391.43 |
<none> |
|
|
4.9298 -390.13 |
- Pop |
1 |
0.3084 |
5.2382 -384.54 |
- ModernC |
1 |
0.3461 |
5.2759 -383.65 |
- Purban |
1 |
0.5826 |
5.5124 -378.16 |
- Change |
1 |
10.2407 |
15.1705 -251.62 |
Step: AIC=-392.12
Fertility ~ ModernC + Change + PPgdp + Pop + Purban
|
Df |
Sum of Sq |
RSS AIC |
- PPgdp |
1 |
0.0285 |
4.9586 -393.40 |
<none> |
|
|
4.9301 -392.12 |
+ Frate |
1 |
0.0003 |
4.9298 -390.13 |
- Pop |
1 |
0.3102 |
5.2403 -386.49 |
- ModernC |
1 |
0.3559 |
5.2860 -385.41 |
- Purban |
1 |
0.6135 |
5.5436 -379.46 |
- Change |
1 |
10.9186 |
15.8487 -248.15 |
Step: AIC=-393.4
Fertility ~ ModernC + Change + Pop + Purban
> summary(model)
Call:
lm(formula = Fertility ~ ModernC + Change + Pop + Purban, data = UN)
Residuals:
Min 1Q Median 3Q Max
-0.4866 -0.1282 0.0084 0.1321 0.5862
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.285452 0.102858 12.497 < 2e-16 ***
ModernC -0.003971 0.001121 -3.541 0.000567 ***
Change 0.323302 0.019679 16.429 < 2e-16 ***
Pop -0.024369 0.009324 -2.613 0.010110 *
Purban -0.006111 0.000985 -6.205 8.09e-09 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.2033 on 120 degrees of freedom
Multiple R-squared: 0.8462, Adjusted R-squared: 0.841
F-statistic: 165 on 4 and 120 DF, p-value: < 2.2e-16
> par(mfrow=c(2,2))
> plot(model)
Residuals vs Fitted Normal Q−Q
2815 ● |
||||
●● ● ● |
||||
|
|
|
|
|
0.5 1.0 1.5 2.0
125 ●(28)● |
|||||
●● |
|||||
● |
|||||
●6 |
|||||
|
|
|
|
|
|
−2 −1 0 1 2
Fitted values
Theoretical Quantiles
Scale−Location Residuals vs Leverage
●●●●●●●●●●●●●●●●●●●●●12561●●●●●●●●●● ●(●) ●(●) ● ● |
2022-05-28