Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Semester 1 Assessment, 2019

MAST30025 Linear Statistical Models

Question 1 (10 marks)

(a) [3 marks] Let A1, . . . , Am  be a

idempotent. Show directly that

set of symmetric idempotent matrices whose sum is also

r i Ai \ = i r(Ai).

(b) [3 marks] Let y be a random vector with var y = V . Show that V is positive semidefinite.

(c) [4 marks] Show directly that for any matrix A, we have A  = A(AT A)c AT A.   (∶】2t.  佐| MT M = 0← t|à2 M = 0 ﹐ )

Question 2 (12 marks) Let

y ~ MVN 4_1 , 1(3)   3(1) ,        A = .

(a) [3 marks] Let c = 1. Calculate E[yT Ay].

(b) [3 marks] Let c = _1. Describe the distribution of Ay.

(c) [4 marks] Find all values of c for which yT Ay has a non-central χ2  distribution.

(d) [2 marks] Let c = . Determine if yT Ay is independent of y1 + y2 .

Question 3 (18 marks) In this question, we study a dataset of 125 countries, collected by the United Nations. This dataset contains the variables:

❼ ModernC: Percent of unmarried women using a modern method of contraception ❼ Change: Annual population growth rate, percent

❼ PPgdp: Per capita gross national product, US dollars

Frate: Percent of females over age 15 economically active

❼ Pop: Total 2001 population, 1000s

Fertility: Expected number of live births per female, 2000

❼ Purban: Percent of population that is urban, 2001

We wish to model the birth rate (Fertility) in terms of the other variables. The following R calculations are produced (with some removed):

>  UN  <- read.csv(✬UN3.csv✬, header=T)

> pairs(UN,  cex=0.5)

−1    1     3

0    40    80

1     4     7

●●(●●)

0     40    80                           0   20000                               0   800000                               20   60

>  UN$Fertility  <- log(UN$Fertility)

>  UN$PPgdp  <- log(UN$PPgdp)

>  UN$Pop  <- log(UN$Pop)

>  fullmodel  <- lm(Fertility ~ ., data=UN)

>  deviance(fullmodel)

[1]  4.929815

>  #  Input  removed

Start:    AIC=-390.13

Fertility  ~  ModernC  +  Change  +  PPgdp  +  Frate  +  Pop  +  Purban

Df

Sum of Sq

RSS AIC

- Frate

1

0.0003

4.9301 -392.12

- PPgdp

1

0.0277

4.9575 -391.43

<none>

4.9298 -390.13

- Pop

1

0.3084

5.2382 -384.54

- ModernC

1

0.3461

5.2759 -383.65

- Purban

1

0.5826

5.5124 -378.16

- Change

1

10.2407

15.1705 -251.62

Step:    AIC=-392.12

Fertility  ~  ModernC  +  Change  +  PPgdp  +  Pop  +  Purban

Df

Sum of Sq

RSS AIC

- PPgdp

1

0.0285

4.9586 -393.40

<none>

4.9301 -392.12

+ Frate

1

0.0003

4.9298 -390.13

- Pop

1

0.3102

5.2403 -386.49

- ModernC

1

0.3559

5.2860 -385.41

- Purban

1

0.6135

5.5436 -379.46

- Change

1

10.9186

15.8487 -248.15

Step:    AIC=-393.4

Fertility  ~  ModernC  +  Change  +  Pop  +  Purban

>  summary(model)

Call:

lm(formula  =  Fertility  ~  ModernC  +  Change  +  Pop  +  Purban,  data  = UN)


Residuals:

Min            1Q   Median           3Q         Max

-0.4866  -0.1282    0.0084    0.1321    0.5862


Coefficients:

Estimate  Std.  Error  t  value  Pr(>|t|)

(Intercept)    1.285452      0.102858    12.497    < 2e-16 ***

ModernC          -0.003971      0.001121    -3.541  0.000567  ***

Change             0.323302      0.019679    16.429    < 2e-16 ***

Pop                 -0.024369      0.009324    -2.613  0.010110  *

Purban           -0.006111      0.000985    -6.205  8.09e-09  ***

---

Signif.  codes:    0  ‘***’  0.001  ‘**’  0.01  ‘*’  0.05  ‘.’  0.1  ‘  ’  1


Residual  standard  error:  0.2033  on  120  degrees  of  freedom

Multiple  R-squared:    0.8462,               Adjusted  R-squared:    0.841

F-statistic:      165  on  4  and  120  DF,   p-value:  < 2.2e-16

> par(mfrow=c(2,2))

> plot(model)

Residuals vs Fitted                                        Normal Q−Q

2815


0.5       1.0       1.5       2.0

125 ●(28)

6

−2 −1 0       1       2




Fitted values


Theoretical Quantiles





Scale−Location                                    Residuals vs Leverage




●●●●●●●●●●●●●●●●●●12561●●●●●●●●●● ●(●) ●(●)