闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

MTH316 Applied Multivariate Statistics

Coursework 2022/23

INSTRUCTIONS

1. There are 5 questions in total, the total mark overall is 100. The mark for each question is indicated below.

2. You MUST submit both the R code and the output to get full marks. Use the compile report function in the file menu to export as PDF.)

3. The coursework should be submitted online as a single pdf file on Learning Mall Online (LMO). Scans must be readable for marks. The coursework is due by 5pm Friday 12th May 2023. You submission should be a combination of written solutions and R reports. You can use MS Word, Adobe, or other software to combine the files and scans/screenshots.

Q 1. (20 marks) The housing data (boston.csv), which can be downloaded from the coursework

section of the MTH316 Learning Mall page, comprises of 506 observations for each census district of the Boston metropolitan area, and contains the following seven variables:

X1 = per capita crime rate,

X2 = average number of rooms per dwelling,

X3 = weighted distances to five Boston employment centers,

X4 = full-value property tax rate per $10,000,

X5 = pupil/teacher ratio,

X6 = % lower status of the population,

X7 = median value of owner-occupied homes in $1000.

(You need to submit both the R code and the output for this part to get full marks . Use the compile report function in the file menu to export as PDF.)

Using R fit a multiple linear regression model to predict the median value of owner-occupied homes, using only linear terms with no interactions.

(a) Print out a summary() of the fitted linear model and comment on the R2 and estimate

(b) At a 1% significance level, comment on the significance of the beta coeﬀicients. [5]

(c) Test for the homogeneity of variance, normality of it’s standardized residuals, and whether there are influential points with high Cook’s distance. Comment on your results. [5]

(d) At a 1% significance level, perform a suitable subset test to remove X1 and X4 . Clearly define the hypothesis test and show all calculations. [5]

Q 2. (35 marks) Suppose three random variables X1 , X2 and X3 have the variance-covariance

matrix

2 ) .

(a) Do PCA on the covariance matrix of X. Clearly state the principal components, cu-

mulative proportion of variance explained, and plot the original variables on a 2D plot where PC1 and PC2 are the main axis. [10]

(b) Using the correlation matrix of X, fit a one factor model using the method of principal factors. Keep iterating the estimates for Q and PSI, and explain why you stopped. State both the loadings Q and specific variances ψ, and plot the loadings like in part (a). [15]

(c) Using the correlation matrix of X, fit a one factor model using the method of principal components. State both the loadings Q and specific variances ψ, and plot the loadings like in part (a). [10]

Q 3. (20 marks) Using the dissimilarity matrix D, construct the average linkage dendrogram for

the objects A, B , C , D , and E . [15]

B 1 0

(E 2 3 5 4 0 )

Based on your dendrogram, deduce the ‘natural’ clusters of the objects. [5]

Q 4. (25 marks)

(a) For the three-class classification problem, we also allocate a new observation x0 to the population πi with the largest “posterior” probability P (πi |x0 ). By Bayes’ rule, or otherwise, obtain the posterior probabilities P (π1 |x0 ), P (π2 |x0 ) and P (π3 |x0 ), and

state their corresponding classification regions R1 , R2 and R3 . [5]

(b) Recall the pdf of the exponential distribution for a random variable X is given by f (x) = λe−λx , x ≥ 0.

Suppose that observations come from three distinct populations, T1 , T2 , and T3 , defined by the following exponential distributions:

T1 : X ∼ Exp (1)with prior probability p1 = 0.5 T2 : X ∼ Exp (0.5) with prior probability p2 = 0.3 T3 : X ∼ Exp (0.2) with prior probability p3 = 0.2.

Using part (a) only, determine the classification regions R1 , R2 and R3 . Thus, classify the observations x = 2, x = 4, and x = 6. [20]

2023-05-04

Java

物理(Physical)

LINUX

C++

Python

Processing

sas

ios

maths

maple