闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

CMSE 820 Homework Assignment 1

This assignment is due on Jan. 24th at 11:59 pm.

Question 1: Assume that Y = XT β + ϵ, where X e RP is not random and ϵ … N(0, 1). Given i.i.d. data {(x1 , y1 ), . . . (xn , yn )}, we would like to estimate β e Rp through the maximum likelihood framework. Write down the joint log likelihood and compare it with the least-squares method.

Question 2: Consider the usual linear regression setup, with response vector y e Rn and predictor matrix X e Rp ×n. Let x1 ,..., xp be the rows of X. Suppose that β(ˆ) e Rp is a minimizer of the least-squares criterion,

Iy — XT βI2 .

a. Show that if v e Rp is a vector such that XTv = 0, then β(ˆ) + c · v is also a minimizer of the least-squares criterion, for any c e R.

b. If x1 ,..., xp e Rn are linearly independent, then what vectors v e Rp satisfy XTv = 0? We assume p 三 n.

c. Suppose that p > n. Show that there exists a vector v 0 such that XTv = 0. Argue, based on part (a), that there are infinitely many linear regression estimates. Further argue that there is a variable i e {1,..., p} such that the regression coefficient of vari- able β[i] can have different signs, depending on which estimate we choose. Comment on this.

Question 3: Implement the following model (you can use any language)

Y = XT β + ϵ,

where ϵ … N(0, 1), X … N(0, Ip ×p ) and β e Rp with β[1] = 1, β[2] = — 2 and the rest of β [j] = 0. Based on this setting, let us start with p = 5 and simulate {x1 ,..., x100 } and store it. Then carry out the following experiments.

(1) Based on the β and {x1 ,..., x100 }, we first simulate the corresponding Y ’s and calcu-late the β(ˆ)ols.

(2) Using the same {x1 ,..., x100 }, we then simulate another set of Y(˜) = {˜(y)1 , . . . , ˜(y)100 } and calculate the in-sample prediction error (PEin ) using β(ˆ)ols calculated in (1). This is one realization of PEin [prediction-error in sample].

(3) Repeat (1) - (2) 5000 times and take average of those 5000 calculated PEin. You have an approximate PEin.

(4) Repeat the same procedure for p = 10, 40, 80. What is the trend for the PEin? Comment on your findings.

Question 4: Implement the following model (you can use any language)

yi = β[1(*)] xi[1] + β[2(*)] xi[2] + ϵi ,

where E(ϵi ) = 0, Var(ϵi ) = 1, Cov(xi , xj ) = 0 and β = (−1, 2)T . We also assume xi ∼ N(0, Σx ) with

Σx = Cov(xi ) =（ 0.9999(1) 0.91(9)99 ) .

We repeat the following 2000 times:

. Generate y = (y1 ,..., y50 )T and X = (x1 ,..., x50 ).

. Compute and record β(ˆ)ols and β(ˆ)ridge (for ridge regression, choose λ = 0.005).

Then report the followings:

a. The histograms for β(ˆ)[1(ol)](s) and β(ˆ)[(r)1(i)](d)ge. What conclusion can you make from these his- tograms?

b. For each replicate of the 2000 repeats, compare |β[1(*)] − β(ˆ)[1(ol)](s)| with |β[1(*)] − β(ˆ)[(r)1(i)](d)ge | . How many times does ridge regression return a better estimate of β[1(*)] ?

2024-01-24

Java

物理(Physical)

LINUX

C++

Python

Processing

sas

ios

maths

maple

C语言