闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

IEOR E4525

Assignment 2

2022

All ﬁles referred to in this homework can be found on CourseWorks.

Your hand-in should be made on Gradescope.

For theory questions, you will need to write math. Please make sure to show your derivations. To see examples of how to write latex code in jupyter, see this link. You may also take pictures of hand-written solutions, if you ﬁnd that easier. In that case it is your responsibility that the pictures and hand-writing is suﬃciently legible for grading.

You must submit two ﬁles:

1. A ﬁle (i.e. jupyter notebook ﬁle or a python script) with your data analysis for answering questions

2. A single pdf ﬁle with all your answers to all the questions, including your jupyter notebook output for the above notebook. You can create a pdf ﬁle of your notebook using the workﬂow described here, or you can insert screenshots of the notebook in another pdf. Alternatively, you can insert your answers to theoretical questions as part of your notebook.

1 ISLR Classiﬁcation Lab

Complete the lab from Section 4.7 of ISLR. You can skip Part 4.7.7. Feel free to utilize the provided worked jupyter notebook as inspiration.

2 Classiﬁcation Models for Stock Market Data

Solve exercise 13 from Section 4.8 of ISLR.

The Weekly dataset can be found in the Data folder. A description of it can be found here on page

14.

3 Reduced-Rank LDA

Let B and W be symmetric positive deﬁnite matrices and consider the following problem of maximizing the Rayleigh quotient :

aT Ba

a aT Wa

(1)

1. Use the method of Lagrange multipliers to solve this problem. In particular, show that the optimal solution a* is an eigenvector of a certain matrix related to B and W . What is this matrix, and which eigenvector does a* correspond to?

Hint : Use the scale invariance of the Rayleigh quotient to rewrite the unconstrained maximization as a constrained maximization problem where B appears in the objective, and W appears in the constraint.

2. By identifying B and W with the between-class and within-class covariance matrices, we can inter- pret the problem in (1) as the problem of ﬁnding the linear combination a*Tx so as to maximize the between-class variance relative to the within-class variance. Show that a*Tx is the ﬁrst discriminant variable.

Hint : First note that W = Σ from the lecture slides, and that B * = D- 1/2UT BUD- 1/2

4 Logistic Regression

1. Show that binary classiﬁcation using logistic regression yields a linear classiﬁer.

Consider a naive Bayes classiﬁer for a binary classiﬁcation problem where all the class-conditional distributions are assumed to be Gaussian with the variance of each feature Xj being equal across the two classes. That is we assume (Xj ]Y = k) ~ N(µjk , σj(2))

2. Show that the decision boundary is a linear function of X = (X1 , ..., Xd ) and hence that it has the same parametric form as the decision boundary given by logistic regression.

3. Does the result of part (2) imply that in this case, Gaussian naive Bayes and logistic regression will ﬁnd the same decision boundary? Justify your answer.

4. If indeed the class conditional distributions are Gaussian with (Xj ]Y = k) ~ N(µjk , σj(2)) and the assumptions of naive Bayes are true, which classiﬁer do you think will be “better”: the naive Bayes classiﬁer of part (2) or logistic regression? Justify your answer.

5 Bootstrap Probabilities