Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

ECOM151 Big Data Applications for Finance

1. Elements of Statistical Learning.

(a)  (5 points) We show in class how the Mean Squared Error E Y − 2

can be decomposed in a “reducible” and “irreducible” error terms. Un- der which circumstances is the reducible error zero?

(b)  (5 points)  Can the ”irreducible” error term be zero? Why?

(c)  (5 points) Highlight the pros and cons of parametric vs non-parametric machine learning methods and the trade-off between accuracy vs inter- pretability.  Give some example of parametric vs non-parametric ma- chine learning methods.

(d)  (5 points) By using the definition of the Mean Squared Error, describe the bias-variance tradeoff. Provide equations where appropriate.

Total for Question 1: 20

2. Penalised regressions.

(a)  (5 points) What do we mean by “regularization”? When is helpful for

linear regression models? Why?

(b)  (5 points) Describe the difference between the Ridge and the Lasso pe-

nalized regression models. Provide equations where appropriate.

(c)  (5 points)  Give the main intuition behind the LASSO model selection. Is the Ordinary Least Square regression a special case of the LASSO? Provide equations where appropriate.

(d)  (5 points) What is the main difference between the “forward” and “back- ward” stepwise regressions?

Total for Question 2: 20

3. Regression trees.

(a)  (5 points) We discussed in class two alternative loss functions for split- ting in regression trees in addition to the classification error rate. What are these methods?

(b)  (5 points) What is the main difference between “regression” and “claf- ficiation” trees?

(c)  (5 points)  Can we apply regularization penalties when calibrating a re- gression tree?  If yes, which feature of the tree we should penalize?  If not, why penalization does not help?

(d)  (5 points) Based on the evidence seen in class.  Do decision trees out- perform the forecast from a simple rolling mean when using multiple predictors? Also, do trees outperform the LASSO when using multiple predictors?

Total for Question 3: 20