MATH 304 - Numerical Analysis and Optimization Project
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
MATH 304 - Numerical Analysis and Optimization
Project
2022
Topic: Least Squares Regression
Tasks:
Build a polynomial model to fit the given data sets.
f(x) = an xn + an−1xn−1 + an−2xn−2 + … + a2x 2 + a1x + a0
When solving the model coefficients, you can consider to add a penalty term into the cost function by using L2-norm regularization. In this case, the cost function should be in the form of ||A ⋅α − B ||22 + λ⋅||α ||22 . Write your MATLAB code to build the over-determined linear equations that take the regularization term into account. You can solve the model coefficients by using the backslash “\” in MATALB.
(1) Try different models (n=1, 2, 3, … , 9) without regularization (λ=0) on the training data set of “SmallData.mat” , and test the error on the “TestData.mat” .
(a) Fill in the following table to show the training error and test error for each model. The error means average squares error (i.e., normalized by the sample number).
Model |
N=1 |
N=2 |
N=3 |
N=4 |
N=5 |
N=6 |
N=7 |
N=8 |
N=9 |
Training Error |
|
|
|
|
|
|
|
|
|
Test Error |
|
|
|
|
|
|
|
|
|
(b) Plot all data (training data and test data with different colors);
(c) Plot two fitted models in the same figure of (b) (the model with smallest training error and the model with smallest test error. if they are the same model, only plot that model);
(d) Print the model coefficients in the task (c).
(2) Repeat the task (1) by using the training data set of “LargeData.mat” , and show the
results.
(3) Use the model (n=9) in the task (1) with different regularization weights (λ = 10−6, λ = 10−3 , λ = 1, λ = 103 , and λ = 106) .
(a) Fill in the following table, and repeat the tasks (b) (c) (d) in the task (1).
Weight |
λ = 10−6 |
λ = 10−3 |
= 1 |
= 103 |
= 106 |
Training Error |
|
|
|
|
|
Test Error |
|
|
|
|
|
(4) Use the strategy of 5-fold cross validation to choose the best regularization weights from the range (λ = 10−6, λ = 10−3 , λ = 1, λ = 103 , and λ = 106) for the model (n=9) with the training data set of “LargeData.mat” .
(a) Show the average validation error for each regularization weights. (4 folds are for training, and the remained 1-fold is for validation.)
(b) Using the best regularization weight with smallest validation error in (a), train the model again with all training data.
(c) Print the model coefficients in (b), the best regularization weight, and the test error.
Submission:
(1) The Matlab code files (.m files) to be compressed as “MATH304-YourName-Proj- Code.rar or zip”:
(a) The function code of least squares regression is saved to a file named by “LSR- YourNetId.m”; (3 marks)
(b) The main code for task 1, task 2 (3 marks), task 3 (2 marks), and task 4 (2 marks) are saved by the file names of “Main1-YourNetId.m”, “Main2-YourNetId.m”, “Main3-YourNetId”, and “Main4-YourNetId”, respectively.
(c) If you have any other functions, please save them separately by “FunctionName- YourNetId.m”
(2) One final report “MATH304-YourName-Proj-Report.pdf” with the template. (10 marks)
(3) One presentation slide “MATH304-YourName-Proj-Presentataion.pdf” (10 marks)
2022-10-07