STA 5208 Homework 1
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
STA 5208 Homework 1
1 Simulation example
Construct a numerical simulation example to illustrate the difference between the following two procedures for comparing OLS (ordinary least squares), PLS (partial least squares), or other estimators. In this simulation example, please first generate the data under the multiple linear regression model.
1.1 Procedure 1
1. Center the data. Xi = Xi - , i = Yi - for i = 1, . . . , n, where = n~ 1 Xi and = n~ 1 Yi .
2. Data splitting. Splitting the data into 50%-50% training-testing groups with index sets labelled as [Tr] and [Te] so that they can form a partition of {1, 2, . . . , n}.
3. Estimation. Obtain tr from the training set.
4. Evaluation. Compute PE = i×[T e](yi - i )2 , where the predictor i = tTr xi .
5. Comparison. Apply steps 3 and 4 to different methods (OLS, PLS, LASSO, etc.) and compare the prediction errors.
1.2 Procedure 2
1. Data splitting. Splitting the data into 50%-50% training-testing groups with index sets labelled as [Tr] and [Te] so that they can form a partition of {1, 2, . . . , n}.
2. Center the partitioned data. x,tr = ntr(~)1 i×[Tr]xi , y,tr = ntr(~)1 i×[Tr]yi . Center both the training and
4. Prediction error. Compute PE = i×[T e] ╱yi - y,tr - tr (xi - x,tr )、2 , where the predictor i = tTr xi .
5. Comparison. Apply steps 3 and 4 to different methods and compare the prediction errors.
2 Invariance of OLS
Assume that XT X is non-degenerate and Γ is a p × p orthogonal matrix. Define = XΓ . From the OLS fit of Y
= Γ, = , =
3 Multivariate Linear Model
For each unit i = 1, . . . , n, we have multiple responses yi = (yi1 , . . . , yiq )T e Rq and multiple covariates xi = (xi1 , . . . , xip )T e Rp . Please carefully define the centered data matrices Y e Rn—q and X e Rn—p, and the OLS
objective function L(B) to be minimized. Express the OLS estimator = ╱1 , . . . , q 、in terms of the observed
4 Matrix derivatives
Under the same MLM as in Problem 3, let L(B) denote the objective function to be minimized.
4.1 Derivation
Calculate the derivative D = L(B), which should be a p × q matrix.
4.2 Simulation
Simulate a data set under this MLM and numerically compute the derivative as Dij = {L(B + δ1ij ) - L(B)}/δ for i = 1, . . . , p and j = 1, . . . , q, for a small δ, say 0.0001. Compare this answer to your analytical solution in the previous part of the problem.
2022-09-12