Empirical Finance: Methods and Applications Assignment 2 2022
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
Assignment 2 Resit
Empirical Finance: Methods and Applications
August 18, 2022
● You should submit a single pdf solution containing answers to all sub-parts of all problems. Type- written solutions are preferred but handwritten and scanned solutions are acceptable. You may use R-markdown, LaTeX, or any other software to prepare your solution, but please prepare a PDF.
● Marks for each problem are listed below. Within each problem sub-parts are equally weighted.
● In addition, please submit code for problems 4-5 in the form of an R project. This should be a zipped folder that contains an R Project and a single R file with answers to all relevant parts of all problems. I should be able to download and run your R file directly. Please comment your code to make it as easy to interpret as possible.
● Your marks depend on clarity of exposition in solutions and code. This includes figures and regression results.
Problem 1 (20 Marks)
Consider the following model:
Yi = Xβ + e.
We defined the objective function for RIDGE as:
N K
βˆRI DGE b(= arg mi)n (yi = X b) 2 subject to bk(2) s c
i31 k31
or alternatively:
N K
βˆRI DGE b(= arg mi)n (yi = X b) 2 + λ b2 .
i31 k31
Derive the solution for βˆRI DGE and show that βˆRI DGE is a biased estimator of β .
Problem 2 (10 Marks)
Suppose we are interested in the relationship between yi and xi , where
yi = β← + β1 xi + εi
However, we only observe the variable yi when yi is less than a threshold c (That is, when yi s c), and we observe nothing if yi > c. Suppose that εi |xi is a random variable with pdf g(|) and cdf G(|).
(a) What is the term for data that is restricted in this form?
(b) Suppose we observe n independent draws of (yi , xi ) and that εi |xi ~ N(0, σ2 ). Write the log-likelihood as a function of the observed data and the unknown parameters of the model.
Problem 3 (30 Marks)
In this problem you will simulate and estimate a censored regression model:
yi(*) = β← + β1 xi + vi
yi = max(yi(*), ci )
vi |xi , ci ~ N (0, σ2 )
(a) Set a seed in r using the following command: set.seed(123). Now simulate 1000 draws of the uncensored data yi(*) using parameters β← = =0.6, β 1 = =1.2 σ = 1.2. Draw the data xi as normal with mean 0 and variance 1. Create a scatter plot of the uncensored yi(*) against xi . Estimate an OLS regression of yi(*) on xi and report βˆ and
(b) Now censor the data at c = 0. Create a scatter plot of the censored data yi against xi . Estimate an OLS regression of yi on xi . How do the coefficients look compared to the choices of β← and β 1 that generated the data.
(c) Write the log-likelihood function, and estimate the parameters of the model via MLE. Note that, in contrast to the probit, σ is now a parameter to be estimated. I recommend using starting values [1; 1; 1] . Report your estimates of βˆ , βˆ1(M LE), and M LE . Please use the following for your log-likelihood:
n
l = log(L(β← , β 1 , σ)) = log(f (yi |xi , ci ))
i31
= in31 ✶{yi ≥ ci }log ┌ 1 = Φ ╱ 、┐ + ✶{yi < ci }log ┌ φ ╱ 、┐
(d) Re-simulate and re-estimate the model 100 times, storing your estimates of βˆ , βˆ1(M LE), and M LE each time. Report the mean and variance of these estimates? Create a histogram of your βˆ1(M LE) estimates. Now simulate and estimate the model 1000 times and do the same.2
Problem 4 (40 Marks)
Simulate data xi,t based upon the linear factor structure discussed in lecture:
xi,t = αi + β1,if1,t + | | | + βk,i fK,t + εit .
Your simulation should satisfy the following criteria:
● m = 10: 10 different “assets.”
● T = 100: 100 periods.
● K = 3: Three factors.
● Set all αi to 0.
● Set the means of all factors µf to 0.
● The factor loadings for the first “asset” should be the first three non-zero digits of your CID divided by 10. For example, if your CID is 00946508, you would have loadings of: β 1 , 1 = 0.9, β2 , 1 = 0.4, and β3 , 1 = 0.6. You may choose the loadings for all other assets as you wish.
● All three factors must have non-zero variances and non-zero correlations with one another. In other words, Ωf may not have any zero terms.3
● The variances of εit must not be the same for all i.
● All assumptions on Cov[εt] = ψ and the correlations of εit over time must be as discussed in lecture.
(a) Describe the parameters of your simulation. Specifically Ω f , ψ and B .
(b) Run time series regressions using your simulated values of fk,t and xi,t to estimate Bˆ , the matrix of factor loadings for all assets. Show your estimated Bˆ . How do your estimates compare to the loadings used to generate the data?
(c) Now use the matrix B and your simulated xi,t to estimate the factor realizations fˆ using the BARRA/GLS procedure discussed in lecture. For each factor, create a plot that includes the time series of both (i) your estimated fˆ and (ii) the actual fk,t from your simulation.
(d) Conduct principal components analysis on Σx , the covariance matrix of asset returns. Create a plot showing the proportion of the variance explained by the first five principle components. What fraction is explained by the first principle component? The fourth?
2022-08-24