IEOR 4630: Asset Allocation 2022
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
IEOR 4630: Asset Allocation
2022
1 Data Generation Procedure
Throughout the project, we fix the risk-free rate r to be 0.002.
We first simulate the data that we need in order to apply the strategies. In the market, there is one risk-free asset with risk-free rate r and N risky assets. In particular, we adopt the K-factor model on
the excess return Rt at time t of N − K risky assets:
Rt = α + BRfactor,t + ϵt , (1)
where we denote α ∈ RN−K as the mispricing vector, B ∈ R(N−K)×K as the factor value matrix, Rfactor,t ∈ RK as the factor coefficient vector at time t, and ϵt ∈ RN−K as the idiosyncratic noise at time
t. Please note that in this factor model, K factors stand for K different assets. And the returns of the rest N − K risky assets are linear combination of these K factors plus some noise.
For the sake of simplicity, we let K = 1 and the factor be the excess return of one of the risky assets. Based on this model, follow the following two procedures generate synthetic data.
• (Light tailed distribution) Suppose α = 0, Rfactor,t ∼ N(0.007, 0.002), ϵt ∼ N(0, Σϵ). We further assume Σϵ is diagonal, and each diagonal element Σϵii ∼ U[0.010,0.030] . Apply this 1-factor model to generate the excess return of N = 10 risky assets for T = 24000 time steps.
1. Simulate the loading vector B for all N − 1 risky assets. For each i, Bi ∼ U[0.50,1.50] .
2. Simulate the noise covariance matrix Σϵ for all N − 1 risky assets. For each i, Σϵii ∼ U[0.010,0.030] .
3. For t = 1 to T :
Simulate Rfactor,t ∼ N(0.007, 0.002), ϵˆt ∼ N(0, Σϵ). Then Rt = α + BRfactor,t + ϵˆt .
• (Heavy tailed distribution) Again suppose α = 0, but Rfactor,t ∼ √0.002 · tdf=5+0.007, where tdf=5 denotes a T-distribution with degrees of freedom 5. Then we achieve a heavy tail distribution for the factor risky asset. In order to achieve a heavy tail for the excess return of other risky assets, we need to change the distribution of ϵt as well. Consider ϵt follows a multivariate T distribution with degree of freedom ν:
ϵt ∼ µ + Z,
where Z ∼ N(0, Σ), U ∼ χν(2), Σ is the covariance matrix and ν is the degree of freedom. Then ϵt follows a tdf=5(µ, Σ) multivariate distribution. To create a similar setting to the light tailed case above, we let µ = 0, ν = 5 and Σ = Σϵ (which is already generated above).
Adopt following procedure to simulate the heavy-tailed noise and apply to the 1-factor model to generate the excess return of N = 10 assets for T = 24000 time steps.
1. Use the loading vector B sampled in the light tailed distribution case.
2. Use the noise ϵˆt sampled in the light tailed distribution case in Sec. 1.
3. For t = 1 to T:
Simulate rt ∼ tdf=5 . Then Rt(factor) = √0.002 × rt + 0.007.
4. For t = 1 to T :
Simulate Ut ∼ χν(2) . Heavy tail noise ϵt(′) = ϵˆt and Rt(′) = α + BRt(factor) + ϵt(′) .
Pick one among the N − 1 risky assets, and the factor risky asset to plot out the histogram of their excess
generated by the factor model, and 1 factor risky asset) when applying the asset allocation method.
2 Methods Implemented
In this project, we will implement and compare 12 asset allocation methods:
Benchmark
1. 1/N with rebalancing (ew)
2. Market Portfolio/Factor Portfolio (mkt)
Classical approach that ignores estimation error
3. Sample-based mean-variance (mv)
4. Sample-based unbiased mean-variance (as in Kan and Zhou, 2007, see Lecture 5, page 25) (u-mv)
Bayesian approach to estimation error
5. Bayes-Stein (shrink to minimum variance portfolio, as in Jorion, 1986, see Lecture 5, page 50) (bs)
Moment restrictions
6. Minimum-variance with sample based mean-variance (min)
Portfolio constraints
7. Sample-based mean-variance with no-shortsale constraints (mv-c)
8. Bayes-Stein with no-shortsale constraints (bs-c)
9. Minimum-variance with no-shortsale constraints (min-c)
Robust portfolios
10. Uncertainty in mean with box uncertainty set (as in Garlappi et al., 2007, see Lecture 6, page 14-17) (r-m-1)
11. Uncertainty in mean with ellipsoid uncertainty set (as in Garlappi et al., 2007, see Lecture 6, page 14-17) (r-m-2)
12. Distributional robust (as in Blanchet et al., 2020, see Lecture 6, page 28-36) (dr-mv)
We allow the user to input one tuning-parameter to specify the uncertainty set, even though Blanchet et al. (2020) suggest how to choose such parameter endogenously.
2.1 Performance Comparison of Strategies with Estimation
Implement all the strategies and test each of them on the simulated data generated in Sec. 1. Since these methods require the information of expected excess return vector and covariance matrix but we do not have access to them in reality, we need to leverage the rolling window technique to obtain an estimation.
Estimation using rolling window technique: Suppose the window length is M, then at each time
step t, we use the excess returns of all N risky assets in steps t − M , t − M + 1, ..., t − 1 to generate
Metrics: We are interested in three metrics:
• Out-of-sample Sharpe ratio (OSR):
OSR =
OSR ,
where OSR denotes the sample average of the out-of-sample return µt achieved by the portfolio over all time steps from t = M + 1 to T, and OSR denotes the corresponding sample standard deviation. Specifically, out-of-sample µt is measured in the following way: when we apply different strategies using estimation with rolling window technique at step t, we use the excess returns of all
N risky assets in steps t − M to t − 1 to estimate and solve for the optimal weight w(t) . Then we
µt =
X
w × return of asset i at time t
asset i in all assets
• In-sample Sharpe ratio (ISR):
ISR − r
where ISR denotes the sample average of the return in-sample µt achieved by the portfolio over all time steps from t = M to T, and ISR denotes the corresponding sample standard deviation. Specifically, in-sample µt is measured in the following way: when we apply different strategies using estimation with rolling window technique at step t, we use the excess returns of all N risky assets
in steps t − M − 1 to t to estimate and solve for the optimal weight w(t) . Then we calculate the
µt = w × return of asset i at time t
asset i in all assets
• Turnover:
Turnover = M+1 |wt+1) − t+1)|
T − M ,
where (t) , w(t) denotes the weight vector before and after rebalancing at time t respectively.
Here is an example: for the 1/N strategy, suppose there are only two assets, and the asset prices are $1 and $1 at time t = 1. The 1/N strategy will place w = 0.5 and w = 0.5 of our wealth on them. At the next time step t = 2, the prices of the two assets become $2 and $0.5. Then, the our
strategy at time t = 2 before rebalancing will have = = 0.8 while = = 0.2. To maintain our 1/N strategy at time t = 2, we need to rebalance the weight to w = 0.5 and w = 0.5. Then the turnover at time t = 2 is:
| − w | + | − w| = |0.8 − 0.5| + |0.2 − 0.5| = 0.6 (2)
Let’s set M = 120 in this part (also in Sec. 2.2). Report the out-of-sample Sharpe ratios, in-sample Sharpe ratios, and turnover for all methods.
2.2 Performance of Optimal Mean-variance
We still use the simulated data generated in Sec. 1. Since we know the true mean vector and covariance matrix in our simulation study, apply them to the mean variance problem, and compare with the methods 1-12 in terms of the metrics mentioned in Sec. 2.1.
Recall that in theory, any mean-variance efficient allocation achieves the same and the highest Sharpe ratio.
3 Tuning-parameter Ablation Study
3.1 The Effect of Window Length
To study the effect of choosing different window length on the performance of the strategies, we fix N = 10 and change M to 500, 1000, 6000 and report the same metrics in Sec. 2.1. For each case, regenerate the synthetic data for both light tailed distribution and heavy tailed distribution in Sec. 1 by following the instruction. Observe how the performances (in terms of the three metrics) change as M increases and compare the performance of all methods for each M .
3.2 The Effect of Asset Number
Similarly, to study how the number of asset would affect the performance of the strategies, for each N in {25, 50, 100}, try M = 120 and report the same metrics in Sec. 2.1. For each case, regenerate the
synthetic data for both light tailed distribution and heavy tailed distribution in Sec. 1 by following the instruction. Observe how the performances (in terms of the three metrics) change as N increases and compare the performance of all methods for each N .
3.3 The Effect of α
In addition, we want to study how α in equation (1) would affect the performance of the strategies. Suppose αi ∼ U[−0.01,0.01] for each risky asset i, we regenerate the synthetic data for the light tailed distribution in Sec. 1 by following the instruction. Again, report the metrics and compare all the strategies.
4 Study on Intertemporal Correlation
In this part, we focus on studying the performance in another setting where the excess return of risky assets Rt have intertemporal correlations along time. That is
Cov(Rt,Rt′ ) 0, t,t′ ∈ {1, 2, ...T}
To realize this, consider an autoregressive model of order p (AR(p)).
p
Xt = c +X φiXt−i + ηt
i=1
where φi’s are the parameters, c is constant, and ηt follows a standard white noise process. In order to make the covariance nonzero, we can consider replacing the factor risky asset (Rfactor,t) in the factor model (equation (1) in Sec. 1) with an AR(1) process.That is,
(3)
Then we will have
Cov(Rt,Rt−1) 0, t − 1 ∈ {1, 2, ...T − 1}
Follow simulation procedure below to the 1-factor model and generate the excess return of N = 10 assets for T = 24000 time steps.
(Intertemporal correlated process) Suppose α = 0, Rfactor,t ∼ AR(1), ϵt ∼ N(0, Σϵ), ηt ∼ N(0, 0.032) in equation (2). Apply this 1-factor model to generate the excess return of N = 10 assets for T = 24000 time steps.
1. Use the loading vector B sampled in the light tailed distribution case in Sec. 1.
2. Use the noise ϵˆt sampled in the light tailed distribution case in Sec. 1.
3. For t = 2 to T: simulate ηt ∼ N(0, 0.032).
4. Rfactor,1 = 0.008. For t = 2 to T: Rfactor,t = 0.010 − 0.112 · Rfactor,t−1 + ηt .
5. For t = 1 to T: Rt = α + BRfactor,t + ϵˆt .
Redo the task in Sec. 2.1.
5 Conclusion
Based on the testing results in Sec. 2, 3, and 4, what can you conclude? For example, what will be your recommendation on various methods? Also specify the reason why you recommend them and in what conditions one should use them.
6 Deliverables
You are supposed to submit your code together with your report.
• The report should be a PDF file containing only texts, formulas, tables and figures. It should contain the following content: an introduction of the main findings, a brief summary of the formulas you used to implement each method, organized and clear presentation of numerical results in suitable ways under various settings, e.g., tables or figures, and thorough discussion of the results you get, the comparisons you make, and the final conclusion you draw. Do not include code in the main text of the report.
• The code can be written in any programming language and you can use any existing packages for optimization. It should contain one main function that has the following form:
w = function(N,M,R,method,para),
where the inputs N stands for the number of assets, M stands for the number of observation used for training, R ∈ RN×M is a matrix that stands for training data of the excess return of each asset at each time, method is a string that specifies the methods to use, para is a list of variables that contain additional parameters required to implement each method, and the output w stands for the allocation weight computed by the specified method.
7 Rubrics
The simulation study counts for 30% of final grade. 25% will be evaluated based on the quality of your report and 5% for the quality of your code.
• Basic rules: since you are supposed to fix the random seed, the results you report should be reproduced by running your code. Unless you have the same UID, the results will not be identical. If you cheat and fabricate numerical results, you get 0 for this project and may be subject to disciplinary actions.
• The report will mainly be judged by the completeness (whether you implement all methods and test under all settings as required), the cleanness of presentation (whether it is easy to read and follow), the correctness of the numerical results (even though they may be different slightly due to different random seeds, they will not violate the commonly agreed conclusions too much), the thoroughness of your conclusions and discussions.
• The code will mainly be judged by its cleanness (meaning that whether it is easy to read) and correctness (meaning that no bugs in the code so that the program can be executed without error). The correctness of implementation of a certain method will be reflected and evaluated by the results you report, so will not be evaluated here.
2022-04-20