QBUS3830 Advanced Analytics Semester 2, 2022 Homework 2
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
QBUS3830
Advanced Analytics
Semester 2, 2022
Homework 2
1 Instructions
(a) Required submissions:
i. ONE written report (word or pdf format, through Canvas- Assignments- Homework
2 report submission)
ii. One or multiple code files (Matlab m-file, through Canvas- Assignments- Homework
2 code submission).
(b) Due date/time: Thursday, 6th Oct 2022, 2 pm.
(c) Late submission: Deduction of 5% of the maximum mark for each calendar day after the due date. After ten calendar days late, a mark of zero will be awarded.
(d) Weight: 10% of the total mark of the unit.
(e) Length: The main text (excluding appendix) of your report should have a maximum of 5 pages. You do not need to include a cover page.
(f) Report and code files naming: SID123456789-HW2. Repalce “123456789” with your student ID. If you submit more than one code files, the main function of the code files should be named as “SID123456789-HW2.m”. The other code files should be named
according to the actual function names, so that the marker can directly run your code and replicate your results.
(g) You must show your implementation and calculation details as instructed in the ques- tion. Numbers with decimals should be reported to the four-decimal point. You can post your questions on homework 2 in the homework 2 Megathread on Ed.
2 Questions
Simulation studies are often used to examine how methodologies work. This question is about a simulation study that examines the performance of MCMC.
Consider a Poisson regression model
yi ~ Poisson(µi ), i = 1, . . . , n
log(µi ) = β0 + 5 xij βj .
j=1
Let’s simulate a data set D from the above model as follows: Generate the covariates
xij ~ U (0, 1), i = 1, 2, . . . , n = 2000; j = 1, 2, 3, 4, 5. Set ← = (1, -0.2, 0.4, 0.1, -0.7, 0.3) and generate a data set of n = 2000 observations according to the given model. Please fix
the random seed using “rng(2022)” so that your results are reproducible. Employ the prior p(←) = N (0, m2 I), and you need to test and choose your own m here.
Perform Bayesian inference on ← using Random walk Metropolis-Hastings (RWMH) algorithm:
β (5 marks) Generate the simulated data as instructed. Build and implement the likeli- hood correctly.
β (25 marks) Implement a RWMH algorithm to estimate the posterior distribution of ← .
— Describe your implementation details step by step in the report.
— For the prior p(β) = N (0, m2 I), you need to explain how you test and choose m. — For the proposal ∈ ~ N (0, Σ) in RMWH, you need to show how you test and
choose Σ . You can also use an adaptive Σ, e.g., Σ can be different according to the RWMH iterations, as disucssed in the lecture and demonsrated in our lecture example code ”Lecture06 Example 02”. You might also read some research papers on adatptive Σ and test/employ such approaches, to improve the performance of
your RWMH algorithm. The process of the adaptive approach also needs to be clearly presented in the report if it is used.
— In the reporot, you can use small sections of code to faciliate your presentation of the implementation details if needed.
β (10 marks) Show the trace plots for each parameter βj . Justitify that your RWMH chains for each parameter have been converged and converged to the correct values.
β (5 marks) Report estimates for the posterior mean and posterior standard deviation for each βj . Show your calculation details.
β (3 marks) Given a future subject with covariates x* = (1.83, -2.26, 0.86, 0.32, 0.65), estimate the predictive mean E (y* lx* , D) based on your RWMH samples. Show your calculation details.
β (2 marks) Discuss future works to potentially further improve the performance of your
RWMH algorithm.
2022-10-06