闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Bayesian Data Analysis, 2022/2023, Semester 2

Assignment 2

IMPORTANT INFORMATION ABOUT THE ASSIGNMENT

In this paragraph, we summarize the essential information about this assignment. The format

and rules for this assignment are diﬀerent from your other courses, so please pay attention.

1) Deadline: The deadline for submitting your solutions to this assignment is the 17 April 12:00 noon Edinburgh time.

2) Format: You will need to submit your work as 2 components: a PDF report, and your R Markdown (.Rmd) notebook. There will be two separate submission systems on Learn: Gradescope for the report in PDF format, and a Learn assignment for the code in Rmd format. You need to write your solutions into this R Markdown notebook (code in R chunks and explanations in Markdown chunks), and then select Knit/Knit to PDF in RStudio to create a PDF report.

The compiled PDF needs to contain everything in this notebook, with your code sections

clearly visible (not hidden), and the output of your code included. Reports without the code

displayed in the PDF, or without the output of your code included in the PDF will be marked as 0, with the only feedback “Report did not meet submission requirements”.

You need to upload this PDF in Gradescope submission system, and your Rmd ﬁle in the Learn

assignment submission system. You will be required to tag every sub question on Gradescope.

Some key points that are diﬀerent from other courses:

a) Your report needs to contain written explanation for each question that you solve, and some numbers or plots showing your results. Solutions without written explanation that clearly demonstrates that you understand what you are doing will be marked as 0 irrespectively whether the numerics are correct or not.

b) Your code has to be possible to run for all questions by the Run All in RStudio, and reproduce all of the numerics and plots in your report (up to some small randomness due to stochasticity of Monte Carlo simulations). The parts of the report that contain material that is not reproduced by the code will not be marked (i.e. the score will be 0), and the only feedback in this case will be that the results are not reproducible from the code.

c) Multiple Submissions are allowed BEFORE THE DEADLINE are allowed for both the report, and the code.

However, multiple submissions are NOT ALLOWED AFTER THE DEADLINE.

YOU WILL NOT BE ABLE TO MAKE ANY CHANGES TO YOUR SUBMISSION AFTER THE DEADLINE.

Nevertheless, if you did not submit anything before the deadline, then you can still submit

your work after the deadline, but late penalties will apply. The timing of the late penalties will be determined by the time you have submitted BOTH the report, and the code (i.e. whichever was submitted later counts).

We illustrate these rules by some examples:

Alice has spent a lot of time and eﬀort on her assignment for BDA. Unfortunately, before

submission, she has accidentally introduced a typo in her code in the ﬁrst question, and it did

not run using Run All in RStudio. - Alice will get 0 for the questions that do not run in her

code (we will try to run each code block individually), with the only feedback “Results are not reproducible from the code”.

Bob has spent a lot of time and eﬀort on his assignment for BDA. Unfortunately he forgot to

submit his code. - Bob will get no personal reminder to submit his code. Bob will get 0 for

the whole assignment, with the only feedback “Results are not reproducible from the code, as

the code was not submitted.”

Charles has spent a lot of time and eﬀort on his assignment for BDA. He has submitted both

his code and report in the correct formats. However, he did not include any explanations in the report. Charles will get 0 for the whole assignment, with the only feedback “Explanation is missing.”

Denise has spent a lot of time and eﬀort on her assignment for BDA. She has submitted

her report in the correct format, but thought that she can include her code as a link in the

report, and upload it online (such as Github, or Dropbox). - Denise will get 0 for the whole assignment, with the only feedback “Code was not uploaded on Learn.”

3) Group work: This is an INDIVIDUAL ASSIGNMENT, like a 2 week exam for the course. Communication between students about the assignment questions is not permitted. Students who submit work that has not been done individually will be reported for Academic Mis- conduct, that can lead to serious consequences. Each problem will be marked by a single instructor, so we will be able to spot students who copy.

4) Piazza: During the periods of the assignments, the instructor will change Piazza to allow messaging the instructors only, i.e. students will not see each others messages and replies. Only questions regarding clariﬁcation of the statement of the problems will be answered by the instructors. The instructors will not give you any information related to the solution of the problems, such questions will be simply answered as “This is not about the statement of the problem so we cannot answer your question.”

THE INSTRUCTORS ARE NOT GOING TO DEBUG YOUR CODE, AND YOU ARE ASSESSED ON YOUR ABILITY TO RESOLVE ANY CODING OR TECHNICAL DIFFI- CULTIES THAT YOU ENCOUNTER ON YOUR OWN.

5) Oﬃce hours: There will be two oﬃce hours per week (Monday 14:00-15:00, and Wednesdays

15:00-16:00) during the 2 weeks for this assignment. The links are available on Learn / Course Information. I will be happy to discuss the course/workshop materials. However, I will only answer questions about the assignment that require clarifying the statement of the problems, and will not give you any information about the solutions. Students who ask for feedback on their assignment solutions during oﬃce hours will be removed from the meeting.

6) Late submissions and extensions: NO EXTENSIONS ARE ALLOWED FOR THIS AS- SIGNMENT, AND THERE IS NO SUCH OPTION PROVIDED IN THE ESC SYSTEM. Students who have existing Learning Adjustments in Euclid will be allowed to have the same adjustments applied to this course as well, but they need to apply for this BEFORE THE DEADLINE on the website

https://www.ed.ac.uk/student-administration/extensions-special-circumstances

by clicking on “Access your learning adjustment”. This will be approved automatically.

Students who submit their work late will have late submission penalties applied by the ESC

team automatically (this means that even if you are 1 second late because of your internet connection was slow, the penalties will still apply). The penalties are 5% of the total mark deduced for every day of delay started (i.e. one minute of delay counts for 1 day). The course instructors do not have any role in setting these penalties, we will not be able to change them.

7) Please make sure to tag all pages in your submission on Gradescope, otherwise we may miss some of your work. Once your upload is complete, tagging does not counts towards your submission time (i.e. you won’t get any late penalties for doing it).

rm (list = ls(all = TRUE))

#Do not delete this!

#It clears all variables to ensure reproducibility

Problem 1

In this problem, we study a dataset about car insurance. This data set is based on one-year vehicle insurance policies taken out in 2004 or 2005. In total, there are 67856 policies, of which 4624 have claims.

require (insuranceData)

## Loading required package: insuranceData

data (dataCar)

#You may need to set the working directory first before loading the dataset #setwd("location of Assignment 1")

#The first 6 rows of the dataframe

print.data.frame (dataCar[1:6,])

## ## 1 ## 2 ## 3 ## 4 ## 5 ## 6 ## ## 1 ## 2 ## 3 ## 4

veh_value exposure clm numclaims

1 .06 0 .3039014 0 0

1 .03 0 .6488706 0 0

3 .26 0 .5694730 0 0

4 .14 0 .3175907 0 0 0 .72 0 .6488706 0 0

2 .01 0 .8542094 0 0

agecat X_OBSTAT_

2 01101 0 0 0

4 01101 0 0 0

2 01101 0 0 0

claimcst0 0 0 0 0 0 0

veh_body

HBACK HBACK UTE STNWG HBACK HDTOP

veh_age

gender

F F F F F M

area

## 5

## 6

2 01101

4 01101

Description of the columns.

veh_value: vehicle value in $10000s

exposure: maximum portion of the vehicle value the insurer may need to pay out in case of an incident

claimcst0: claim amount (0 if no claim)

clm: whether there was a claim during the 1 year duration

numclaims: number of claims during the 1 year duration

veh_body types: BUS = bus CONVT = convertible COUPE = coupe HBACK = hatchback

HDTOP = hardtop MCARA = motorized caravan MIBUS = minibus PANVN = panel van

RDSTR = roadster SEDAN = sedan STNWG = station wagon TRUCK = truck UTE =

utility

gender: F- female, M - male

area: a factor with levels A,B,C,D,E, F

agecat: age category, 1 (youngest), 2, 3, 4, 5, 6

You can use either JAGS, Stan, or INLA for this question.

a)[10 marks] Fit a Bayesian logistic regression model on the dataset dataCar with

● clm as response,

● a link function of your choice,

● using veh_value, exposure, veh_body, veh_age, gender, area, and agecat as covariates (you can use categorical covariates by converting integers to factors if appropriate).

Center and scale the non-categorical covariates.

Choose your own prior distributions (do not use default priors), and explain the rationale your prior choices, and ensure that the posterior is not too sensitive to your prior choice [Hint: look at the induced prior on the linear predictor and on the response.]

Compute the posterior means of the model parameters, and discuss the results.