Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

School of Mathematics, Statistics & Physics

MAS8384

Bayesian Methodology

Project

Submission

Submit your solution via NESS by 4pm on  Monday 27th  February 2023.  You should submit your work as a single electronic file in PDF format. This should include written work and any plots you have produced. Include your JAGS model code and R code to run it. You do not need to include code for making plots. The page limit is 12 pages. You may include an appendix for supplementary R code which does not count towards the page limit.

Data description

The project is on some data on cross country race times for male athletes in the North East Harrier League (NEHL). The website for the NEHL, where the data have been taken from, is www.harrierleague.com.

The data  provide the  race times  (in  minutes) for  runners in  NEHL  races in the years 2016-18. In total there were 17 races over this period.

The data contains the following variables.

·  Number Athlete identier.

· Age The age groups of the athlete:  Under 20, Senior, Veteran over 35, Veteran over

40, etc.

·  Pack The pack of the athlete:  slow (S), medium (M) or fast (F). All athletes begin

in the slow pack.  If they nish in the top 10% of the eld they are promoted to the medium pack for the rest of the year.  If they nish in the top 10% of the eld from the medium pack they are promoted to the fast pack for the rest of the year.

·  Course The location of the race.

· Year The year of the race.

· Temperature The temperature of the race in degrees celcius.

· Windspeed The windspeed during the race in miles per hour.

·  Distance The distance of the race in miles.

·  Elevation The total metres of ascent during the race.

·  Response The time taken in minutes to complete the race.

The data contains almost 8000 rows.  This will result in rather slow MCMC chains, and so you will consider only a subset of 1000 rows from the data.  To do so, run the following commands in R.

set .seed(n)

run  =  read .table("rundata .txt",header=TRUE)

run  =  run[sample(1:nrow(run),1000),]

The value for n you should use in your code is given in the table below.

Student number    n

190638344

180229374

210065657

220084633

190386674

190174842

190394543

220624107

190151430

190193669

220101620

190324012

220526111

210023967

220494935

220163640

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

Question


Perform a Bayesian analysis of the NEHL dataset, in particular how the different courses appears to affect the time taken to complete the races.  The project should be written up as a coherent report on this problem. You will be marked for how well you apply Bayesian methods, including interpretation of their results.

Your report should include:

·  Description of various  models you consider, your the nal model, and why it was selected.

· The diagnostics you use.

· Summaries of your posterior distributions for various models.

·  Conclusion. Give a brief non-technical summary of your findings.

This dataset is complicated enough that there is no single best model.  There are many different ideas from the course which can try including in your model. An inital model could be to include only random effects for the athlete and the course. You will receive credit for considering sensible models beyond this initial regression, even if they do not turn out to fit well.