MAS8384 Bayesian Methodology
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
School of Mathematics, Statistics & Physics
MAS8384
Bayesian Methodology
Project
Submission
Submit your solution via NESS by 4pm on Monday 27th February 2023. You should submit your work as a single electronic file in PDF format. This should include written work and any plots you have produced. Include your JAGS model code and R code to run it. You do not need to include code for making plots. The page limit is 12 pages. You may include an appendix for supplementary R code which does not count towards the page limit.
Data description
The project is on some data on cross country race times for male athletes in the North East Harrier League (NEHL). The website for the NEHL, where the data have been taken from, is www.harrierleague.com.
The data provide the race times (in minutes) for runners in NEHL races in the years 2016-18. In total there were 17 races over this period.
The data contains the following variables.
· Number Athlete identifier.
· Age The age groups of the athlete: Under 20, Senior, Veteran over 35, Veteran over
40, etc.
· Pack The pack of the athlete: slow (S), medium (M) or fast (F). All athletes begin
in the slow pack. If they finish in the top 10% of the field they are promoted to the medium pack for the rest of the year. If they finish in the top 10% of the field from the medium pack they are promoted to the fast pack for the rest of the year.
· Course The location of the race.
· Year The year of the race.
· Temperature The temperature of the race in degrees celcius.
· Windspeed The windspeed during the race in miles per hour.
· Distance The distance of the race in miles.
· Elevation The total metres of ascent during the race.
· Response The time taken in minutes to complete the race.
The data contains almost 8000 rows. This will result in rather slow MCMC chains, and so you will consider only a subset of 1000 rows from the data. To do so, run the following commands in R.
set .seed(n)
run = read .table("rundata .txt",header=TRUE)
run = run[sample(1:nrow(run),1000),]
The value for n you should use in your code is given in the table below.
Student number n |
|
190638344 180229374 210065657 220084633 190386674 190174842 190394543 220624107 190151430 190193669 220101620 190324012 220526111 210023967 220494935 220163640 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
Question
Perform a Bayesian analysis of the NEHL dataset, in particular how the different courses appears to affect the time taken to complete the races. The project should be written up as a coherent report on this problem. You will be marked for how well you apply Bayesian methods, including interpretation of their results.
Your report should include:
· Description of various models you consider, your the final model, and why it was selected.
· The diagnostics you use.
· Summaries of your posterior distributions for various models.
· Conclusion. Give a brief non-technical summary of your findings.
This dataset is complicated enough that there is no single best model. There are many different ideas from the course which can try including in your model. An inital model could be to include only random effects for the athlete and the course. You will receive credit for considering sensible models beyond this initial regression, even if they do not turn out to fit well.
2023-02-21