关键词 > MATH3026&MATH4022

MATH3026 & MATH4022

发布时间:2021-04-01

MATH3026 & MATH4022

Time Series       2020/21


MATH3026 students: Question 1 is ASSESSED and is worth 10% of the total assessment for MATH3026. Coursework should be submitted via the coursework submission area on the Moodle page by Thursday 29 April, 3pm. There is a 5 working day grace period and so if there is a delay then penalties will not apply if the work is submitted before Friday May 7th, 3pm. Work submitted after this time will receive zero. You should submit a single html file that has been produced by R Markdown in R Studio. Please ensure you have comments at the start of the document about what the aims of the report are, comments before and after chunks of R code, and comments at the end giving a summary of your findings.


MATH4022 students: Questions 1 and 2 are both ASSESSED and worth 10% each. Coursework should be submitted via the coursework submission area on the Moodle page by Thursday 29 April, 3pm. There is a 5 working day grace period and so if there is a delay then penalties will not apply if the work is submitted before Friday May 7th, 3pm. Work submitted after this time will receive zero. You should submit

1. one html file that has been produced by R Markdown in R Studio for Question 1,

2. one html file produced by R Markdown for the Statistical Report in Question 2

3. one pdf or Word file for the Executive Summary for Question 2.

    For each html file please ensure you have comments at the start of the document about what the aims of the report are, comments before and after chunks of R code, and comments at the end giving a summary of your findings.


    Plagiarism and Academic Misconduct. For all assessed coursework it is very important that you submit your own work. Some information about plagiarism is given on the Moodle webpage.


1. Analyse the Nile time series dataset which is available in R.

Write a short report in html from R Markdown describing the identification, estimation and checking of a suitable model for the Nile dataset. Provide appropriate graphs and clearly justify your choice of model. Note that there is no unique correct answer for this problem.

In case it is useful you can use LaTeX in the comment areas between R chunks (in the white areas), e.g.

$X_t - 0.8 X_{i-1} = Z_t - 0.2 Z_{t-1}$

becomes


2. (MATH4022 only) You work as a statistician for an environmental agency and your task is to provide a Statistical Report and an Executive Summary for an analysis of a dataset of lake levels for Harbor Beach. Harbor Beach is a town on Lake Huron, Michigan in the US, and these data are noisy versions of lake level data obtained from National Ocean Service. The aim of the project is to forecast the lake level height above mean sea level (HMSL) for the next 24 months, and provide uncertainty information. The data are available on the Moodle page in the file HarborBeach.txt and consist of the HMSL in months starting in January 1900 and ending in January 2021. The HarborBeach.txt file contains a single time series of length n = 1453 presented in row order. In particular the first few values are 579.007 579.2817 578.9298 579.3969 578.7382 579.3947 578.6572 ...

The Statistical Report should be produced in html from R Markdown and should be suitable for your peer statisticians to read and understand. Please ensure that you explain clearly what you are aiming to do and that you present, interpret and discuss your results in a professional manner. It is important to explain with comments how you (i) analysed the data (ii) fitted the models (iii) performed model selection (iv) checked the modelling assumptions and (v) calculated the forecasts.

The Executive Summary should be maximum of one page in length, consisting of text and at most one plot. It must be suitable for a broad set of non-statistician readers to understand, including the CEO of the agency.


    In your reports you should aim to convey the important details in a way which is easy to follow, but not excessively long. Think about someone reading it through and try to help make it easy for them. Make it clear, without too much repetition, and avoid long items of numerical output.


Grading Each question will be marked out of 10:

• 5 marks for technical content, use of R and appropriate methods

• 5 marks for presentation and interpretation of results