MTHM505 Data Science And Statistical Modelling In Space And Time 1
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
Data Science And Statistical Modelling In Space And Time
REF/DEF Assessment - Practical Modelling exercises
Section A consists of spatial modelling questions, and Section B consists of time series modelling questions. Commented R code (and the outcomes/plots) should be part of your answers.
This assessment is worth 50% of the module mark.
You should submit a single pdf containing answers to A and B to the BART submission point.
The deadline for submission is 12 noon, 8th August.
A. Spatial modelling [100 marks]
You have just started work at an oceanographic consultancy. You are asked to interpolate a set of sea surface temperature data for one month in the Kuroshio region off the coast of Japan onto a grid with a resolution of .5. in both the E and N directions. We are going to assume a flat Earth!
The data are in the file kuroshio .csv. You are also provided with an R program to read the data (readkuro .R).
Analyse the data and answer the following questions (indicative marks are given).
1. Produce numerical and graphical summaries of the data. Comment on your findings and highlight any potential outliers in the data. /1θ mαr&g/
2. Check for isotropy (the function variog4 in geoR may be useful). Do you need a trend in the model?
/2θ mαr&g/
3. Decide what spatial model you want to fit. You may want to try several and see which one fits best. Estimate the parameters of your chosen model by Maximum Likelihood and plot the expected value and variance for the estimate on the required grid. Validate your model or models. /35 mαr&g/
4. Repeat 3 but use Bayesian methods. Show your priors and the ensuing posteriors. Consider different priors and models and justify your choice of the final model. Illustrate your results by plotting the mean and variance fields as well as some samples from the posterior fields. /25 mαr&g/
5. Comment on the difference and the advantages and disadvantages of the two methods of estimation.
/1θ mαr&g/
Note: fitting Gaussian processes becomes significantly more expensive as the number of data points increases. You may want to consider fitting models to a subset of the data for computational efficiency (consider how you might want to split the data, how you might use the left-out data).
B. Time series modelling [100 marks]
1. The figures labelled A to E show five time series whose defining equations are given below.
i) xt = 0.8xt − 1 + et ,
ii) xt = et _ 0.5et − 1 ,
iii) xt = 2xt − 1 _ xt −2 + et + 0.5et − 1 + 0.4et −2 ,
iv) xt = et + 0.1(250 _ t)et − 1 ,
v) xt = xt − 1 + 0.9et − 1 + et .
In each case, et ~ N (0﹐ 1).
State, with reasons, which equation corresponds to which plot. [10 mαr&s]
Fig A
0 50 100 150 200 250
Time
Fig C
0 50 100 150 200 250
Time
Fig B
0 50 100 150 200 250
Time
Fig D
250
Time
Fig E
0 50 100 150 200 250
Time
2. The ACF and PACF are plotted below for 5 different series. Suggest appropriate ARMA models for each (A, B, C, D, E), giving reasons for your choice in each case. [10 mαr&s]
ACF, Series A
PACF, Series A
0 |
5 |
10 |
15 Lag |
20 |
25 |
30 |
15 20 25 30
Lag
ACF, Series B
0 5 10 15 20 25 30
Lag
ACF, Series C
0 5 10 15 20 25 30
Lag
ACF, Series D
0 |
5 |
10 |
15 Lag |
20 |
25 |
30 |
PACF, Series B
0 5 10 15 20 25 30
Lag
PACF, Series C
0 5 10 15 20 25 30
Lag
PACF, Series D
0 |
5 |
10 |
15 Lag |
20 |
25 |
30 |
ACF, Series E
PACF, Series E
0 |
5 |
10 |
15 Lag |
20 |
25 |
30 |
0 |
5 |
10 |
15 Lag |
20 |
25 |
30 |
3. The data for this assignment are the measured strength of the overturning in the North Atlantic from moorings at 26N between April 2004 and March 2014, found in file overturning .csv.
a. Average the data to quarterly means. Produce numerical and graphical summaries of the averaged data, and comment on your findings and highlight any potential outliers. You might find it useful to convert the averaged data to a time series object ts(). /1θ mαr&g/
b. Fit an ARMA and an ARIMA model to the data. Choose the most appropriate model, and use this to predict the values for the six 3-month periods from April 2014 to September 2015. /3θ mαr&g/
c. Fit a DLM to the data (including both a trend and a seasonal component). Use your model to predict the values for April 2014 to September 2015. /3θ mαr&g/
d. Compare the results of parts b and c, and comment on any differences you may find. /1θ mαr&g/
2022-07-23
REF/DEF Assessment - Practical Modelling exercises