Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Section B) Coursework Brief and Requirements

Overview

This coursework focuses on applying the methods learned during the course (weeks one to five only) to time series data provided. You should structure your analysis according to the brief below. A copy of the data is available in the coursework section in the course Moodle page. The mark will be awarded taking together the methods, analysis, reasoning and conclusions as a whole (see assessment criteria). You should prepare a short report explaining the analysis you conducted in response to the brief, the results and what the conclusion you draw from them. Please note the following:

❼ There is a word limit of 2000 words for the report. A page of figures (graphs/tables) will count as 300 words but you can include 3 figures without any contribution to the word limit. Hitting the word limit is not required.

❼ Robustness checks can be included in an appendix (excluded from the word limit) and referred to if need be.

❼ There is no particular mark for the individual elements of the coursework.

❼ Your may use Stata or Python to conduct the analysis. If you wish to use alternative software contact Corrado for permission alongside a justification.

❼ You should provide code as part of your submission. You code should be attached to the end of your report and annotated (i.e. it explains what you are doing). If you use inbuilt commands in Stata rather than writing a script, save a log file to track what you have done. The code section does not count against the word limit. Your skill with coding is not being judged, we request code to verify the consistency between what you say you do in the report and what you actually did.

You are strongly advised to present the outputs from your analysis in a coherent and readable fashion in order to stay within the word limit. The efficient use of plots and tables to illustrate results concisely will be rewarded during marking.

Reminder:  This is an individual coursework. Any unsanctioned sharing of code or results is academic misconduct and you will be penalised. Similar submissions among students are obvious at the marking stage.

The exercise

The dataset accompanying the coursework contains two time series at a daily frequency from 30th July 2020 - 15th December 2022: (i) The number of bikes hired as part of Transport for London’s bike hire scheme and (ii) Average day time temperatures in London in degrees celsius. The data on the number of bikes hired has been adjusted to take into account effects arising from the day of the week and the season.

You may take this data as given. To keep the coursework within a reasonable scope there is no expectation that you should augment the data with additional sources. However, you may be asked to, or wish to, to define additional variables based on this data (for example, a dummy variable for the month the observation arises from). Given this data, here is the brief:

❼ Take the natural logarithm of the number of bikes hired and explain why this transformation may make analysing the time series easier. Do you think the temperature and logged cycle hire series are covariance-stationary? Justify your answer and propose a transformation to render the series stationary if needed.

Denote xt as the (potentially transformed) averaged temperature and yt as the (potentially trans-formed) logarithm of bike hires.

❼ What AR(p) model would you select to fit yt? Consider only p ≤ 10. Estimate your preferred model, comment on the parameter estimates and whether they affect your answer to the previous question.

Now we turn to the link between temperatures (xt) and bike hiring (yt).

❼ Define a set dummies zi,t where i = 1, . . . , 12 denotes months in the year and zi,t takes a value of 1 if observation t came from month i and zero otherwise. In other words, define twelve dummy variables for each month of the year. Regress xt on these twelve zi,t dummies and compute the residuals which you can denote ˆxt . What is the interpretation of ˆxt?

❼ Using a time series regression model estimate the dynamic response, up to 7 days ahead, of the number of bikes hired to a 1 degree improvement in temperatures. You should use ˆxt for the analysis. Add 90% confidence intervals to your estimates. You should consider the following:

– The appropriate controls of based on the lags of ˆxt and yt to include.

– The reasoning why your control set is weakly exogenous.

– The appropriate standard errors to use in the regression.

– How to interpret your regression coefficients in a manner that allows you to correctly compute the dynamic response of log bike hires to temperatures.

You should explain your choices and reasoning based on your judgement and the data.

❼ Broadly consider the following question: Is the relationship between temperatures and bike hiring different between the Summer and Winter? Use your data and the model from the previous question to assess.