闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

MDS5130 Project

Due date: May 9, 2022

❼ Outstanding projects will be invited to give a presentation on April 26, 2023. Students

who have given a presentation can receive maximum 10 bonus points in their final exam.

❼ Students who want to present their work need to submit the project by April 19, 2023.

Submissions after April 19 will not be invited for presentation. All students can revise their work before May 9, 2023.

❼ The submitted codes must be clearly written in a R file with an output MSE.

❼ A report to describe your analysis is required.

1 Background

In this project, we will analysis a dataset about horse racing. Let’s have a brief introduction of horse racing. In a particular game, there are 14 horses racing. Before a particular time tfinal , people are allowed to bet which horse can win the game. Let bi (t) be the total amount betting on horse i at time t. Note that bi (t) is increasing before tfinal . After the game, we have bi (tfinal ) being bet on horse i for i = 1, . . . , 14. If horse I wins the game, people who bet on horse I can get the dividend

dI(f) = dI (tfinal ) = (1 − ∇) 对j(n)=1 bj (tfinal )

for each $1 bet, here ∇ = 0.175 is the percentage track-take. Note that the dividends

(1 − ∇) 对j(n)=1 bj (t)

bi (t)

for horse i, i = 1, . . . , 14, are known by all gamers at time t < tfinal . As bi (t) is time varying, so does di (t).

Now suppose we have some insider information and we believe that we know the “true” winning probability πi of each horse i. Since we will only make a bet on horse i if the expected return is greater than 1/πi , so one betting strategy is betting on horse i if di(f) > 1/πi . However, we don’t know di(f) at time we bet (tbet ). Let bi = bi (tbet ), di = di (tbet ), fi W be the amount we bet on horse i at tbet and Ci be the amount bet on horse i by other parties after tbet . Then

we have

bi + Ci + fi W .

The unknown quantities here are Ci for i = 1, . . . , 14. In this project, your task is to estimate Csum =对 Ci before tbet .

2 Data

The datasets “data20XX.RData” with XX=14,15,16,17,18 are given. They all have the same set of column names, which are

❼ ID: It is of the form “yyyymmddrr”, which means Year yyyy Month mm Date dd Race

rr. Note that there are more than one race on each day and the number of races can be different on each day.

❼ WIN POOL.x: The total amount in the pool at time tbet .

❼ WIN POOL.y: The total amount in the pool at time tfinal . Hence Csum is the

difference between WIN POOL.y and WIN POOL.x.

❼ WIN TAKE.x: ∇ = 0.175. It is the same as WIN TAKE.y.

❼ WIN ODDS i.x: di = di (tbet ). If it is 0, it means that horse i actually was not in

the race.

❼ WIN ODDS i.y: di(f) = di (tfinal ). If it is 0, it means that horse i actually was not in

the race.

❼ WIN MODEL i.x: “True” winning probability πi . If it is 0, it means that horse i

actually was not in the race. It is the same as WIN MODEL i.y.

❼ WIN TIME.y The “yyyymmdd” part of ID.

❼ WIN NUMBER.y The “rr”part of ID.

In this project, you are required to forecast Csum for each race in data2018.RData. Note that you MUST only use the information BEFORE tbet to forecast the Csum in a particular race. Let N be the total number of races in 2018, R2018 be the set of all races in 2018, xr be the true Csum on Race r and r be your forecast.