闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

May Examination Period 2021–22

ECOM193 Statistical Machine Learning in Finance

When writing formulas, please note the following:

· It is acceptable to use the standard alphabet in place of Greek letters. The following are

recommended: a for ↵ , b for β, d for 6, D for ∆ , l (lowercase l) for λ , m for µ , v for v, s for σ , S for ⌃ .

· Use + for addition, - for subtraction, * for multiplication and / for division.

· Where appropriate use an underscore to indicate a subscript, e.g. x i for xi .

· Use the ˆ character for power, e.g. x^2 for x2 , x^0.5 for ^x.

· When referring to the following functions use log(x) or ln(x) for loge x, logb(x) for logb x, exp(x) for e北 , cos(x) for cos x.

· Use infty for 1.

· Use Sum to denote summation of terms, e.g. Sum i=1^n x i for P xi .

· Use Prod to denote product of terms, e.g. Prod i=1^n x i for Q xi .

· Use D for derivative, e.g. D(x^2) = 2x.

· Use Int for integral, e.g. Int a^b (x) dx for Ra(b) xdx.

· Use cap for \ and cup for n when referring to sets.

· Where it is not obvious that an estimate is implied then state this in full, e.g. ‘a suitable

estimate of b is 0.125’ or more simply (and equally acceptable) ‘est.b = 0.125’ .

· Use brackets as necessary. To make your answer clearer use diferent types of bracket pairs where

appropriate, e.g. (), [], {}.

Use obvious choices for any other mathematical symbols not listed above that you may require.

Question 1

a) Explain how the bootstrap procedure may be used with a suitable sample to estimate the bias and the standard error of a statistical estimator. [12 marks]

b) The following ordinal credit score data were obtained on eight individuals.

589 845 701 842 599 913 749 845

A bootstrap analysis of the data was thought to be sensible and the following four bootstrap samples were obtained from the above sample in the usual way.

Bootstrap Sample

1 2 3 4

589 589 589 599

589 599 701 599

599 749 749 701

842 749 842 749

864 842 864 842

864 845 864 842

864 845 913 864

From the data in the table above:

i. Calculate the bootstrap estimate of the mean.

ii. Calculate the bootstrap estimate of the median.

iii. What is the bootstrap estimate of the bias of the mean? What do you conclude?

iv. What is the bootstrap estimate of the bias of the median? How does this compare with your preceding answer for the bias of the mean?

v. What is the bootstrap estimate of the standard error of the mean? What do you conclude about this estimate? [13 marks]

Question 2

a) What is an ensemble method in machine learning? [5 marks]

b) What are their primary advantages of ensemble methods compared with more classical data modelling approaches? [6 marks]

c) Describe one ensemble method and its general procedural implementation. (You do not need to derive the particular method you choose from ﬁrst principles but you must give a clear description of the method.) [8 marks]

d) Why are out- of- bag samples relevant in an ensemble method context? [6 marks]

Question 3

a) Describe the method of Principal Component Analysis (PCA) its general aims and advantages. [10 marks]

b) Why is it advisable to ensure the input variables to PCA are generally measured on comparable scales? What would you do if this wasn’t the case? [4 marks]

c) The following correlation matrix is estimated from the daily returns of ﬁve currency pairs for the year 2020:

USDAUD USDBRL USDCOP USDMXN USDMYR

USDAUD 1 .0000 0 .3429 0 .4760 0 .5764 0 .3251

USDBRL 0 .3429 1 .0000 0 .3993 0 .5502 0 .1394

USDCOP 0 .4760 0 .3993 1 .0000 0 .5316 0 .2980

USDMXN 0 .5764 0 .5502 0 .5316 1 .0000 0 .3316

USDMYR 0 .3251 0 .1394 0 .2980 0 .3316 1 .0000

(USD base currency).

a PCA analysis in R using this correlation matrix gives

Loadings (Principal Components):

USDAUD

USDBRL

USDCOP

USDMXN

USDMYR

Comp .1

0 .474

0 .418

0 .470

0 .526

0 .322

Comp .2

0 .103

-0 .546

-0 .135

0 .820

Comp .3

0 .567

-0 .598

0 .323

-0 .464

Comp .4 Comp .5

0 .530 0 .403

0 .407

-0 .818

0 .215 -0 .811

(small values are automatically suppressed)

Importance of components:

Comp .1 Comp .2 Comp .3 Comp .4 Comp .5

Standard deviation 1 .6233 0 .9397 0 .7768 0 .7268 0 .5917

Proportion of Variance 0 .5270 0 .1766 0 .1207 0 .1057 0 .0700

Cumulative Proportion 0 .5270 0 .7036 0 .8243 0 .9300 1 .0000

i. What do you observe about the correlation structure?

ii. What do you deduce from the components or loadings themselves?

ii. How many components might you choose to describe the data? Justify your answer.

iii. Explain how PCA scores can be obtained from the above analysis.

[11 marks]

Question 4

a) Describe what is meant by a generalized linear model (GLM). [12 marks]

b) Why is such a framework of use? [6 marks]

c) Brieﬂy outline how one might estimate the parameters of a GLM. [4 marks]

d) Name three probability distributions for the response variable that the GLM framework can handle. [3 marks]

Question 5

a) What is a maximal margin classiﬁer and why is it a largely theoretical starting point for support vector machines? [4 marks]

b) Explain some of the advantages a support vector machine has over linear discriminant analysis. Where might linear discriminant analysis do better? [6 marks]

c) Explain what is meant by a kernel in a support vector machine context. [3 marks]

d) What role does a cost function play in ﬁtting support vector machines? In what form is this incorporated into the model? [5 marks]

e) Give three real world examples where you might apply a support vector machine. [3 marks]

f) You decide to ﬁt a support vector machine using a polynomial kernel. How might you go about choosing its degree? What would you guard against? [4 marks]

Question 6

a) What is multiple logistic regression? [6 marks]

b) Explain how neural networks can be thought of as a ﬂexible non-linear extension of multiple logistic regression. [6 marks]

c) What is the role of an activation function in a neural network? [3 marks]

d) Describe and compare two common neural network activation functions. [5 marks]

e) List the main advantages and disadvantages of neural network models compared with more classical parametrized statistical models. [5 marks]

2023-02-09

Java

物理(Physical)

LINUX

C++

Python