闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

SPCE0038: Machine Learning with Big-Data

Exam 2021

Question 1

(a) Explain how linear regression may be used to ﬁt a polynomial model that is non-linear in the data features. [3 marks]

(b) Is a high-dimensional polynomial model likely to be a good model to use for a machine learning regression

problem? Explain your reasoning. [3 marks]

(c) How would you compute a “clean‘’ model prediction on each data instance provided (“clean‘’ in the sense that when evaluating a model on a data instance that data instance has not been used in ﬁtting the model)? Illustrate your explanation with a diagram. [6 marks]

Consider the underlying (true) model

y = f (z) + e,

where f is the true model, to be approximated by h, z is an object feature vector, and e is noise, with zero mean and variance σ 2 .

(d) Explain the three contributions to the mean square error. [3 marks]

Show that

E ┌ (y 一 h(z))2] = Bias2 [h(z)] + Var [h(z)] + σ 2 . [5 marks]

Question 2

(a) For a two-class supervised classiﬁcation problem, explain conceptually (without any equations) how support vector machines (SVMs) classify data instances. Include a discussion of both hard and soft margin classiﬁcation. Illustrate your explanation with diagrams. [10 marks]

(b) What are the characteristics of machine learning problems for which SVMs are well-suited? [2 marks]

The decision boundary is given by w ! z + b = 0 and the margins by w ! z + b = 士1, where w is the weight vector, z is the data instance vector and b is the bias.

Alternatively, these expressions may be expanded for the two dimensional setting to give w0x0 + w1x1 + b = 0 and w0x0 + w1x1 + b = 士1, respectively.

Derive an expression for the size of the margin (deﬁned by the shortest distance between the lines deﬁning the two edges of the margin, i.e. between lines w ! z + b = 1 and w ! z + b = 一1). [8 marks]

Question 3

(a) In a convolutional neural network, describe what a Max Pooling layer does, and why one may want to include such a layer. [4 marks]

(b) Describe what the stride is, with respect to a max pooling layer. [2 marks]

(c) Describe the two diﬀerent types of padding – valid and the same – that one may use with respect to a convolutional neural network. [4 marks]

(d) Consider the following image, with pixel values shown as integers, as an input layer. In all cases we consider the lower left corner of the image as being the point where any operation on the image begins. [10 marks]

1	2	3	4	5	6
3	5	6	7	8	9
4	4	4	4	6	6
2	6	7	8	9	0
1	6	8	9	0	8
1	3	7	3	5	8

(i) If one uses a max pooling layer with a ﬁlter size 6x6 pixels, using no padding, and a stride length of 6 pixels draw the resulting receptor layer.

(ii) If one uses a max pooling layer with a ﬁlter size 3x3 pixels and a stride length of 3 pixels, using no padding, draw the resulting receptor layer.

(iii) If one uses a max pooling layer with a ﬁlter size 2x2 pixels and a stride length of 2 pixels, using no padding, draw the resulting receptor layer.

(iv) If one uses a max pooling layer with a ﬁlter size of 3x3 pixels and a stride length of 1 pixel, using no padding, draw the resulting receptor layer.

(v) Consider the following code snippet, where image is the array under consideration in this question, and draw the resulting output array.

max pool = keras .layers .MaxPool2D(pool size=4,padding="valid") output = max pool(image)

(vi) Consider the following code snippet, where image is the array under consideration in this question, and draw the resulting output array.

max pool = keras .layers .MaxPool2D(pool size=4,padding="same") output = max pool(image)

Question 4

You want to build a prediction model trained on a large volume of experimental measurements. The data will be provided by an international organisation, which has been coordinating the experiments and the reporting of their results according to a well-deﬁned protocol and standard. The protocol deﬁnes, for example, how many and what measurements an experiment may provide, or which values may be missing and under what conditions. A lot of care has been taken to ensure that the data is consistent and adheres to this standard. All experiments are now complete and the collection is made available as a relational database.

(a) From the above description, why is a relational database a good choice for this dataset? What would

you gain by using a NoSQL database instead? [3 marks]

Once you connect to the database, you intend to extract and preprocess some of its contents into a CSV ﬁle. You will then ﬁt an appropriate classiﬁcation model on that data, and ﬁnally make the model available to others. You are not sure what preprocessing scheme or classiﬁer you will use, and would like to try diﬀerent options. Additionally, the dataset is very large, and the training takes a very long time on a single personal computer.

(b) For each of the following tasks, suggest one appropriate technology (such as a tool or library) and brieﬂy explain how it helps with that task:

● Switching between diﬀerent preprocessing methods and classiﬁers.

● Training the classiﬁer eﬃciently.

● Sharing the trained model.

(For example, the answer for a data storage task could be: “SQL/relational database: allows users to programmatically store, access and query data") [6 marks]

(c) For the pipeline mentioned above (preprocessing and training classiﬁer), give the contents of a ﬁle in the YAML format that describes this pipeline, such that it can be reproduced automatically using DVC. As- sume that the connection to the database and the preprocessing is done in a ﬁle called preprocess .py, which produces the ﬁle data .csv, and that the model is ﬁtted and saved in classify .py. The Python ﬁles can be run as e.g. python preprocess .py . The ﬁtting can be conﬁgured through two parameters: “degree” and “bias” (you can assume that these are speciﬁed in another suitable ﬁle, but do not need to provide that in your answer). [8 marks]

(d) Which DVC command can be used to run the whole analysis this ﬁle describes? Assume that you run the analysis this way, make a modiﬁcation to the classiﬁer parameters, and run that command again. Will the whole pipeline be rerun, and why (not)? [3 marks]

2023-04-26

Java

物理(Physical)

LINUX

C++

Python

Processing

sas

ios

maths

maple

C语言