SPCE0038: Machine Learning with Big-Data Exam 2019
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
SPCE0038: Machine Learning with Big-Data
Exam 2019
Question 1
(a) Describe three reasons why machine learning improved in effectiveness so dramatically over recent years. [3 marks]
(b) Briefly describe batch gradient descent and stochastic gradient descent at a conceptual level. [2 marks]
(c) Describe the properties of batch and stochastic gradient descent. [3 marks]
(d) Write pseudo code (i.e. outline the algorithmic steps) defining a stochastic gradient descent algorithm to estimate parameters theta, given training feature matrix X train and target vector y train. As- sume the functions compute random index, compute gradient, and compute learning rate are available to you (i.e. you do not need to explicitly define them), although you should specify their inputs and outputs when you make use of them. Assume that the training set is already randomised. [5 marks]
(e) Fill in the four missing entries of the confusion matrix below for a binary classifier to show which entries correspond to true-positives (TP), false-positives (FP), true-negatives (TN), and false-negatives (FN). Copy the confusion matrix and complete it in your answer book. [2 marks]
Predicted
Negative Positive
(f) Define the true positive rate and the false positive rate in words and mathematically using the entries in your confusion matrix above. [2 marks]
(g) Explain how a ROC curve is constructed and how the threshold defining the the ROC curve varies along the curve. Illustrate your explanation with a diagram. [3 marks]
Question 2
(a) For the logistic unit shown below, specify the equations that define the output a of the logistic unit given the inputs xj and parameters θj . [2 marks]
Fig. 1: Logistic unit.
(b) Specify the equations defining the logistic unit when a bias term is included. [2 marks]
(c) Explain how the softmax function may be used to adapt a neural network to perform multi-class classification. [2 marks]
(d) Define the softmax function mapping inputs aj to outputs pj, for j = 1, . . . , n. [1 mark]
(e) Show the outputs of the softmax function pj satisfy the following properties:
(i) j(n)=1 pj = 1;
(ii) 0 < pj < 1 for all j . [3 marks]
(f) The gradient of the cost function is often used when training neural networks. Describe two ways gradients may be computed efficiently in practice for typical neural network cost functions. [2 marks]
(g) Describe the vanishing gradient problem when training neural networks and its cause. [2 marks]
(h) Give mathematical expressions for the sigmoid and ReLU activation functions and plot them. Which activation function is generally better and why? [3 marks]
(i) Describe a shortcoming of the ReLU activation function. [1 marks]
(j) Describe an improved version of the ReLU activation function to mitigate this shortcoming. Give the equation defining the improved activation function and illustrate your description with a diagram. [2 marks]
Question 3
Consider a classification problem with features xj(┐)i( and targets y ┐i(, where there are nfeatures features and nobjects objects (assume i and j are indexed from 1).
(a) Construct the feature matrix X and target vector y from features xj(┐)i( and targets y ┐i( . [1 mark]
(b) Give the size of the feature matrix X and target vector y . [1 mark]
(c) Explain the process of n-fold cross-validation and why it is useful. Use a diagram to illustrate your answer. [4 marks]
(d) What are the four key steps in training and applying a classification model in Scikit-Learn (assuming data are already set up appropriately). [4 marks]
Consider the underlying (true) model
y = f (z) + ∈,
where f is the true model, to be approximated by h, z is an object feature vector, and ∈ is noise, with zero mean and variance σ 2 .
(e) Explain the three contributions to the mean square error. [3 marks]
Show that
E _(y - h(z))2] = Bias2 [h(z)] + Var [h(z)] + σ 2 [3 marks]
(g) Explain the bias-variance trade-off and how it relates to model complexity. Illustrate your explanation with a diagram. [4 marks]
Question 4
(a) State which type of neural network would be appropriate for the following problems, stating your reasons.
(i) An image classification problem.
(ii) A language translation problem.
(iii) Sentiment score analysis. [6 marks]
(b) Describe what a pooling layer in a neural network is, and state some reasons why such a layer may be included. [4 marks]
(c) For a convolutional neural network describe what stride parameter defines. [2 marks]
(d) Consider the following piece of Python code using Tensor Flow:
0 import t e n s o r f l o w as t f
1 reset _ graph ()
2 n _ inputs = 3 4 m i s s i n g _ l i n e
5 n _ outputs = n _ inputs
6 l e a r n i n g _ r a t e = 0 . 01
上 X = t f . p l a c e h o l d e r ( t f . f l o a t 3 2 , shape =[None , n _ inputs ] )
8 hidden = t f . l a y e r s . dense (X, m i s s i n g _ v a r i a b l e )
9 outputs = t f . l a y e r s . dense ( hidden , n _ outputs )
0) r e c o n s t r u c t i o n _ l o s s = t f . reduce _ mean ( t f . square ( outputs - X))
00 o p t i m i z e r = t f . t r a i n . AdamOptimizer ( l e a r n i n g _ r a t e )
01 training _ op = o p t i m i z e r . minimize ( r e c o n s t r u c t i o n _ l o s s )
(i) State what this code is doing in general terms. (ii) Identify the missing variable on line 8.
(iii) Provide a value for the missing variable on line 4, justifying its value. If n inputs = 4 how would this change your answer?
(iv) Describe in words what the learning rate on line 6 is. If this was set too high how might the results change. [8 marks]
Question 5
This question focuses on data formats, normal forms, SQL, markup languages and semantic models.
(a) (i) What do we mean by the serialization of an object or data structure? [1 mark]
(ii) Provide at least two reasons why it might be beneficial to serialize. [1 mark]
(b) The Anonymous gallery possesses a physical archive where it stores information on all the purchases made by its customers over time. These purchases are recorded in files, an example of which is given in Fig. 2. The gallery also wishes to maintain records of data on customers, artists and objects. There might be several objects authored by a single artist and objects may be bought and sold several times over. In other words, the gallery may sell something, buy it back at a later date and sell it to another customer.
Fig. 2: Example of a customer’s file in the Anonymous gallery archives.
(i) Provide a database schema not in normal form (UNF). Use the following syntax: table_name [ column_1, (sub_column_1, sub_column_2), . . . ] grouping attributes which end-up in the same column using parentheses. [2 marks]
(ii) Provide a database schema in first normal form (1NF). [2 marks]
(iii) Provide a database schema in second normal form (2NF). [2 marks]
(iv) Provide a database schema in third normal form (3NF). [4 marks]
(c) Write a query in SQL that lists all the works of art bought by John Brown (roughly as Fig. 2). You might need some of the following SQL syntax: SELECT (AS), FROM, WHERE, JOIN (ON), GROUP BY, ORDER BY. [3 marks]
(d) Clarify the differences between URI (Uniform Resource Indicator), URL (Locator), URN (Name) and namespaces. [2 marks]
(e) Use RDFS (Resource Description Framework Schema) to specify the schema of artists, their works
and relations among them, as in Fig. 2. You might need to use some of the following components: RDFS:Class, RDF:Property, XSD:string, XSD:integer, RDFS:domain, RDFS:range. XSD stands for XMLSchema. When defining your namespaces, feel free to ignore providing the correct URL. A partial solution is fine, provided you show an understanding of: namespaces, classes and properties, domains and ranges, data types, RDFS syntax. [3 marks]
2023-04-26