关键词 > COMP5318/COMP4318

COMP5318/COMP4318 Machine Learning and Data Mining Semester 1, 2023

发布时间:2023-06-07

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

COMP5318/COMP4318 Machine Learning and Data Mining

Semester 1, 2023

Sample exam questions

Question 1. (Multiple choice question)

Select the correct answer.

1. Leave-one-out cross validation is suitable for large data sets.

a) True

b) False

2. The regression line minimizes the sum of the residuals

a) True

b) False

3. A single perceptron can solve the XOR problem.

True    False

Question 2. (Short answer question)

1. Why do we need to apply normalization when using distance-based algorithms such as k-Nearest Neighbor?

2. In linear support vector machines, we use dot products both during training and during classification of a new example. What vectors are these products of?

During training:

During classification of new example:

3. List one disadvantage of applying a multi-layer perceptron neural network to perform handwritten digits image classification.

Calculation (problem solving) questions

Question 3. Decision tree

Given is the following training data where location, weather and expensive are the features and holiday is the class.

a)  What is the entropy of this set of training examples with respect to the class?

b)  We would like to build a decision tree using information gain. Which attribute will be selected as a

root of the tree? Show your calculations.

You may use this table:

Question 4. Naïve Bayes

Given is the following training data where location, weather, companion and expensive are the features and holiday is the class.

Use Naïve Bayes to predict the value of holiday for the following new example, showing your calculations: location=boring, weather=sunny, companion=annoying, expensive=Y.

Question 5. 1R

Given the training data in the table below where credit history, debt, deposit and income are attributes and risk is  the  class,  predict  the  class  of  the  following  new  example  using  the  1R  algorithm: credit history=unknown, debt=low, deposit=none, income=average. If needed, settle ties by random selection. Show your calculations.

Question 6. Perceptron

Given is the following training set:

a) Train a perceptron with a bias on this training set. Assume that all initial weights (including the bias of the neuron) are 0. Show the set of weights (including the bias) at the end of the first epoch. Apply the examples in the given order.

Recall that the perceptron uses a step function defined as:

step(n) = 1, if n >= 0

= 0, otherwise.

Question 7. K-means clustering

Suppose that we are given 7 examples to cluster: A, B, C, D, E, F and G. The distance between them is given by the following matrix:

Run the k-means algorithm to group these examples into 2 clusters for 1 epoch. The initial centroids are A and B. Show the resulting clusters.

Question 8. Markov models

Given is the following Markov model for the weather in Sydney:

a) Given that today the weather is Sunny, what is the probability that it will be Sunny tomorrow and Rainy the day after tomorrow, i.e. what is the probability P(π3  = Rainy, π2 = Sunny| π1 = Sunny)?

Hint: P(A,B|C) = P(A|B,C) P (B|C)

b) If the weather yesterday was Rainy, and today is Foggy, what is the probability that tomorrow it will be Sunny?

For both questions, briefly show your calculations.

Question 9. Hidden Markov models

Julia tested positive to COVID and had to quarantine at home for several days. Her friend Nicole came to bring her food every day. We don’t know what the weather was on the quarantine days but we know the type of clothing Nicole wore and it provides evidence about the weather.

The following Hidden Markov Model models the situation. The initial state probabilities are: A0(Sunny)=0.5 and A0(Cloudy)=0.5.

Suppose that on the first quarantine day Nicole wore a dress and on the second she wore a blazer.

a)  What is the probability of the observation sequence?

b)  What is the most likely sequence of hidden states?

Briefly show your calculations.