COM 2004 DATA DRIVEN COMPUTING 2014–2015
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
COM 2004
DEPARTMENT OF COMPUTER SCIENCE
DATA DRIVEN COMPUTING
AUTUMN SEMESTER 2014–2015
1. a) A Gaussian distribution whose mean is 100 and whose standard deviation is 10 is used to generate N random samples. The samples are then used to estimate the mean, µ , and variance, S, of the Gaussian using the formulae below.
1 N
µ = N E xi
1 N
S = N E (xi - µ)2
(i) Use a series of sketches to show how the distribution of the mean estimate changes as the value of N increases. [20%]
(ii) Use a sketch to show how the distribution of the variance estimate changes as the value of N increases. [20%]
b) Alice has two routes for travelling to work. Travel times on these routes are normally distributed.
- Route A takes on average 60 minutes with a standard deviation of 20 minutes.
- Route B takes on average 70 minutes with a standard deviation of 10 minutes. Work starts at precisely 9:00 and Alice does not want to be late.
(i) Which route should Alice take if leaving home at 8:00? Justify your answer. [10%] (ii) Which route should Alice take if leaving home at 7:30? Justify your answer. [10%]
(iii) At what departure time would both routes be equally likely to get Alice to work by 9:00? [10%]
c) A process is generating independent uniformly distributed random numbers, x
x ~ U(a’b)
You observe the sequence of samples: 5.1, 10.2, 8.4 and 3.1. Using maximum-likelihood parameter estimation what would be the best estimates for the parameters, a and b. Ex- plain your reasoning.
[30%]
2. a) Consider two classes, o1 and o2 each containing two points as follows,
o1 = ,(-2 ’1)T ’(- 1 ’1)T 、
o2 = ,(0’0)T ’(1’0)T 、
Consider a linear classifier with the discriminant function w1x1 + w2x2 + w0 = 0. Using initial weights of w1 = 1, w2 = 1 and w0 = -2 and a learning rate of 0.5, run the perceptron learning algorithm by hand until all the points are correctly classified (show
all your workings). [50%]
b) Plot the points given in 2 (a) in feature space and plot the decision boundaries given by the initial and final weights. [10%]
c) Consider two classes, o1 and o2 each containing two points as follows,
o1 = ,(1’0)T ’(1’2)T 、
o2 = ,(0’1)T ’(2’1)T 、
When using this data the perceptron learning algorithm will never terminate regardless of what values are chosen for the starting weights. Explain why. [10%]
d) Draw a two layer perceptron that can correctly classify the points in 2 (c). [30%]
3. a) Use pseudocode to show how samples are classified using a K-nearest neighbour clas- sifier. [30%]
b) Consider a K-nearest neighbour that uses a Euclidean distance measure, K = 1, and the following samples as training data,
o1 = ,(0’0)T ’(2’0)T 、
o2 = ,(-2 ’2)T ’(-2 ’-2)T ’(0’2)T ’(0’-2)T 、
A point is selected uniformly at random from the region defined by -2 < x1 < 2 ’-2 < x2 < 2. What is the probability that the point is classified as belonging to class o1 ? [Hint: start by sketching the decision boundary.]
[40%]
c) Consider the distance measure
d(x1 ’x2) = 1 - cos(9)
where 9 is the angle between vectors x1 and x2 .
Two separate 1-nearest neighbour classifiers are constructed, one using the Euclidean distance and the other using the distance measure defined above. The following labelled samples are provided as training data,
o1 = ,(1’0)T ’(0’1)T 、
o2 = ,(1’1)T 、
Sketch the region of feature space in which the two classifiers would be in agreement.
[30%]
4. a) Give an account of the different ways in which distances between clusters can be de- fined. [30%]
b) Run the agglomerative clustering algorithm on the following data and write down the cluster membership after each iteration.
(1’1)’(2’1)’(2’4)’(4’4)’(5’0)’(5’5)
Your algorithm must use the minimum Euclidean point-to-point distance to measure the distance between clusters and you must iterate the algorithm until only one cluster re- mains.
[30%]
c) Draw a dendrogram to illustrate the hierarchical clusterings generated in 4 (b).
[10%]
d) The data in 4 (b) is to be clustered into two clusters using the k-means clustering algo- rithm. By providing specific examples, show how the different initialisations of the cluster centres can lead to different results. [30%]
2022-01-15