闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

COM 2004

DEPARTMENT OF COMPUTER SCIENCE

DATA DRIVEN COMPUTING

AUTUMN SEMESTER 2014–2015

1. a) A Gaussian distribution whose mean is 100 and whose standard deviation is 10 is used to generate N random samples. The samples are then used to estimate the mean, µ , and variance, S, of the Gaussian using the formulae below.

1 N

µ = N E xi

1 N

S = N E (xi - µ)2

(i) Use a series of sketches to show how the distribution of the mean estimate changes as the value of N increases. [20%]

(ii) Use a sketch to show how the distribution of the variance estimate changes as the value of N increases. [20%]

b) Alice has two routes for travelling to work. Travel times on these routes are normally distributed.

- Route A takes on average 60 minutes with a standard deviation of 20 minutes.

- Route B takes on average 70 minutes with a standard deviation of 10 minutes. Work starts at precisely 9:00 and Alice does not want to be late.

(i) Which route should Alice take if leaving home at 8:00? Justify your answer. [10%] (ii) Which route should Alice take if leaving home at 7:30? Justify your answer. [10%]

(iii) At what departure time would both routes be equally likely to get Alice to work by 9:00? [10%]

c) A process is generating independent uniformly distributed random numbers, x

x ~ U(a’b)

You observe the sequence of samples: 5.1, 10.2, 8.4 and 3.1. Using maximum-likelihood parameter estimation what would be the best estimates for the parameters, a and b. Ex- plain your reasoning.

[30%]

2. a) Consider two classes, o1 and o2 each containing two points as follows,

o1 = ,(-2 ’1)T ’(- 1 ’1)T 、

o2 = ,(0’0)T ’(1’0)T 、

Consider a linear classiﬁer with the discriminant function w1x1 + w2x2 + w0 = 0. Using initial weights of w1 = 1, w2 = 1 and w0 = -2 and a learning rate of 0.5, run the perceptron learning algorithm by hand until all the points are correctly classiﬁed (show

all your workings). [50%]

b) Plot the points given in 2 (a) in feature space and plot the decision boundaries given by the initial and ﬁnal weights. [10%]

c) Consider two classes, o1 and o2 each containing two points as follows,

o1 = ,(1’0)T ’(1’2)T 、

o2 = ,(0’1)T ’(2’1)T 、

When using this data the perceptron learning algorithm will never terminate regardless of what values are chosen for the starting weights. Explain why. [10%]

d) Draw a two layer perceptron that can correctly classify the points in 2 (c). [30%]

3. a) Use pseudocode to show how samples are classiﬁed using a K-nearest neighbour clas- siﬁer. [30%]

b) Consider a K-nearest neighbour that uses a Euclidean distance measure, K = 1, and the following samples as training data,

o1 = ,(0’0)T ’(2’0)T 、

o2 = ,(-2 ’2)T ’(-2 ’-2)T ’(0’2)T ’(0’-2)T 、

A point is selected uniformly at random from the region deﬁned by -2 < x1 < 2 ’-2 < x2 < 2. What is the probability that the point is classiﬁed as belonging to class o1 ? [Hint: start by sketching the decision boundary.]

[40%]

c) Consider the distance measure

d(x1 ’x2) = 1 - cos(9)

where 9 is the angle between vectors x1 and x2 .

Two separate 1-nearest neighbour classiﬁers are constructed, one using the Euclidean distance and the other using the distance measure deﬁned above. The following labelled samples are provided as training data,

o1 = ,(1’0)T ’(0’1)T 、

o2 = ,(1’1)T 、

Sketch the region of feature space in which the two classiﬁers would be in agreement.

[30%]

4. a) Give an account of the different ways in which distances between clusters can be de- ﬁned. [30%]

b) Run the agglomerative clustering algorithm on the following data and write down the cluster membership after each iteration.

(1’1)’(2’1)’(2’4)’(4’4)’(5’0)’(5’5)

Your algorithm must use the minimum Euclidean point-to-point distance to measure the distance between clusters and you must iterate the algorithm until only one cluster re- mains.

[30%]

c) Draw a dendrogram to illustrate the hierarchical clusterings generated in 4 (b).

[10%]

d) The data in 4 (b) is to be clustered into two clusters using the k-means clustering algo- rithm. By providing speciﬁc examples, show how the different initialisations of the cluster centres can lead to different results. [30%]