MAT700 Mathematical Methods for Data Mining 2020/21
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
MAT700
Mathematical Methods for Data Mining
2020/21
1. (a) i. What is supervised learning? Give an example.
ii. What is unsupervised learning? Give an example.
iii. What is semi-supervised learning? Give an example.
(b) Prove that the edit distance is indeed a distance measure.
(c) Suppose there is a repository of 512 documents, and word w appears in 32 of them. In a particular document D, the maximum number of occurrences of a word is 15. What is the TF IDF score
for w in D if that word appears six times in D?
(d) i. Prove that the Jaccard similarity of bags is always less than or equal to 1/2.
ii. Compute the Jaccard similarities of the following two bags:
一a, b, b, c, c, c, d, d, d, d, e , e, e, e, e{ and
一x, x, e, d, d, c, c, b, b, b, b, a, a, a, a, a{.
2. (a) Describe the na¨ıve Bayesian classification.
(b) Explain how any hierarchical clustering algorithm works and how we choose
which two clusters to merge.
(c) During testing of a Spam/NotSpam classifier on a test set of 1000 e-mail messages with 200 spam messages and 800 non-spam messages you found
that 180 spam messages were correctly classified as spam, and 600 non-spam messages were correctly classified as not spam. Calculate the precision, recall and F-measure of this classifier.
(d) In a class of 250 students, suppose that the birthdays are independent and uniformly distributed over the 365 days of the year. We are interested in the event that some students have their birthday on the same day. Calculate the expectation of the number of triples of students having
the same birthday.
3. (a) What are spider traps and what problems do they cause in PageRank calculations? Explain how the improved PageRank model:
ν = }(1 │ β) + βM╱ ν
overcomes the problem of spider traps.
(b) Compute the PageRank of each page in Fig. 1, assuming β = 0.8
(c) For a given data set of 3-D input data, we apply the SVM (support vector machine) learning algorithm and therefore achieve an optimal decision plane:
x1 + 2x2 + 2x3 + 3 = 0
i. What is the margin of this SVM?
ii. For three examples in the training set, (-1,-1,-2) with label -1, (2,-2,-1) with label -1, and (-2,-2,2) with label +1, tell whether they are support vectors of the decision plane.
4. (a) What is the unnormalized graph Laplacian operator L for a weighted graph? [4]
(b) Prove that if a function f on a connected weighted graph belongs to the kernel of the unnormalized graph Laplacian operator, i.e. Lf = 0, then f is a constant function.
(c) Explain the kernel trick and Mercer’s condition. Show how to kernelize the nearest neighbour classifier.
(d) What is the kNN classification algorithm? Explain how to weight the contributions of the neighbours.
(e) Formulate the k-means algorithm. What is the objective function of the k-means algorithm? Prove the convergence of the k-means algorithm.
2022-05-17