COMPSCI 3DM3 Winter 2023 Introduction to Data Mining Assignment 1
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
COMPSCI 3DM3 Winter 2023
Introduction to Data Mining
Assignment 1
Total marks: 100 marks (5% of the final grade)
Assignment 1.1 (40 marks)
a) Show a dataset of four 2-dimensional points with class labels “+” (i.e., positive) or “-” (i.e., negative) that are not linearly separable. (20 marks)
b) Provide a nonlinear transformation which would make the data set proposed in part a) linearly separable. (20 marks)
Assignment 1.2 (60 marks)
Consider the following database of transactions:
TransID Items
1 a,b,c,d
2 b,c,e,f
3 a,d,e,f
4 a,e,f
5 b,d,f
Assume an absolute minimum support level of 2. For each level-wise pass of the Apriori algorithm, show the candidate itemsets generated from the join step, the candidate itemsets remaining after pruning, and the frequent itemsets.
Note:for each of thefollowing questions, you can EITHER derive the answers by hand OR write a computer program in Python to produce the answers .
• If you choose to derive the answers by hand, please write down your answers.
• If you choose to write a computer program, please submit your program as a zipfile to avenue. The zipfile should include a main.pyfile, which can be directly executed to produce the results (i.e., print the results on screen)for each of the questions in an easy-to- read manner. If the answers are not easy to read or understand, marks will be deducted.
a) Show level 1 candidate itemsets and frequent itemsets. (15 marks)
b) Show level 2 candidate itemsets and frequent itemsets. (30 marks)
c) Show candidate itemsets and frequent itemsets for the rest of the level(s). (15 marks)
2023-02-04