Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Group Project Report Instructions

Description:

· There are 20% of the final marks come from the group project.

· The students are clustered into groups each having 3-4 students.

· The project related to classification analysis. Given a dataset, the goal is to create an accurate classifier and make prediction on unseen records.

· Deadline: 18th Dec, 2023.

Submission Requirement:

Upon completion, each student must submit the following materials:

1. Test data and its prediction

2. Code

a) You MUST implement the following models by yourself (write your own R program instead of using existing packages): KNN, Naïve Bayes and Perceptron.

b) You MUST adopt at least two models besides the aforementioned three ones for your classification task (For example: logistics regression, association rule, decision tree, Bayes belief network, neural network, bagging, boosting method, random forest, SVM, etc.). You do not need to implement by yourself, instead you can use existing packages/libraries in R or Python.

c) Your program must be executable without any bug and can read the training data and report the performance of models. Your program also can read the test data to perform prediction.

d) A README file should be included to introduce the information for your code and explain how to execute your code.

3. Implementation report

In the report, the following components should be included:

1. Cover Page (indicating group number and member list)

2. Abstract or the workflow.

3. Brief introduction to the models adopted (your implementation as well as those provided by existing libraries )

4. Experimental results of different models, e.g., Macro-averaging /Micro-averaging of precision/recall/F-score/ROC curve.

5. Result analysis (Model evaluation/ Model interpretation  )

a) Which model achieves the BEST performance on this dataset? Why?

b) Interpret the trained model.

c) Conduct error analysis for the models that do not perform well.

Assessment:

1. Classifiers implementation and performance: 50%

2.  Code: 10%

3.  Project report: 20%

4. Others: 20%

a) Use classification methods that have never been (or optionally) learned in the classes.

b) Novel strategy that improves the classifier performance, e.g., data preprocessing methods, imbalanced classification etc.

c) Implementation of other models by yourself.