COMP10082 Machine Learning for Data Analytics
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
COMP10082 Machine Learning for Data Analytics
Coursework (Semester 1: 2023)
Component 1 – Portfolio of Practical Work (Weighting 50%)
Due Date 4th December 2023
Overview
This component consists of a portfolio of the coursework students have completed during the module.
Submission
Students are required to submit the following.
Certificates for the following Kaggle Courses
1. Intro to Machine Learning
2. Intermediate Machine Learning
3. Pandas
4. Data Cleaning
Completed Jupyter Notebooks
1. Week 4 Part 3: |
Multiple Regression Exercise from scratch |
2. Week 6 Part 2: |
Classification with SVM |
3. Week 8 Part 2: |
k-means clustering from scratch |
4. Week 10 Part 3: |
Regression model with Keras (Creating your own models) |
Component 2 – Malware Classification and Report (Weighting 50%)
Due Date 5th January 2024
Overview
Download the Drebindataset from Moodle. The drebin-215-dataset contains 215 software attributes extracted from over 15,000 Android applications along with a classification in the final column, ‘class’. The dataset-features-categories file provides a descriptor of each value in the vector and an explanation of the class.
Students should read the data into a Jupyter notebook and apply a variety of classification models (minimum of 3) to the data supported by a critical analysis of each of the approaches taken. Your analysis should explain your approach and give an overview with the rationale and mathematical basis for each model you have applied. You should support your findings with empirical observations from your models and appropriate data visualisations. Your analysis should identify the strengths and weaknesses of each approach and you should conclude by recommending a classification technique based on the work you have conducted and your empirical observations.
Submission
You should submit asingle Jupyter notebook which contains both your models and analysis.
Marking Schemes
Component 1 - Portfolio
Kaggle Course Certificates
Each certificate is worth 10% of the marks for this component and will be marked as either a pass or fail. (Maximum of 40%).
Jupyter Notebooks
Each notebook is worth 15% of the marks for this component. It is expected that notebooks will be annotated with a description of and rationale for the steps taken and that results presented are fully explained. (Maximum of 60%).
Grade |
Jupyter Notebook |
A1 No improvement possible |
|
A2 |
All notebooks are completed to outstanding standards reflecting a deep understanding of the models created. The code produced is professionally structured and makes full use of the techniques introduced on the module |
A3 |
Fully completed to a professional standard; a clear understanding of the models created is obvious; The code produced is well structured and makes appropriate use of the techniques introduced on the module |
B1 |
Completed to a high standard; an understanding of the models created is obvious. The code produced is structured and makes appropriate use of the techniques introduced on the module |
B2 |
Models are created and largely correct; Code is structured but may have omissions or lack clarity |
C |
Models are created and mainly correct; Code is structured but may have omissions or lack clarity. |
D |
Models are created and all elements run although some output maybe incorrect; Code is structured but may have omissions or lack clarity. |
E |
Models do not run in the notebook or output is largely incorrect. Code produced lacks structure |
N No attempt |
Component 2 - Malware Classification and Report
Grade |
Jupyter Notebook |
Analysis |
A1 No improvement possible |
|
|
A2 |
All notebooks are completed to outstanding standards reflecting a deep understanding of the models created. The code produced is professionally structured and makes full use of the techniques introduced on the module |
Analysis is insightful and fully supported by metrics and data visualisations produced from the models created. A full understanding of the mathematical principles supporting the work is demonstrated. |
A3 |
Fully completed to a professional standard; a clear understanding of the models created is obvious; The code produced is well structured and makes appropriate use of the techniques introduced on the module |
Analysis is clear and well supported by metrics and data visualisations produced from the models created. A good understanding of the mathematical principles supporting the work is demonstrated |
B1 |
Completed to a high standard; an understanding of the models created is obvious. The code produced is structured and makes appropriate use of the techniques introduced on the module |
Analysis is supported by metrics and data visualisations produced from the models created. An understanding of the mathematical principles supporting the work is demonstrated |
B2 |
Models are created and largely correct; Code is structured but may have omissions or lack clarity |
Analysis is supported by metrics and data visualisations produced from the models created. Some understanding of the mathematical principles supporting the work is demonstrated |
C |
Models are created and mainly correct; Code is structured but may have omissions or lack clarity. |
Analysis is partially supported by metrics and data visualisations produced from the models created. Some understanding of the |
|
|
mathematical principles supporting the work is demonstrated |
D |
Models are created and all elements run although some output maybe incorrect; Code is structured but may have omissions or lack clarity. |
Some metrics and data visualisations are produced from the models created. A limited understanding of the mathematical principles supporting the work is demonstrated |
E |
Models do not run in the notebook or output is largely incorrect. Code produced lacks structure |
Limited or no output. No theoretical support for the models is included. |
N No attempt |
|
2023-12-23