COMP4131 Data Modelling and Analysis
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
COMP4131 Data Modelling and Analysis
This assignment requires you to work individually.
You will need to analyse a data set using all the data modelling and analysis steps you have learnt to create and compare your trained models.
You will write your work up as an academic paper, comparing and analysing your results of the data modelling and analysis pathway (6 to 8 pages including references and diagrams) as stated in this coursework specification.
Coursework Deliverable Requirements:
Submit your academic paper following this naming convention:
Name_ID.docx or Name_ID.pdf.
For example: XiaomingChen_20712345.docxSubmit your Jupyter Notebook file following this naming convention:
For example: XiaomingChen_20712345.ipynb
Zip the following files:
Submit your zip file following this naming convention: Name_ID.zip.
For example: XiaomingChen_20712345.zip
Issue Date:l16 March 2026
Submission Date:6 May 2026, by 5.00PM.
Submission Mechanism:Via Moodle.
Late Policy (University of Nottingham default will apply, if blank): The standard late submission policy applies, i.e. 5% deduction of the total mark for every 24 hours (including weekends and holidays).
Feedback Date:12 June 2026
Instructions
For this coursework assignment, you will work individually to analyse a dataset of your choice. You may use any publicly available dataset from online sources such as the UCI Machine Learning Repository, Kaggle, or other reputable platforms. Your task is to apply the data modelling and analysis techniques you have learned in the course to preprocess the data, train models, and compare the performance of your trained models with state-of-the-art methods. Your work should demonstrate a clear understanding of the data modelling pipeline, from data preparation to evaluation, and include a critical analysis of your results.
You will write your work up as an academic paper, comparing and analysing your results from the different stages of the data preprocessing, analysis, and modelling pathway. You will also need to compare with different models or state-of-the-art methods.
You will need to present your paper in an IEEE format using a template from here: https://www.ieee.org/conferences/publishing/templates.html
Paper Structure and Mark Allocation
Your paper should be organised according to the following structure. The mark allocation for each section is shown in brackets.
Introduce the dataset and the research problem being investigated. Clearly state the research question(s) and the objectives of the analysis.
3. Literature Review (5%)
Review relevant work from existing research that applies similar data analysis or modelling techniques, particularly studies that use the same or similar datasets. Highlight key methods and findings from the literature.
4. Methodology (20%)
Describe the proposed methodology for your data analysis and modelling pipeline. This should include:
5. Experimental Results and Analysis (20%)
Present the experimental process and results. This should include:
6. Discussion (15%)
Summarise the main findings of your work and reflect on the effectiveness of the proposed approaches. Suggest possible improvements or directions for future research.
All sources must be properly cited using an appropriate academic referencing style (i.e., IEEE format).
Code Submission
In addition to the written report, you must submit a single Jupyter Notebook (.ipynb) containing all code used in the analysis.
The notebook should:
• Demonstrate the complete data modelling and analysis workflow, including:
• Data preparation and preprocessing
• Exploratory data analysis (EDA)
• Feature extraction or transformation
• Allow the results presented in the paper to be fully reproducible
Your submission will be evaluated based on whether the notebook can be successfully executed to reproduce the reported results.
The aim of this coursework is to provide practical experience in working with a real-world dataset and applying the full data modelling and analysis pipeline, from initial data preparation through to model development and evaluation.
Assessment Criteria
The main assessment criteria for the paper are:
|
Section |
Weightage % |
Criteria |
|
Title and Abstract |
3 |
The title and abstract clearly reflect the content of the paper. The abstract concisely summarises the problem, dataset, methodology, and key findings. |
|
Introduction |
5 |
The dataset and research problem are clearly introduced. The dataset is appropriately described, and the research question(s) are clearly stated and relevant to the context of the dataset. |
|
Literature Review |
5 |
Relevant and recent research papers are identified and discussed. The approaches and key findings of existing studies using similar datasets or methods are clearly summarised. |
|
Methodology |
20 |
Appropriate methods are selected for data preprocessing, analysis, and modelling. The methodological choices are clearly explained and justified. Any enhancements, modifications, or innovative aspects of the proposed approach are clearly described. |
|
Experimental Results and Analysis |
20 |
The techniques are implemented correctly and the experimental process is clearly presented. Multiple approaches are implemented and compared where appropriate. Results are presented clearly using suitable tables, charts, or visualisations. |
|
Discussion |
15 |
Results are interpreted critically and discussed in depth. The performance of different approaches is compared and analysed, andfindings are linked to the research question and literature where appropriate. |
|
Conclusion and Future Work |
10 |
The work is clearly summarised and the main findings are highlighted. Limitations of the study are acknowledged, and reasonable suggestions for future improvements or research directions are provided. |
|
References |
2 |
Relevant academic references are included and cited correctly using an appropriate referencing style (e.g., IEEE). |
|
Python Code Implementation |
20 |
The submitted Jupyter Notebook is well structured, clearly commented, and easy to follow. Variable and function names are meaningful and consistent. The code demonstrates the full data modelling workflow, including data preprocessing, exploratory data analysis, model training, and evaluation.Evidence of appropriate modelling practices (e.g., preprocessing, parameter tuning, or model comparison) is expected. The code should reproduce the results reported in the paper. |
Students must ensure that all submitted work is their own original work. Academic integrity is a fundamental principle of the University of Nottingham, and all coursework must comply with the University of Nottingham Ningbo China (UNNC) Academic Integrity Policy and the School of Computer Science AI Use Policy.
Plagiarism, collusion, or copying work from other students or external sources is considered aserious academic offence and may result in marks deduction, a mark of zero, or further disciplinary action in accordance with University regulations.
If you use external sources such as academic papers, datasets, documentation, or online resources, you must clearly acknowledge and cite them appropriately.
Students must also ensure that their coursework does not overlap with work submitted for other modules or previous projects. In particular, you must not reuse the same problem formulation, dataset, or a substantial portion of work that has already been submitted for assessment in another module or project. Double submission (submitting the same or substantially similar work for multiple assessments) is strictly prohibited and will be treated as a breach of academic integrity.The School of Computer Science recognises that Artificial Intelligence tools (e.g., generative AI systems) may be used in some contexts. However, students must ensure that any use of such tools complies with the UNNC and School of Computer Science AI Use Policy. Any permitted use of AI tools must be transparent and properly acknowledged, and students remain fully responsible for the accuracy, originality, and integrity of the submitted work.
Your coursework should demonstrate independent thinking, appropriate methodological design, and your own implementation of the data analysis and modelling pipeline.
If you are unsure whether your work complies with the academic integrity or AI use policies, you should consult the module convenor before submitting your work.
2026-04-02