Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit


GEOG0178 Machine Learning for Social Sciences with Python

COURSEWORK INSTRUCTIONS

COURSEWORK: Machine learning-data analysis project (max. 2000 words)

Deadline – noon 27 April, 2026

The objective for this coursework is to analyse a topic/dataset of your choice in order to provide a data-based answer to a research question by applying machine learning modelling to it. Any topic can be analysed (and topics do not require approval from module convenor). The final output of this assessment is

(1) coherent pdf report submitted under the link below:

or for DAP, EC or SoRA students

 

and (2)

a jupyter notebook (.ipynb file) that contains the python code for the analysis and a data file (e.g., .csv file) jointly submitted in a single .zip file under the link below (for all students  regardless of the submission date):

 

Please note that you can find the above links a on the course Moodle page, under the Assessmenttab.

Report structure:

The report shoudl be coherent and may be generally/roughly structured as follows:

1.   Introduction

a.   Background, context and research question

2.   Brief literature-based overview of the topic and the research question

i.   Cite a minimum of 10 academic and/or policy papers

3.   Data and method

a.   Variables

b.   Why the selected machine learning model(s) is appropriate for the selected dataset/question? Main model premises.

c.   Discuss data cleaning/wrangling (if applicable and relevant)

4.   Interpretation and discussion of machine learning modelling results

a.   Exploratory data analysis (EDA)

b.   Model results and performance

i.   Comparison of models  - students are required to run at least two

machine learning models in their analysis and compare them. Note that the two (or more) models do not necessarily need to be different machine learning techniques (e.g., SVM as the 1st  model and decision tree as the 2nd  model). They can be two or more model variations conducted with the same machine learning technique (e.g., SVM with 8 variables as the 1st  model and SVM with 5 variables as the 2nd model).


c.   Limitations and implications

5.   Conclusion

a.   Summary of the main findings

b.   How could the analysis/model be improved?

c.   Suggestions for further research within the topic

Submission format:

The report should start with the UCL Geography cover page you can download from Moodle.

In a PDF document with text of font size 11 or 12 and written fully in complete sentences, e.g. not using bullet points and not including any Python code in the report. The report’s maximum length is 2,000 words which you are free to divide in any way between the sections and subsections. The following will NOT be included in the word count:

•    Assignment title

•    Author name/examination code

•    Page numbers

•    Reference lists

•    Footnotes, but only when used to reference primary source material.

•    Figures captions and table titles. A figure caption or table title should be  restricted to a succinct description of the figure or table to which it refers. Figures and tables themselves are not counted as part of the assignment length

•    Appendix (if applicable)

The maximum number of figures is 10 in total (multiple sub-figures used to make the same point are allowed) and the relevance of these figures should be explained in your write-up.

The submission deadline is noon on 27 April, 2026. Both the coursework report and the codebook zipped file must be submitted anonymously. The submission title should be your Exam Candidate ID, NOT your name, student number or essay title.

Queries:

All coursework-related queries can be posted on Moodle (“Ask a question” under “Keep in touch” tab); this is largely to address a likely overlap in questions that students may have and so that all students will benefit from any clarification that is given. Note that, as this is an assessed piece of work, you may not ask about questions that pertain directly to the coursework itself, e.g. ”Is model X the best way to answer question Y?” Other types of questions (e.g., regarding structure, datasets, etc.) are more than welcome. The rules for plagiarism apply. The deadline for any questions to be asked and answered is 22 April, 2026, i.e. 5 days before the submission deadline (27 April, 2024).