Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Computer Science

Summative Assignment

Module code and title

COMP3517 Computational Modelling in the Humanities and Social Sciences

Academic year

2023-24

Coursework title

Written Report

Coursework credits

10 credits

% of module’s final mark

100%

Submission date*

Tuesday, January 23, 2024 14:00

Estimated hours of work

20 hours

Submission method

Turnitin

Additional coursework files

N/A

Required submission items and formats

Written report as a PDF or Word file via Turnitin;

Code used to carry out the analyses described in the report as a .zip or .tar.gz archive.

* This is the deadline for all submissions except where an approved extension is in place.

Late submissions received within 5 working days of the deadline will be capped at 40%. Late submissions received later than 5 days after the deadline will receive a mark of 0.

It is your responsibility to check that your submission has uploaded successfully and obtain a submission receipt.

Your work must be done by yourself (or your group, if there is an assigned groupwork component) and

comply with the university rules about plagiarism and collusion. Students suspected of plagiarism, either   of published or unpublished sources, including the work of other students, or of collusion will be dealt with according to University guidelines (https://www.dur.ac.uk/learningandteaching.handbook/6/2/4/).

Summative Assignment 2023

COMP3517: Computational Modelling in the Humanities and Social Sciences

The House of Commons SPARQL endpoint (https://api.parliament.uk/sparql/) provides access to structured data about Members of Parliament (MPs) and questions that have been asked in the House of Commons. Use this data together with data from Wikidata (where necessary) to answer the following questions, presenting your findings in the form of a report of up to 3000 words, based on data about questions asked by MPs between 1 January 2023 and 30 September 2023 inclusive:

Q1.To what extent do Members of Parliament (MPs) tend to ask questions that directly reference

their own constituency or a location in it? You should answer this question by identifying named entities that refer to places or identifiable geographical features (e.g. “Dartford Crossing”,

“Reading Gaol”, etc.) in asked questions, and determining whether or not these are located in the MP’s constituency using data from Wikidata. You are free to choose any reasonable method in doing so – even if doing so will result in some false negatives – (e.g. relying on Wikidata property P131 “ located in the administrative territorial entity”), however you should estimate   how reliable you believe your chosen approach is. (N.B. it is not expected or required that your approach will result in 100% accuracy.)

Q2. By applying LDA topic modeling and analyzing the results, what (if any) identifiable regional

differences are there in the types of questions asked – e.g. do MPs representing, say, constituencies located in the North of England tend to ask more questions about certain topics than those in Southeast England? In answering this question, you should start by aggregating data into regions larger than an electoral district, such as those denoted by the property “ region of England” https://www.wikidata.org/wiki/Q48091 (For simplicity, you may treat Scotland and  Northern Ireland as two separate regions without further subdivisions, or alternatively use any    reasonable administrative subdivisions for these regions as you see fit). Discuss the assumptions and limitations of your approach and analysis.

The following SPARQL query can be used as a starting point:

SELECT *

WHERE {

?question <https://id.parliament.uk/schema/writtenQuestionIndexingAndSearchUin> ?qnum .

?person <https://id.parliament.uk/schema/askingPersonHasQuestion> ?question .

?question <https://id.parliament.uk/schema/questionText> ?text .

?question <https://id.parliament.uk/schema/questionAskedAt> ?date .

FILTER (?date >= "2023-01-01+00:00"^^xsd:dateTime && ?date < "2023-10-01+00:00"^^xsd:dateTime)

}

This query returns the following data:

?question

Entity representing a question

?person

Entity representing the person who asked the question

?qnum

Numerical identifier for this question

?text

The text of the question

?date

The date the question was asked

Your report must:

1.   Document how data was collected, and briefly state any relevant modules or toolkits that were used. If using SPARQL queries not contained in your code (other than the one provided), you must include these as separate files within your code submission.

2.   Describe your implementation and the models constructed.

3.   Summarize your results in appropriate ways, including use of suitable visualizations.

4.   Critically evaluate the models. This must involve critical analysis of the adequacy of the modelling (e.g. what are the assumptions the models rely on, and do they all hold; are there factors or biases that might invalidate conclusions drawn?), but should also involve some comparison with external data and/or published social/political science research on relevant subjects.

5.   Discuss what conclusions can be drawn from the model.

In this assignment, please note that:

.    You are not expected to resolve issues of lack of recorded data (or incorrectly recorded data) in additional data sources such as Wikidata. However, you are expected to account for the potential effects of these in your analysis, and should attempt to quantify these where possible.

.    You are allowed to reuse existing code – this includes open source modules and toolkits, Stack

Overflow responses, code described in online tutorials, etc., provided that whenever you do this, you clearly indicate what you have reused and exactly where it came from (including a URL for an online source). You should document smaller instances of reuse (e.g. short code examples from Stack Overflow) as comments in your code, and longer ones both in your code and as citations to the source in your report. Marks will be awarded for the original parts of your work (i.e. your extensions to and adaptations or developments of any reused material).

. Copying any material (this includes code and text) in your submission from any source without clear acknowledgement is likely to constitute plagiarism, which can have serious consequences as described in the Teaching and Learning Handbook:

https://www.dur.ac.uk/learningandteaching.handbook/6/2/4/

.    Marks will not be awarded under the “Technical depth/clarity of implementation” sections for parts of your code that reimplement code available in widely used libraries or modules.

.    You may use any programming language suitable for the task; use of Python 3.x is recommended where possible.

.    In writing your report, you should assume that the reader is familiar with everything covered

during the lectures for this module. Concepts and techniques important to your implementation which were not introduced during the module should be explained.

.    The word limit is exclusive of figures and references. You may include a maximum of 20 figures   (recommended: around 5-10 figures). Please use a 12pt font, single-spacing, and A4 pages. Your report should begin with an introduction, and does not require a separate abstract.

Submission (2 files)

.    A written report as a PDF file.

.    An archive (.zip / .tar.gz) containing the code for your implementation, including instructions for how to run it in a README.TXT file. If any data that your code relies on is over 10 Mb in size, include instructions in this file for how to obtain it and do not include it in the archive. Your code is not marked separately, but serves as supporting evidence for the criteria indicated in the marking scheme.

Marks (total 100)

Marks

Assessed by

Q1. Appropriateness, technical depth and clarity of implementation

15

Report and code

Results and evaluation

15

Report and code

Conclusions and interpretation of results

15

Report

Q2. Appropriateness, technical depth and clarity of implementation

15

Report and code

Results and evaluation

15

Report and code

Conclusions and interpretation of results

15

Report

Presentation (including academic writing, structure, and referencing)

10

Report