Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

MTH765P (2021)

Question 1  [2 marks].

(a)  Given the text below, for each of the given regular expressions, write down all the matches. You can assume there is a newline at the end of each line. You should   consider this as though you were looking for matches in Python rather than grep

(so it is not line based).                                                                                                [1]

Dark  Side  of  the Moon

4  seasons

37  steps    sarcastic  <--home-->

(i)  [a-z]/s[A-Z]

(ii)  /n[0-9].s

(iii)  .{2}-.{2,5}-

(iv)  <.*>

(v)  (s-zI0-9){2}(/sI-).*?/n

(b) Write down a regular expression which matches all web queries to UK servers in a log file (described by the criteria below ). The structure of the queries are             described below. You are using grep so you do not need to worry about lazy or    greedy matches, each line is tested separately and there is at most one address per

line. Provide a short explanation of the regular expression you write down.              [1]

– are part of a web address - so begin with http:// or https://

– has co.uk at the end of the web address (you do not need to verify that it is a proper web address)

– following the web address, there is a slash followed by the word query, followed by a question mark

– then there is at least one search eld given which contains =

Question 2 [4 marks].

(a) You are given a SQL database on Mathematicians with the following tables and columns:

Name

Columns

mathematicians

institutions

academics

id, name, area, institute id, birth year

id, name, city, country

institution id, professor id, starting year

Write SQL queries which will return the results described below:

(i) The names of topologists born between 1900 and 1950. (ii) All institution names with a professor who started after 1980.

(iii) The average birth year of geometers (geometry) at insitutions in the UK. (iv) The earliest starting year for mathematicians born in 1980.

[1] [1] [1] [1]

(b)  Given the following two tables mathematicians

id

last

name

birth

year

institute

id

1

2

3

4

Gauss

Riemann

Serre

Grothendieck

1777

1826

1926

1928

1

1

5

3

institutions

id

name

country

1

2

3

4

University of Gottingen

Princeton University

IHES

Oxford University

Germany

USA

France

UK

Write down the returned table (note that you can abbreviate the text):

(i) mathematicians INNER JOIN institutions ON institutions.id=mathematicians.institute id;

(ii) mathematicians OUTER JOIN institutions ON

institutions.id=mathematicians.institute id;

(iii) institutions LEFT JOIN mathematicians ON

institutions.id=mathematicians.id;

(iv) institutions RIGHT JOIN mathematicians ON

institutions.birth year=mathematicians.name;

Question 3  [4 marks].     You would like to visualise how you spend your time at      home. You have used your phone to record where you are at every minute. You have a floor plan of your home (and assume your phone has done the translation so x and y     coordinates match up and the at has only one oor). You create a 2D histogram of the points representing the probability distribution of your locations. Answer the questions below in a few sentences.

(a) What kind of information can this visualisation convey eectively?

(b) What are some potential problems or drawbacks of this visualisation? State and describe two.

You now create a second visualisation. Since you have many measurements, you rst take only every 10th measurement. You create a scatter plot of the locations of the   measurements, using the color of the points to represent the time of day of the measurement.

(c) What would be a good way of assigning colors to time of day? Explain why your solution would work and include in your answer how you would generate the       colors, i.e. given a time of day, what is the corresponding color.

(d) Compare this visualisation with the one above in terms of the information it conveys.

[1]

[1]


Question 4  [4 marks].

(a)  Given the following JSON le

{

’test’:[

{

’x’:4,

’y’:20

},

{

’x’:23,

’y’:7

},

],

’debug’:{

{

’x’:6,

’y’:24

},

},

production’:  [’2’,  ’3’]

}

It is stored in a le called tmp.json You read it in using:

X  =  json .load(’tmp .json’)

Write down how you would query the following values (i.e. write the query which returns this as the answer, these are unique):

(i)  24                                                                                                                           [1] (ii)  ’2’                                                                                                                           [1] (iii) 4                                                                                                                             [1]

(b) Write out the following table as a JSON le, without including missing values.

Give a brief explanation of how you encoded it (i.e. dictionaries, lists, etc.). There       is more than one correct answer.                                                                                  [1]

month

temp

humidity

UV

index

January

13

March

10

25

5

Nov.

5

20

Dec

0

3