Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

MTH765P (2022)

Question 1 [33 marks].

(a)  Consider the following text,

I needed a new heel for my shoe, so I decided to go to Morganville which is what they called Shelbyville in those days. So, I tied        an onion to my belt which was the style at the time. Now, to take the ferry cost $0.05. And in those days, $0.05 had pictures             of bumblebees on em. ‘Give me ve bees for a quarter,’ youd say.

For each query below write out all the matches for each query. You can assume  each line ends with \n. Write down the exact match, so the query ’an  onion’ would return an onion, rather than the whole line. For long matches, you can  indicate only the beginning and the end of the match as long as it is clear where they occur in the text. Empty matches are possible.

(ii)  ’\’.+?\s’                                                                                                      [5]

(iii)  ’[a-zA-Z]{2,4}?[, |\ .]’                                                                               [5]

(b) For each pair of regular expressions, give an example for which the rst query matches but the second does not (or vice versa). If they are equivalent, give a brief explanation why (a few sentences).

(ii)  [x|y|z|X|Y|Z] .+[1-3]{1} and  [x-zX-Z]  *[1 |2|3]                                     [5]

(c)  Create a regular expression that will match all integers where the digits are in             numerical order. For example, 0137899 should match but 7289 should not. The           empty string is allowed.                                                                                                [8]

Question 2 [23 marks]. You are given the Vehicle-themed SQL database consisting of the three tables below:

Shows

id

Name

Transport

Driver

Year

1

2

3

4

Fast and the Furious

Smokey and the Bandit

Knight Rider

A-Team

2

1

1

4

2

4

3

5

2001

1977

1982

1983

Vehicles

id

Maker

Model

1

2

3

4

5

Pontiac

Dodge

Toyota

GMC

Aston Martin

Trans Am

Charger

Supra

Vandura (Van)

Vanquish

Characters

id

Name

Actor

1

2

3

4

5

James Bond

Dom Toretto

Michael Knight

Bo BanditDarville

Bosco Albert B.A.” Baracus

George Lazenby

Vin Diesel

David Hasselho

Burt Reynolds

Mr. T

(a) What does the following query return? You may abbreviate the entries.

SELECT  * from Shows

INNER  JOIN Vehicles ON Vehicles.id=Transport ;

(b) Write queries which return

(i) the make of the vehicle which transports the A-team;

(ii) the actors who drive a Trans Am;

(iii) the make and model of the car that Dom Toretto drives.

Question 3 [22 marks]. You would need to choose colours in order to visualise   multiple classes. However, there are too many classes to simply use only hue to           distinguish between them, i.e. the colours would look too similar. A solution has been suggested where both hue and saturation are used.

(a) Assuming we need N colours, one suggested approach is to rst create a vector x

of N equally spaced values from 0 to 1, e.g. if N = 6, then x would be    [0, 0.2, 0.4, 0.6, 0.8, 1]. Using the vector, generate the i-th colour using the formulas:

hue(i) = x(i)

saturation(i) = x(i)

Sketch what colours are chosen on a diagram where the hue is given by the angle        and the saturation is given as the radius.                                                                     [7]

(b) A second approach is to consider an N x N grid of equally spaced values from 0 to 1, which we denote by a pair (x1 , x2 ). Each pair of values occurs exactly once. For example, if N = 6, then (0, 0.6) and (0.6, 0) would both appear in the grid.

hue(i) = x1 (i)

saturation(i) = x2 (i)

angle and the saturation is given as the radius for N = 4.                                  [7]

(ii) Describe an alternative approach to visualising many classes. A few

sentences are sufficient. Hint: There are many possible answers.                       [8]

Question 4 [22 marks]. Consider the following dataset of measurements, which

are denoted by X .

index

1

2

3

4

5

6

7

8

9

10

11

12

X

-6

-22

7

10

17

30

0

2

-1

5

-10

10

We define the lower and upper hinges as the third and tenth value out of 12 samples (with the index beginning with 1).

(a)  Compute the H-spread and the inner fences.                                                               [6]

(b)  Compute the mean of the dataset after the following transformations:

(ii) The values outside of the hinges are set to the value of the hinge which is

closest.                                                                                                                   [5]

(c) How much can you change the H-spread by changing one value in the dataset?     Write down which value you should change, what you should change it to, and the

resulting difference in H-spread.                                                                                   [6]