COMS30042 Advanced Algorithms 2021
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
January/February 2021 Examination period
FACULTY OF ENGINEERING
Third Year Examination for the Degree of
BacheIor and Master of Engineering and BacheIor of Science
COMS30042
Advanced AIgorithms
TIME ALLOWED:
2 Hours
Q1 . (a) consider an array of n integers whose values are in the range 0, . . . , u - 1. Give an algorithm that can sort the array in O(n log log u) time.
SoIution: Make a van Emde Boas tree from the integers taking O(n log log u) time. ln linear time ind the largest element and call predecessor and delete re- peatedly to get the array in reverse order, then reverse. or we can ind the smallest element and call successor repeatedly.
[3 marks]
(b) State the space usage of the FKS hashing scheme and state the time complexity of the construction.
SoIution: Linear for both.
[2 marks]
Q2 . Answer the following questions about Bloom ilters which we use to represent a set of elements. Bloom ilters do not support deletions natively. However, your friend Martha suggests a way round this problem. Her idea is as follows. create two Bloom ilters, one called 反add and one callld 反de/ . whenever we want to add an element e to the set we set the relevant places in 反add to 1. whenever we want to delete an element e we set the relevant places in 反de/ to 1. ln order to check if an element is in the set we have to look at both 反add and 反de/ . First check 反add . lf it returns false then e is not in the set. lf it returns true then check 反de/ . lf this returns true then e is not in the set. lf it returns false then e is in the set.
(a) Does Martha,s idea give us the desirable property that false positives may occur but false negatives are impossible. lf you think yes then explain why and if no, give an example that breaks this property.
SoIution: You can have false positives even if for example you haven,t deleted anything. You can also have false negatives by having a false positive in 反de/ .
[5 marks]
(b) what is the time complexity of a lookup using Martha,s idea?
SoIution: lf the Bloom ilter uses r hash functions then it is O(r) time because we need to do r hash function calls twice.
[5 marks]
(c) Another way to handle deletions is to store an m-bit number in the array positions. ln practice it turns out that 4 bits normally suices. Describe how insert, delete and lookup queries could be supported in such an extended Bloom ilter.
SoIution: The insert operation then increments the values at the relevant places by 1 and the delete operation decrements the value of each of the locations by 1. A lookup checks to see if all the values at the relevant locations equal 0.
[5 marks]
(d) Does your solution using m-bit numbers in the array positions have the desirable property that false positives may occur but false negatives are impossible? lf you think yes then explain why and if no, give an example that breaks this property.
SoIution: No. lf we insert x and y which have the same hash locations and then delete y twice and then query x we will get a false negative.
[5 marks]
(e) lmagine we want to use a Bloom ilter of size 800 bits for a web cache which stores
up to 100 URLs picked from a universe of 1000000 URLs. How would you calculate the optimal number of hash functions to use to minimise the false positive probability and what false positive probability does it give? You can give the probability to one decimal place.
SoIution: The number of hash functions is 800/(100e) 个 2.9. ln practice you should round this up to 3. The false positive probability is 心 0.05
[5 marks]
Q3 . A hash function family H = {h1 , h2 , . . . } with hi : U 一 T is pairwise independent if for randomly and uniformly chosen h e H, we have that for any distinct x, y e U and pair of values u, Ⅴ e T , pr(h(x) = u V h(y) = Ⅴ ) = 1/|T | 2 .
Let x1 , . . . , xn be diferent values chosen from U.
(a) pick at random a1 , . . . , an e {0, 1} and deine
ha1 ,...,an (x1 , . . . , xn ) = a1x1 + . . . + anxn mod 2
prove that this family is pairwise independent or give a counter-example.
SoIution:
This is not pairwise independent. consider the zero string input x1 = x2 = . . . = xn = 0. lt always maps to zero and any other string will map to zero with probability 1/2 . Hence the probability they are equal is 1/2 not 1/4 .
[5 marks]
(b) pick at random ao , . . . , an e {0, 1} and deine
hao ,...,an (x1 , . . . , xn ) = ao + a1x1 + . . . + anxn mod 2
Let u = u1 , . . . , un and Ⅴ = Ⅴ1 , . . . , Ⅴn . ui and Ⅴi must difer at at least one index i and we can ignore all other indices. we can regard a1 , . . . , an as selecting a subset of values in the two inputs u and Ⅴ. You can assume without proof that the probability of the size of the subset of mismatching indices being even or odd size is equal.
prove that this hash function family is pairwise independent.
SoIution: we only have to observe that if the size of the subset of mismatching indices is even then h(u) = h(v),if it is odd then h(u) h(v). The choice of value for ao lips parity of the subset size for both u and v. For the even case the probability they are either both 0 or both 1 each has equal probability of 1/4. For the odd case 0, 1 and 1, 0 each also have probability 1/4 as desired.
[5 marks]
Q4 . This question is about devising a fast algorithm to count the number of distinct substrings of a string of length n,including the empty string. For example, the string abb has ive distinct non-empty substrings which are (a, b, ab, bb, abb).
(a) i. Construct the atomic suix tree of abb. The suix tree you construct should have exactly three leaves as we do not need to add an edge from the root that points only to the sentinel.
SoIution:
b |
|
$ |
$
[5 marks]
ii. what property of an atomic suix tree gives you the number of distinct non- empty substrings for a string?
SoIution: The number of internal nodes excluding the root.
[5 marks]
iii. Give an algorithm that runs in O(n) time to count the number of distinct non- empty substrings of a string of length n. You should provide a short explanation in English of why your algorithm is correct and why its running time is O(n). You may assume anything given in the lectures as part of your solution. [Hint: The atomic tree has size O(n2 ).]
SoIution: Construct a (compacted) suix tree in O(n) and sum up the lengths of all the edges in the tree.
[5 marks]
(b) This part of the question is about the DC3 suix array construction algorithm. ln the DC3 algorithm indices 反1 = {i mod 3 = 1} and 反2 = {i mod 3 = 2} are deined. ln the irst stage of the algorithm it computes the suix array for indices 反1 n 反2 from a string of twice the length of the input. show the suix array that is obtained in this way by the DC3 algorithm in its irst stage using the input string bananas.
SoIution: (0, 3, 2, 1)
[5 marks]
Q5 . consider the following problem which we will call Maxlncome. we are given work values a1 , . . . , an and a bound b and incomes c1 , . . . , cn with bound k. work ai gives income ci . ls there a way of choosing work values so that their sum is at most b and the corresponding total income is at least k?
(a) show that Maxlncome is in NP. [5 marks]
(b) show that Maxlncome is NP-complete by reducing it from one of the NP-complete problems that has been covered in the lectures. [5 marks]
SoIution: To check if Maxlncome is in NP we check in linear time if the total of the work values is at most b and the sum of the incomes is at least k.
For the reduction, we reduce from subsetsum. create an instance of Maxlncome with
we now have:
{Σ(Σ)iGS(iGS) ci 三 k(a i 三 b) —(—)今(今) Σ(Σ)iGS(iGS) s i(s i) t(t) —今 si = t
A yes answer to the new problem gives a yes answer to the original problem. similarly a no answer to the new problem gives a new answer to the original problem.
Q6 . consider a 2D version of the RMQ problem. we are given a square n 根 n grid of integers. A query RMQ (y1 , X1 , y2 , X2 ) returns an index of a smallest value in the rectangle with bottom left coordinate (y1 , X1 ) and top right coordinate (y2 , X2 ) . we can assume that y1 < y2 and X1 < X2 .
consider a solution which precomputes RMQ (y1 , X1 , y1 + /y , X1 + /x ) for all y1 , X1 , /y e {21 , 22 , 23 , . . .} and /x e {21 , 22 , 23 , . . .} and store the results in a table.
(a) Explain a fast method for performing the precomputation and give its running time and gives its time complexity.
SoIution: we use the same idea as in the 1D case. That is for a given bottom left (y1 , X1 ) and ixed /y we ind all the RMQs for /x = k and then use these to get the RMQs for /x = 2k and so on doubling. This is O(log n) work in each dimension for all O(n2 ) bottom left corners, giving O(n2 log2 n) time overall. [5 marks]
(b) How can queries be performed in O(1) time once this precomputation has been inished?
SoIution: Any rectangular query can be cover by 4 overlapping rectangles with both dimensions being a power of 2. we can then make 4 queries to our precom- puted table and take the minimum of these.
[5 marks]
Q7 . construct the cartesian tree for the input 7 , 6, 1, 7, 5, 8, 3, 4.
SoIution:
1 / 6 / 7
|
3 / 5
6 |
|
[5 marks]
2023-08-09