关键词 > COMP3600/6466

COMP3600/6466 — Algorithms

发布时间:2021-09-27

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit


COMP3600/6466 — Algorithms

Assignment 2


IMPORTANT NOTES (PLEASE READ BEFORE WORKING ON THE SOLUTION):

● Please read the entire of this question sheet before starting to work on the solution.

● Your assignment should consist of a single .pdf, named A2-[studentID].pdf and a single .cpp/.java file, named A2-[studentID].cpp or A2-[studentID].java. You should zip these two files together into a single file, named the .zip file A2-[studentID].zip and submit via Wattle before the due date.

● We provide 13 hours grace period. This means, there will be no penalty if you submit before the grace period ends. However, we will NOT accept assignment submission beyond this time (i.e., late penalty after the grace period ends is 100%).

– You can update your assignment until the grace period ends.

● Assignment marking:

– The total mark you can get in this assignment is 100 points.

– This assignment will contribute 20% to your overall mark.

– In each question, you need to provide a convincing argument (including mathematical derivation) that supports your answer. The argument is worth 80% of the marks, while the solution alone is 20%.

● Discussion with your colleagues are allowed and encouraged. However, you need to work on the assignment on your own AND write the names you discussed this assignment with.


In all the questions below, you can assume that all n are positive integers.

[30 pts] Part A

1. [10 pt] Given the following recurrence, please answer the questions below

where c is a constant.

(a) [2.5 pt] Can Master Theorem be used to solve this recurrence?

(b) [7.5 pt] Please find a tight asymptotic bound (i.e., ) to the solution of the above recurrence.

2. [5pt] Please solve the recurrence

3. [5 pt] As a programmer in a start-up company, you have been tasked to write a general sorting program (i.e., one that sort any type of data given a comparison function) that runs in time, where n is the number of data to be sorted. Is this task feasible? Please explain.

4. [10 pt] For each statement below, please identify if it is Always True, Always False, or sometimes True. Please provide an explanation for your answer.

(a) [5 pt] Any non-stable sorting algorithm can be made stable.

(b) [5 pt] Finding the minimum value in a MAX-HEAP takes time in the worst case, where h is the height of the tree.


[40 pts] Part B

5. [10 pt] The town pub is planning an end-of-lockdown celebration to be held the day the lockdown ends. All 100 tickets to this celebration has been sold out. To make things fun, the pub owner, Ms Pub, plans to serve drinks on different types of cups, where people born on the same month would be served with the same type of cups. He wonders how many of each of the 12 different types of cups should he prepare. Can you help Ms Pub? Please use the expected value of attendees born in each month to estimate the number of cups (per type) that Ms Pub needs to prepare. You can assume that everyone who has bought the ticket will come and the month each attendee were born is distributed uniformly at random.

6. [15 pt]


Figure 1: Some of the glasswares sent to Arrebnac’s Science Museum. Photos taken from Related Objects section of https://www.metmuseum.org/art/collection/search/249475.

The city Arrebnac has just received a new exhibit for its Science Museum. The exhibit consists of n artistic glasswares (some of them are shown in Figure 1), each with a unique volume. Together with these glasswares, comes n jars of pebbles. Each jar has a unique mixed of small pebbles and indicates the volume of exactly one of the glasswares, in the sense that when poured into the glassware, the entire pebbles in the jar will fill it exactly, no more and no less. In the exhibit, the glassware will be put side by side with the jar of pebbles that indicate its volume, and a robot mechanism has been built, such that with a touch of a button, one can see the jar of pebbles being poured gently into the glassware and vice versa.

All glasswares and the jars of pebbles have arrived in good conditions. However, information about which jar of pebbles should be associated with which glassware were nowhere to be found. Unfortunately, due to the odd geometry of the glasswares and the different mixed of pebbles in each jar, direct comparisons between the the glasswares’ volumes and direct comparisons between the jars of pebbles’ volumes are impossible. In fact, the only safe way to find the right association between the glassware and the jar of pebbles is to use the robot mechanism to pour the pebbles into the glassware until it is full and check if the entire pebbles in the jar fits exactly in the glassware.

The exhibition manager knows that he can resolve the problem, but with comparisons, and hence time, even in the best case. Considering the exhibition will start the next morning, he is looking for an asymptotically faster method, one that takes only time on average. Can such a method be devised? If it can, please provide the method as a pseudo-code and show that the time complexity of your method is indeed on average. Otherwise, please explain why such a method is impossible.

7. [15 pt] The Arrebnac Streaming Company (ASC) Pty. Ltd. needs a fast method to sort its entire video collections based on the number of views it has over the past week. These videos have been sorted based on the number of past week views, but they are only sorted within the same genre and released year. Considering ASC have a collection of almost any type of genre known in the film industry, dating back from the early 1900, efficiency of the sorting method matters a lot. Please help ASC by answering the following questions:

(a) [10 pt] Develop a sorting mechanism that utilises the sorted order of the videos within each group, such that the time complexity of sorting the entire video collection is in the worst case, where n is the number of videos in the entire collection, while m is the number of groups (genre and released year). Please provide a pseudo code.

(b) [5 pt] Show that the method you propose indeed have a worst case time complexity of .


[30 pts] Part C

8. In a galaxy far far away, the Planet Htrae is still fighting a health crisis over the spread of a new highly contagious and deadly disease, called DV. However, the Htrae’s scientists have now produced vaccines that can make the recipient becomes immune to DV. To vaccinate as many people as soon as possible, the Senate of Htrae has ordered each Local Area (LA) to setup a vaccination hub. The Senate requires that the hub be setup at a location that is as close as possible to everyone, especially to the vulnerables and essential workers who may have difficulties travelling far from their homes or offices. To help LA government decide the location, the Senate asked that a software be developed to automatically decide the optimal location for the vaccination hub.

The idea is the software will utilise information about the center coordinates (x, y) of clusters of homes, offices, nursing homes, hospitals, etc., as well as a weighting w ∈ (0, 1] that indicates the importance of each of those clusters. The weights are set such that the sum of the weights of the entire data would be 1. Based on these information, the software computes the position v = (xv, yv) of the vaccination hub that will minimise , where (xi, yi) and wi are the coordinate and the weight of cluster-i, while n is the number of clusters in a particular LA.

It is known that the position of v that will minimize Dv can be found by finding the weighted median for the X-axis and Y-axis independently. Suppose P = [(w1, p1), (w2, p2), · · · , (wn, pn)] is a weighted data points sorted in ascending order, first based on its data values (p) and second based on its weight ( w). Then, we compute the weighted median of P as pk, where . Note that although this definition relies on sorted arrays, given an unsorted data, the weighted median can be found without sorting the data first.

Your company, and eventually you as one of the company’s lead programmer, has been tasked to develop the software. For this purpose, you need to answer the following questions:

(a) [10 pt] Design a randomized algorithm to find the weighted median of a set of n weighted unsorted data. The algorithm should have an average running time of (n) and use (1) extra storage outside of the input arrays. Your answer should provide the pseudo-code and explanation to show that the average running time and extra storage requirement are satisfied. You can assume the input will be 2 arrays: An array, say XPos, which is a 1-dimensional position of the data (not necessarily sorted), and an array W, which contains the weights that indicate the importance of the data. Suppose n is the length of these arrays, then an example input would be XPos[i] = xi for i ∈ [1, n] while W[i] is the weight for xi . You can assume the data in XPos are independently and uniformly distributed in [XMin, XMax], where XMin and XMax refer to very large negative and positive numbers. The output is the weighted median of the input data.

(b) [10 pt] Please develop a program for deciding the vaccine hub position in each LA using the algorithm you have developed in 8a and named your program a2-[your UID].cpp or a2-[your UID].java. If you use C/C++, please use the template as provided in A2-prgInfo to ensure you use the right input and output format when run on the terminal / command prompt. If you use Java, please make sure that your program can run from terminal / command prompt, accept the input file, and output the solution to the terminal / command prompt, as per specification. The following are the program specifications

Input to the Program: The program will accept one argument, which is the name of the input file. The input file contains n + 1 lines, where n is the number of building clusters in the LA. The first line consists of one integer number, n. Each line in the next n lines consists of three integer numbers, separated by a white space. They represent the weight w along with the x and y coordinate of each cluster. You can assume x and y are integer values, where the minimum and maximum follows the bound of int data type, while w is of type float. Example:

3

0.25 1 2

0.4 3 1

0.35 4 5

The above input means the LA has 3 residential/office clusters that the vaccine hub must serve. The cluster centered at (1, 2) has a weight (importance) of 0.25, the one centered at (3, 1) has a weight of 0.4, and the one centered at (4, 5) has a weight of 0.35.

Output of the Program: A single line in the standard output, containing two integer numbers separated by a white space. The first number is xv and the second number is yv (aka., the position where the vaccine hub should be set up). The output of the example input is:

3 2

Program Marking: If your program compiles and runs, you will get 2 points. We will then run your program on 4 test cases: with increasingly many clusters (up to 100,000,000 clusters). For each test case, your program will be given CPU time to find a solution. The time limit will be rounded up to 2 decimal digit and is only for finding the position of the vaccination hub. It does not include the loading time. For marking purposes, we will run a test on the loading time for out test cases, and offset the loading time from the timing of your program. You can assume your program will have access to at most 8GB RAM. It will be run as a single thread process on a computer with Intel i7 3.4GHz processor. For each test case that your program solves correctly within the given time and memory limit, you will get 2 points.

Tips:

● Be careful not to use arrays allocated in the program stack.

● We provide additional test cases in A2-prgInfo. However, these are just examples. You should NOT extract specifications from these test cases. Specifications should follow the description in this question sheet.

(c) [10 pt] Experimental analysis of the time complexity of your algorithm and comparison against the theoretical analysis (8a). You will need to have sufficient data to fit a function well and will need to generate your own test cases for this purpose.