Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

COMP 4380 Sample Midterm 2

1.   Suppose you have a heap file containing student information (Student(sID, name, department, year)). The file contains 10,000 records and 128 records will fit on a page.

a.   What is the search cost of finding all students in the Computer Science department?

b.   What is the cost of sorting this file by department using the original 2-way external mergesort?

c.   Now that the file is sorted, assuming there are 500 computer science students, what is the worst-case cost of finding all computer science students?

2.   Consider a B+ tree with d = 3 that’s initially empty. Add the following entries in this order: 60*, 25*, 13*, 15*, 18*, 62*, 19*, 45*, 42*, 56*, 35*, 17*, 49*, 65*, 51*.

a.   How many index entries are there in the tree?

b.   Delete the following entries: 13*, 62*, 49*. How many index entries are there in the tree now?

3.   Is it possible have an extendible hash directory where there is exactly one bucket where the local depth = the global depth? If so, provide an example of one. If not, explain why not.

4.   We have a file which takes up P pages of disk space. We are using a variant of external merge-sort to sort the file, where we have 4 buffer pages available for Pass 0 and 6 available for all subsequent passes. The cost of the sort is 12P pages. What are the minimum and maximum possible values for P? Show your work.

5.   Consider a bit vector that is 120 bits long, all of which are zeros except for bits 7, 14, 28, 49, 87, and 96.

a.   What is the hexadecimal representation of the CBV for this bit vector using RLE?

6.   Consider a Player table in a sports database. Assume it is sorted by PlayerID (incremental, starting at 1). You have the following CBVs

i.   CBV(Goals > 30) = 0010100001110101

ii.   CBV(Assists > 40) = 11010000101010110100

iii.   CBV(PIM > 80) = 011101001101000100

b.   What is the largest PlayerID in the database?

c.   Assume that a Playmaker is defined as having more than 40 assists, fewer than 30 goals, and fewer than

80 penalty minutes. What are the IDs of the Playmakers in this table, if any?

7.   Suppose you have a file containing one million records, where a page can hold 210 records. We sort this file using a version of the general mergesort, and we manage to do so in less than than 5,000 page I/Os. What is the minimum number of buffer pages B we must have used?

a.   Note: we used the same number of buffer pages for all passes, including Pass 0.

8.   Suppose you are designing a new version of the general mergesort. You have X total buffer pages, which you      must split between those used in Pass 0 and those used in all other passes (that is, you will have Y buffer pages in Pass 0 and Z at all other passes, where Z + Y = X). Is it generally better to have Z > Y, Y > Z, or for them to be  (approximately) equal? Defend your choice in a few sentences.