关键词 > Database

HW2: Cost Estimation

发布时间:2025-09-24

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

HW2: Cost Estimation

In the following, you are asked to estimate processing costs for different query plans. Use the cost formulas seen in class.

Also, use the simplifying assumptions seen in class to estimate cardinality and selectivity values. As seen in class, we do not count the cost of writing out the *final* result for any query plan.

All questions are Numeric, please fill in the calculated answer directly. Please ensure that the programming calculation process (query) for each question is shown and reasonably explained. Make sure to mark clearly which explanation refers to which question.

Q1. A table stores information on 200,000 flights (one row per flight). Each row consumes 50 bytes of storage and a page can store up to 1,000 bytes. We use an index to retrieve all information on flights associated with a specific airline. There are 20 airlines in total. The index is a clustered tree index with a fanout of 100 (it stores data, as opposed to references to data). Calculate the cost of this operation.

Q2. Consider the same scenario as in the previous question (flights). This time, however, the index is unclustered and stores references to data (as opposed to the data itself). You can assume that each entry referencing a flight consumes 50 bytes (same as for the flights themselves). Recalculate processing costs in this scenario.

Q3. We are joining two tables, R and S, with the following dimensions. R stores 100,000 rows and each row consumes 20 bytes of storage. S stores 50,000 rows and each row consumes 50 bytes of storage. Each page stores 1,000 bytes. Calculate the cost of a block-nested loops join, using R as outer and S as inner relation. Assume that blocks of size 10 pages are loaded from R.

Q4. Consider two tables T and U, T storing 1,000,000 rows with 10 bytes per row and U storing 500,000 rows with 50 bytes per row. Each page stores 1,000 bytes. How much main memory do we need to execute a hash join (with two phases, as seen in class) between T and U? You can use the rule of thumb seen in class.

Q5. We sort table V using the external sort algorithm seen in class (as sub-function of the sort-merge join algorithm). Table V consumes 100,000 pages on disk. Assume that we have allocated 100 pages of main memory for the sort algorithm. Calculate the cost of sorting the data.

Q6. We execute two consecutive hash joins (we assume that one partitioning pass over the data is sufficient). The first hash join consumes two input tables with sizes 10 and 20 pages respectively. Its output is materialized and has 10 pages. The second hash join takes the output of the first hash join as input and another table with 5 pages. What is the total processing cost for this query plan?

Q7. We execute two consecutive index nested loops joins. The first one has 100 rows in the outer input (which fit on ten pages) and we find 10 matching rows in the inner table for each row in the outer table. The output of the first join has 50 rows and is the outer input of the second join. For the second join, we find 1 matching row in the inner table for each row in the outer table. The results of the first join are pipelined to the second one. The inner inputs are indexed and we count one page read per index bucket page and one page read for each accessed entry.

Q8. Consider the query plan below referring to the example query seen in class (counting customers from Ithaca ordering a book about Ithaca). Predicates are annotated with selectivity values, all results are annotated with their row sizes and the corresponding number of pages. Calculate the processing costs for this query plan using the method seen in class (you have an error margin of 5% around the accurate cost value).

Q8. Picture:

Show Your Work:

In the following, describe the steps you used to come to your answers above. Make sure to mark clearly which explanation refers to which question.

Q1:

Q2:

Q3:

Q4:

Q5:

Q6:

Q7:

Q8: