CSI4142 Fundamentals of Data Science Midterm Examination 2020
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
CSI4142 Fundamentals of Data Science
Midterm Examination 2020
1. Identify one (1) measure that you would include in the Fact table of the Opération Soleil conceptual design. (2)
2. Identify a role-playing dimension in the Opération Soleil conceptual design. (2)
3. Identify one concept hierarchy, other than the one in the Date dimension, in the Opération Soleil conceptual design. (2)
4. Detail the attributes you would include in a Visitor dimension and explain how you would incorporate demographic details into this dimension. (4)
5. Provide the SQL statement to create the Fact table. (6)
6. Recall that 95% of the members completed a questionnaire to collect demographic data. You notice that only 20% of the members disclosed their incomes. Some of the recorded values seem unrealistically high, which may imply that members have entered inflated numbers.
The data analyst at Opération Soleil asks you to find a way to assess, and to potentially improve, the
quality of the income attribute.
Describe how you would handle his request. (4)
7. Opération Soleil is interested in tracking changes in members’ marital status over time. Describe the incremental load procedure you would implement to best handle such changes. (4)
8. Explain the following three concepts, with reference to examples in the Opération Soleil data mart:
a. lattice of cuboids
b. apex cuboid
c. base cuboid (6)
9. This question concerns iceberg cubes and iceberg queries.
a. Explain what an iceberg cube is and give an example of a potential iceberg cube for the Opération Soleil data mart. (2)
b. Give the SQL statement for an iceberg query executed against the cube you proposed in question 9(a). (2)
10. Suppose that we have a tiny subset of demographic data as shown in the following table:
Occupation |
Age |
Number-of-Trips |
Average-Price |
|
Professor |
35 |
3 |
3,400 |
|
Professor |
45 |
5 |
1,400 |
|
Dentist |
52 |
1 |
15,000 |
|
Pilot |
34 |
8 |
1,200 |
|
Professor |
43 |
6 |
3,400 |
|
Doctor |
35 |
6 |
8,100 |
|
Movie Star |
60 |
4 |
16,000 |
a. Determine whether there is a correlation between the age and number-of-trips. (2)
b. Describe what Bessel’s coefficient is and explain why one should use it when calculating the variance of the average-price of the sample. (2)
c. Determine the central tendency of the occupation attribute. (2)
2022-03-10