Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

COMP7095 Big Data Management

Assignment 2

Q1 (20 marks) Assume that a program only contains a portion 10% is not parallelizable. When this program is executed on a cluster consisting of 8 nodes. What is the parallel speedup compared with a single node? We can assume that a single node runs at the same speed as each of 8 nodes in a cluster. If a compiler further optimizes the program, the non-parallelizable portion becomes 1%, what will be the parallel speedup?

Q2 (20 marks) What is the difference between ACID and BASE? Can you give some examples that follow ACID and BASE, respectively?

Q3 (20 marks) Please compare Relational (SQL) DB and NoSQL DB in terms of three different perspectives.

Q4 (20 marks) What is the CAP theorem? Can you give some examples of CA, AP, and CP databases, respectively?

Q5 (20 marks) Why can MongoDB ensure the uniqueness of _id?