关键词 > Python代写

Mock Exam

发布时间：2024-01-10

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

[Mock Exam]

- Part 1: Multiple Choice Questions

1. Drew Conway Venn Diagram: Which of the following best explains the “Danger Zone” intersection in the Drew Conway Venn Diagram?

a. It describes people who are well versed in conducting end to end machine learning and report coefficients, but without understanding what they mean.

b. It describes people who are not sure what they are doing, although they can explain the meaning of the output of the coefficients.

c. It is an area that we should not enter as it is dangerous and can result in harmful analysis.

d. It is an area where people conduct trial and error experiments with data and report the best results.

Answer: a

2. What is the technological reason for the continued increase in lack of privacy? a. the increase in cybercrime and terrorism makes it a necessity.

b. the flow of technology makes surveillance easier unless particular measures are set in place.

c. it follows from Koomey's Law.

d. the open internet and the cloud removes privacy.

Answer: b

3. Machine learning is useful when:

a. human expertise is not available

b. All of the options

c. humans cannot explain their expertise (as a set of rules)

d. humans are expensive to use for the work

Answer: b

4. Which of the following statements about Python is TRUE?

a. The first element of an array in Python has the index 1.

b. Python was designed by statisticians.

c. Python is a scripting language.

d. Python is a proprietary programming language.

Answer: c

5. Unix shell commands like “less” and “grep

a. are used to fit regression tree models

b. are examples of technology that is too old to be useful to a modern data scientist

c. can be used to manipulate large data files easily

d. are poorly documented

Answer: c

6. Over the years, disk capacity is generally growing:

a. Quadratically

b. linearly

c. exponentially

d. logarithmically

Answer: c

7. Which of the following is real world applications of Machine Learning?

a. Self-driving cars

b. Spam filtering

c. Weather forecasting

d. All of the options

Answer: d

- Part 2: Short Answer Questions

8. Explain what big data is. Consider the four V’s of big data and explain veracity in a few words

Answer: Big data: It’s data that contains greater variety, large volume with more velocity and veracity or simply any attribute that challenges CONSTRAINTS of a system CAPABILITY or BUSINESS NEED

Veracity:

Veracity is related to uncertainty of data. How good the data quality is in terms of consistency, accuracy, trustworthiness, incompleteness, errors, outliers, etc.

9. List some types of metadata might be associated with an image.

Answer: For example, time/date, owner of image data, size, resolution, location, etc.

10. Assuming you are collecting data about traffic accidents in Melbourne in order to develop a predictive model. Would it be better to collect “more data” (e.g. the locations of accidents over many years) or “more types of data” (e.g. the types of vehicles involved, the weather conditions, etc.)? Give a brief justification.

Answer: Depends on your justification. For example,

More data: By using many more years’ data, a prediction model potentially learns more patterns from data. For example, a predictive model using 10 years’ data can potentially predict traffic accident more accurately than a model using 1 year’s data.

More types of data: Traffic accident might be caused by many reasons like weather, vehicle type so if we can use these features, it can improve the performance of a prediction model.

Usually more types of data helps a predictive model more than just collecting more data, assuming that there is sufficient data for building a predictive model to start with.

11. Explain "sensitivity" as one of the classification metrics.

Answer: Check week 7 lecture slides.

12.Would you consider users' emails to be sensitive information? Why or why not?

Answer: Yes. - They contain all sorts of private information: addresses, credit card numbers, phone numbers, opinion about individuals like racial or ethnic origin, political opinions or associations, religious or philosophical beliefs etc.

13. What is the difference between the random forest algorithm and decision tree?

Answer: Decision tree is a combination of rules/decisions, and a random forest is a combination of many decision trees. Random forest is an ensemble learning technique.