Assignment 1


Problem I (10 points)

Please list all characteristics of big data and use one or two sentences to explain their meanings.  

 

Problem II (50 points)

1) Following Lab 1 to install git and create a github account.

2) Create a public Repository, for example: DataCrawl (You have the freedom to chose which program you want to commit to github).

3) Commit one of your Python programs to the above repository.

4) Create a Readme file to document the functions of your Python file.  

5) Provide a screenshot of your repository and a URL link. Your repository should contain the python file and the Readme.md file.

6) Provide an activity screenshot that shows your have successfully commit your code to github.

 

Problem III (40 points)

1) Install Cloudera VirtualBox on your machine.

2) Start Cloudera Express.

3) Make a screenshot of Cloudera Manager, showing all services have been started.

 

Submission Notes:

   Please submit your file in a word or pdf format through BlackBoard. DO NOT email your files to the instructor. The filename must be like: YourLastName_YourFirstName_Ass1.docx.