关键词 > CMDA3654

CMDA 3654 Project Milestone 1 - Data for Course Project


Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Project Milestone 1 - Data for Course Project


This course will have a final project.

You are required to use the methods that you learned during this course on your data set. The first step is to actually find a data set.



You are required to look for data in one of the following areas:

1. Agriculture/Environment/Ecology/Climate

2. Crime

3. Economics/finance/consumer

4. Education/science/research

5. Engineering/manufacturing/energy

6. Entertainment

7. Health/Healthcare

8. Housing

9. Government

10. Sports



  Find a dataset that contains between at least 15 variables and at least 5000 observations on those


  Try to find a dataset that has a mixture of variable types, i.e. numeric and categorical variables.   It helps if you find a dataset in an area that you are passionate about.

  The dataset does not have to be clean, but it would be in your best interest if it were at least in a

format that can easily be converted into a standard data frame format.

  Provide a description of the data to the best of your ability.

  You are not allowed to do analysis that copies the work of other people.  So do not pick a data project where you intend to copy the analysis of someone else, e.g. from someone on kaggle.com!



1. Tell us where you got the data and provide a link so that we can obtain a copy for ourselves to inspect it.  You should be very clear about your source.

2. Provide a description of the data to the best of your ability.

3. Tell us why you chose the data.

4. Do you have an initial idea of what you would like to do with the data?

5. Additionally you will be required to look over the submissions and data for two of your peers.  Further instructions are found below.  Consider how the evaluations will be carried out when writing up your  brief report.


Assessment: 40 points possible

You will earn 20 points for submitting your brief report.

You must do 2 peer evaluations which will be randomly assigned.  You will earn an additional 10 points each for your 2 peer reviews.

Please provide constructive and polite peer reviews as we can see everything you write.


Looking Forward:

The next project milestone will require you to clean the data and perform basic Exploratory Data Analysis.  You may want to keep this in mind when choosing your data.


Instrucঞons for Peer Evaluaঞons:

Please leave feedback on the following items related to your classmates submissions.

  Were you able to easily access the data?

  Is it clear what the data is related to? Additionally are the variables clearly identified and discussed

in the source material in either a data dictionary or some other manner?

  Was the data already clean and provided in a data frame format or will it take work to get it into a

data frame format?

  Did the data meet the guidelines of having at least 15 variables and at least 5000 observations on

those variables.

  Your classmate should have indicated which area of focus (1-10, education, healthcare, etc) that

they are intending to use the data for.  Did they do this?  Do you think the data they have collected is related to area they have indicated?

  Do you have any concerns about their data set, particularly do you see your classmate being

successful if they were to do a thorough analysis of this data for their course project?