Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

INFT6201 - BIG DATA

ASSESSMENT 3: DATA

ANALYSISREPORT

DESCRIPTION

Student  groups  analyse  a  real  dataset  by  using  a  professional  statistical  data  analysis  tool.  The submission includes written text, figures, and tables. References and statistical results need to be formatted in APA style (http://apastyle.apa.org). The Python-code is to be submitted as an appendix (Appendix A). Appendix  B is optional and can  be  used to  provide the  reader with supplementary material (e.g., figures, tables, supporting documentation). The data analysis report includes:

§ Title page § Analysis & Results

§ Executive summary § Discussion & Conclusions

§ Introduction § References

§ Dataset Description § Appendix A Python-code

§ Descriptives § Appendix B Supplementary Material (optional)

To  submit  your  data  analysis  report,  please  log  on  to  Canvas  and  look  up  the  following  folder:

Assignments à Assessment 3: Data Analysis Report

TOPIC

In this assignment, you will be analysing a dataset containing information for traffic accidents that occurred  in the state of  New York in 2019 and 2020.1  There are  many interesting aspects  in the dataset and you will find a wealth of relationships between different variables that can be analysed. In your report, first give an introduction to the topic (Section: Introduction) and then continue with describing the dataset (Section: Dataset Description). The third section (Section: Descriptives) should then provide the reader with an overview of the different variables and relationships between them. This  includes  summary  tables,  correlation  tables,  and  figures  illustrating  data  distribution.  Then, select  a  couple  of  relationships  between  different  variables  in  the  dataset  that  you  find  most interesting and investigate these relationships using the statistical techniques we discussed in the course  (e.g., t-Test, ANOVA,  linear  regression).  Finally,  provide  a  discussion  of the  results  and  a conclusion for the data analysis report (Section: Discussion & Conclusions).

MARKING CRITERIA

Criterion

Description

1. Report format [5%]

A data analysis report has to be clearly structured and well organised. It has to be designed in a pleasant and appealing way (e.g., by using an eye catcher on the title page, including page numbers, number sections, figures and        tables, and by adequately highlighting essential information as well as using footnotes and appendices to add supplementary information).

2. Executive summary [5%]

An executive summary provides a concise summary of the most essential     information of the data analysis report. This includes in particular the key    findings of the statistical analysis as well as any other important discoveries. It is the first thing a reader will see after the title page.

3. Introduction [10%]

A data analysis should provide a concise introduction into the topic (e.g., half a page) and provide an overview of the analysis that are conducted in the report.

4. Dataset Description [5%]

Provide a concise description of the dataset, including the source of the           dataset, the number of observations, the number and types of variables, and missing values.

5. Descriptives (e.g., summary tables, correlation tables, figures) [15%]

The data analysis report includes a section that provides an overview of the data. This includes summary tables for key variables (e.g., mean, median, quartiles, standard deviation, etc.), correlation tables, as well as figures (e.g., correlation plots, box plots and violin plots) to provide an understanding of  the data distribution.

6. Analysis & Results (e.g., regressions) [25%]

The data analysis report has to provide a well-structured analysis & results section, which presents the analysis that are conducted (e.g., t-tests, ANOVAs, Tukey-Tests, linear regression, etc.) and the results. This includes written text, tables, and figures.

7. Discussion &

conclusions [10%]

A data analysis report has to provide a general discussion of the results     presented in the earlier sections of the report. Moreover, it has to provide clear conclusions at the end of the report.

8. Appendix A (Python-Code)  [10%]

The Python-code used to present the descriptives and to conduct the         statistical analysis is to be submitted as an appendix (Appendix A). The      Python-code needs a clear structure and has to be free of errors. Use # to comment what analysis is conducted and to structure the code. Marks will be deducted for poor documentation, structure, and programming style.

9. Writing and

referencing [15%]

A major aspect of data analysis is the ability to communicate results and      findings clearly and accurately. This criterion includes correctness of              grammar and spelling, and appropriate use of references, direct quotes, paraphrasing, etc. However, students should note that there is a fine line     between poor referencing and plagiarism, and