Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Final Project Part 3

Final Data Analysis Report

2022

Goal of the Assessment:

Part 3 of the Final Project is your opportunity to demonstrate all that you have learned throughout the course. This will be done by showing the teaching team that you can use the methods and techniques learned in the course appropriately. You can use the feedback that you have received in Part 1 and 2, as well as in the video project to write a report that is in a common research paper format (IMRD: Introduction, Methods, Results, Discussion). Writing these kinds of reports is likely something that, as a graduate student or a statistician working in industry, you will find yourself doing occasionally.

Since this assignment is used to assess how familiar you are with the use of the tools and methods from this course only, you should NOT use materials that were not covered in this course. Instead, focus on showing us how much you know about everything we have discussed throughout the term.

It can also be used as part of a dossier when applying to jobs to showcase your abilities as a statistician and data analyst.

General Instructions:

Using only methods and techniques presented in the lecture slides throughout the term, you are tasked with answering your proposed research question by creating the ‘best’ linear regression model that meets the requirements of your research question. You will then need to write a report (details below) that (i) introduces your research question and presents some background, (ii) outlines the steps in your analysis that you followed to reach the ‘best’ model,  (iii) presents the results of your analysis and describes and justifies the decisions you made, and finally (iv) discusses the final model, its interpretation and its limitations in terms of its ability to meet your research goals. It should be made clear whether you are aiming for a model that makes good predictions, or a model that is more descriptive and easier to interpret, or some combination of both.

The feedback and work you have put into Part 1 of the final project should help you structure your report in a professional and easy-to-read fashion, as well as provide you with a good beginning to your introduction section. You may want to consider adding some additional background research or more discussion about how your research question is important and different from the background you present. The EDA portion of part 1 should be helpful in writing the beginning of the results section, where you display the characteristics of the data you will use to answer your question.

The feedback and work you have put into Part 2 of the final project should help you structure  the methods section of your report, where you will outline the process you followed/tools and methods you used to answer your research question. The feedback should also help you with  how you approach your data analysis itself.

How to present your final report:

Once you have decided upon the best’ model to fulfill the goal of the project, you must write up a short scientific report. There should be 4 main sections of your report:

•   Introduction section : where you introduce the purpose and relevance/importance of   the project and provide some relevant background information on the topic (no results or data should be presented here).

•   Methods section : where you describe and explain the methods, tools and techniques   used to arrive at your final model (no results or data should be presented here, but you can tell us where you found your data and what variables it contains).

•   Results section : where you present a numerical/graphical description of your study    sample and important results that led you to make crucial decisions in building your    model (following the methods you outline in the earlier section), followed by the final model and any other important results

•   Discussion section : where you interpret your final model and describe why it answers    the research question and why it is important, as well as discuss any limitations that still exist based on your results.

You may use tables and plots to help present your results, but they must be relevant and well-  thought-out to convey as much information as possible without being too overwhelming or confusing. When explaining your methods, try to avoid just stating that you used a specific method, but add an explanation for how it is used to achieve a specific task. When presenting   your results, avoid repeating exactly what you wrote in your methods section. Instead, focus on the results of the process you described earlier, and use numerical values/graphical results to support the decisions you made in arriving at your final model. See the rubric for more information regarding the various report components.

If you want more information about how to structure your report and what should be contained in each section, see this cheat sheet and this outline for reports (you may ignore the abstract portion since you do not need one). Note that not all the elements in these resources need to be included in your report. But you can use these to better understand how to structure your submission.

Finally, if you use any external resources outside of the lecture slides, e.g. to provide background on your topic, you should include a reference section at the end of your report. You may follow APA citation styles to help format your references. For some resources on how to cite, see the library page on citations.

What to do if you want to change your dataset or research question:

If you wish to change your dataset or research question from what was originally proposed in  Part 1, you are allowed to do so. However, you will need to provide a written statement that    proposes the change you wish to make. In order to change your dataset or research question,  you will need to submit a 1-page document (to be submitted by December 4 at 11:59PM ET on Quercus) that answers the following two questions:

1.   Why are you changing your topic or dataset? Elaborate on what made your original dataset or topic not appropriate for the final project.

2.   What makes your new topic and/or dataset more appropriate than the previous one? Be sure to clearly state your new research question and provide a short, written description of where you located your dataset and what information it contains.

The instructor will then approve or provide suggestions to improve your new dataset/research question.

Technical Requirements of the Final Report:

Your report should be typed using whatever software you prefer but must be saved and submitted as a PDF or .docx file on Quercus. Your report must meet the following requirements:

•   Font: 12-point font in a style similar to Times New Roman (this is the default in R Markdown)

•   Spacing: single-spaced

•   Word count : up to a maximum of 1500 words in total (this does not include captions on figures and tables, however, you should also not make captions excessively long or contain information that isn’t mentioned in the main text). We will still accept a report  that exceeds the word limit by no more than 150 words.

•   Number of tables/figures in the main report : 5 in total, but you may use any combination of tables and figures

•   Figures and table captions : all figures and tables included should include a caption that describes what is being presented (caption not included in the word count).

o Captions should not contain information that is not also discussed in the main report

•    Figure properties :

o All plots should have an appropriate title and axis labels, avoiding the use of variable names as they appear in the dataset

o A figure may include multiple individual plots but they should be related to each other and make sense as to why they are being presented together

§ Avoid having too many plots in the same figure to ensure that they are legible and clear.

•   Reference list or bibliography at the end of the report (will not count towards word count), using appropriate citation style

•   Appendix: you may add an appendix at the end of your report to include some       additional tables or figures that were not important enough to be part of the main report, but still relevant to your analysis :

o up to 3 additional tables/figures but they should only be included if they are relevant to the analysis and are referred to in the main text.

•   R code : In a separate file (i.e. RMD file), you should upload your cleaned and complete    version of the R code that was used to conduct your analysis. The R code should be well- organized and commented appropriately to indicate what each line/section of code is     doing.

Checklist for submitting final project part 3:

1.   Your final written report which follows the requirements above.

2.   Your R code that shows your complete analysis (this will be used to verify the results displayed in your written report and will not be assessed for content).

Things to keep in mind while writing your final report:

o You do not need to write out the results of every step you took in your analysis as this will make your report too long.

o Instead, focus on summarizing the most important results, especially where a big decision was made. You need to justify it any big decisions.

o For the rest of your results, very short mentions of the process with a brief piece of evidence provided are enough to allow your reader to follow your analysis and understand how you arrived at the final model.

o Rather than presenting the results of each step separately (e.g creating separate tables for each), consider putting together one larger table that you can refer to in your           discussion of many steps in your analysis so that you don’t use too much space

o For example, if you are selecting between a few different models, you could       consider presenting a table that includes many different summaries of the fit of each model and refer to each part as needed in the text, instead of making         individual tables for each component.

o Avoid using R output taken directly from R/RStudio. Instead create your own tables where you select only the relevant pieces of the output to display.

o Generally, the methods and results sections tend to be the longest sections, while the introduction and discussion tend to be shorter.

o Keep this in mind when deciding how much background to provide in your             introduction. Often just a paragraph or two is plenty, given the word limits in this project.

o However, make sure you leave yourself enough space for a solid discussion           where you can discuss the impact of the limitations that may exist in your model.