Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Report

1 Overview

The irst assignment is a report based around the survey data we collected in class. The raw data can   be downloaded here and the original format of the survey can be found here. See below for the required questions. You should write your indings in a report style, as if you were analysing a data set for a        client. The client is not a statistician, but they are interested in the details of your work.

You will submit an  document compiled using Quarto or R Markdown with code folding1 enabled so that your client can see ALL the code you used. However, your report should not rely on the client    understanding your code - the text should communicate everything that the client needs to know.

To help you think about this - one of your clients might be an analyst who understands the R code and  might want to check the details while the other is a manager who doesn’t have R skillz but still wants to have a good understanding of the data processing choices you’ve made and what you’ve done.

All of the standard writing expectations apply for statistical reports. For help with improving your writing see the Study Skills - Writing page.

 

 

If you only submit a .rmd or a .qmd ile you will get a maximum of 2 out of 10. Your submission to Canvas must be a .html ile.

You can work on this in your computer lab in consultation with others but you will need to submit your own report. Your tutor can also provide feedback on your approach.

 

2 Required questions

The following questions need to be addressed in the report.

 

 

You should address these questions in the  section of your report.

1. Is this a random sample of DATA2X02 students?

2. What are the potential biases? Which variables are most likely to be subjected to this bias?

3. Which questions needed improvement to generate useful data (e.g. in terms of the way the question was phrased or response validation)?


FYI there are 780 students in DATA2002 and 70 students in DATA2902.

 

 

 

 

You should address these questions in the  section of your report. The report will be more compelling if   you can articulate a connection between the questions you select so that the report feels like a coherent body of work (rather than three unrelated tests).

Identify  questions you can answer from the data and perform a hypothesis test for each question. The hypotheses should be of the same form as what we have covered in lectures. Give a motivation for why you selected these questions. Be sure to report the and interpret the results and mention any          limitations in the data that may impact your indings. You may have mentioned this in general terms in  the introduction, but be speciic in the results section.

There needs to be some variety in the types of tests you implement:

at least one test from module 1

at least one test from module 2

at least one test needs to be based on a resampling method (e.g. Monte Carlo or permutation test).

Additional requirements for DATA2902 students have been posted to the DATA2902 resources page.

 

3 Guides

  Guide on importing and cleaning the data

  eport writing guideR 

 

 

 

The two guides above provide essential information on how to succeed in this assessment task. You should read them carefully as you go about writing up your report.

 

 

 

The following YAML code can be used to make sure you meet the minimum criteria. The self contained and code folding options are particularly important.

 

 

If your ile ends in  .rmd you can adapt this:


 

 

If your ile ends in  .qmd you can adapt this:

 

---

title: "Your title here"

date: "`r Sys.Date()`"

author: "Your SID (don 't put your name, so that we can respect the anonymous marking policy)"

format:

html:

self-contained: true # Creates a single HTML file as output

code-fold: true # Code folding; allows you to show/hide code chunks

code-tools: true # Includes a menu to download the code file

table-of-contents: true # (Optional) Creates a table of contents!

number-sections: true # (Optional) Puts numbers next to heading/subheadings

---

 

 

 

 

 

You should absolutely review and follow the advice in the guides above, but at a minimum, before submitting your report, check the following points.

Your assignment submission needs to be a  that you have compiled using R Markdown or Quarto.  upload the rmd or qmd ile (i.e. the code ile)

You  use code folding (so we can see your code) and specify that the HTML ile is self

contained (otherwise all the formatting and images won’t be sent to Canvas). Using code folding is also super handy to use as a check if your report is well written - your report should make sense     and provide all the relevant information in the text when the code is hidden. Don’t rely on the reader

to understand your code.

Think about how your report is structured (e.g. introduction, results, conclusion).

Is there suicient text explaining what is being presented or are you relying on the reader being able to understand and interpret the code and R output? Any output that you include needs to be             explained in the text of the document.

Is it well presented (e.g. no unnecessary warnings or messages showing up)? If your code chunk generates unnecessary output, you need to suppress it using the chunk options.2

Have you included suicient and appropriate ? This includes software, data, and other reference material.3


 

1. Code folding in Quarto and R Markdown.

2. For details on how you can ine tune the settings in your R Markdown document see the R Markdown Cookbook. This includes formatting, tables, captions and chunk options. Much of this is directly transferable to Quarto        documents too.

3. We are not prescriptive about the citation style that you use, but you should be consistent in whatever style you choose. See the ibrary websiteL  for more details about citations.