Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

STA 112 Project 2 – Technical Report Guidelines

The final deliverable for your project is a technical report that describes the process you used to arrive at a “best model” for your predictor variable.

Deadline

Your report should be submitted by noon on Friday, May 5th.

Caution:  Be aware that I cannot accept work after Friday, May 5. This is a rule imposed by the University and cannot be changed.

Audience

In any type of writing, it is important to consider who your audience will be.  For this project, you may imagine that your audience is another technical specialist.

• You may assume that your audience is familiar with statistics at the level STA112 (i.e. they’re familiar with all the tools and techniques we’ve learned in this class)

• You should fully explain what decisions you made while building your model and why you made those decisions.

• Your report should stand on its own. The reader shouldn’t need to look anything up about your data or methods to understand what you did.

Organization

If you need some help organizing your report, I suggest you follow the outline given in the Best  Subset Selection Example and Procedure Suggestions on Canvas, which is also reproduced in the next section. Note that there are many good ways you could organize your final report, so feel free to deviate from what is suggested here if appropriate.

Suggested Outline

1. Introduction:  Describe the data set that you’re studying and the variables it includes.  Give any background information that is needed to understand the rest of your report.

2. Methods:  Here, you should describe the process you went through to obtain your final model.  One suggested process is given below, but you should adapt it to suit your needs.  Your  final  report should describe the steps you took to build your model, supported with plots and model outputs where necessary. The reader should be able to follow the line of reasoning that led you to the model you selected.  Did you decide to apply a transformation?  Drop some predictors?  Address concerns with multicolinear predictors? Explain as you go why you are doing the things you’re doing.

3. Discussion:  Explain any issues you came across and limitations of your final model. The key is to be honest and transparent about limitations, without minimizing the ways in which your model succeeds. If your model has any interesting implications, describe them here.

Suggested Procedure

1. Plot the numerical variables against the response variable to see if the relationships between each predictor and the response look linear. If not, you should consider applying transformations.

2. Apply transformations, if necessary.

3. Perform best subset selection (BSS) and create plots as described above, both using multiple R2  and Ra(2)dj  as metrics for the ”best” model.

4. Although it’s possible that the two best” models from the previous step are the same, it is likely that they’ll be different. Indeed, it’s likely that one model will be nested inside the other. Compare the two models against each other using a nested F-test.

5. Use the result of your test to choose a ”base”model to start with.

6. Then, investigate your ”base”model closely.

(a) Are all the predictors significant?

(b) What percentage of variability is explained by each predictor?

(c) Is multicolinearity a concern?

7. Answers to the above questions may suggest your next steps. Should some predictors be dropped from the model? Does your best” model exclude a predictor you think is important, even though it is not statistically significant?

8. Consider whether or not you want to add any interaction terms to the model.

9. Once you’ve decided on a final model, you should describe it statistically as completely as you can (individual t-tests, ANOVA, percentage of variability explained by each variable, etc).

Length

There is no minimum or maximum length requirement: your report should be as long as it needs to be to both describe the process that you followed to arrive at your final model and justify why you made the decisions that you did.

Style

This project is a chance to practice professional writing about statistics, and so you should use an appropriate professional writing style throughout. You should write in complete sentences and paragraphs, and you should use words as you transition between figures and mathematical expressions. Your grammar and spelling should be good, and your final product should read like a carefully-prepared technical report, not a hastily-completed homework assignment.

R Code

Your final report may contain some R code, but big chunks of code should probably be suppressed (so that you can see their outputs, but not the code itself). If you’re writing your report in RMarkdown, to prevent the code in a code chunk from being visible when you knit your file, add echo=FALSE to the setup of your code chunk:

‘‘‘{r,  echo = FALSE}

Your  code here

‘‘‘

This will suppress the R code in your knitted file, but will still show outputs (so you’ll be able to see your plots, but not the code that generated those plots). Use this to create a cleaner look for your final report.

Note:  Don’t copy-paste the above code, it may not paste in the right format!  Instead, type what you see above into R.