ST221 Linear Statistical Modelling Assessed coursework 1 2022
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
Assessed coursework 1
ST221 Linear Statistical Modelling
Please read these instructions carefully!
This assignment counts for 15% of your ﬁnal module mark. The maximum score for this coursework is 30 marks.
Your solutions must be produced using a word processor, R Markdown, or LaTeX. You may cut-and-paste R-output. Use a font size of 11pt or larger. Question sub-sections must be clearly labelled for ease of marking.
If you do not submit your solutions in a typed format, then this will not be
accepted as a submission.
You should convert your solutions into one PDF ﬁle that should be submitted on the ST221 moodle page. Please DO NOT to add your name on your submission to allow anonymous marking.
Please read Chapter 5 in the course guide which gives details around the procedures regarding coursework including applying for extensions and lateness penalties. Please ensure that you submit in good time before the deadline. Penalties will apply if work is submitted more than 1 minute after the deadline unless an extension or waiver is granted. Coursework is not eligible for mitigating circumstances for the loss of work in progress. The penalty for late submission is 5% per 24 hour period encompassing a working day. However no submission will be
accepted more than 5 working days after the original deadline unless there is a pre-approved extension extending past the cut oﬀ period.
If you have any queries about the coursework, please post them on the ST221 forum, but do not post any part of your solutions.You can also submit questions to the anonymous question form on moodle.
Please be aware that your work will be submitted to TurnItIn, a piece of
plagiarism-detection software. Cases of suspected collusion or plagiarism will be followed
up as outlined in Section 5.3 of the course guide.
Make sure to read questions carefully. If asked to produce a plot, then please include the plot in your report. Make sure it is of appropriate scale and the axes are clearly labelled.
Include R code only if requested to do so.
Good luck with the assignment!
Download the ﬁle Nambe.csv from the module webpage and load it into R.1 The dataset consists of information about the production of various tableware products. After casting, each piece of tableware goes through a series of grinding and polishing steps.
The variables (adapted from the original dataset) are:
● Type: the type of product; a categorical variable with categories Bowl, Plate and Tray;
● Diam: the diameter of the product (in inches);
● Time: the total grinding and polishing time of the product (in minutes);
An engineer in the ceramic factory suggests that the total time it takes to grind and polish the product can be predicted from its diameter using an equation of the form
time = a X diameterb (1)
for some constants a and b.
(a) [4 marks] Produce a scatterplot of the grinding and polishing time against the diameter of the product. Use diﬀerent colours and/or plotting symbols to show the type of the product. Your plot should be clearly labelled and contain a legend.
(b) [3 marks] Explain how the relationship in equation (1) as suggested by the engineer can be transformed into a relationship that can be modelled by a simple linear regression. What assumptions are you making about the errors in the original relationship, that is in the original scale of the response?
(c) [4 marks] Fit the simple linear regression model in (b), that is a model for the trans- formed relationship. Produce a ‘residuals versus ﬁtted values’ plot and a scale-location plot of the ﬁtted model and discuss whether linearity and homoscedasticity can be assumed to hold.
(d) [2 marks] Give a quantitative interpretation of the estimated slope parameter for the model in (c) in the original scale of the response variable Time.
(e) [2 marks] Predict the time (in minutes) that it will take to grind and polish a product
that has a diameter of 15 inches.
(f) [1 mark] Reproduce the plot from part (a) and add a curve that shows how, according to the model in (c), the predicted time for grinding and polishing changes with diameter. (g) [3 marks] The engineer sees your plot and realises that the grinding and polishing time is proportional to the diameter of the product. They wonder whether a simple linear regression of time on diameter (possibly through the origin) would have suﬃced. Discuss whether this would have been a suitable alternative supporting your answer with appropriate evidence.
(h) [2 marks] The engineer then suggests that the constant a in equation (1) may depend on the type of the product. Explain how to modify the model in (b)-(c) to accommodate this. (i) [2 marks] Write out the model equations for the new linear model suggested in (h). (j) [2 marks] Give a description of the jth row of the design matrix for the model suggested in (h) using indicator variables.
(k) [2 marks] Fit the model in (h). Reproduce the plot from part (a) and add a curve for each product type that shows how, according to the ﬁtted model, the predicted time for grinding and polishing changes with diameter.
Hint: If c is an array of observations for the variable C and d is an array of corresponding observations for the variable D, then to compute the predictions from a model m with
explanatory variables C and D, we use a command of the form
predict(m, list(C=c, D=d)).
(l) [3 marks] Judging from the plots in (f) and (k) do you think that the engineer was right to suggest that the constant a in equation (1) should depend on the type of the product. Justify your answer.