STAT4051 Fall 2022 Midterm II
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
STAT4051
Fall 2022
Midterm II
Problem I. Short Answer (28 points total)
Show all work for full credit unless noted otherwise.
A homework assignment in STAT 4051 had the following problem:
”A study was conducted in guinea pigs to investigate the effect of dose of vitamin C and de- livery method on the length of odontoblasts (cells responsible for tooth growth). Data was recorded for 51 guinea pigs. The variables are:
• dose of vitamin C (0.5, 1, or 2 mg/day)
• delivery method (orange juice or ascorbic acid)
The researcher was interested in these particular levels. Analyze the ToothGrowth2 dataset to determine what factors, dose and/or method, affect odontoblast length at α = 0.05. If a factor is statistically significant, then there is interest is determining what levels are statistically different.”
Here is a snapshot of the dataset:
> head(ToothGrowth2,2)
dose length method
1 0 .5 4 .2 VC
2 0 .5 11 .5 VC
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Two students, Mickey and Minnie, were doing their homework together. They both fit a two- way ANOVA model to the above data, but got different results! See below.
Mickey’s Results:
> model .Mickey<-aov(length~method*dose,data=ToothGrowth2)
> summary(model .Mickey)
Df Sum Sq Mean Sq F value Pr(>F)
method 1 22 .9 22 .9 1 .586 0 .214
dose 2 2118 .5 1059 .2 73 .483 6 .68e-15 ***
method:dose 2 33 .5 16 .8 1 .162 0 .322
Residuals 45 648 .7 14 .4
Minnie’s Results:
> model .Minnie<-aov(length~dose*method,data=ToothGrowth2)
> summary(model .Minnie)
Df Sum Sq Mean Sq F value Pr(>F)
dose 2 1860 .8 930 .4 64 .546 6 .02e-14 ***
method 1 280 .5 280 .5 19 .460 6 .34e-05 ***
dose:method 2 33 .5 16 .8 1 .162 0 .322
Residuals 45 648 .7 14 .4
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Since Mickey and Minnie’s results did not agree, someone is wrong, or both are wrong.
1. (10 points) Complete the following ANOVA table with the correct results. Hint: You do not need to compute anything.
|
Df |
Sum Sq |
Mean Sq |
F value |
Pr(>F) |
dose |
___ |
______ |
______ |
_______ |
_______ |
method |
___ |
______ |
______ |
_______ |
_______ |
dose:method |
2 |
33 .5 |
16 .8 |
1 .162 |
0 .322 |
Residuals |
45 |
648 .7 |
14 .4 |
|
|
2. (4 points) Can the method:dose (or dose:method) interaction be dropped from the model? Explain.
3. (6 points) Using SS notation, show why Mickey’s method sum of squares does not equal Minnie’s method sum of squares.
4. (4 points) What type of model did the students fit? Fixed effects, random effects, or mixed model? Explain.
5. (4 points) What observations, if any, are correlated?
Problem II. Short Answer (36 points total)
Grocery stores are always interested in how well the products on the store shelves sell. An experiment was designed to test whether the amount of discount given on products affected the amount of sales of that product. There are three levels of discount ( 5%, 10%, and 15%) and and sales were held for a week. The total number of products sold during the week of the sale was recorded. The researchers also recorded the wholesale price of the items put on sale.
The variables that were collected in the experiment are:
• Discount 5%, 10%, and 15%
• Price: wholesale price (in dollars)
• Sales: number sold during one week
1. (18 points) I fit several models to the data. Use the information in the Grocery Hand- out to determine your final model. Examine ALL models, model.0 - model.6, and select the model that best fits the data. Use α = 0.05. Discuss ALL models to justify your selection of your final model.
Your response should be a logical flow of the steps you take to arrive at your final model and what decision you made at each step based on the output. Simply writing down the final model will gain few points if any. Remember your goal is to address the question of interest which is whether the amount of discount given on the products affects the amount of sales of that product.
Summary Information:
> mean(Price)
[1] 8 .524444
> tapply(Grocery$Price,Grocery$Discount,mean)
5 .00% 10 .00% 15 .00%
8.499167 8.592500 8.481667
> tapply(Grocery$Sales,Grocery$Discount,mean)
5 .00% 10 .00% 15 .00%
203.5000 217.7500 213.5833
2. (4 points) Is the ANOVA model a better model than your final ANCOVA model from question 1? Explain.
3. (4 points) For your final model, is the rate of change in Sales with respect to Price the same for all levels of Discount? Explain.
4. (10 points) Based on model .3 estimate the covariate adjusted means for two of the three Discounts. Please note, model .3 may or may not be the best final model.
Note: A correct formula and correct plug-ins of observed data will get full credit. No need to perform calculations.
Write on the next page =>
Problem III. Short Answer (34 points total)
The following data comes from an experiment to test the paper brightness depending on shift operators. Interest is not in any particular operator. (Sheldon, 1960).
Sheldon, F. (1960) ”Statistical techniques applied to production situations” . Industrial and Engineering Chemistry, 52, 507-509.
> model .1<-aov(bright~ operator,data=pulp)
> summary(model .1)
Df Sum Sq Mean Sq F value Pr(>F)
operator 3 1 .34 0 .4467 4 .204 0 .0226 *
Residuals 16 1 .70 0 .1062
1. (6 points) What statistical assumptions do I need to assess for this model? Be specific.
2. (4 points) The operator effect is statistically significant. How do I interpret this effect?
3. (8 points) Estimate the variance of operator.
4. (4 points) Estimate the variance of brightness.
5. (4 points) Estimate the intraclass correlation.
6. (4 points) Interpret the intraclass correlation computed above.
7. (4 points) What type of model was fit to the data? Fixed effects, random effects, or mixed model? Explain.
Problem IV. Short Answer (6 points total)
A randomized complete block experiment was conducted to investigate a drug added to the feed of chicks to promote growth. There were three levels of drug:
• standard feed (control)
• standard feed plus low dose of drug
• standard feed plus high dose of drug
The following table reports the weight (in pounds) for each chick after 6 weeks. There are 15 chicks in the study.
Drug Dose
Block Control Low Dose High Dose
1 |
3.93 |
3.99 |
3.96 |
2 |
3.78 |
3.96 |
3.94 |
3 |
3.88 |
3.96 |
4.02 |
4 |
3.93 |
4.03 |
4.06 |
5 |
3.84 |
4.10 |
3.94 |
1. (6 points) Complete the Source and df columns of ANOVA table for this dataset:
Source df
Grocery
> model .0<-lm(Sales~Price)
> summary(model .0)
Coefficients:
Estimate Std . Error t value Pr(>|t |)
(Intercept) -466 .384 24 .631 -18 .93 <2e-16 ***
Price 79 .535 2 .886 27 .56 <2e-16 ***
---
Residual standard error: 6 .954 on 34 degrees of freedom
Multiple R-squared: 0 .9571,Adjusted R-squared: 0 .9559
F-statistic: 759 .3 on 1 and 34 DF, p-value: < 2 .2e-16
> model .1<-aov(Price~Discount,data=Grocery)
> summary(model .1)
Df Sum Sq Mean Sq F value Pr(>F)
Discount 2 0 .085 0 .0426 0 .246 0 .783
Residuals 33 5 .719 0 .1733
> adjusted .covariate<-resid(model .1)
> model .2<-aov(Sales~Discount,data=Grocery)
> summary(model .2)
Df Sum Sq Mean Sq F value Pr(>F)
Discount 2 1288 644 .2 0 .573 0 .569
Residuals 33 37074 1123 .5
> model .3<-lm(Sales~Price + Discount,data=Grocery)
> summary(model .3)
Coefficients:
Estimate Std . Error t value Pr(>|t |)
(Intercept) -472 .953 18 .317 -25 .820 < 2e-16 ***
Price 79 .591 2 .148 37 .052 < 2e-16 ***
Discount10 .00% 6 .822 2 .107 3 .238 0 .0028 **
Discount15 .00% 11 .476 2 .098 5 .471 5 .04e-06 ***
---
Residual standard error: 5 .137 on 32 degrees of freedom
Multiple R-squared: 0 .978,Adjusted R-squared: 0 .9759
F-statistic: 473 .9 on 3 and 32 DF, p-value: < 2 .2e-16
> anova(model .3)
Analysis of Variance Table
Response:
Price Discount Residuals
---
Sales
Df Sum Sq Mean Sq
1 2 32 |
36718 800 844 |
36718 400 26 |
F value Pr(>F)
1391 .366 < 2 .2e-16 ***
15 .149 2 .348e-05 ***
> model .4<-lm(Sales~Price * Discount,data=Grocery)
> summary(model .4)
Coefficients:
Estimate Std . Error t value Pr(>|t |)
(Intercept) -452 .038 27 .668 -16 .338 <2e-16 ***
Price 77 .130 3 .251 23 .728 <2e-16 ***
Discount10 .00% -24 .161 41 .771 -0 .578 0 .567
Discount15 .00% -39 .308 50 .225 -0 .783 0 .440
Price:Discount10 .00% 3 .632 4 .879 0 .745 0 .462
Price:Discount15 .00% 5 .982 5 .913 1 .012 0 .320
---
Residual standard error: 5 .204 on 30 degrees of freedom
Multiple R-squared: 0 .9788,Adjusted R-squared: 0 .9753
F-statistic: 277 .3 on 5 and 30 DF, p-value: < 2 .2e-16
> anova(model .4)
Analysis of Variance Table
Response: Sales
Df Price 1 Discount 2 Price:Discount 2 Residuals 30
---
Sum Sq 36718 800
32
812
Mean Sq 36718 400
16
27
F value 1355 .9419 14 .7636
0 .5926
Pr(>F)
< 2 .2e-16 ***
3 .436e-05 ***
0 .5592
> model .5<-lm(Sales~adjusted .covariate + Discount,data=Grocery)
> summary(model .5)
Coefficients:
Estimate Std . Error t value Pr(>|t |)
(Intercept) 203 .500 1 .483 137 .225 < 2e-16 ***
adjusted .covariate 79 .591 2 .148 37 .052 < 2e-16 ***
Discount10 .00% 14 .250 2 .097 6 .795 1 .11e-07 ***
Discount15 .00% 10 .083 2 .097 4 .808 3 .47e-05 ***
---
Residual standard error: 5 .137 on 32 degrees of freedom
Multiple R-squared: 0 .978,Adjusted R-squared: 0 .9759
F-statistic: 473 .9 on 3 and 32 DF, p-value: < 2 .2e-16
> anova(model .5)
Analysis of Variance Table
Response: Sales
Df Sum Sq Mean Sq F value Pr(>F)
adjusted .covariate 1 36230 36230 1372 .84 < 2 .2e-16 ***
Discount 2 1288 644 24 .41 3 .648e-07 ***
Residuals 32 844 26
---
> model .6<-lm(Sales~adjusted .covariate * Discount,data=Grocery)
> summary(model .6)
Coefficients:
Estimate Std . Error t value Pr(>|t |) (Intercept) 203 .500 1 .502 135 .467 < 2e-16 *** adjusted .covariate 77 .130 3 .251 23 .728 < 2e-16 *** Discount10 .00% 14 .250 2 .124 6 .708 1 .97e-07 *** Discount15 .00% 10 .083 2 .124 4 .746 4 .76e-05 *** adjusted .covariate:Discount10 .00% 3 .632 4 .879 0 .745 0 .462 adjusted .covariate:Discount15 .00% 5 .982 5 .913 1 .012 0 .320
---
Residual standard error: 5 .204 on 30 degrees of freedom
Multiple R-squared: 0 .9788,Adjusted R-squared: 0 .9753
F-statistic: 277 .3 on 5 and 30 DF, p-value: < 2 .2e-16
> anova(model .6)
Analysis of Variance Table
Response: Sales |
|
|
|
|
|
Df |
Sum Sq |
Mean Sq |
F value Pr(>F) |
adjusted .covariate |
1 |
36230 |
36230 |
1337 .8914 < 2 .2e-16 *** |
Discount |
2 |
1288 |
644 |
23 .7888 6 .468e-07 *** |
adjusted .covariate:Discount |
2 |
32 |
16 |
0 .5926 0 .5592 |
Residuals |
30 |
812 |
27 |
|
--- |
|
|
2023-03-29