闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Biostatistics

Lab 5: Biological pest control

Background

• ANOVA is used to determine whether the means of several groups differ

• Post hoc comparisons via the Tukey-Kramer method are used to determine which groups are significantly different from each other without inflating the family-wise type I error

Objectives

• Use diagnostic plots to determine whether conducting ANOVA is appropriate

• Perform multiple post hoc comparisons while controlling for Type I error

Instructions

• Launch RStudio, open the R Markdown template file biostatistics-labTemplate.Rmd and save it as lab5-yourLastName .Rmd

• Complete the activities outlined below by adding the appropriate commands in R code chunks

• Embed your interpretation of the results below the relevant R code chunks

Code Examples

R has several datasets that are available internally (i.e. without needing to download them from any- where). Today, we’re going to use one of these datasets to investigate running ANOVAs. Begin by loading the ”mtcars”dataset and looking at the first few rows:

data(mtcars)

head(mtcars)

# mpg cyl disp hp drat wt qsec vs am gear carb

# Mazda RX4 21 .0 6 160 110 3 .90 2 .620 16 .46 0 1 4 4

# Mazda RX4 Wag 21 .0 6 160 110 3 .90 2 .875 17 .02 0 1 4 4

# Datsun 710 22 .8 4 108 93 3 .85 2 .320 18 .61 1 1 4 1

# Hornet 4 Drive 21 .4 6 258 110 3 .08 3 .215 19 .44 1 0 3 1

# Hornet Sportabout 18 .7 8 360 175 3 .15 3 .440 17 .02 0 0 3 2

# Valiant 18 .1 6 225 105 2 .76 3 .460 20 .22 1 0 3 1

Then, we’re going to visually inspect the miles per gallon for car models that have 4 cylinders versus cars that have 8 cylinders:

par(mfrow = c(1 , 1))

hist(mtcars$mpg[mtcars$cyl == 4], xlab = "Miles Per Gallon" ,

main = "MPG by Number of Cylinders" , col = adjustcolor("darkblue" ,

alpha .f = 0.3), xlim = c(10 , 34), ylim = c(0 , 4),

breaks = 10)

hist(mtcars$mpg[mtcars$cyl == 8], col = adjustcolor("darkgreen" ,

alpha .f = 0.3), breaks = 10 , add = T)

legend("topright" , legend = c("4 cylinders" , "8 cylinders"),

col = c(adjustcolor("darkblue" , alpha .f = 0.3), adjustcolor("darkgreen" ,

alpha .f = 0.3)), pch = 20 , pt .cex = 2)

MPG by Number of Cylinders

4 cylinders 8 cylinders

10 15 20 25 30

Miles Per Gallon

It looks like there’s a clear difference between the two groups, but we can test that using the aov function. It is often helpful to save the model as its own object, from which you have many further options (like using summary to look at the statistical significance or pulling out the coefficients).

# Keep only data to be analyzed in ANOVA

cars4 .8 <- mtcars[mtcars$cyl == 4 | mtcars$cyl == 8, ]

mpg .aov <- aov(mpg ~ cyl, data = cars4 .8)

summary(mpg .aov)

# Df Sum Sq Mean Sq F value Pr(>F)

# cyl 1 823 .7 823 .7 65 .65 3 .45e-08 ***

# Residuals 23 288 .6 12 .5

# ---

# Signif . codes: 0 '***' 0 .001 '**' 0 .01 '*' 0 .05 ' . ' 0 .1 ' ' 1

# str(mpg .aov) # Use this to look at all the

# attributes of the model

mpg .aov$coefficients

# (Intercept) # 38 .227273

cyl

-2 .890909

Remember that an ANOVA allows for the comparison of more than two groups, so we can easily change the data back to the full dataset to evaluate all the cars!

mpg .aov .all <- aov(mpg ~ cyl, data = mtcars)

summary(mpg .aov .all)

# Df Sum Sq Mean Sq F value Pr(>F)

# cyl 1 817 .7 817 .7 79 .56 6 .11e-10 ***

# Residuals 30 308 .3 10 .3

# ---

# Signif . codes: 0 '***' 0 .001 '**' 0 .01 '*' 0 .05 ' . ' 0 .1 ' ' 1

Activities

You are the chief scientist at a biotech company that is attempting to develop safe and effective products to combat agricultural pests, which cost the US economy millions of dollars annually. Your area of expertise is biological control via the use of bacteria that target pests without affecting other beneficial insects. The development of such effective biological control agents would be invaluable because they could potentially replace dangerous pesticides that kill indiscriminately and have a tendency to accumulate as they travel up to food chain, thus potentially harming humans.

Task 1: Identifying promising biological control agents

After two years of active research, your team has developed four biological control agents in the form of bacteria that target distinct biochemical pathways in pests controlling reproduction, metabolism, and immunity. To test their relative effectiveness, you decide to run an experiment whose results can be downloaded from the web:

# Download the data

d1 <- read .csv("http://faraway .neu .edu/biostats/lab5_dataset1 .csv")

1. To determine whether the biological control agent was successful in increasing the mortality of the pest, you decide to conduct an ANOVA. Formulate the null and alternative hypotheses.

2. Before conducting your analysis, plot the data to determine whether it adheres to the assumptions of ANOVA. Specifically, use function hist to determine whether mortality is normally distributed. Does the distribution of mortality seem normal?

3. If the data are not normally distributed, a log10 transformation can sometimes help. Plot a histogram of log10-transformed and non-transformed pest mortality on two separate panels. Is a transformation necessary in this case? When doing further analysis, would you use the raw data or the transformed data?

4. Now you decide to run an ANOVA to determine whether the mean mortality differed between your treatment groups. In R, ANOVA models are fit in the following way mod .aov=aov(y ∼ x, data=d), where y is the response variable in dataset d, x is the explanatory or grouping variable in dataset d, and mod is the variable in which the ANOVA aov object is stored. Starting with this example, replace x, y, mod, and d with the variables of interest in each case. To extract different parts of the mod object, R provides a number of important utility functions such as summary(mod), which provides the standard ANOVA table, coef(mod) which provides the coefficients, and resid(mod) which extracts the residuals. Other components of aov objects can be accessed directly via the $ operator (look at str(mod) for a list of these objects and consult the help file for additional details via ?aov). Use aov to fit an ANOVA model to your data and use the summary function to display the ANOVA table.

5. Use the ANOVA table to determine whether there is a difference in the mean mortality of the pest across the different groups.

6. To determine which groups have significantly different means, you decide to run post-hoc pairwise comparisons using the Tukey-Kramer method. Why not simply use t-tests between each pair of means?

7. Use function TukeyHSD to perform the post-hoc pairwise comparisons on the aov object and print the results to the screen. The p adj column contains the adjusted p-values for the comparisons described in the first column.

8. To visualize these differences, use the plot function to create a figure of the TukeyHSD object and interpret the results.

9. To improve upon the first figure, you decide to use a barplot to display the means and standard errors of mortality for each group, along with labels over each bar to identify means that are significantly different. Because the process of labeling each bar can be time-consuming and error- prone when dealing with many groups, download and use the package multcompView by issuing the following commands: install .packages("multcompView"); require(multcompView)

10. First, determine what labels go on each bar to distinguish which values are significantly different. To do this, you will have to use the multcompLetters function. Extract appropriate letters can be found in the TukeyHSD object you created earlier. In this case, the component would be mod .tukey$treatment[, "p adj"], if you named your TukeyHSD object mod .tukey. Modify the following example to work with your objects. Then, re-order the labels (control, t1, t2, t2, and then t4).

labels <- multcompLetters(mod .tukey$treatment[, "p adj"])$Letters

11. Now, use function aggregate to compute the mean and standard error of pest mortality for each group, and plot the results using a barplot. Don’t forget to add the text labels that you worked so hard to create in the previous step. Start with this template for making the barplot:

# Compute means

means <- aggregate(XXXX ~ XXXX, mean, data = XXXX)

# Compute standard errors

stderrs <- aggregate(XXXX ~ XXXX, FUN = function(x) XXXX,

data = XXXX)

# Create barplot figure

bp <- barplot(XXXX, names = XXXX, ylab = "Mean mortality of pest" ,

ylim = range(XXXX))

# Add standard error bars

arrows(x0 = XXXX, y0 = XXXX, y1 = XXXX, code = 3 , angle = 90 ,

length = 0.1)

box()

# Add text labels

text(x = XXXX, y = XXXX, XXXX, pos = 3)

12. Describe the results from the figure. Specifically, which treatment seems most effective in increas- ing pest mortality and why?

Task 2: Mitigating impacts on pest enemies

Although we have found some promising biological control agents, we must ensure that they do not interact antagonistically with natural enemies of pests such as predators. Indeed, if the biological control agents prevent the predators from consuming the pests, the net effect of the biological control agent could be either neutral, if the gains in pest mortality caused by the biological control agent are offset by the losses in mortality due to lower pest predation, or even negative! To ensure that the biological control agents do not affect pest mortality rates due to natural predators, you conduct a second experiment in which you place the the predators of the pests in isolated chambers and expose them to each treatment (control, t1-t4). Download the data from the web:

# Download the data

d2 <- read .csv("http://faraway .neu .edu/biostats/lab5_dataset2 .csv")

1. To ensure that the biological control agent does not inadvertently reduce the effectiveness of natural predators of the pests, you decide to fit an ANOVA model to the data (via function aov). Does the ANOVA table suggest that the mean mortality is different across the treatments?

2. To determine which groups are different from each other in terms of mean pest mortality, perform post-hoc pairwise comparisons using the Tukey-Kramer method by following the procedure outlined above.

3. Then, plot the mean and standard error of pest mortality for each group using a barplot and assign the appropriate label to each bar based on statistical significance. Note: use the same workflow as for the first labeled barplot.

4. Describe the results and determine which treatment groups experienced the lowest rates of mor- tality for the enemy of the pest.

5. Based on these results and those obtained in the last section, what treatment or biological control agent would you recommend and why?