Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Stat 311 Homework 7

This assignment mostly parallels HW6 and requires the same data sets.

•   Use the T distribution for all problems involving means. Use t.test for any problems with raw data. Use an appropriate custom function for a test based on summary data only.

•   For all hypothesis tests, be sure to state the null and alternative hypotheses using symbols, the value of the test statistic (including df if applicable), the p-value, the decision, and a conclusion in the context of the problem. You do not need to calculate or report critical cut off values.

•   When code is required, put all the code for a given problem in a chunk and then put all the writing AFTER the chunk. That means any stated hypotheses would go after the code chunk for the problem. We want all the writing in one place.

1.   Identifying H0  and Ha  from Triola. First identify if the given statement is a statement about the null or alternative hypothesis. Then write out the hypotheses using symbols (0.5 points each)

a)  More than 25% of Internet users pay bills online.

b)  Most households have telephones

c)   The mean weight of women who won Miss America titles is equal to 121 lb.

d)  The percentage of workers who got a job through their college is no more than 2%.

e)  Plain M&M candies have a mean weight that is at least 0.8535 g.

f)   The success rate with surgery is better than the success rate with splinting.

g)  Unsuccessful job applicants are from a population with a greater mean age than the mean age of successful applicants.

Iris Data Set

2.   Use R as a calculator to do the “by-hand” calculations (means to show the steps as if you were solving by hand) to test the claim that the population mean sepal length for iris species virginica is different than the observed sample mean sepal length for iris species versicolor, using a 5% significance level. Check your answer using an R function call. [Hint: you are using the mean sepal length for the versicolor species as a constant]. (1 hypotheses, 1 by-hand calculation, 1 R output, 1 reporting required numbers, 1 decision, 1   interpretation in context for a total of 6 points)

3.   Test the claim that the population mean sepal width for iris species versicolor is different than the  population mean sepal width for the virginica species, use a 5% significance level. Assume the population variances are not equal. [do this problem using an R function only, meaning no by-hand calculations are necessary] (1 hypotheses, 1 R output, 1 reporting required numbers, 1 decision, 1   interpretation in context for a total of 5 points)

Hair/Eye Color Data Set

4.   Use the hair/eye color data set to complete parts (a) – (c).

a)  Use R as a calculator to do the by-hand” calculations (meaning, show the steps as if you were solving by hand) to test the claim that the population proportion of green-eyed students having red     hair is different than the proportion of hazel-eyed students having red hair, using a 5% significance    level. Show your work in the R chunk. Assume large sample conditions are met. [be sure to follow all steps for hypothesis testing for this part] (1 hypotheses, 1 by-hand calculations, 1 reporting required   numbers, 1 decision, 1 interpretation in context for a total of 5 points)

b)  Repeat the test in part a) using prop.test in R. You do not need to restate the hypotheses and  other information, as you should get comparable results and the same conclusion. Show that the    square root of the chi-square test statistic from prop.test is equal to the z-score you got in part (a), within rounding error. (1 point)

c)  In the context of this problem, what does it mean if a Type 2 error was committed? Do you think there are any significant consequences if a Type 2 error was made? (1 point)

Cholesterol Data Set

5.   This problem was modified fromhere. This study used a cross-over trial experiment to investigate whether eating oat bran lowered serum cholesterol levels. Twelve individuals were randomly assigned a diet that included either oat bran or corn flakes. After two weeks on the initial diet, serum cholesterol (mmol/L) was measured and then participants were crossed-over” to the other diet. After two-weeks on the second diet, cholesterol levels were measured again.

a)  Using a 5% significance level, test the claim that a diet that includes oat bran decreases serum cholesterol using an R function only (no by-hand not needed). [be sure to follow all steps for hypothesis testing for this part] (1 hypotheses, 1 R output, 1 reporting required numbers, 1 decision, 1 interpretation in context for a total of 5 points)

b)  Construct an appropriate confidence interval that is equivalent to the test in part (a) only using an R function. Explain your choice of interval, then report and interpret the interval. (2 points)

c)  In the context of this problem, what does it mean if a Type 1 error was committed? Do you think there are any significant consequences if a Type 1 error was made? ( 1 point)

No Data Set

6.   In clinical experiments involving distinct groups of independent samples, it is important that the groups be similar in the important ways that affect the experiment. In an experiment designed to test the effectiveness of paroxetine for treating bipolar depression, subjects were measured using the Hamilton depression scale with  the  summary  results  given  below  (based  on  data  from  a  Double-Blind,  Placebo-Controlled Comparison of Imipramine and Paroxetine in the Treatment of Bipolar Depression,” by Nemeroff et al., American Journal of Psychiatry, Vol. 158, No. 6). [lower scores indicate lower depression]

 

n

 

s

Treatment

18

22.5

3.77

Placebo

25

25.2

3.85

a)  Use a 0.05 significance level to test the claim that the treatment and placebo groups come from populations with the same mean, using by-hand calculations shown in R. Assume equal population variances. (1 hypotheses, 1 by-hand calculations, 1 reporting required numbers, 1 decision, 1 interpretation in context for a total of 5 points)

b)  Do you agree or disagree with the variance assumption made in part (a)? Support your answer by considering the rule of thumb presented in lecture combined with your understanding considerations regarding pooling or not pooling variances. (1 point)