Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Stat 311 Homework 7

This assignment requires the same data sets used in HW6.

.    Use the T distribution for all problems involving means. Use t.test for any problems with raw data.

.    For all hypothesis tests, be sure to state the null and alternative hypotheses using symbols, the value of the test statistic (including df if applicable), the p-value, the decision, and a conclusion in the context of the problem. You do not need to calculate or report critical cut off values.

.    Round t-scores and CIs to two decimal places. Report p-values to four decimal places.

.    In Rmarkdown you can use $\mu$ to get μ or $\mu_ {1}$ to get μ1 , or use $p_ {T}$ to get pT , for example. You can use this same notation to specify the null and alternative, such as $H_ {0}:  \mu_ {1} = \mu_ {2}$ versus $H_ {a}:  \mu_ {1} \ne \mu_ {2}$, where thus use of \ne gives ≠. You can also just use notation like mu[1], H0, Ha, != for not equal, etc.

.    When code is required, put all the code for a given problem in a chunk and then put ALL the writing AFTER the chunk. That means any stated hypotheses would go after the code chunk for the problem. We want all the writing in one place and will not read writing before the code chunk.

1.   Identifying H0  and Ha  from Triola. First identify if the given statement is a statement about the null or alternative hypothesis. Then write out the hypotheses using symbols (0.5 points each)

a)  More than 25% of Internet users pay bills online.

b)  Most households have telephones.

c)   The mean weight of women who won Miss America titles is equal to 121 lb.

d)  The percentage of workers who got a job through their college is no more than 2%.

e)  Plain M&M candies have a mean weight of at least 0.8535 g.

f)   The success rate with surgery is better than the success rate with splinting.

g)  Unsuccessful job applicants are from a population with a greater mean age than the mean age of successful applicants.

Iris Data Set

2.   Use R as a calculator to do the “by-hand” calculations (means to show the steps as if you were solving by hand) to test the claim that the population mean sepal width for iris species virginica is different than the  observed sample mean sepal width for iris species versicolor, using a 5% significance level. Check your   answer using an R function call. [Hint: you are using the mean sepal width for the versicolor species as a  constant, so this is a one-sample test]. (1 hypotheses, 1 by-hand calculation, 1 R output, 1 reporting required numbers, 1 decision, 1 interpretation in context for a total of 6 points)

3.   Test the claim that the population mean petal length for iris species versicolor is different than the population mean petal length for the virginica species, use a 5% significance level. Assume the population variances are not equal. [do this problem using an R function only, meaning no by-hand calculations are necessary] (1 hypotheses, 1 R output, 1 reporting required numbers, 1 decision, 1    interpretation in context for a total of 5 points)

Diet and Health Study Data Set

4.   Use the diet/health data set to complete parts (a) – (c).

a)  Use R as a calculator to do the “by-hand” calculations (meaning, show the steps as if you were

solving by hand) to test the claim that the population proportion of AHA diets showing fatal heart

attacks is different than the proportion of Mediterranean diets having fatal heart attacks, using a 5%

significance level. Show your work in the R chunk. Assume large sample conditions are met. [be sure to follow all steps for hypothesis testing for this part] (1 hypotheses, 1 by-hand calculations, 1

reporting required numbers, 1 decision, 1 interpretation in context for a total of 5 points)

b)  Repeat the test in part a) using prop.test with argument correct=FALSE in R. You do not

need to restate the hypotheses and other information, as you should get comparable results and the

same conclusion. Show that the square root of the chi-square test statistic from prop.test is equal to the z-score you got in part (a), within rounding error. (1 point) [Hint: if x2  is not close to your z- score from part (a), then your z-score or your call to prop.test is incorrect. (2 points)

c)   In the context of this problem, what does it mean if a Type 2 error was committed? Do you think there are any significant consequences if a Type 2 error was made? (1 point)

Cholesterol Data Set

5.   This problem was modified fromhere. This study used a cross-over trial experiment to investigate whether eating oat bran lowered serum cholesterol levels. Twelve individuals were randomly assigned a diet that included either oat bran or corn flakes. After two weeks on the initial diet, serum cholesterol (mmol/L) was measured and then participants were “crossed-over” to the other diet. After two-weeks on the second diet, cholesterol levels were measured again.

a)  Using a 5% significance level, test the claim that a diet that includes oat bran decreases serum

cholesterol using an R function only (no by-hand needed). [be sure to follow all steps for hypothesis testing for this part] (1 hypotheses, 1 R output, 1 reporting required numbers, 1 decision, 1

interpretation in context for a total of 5 points)

b)  Construct an appropriate confidence interval that is equivalent to the test in part (a) only using an R function. Explain your choice of interval, then report and interpret the interval in the context of the problem. (2 points)

c)   In the context of this problem, what does it mean if a Type 1 error was committed? Do you think there are any significant consequences if a Type 1 error was made? (1 point)

No Data Set

6.   In clinical experiments involving distinct groups of independent samples, it is important that the groups be similar in the important ways that affect the experiment. In an experiment designed to test the effectiveness of paroxetine for treating bipolar depression, subjects were measured using the Hamilton depression scale with  the  summary  results  given  below  (based  on  data  from  a  “Double-Blind,  Placebo-Controlled Comparison of Imipramine and Paroxetine in the Treatment of Bipolar Depression,” by Nemeroff et al., American Journal of Psychiatry, Vol. 158, No. 6). [lower scores indicate lower depression]

 

n

 

s

Treatment

25

22.5

3.77

Placebo

18

25.2

3.85

a)  Use a 0.05 significance level to test the claim that the treatment and placebo groups come from

populations with the same mean, using by-hand calculations shown in R. Assume equal population variances. (1 hypotheses, 1 by-hand calculations, 1 reporting required numbers, 1 decision, 1 interpretation in context for a total of 5 points)

b)  Check your answer for part (a), using tsum.test. Do the results match? (1 point)

c)  Do you agree or disagree with the variance assumption made in part (a)? Support your answer by considering the rule of thumb presented in lecture combined with your understanding of the considerations regarding to pool or not to pool variances. (1 point)