关键词 > MTH6991/MTH791U/MTH791P

MTH6991/MTH791U/MTH791P Computational Statistics with R Exercise Sheet 3 Spring 2023

发布时间：2023-03-03

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

MTH6991/MTH791U/MTH791P

Computational Statistics with R

Exercise Sheet 3

Spring 2023

Problems for handing in

1. (50 marks)

This question uses a dataset on QMPlus, which is not the same as the dataset for the first two exercise sheets. For each student, there should be a file called “exercise3 XYZ.txt”, where XYZ is your ID number (you need to be logged in to QMPlus). If you cannot see a file, please send me an email.

Hand in: an R script with all the R code used, plus a separate file with your solutions - write the solutions (briefly) in your own words rather than copying and pasting any console output from R.

The dataset contains two columns, labelled“before” and “after”. Assume these are the test marks for each student in a class before and after some extra teaching sessions. We want to find out if there is a difference between the marks for each student, without assuming the data are normally distributed.

(a) Which type of data is this? Motivate your answer, also explaining the main dif-

ferences between (two-sample) independent data and (one-sample) matched pair data.

(b) For this part, do the calculations by hand (and calculator). Using only the first

three rows of the dataset, calculate the test statistic for the permutation test, which is the sum of the differences between the marks. Then find the null distribution for the permutation test of whether the before and after marks have the same distribution. Finally, calculate the p-value considering the alternative hypothesis of having different distributions.

(c) For this part, use the whole dataset. Use wilcox .test in R to test at the 5% significance level whether there is a difference in the before and after marks. Note that there are almost certainly tied values in the full dataset, so you will need to use a normal approximation.

Additional problems

2) For this part, use the first six rows in the question 1. dataset. By hand, calculate the Wilcoxon signed-rank statistic W+ based on the differences between the before and after marks. Hence, using the dsignrank or psignrank functions in R calculate the

p-value for whether there is a difference in the marks.

Check this using wilcox .test.

3) Refer to the Student t-test paper referred to in practical 4.

(a) What is the p-value reported for the sleep data, testing whether measurement 2 is

greater than measurement 1? How does this compare with the p-value that you found with the permutation test at the end of practical 4?

(b) Carry out a one-sample, one-sided t-test for the same comparison. Does this agree

with the paper? Also do a t-test of whether each set of measurements x and y is greater than 0 and see if these agree with the paper.