STATS 240: Design of Surveys And Experiments Assignment 3, 2022
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
STATS 240: Design of Surveys And Experiments
Assignment 3, 2022
TOTAL = 30 MARKS
The markers will mark based only on correct output and comments.
Questions 1 and 2 use data from the New Zealand Quality in Healthcare Study, which assessed the occurrence, impact and preventability of adverse events recorded in New Zealand public hospitals in 1998. Documentation can
be found in New Zealand Quality of Healthcare Study.pdf
Read the data set in NZqhs.240.csv into iNZight Lite.
First, specify a survey design with:
wt as the weighting variable,
stratum_id as the strata variable, and
cluster_id as the 1st stage clustering variable.
1. Associations 7 MARKS (a) Produce the summary for the stay variable – Length of hospital stay – by sex – male and female. (1 MARKS).
(b) Use a t-test to test whether the length of hospital stay differs for males and females. Copy the output into your assignment. What do you conclude? (you need to comment about significance and, if significant, direction
and size of effect) (2 MARKS)
(c) Create a categorical variable for mdcgp – major diagnostic category.
NOTE: turn off survey design before creating this variable, then specify the survey design again.
Rename the levels according to the levels listed in New Zealand Quality of Healthcare Study.pdf, and name the final variable mdcgp_cat. Produce the summary for the stay by mdcgp_cat. Which major diagnostic category
has the longest mean length of stay? Which has the shortest? (2 MARKS)
(d) Use an ANOVA to test whether length of stay differs by major diagnostic category. Copy the output into
your assignment. Is there evidence that length of stay differs by major diagnostic category? (2 MARKS)
2. Graphs 7 MARKS
Convert aey – Adverse event –to a categorical variables.
NOTE: turn off survey design before converting this variable, then specify the survey design again.
Rename the levels according to the levels listed in New Zealand Quality of Healthcare Study.pdf, and name the final variable aey_cat.
(a) Create a bar graph of your aey_cat (variable 1) – by mdcgrp_cat (variable 2). (2 MARKS)
From the graph answer:
(b) Which major diagnostic category is most likely to result in an adverse event? Which is least likely?
(2 MARK)
(c) Conduct a chi-squared test of the association between aey_cat and mdcgrp_cat. Is there evidence for a difference in the likelihood of an adverse event between major diagnostic categories?
(2 MARKS)
(d) What is the estimated difference in proportion with an adverse event between musculoskeletal admissions and neonatal admissions? (1 MARKS)
3. Stratified sampling 6 MARKS
The table below shows the population sizes for each of New Zealand’s 16 regions, as at the 2018 Census.
Region |
Population |
Northland |
179,076 |
Auckland |
1,571,718 |
Waikato |
458,202 |
Bay of Plenty |
308,499 |
Gisborne |
47,517 |
Hawke's Bay |
166,368 |
Taranaki |
117,561 |
Manawatu-Wanganui |
238,797 |
Wellington |
506,814 |
West Coast |
31,575 |
Canterbury |
599,694 |
Otago |
225,186 |
Southland |
97,467 |
Tasman |
52,389 |
Nelson |
50,880 |
Marlborough |
47,340 |
Suppose a survey sampled 1000 people from each region.
(a) Calculate the sampling fractions for Auckland and West Coast to five decimal places. (b) What weight should Auckland and West Coast be given in analyses (one decimal place)?
(c) What is the total sample size?
(d) What is the sum of weights across all people in the sample?
4. Sample size calculations 10 MARKS (a) A researcher wants to estimate the mean daily sugar intake among the 1,000 adults in their local town.
They decide to take a random sample. In a small pilot study, the mean daily sugar intake from all sources was 36 grams and the standard deviation was 6 grams.
How large a sample of adults should be taken if they want the margin of error of their estimated mean to be no larger than 1 gram? Did the finite population correction adjustment make much difference?
Comment on why you think it did or it didn’t. (5 MARKS)
Note: Use z = 1.96
(b) The same researcher wants to estimate the prevalence of diabetes in the same town. In a similar town it
was estimated that 10% of adults have diabetes. The researcher wants to determine the percentage of adults have diabetes in their town by taking a simple random sample.
How large should this sample be if the margin of error of the estimate is to be no larger than 2 percentage points (0.02)? Did the finite population correction adjustment make much difference? Comment on why
you think it did or it didn’t. (5MARKS)
Note: use z = 1.96
2022-08-27