STAT1201 Analysis of Scientific Data Semester 1, 2022
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
STAT1201
Analysis of Scientific Data
Semester 1, 2022
Flu Vaccination
A new study to determine the effectiveness of winter flu vaccinations measures the benefits of a flu shot for school children. The study comprises two random samples of school children who were vaccinated. One sample is from a primary school and the other from a high school.
Question 1
According to previous studies, 50% of vaccinated primary school children contract winter flu. If a study takes a random sample of 17 vaccinated primary school children, the probability that less than 4 children in the sample contract flu is
0.0064
0.0182
0.0245
0.5738
Question 2
For the random sample in Question 1, the probability that at least 4 children but no more than 8 children contract flu is
0.0245
0.3082
0.4936
0.5984
Question 3
The new study claims that the probability of contracting the winter flu among vaccinated high school children is less than 0.5. Assume that p is the population proportion of vaccinated high school children who contract the flu. The appropriate framework to test the new study's claim is
H0 : p = 0 vs H1 : p 丰 0
H0 : p > 0.5 vs H1 : p < 0.5
H0 : p = 0 vs H1 : p > 0
H0 : p = 0.5 vs H1 : p < 0.5
Question 4
The new study takes a random sample of 10 vaccinated high school children. Let x be the number of children in the sample who contract the flu. The p-value for the test can be calculated from a of children in the sample who contract the flu. The p-value for the test can be calculated from a Binomial distribution using P(X ≤ x). The maximum number of children who can contract the flu to give evidence against the null hypothesis in Question 3 at the 5% level is
1
3
4
5
Question 5
Suppose the actual population proportion of vaccinated high school children who contract the flu is 30%. For a random sample of 10 vaccinated high school children and based on your answer to Question 4, the probability of making a Type II error is
0.3900
0.4472
0.8507
0.9718
Question 6
The new study also carried out a test to determine whether the population proportion of unvaccinated school children contracting winter flu was higher than the population proportion of vaccinated school children. The Z test statistic to test this belief is found to be 1.702. The corresponding p-value is
0.0444
0.1721
0.4316
0.5415
Question 7
Suppose that the new study uses a level of significance of 0.05 to test the claim in Question 6. The probability of Type I error is
0.025
0.05
0.95
0.975
Question 8
Based on previous studies of school children who were vaccinated and contracted the flu, the time in hours that the flu symptoms last is assumed to follow a normal distribution with a mean of 20.2 hours and a standard deviation of 7.5 hours. The probability that a randomly selected school child has flu symptoms for more than 24 hours is
0.1531
0.3062
0.6938
0.8469
Question 9
Suppose that a random sample of 5 vaccinated school children is taken. Assuming the distribution in Question 8, the probability that the mean time with symptoms is less than 18 hours is
0.1964
0.2559
0.3846
0.7441
Question 10
The researcher conducting the new study believes that the mean time with flu symptoms tends to be greater than 20.2 hours for unvaccinated school children. She takes a random sample of 10 unvaccinated children who contracted flu and records their times with symptoms. The appropriate non-parametric test the researcher should use is the
Sign test
F test
Chi-square test
Wilcoxon Rank-sum test
Cat Ownership and Psychotic Episodes
A study investigates whether cat ownership by mothers during pregnancy is associated with their children experiencing psychotic episodes. The study uses data from a hospital database. The data includes children aged from 10 to 19 years.
Download the CSV data using the link below and read it into RStudio:
Cats.csv
The data contains the following variables:
Ownership
Age
PE
Cat ownership by mother in pregnancy (Cat/NoCat)
Age of the child (years)
The child has experienced psychotic episodes (Yes/No)
Question 1
How would you describe this research?
Observational study
Randomised comparative study
Randomised comparative blind study
Randomised comparative double-blind study
Question 2
What type of variable is Ownership?
Nominal variable
Continuous variable
Ordinal variable
Discrete variable
Question 3
The average age of children who have experienced a psychotic episode is
13.48 years
14.27 years
14.85 years
15.00 years
Question 4
The probability that a randomly selected child whose mother owned a cat has had a psychotic episode is
0.350
0.650
0.765
0.770
Question 5
The appropriate statistical test to investigate whether psychotic episodes in children are related to their mothers owning a cat during the pregnancy is a
Two-way ANOVA
Chi-square test
Two-sample t test
One-way ANOVA
Question 6
Assuming no association, the expected number of children who experienced psychotic episodes and whose mother owned a cat during the pregnancy is
9.83
16.95
23.05
26.00
Question 7
The test statistic for the statistical test in Question 5 is
1.150
1.424
1.907
2.665
Question 8
Based on this analysis, you can conclude that there is
idence to suggest that psychotic episodes in children are related to their moth
no evidence to suggest that psychotic episodes in children are related to their mothers owning a cat during the pregnancy (p = 0. 167)
no evidence to suggest that psychotic episodes in children are related to their mothers owning a cat during the pregnancy (p = 0.263)
no evidence to suggest that psychotic episodes in children are related to their mothers owning a cat during the pregnancy (p = 0.833)
weak evidence to suggest that psychotic episodes in children are related to their mothers owning a cat during the pregnancy (p = 0.084)
Question 9
Ignoring cat ownership, the estimated odds of observing psychotic episodes in children is
0.424
0.576
0.735
1.360
Question 10
The study uses a logistic regression model to estimate the probability of a child experiencing a psychotic episode. The model uses the age of the child and cat ownership by their mothers during the pregnancy as predictors. The coefficient estimate for the variable Age in this model is
-3. 1322
0.2013
0.8825
0.8912
Question 11
Based on the estimated model in Question 10, the margin of error in a 95% confidence interval for the population Age coefficient is
0.1092
0.1796
0.2140
0.5891
Question 12
Based on the model in Question 10, the estimated probability of experiencing a psychotic episode for a 12-year-old child whose mother owned cats during the pregnancy is
0.0418
0.3281
0.5414
1.1805
Question 13
Based on the model in Question 10, the estimated odds ratio of experiencing a psychotic episode between a 16-year-old child and a 12-year-old child whose mothers did not own cats during the pregnancies is
0.805
1.185
2.237
3.035
Back to scenarios
Sleep Quality and Academic Performance
A study investigated the impact of sleep deprivation on academic performance in university medical students. An individual's sleep habit was measured using a sleep quality index. A higher index number indicates poorer sleep quality. Academic performance was measured by the student's cumulative grade point average. In addition, the student's physical activity, as measured by the average time for exercise per week, and their behavioural risk factors such as alcohol use, smoking habit and coffee use were recoded.
Download the CSV data using the link below and read it into RStudio:
GPA.csv
The data contains the following variables:
Gender |
Gender of the student (Male/Female) |
Alcohol |
Whether the student drinks alcoholic beverages (Yes/No) |
Exercise |
Average hours of exercise per week |
SQI |
Sleep Quality Index (a higher index number indicates poorer sleep quality) |
GPA |
Cumulative Grade Point Average |
Question 1
The researchers claim that female students have better sleep quality than male students. The appropriate statistical test for their claim is a
Two-sample t-test
Chi-square test
Correlation test
Two-way ANOVA test
Question 2
Suppose μF and μM define the population mean sleep quality indices for females and males, respectively. The appropriate null and alternative hypotheses to test the claim in Question 1 are
H0 : μF = μM vs H1 : μF 丰 μM
H0 : μF = μM vs H1 : μF < μM
H0 : μF > μM vs H1 : μF ≤ μM
H0 : μF = μM vs H1 : μF > μM
Question 3
The test of the hypotheses in Question 2 indicates that there is
moderate evidence to suggest that female students have better sleep quality than male students (p = 0.025)
weak evidence to suggest that female students have better sleep quality than male students (
p = 0.070)
no evidence to suggest that female students have better sleep quality than male students (p
= 0.228)
no evidence to suggest that female students have better sleep quality than male students (p
= 0.457)
Question 4
In general, if the study uses a smaller sample size, the probability of a Type II error will
stay the same
decrease
increase
either increase or decrease
Question 5
The association between academic performance and sleep quality is modelled using a simple linear regression. The appropriate population regression equation is
SQI = β0 + β 1 GPA + U
GPA = β0 + β 1 SQI + U
SQI = β0 + β 1 GPA2 + U
GPA = β0 + β 1 SQI2 + U
Question 6
The slope coefficient of the estimated linear regression equation in Question 5 is
-0. 150
-0. 124
-0.093
5.747
Question 7
The lower bound of a 95% confidence interval for the slope coefficient in Question 5 is
-0.245
-0.225
-0. 196
Question 8
Using the model in Question 5, the researcher examines whether there is a statistically significant negative linear relationship between academic performance and sleep quality. The researcher tests
H0 : β 1 = 0 vs H1 : β 1 丰 0
H0 : β 1 = 0 vs H1 : β 1 < 0
H0 : β 1 = 0 vs H1 : β 1 > 0
H0 : β0 = 0 vs H1 : β0 > 0
Question 9
Using the model in Question 5, the p-value for the hypothesis test in Question 8 is
0.022
0.183
0.855
0.876
Question 10
The association between academic performance and sleep quality may be obscured by the students' engagement in physical activity and behavioural risk factors. Assuming there are no interactions, fit a multiple regression model for academic performance using sleep quality, exercise and alcohol usage as predictors. The estimated coefficient for the variable Exercise in this model is
0.0473
0.0532
0.0665
0.0700
Question 11
In the fitted regression model in Question 10, the degrees of freedom used to test the statistical significance of the population parameter corresponding to the variable Exercise is
50
51
52
53
Question 12
Based on the model in Question 10, you can conclude that there is
moderate evidence to suggest that academic performance is associated with exercise after taking into account the effect of sleep quality and alcohol usage (p = 0.043)
no evidence to suggest that academic performance is associated with exercise after taking into account the effect of sleep quality and alcohol usage (p = 0.414)
no evidence to suggest that academic performance is associated with exercise after taking into account the effect of sleep quality and alcohol usage (p = 0.584)
no evidence to suggest that academic performance is associated with exercise after taking into account the effect of sleep quality and alcohol usage (p = 0.734)
Question 13
Based on the fitted regression model in Question 10, the estimated cumulative grade point average of a student who drinks alcohol, has a sleep quality index of 8.9, and has an average 9 hours of physical exercise per week is
4.06
4.27
4.66
4.79
Question 14
A plot of residuals against fitted values for the model in Question 10 suggests that
the errors are normally distributed
the errors have equal variance
the errors are right skewed
the errors are correlated
Question 15
The regression sum of squares of the fitted model in Question 10 is
7.93
9.33
98.41
107.74
Back to scenarios
Penguin Bills
A study measures the dimensions of adult penguin bills for three species (Adelie, Gentoo and Chinstrap) living on an island. Bill lengths and bill depths (millimetres) were taken, as shown in the following diagram:
Question 1
In the study it is believed that there is a negative linear relationship between bill lengths and bill depths for the penguins. Define ρ as the population correlation coefficient between bill length and bill depth. The appropriate null and alternative hypotheses to test are
H0 : ρ = 0 vs H1 : ρ > 0
H0 : ρ > 0 vs H1 : ρ < 0
H0 : ρ = 0 vs H1 : ρ < 0
2023-02-07