Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Biostatistics. December 13th. 2022

Take-home exam

Deadline Friday December 16th at 5 pm

----------------------------------------------------------------------------------------------------------------

You must solve the questions individually. The 5 questions will give a total point of 38.

a total of at least 24 is required to pass and a total of at least 35 is required to pass with distinction.

Short and concise but correct answers will be enough for maximum score. 

Please submit your answers as a Word document on QPS.

paste a relevant screenshot from your output for each of your answers.

----------------------------------------------------------------------------------------------------------------

1. You are planning a new study for comparing the difference in mean bone density between runners and non-runners and must determine the sample size of the study. You want equal group sizes and have decided on the desired power (80%) of the study as well as the significance level (5%) you will use for the result.

a. Describe the null hypothesis, and a suitable alternative hypothesis. (1p)

b. Suggest a suitable statistical test. (1p)

c. What additional information do you need to be able to perform the sample size calculation? (1p)

2. A study of risk factors for low birth weight was conducted. Use the dataset Birthweight.

The dataset has information on 500 mothers and their babies. The following variables were collected:

mother_id Mother ID

age Mother's age at childbirth in (years)

birthweight Birthweight (in grams)

smoke Smoked during pregnancy (0=No, 1=Yes)

male Child is male (0=No, 1=Yes)

married          Mother is married (0=No, 1=Yes)

black            Mother is black (0=No, 1=Yes)

childorder       Order of child (1-3)

a. Which test (no regression model) is suitable to use to see if there is a significant difference between the smokers and non-smokers in the birthweight and why? Briefly explain. (1p)

b. What are the null and alternative hypotheses? (1p)

c. Run the test and interpret the results. (1p)

d. What assumptions are required to be able to use this test? Check the assumptions and explain. (1p)

e. What would you report as results from this study? (1p)

f. What would you say about the statistical uncertainty reporting these results? (1p)

3. Use the dataset Birthweight.

a. Which regression model is suitable to use to see if there is a significant difference between the smokers and non-smokers in the birthweight and why? Briefly explain. (1p)

b. What are the null and alternative hypotheses? (1p)

c. Run the regression model and interpret the coefficients. (2p)

d. What assumptions are required to be able to use this regression model? Check the assumptions. (1p)

e. Write down the equation of the regression model. (1p) 

f. How large proportion of the original variation that was explained by the model? (1p)

g. Adjust the model for the potential confounders. (1p)

h. Explain why you think that those variables are confounders. (1p)

i. Report the results from this study? (1p)

j. What would you say about the statistical uncertainty reporting these results? (1p)

k. Explain your conclusions? (1p)

4. Use the dataset Birthweight.

Low birth weight has been defined by WHO as weight at birth of < 2500 grams.

a. Create a variable low_bw according to the definition above. (1p)

b. Summarize the variables age, smoke, male, married, black and childorder separately for each of the two birth weight categories. Make a proper table in Word with appropriate summary measures, no screenshot of computer output. (2p)

c. Which regression model you can use now if the outcome is this new low_bw variable to see if there is a significant difference between the smokers and non-smokers in the birthweight. (1p)

d. What are the null and alternative hypotheses? (1p)

e. Run the regression model and interpret the results. (2p)

f. Is smoking statistically significantly associated to the outcome?  Why? Don´t use the P-value. (1p)

g. Adjust the model for the potential confounders. (1p)

h. Explain why you think that those variables are confounders. (2p)

i. Report the results from this study? (1p)

j. Is the whole model significant? Why? (1p)

k. What would you say about the statistical uncertainty reporting these results? (1p)

l. Explain your conclusions? (1p)

5. A small randomized controlled trial was carried out in a rare type of brain cancer with poor prognosis. The standard treatment A was compared to a new promising treatment B. The results of the trial are summarized in the Kaplan-Meier graph below:

 

Which of the following 4 statements are true and which are false? Motivate each answer briefly.

a. The difference in median survival between the two groups is between 4 and 5 months. (0.5p)

b. A two-sided two-sample t-test comparing the mean survival times in the two groups is an appropriate test to evaluate the null hypothesis of equal treatment effect.  (0.5p)

c. Cox regression is an appropriate method for estimation of the relative treatment effect.  (0.5p)

d. The hazard ratio for treatment B vs treatment A is above 1.0. (0.5)

Good luck!!