Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

MA 214: Applied Statistics

Fall 2023

Problem Set 8

To receive a Pass, make a good-faith effort to answer all questions. You may answer on

a separate piece of paper. Be sure to answer the questions in order using legible writing. Unreadable solutions will result in not receiving a Pass.

Question 1. In this question you will look at a “real example” of logistic regression from a paper by BU's own Professor Judith Lok:

Lok, J.J., Peter, W.H., Collier, A.C., Benson, C.A., D Witt, M., Luque,A.E., Deeks, S.G. and Bosch, R.J., 2013. The impact of age on the prognostic capacity of

CD8+ T-cell activation during suppressive antiretroviral therapy. AIDS (London, England), 27(13),p.2101

Please take a look at Table 2 below. Lok et al. predicted the risk of what is called a composite outcome: AIDS-defining and non-AIDS-defining events (Yes/No -- predicting the probability of Yes”, an event took place). A total of 94 “uncensored” participants had events: 12 AIDS-defining events and 82 non-AIDS-defining events (see note 1 in the table). The most important predictor they were interested in was CD8+ T-cell activation, a continuously distributed predictor.

Recall that in logistic regression, the odds ratio (OR) for a covariate x with coefficient β is defined as eβ .

(a)  Interpret the OR=1.22 of CD8+ T-cell activation in the univariate model.

(b)  Interpret the OR=1.14 of CD8+ T-cell activation in the multivariate model with CD8+ T-cell activation, age, and the CD4 count.

(c)  What is the main diference in the interpretation in questions 1 and 2?

(d) Why aren't the confidence intervals for the ORs symmetric? How do you think they were constructed? Check whether your proposed method leads to (approximately) symmetric confidence intervals.

(e)  Test whether CD8+ T-cell activation significantly predicts the composite outcome in the univariate model. Use a significance level of 0.05.

(f)  Test whether CD8+ T-cell activation significantly predicts the composite outcome in the multivariate model with CD8+ T-cell activation, age, and the CD4 count. Use a   significance level of 0.05.

(g)  What would be your conclusion regarding the prognostic value of CD8+ T-cell

activation in the univariate model? And in the multivariate models including age?

(h)  Can you explain why the conclusions in questions 4 and 5 are different? Is that   concerning? Useful? Do you think it’s useful to measure CD8+ T-cell activation if one is interested in predicting these serious events?

Question 2. In this question we will consider an ordinal regression model for predicting  the size of a company (big, medium, small) based on it’s profits. Here is the output from JMP:

(a)  Interpret the coefficient for intercept[big] in terms of probabilities of the company size.

(b)  Interpret the coefficient for intercept[medium] in terms of probabilities of the company size.

(c)  Interpret the coefficient for Profits ($M) in terms of log cumulative odds.

(d)  Interpret the point marked with a red ⋆ in the logistic plot above

(e)  Calculate the probability of a compute with profits of $100M being small.

(f)   Calculate the probability of a compute with profits of $100M being medium.