关键词 > STAT3405/STAT4066

STAT3405: Introduction to Bayesian Computing and Statistics

发布时间:2022-10-08

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

STAT3405: Introduction to Bayesian Computing and Statistics

STAT4066: Bayesian Computing and Statistics

Computer Lab Problems #4

Important:  This assignment is assessed.  Your work for this task must be submitted by 11:59pm  on Sunday, 9 October 2022.

Your solution should be submitted via LMS.

The expectation for a submission are:

• The questions are answered in complete sentences in the“Text  Submission”part (click on“Write Submission”) of LMS’ assignment submission page.  Marks will be awarded for the correctness of the answers and that they are given in complete sentences.

• That numerical answers are rounded to an appropriate number of digits.

• Code used to answer the questions was saved into a plain text file1   and then attached to the submission (see“Attach  Files”), so that it will be easy to re-run the code and check it.  Code should be in Stan or BUGS. Please do not attach Word documents, PDF, or other binary formats, the only exception would be OpenBUGS’*.odc format but even that format should be avoided if the file only contains BUGS code. Marks will be awarded for

– the correctness of the code, i.e. how easy it is to read it2  ; and

– how easy it is to get the code to run, if necessary.

You may receive comments on the efficiency of your code, but there are no marks for efficiency.

Unless special consideration is granted, any student failing to submit work by the deadline will receive a penalty for late submission (as described in the unit outline).

Plagiarism:  You are encouraged to discuss assignments with other students and to solve problems together. However, the work that you submit must be your sole effort (i.e. not copied from anyone else). If you are found guilty of plagiarism you may be penalised. You are reminded of the University’s policy on Academic Conduct’ and ‘Academic Misconduct’ (including plagiarism):

http://www.student.uwa.edu.au/learning/resources/ace/conduct Various material at the following URL might be helpful too:

https://www.uwa.edu.au/students/study-success/studysmarter

The file cholestg .txt contains data on cholesterol levels after heart attacks.  These data are sourced from http://www.statsci.org/data/general/cholest.html, where you can find a detailed description of the data.  We can read in the data, replacing“PATH TO DATA”with the appropriate path, using the following command:


dat  <-  read .table("PATH_TO_DATA/cholestg .txt" ,  header  =  TRUE)


We will concentrate on the measurements taken on day 2 after their attack for the 28 heart-attack patients and the measurements taken on the 30 people in the control group. These data can be extracted with the following commands:


x  <-  dat$cholest[dat$group  ==  1  &  dat$day  ==  2]

z  <-  dat$cholest[dat$group  ==  2]


The object x contains now contains the measurement taken on the 28 heart-attack patients, i.e. the sample   x = (x1 ,x2 , . . . ,x28 ), and the object z contains the measurements on the control group, i.e. the sample z =   (z1 ,z2 , . . . ,z30 ). We will also refer to all observations together as y = (x, z) = (x1 ,x2 , . . . ,x28 ,z1 ,z2 , . . . ,z30 ).

Assume our model is that

• the xi , i = 1, . . . , 28, are realisations of i.i.d. random variables Xi ∼ N(µx,σ2 ),

• the zj , j = 1, . . . , 30, are realisations of i.i.d. random variables Zj  ∼ N(µX ,σ 2 ); and

• the Xi’s are independent of the Zi’s.

Note that we use the same measurement error σ 2  for the two groups3.

Our question of interest is whether heart-attack patients have, on average, higher cholesterol levels than the average cholesterol level of people without a heart attack. That is, we are interested in δ = µx − µX .

Implement this model using your favourite non-informative priors for µx , µX  and σ 2  (or some transforma- tion of σ2 ).

Task 1:   In your submission, you should answer the following questions:

1. Which priors4  did you use for the parameters µx , µz  and σ 2 ?

2. What are your Bayesian estimates for the parameters µx , µz  and σ?

3. What is your Bayesian estimate for δ and for P[δ > 40|y], the posterior probability that δ > 40? The answer to this question should be phrased in the context of this analysis.

4. Perform some model checking by producing replicate data sets yrep  using the posterior predictive distribution and the following three statistics:

28

T1(y) =  =   xi

i=1

30

T2(y) =  =   zi

i=1

sd(x)

T3(y) =

 

Where sd(.) is the sample standard deviation5.

Provide for each of these statistics the Bayesian p-value, i.e. (estimates of) P [Ti(yrep ) ≥ Ti(y)|y], i = 1, 2, 3.

5. Do you think that the inferences that you made in 3 are still sound given the results from 4? Comment briefly.

Task 2:   Assume you are tested for a disease called conditionitis, a medical condition that you believe afflicts a proportion p of the population, with you modelling p as being beta distributed with parameters6 α = 1.41 and β = 547.12.

The test result is positive, i.e., the test claims that you have the disease.  You immediately perform a second, independent test.

The first test is a conditionitis Antigen Rapid Test (Oral Fluid)7  and its clinical performance is given as:

 

PCR confirmed sample number    Correct identified                  Rate

Positive sample Negative sample

101

305

91

303

90.1% (Sensitivity) 99.3% (Specificity)

Total

406

394              97.0% (Total Accuracy)

The second test is a conditionitis Antigen Rapid Test (Nasal Swab)8  and its clinical performance is given as:

 

PCR confirmed sample number    Correct identified                  Rate

Positive sample Negative sample

311

763

291

763

93.6% (Sensitivity) >99.9% (Specificity)

Total

1074

1054             98.1% (Total Accuracy)

Analyse these data and answer the following questions:

1. Which priors did you use in your modelling of the sensitivity and the specificity of each test?

2. What is the probability that you have conditionitis if the first test comes back positive?

3. What is the probability that you have conditionitis if the second test comes back positive too?

4. What is the probability that you have conditionitis if the second test comes back negative (after the first test came back positive)?

5. You realise that the two tests are produced by the same company. Do you think that the assumption that the results of the two tests given your (unknown) disease status are independent is tenable? Comment briefly.