闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

STAT 7100: Introduction to Advanced Statistical Inference

Examples and Illustrations for part B of learning unit 7

Example: Metropolis-Hastings calculation of Binomial counts analysis. Suppose XIθ ~ binomial(n, θ) counts the number of incidences in a binomial experiment with n trials. Prior uncertainty about the incidence probability is described by a beta distribution, P ~ beta(α, β), having written P for a random variable whose possible values are those of θ . The associated data-generating PMF and prior PDF are given by

fx-P(xIθ) = ╱ 、x(n) θ2 (1 - θ)ny2 and fP (θ) = θay] (1 - θ)β y]

Suppose further that x = 3“heads”are observed in n = 10 trials.

We examined this setup in a previous unit, wherein it was noted that the posterior dis- tribution is P IX = x ~ beta(α + x, β + n - x), which makes possible direct calculation of posterior quantiles using software with built-in functions. This calculation provides the following table of posterior quantiles across three settings of the prior parameters α and β .

quantile

(α, β)

(1/2, 1/2)

2.5%

25%

50%

75%

97.5%

mean

0.0740 0.1949 0.2860 0.3908 0.6012 0.3001

An alternative approach to calculating these quantiles is to sample from the posterior distribution using a Metropolis-Hastings algorithm. From the forms of the data-generating PMF, fx-P(xIθ) x θ2 (1 - θ)ny2 , and prior PDF, fP (θ) x θ ay] (1 - θ)β y] , the posterior PDF has the form

fP -x (θIx) x fx-P(xIθ)fP (θ) x θa+2y] (1 - θ)β+ny2y] .

In a Metropolis-Hastings algorithm, the acceptance probability of a proposed update γ of the incidence probability θ is therefore

ρ = min , , 1 ← = min 宁 ╱ 、 a+2y] ╱ 、β+ny2y] , 1 ( ,

where fQ-P(γIθ) is the PDF of the proposal distribution.

In this computational approach, one interesting procedure for generating a proposed up- date γ of θ is deﬁned by the following steps: First, simulate a value u from a uniform distribution between -c and c (i.e., -c < u < c). Next, assign the update according to

） - (θ + u) if θ + u < 0

γ = l θ + u if 0 < θ + u < 1

Interpreting these steps, the procedure works by simulating a symmetric shift of θ, fol- lowed by checking whether the value falls outside the unit interval; if it does, the value is reﬂected back into the interval by the same distance it fell outside of it. This type of proposal distribution is sometimes called a uniform, symmetric shift with reﬂecting bound- aries. What is especially valuable about it is that the corresponding conditional density is guaranteed to satisfy fQ-P(γIθ) = fQ-P(θIγ), which implies that the density cancels in the acceptance probability formula, and the formula simpliﬁes to

ρ = min , , 1 ← = min 宁 ╱ 、 a+2y] ╱ 、β+ny2y] , 1 ( .

An accompanying R code demonstrates how this algorithm can reproduce the table of posterior quantiles that appears above.

Example: Weibull sampling. Suppose k = (X] , . . . , Xn ) is an IID random sample such that each Xi has PDF

fx-P(xIξ, λ) = ξλx5 y] ey2s A .

This deﬁnes a Weibull distribution with parameters ξ and β = 1/λ. Suppose the prior distribution is formulated to that ξ and λ follow independent gamma distributions,

fP (ξ, λ) = ╱ ξay] eyb5 、 ╱ λcy] eydA 、.

● One possible proposal distribution speciﬁes the generation of a proposed update (ξ* , λ* ) from independent exponential distributions, with means ξ and λ. The PDF is

fQ-P (ξ* , λ* Iξ, λ) = ey85 * /5+A* /A〉,

This proposal does not deﬁne a random walk. The associated acceptance probabil- ity formula is somewhat complicated: ρ = min{e( , 1}, where

ζ = (n + a - 2) log(ξ* /ξ) + (n + c - 2) log(λ* /λ) +b(ξ - ξ * ) + d(λ - λ* ) + (ξ* /ξ - ξ/ξ * ) + (λ* /λ - λ/λ* )

n n n

+(ξ* - ξ)_ log xi + λ _xi(5) - λ* _ x

ie] ie] ie]

● Another possible proposal distribution is deﬁned according to log ξ* = log ξ + σ ] ∈ ] and log λ* = log λ + σ2 ∈2

where ∈ ] and ∈2 are independent with symmetric distributions. For instance, if each ∈i uniform(-σi , σi ), then the proposal density is

fQ-P(ξ* , λ* Iξ, λ) =

4σ] σ2 ξ * λ*

, for

ξey口1 < ξ * < ξe口1

λey口2 < λ* < λe口2 .

Although elements of this proposal distribution resemble those of a random walk, it does not deﬁne one. The complication, here, is that the additive perturbations log ξ* = log ξ + σ∈ ] and log λ* = log λ + σ∈2 are deﬁned on a logarithmic scale, and not directly on the parameters.

To deﬁne a random walk, one might re-parameterize the problem according to θ ] = log ξ and θ2 = log λ. The log-likelihood function and prior translate to

n n

l(k; m) = nθ ] + nθ2 + (e91 - 1)_ log xi - e92 _ xi(e)9 1

ie] ie]

and

fP (k) = ╱ ea91 ybe9 1 、 ╱ ec92 yde92 、,

having used the transformation theorem to identify the PDF of the re-parameterized prior.

Under this parameterization, the perturbations γi = θi + σi ∈i deﬁne a random-walk

proposal distribution. The acceptance probability is ρ = min ,e( , 1、, where

ζ = (n + a)(γ] - θ ] ) + (n + c)(γ2 - θ2 )

-(ey1 - e91) !n_ie] log xi + b! - ey2 n_ie] xi(e)y 1 + e92 n_ie] x

Example: Beta-binomial distributions. Suppose the random variable P follows a beta- binomial distribution with parameters n, α, and β. Recall that this distribution is deﬁned by the PDF

fP (θ) = ╱ 、θ(n) .

Despite that this PDF is written in a concise form, it can be difﬁcult to work with, especially when n, α, and β are large; thus, Monte Carlo solutions are appealing.

To simulate a beta-binomial distribution, it is sometimes helpful to recall the hierarchical formulation by which P IH = λ ~ binomial(n, λ) and H ~ beta(α, β) implies the beta- binomial distribution above for the marginal distribution of P . Thus, it becomes possible to generate a sample of beta-binomial values, (θ8]〉, . . . , θ 8k〉), by generating a sample of bivariate measurements, ((θ8]〉, λ 8]〉), . . . , (θ8k〉, λ 8k〉)), such that the pair (θ8i〉, λ 8i〉) is gener- ated by ﬁrst simulating λ 8i〉 from a beta(α, β) distribution and then simulating θ 8i〉 from a binomial(n, λ8i〉) distribution.

Notice that each pair of (θ, λ)-values is simulated independently of all other such pairs in this algorithm, so the Markov chain it deﬁnes is trivial. An alternative approach applies the property HIP = θ ~ beta(θ + α, n - θ + β). Here, the sequence ((θ8]〉, λ 8]〉), . . . , (θ8k〉, λ 8k〉)) is simulated as follows.

● (Initialization) Generate (θ8]〉, λ 8]〉) according to the previous algorithm: ﬁrst simulate λ 8]〉 from a beta(α, β) distribution; then simulate θ8]〉 from a binomial(n, λ8]〉) distribu- tion.

● For i > 1, simulate λ8i〉from a beta(θ8iy]〉+α, n - θ 8iy]〉+β) distribution; then simulate θ 8i〉 from a binomial(n, λ8i〉) distribution.

This alternative algorithm is an example of a Gibbs sampling algorithm. Notice here that the simulation of each λ 8i〉 is speciﬁed from a previous value θ8iy]〉, which implies that the random variable pairs (P8i〉, H8i〉) are not independent across i. The sequence of bivariate random variables ((P8]〉, H8]〉), . . . , (P8k〉, H8k〉)) is a nontrivial Markov chain.

Example: Gibbs sampling in a multinomial model. Suppose k = (X] , . . . , X5 ) is a multinomial random vector summarizing n multinomial trials in ﬁve categories, whose associated probabilities are parameterized as

p] = a] θ ] + b] , p2 = a2 θ ] + b2 , p3 = a3 θ2 + b3 , p4 = a4 θ2 + b4

and p5 = 1 - (p] +p2 +p3 +p4 ), in which the ai and bi are known constants, each 0 < θi < 1, and θ ] + θ2 < 1. Suppose the prior PDF has the form

fP (k) x θ ](a)1 y] θ2(a)2 y] (1 - θ ] - θ2 )a3 y] .

The posterior distribution associated with this model is challenging to work with directly. However, it is possible to formulate a tractable Gibbs sampling algorithm by augment- ing the multinomial data in ﬁve categories k = (X] , . . . , X5 ) to multinomial data in nine categories Y = (Y] , . . . , Y3 ) with associated probabilities

q] = a] θ ] , q2 = b] , p3 = a2 θ ] , q4 = b2 ,

q5 = a3 θ2 , qq = b3 , q之 = a4 θ2 qs = b4

and q3 = 1 - (q] + q2 + q3 + q4 + q5 + qq + q之 + qs ). The relationship between these vectors is

X] = Y] + Y2 , X2 = Y3 + Y4 , X3 = Y5 + Yq , X4 = Y之 + Ys ,

and X5 = Y3 . If Y were observed, one could work with the posterior distribution with PDF

fP -Y (kIy) x θ ](z)1 +z3 +a1 y] θ2(z)5 +z7 +a2 y] (1 - θ ] - θ2 )z9 +a3 y] .

Moreover, a sample from this distribution could be simulated using a Gibbs sampling algorithm based on the conditional distributions

P] I{Y = y, P2 = θ2 } ~ (1 - θ2 ) × beta(y] + y3 + α] , y3 + α3 )

P2 I{Y = y, P] = θ ] } ~ (1 - θ ] ) × beta(y5 + y之 + α2 , y3 + α3 ).

Yet, despite that Y is unobserved, it is possible to simulate from the conditional distribu- tions that connect it to the observed data k according to

Y] I{X] = x] , 扌 = k} ~ binomial ╱x] , 、 ,

Y3 I{X2 = x2 , 扌 = k} ~ binomial ╱x2 , 、 ,

Y5 I{X3 = x3 , 扌 = k} ~ binomial ╱x3 , 、 ,

Y之 I{X4 = x4 , 扌 = k} ~ binomial ╱x4 , 、 .

The associated Gibbs sampling algorithm treats the augmented data as parameters to be updated. At iteration i, the current value of relevant set of “parameters” is (θ](8)i〉, θ y8i〉), which are updated by simulating

θ ](8)i+]〉 ~ {1 - θ2(8)i〉} × beta(y](8)i〉 + y3(8)i〉 + α] , y3(8)i〉 + α3 )

θ2(8)i+]〉 ~ {1 - θ ](8)i〉} × beta(y5(8)i〉 + y之(8)i〉 + α2 , y3(8)i〉 + α3 ),

followed by

y ](8)i+]〉 ~ binomial !x] , ! ,

y5(8)i+]〉 ~ binomial !x3 , ! ,

y3(8)i+]〉 ~ binomial !x2 , ! ,

y之(8)i+]〉 ~ binomial !x4 , ! ,

and then setting y2(8)i+]〉 = x] - y y4(8)i+]〉 = x2 - y yq(8)i+]〉 = x3 - y ys(8)i+]〉 = x4 - y之(8)i+]〉 , and y x5 .

Example: Posterior maximization in a multinomial model using the EM algorithm. Consider the previous example in which k = (X] , . . . , X5 ) is a multinomial random vector summarizing n multinomial trials in ﬁve categories, with associated probabilities

p] = a] θ ] + b] , p2 = a2 θ ] + b2 , p3 = a3 θ2 + b3 , p4 = a4 θ2 + b4

and p5 = 1- (p] +p2 +p3 +p4 ). An augmented model has Y = (Y] , . . . , Y3 ), with probabilities

q] = a] θ ] , q2 = b] , p3 = a2 θ ] , q4 = b2 ,

q5 = a3 θ2 , qq = b3 , q之 = a4 θ2 qs = b4 ,

and q3 = 1 - (q] + q2 + q3 + q4 + q5 + qq + q之 + qs ). The connection between them is

X] = Y] + Y2 , X2 = Y3 + Y4 , X3 = Y5 + Yq , X4 = Y之 + Ys ,

and X5 = Y3 . The prior PDF

fP (k) x θ ](a)1 y] θ2(a)2 y] (1 - θ ] - θ2 )a3 y] .

implies the posterior PDF

fP -Y (kIy) x θ ](z)1 +z3 +a1 y] θ2(z)5 +z7 +a2 y] (1 - θ ] - θ2 )z9 +a3 y] .

The posterior PDF deﬁned conditionally on the augmentation, Y , makes feasible the use of the EM algorithm to ﬁnd the posterior mode. In particular, it provides that within the expectation step, the quantity Q(k; k8i〉) is

Q(k; k8i〉) = Eθ(i) *log fP -Y (kIY)Ik = ml

= (Eθ(i)[Y] + Y3 Ik = m] + α] - 1) log θ]

+(Eθ(i) [Y5 + Y之 Ik = m] + α2 - 1) log θ2

+(x5 + α3 - 1) log(1 - θ ] - θ2 ),

whose conditional expectations are readily evaluated using the properties

Y] I{X] = x] , 扌 = k} ~ binomial ╱x] , 、 ,

Y3 I{X2 = x2 , 扌 = k} ~ binomial ╱x2 , 、 ,

Y5 I{X3 = x3 , 扌 = k} ~ binomial ╱x3 , 、 ,

Y之 I{X4 = x4 , 扌 = k} ~ binomial ╱x4 , 、 ,

as in the previous example; that is,

Eθ(i) [Y] + Y3 Ik = m] = x] + x2

a3 θ2(8)i〉 a4 θ2(8)i〉

Eθ(i) [Y5 + Y之 2(8)i〉 2(8)i〉 + b4 .

Leaving the conditional expectations unevaluated, the relevant ﬁrst order partial deriva- tives are

Q(k; k8i〉) = -

∂ Q(k; k8i〉) = Eθ (i) [Y5 + Y之 Ik = m] + α2 - 1 - x5 + α3 - 1

It follows that the desired value of k8i〉 is the value of k at which

θ ] Eθ (i) [Y] + Y3 Ik = m] + α] - 1

1 - θ ] - θ2 x5 + α3 - 1

θ2 Eθ (i) [Y5 + Y之 Ik = m] + α2 - 1

1 - θ ] - θ2 x5 + α3 - 1 ,

which is solved by straightforward algebra. (For instance, if the right-hand sides of the formulas above are respectively denoted A] and A2 , and Bi = Ai /(1 + Ai ), the solution is θ ] = B] (1 - B2 )/(1 - B] B2 ) and θ2 = B2 (1 - B] )/(1 - B] B2 ).)