STAT 3690 Lecture 12
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
STAT 3690 Lecture 12
2022
(1 - a) x 100% CR for v = Aμ
● x1 , . . . , xn MVNp (μ, 五)
- Unknown 五
- n > p
● A is of q x p and rk(A) = q , i.e., A五AT > 0
● Then iid Axi ~ MVNq (v , A五AT )
● (1 - α) x 100% CR for v is {v : (A - v)T (ASAT )_ 1 (A - v) < F1 _α,q,n _q }
● Special case: A = aT e Rp
- (1 - α) x 100% confidence interval (CI) for scalar ν = aT μ is
{ν : n(aT -ν)2 (aT Sa)_ 1 < F1 _α,1,n _ 1 } = /aT -t1 _α/2,n _ 1 ′aT Sa/n, aT +t1 _α/2,n _ 1 ′aT Sa/n\
- Check the coverage probability of CI for each entry of μ
options(digits = 4)
install.packages(c ( "MASS"))
set.seed(1)
B = 5e3L
n = 5e2L
Mu = (1:10) ˆ2; (p = length (Mu))
(Sigma = diag(p)+ .5)
alpha <- .05
(A = diag(p))
cover = matrix(0 , ncol = p, nrow = B)
for (b in 1:B){
sample = MASS::mvrnorm(n, Mu, Sigma)
mu_hat = colMeans (sample)
sample_cov = cov (sample)
LB = A %*% mu_hat - qt(1-alpha/2, n-1)* sqrt(diag(A %*% sample_cov %*% t (A))/n) RB = A %*% mu_hat + qt(1-alpha/2, n-1)* sqrt(diag(A %*% sample_cov %*% t (A))/n)
cover[b,] = (LB < Mu) * (Mu < RB)
}
(cover_prob_indiv = colMeans (cover))
(cover_prob_simul = mean (apply(cover, 1 , prod)))
Simultaneous confidence intervals
● Interested in (1 - αk ) CIs for scalars a k(T)μ, say CIk , k = 1, . . . , m, simultaneously
● Make sure Pr( k {ak(T)μ e CIk }) > 1 - α
● Bonferroni correction
- Bonferroni inequality:
m m m m
Pr( ← {ak(T)μ e CIk }) = 1 - Pr( {ak(T)μ CIk }) > 1 - Pr(ak(T)μ CIk ) = 1 - αk
k=1 k=1 k=1 k=1
- Taking αk such that α = αk , e.g., αk = α/m, i.e.,
(ak(T) - t1 _α/(2m),n _ 1 ′a k(T)Sak /n, a k(T) + t1 _α/(2m),n _ 1 ′a k(T)Sak /n)
- Working for small m
● Scheffé’s method
- Let CIα = (aT - c ′aT Sa/n, aT + c ′aT Sa/n) for all a e Rp
- Derive that c = ′p(n - 1)(n - p)_ 1 F1 _α,p,n _p
* By Cauchy–Schwarz: {aT ( - μ)}2 = [(S1/2a)T {S_ 1/2( - μ)}]2 < {(aT Sa)T /n}{n( - μ)T S_ 1 ( - μ)} =
m
Pr( ← {ak(T)μ e CIk }) > Pr( ← {aT μ e CIα }) = 1 - Pr( {aT μ CIα })
k=1 αeRp αeRp
= 1 - Pr( [{aT ( - μ)}2 /{(aT Sa)T /n} > c2])
αeRp
> 1 - Pr({n( - μ)T S_ 1 ( - μ) > c2 })
* Pr({n( - μ)T S_ 1 ( - μ) > c2 }) = α = c = ′p(n - 1)(n - p)_ 1 F1 _α,p,n _p
- Working for large even infinite m
options(digits = 4)
install.packages(c ("dslabs"))
library(dslabs)
data("gapminder")
dataset = gapminder[
!is.na (gapminder$infant_mortality) &
gapminder$year == 2012 ,
c ( 'infant_mortality ' , "life_expectancy")]
dataset = as.matrix(dataset)
n = nrow (dataset); p = ncol(dataset)
alpha <- .05
a1 = c ( 1 ,0); a2 = c (0 , 1)
A = rbind (a1, a2)
(mu_hat <- colMeans (dataset))
(sample_cov <- cov (dataset))
# Simultaneous CIs without correction
c = qt(1-alpha/2, n-1)
(NOcorrection <- cbind(
A %*% mu_hat - c * sqrt(diag(A %*% sample_cov %*% t (A))/n),
A %*% mu_hat + c * sqrt(diag(A %*% sample_cov %*% t (A))/n)
))
# Simultaneous CIs with Bonferroni correction
m = nrow (A)
c = qt(1-alpha/2/m, n-1)
(Bonferroni <- cbind(
A %*% mu_hat - c * sqrt(diag(A %*% sample_cov %*% t (A))/n),
A %*% mu_hat + c * sqrt(diag(A %*% sample_cov %*% t (A))/n)
))
# Simultaneous CIs with Scheffe correction
c = sqrt(p* (n-1)/(n-p) * qf(1-alpha, p, n-p))
(Scheffe <- cbind(
A %*% mu_hat - c * sqrt(diag(A %*% sample_cov %*% t (A))/n),
A %*% mu_hat + c * sqrt(diag(A %*% sample_cov %*% t (A))/n)
))
● Report: CIs (21.82, 29.82) and (69.92, 72.70) cover the mean infant mortality and mean life expectancy, simultaneously, with probability at least 95%.
Comparing two multivariate means (J&W Sec. 6.3)
● Two independent samples of (potentially) different sizes from two distributions with equal covariance
- x11 , . . . , x1nY MVNp (μ1 , 五)
- x21 , . . . , x2n9 MVNp (μ2 , 五)
● Let i and Si be the sample mean and sample covariance for the ith sample
● Hypotheses H0 : μ 1 = μ2 v.s. H1 : μ 1 μ2
● Test statistic following LRT
(1 - 2 )T {(n1(_)1 + n2(_)1 )Spool }_ 1 (1 - 2 ) ~ F (p, n1 + n2 - p - 1)
- Spool
● Rejection region at level α
,x11 , . . . , x1nY , x21 , . . . , x2n9 : (1 -2 )T {(n1(_)1 +n2(_)1 )Spool }_ 1 (1 -2 ) > F1 _α,p,n Y +n9 _p _ 1 、
● p-value
1 - FFY − α,p,n Y+n9 −p − Y ┌ (1 - 2 )T {(n1(_)1 + n2(_)1 )Spool }_ 1 (1 - 2 )┐
options(digits = 4)
install.packages(c ("dslabs"))
library(dslabs)
data("gapminder")
dataset1 = gapminder[
!is.na (gapminder$infant_mortality) &
gapminder$continent == "Africa" &
gapminder$year == 2012 ,
c ( 'infant_mortality ' , "life_expectancy")]
dataset1 = as.matrix (dataset1)
dataset2 = gapminder[
!is.na (gapminder$infant_mortality) &
gapminder$continent == "Asia" &
gapminder$year == 2012 ,
c ( 'infant_mortality ' , "life_expectancy")]
dataset2 = as.matrix (dataset2)
n1 <- nrow (dataset1); n2 <- nrow (dataset2); p <- ncol (dataset1)
(mu_hat1 <- colMeans(dataset1))
(mu_hat2 <- colMeans(dataset2))
(S1 <- cov (dataset1))
(S2 <- cov (dataset2))
S_pool <- ((n1 - 1)*S1 + (n2 - 1)*S2)/ (n1+n2-2)
(lrt <- t(mu_hat1-mu_hat2) %*%
solve ((n1ˆ-1 + n2ˆ-1)*S_pool) %*%
(mu_hat1-mu_hat2))
alpha <- .05
(cri.val <- (n1+n2-2)*p/ (n1+n2-p-1)*qf(1-alpha, p, n1+n2-p-1))
lrt >= cri.val
(p.val = 1-pf ((n1+n2-p-1)/(n1+n2-2)/p*lrt, p, n1+n2-p-1))
● Report: Testing hypotheses H0 : In 2012 Asia and Africa shared the identical mean value in both infant mortality and life expectancy v.s. H1 : otherwise, we carried on the LRT and obtained 87.65 as the value of test statistic. The corresponding p-value (resp. rejection region) was 4.952e-14 (resp. [6.255, o)). So, at the .05 level, there was a strong statistical evidence against H0 , i.e., we rejected H0 and believed that in 2012 Asia and Africa didn’t share the identical mean value in either infant mortality or life expectancy.
2022-03-17