闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

MATH3723 Statistical Theory

Session 2021–2022

Background. The Cauchy distribution is deﬁned by its probability density function

f(x) =

⇡(1 + x2 ) ,

x 2 R.

Here, the normalisation 1/⇡ ensures that R f(x)dx = 1.

Indeed, recalling that (arctanx)0 = , we have

Z−1(1) = 2 Z0 1 = 2 lKim!1 Z0 K = 2 lKim!1arctanK = 2 · = ⇡ .

Note that the expected value of the Cauchy distribution is inﬁnite:

Z−1(1) dx = 2 lKim!1 Z0 K dx = lKim!1ln(1 + x2 ) #0(K) = lKim!1ln(1 + K2 ) = 1.

However, the median M is well deﬁned and, by symmetry, is given by M = 0, because e.g.

Z0 1 f(x)dx = Z0 1 = .

Statistical model. Suppose we have an i.i.d. sample X = (X1 , . . . ,Xn ) from the shifted Cauchy distribution with density f(x|✓) = f(x − ✓) (✓ 2 R), that is,

f(x|✓) =

⇡ $1+(x − ✓)2 % ,

x 2 R.

We are interested in estimating the location parameter ✓ . Arranging the observed values X1 ,X2 , . . . ,Xn in ascending order, consider the order statistics

X(1) < X(2) < ··· < X(n) .

(Since the Cauchy distribution is continuous, with probability 1 there are no ties in the obser- vations, so all the inequalities may be presumed to be strict .)

The sample median Mn is deﬁned as the value separating the order statistics into two “equal parts”; more precisely, if n = 2k +1 then Mn := X(k+1), so that

X(1) < ··· < X(k) < Mn = X(k+1) < X(k+2) < ··· < X(2k+1) .

If n = 2k then the usual convention is to set Mn := X(k) + X(k+1) .

In this practical, you will consider the following four estimators of the parameter ✓:

• ✓ˆ := Xn = Xi (sample mean)

• ✓ˆ := Mn (sample median)

• ✓ˆ := Mn + · #✓ = Mn (modiﬁed sample median)

• ✓ˆ := maximum likelihood estimator (MLE)

The MLE ✓ˆ is the maximiser of the log-likelihood, i .e .

` ✓ˆ|X =✓2R(max) `(✓ |X),

`(✓ |X) = lnf(X|✓) = −nln⇡ −X ln 1+(Xi − ✓)2 .

i=1

The coeﬃcient 2/n in the estimator ✓ˆ (modiﬁed sample median) is explained by Theorem 8 .3 in the lecture notes (see formula (8 .23)) and by the fact that Fisher’s information in this model is given by (see Appendix 1 below)

In (✓) = E✓ ⇥(`✓(0)(✓ |X))2 ⇤ = .

Derivation of this formula is provided in the handout attached with the practical task .

Task. The objective of the practical is to explore and compare the asymptotic properties of the four estimators above (where analytical calculations may not be possible) .

To this end, use computer simulations to verify if these estimators are consistent (i .e . , ✓ˆn approaches the true value ✓ as n ! 1) and also to assess their accuracy by evaluating, for di↵erent values of the sample size n, their mean square error MSE✓ ( ✓ˆn) = E✓ (✓ˆn − ✓)2 and the coverage probability P✓ (|✓ˆn − ✓| "), say for " = 0.1.

More speciﬁc guidelines:

1. Statistical software R should be used for simulations and statistical computations including graphics . See http://www.maths.leeds.ac.uk/school/students/computing.html for some useful documentation on R . More help is available within R itself by clicking on “Help” in the menu and choosing “R functions” or “Html help” . In particular, you might ﬁnd it useful to familiarise yourself with the entries Cauchy, plot, function, mean, median and sum, and to practice with these commands before starting your practical work .

2. Fix a certain value of ✓ and simulate your Cauchy samples using command rcauchy.

3. To visually verify consistency of estimator ✓ˆn = ✓ˆ(Xn), one method is to sample the values X1 ,X2 , . . . ,Xn sequentially and to plot the sequence of resulting values ✓ˆn as a (random) function of n 2 N. What behaviour of such a plot would you expect for a consistent estimator?

4. When assessing the quality of the estimators, the sequential approach may be too compu- tationally demanding, so it is recommended to conﬁne oneself with some representative (increasing) values of the sample size n, say n = 10, 100, 500, 1000, . . .

5. To evaluate numerically the mean square error and the coverage probability (for a given n), make use of the Monte Carlo method (based on the Law of Large Numbers), according to which the expected value E(Y) of a random variable Y can be approximated by the sample mean Ym = (Y1 + ··· + Ym)/m of a (large) i .i .d . sample Y = (Y1 , . . . ,Ym) from

this distribution:

E(Y) ⇡ Y1 + ··· + Ym

In practical applications, the number of replicates m should be large enough so that any two di↵erent estimation runs would yield reasonably close approximate values .

6. To illustrate the maximum likelihood estimation (MLE) method, plot the log-likelihood function `(✓ |X) as a function of parameter ✓ for several di↵erent values of n (and with the corresponding sample values X1 , . . . ,Xn ﬁxed) . This can be done using the R function plot, but ﬁrst you will have to deﬁne the function `(✓ |X) using suitable R commands . Is there always a unique root of the likelihood equation `✓(0) = 0?

7. To calculate MLE numerically, it is recommended to use the R command optim; note however that it minimises a given function .

8. Summarise your ﬁndings by drawing the conclusions and recommendations .

Appendix 1: Fisher’s information

Suppose X = (X1 , . . . ,Xn) is an i.i.d. sample from the shifted Cauchy distribution with density

f(x|✓) = , x 2 R. (1)

The likelihood is given by

L(✓ |X) = f(X |✓) = f(Xi |✓) = .

Hence, the log-likelihood and its partial derivative are easily calculated:

`(✓ |X) = lnL(✓ |X) = −nln⇡ −X ln 1+(Xi − ✓)2

i=1

`✓(0)(✓ |X) = 2X(n) Xi − ✓

Our aim is to calculate Fisher’s information in this model by showing that

In (✓) =

The fact that the information does not depend on the parameter ✓ should not be surprising in view of the shift nature of the distributional family (1).

By deﬁnition,

In (✓) = E✓ ⇥(` )✓(0) 2 ⇤ .

However, it is more convenient to use the information’s additivity, In (✓) = nI(✓) (see the lecture notes, Section 4.3, Theorem 4.4), where I(✓) is the information per observation,

I(✓) = E✓ ⇥(` (✓ |X✓(0)i))2 ⇤= Z−1(1) ✓ ◆2 .

By a simple shift of the variable, x 7! x − ✓, this integral is rewritten as

I(✓) = E✓ ⇥(`✓(0)(✓ |Xi))2 ⇤ = Z−1(1) dx.

Thus, we must evaluate the last integral. Namely, we have the following

Proposition.

Z−1(1) dx =

(3)

Proof. Integrate in (3) by parts:

✓ ◆

1 x2 1 1 1

−1 (1 + x2 )3 4✓ −1 1 + x2 )2

= − + −1(1)

| }

4 −1 (1 + x2 )2 .

Furthermore, decomposing and again integrating by parts:

= dx

−1

−

⇡ −

dx 1 1 x2

Z1+ x2 ✓4 −1 1 + x2 )2

1 1

1 dx

−1 1+ x2

8 ,

and (3) is proved.

Alternatively, using the substitution x = tant ( − < t < ) and noting that

dx = dt 1+ x2 = 1

we obtain

−1 (1 + x2 )3 dx = −⇡/2 1/ cos6 t · cos2 t

⇡/2

= 2 sin2 t cos2 t dt [2costsint = sin2t]

1 ⇡/2

= sin2 (2t)dt [change of variables u = 2t]

2 Z0

1 ⇡

= 4 sin2 u du [2sin2 u = 1 − cos2u]

1 ⇡

= (1 − cos2u)du

8 0

1 ⇡

8 8 ,

as required.

Appendix 2: Various hints on using R

##### Simulations:

> n = 10 # sample size

> theta = 2 # shift parameter

> X = rnorm(n, mean=theta, sd=1) # generating a random (normal) sample

# for Cauchy distribution, use rcauchy(n, location, scale)

> hist(X) # plotting the histogram of the data

> theta1 = mean(X) # estimation by the sample mean

> theta2 = median(X) # estimation by the median

##### How to plot the estimates dynamically? (consistency)

# This is simple for the mean:

> M1 = cumsum(X)/seq(1:n) # computes the sample mean of subsamples

# (X[1], . . .X[k]) for k=1, . . .,n

> plot(M1, xlab="k", ylab="Estimator I", type="line") # plot of M1

> abline(h=theta) # to compare with the true value

##### How to do the same for the median?

> Median = function(X) {z=vector(length=n);

for(k in 1:n) z[k]=median(X[1:k]); z}

> M2=Median(X)

> plot(M2, xlab="k", ylab="Estimator II", type="line")

> abline(h=theta)

##### How to estimate the MSE?

> m = 1000 # number of replicates

> Theta1 = vector(length=m) # allocating space for iterations

> for(i in 1:m) {X = rnorm(n, mean=theta, sd=1); Theta1[i]=mean(X); Theta1}

# simulating m samples and computing the corresponding estimates of theta

> hist(Theta1) # plotting the histigram of Theta1

> MSE1 = mean((Theta1-theta)^2) # estimating MSE of the estimate theta1

> MSE1 # print MSE1

##### How to define the log-likelihood (for a given sample X) and plot it?

> logL = function(t) {ell=0; for (j in 1:n) ell=ell-(X[j]-t)^2; ell}

> plot(logL, xlab="theta", ylab="log-likelihood", from=-20, to=20, type="line")

##### How to find the maximum of a function?

> theta = 2

> n = 100

> X = rnorm(n, mean=theta, sd=1)

> f = function(theta) sum((X-theta)^2)

> a = optim(0, f) # 0 = initial value for search (of min f)

> a$par # point of minimum

> a$value # minimum value

##### NOTES

# 1 . ’optim’ searches for the MINIMUM value of the function .

# 2 . The exact answer in the example above: a = mean(X) .

# 3 . There will be a warning message that the default method is unreliable;

# indeed, accuracy may not be great .

# 4 . Experiment further by specifying the method, e .g ."Brent", to see the

# difference; but you will have to specify the search range, e .g .

# > a = optim(0, f, method="Brent", lower=-10, upper=10)

# 5 . Alternatively, you can use the function ’optimize’ instead of ’optim’;

# this also needs specification of the range .

2022-04-27

Java

物理(Physical)

LINUX

C++

Python

Processing

sas

ios

maths

maple

C语言

R语言

Internet and World Wide Web

Principles of Programming Languages

sql

scheme

prolog

JavaScript

Haskell

essay

HDL

VBA

会计学(Accounting)

Rust

经济学（ Economics）

算法分析（Algorithm analysis）

MATLAB

心理学

Ethics

建筑学

公共卫生

土木工程

TCAD

Adobe Photoshop

Microsoft Word

语言学 (linguistics)

历史学 (History)

文学 (Literature)

教育学 (Pedagogy)

天文学 (Astronomy)

地质学（geology）

SWOT

LAW (法学)

CAD(计算机辅助设计)

G语言

地理学（Geography）

Project management （管理学）

SysML

社会学（Sociology）

商业分析(Business Analysis)

市场营销学(Marketing)

人类学(Anthropology)

人文艺术(Arts and humanities)

电气工程（Electrical Engineering）

材料学（hylology）

生物科学（biological science）

哲学（Philosophy）

管理科学与工程类（Management science and Engineering）

工商管理（Business Administration）

数学（mathematics）

计算机（computer）

网络安全（Cyber Security）

统计学 Statistics

金融 Finance

经济与贸易 Economy and trade

Excel

Chemistry

LaTeX

OCaml

SPSS

Project

ASP

Stata

FORTRAN

Information system

SDLC

Basic

Digital Media

Biological

Android

ruby

HTML/CSS

Scala

PHP语言

MATH3723 Statistical Theory Session 2021–2022