闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Econ 114: Assignment 2

Fall 2022

¹ . Inference after model selection

Suppose you have to choose between two models:

Y = p0 + p1X1 + U, ( ¹) and

Y = p0 + p1X1 + p2X2 + U. (2)

Let 1(*) and 1 be the OLS estimators from model (¹), and (2) respectively. If you reject the null hypothesis that p2 = 0, you use 1 . Otherwise, if you do not reject the null hypothesis that p2 = 0, you use 1(*) . In this exercise we try to asses the pitfalls of this procedure when it comes to inference.

Let n = 30 be the sample size, and let s = 1, 000 be the number of replication. The true values of the parameters are p0 = 1, p1 = 1, and p2 = 0. Note that this means that model (¹) is the correct one. Moreover, let X1, U be all independent standard normals, and let X2 = X1 + V, where V is a standard normal (independent of everything).

(a) For each replication, generate Y according to model (¹).

(b) For each replication, estimate model (¹) and (2). Let 1(*) and 1 be the OLS estimators from model (¹), and (2) respectively.

(c) For each replication, test the null hypothesis, at the 5% level, that p2 = 0. If you reject, then the conﬁdence interval is computed as 1 ± 1.96 x se(1 ). If you don’t reject, then the conﬁdence interval is computed as 1(*) ± 1.96 x se(1(*)). Use robust standard errors in each case. Note that the standard errors will likely be different in each model.

(d) Does the empirical coverage of your conﬁdence intervals is close to 95%?

2. IV vs OLS

Consider the following model:

Y = p0 + p1X + U

X = m0 + m1Z + V

where Cov(X, U) = 0. This means that X is exogenous. Also, Z is valid IV: Cov(Z, U) = 0 and Cov(Z, X) 0. Our target parameter is p1 .

(a) Show that 1,OLS and 1,IV are consistent, where

1,OLS =

1,IV = .

(b) We saw that in OLS should be preferred to IV on grounds of lower asymptotic variance. This

means that

1,OLS N(p1, AVar(OLS ))

1,IV N(p1, AVar(IV ))

and Avar(OLS ) < Avar(IV ). Estimate S = 1, 000 replications of 1,OLS and 1,IV with n = 500, and

Y = X + U

X = Z - V

where U ~ N(0, 1), V ~ N(0, 1) and Z ~ N(0, 1). Note that once you specify Z and V, then you can construct X. And once you specify U and construct X, you can construct Y.

(c) Overlay the plots of the densities of 1,OLS and 1,IV . What do you see? Is this consistent with Avar(OLS ) < Avar(IV )?

(d) Compute the mean and the standard deviation of 1,OLS and 1,IV . Interpret this.

3. Leave-one-out estimator

Suppose your model is Y = a + pX + U, where X 1 U, and X and U are standard normals. The OLS is the best linear unbiased estimator: it has the smallest variance. Here we investigate a non-linear alternative.

= OLS + Yi(Yj - Xj -i).

Here, i and j are given integers (chosen by you) that can range from ¹ through n, the sample size. -i is the OLS estimator that is obtained from deleting the i -th observation (leave-one-out).

Using simulations, investigate the variance of relative to that of OLS .

4. Heteroskedastic-Robust Standard Errors Consider the following model:

Y = p0 + p1X + (1 + yX)U

where y 0, E[UlX] = 0, and E[U2lX] = c2 . In this case, 1,OLS will be consistent. Although E[U2lX] = c2, the model is not homoskedastic because, in a regression of Y, a constant, and X, the error term is

Y = p0 + p1X + V

V = (1 + yX)U

Here E[V2lX] = (1 + yX)2c2 . Therefore, the model exhibits heteroskedasticity. Recall that¹

1,OLS N ╱p1, \

We deﬁned the asymptotic variance of to be

AVar(1,OLS) = .

For this exercise, you will run S = 100 regressions with sample size n = 1, 000, assuming that p0 = 0, p1 = 0, and y = 1. Also, that X ~ N(0, 1) and U ~ N(0, 1).

(a) Save the values of 1,OLS, and of

AVˆar(1,OLS) = 泛1(泛)

AVˆar(1,OLS)R = ┌2(2)┐

where i = Yi - 0,OLS - 1,OLSXi . Here AVˆar(1,OLS) is an estimator of AVar(1,OLS) under the incorrect homoskedasticity assumption. Here AVˆar(1,OLS)R is the (correct) heteroskedastic- robust estimator of the asymptotic variance.

(b) Construct 95% conﬁdence intervals for p1 using each of the estimators of the asymptotic vari- ance:

1,OLS± 1.96 x ′AVˆar(1,OLS)

1,OLS± 1.96 x ′AVˆar(1,OLS)R

We refer to ′AVˆar()R as robust standard errors. Plot the conﬁdence intervals and compute the proportion of conﬁdence intervals that contain the true value of p1 in each case. This is called the empirical coverage of the conﬁdence intervals. Comment on your ﬁndings.

(c) In the case where y = 0, then the assumption of homoskedasticity is correct. However, still AVˆar(1,OLS) AVˆar(1,OLS)R . Which one should you use? Base your answer in the empirical coverage of the conﬁdence intervals now computed for y = 0. (Before we had assumed that y = 1.

(d) Plot the empirical coverage of both types of conﬁdence intervals for a range of gamma from 0 to 2 in increments of О . ¹ : y = 0, 0.1, 0.2, ..., 1.9, 2. Depending on the value of y, which estimator should you use?

5. Weak IV

Consider the following model:

Y = p0 + p1X + U

X = m0 + m1Z + V

where X is endogenous: Cov(X, U) 0. However, Z is a valid IV: Cov(Z, U) = 0 and Cov(X, Z) 0. Our target parameter is p1 . The iV estimator is

1,IV = =泛(泛) X¯(Y¯)Z¯(Z¯)

and its asymptotic distribution is

IV N ╱p, \ .

where we denote

AVar(IV ) = .

In the weak IV setting, we have that Cov(X, Z) ≈ 0. Consequently, the normal approximation to the IV estimator will not be very reliable. In this exercise we will do S = 1, 000 simulations of sample size n = 1, 000. And we will set p0 = 0, p1 = 2, m0 = 0 and m1 = 1/^n = 1/^1, 000. Also, let U ~ N(0, 1) and Z ~ N(0, 1). To generate the endogeneity, we need Cov(U, V) 0. If you just generate V = rnorm(n, 0, 1), it will be independent of U. Instead, generate V as