Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Coursework

Applied Econometrics and Policy Analysis

This document contains the task of the coursework.  You need to create a separate docu- ment with the report, containing your answers to the questions below. Maximum length of report is 5 pages, including figures and tables. Extra pages will not be read.  Stick to the word limit where specified, longer answers will receive a mark of zero.  Be precise in your writing: vague and generic statements will be downgraded.

1    Estimating Causal Effects [70 Marks]

You are interested in estimating a causal effect of a training program called Job Corps: https://en.wikipedia.org/wiki/Job_Corps. The Job Corps is the largest U.S. labour market program targeting disadvantaged youths. It provides academic, vocational, and social training, as well as health care counselling and job search assistance, for an average duration of eight to nine months.

The dataset for this exercise is uploaded on canvas. Load the data job_corps_sample.csv which contains a subsample of the individuals who participated in the program. The dataset contains the following variables:

Table 1: Variable definitions for Job Corps evaluation data

1. First, examine the data by reporting the following descriptive statistics:

Share of the sample, who participated in the program (in %)

Share of females in the sample (in %)

Number of people who smoked marijuana in last year

Number of people who live with spouse or partner and are parents

2. We want to estimate the causal effect of participation in Job Corps on the weekly earnings of participants four years after the program. For this purpose, we estimate the univariate linear regression model,

EARNY4 = γ + δ · participation + u.                             (1.1)

a) What is the main assumption we have to make to interpret δ in a causal way?  Write maximum 100 words. 2 MARKS

b) How credible is the identifying assumption in this application?   Write  maxi –mum 100 words. 3 MARKS

3. In order to make the identification assumption more plausible, we can select the control variables X  using economic theory, or intuition.   For  example,  Job Corp participants usually live on a campus during the program participation. Accordingly, parents might have a lower participation probability, because it is difficult to live together with a child on the campus.   Furthermore,  the  effect  of having  a  child (haschld) on the participation probability might might be differential for mothers and fathers. Accordingly, the interaction between gender and children (haschld*female) might be relevant as well. Estimate the following multivariate model:

EARNY4 = γ + δ · participation + β1haschld + β2haschld * female + u.    (1.2)

What is the main assumption we have to make to interpret δ in a causal way? How credible is the identifying assumption in this model?  Write maximum 100 words. 6 MARKS

4.  Compare the difference in the observable characteristics between the groups of par- ticipants and non-participants. Characteristics which differ greatly between the two groups appear to be important control variables.

The standardized differences (SD) are defined as

(1.3)

where X1  and X0  are the means and σX1  and σX0  the standard deviations in the groups of participants and non-participants, respectively.  Large standardized dif- ferences indicate a large inbalance between the characteristics of participants and non-participants.

Report standardised differences defined in eq. (1.3) for livespou,  publich,  evarrst, badhlth,  job9_12. Interpret the numbers briefly. Report the numbers with 3 digits after the decimal point in the format of a table given below:

Variable

SD

Brief interpretation  ( 20 words)

livespou

publich

evarrst

badhlth

job9_12

10 MARKS

5. Now we estimate the multivariate linear regression model,

EARNY4 = γ + δ · participation + β · X + u.                     (1.4)

where X consists of the following control variables livespou,  publich,  evarrst, badhlth,  job9_12.

Estimate Models 1 - 3 defined in equations (1.1), (1.2), (1.4) respectively.  For each model in the format of the table given below report 1) estimated effect of participation

in Job Corps on earnings δ(ˆ) 2) 95% confidence interval for the estimated effect of

participation in Job Corps on earnings 3) p-value which corresponds to testing the one-sided hypotheses pair H0  : δ ≤ 0 vs. Ha  : δ > 0.  Report the numbers with 3 digits after the decimal point.

Model 1

Model 2

Model 3

1) δ(ˆ)

2) 95% CI

3) p-value

15 MARKS

6. Using the logit model estimate propensity scores for all individuals in the sample. For

controls use haschld  *  female,  haschld,  livespou,  publich,  evarrst,  badhlth, job9_12. Plot two histograms of propensity scores for treatment and control groups separately and discuss whether the overlap assumption holds.  Write maximum  100 words. 10 MARKS

7. Estimate the Average Treatment Effect (ATE) and the Average Treatment Effect on the Treated (ATET) using the IPW type I estimator with the propensity scores

from the previous step.  Report your results with 3 digits after the decimal point.

Explain why do the ATE and ATET estimates differ from each other and from the OLS estimates.  Write maximum 200 words. 20 MARKS

2 Short Answers [30 Marks]

Each answer should be maximum 150 words.

1.  Suppose we estimate the effect of military service on earnings using draft lottery number as an instrument.   What population does the IV estimator identify,  and under which assumptions? 10 MARKS

2. Explain the difference between AIC and BIC as model selection criteria.   Which would you prefer in small samples, and why? 10 MARKS

3. Why might forecasts from a simple AR(1) model sometimes outperform those from a higher-order ARMA model, even if the latter has a better in-sample fit? 10 MARKS