Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

EC3380

Summer Examinations 2021/22

Econometrics 2: Microeconometrics

Section A: Answer ALL SIX questions

1. The Warwickshire County Council introduced a policy to install ‘smart’ electricity meters in all dwellings in January of 2016. The intervention was universally adopted during the month of January (i.e. everyone was treated). The neighbouring counties of Worcestershire and West Midlands County still use traditional meters that don’t show ‘real-time’ electricity usage and expenditure. Ignore any potential interaction with the policies discussed in questions 2 & 3.

A researcher has a repeated cross-section sample of monthly electricity usage data for a representative sample of Warwickshire, Worcestershire, West Midlands County dwellings (i.e. houses) from January 2015 to December 2016. They estimate the following difference-in-difference model:

Yitc = αc + δt + βDct + εitc

where Yitc is monthly electricity consumption for dwelling i in county c. Time is measured in month intervals, t ∈ {..., 11:2015, 12:2015, 01:2016, ...}. The dummy variable Dct is = 1 for dwellings in Warwickshire from January 2016 onwards and αc is a county FE

(a)  Given the estimating equation, write down an expression for E [yitc (0)|p, c] where yitc (0) is the potential consumption under the counterfactual that the policy did not take place.

In addition, comment on what this equation says about the identifying assumptions of

the model. (4 marks)

(b)  The researcher is concerned that the parallel trends assumption might not hold. They   therefore decide to include a set of dwelling-specific control variables: Xty .  Provide an expression for E [yitc (0)|p, c, Xt] and comment on the new identifying assumptions.

(4 marks)

(c) Suppose the researcher had a panel dataset instead. Write down a comparable linear    regression model which yields an estimator of the Average Treatment Effect of the        Treated (ATT) with a lower variance. Provide a brief discussion for why the variance of this estimator is lower. (4 marks)

2. The Warwickshire County Council performed a randomized control trial that used behavioural ‘nudges’ to induce changes in energy consumption (electricity and gas). Ignore any potential interaction with the policies discussed in questions 1 & 3. They chose a     random sample of 3,000 dwellings. The council randomly assigned the sample to 3 treatment groups: Gi  = {0 “control” , 1 “treatment 1” , 2 “treatment 2”}. Treatment group 1 received a letter depicting the dwelling’s energy use over the last calendar year, as well as its estimated  carbon footprint. Treatment group 2 received the same letter, but in addition was told the    average energy use and carbon footprint of their neighbours.

The linear regression below is used to estimate the Average Treatment Eect of this

intervention. The regression model is estimated using a single period of data with the variables yi  (total energy consumption in kilowatt hours) measured a year after the    intervention using administrative records.

yi  = a + g1 D1i + g2 D2i + ei

where D1i  = ī{Gi  = 1 or Gi  = 2} and D2i  = ī{Gi  = 2}. By randomization, we can assume that E [ei |D1i, D2i] = E [ei |Gi] = 0.

(a) Show that g2  = E [yi |Gi  = 2] _ E [yi |Gi  = 1]. (2 marks)

(b)  Let {yi (0), yi (1), yi (2)} denote the potential outcomes of energy consumption without

treatment, with treatment 1, and with treatment 2 .  Using the expression above, show

that g2  can be expressed as the dierence between the ATE of treatments 2 and 1 . (6 marks)

(c) Suppose the council wanted to learn about how the impact of the second treatment      depended on whether the dwelling consumed more or less than their neighbours. Let Hi be a dummy variable that identifies households that use more energy than their             neighbours. Provide a brief discussion explaining why it would be advantageous to         stratify the sample by this dwelling characteristic in addition to the random assignment. (4 marks)

3. In a third policy, the Warwickshire County Council introduced a new energy saving policy     across the county in a staggered manner. Ignore any potential interaction with the policies discussed in questions 1 & 2. Each dwelling gets their rubbish collected on a     different day of the week (Monday through Friday). They decided to use this distinction to  divide dwellings into 5 treatment cohorts that would receive a staggered treatment, adding a new group each year. The treatment subsidized investments into new, more efficient electric boilers.

To estimate the impact of this policy on house prices, a researcher estimates the model:

yitd  = ad + ot +      gj ī{p _ sd  = j} + uitd

jeJ

using data from Warwickshire alone for years 2016-2021. The outcome, yitd , is (log of) the sale price of house i in year p and treatment cohort

d = {1 “Monday” , 2 “Tuesday” , . . . , 5 “Friday”}.  Included in the specification are cohort FE’s (ad ) and time FE’s (ot ), where time is measured in calendar years. The dynamic treatment    effect is captured by a full set of event-time dummy variables.  Event-time is p _ sd , where     sd  ∈ {2017, 2018, 2019, 2020, 2021}d=1,...,5  is the year in which each cohort of houses is

treated. The set of event-time dummy variables covers event-times

J = {_5, . . . , _2, 0, . . . , 4}.

(a)  Express the following in terms of gj  coecients:

[E [yitd |d = 1, p = 2020] _ E [yitd |d = 1, p = 2016]]

_ [E [yitd |d = 5, p = 2020] _ E [yitd |d = 5, p = 2016]]

assuming strict exogeneity. (4 marks)

(b)  Suppose that the nal treatment group was never treated. That is

sd  ∈ {2017 , 2018 , 2019 , 2020}d=1 , . . . ,4  and for the never-treated cohort we can set

s5  = e. Solve for the double-difference in (a) with this new information. In addition,   comment on the value of a never treated control group in the context of an event-study design. (4 marks)

(c)  Now, suppose again that all cohorts are treated (i.e. s5  = 2021). The researcher is       aware that without a never-treated control group there is an under-identification problem in the initial estimating equation. They therefore decide to exclude cohort FE’s (ad )      from the model. Discuss whether you believe this is a reasonable strategy. (4 marks)

4. Suppose you observe an outcome variable Yi for individual i, a binary treatment variable Di (Di = 1 if treated, Di = 0 if not treated) and a binary instrument Zi .

(a) Provide an interpretation for E[Di |Zi = 1] − E[Di |Zi = 0] and use the expression to characterise the set of compliers, always-takers, never-takers and defiers. (4 marks)

(b) Write out the reduced form regression equation. Suppose you run this regression. What assumption(s) needs/need to be satisfified to give the parameter in this equation a causal interpretation? (4 marks)

(c) Suppose you also observe a set of covariates Xi . You fifind no statistical difference between any of the covariates for two groups of individuals in your sample that are characterised by (Zi = 1) and (Zi = 0), respectively. What do you take away from this? (4 marks)

5. Suppose you are interested in the effect of neighbourhood pollution on subjective health assessment. You have data on pollution levels pl by location l, and the health assessment Yil for a sample of individuals indexed by i (i = 1, ..., N), living in location l, where Yil = 0: poor health, Yil = 1: satisfactory health, Yil = 2: good health, Yil = 3: excellent health. Suppose that there is an underlying latent health variable Yil ∗ which is given by Yil ∗ = xilβ + eil, where we assume that eil ∼ N(0, σ2 ). xil is a vector of characteristics, including pollution levels pl .

Assume that there exists a set of cut-off parameters αj for the latent variable that determine the outcome Yil, where j ∈ {1, 2, 3} and α1 < α2 < α3.

(a) Using the information in the question, write down expressions for the probabilities of being in poor health (Yil = 0) and in good health (Yil = 2). (4 marks)

(b) Denote by βp the parameter associated with pollution levels pl . Derive the marginal effect of an increase in pollution levels on the probability of being in good health. (2 marks)

(c) Suppose you want to estimate the parameters of this model using maximum likelihood. Derive the log-likelihood function for this estimation. Make sure to write down all the necessary steps for your derivation. (6 marks)

6. An important question for criminal justice systems is how early release from a prison sentence

affects recidivism (the tendency of a convicted criminal to reoffend). Suppose there is a         policy in place that allows offenders to be released from prison early, but only offenders with a sentence of at least 1 year are eligible for the programme. A judge decides whether an           offender can be released early, and not all offenders with a prison sentence of at least 1 year  will be released early.

You have data on a sample of offenders (indexed by i), including their sentence si  (in years), a continuous measure for recidivism once released from prison Ri , participation in the            early-release programme Di  (Di  = 1 if released early, Di  = 0 otherwise), and a set of controls Xi .

You want to estimate the following equation:

Ri  = a1 + ADi + g1 si + y1Xi + ii .

(a)  Briefly discuss the reason why an estimate of A from the above regression will likely not give you an estimate of the causal effect of interest. (2 marks)

(b)  Describe how you would use the information on eligibility for the early-release              programme to get a causal estimate of the treatment effect. In doing so, make sure to write out relevant equations, discuss any assumptions and briefly describe the              implementation of your approach. (10 marks)

Section B: Answer ONE question

7. Consider a simplified version of the RCT described in Question A.2. The RCT has one treatment group and the treatment aims to reduce energy consumption (Yi). The treatment sends a letter with the dwelling’s consumption as well as the average of their neighbours’.

The researcher estimates two models:

Yi = α + βDi + εi

and:

Yi = αm + βmDi + γGi + υi

where Gi is a dummy variable identifying the main source of heating in the dwelling (=1 if gas; =0 if electricity). Both Yi and Gi are measured a year after the intervention. Assume that households are more likely to switch to electric heating after reading about their relative carbon footprint. That is:

Gi = δ + πDi + ξi

where E[ξi |Di ] = 0 and π < 0.

(a) Show that E[υi |Di = 1] − E[υi |Di = 0] = −γπ. [HINT: Use the fact that εi = γGi + υi and E[εi |Di ] = 0.] (10 marks)

(b) Suppose γ > 0: dwellings with gas boilers use more energy. Provide an intuitive explanation for why the difference in part (a) is positive. (6 marks)

(c) The Law of Iterative Expectations tells us that E[υi |Di ] = EG[E[υi |Di , Gi ]]. Thus, if E[υi |Di , Gi ] = 0 ⇒ E[υi |Di ] = 0. [NOTE: this is not a “⇐⇒” relationship.] With this in mind, explain why we should not include Gi in the model. (6 marks)

(d) What does this demonstration teach us about “good” vs “bad” controls? What measure of ‘source of heating’ should we use instead? (6 marks)

8. One of the most explored questions in labour economics is how an additional year of             schooling affects an individual’s earnings. Since education is likely an endogenous variable for a number of reasons, a range of methodologies have been employed to establish a causal link between schooling and earnings. Card (1995) uses information on college proximity as an      instrumental variable (IV) for education. He finds that individuals growing up closer to a       college have significantly higher levels of education, even after controlling for a range of        characteristics including family background variables (parents’ education and family structure when growing up), regional characteristics (broad region and whether they live in a large city or not), and individual characteristics such as experience and race. You can assume for this   question that we can measure college proximity using a binary variable that is equal to one if an individual lives within a certain radius of a college, and zero otherwise.

(a) State the IV assumptions, commenting on whether you think they are satisfied in the given context and whether they are testable. In discussing the assumptions, discuss   whether you think additional data (and if so, what kind of data) may make your       assumptions more credible. (10 marks)

(b)  Imbens and Angrist (1994) show that IV identifies the Local Average Treatment Effect (LATE). In words, define the LATE and discuss the interpretation of the IV estimate as LATE in the given context. (8 marks)

(c)  Discuss one alternative setting and/or method that has been or could be used to causally identify the returns to education (not necessarily the returns to attending college). Be specific in your example, clearly stating the data requirements and    assumptions for your method. (10 marks)