Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

PPHA 34600: Program Evaluation

Spring 2022

Problem Set 1

Instructions:

This problem set consists of two files: (1) this document with instructions and questions; and (2) a dataset which you will use to answer the questions below.

You can work in groups of up to three. Groups can share code, but each group member must turn in their own problem set, and must have separate written answers to the questions. You may not share any written work (including drafts) with other members of your group. You should submit both written answers --      which should be parsimonious -- and a file which contains your code and results for the data analysis.       You must use R. If you know how to use them, I recommend that you use RMarkdown or knitr, which     will allow you to intersperse your code and written answers (but this is not required). Note that you are     primarily being graded on your written answers. Problem sets must be submitted in PDF format. Problem sets must be turned in via Canvas using Gradescope; no late submissions will be considered.

Questions:

You have been asked by a well-meaning NGO, Kenya Electricity Loss Lessening Experts, Registered      (KELLER) to help them learn about the impacts of their flagship program. KELLER works with the        Kenyan government to disconnect electricity users who do not pay their bills, and hypothesizes that these disconnections increase payment.

1.   KELLER would like to know about the payment impacts of their disconnections program. They say they’re interested in measuring the impact of their disconnections, but don’t exactly know   what that means. Use the potential outcomes framework to describe the impact of treatment       (defined as disconnecting a household’s electricity”) for household i on electricity payments    (measured in rupees) formally (in math) and in words.

2.   KELLER are extremely impressed. They want to know how they can go about measuring _i. Let them down gently, but explain to them why estimating _i is impossible.

3.   KELLER are on board with the idea that they can’t estimate individual-specific treatment effects. They suggest estimating the average treatment effect instead. They are willing to give you some   of their early data on payments. They have data on households who did and didn’t get                   disconnected, and want you to compare the average payments across the two sets of households.  Describe what this is actually measuring, and provide an example of why this may differ from the average treatment effect.

4.   KELLER have realized the error of their ways. Their CEO tells you, “Okay, we understand that  our data won’t let us estimate the average treatment effect. But can’t we estimate the average       treatment effect on the treated?” First formally (in math) define the ATT in this context, and then explain whether or not the KELLER data will allow you to estimate it. If so, describe how what  you see in the data corresponds to the necessary components of the ATT. If not, explain why not, and describe what you can’t see in the data that you’d need to observe.

5.   KELLER forgot to tell you that they ran a randomized pilot study to estimate the effects of disconnections on payments. They’re happy to share those data with you: find it in

ps1_data_22.csv. This experience has made you a little bit skeptical of KELLER’s skills, so start  by checking (with a proper statistical test) that the treatment group and control group are balanced in pre-treatment payments, electricity usage, household size, and household head age. Use             keller_trt as your treatment variable. Report your results. What do you find?

6.   Plot a histogram of pre-treatment payments for treated farms and control households. What do    you see?  Re-do your balance table to reflect any necessary adjustments. What does this table tell you about whether or not KELLER’s randomization worked? What assumption do we need to     make on unobserved characteristics in order to be able to estimate the causal effect of keller_trt?

7.   Assuming that keller_trt is indeed randomly assigned, describe how to use it to estimate the         average treatment effect, and then do so. Please describe your estimate: what is the interpretation of your coefficient (be clear about your units)? Is your result statistically significant? Is the effect you find large or small, relative to the mean in the control group?

8.   KELLER is convinced that the reason their disconnections are effective is because they are         getting households to use less electricity. They want you to estimate the effects of the                  disconnections, but controlling for the endline amount of power consumed. Is this a good idea?  Why or why not? Run this regression and describe your estimates. How do they differ from your results in (7)? What about controlling for baseline electricity consumption? Run this regression  and describe your estimates. How do they differ from your results in (7)? How do the two           estimates differ? What is driving any differences between them?

9.   One of the KELLER RAs (the real workforce!) informs you that not everybody who was            supposed to be disconnected -- (keller_trt = 1) actually got disconnected. She tells you that the  actual treatment indicator is keller_trt_yes. (Since disconnections are expensive, KELLER         assures you that nobody in the control group got disconnected). In light of this new information, what did you actually estimate in question (7)? How does this differ from what you thought you were estimating?

10. KELLER aren’t actually interested in the effect of assignment to treatment - they want to know   about the actual effects of their disconnections. Describe (in math, and then in words) what you   can estimate using the two treatment variables we observe, keller_trt and keller_trt_yes. Estimate this object (you can ignore standard errors just for this once). Interpret your findings. How does   this compare to what you estimated in (7)?