Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

ECONOMICS 705

Econometrics II (First Half)

Fall 2023

PROBLEM SET 1

Writing up your answers to the empirical problems

Your write-up should consist of two portions. The first portion is just the answers to the questions, with whatever text is required to explain them. This portion must be typed!

The second portion, on separate pages, consists of a Stata log file that shows how you

got the answers to the empirical questions. The log file must be clear and must include

comments that will allow the reader to quickly see the command or commands leading to each answer. It should not include everything you tried – just the final set of commands   employed to get the answers.

Submit completed problem sets on canvas as a single PDF file with a filename of the

form “LastnameFirstnamePS#.pdf” where “#” is 1, 2, or 3, depending on the problem set.

Problem sets not turned in on Canvas using the format just described will receive no credit.

Data for empirical problems

This problem set uses data from the National Supported Work Demonstration (NSW), one of the first major social experiments in the world.

A sequence of papers uses data on the male participants in the NSW to study the

performance of alternative non-experimental identification strategies and estimators using the experimental impact estimates as a benchmark. These papers include LaLonde

(1986), Heckman and Hotz (1989), Dehejia and Wahba (1999, 2002) and Smith and Todd (2005a,b).

LaLonde (1986) also studies the female participations in the NSW who were in the Aid to Families with Dependent Children (AFDC = “welfare”) target group, but his data on the    AFDC women were lost. Calónico and Smith (2017) recreate his analysis file for the

AFDC women. It is their recreation that we use for this problem set. The Calónico and Smith (2017) paper is available on Canvas.

File name on Canvas: “Economics 705 Fall 2023 NSW Women Data.dta”

The dataset contains the following variables:

treated: 1 for the experimental treatment group and 0 for the experimental control group service: 1/0 received services

age: age in years

educ: years of schooling

black: 1/0 black

hisp: 1/0 Hispanic

married: 1/0 married

re74: real earnings in “1974”

re75: real earnings in 1975

re78: real earnings in 1978

Random assignment took place in 1976 and 1977, so real earnings in “1974” and real  earnings in 1975 are conditioning variables while real earnings in 1978 is the outcome variable. Smith and Todd (2005) explain the shock quotes on “1974” .

Note that I made up the service variable for this problem set. In fact, the NSW

experiment had very few no-shows and essentially no control group substitution.

Problems

1. (5 points) Drop observations with missing values of real earnings in 1978 (i.e. re78). In real life, one would worry more about this, and so do more about this, than we are doing

here. There are missing values of re78 because many individuals did not respond to the

follow-up survey that provides the information used to construct it (“unit non-response”), or they did respond to the survey but did not respond to the specific question about earnings (“item non-response”).

2. (5 points) In expectation, experiments balance the distributions of all variables not

affected by the treatment, both observed and unobserved, between the experimental

treatment group and the experimental control group. Summarize the data on the available baseline variables (i.e. age, educ, black, hisp, married, re74, and re75) separately for the  experimental treatment group and the experimental control group. Do you see any large   differences? Explain.

3. (5 points) Using Stata’s ttest command, formally test the null hypothesis of equal population means in the treatment and control groups for the seven baseline variables.  Describe your findings. Explain why (or why not) this is an interesting null to test.

4. (5 points) Calculate the “standardized difference” for the re74 variable. Report and discuss; in particular, is it larger or smaller than the arbitrary cutoff of 20 defined by   Rosenbaum and Rubin?

5. (5 points) Using Stata’s ttest command, estimate the mean difference in re78

between the treatment and control groups. Describe and interpret your findings; be sure to note whether (or not) the estimated impact is statistically and/or substantively different from zero.

6. (5 points) Using Stata’s regress command, calculate the experimental impact of assignment to the treatment group on re78. Do not include any other covariates in the

model. What is the estimated impact? Is it statistically different from zero? How does it compare to the mean difference estimated in the preceding problem?

7. (5 points) Repeat the regression in the preceding problem but including the seven

baseline covariates as conditioning variables. Discuss and interpret your findings, being  sure to relate them to the estimates obtained in the preceding problem. What happens to  the estimated standard errors? Why? Is this impact estimate unbiased? Why or why not?

8. (5 points) Did you choose to generate and report “robust” (to heteroscedasticity on the diagonal) standard errors in the two preceding questions? Explain why or why not. [Note that these can be obtained via the “ , robust” option in Stata’s regress command.]

9. (5 points) What fraction of the treatment group receives services (i.e. has service = 1)? What fraction of the control group receives services from other programs?

10. (5 points) Calculate the instrumental variables estimate of the impact of receiving

services using two-stage least squares with treated as an instrument for service. What is   the interpretation of the instrumental variables estimate in a common effect world? What

is the interpretation of the instrumental variables estimate in a heterogeneous treatment effect world? Be sure to note any assumptions required for the interpretations you describe.

11. (5 points) Suppose (just for this problem) that treatment group members who receive

services receive a full “dose” while control group members who receive service receive only half a dose”. Provide and justify an estimate of the effect of a full “dose” of services.

12. (5 points) Provide and justify an estimate of the complier mean of the re74 variable.

13. (5 points) Create a binary indicator for “age less than or equal to 32” called agele32.

14. (5 points) Using the indicator constructed in the preceding problem, construct

separate experimental impacts (“subgroup impacts”) for younger and older individuals (i.e. observations with agele32 = 1 and observations with agele32 = 0) using a linear    regression model and the usual baseline covariates.

Do this two ways: first, by estimating completely separate experimental impact   regressions for the younger and older individuals; second, by estimating a single

experimental impact regression but including an interaction between the treatment indicator and the agele32 variable (as well as a main effect in agele32).

Describe and interpret your findings. In particular, discuss the evidence for differing

impacts for older and younger individuals and discuss the evidence regarding whether the two ways of estimating the subgroup effects produce different estimates. Indicate which    estimator you prefer and why.

15. (5 points) Compare the standard deviations of earnings in 1978 in the experimental

treatment and control groups. Formally test the null of equal variances using the sdtest

command. Describe and interpret your findings as they relate to the question of homogeneous or heterogeneous treatment effects.

16. (5 points) Estimate the quantile treatment effect at the 75th percentile of the distribution of earnings in 1978. Compare it to the average treatment effect.

17. (5 points) Discuss two alternative interpretations of the estimate obtained in the preceding problem and the assumptions required for each interpretation.

18. (5 points) Construct an outcome variable employ that equals one for observations with

re78 greater than zero and that equals zero otherwise. Write down the marginal distributions of employ in the treatment and control groups.

19. (5 points) Construct the Frechét-Höffding bounds for the (0, 0) cell in the joint

distribution from the preceding problem. Describe your findings in words. In particular, do you think the bounds are wide or narrow? Justify your answer.

20. (5 points) Using your answers from the preceding problem, describe the distributions of treatment effects associated with the Frechét-Höffding upper and lower bounding distributions.