关键词 > Econ代写

Applied Econometrics and Data Analysis

发布时间:2021-10-22

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit


Applied Econometrics and Data Analysis

Fall 2021

Problem Set 4

Due Date: Next Class.


        This problem set is based on the correspondence study of Agan and Starr (2017), “Ban the Box: Criminal Records, and Racial Discrimination: A Field Experiment.” In that correspondence study, they submitted approximately 15,000 fictitious online job applications to entry-level positions, and examined which applications received a callback from the employer. Agan and Starr randomly assigned half of the resumes a stereotypically white name, and half a stereotypically black name. They also randomly assigned half of the resumes a criminal record. Some of the employers had a “box”, i.e., asked about criminal record of the applicant, and some did not. It is important for this problem set to note that while Agan and Starr randomized whether each applicant had a white or black name, and whether each applicant did or did not have a criminal record, they could not randomize whether the employer had a box as whether the employer had a box was determined by the employer. I’ve provided some background slides on discrimination that you could also review. This problem set will only use the data from the pre-period, before the policy change banning the box. In the next and final lecture, we will cover difference-in-difference and I will cover the application of diff-in-diff and triple-diff to this study using both the pre- and post-policy change data.


    Part 1. Data Summary

1. Use read.table to import AganStarrQJEData.dta. Following Footnote 22 of the paper, use the subset command to drop those observations who have remover=1. Additionally drop any observation from the “post” period, i.e., observations with post=1.

2. Use stargazer to create a summary table of descriptive statistics. In the summary table, include descriptive statistics for the callback rate as well as callback rate conditional on race; fraction of employers who had the box (remover=0); fraction of applicants who were white, who had a GED, had an employment gap, and had a criminal record; and fraction of stores in New York City and fraction that are part of a retail chain.


    Part 2. Equivalence between Linear Regression with Fully Saturated Model and Conditional Means

    In the following questions, only use data from the "pre" period.

1. Using only observations that had no “box” (remover=0), estimate the following model by OLS regression:

(a) Interpret and discuss the estimated α1. Why can we use  as a consistent esti-mator of the effect of race on callback probability without worrying about omitted variable bias?

(b) Show that  equals the sample mean of Callbacki among black applicants (for employers with no box).

(c) Show that  + equals the sample mean of Callbacki among white applicants (for employers with no box).


2. Using only observations that had a “box”, estimate the following model by OLS regression:

(a) Give a justification for including Criminal Record status in the regression here for employers with a box, but not in Question 1 for employers without a box.

(b) Interpret and discuss the estimated coefficients, and compare/contrast your result with your result from Question 1. Why can we use  and  as consistent estimator of the effect of race on callback probability on those without and with a criminal record, without worrying about omitted variable bias?

(c) Summarize your results thus far. Is their evidence of discrimination:? Is the discrimination different for employers that have a box vs not? How does the discrimination interact with having a criminal record?


3. Thus far, we have fit two different linear regression models on two different samples, the linear regression model of equation 1 on the sample with no box, and the linear regression model of equation 2 on the sample with a box. It is sometimes convenient to combine both regressions into one equivalent regression. (it is very useful for testing/inference, for example, if one wishes to test the null that α1 = λ1, it is convenient to estimate both parameters as part of one regression.) Now, using all observations, estimate the following model by OLS regression:

(a) Interpret each β coefficient.

(b) Conjecture a relationship between and your and  from Questions 1 and 2. Verify your conjecture.

(c) Optional: State in words (interpret) the null hypothesis α1 = λ1 from equations 1, 2 in Questions 1 and 2.

(d) Optional: In equation 3, testing the null α1 = λ1 corresponds to what null hy-pothesis about which β parameter(s)?

(e) Why is not necessarily a consistent estimator of the effect of having a box on callback probabilities? If the policy question of interest if the effect of banning the box, will be a convincing answer to that question?

(f) Summarize your results. Is their evidence of discrimination:? Is the discrimination different for employers that have a box vs not? How does the discrimination interact with having a criminal record?