Assignment 1 – EC356 Labor Economics Spring 2023


Upload your answers in a PDF document on Blackboard. If your assignment is uploaded in a non-PDF format or not submitted via Blackboard, we will deduct 10 points from your final score.

If your assignment is hand-written, please take photos of or scan your assignment and merge all photos/images into a single PDF document.

If you type your assignment in the word document, please save the assignment in a PDF format before uploading.

We will also deduct 10 points from your final score if your assignment is submitted past the due date.

Q1 (10 points). Suppose you have survey data about the 165 million people in the United States labor force. The survey data have 3 million observations. Based on the survey data, you calculate the average employment rate at the county level .

(1) Suppose you want to test whether the average employment rate is 60%. Write down the null hypothesis.

(2) Suppose , what other information (variable) do you need to calculate the t-statistics?

(3) Let’s call the answer to part (2) above variable X. Explain what is X.

(4) Suppose X=0.005. Calculate the t-statistics.

(5) Based on the value of the t-statistics, what is your conclusion regarding the null hypothesis?

Q2 (10 points). The Differences-in-Differences (DID) estimates are often expressed in the following way as a regression:


·  if unit i is treated (treatment group)

·  if unit i is not treated (control group)

·  if time period t is post treatment

·  if time period t is before treatment

·  are coefficients

Based on the regression

(1) Fill in the 6 blanks below

(2) Calculate the DID estimate




Treatment Group



Control Group



Treatment – Control Differences



Q3 (30 points). In Derenecourt and Montialoux (2021), they estimate the following DID regression.

(1) Explain in what ways the following regression is similar to the standard DID regression we have seen in Q2, and in what ways the following regression is different.


(2) Write down two DID equations, one when t=1966 and the other when t=1968. Ignore other periods.

(3) What are the interpretations of  and  in the regressions you wrote down above?

(4) What does it mean if  is positive and significant?

(5) What does it mean if  is positive and significant?

Q4 (15 points). In Card and Krueger (1994), they run the DID regression with and without controls of “chain and restaurant ownership”. The results are shown in columns 1 and 2 of the table below.

(1) Explain why they need to control for restaurant ownership to estimate the causal effect of minimum wage in employment.

(2) After controlling for restaurant ownership, does the DID estimate become more or less significant? What does this imply?


Q5 (15 points). We often use percentile differences in earnings as measures of inequality.

(1) Explain how the 90th-50th percentile difference and 50th-10th percentile difference are different measures of inequality.

(2) Suppose we calculate 99th-90th percentile difference from year 1980 to year 2015. Do you think the 99th-90th percentile difference has grown or shrunk over the years?

(3) Another way of measuring inequality is to calculate percentile earnings ratio (instead of differences). Give an example.

Q6 (10 points). Based on Farber et al (2020), describe the relationship between

(1) Union density and the top 10% income share

(2) Selection of union memberships based on years of schooling and race

Q7 (20 points). In Boone et al. (2021), the authors estimate the causal effect of longer unemployment insurance duration on aggregate employment rates. Specifically, the authors use the contiguous border county pair design.

(1) Explain how the authors implement contiguous border county pair design in practice. Hint: There are two steps. The first step is related to sample selection and the second step is related to adding fixed effects in the regression.

(2) Suppose the regression without contiguous border county pair design is written  in the following way:


Where c and t indicate county and time period.

Write down the new regression with contiguous border county pair design, using the similar notations as above. Explain new variables and notations.

(3) Explain which two groups the authors are comparing when using the contiguous border county pair design?

(4) What is the advantage of using a contiguous border county pair design as opposed to comparing counties across the nation?

Q8 (20 points). In Schmieder, von Watcher and Bender (2016), the authors use a regression discontinuity (RD) design to estimate the effect of longer unemployment insurance duration on future wages. Length of UI duration is determined by age cutoffs in the specific policy that the authors are studying.

(1) Explain which two groups the authors are comparing. Assuming the age cutoff is 42 years old.

(2) Explain why the RD design estimates a causal relationship between longer unemployment insurance duration on future wages.

(3) In Figure 1 of the paper (shown in lecture slides), the authors plot the number of UI claims by age. What is the purpose of this graph?

(4) The authors find longer UI duration decreases future wages and future wage growth. Why do you think this is the case?