ECON 121, Applied Econometrics and Data Analysis Summer 2022 PROBLEM SET 4
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
ECON 121, Applied Econometrics and Data Analysis
Summer 2022
PROBLEM SET 4: EFFECT OF MILITARY CONSCRIPTION ON CRIME
Instructions:
• Use the provided “pset4_submission.R” template file to complete this assignment. Do not modify the file name for your submission. The autograder requires this filename to grade your assignment.
• Use the “setwd()” command to read in the datafiles locally. Comment out the “setwd()” command before you submit to Gradescope. Do not modify the provided code in the template file that loads the data. This will cause an error with the autograder.
• Only use the packages loaded in “pset4_submission.R” when executing the tasks for the problem set. The autograder is only configured to use these packages and may not work if you use others.
Problem Set Introduction:
Many countries require young men to serve in the military. Proponents of these policies cite many benefits, including promoting national security and disciplining otherwise undisciplined young men. In this problem
set, we will examine the second of these purported benefits by estimating the effect of military conscription on crime in Argentina.
Military service in Argentina was mandatory for young men throughout most of the twentieth century. The needs of the military varied over time, however, so it held an annual lottery to decide which newly eligible men would serve. The following paragraph (taken from a published source) describes the lottery in detail:
The eligibility of young males for military service was randomly determined, using the last three digits of their national IDs. Each year, for the cohort due to be conscripted the following year, a
lottery assigned a number between 1 and 1,000 to each combination of the last three ID digits. The lottery system was run in a public session using a lottery drum filled with a thousand balls number 1 – 1000. The first ball released from the lottery drum corresponded to ID number 000, the second
released ball to ID number 001, and so on. The lottery was administered by the National Lottery
and supervised by the National General Notary in a public session. Results were broadcasted over the radio and published in the main newspapers. After the lottery, individuals were called for physical and mental examinations. Later on, a cutoff number was announced. Individuals
whose ID number had been assigned a lottery number higher than the cutoff number, and who had
passed the medical examination, were mandatorily called to military service. Clerics, seminarians, novitiates, and any individual with family members dependent upon him for support were exempted from military service.
To produce the dataset (https://github.com/credpath/econ121/raw/main/crime_ps4.rda), researchers started
with all men born in 1958-1962, divided them into cells by birth year and last three ID digits, and then cal-
culated crime rates for each of these cells. Thus, each observation in the dataset represents a set of men with the same birth year and last three ID digits. (The data are aggregated in this way to ensure confidentiality.) The following table defines the variables in the dataset:
Variable name |
Description |
birthyr |
Birth year |
draftnumber |
Draft number (1-1000) |
conscripted |
Fraction conscripted |
crimerate |
Fraction with criminal record by 2005 |
property |
Fraction with property crime conviction in 2000-2005 |
murder |
Fraction with murder conviction in 2000-2005 |
drug |
Fraction with drug conviction in 2000-2005 |
sexual |
Fraction with sex crime conviction in 2000-2005 |
threat |
Fraction with threat conviction in 2000-2005 |
arms |
Fraction with weapons-related conviction in 2000-2005 |
whitecollar |
Fraction with white collar crime conviction in 2000-2005 |
argentine |
Fraction non-indigenous Argentinean |
indigenous |
Fraction indigenous Argentinean |
naturalized |
Fraction naturalized citizens |
Our main outcome variable will be crimerate, which reflects the probability of ever having a criminal record.
We will not disaggregate by type of crime, although these data are also available for crimes committed starting
in the year 2000. The OLS, IV, and RDD regressions can all be estimated with the feols() command from the fixest package.
Problem Set Questions:
1. Are there differences in conscription rates or crime rates across birth years? (Answer in words only.)
2. Use the feols command to use OLS to estimate the relationship between conscription rates and crimerate, controlling for observable covariates and with birth year fixed effects. Assign the out- put to an object called ols. Does the result reflect the causal effect of conscription? Describe possible
biases (Answer in code and writing.)
3. The lottery assigned a draft number to each last three ID digit combination, and the military then set a cutoff based on the needs of the military, such that all draft numbers at or above the cutoff were eligible for conscription. Based on the following cutoffs, code a variable called eligible that equals 1 if eligible,
0 if not:
Year: 1958 1959 1960 1961 1962
Cutoff: 175 320 341 350 320
(Answer with code only.)
4. Estimate the “first stage” effect of eligibility on conscription. Assign the output to an object called fs. Think carefully about the regression specification. Do you need to control for birth year indicators? Do you need to control for ethnic composition? Why or why not? (Answer in code and writing.)
5. Estimate the “reduced form” effect of eligibility on crimerate. Assign the output to an object called rf. Does this result reflect the causal effect of conscription? (Answer in code and writing.)
6. Based on your results for questions (4) and (5), calculate the instrumental variables estimate of the effect of conscription on crimerate. You need only calculate a point estimate, not standard errors. Assign this number to an object called iv. (Answer with code only.)
7. Confirm your calculations by running a two-stage least squares regression. Assign the output to an object called tsls. Are there differences between the 2SLS (question 7) and OLS (question 2) results? Why or why not? (Answer in code and writing.)
8. Given your knowledge of the Argentine draft (from the paragraph on page 1), assess the validity of eligibility as an instrument for conscription. Does it satisfy all the criteria for a valid instrument? (Answer in words only.)
9. Interpret the TSLS result. Which sub-population’s average treatment effect does it estimate? Is it reasonable to call it a local average treatment effect? Is it reasonable to call it a treatment-on-the- treated effect? (Only the written answer will be graded, but some additional code could help find the answer.)
10. Suppose we are concerned that ID numbers (and therefore draft numbers) are correlated with character- istics that raise a person’s risk of committing crimes. Estimate the effect of conscription on crimerate in a fuzzy regression discontinuity design. Because the problem set is getting a little long, I will guide you through it.
(a) To motivate the regression discontinuity design, draw scatter plots of conscription rates against draft numbers by birth year:
ggplot(
data = crime,
aes(
draftnumber,
conscripted
)
) +
geom_point() +
facet_wrap(~ birthyr)
. (Answer with code only.)
(b) Generate a new variable distance that measures the “distance” from the birth-year-specific cutoff, in terms of the draft number. Keep observations with distance ≤ 150 or ≥ −150. Assign this filtered dataframe to a new dataframe called crime_local. Use crime_local for the rest of the
PSet. This step effectively restricts our analysis to a bandwidth of 150. (Hint: look at the table in Question 3 to see the cutoffs for each cohort.) (Answer with code only.)
(c) Draw a scatter plot of the conscripted against distance. Do your results suggest that crossing the cutoff raises conscription? (Answer in code and writing.)
(d) Draw a scatter plot of the crimerate against distance. Do your results suggest that crossing the cutoff raises crime? (Answer in code and writing.)
(e) Now run a two-stage least squares regression to estimate the effect of conscripted on crimerate in a regression discontinuity design. Use local linear regression with a bandwidth of 150. To allow the slopes to be different above and below the cutoff, include the interaction of distance with eligible. Then estimate:
tsls_rdd <-
feols(
crimerate ~ distance + distance:eligible |
conscripted ~ eligible,
data = crime_local,
vcov = “HC1”
).
Interpret both the first- and second-stage results. Do the results confirm your interpretation of the scatter plots in parts (c) and (d) of the question? (Answer in code and writing.)
(f) Why do you think the regression discontinuity results are different from the results earlier in the
problem set? (Answer in words only.)
2022-08-29