Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Homework #7

ECON 327-A: Introduction to Econometrics

due: 5/1/2023

1    R-Studio Problems

For the following problems, we will be using a subset of data from Angrist & Evans (1998).  This is the instrumental variables paper we discussed in class, which is an investigation of how childbirth affects the labor choices of women. As such, all observations in the data set are for married women with two or more children.

We suspect that how many children a woman has is endogenous. That is, generally women choose how many children they have and it is not a random event.

There are two possible instrumental variables for kidcount. These are intended to create exogenous variation in the endogenous choice of how many children to have. That is, the instrumental variables are not controlled

by the mothers, but they could influence the mother’s choice of how many children to have. Variable names and descriptions are listed below.

Variable Name

Description

kidcount

number of kids

morekids

=1 if mom had more than 2 kids

boy1st

=1 if the 1st kid was a boy

boy2nd

=1 if the 2nd kid was a boy

samesex

=1 if the 1st two kids were same sex

multi2nd

=1 if the 2nd and 3rd kids are twins

agem1

age of mom at census

agefstm

moms age when she 1st gave birth

black

=1 if mom is black

hispan

=1 if mom is hispanic

othrace

=1 if mom is othrace

workedm

did mom work for pay in 1979

weeksm1

moms woeeks worked in 1979

hourswm

hours of work per week in 1979

incomem

labor income per week

id

identification number

Note: You will need to install and call the following packages: AER, lmtest, readr, and sandwich.

Questions:

1. Estimate the regression

hourswmi  = β0 + β1 kidcounti + β2 agem1i + β3 blacki + β4 hispani + εi

by OLS and obtain the heteroskedasticity-robust standard errors using White’s Heteroskedasticity consistent standard errors. Interpret the coefficient on kidcount and discuss the statistical significance.

2. Angrist and Evans propose an instrumental variable, samesex, a binary variable equal to one if the first two children are the same biological sex.  What do you think is the argument for why it is a relevant instrument for kidcount?

3. Estimate the regression

kidcounti  = β0 + β1 samesexi + β2 agem1i + β3 blacki + β4 hispani + εi

and see if the reasoning from problem (2) holds. Interpret the coefficient on samesex and comment on statistical significance. Does this appear to be a relevant instrument?

4. Do you think the instrument is exogenous? In other words, can you think of any reason why samesex might be correlated with ε in problem (1)?

(Note  that  biological sex can  be  reasonably  assumed to  be  randomly  determined.  However,  would you expect a family’s finances to be affected based on whether they have two children of the same sex or two children of the opposite sex?)

5. Can we test for the exogeneity of samesex by adding it to the regression in problem (1) and testing its significance? Explain.

6. Use samesex as an instrumental variable for kidcount Obtain the IV estimates for the regression in problem (1).

Note: To estimate using IV, you will need the AER package and you can use the following R code: ivreg(Y  ∼ X  +  W  +  V  |  Z  +  W  +  V,  . . .)

Endogenous variables (X) can only appear before the vertical line; instruments (Z) can only appear after the vertrical line; exogenous regressors that are not instruments (W and V) must appear both before and after the vertical line.

How does the coefficient on kidcount compare with the initial OLS estimate?  Is the IV estimate of coefficient more or less precise than the OLS estimate?