ECON4004 - Econometrics 2 Solutions to Tutorial 3
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
ECON4004 - Econometrics 2
Solutions to Tutorial 3
Question 1.
It is easy to use (1), but after dropping z . Remember, this is allowed because (zi z)
1
(xi x ) = zi (xi x ) and similarly when we replace x with y. So, the numerator in the
1
formula for is
ni1 zi (yi y ) ni1 zi yi ni1 zi y n1y1 n1y ,
where n1 = n zi is the number of observations with zi = 1, and we have used the fact that
1
ni1 zi yi /n1 = y1 , the average of the yi over the i with zi = 1. So far, we have shown that the
numerator in is n1( y1 – y ). Next, write y as a weighted average of the averages over the two subgroups:
y = (n0/n) y0 + (n1/n) y1 ,
where n0 = n – n1 . Therefore,
y1 – y = [(n – n1)/n] y1 – (n0/n) y0 = (n0/n) ( y1 - y0 ). Therefore, the numerator of can be written as
(n0n1/n)( y1 – y0 ).
By simply replacing y with x, the denominator in can be expressed as (n0n1/n)( x1 – x0 ).
When we take the ratio of these, the terms involving n0, n1, and n cancel, leaving = ( y1 – y0 )/( x1 – x0 ).
Question 2.
(i) A few examples include family income and background variables, such as parents’ education.
(ii) The population model is
score = 0 + 1girlhs + 2faminc + 3meduc + 4feduc + u1, where the variables are self-explanatory.
(iii) Parents who are supportive and motivated to have their daughters do well in school may also be more likely to enroll their daughters in a girls’ high school. It seems likely that girlhs and u1 are correlated.
(iv) Let numghs be the number of girls’ high schools within a 20-mile radius of a girl’s home. To be a valid IV for girlhs, numghs must satisfy two requirements: it must be uncorrelated with u1 and it must be partially correlated with girlhs. The second requirement probably holds and can be tested by estimating the reduced form
girlhs = 0 + 1faminc + 2meduc + 3feduc + 4numghs + v2
and testing numghs for statistical significance. The first requirement is more problematical. Girls’ high schools tend to locate in areas where there is a demand, and this demand can reflect the seriousness with which people in the community view education. Some areas of a state have better students on average for reasons unrelated to family income and parents’ education, and these reasons might be correlated with numghs. One possibility is to include community-level variables that can control for differences across communities.
(v) One would expect the attendance of a girls’ high school to be positively associated with the number of girls’ high schools close to one’s home. Therefore, a negative first stage association between the two is a sign of misspecification. It would also imply, using the compliers’ interpretation of the IV estimation, that 1 would measure the effect on scores of girls’ high school attendance, for girls who attend such schools because there are few of them nearby, which is again a scarcely believable interpretation.
Question 3.
(i) Better and more serious students tend to go to college, and these same kinds of students may be attracted to private and, in particular, Catholic high schools. The resulting correlation between u and CathHS is another example of a self-selection problem: students self-select toward Catholic high schools, rather than being randomly assigned to them.
(ii) A standardized score is a measure of student ability, so this can be used as a proxy variable in an OLS regression. Having this measure in an OLS regression should be an improvement over having no proxies for student ability.
(iii) The first requirement is that CathRe1 must be uncorrelated with unobserved student motivation and ability (whatever is not captured by any proxies) and other factors in the error term. This holds if growing up Catholic (as opposed to attending a Catholic high school) does not make you a better student. It seems reasonable to assume that Catholics do not have more innate ability than non-Catholics. Whether being Catholic is unrelated to student motivation, or preparation for high school, is a thornier issue.
The second requirement is that being Catholic has an effect on attending a Catholic high school, controlling for the other exogenous factors that appear in the structural model. This can be tested by estimating the first stage equation that has the form CathHS = 0 + 1 CathRel + (other exogenousfactors) + (reducedform error).
(iv) Evans and Schwab (1995, EV) find that being Catholic substantially increases the probability of attending a Catholic high school. Further, it seems that assuming CathRe1 as exogenous in the main equation is reasonable. Among the checks EV perform to provide evidence for this claim, they use as an additional IV a variable denoting the percentage of Catholics in the county where a student attends school. The arguments for using this additional IV would be: a) that the percentage of Catholics in one’s school county should be correlated with the school being a Catholic one (this is true because Catholic schools tend to be founded in areas with a larger Catholic population); b) that the percentage of Catholics in the county where the school belongs should not have a direct effect on students’ probability of attending college. The use of a second instrument allows EV to do a test of overidentifying restrictions that cannot reject the null. This implies that if one accepts the additional IV as a valid instrument, then one cannot reject the null that the first instrument (namely Catholic religion) is a valid one.
See EV (1995, pp. 964-969) for an in-depth analysis.
2022-04-27