Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

ECON4004 - Econometrics 2

Solutions to Tutorial 3

Question 1.

It is easy to use (1), but after dropping z .  Remember, this is allowed because  (zi   z)

 1

(xi   x ) =  zi (xi   x ) and similarly when we replace x with y.  So, the numerator in the

 1

formula for  is

ni1 zi (yi   y )  ni1 zi yi   ni1 zi y    n1y1  n1y ,

where n1 = n zi    is the number of observations with zi = 1, and we have used the fact that

 1

ni1 zi yi  /n1 = y1 , the average of the yi over thi with zi = 1.  So far, we have shown that the

numerator in  is n1( y1  – y ).  Next, write y  as a weighted average of the averages over the two subgroups:

y   =  (n0/n) y0   + (n1/n) y1 ,

where n0 = n  n1 .  Therefore,

y1  – y   =  [(n n1)/n] y1  – (n0/n) y0     =  (n0/n) ( y1   - y0 ). Therefore, the numerator of  can be written as

(n0n1/n)( y1  y0 ).

By simply replacing y with x, the denominator in  can be expressed as (n0n1/n)( x1  x0 ).

When we take the ratio of these, the terms involving n0, n1, and n cancel, leaving   =  ( y1  – y0 )/( x1  – x0 ).

 

Question 2.

(i) A few examples include family income and background variables, such as parents’ education.

(ii) The population model is

score  = 0 +  1girlhs +  2faminc +  3meduc +  4feduc + u1, where the variables are self-explanatory.

(iii) Parents who are supportive and motivated to have their daughters do well in school may also be more likely to enroll their daughters in a girls’ high school.  It seems likely that girlhs and u1 are correlated.

(iv) Let numghs be the number of girls’ high schools within a 20-mile radius of a girl’s  home.  To be a valid IV for girlhs, numghs must satisfy two requirements:  it must be           uncorrelated with u1 and it must be partially correlated with girlhs.  The second requirement probably holds and can be tested by estimating the reduced form

girlhs  =  0 + 1faminc + 2meduc + 3feduc + 4numghs + v2

and testing numghs for statistical significance.  The first requirement is more problematical.  Girls’ high schools tend to locate in areas where there is a demand, and this demand can        reflect the seriousness with which people in the community view education.  Some areas of a state have better students on average for reasons unrelated to family income and parents’      education, and these reasons might be correlated with numghs.  One possibility is to include  community-level variables that can control for differences across communities.

(v) One would expect the attendance of a girls’ high school to be positively associated with the number of girls’ high schools close to one’s home. Therefore, a negative first stage association between the two is a sign of misspecification. It would also imply, using the compliers’ interpretation of the IV estimation, that  1  would measure the effect on scores of girls’ high school attendance, for girls who attend such schools because there are few of them nearby, which is again a scarcely believable interpretation.

 

Question 3.

(i) Better and more serious students tend to go to college, and these same kinds of students    may be attracted to private and, in particular, Catholic high schools. The resulting correlation between u and CathHS is another example of a self-selection problem:  students self-select    toward Catholic high schools, rather than being randomly assigned to them.

(ii) A standardized score is a measure of student ability, so this can be used as a proxy variable in an OLS regression.  Having this measure in an OLS regression should be an    improvement over having no proxies for student ability.

(iii) The first requirement is that CathRe1 must be uncorrelated with unobserved student  motivation and ability (whatever is not captured by any proxies) and other factors in the error term.  This holds if growing up Catholic (as opposed to attending a Catholic high school)       does not make you a better student.  It seems reasonable to assume that Catholics do not have more innate ability than non-Catholics.  Whether being Catholic is unrelated to student          motivation, or preparation for high school, is a thornier issue.

The second requirement is that being Catholic has an effect on attending a Catholic high school, controlling for the other exogenous factors that appear in the structural model.  This can be tested by estimating the first stage equation that has the form CathHS = 0 +              1 CathRel + (other exogenousfactors) + (reducedform error).

(iv) Evans and Schwab (1995, EV) find that being Catholic substantially increases the      probability of attending a Catholic high school.  Further, it seems that assuming CathRe1 as   exogenous in the main equation is reasonable. Among the checks EV perform to provide        evidence for this claim, they use as an additional IV a variable denoting the percentage of      Catholics in the county where a student attends school. The arguments for using this               additional IV would be: a) that the percentage of Catholics in one’s school county should be  correlated with the school being a Catholic one (this is true because Catholic schools tend to  be founded in areas with a larger Catholic population); b) that the percentage of Catholics in  the county where the school belongs should not have a direct effect on students’ probability   of attending college. The use of a second instrument allows EV to do a test of overidentifying restrictions that cannot reject the null. This implies that if one accepts the additional IV as a   valid instrument, then one cannot reject the null that the first instrument (namely Catholic      religion) is a valid one.

See EV (1995, pp. 964-969) for an in-depth analysis.