Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

ECON 2300: INTRODUCTORY ECONOMETRICS

Tutorial 5: Hypothesis Tests and Confidence Intervals in Multiple Regression, SW Ch7

E7.1 Using the Birthweight Smoking.csv introduced in E5.3 to answer the following questions. To begin, run three regressions:

(1) birthweight on smoker.

(2) birthweight on smoker, alcohol, and nprevist.

(3) birthweight on smoker, alcohol, nprevist, and unmarried.

(a) What is the value of the estimated effect of smoking on birth weight in each of the regressions?

(b) Construct a 95% confidence interval for the effect of smoking on birth weight, using each of the regressions.

(c) Does the coefficient on smoker in regression (1) suffer from omitted variable bias? Explain.

(d) Does the coefficient on smoker in regression (2) suffer from omitted variable bias? Explain.

(e) Consider the coefficient on unmarried in regression (3).

i. Construct a 95% confidence interval for the coefficient.

ii. Is the coefficient statistically significant? Explain.

iii. Is the magnitude of the coefficient large? Explain.

iv. A family advocacy group notes that the large coefficient suggests that public policies that encourage marriage will lead, on average, to healthier babies. Do you agree? [Hint: Review the discussion of control variables in Section 7.5. Discuss some of the various factors that unmarried may be controlling for and how this affects the interpretation of its coefficient.]

(f) Consider the various other control variables in the data set. Which do you think should be included in the regression? Using a table like Table 7.1, examine the robustness of the confidence interval you constructed in (b). What is a reasonable 95% confidence interval for the effect of smoking on birth weight?

E7.2 In the empirical exercises on earnings and height until last week, you estimated a relatively large and statistically significant effect of a worker’s height on his or her earnings. One explanation for this result is omitted variable bias: Height is correlated with an omitted factor that affects earnings. For example, Case and Paxson (2008) suggest that cognitive ability (or intelligence) is the omitted factor. The mechanism they describe is straightforward: Poor nutrition and other harmful environmental factors in utero and in early childhood have, on average, deleterious effects on both cognitive and physical development. Cognitive ability affects earnings later in life and thus is an omitted variable in the regression.

(a) Suppose that the mechanism described above is correct. Explain how this leads to omitted variable bias in the OLS regression of earnings on height. Does the bias led the estimated slope to be too large or too small? [Hint: Review Equation (6.1) in SW.]

If the mechanism described above is correct, the estimated effect of height on earnings should disappear if a variable measuring cognitive ability is included in the regression. Unfortunately, there is not a direct measure of cognitive ability in the dataset, but the dataset does include “years of education” for each individual. Because students with higher cognitive ability are more likely to attend school longer, years of education might serve as a control variable for cognitive ability. In this case, including education in the regression will eliminate, or at least attenuate, the omitted variable bias problem.

Use the years of education variable, educ, to construct four indicator (dummy) variables for whether a worker has less than a high school diploma (lt hs = 1 if educ < 12, and 0, otherwise), a high school diploma (hs = 1 if educ = 12, and 0, otherwise), some college, (some col = 1 if 12 < educ < 16, and 0, otherwise), or a bachelor’s degree or higher (college = 1 if educ ≥ 16, and 0, otherwise).

(b) Focusing first on women only, run two regressions: (1) earnings on height, and (2) earnings on height, including lt hs, hs, and some col as control variables.

i. Compare the estimated coefficients on height in regressions (1) and (2). Is there a large change in the coefficient? Has it changed in a way consistent with the cognitive ability explanation? Explain.

ii. Regression (2) omits the control variable college. Why?

iii. Test the joint null hypothesis that the coefficients on the education variables are equal to zero.

iv. Discuss the values of the estimated coefficients on lt hs, hs, and some col. (Each of the estimated coefficients is negative, and the coefficient on lt hs is more negative than the coefficient on hs, which in turn is more negative than the coefficient on some col. Why? What do the coefficients measure?)

(c) Repeat (b), using data for men.