Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Homework #2

ECON 2311

Due at 11:59 PM on Sunday 11/05/2023. Points will be deducted if turned in late. No submissions will be accepted after solutions have been posted.

You will need to turn in your hand-written responses to questions 1-5. Sending a legible photo from your phone in a PC-readable format (.pdf, .png, .img. .jpg, .doc, or .docx) is acceptable, or you can send in a

.doc or .pdf with typed answers and handwritten-work included as an embedded image.

For Stata Questions, you will need to submit an error-free Stata .do file and .log file – you may include your typed up answers either within the same document as the solve-by-hand questions or include them as comments in the .do file.

Solve-by-hand & Regression Interpretation Questions

1. Ch. 4. Consider the regression model:MJ(̂) = β(̂)0  +  β(̂)1  × UR where MJ is the past-month marijuana-

use rate for adults, age 20-25  in a state in 2020-21, UR is the average state unemployment rate for

2020-21, and there are 51 observations (50 states plus DC). The results are the following, with the standard errors in parentheses below the coefficient estimate:

MJ(̂) = 8.971 + 0.244 × UR

(1.406)   (0. 148)

a. Give the formal null and alternative hypotheses for whether the variable UR has a coefficient that is different from zero.

b. Give the formal null and alternative hypotheses for whether the variable UR has a positive coefficient.

c. Calculate the t-stat for the coefficient estimate on UR.

d. Determine the critical values for the two-sided test on that coefficient for tests at the 1%, 5%, and 10% levels. Is the coefficient estimate statistically significant at each of those levels?

e. Calculate the 95% confidence interval for the coefficient estimate on UR.

2. Ch. 4. You have obtained measurements of height in inches of 29 female and 81 male students

(Studenth) at your university. A regression of the height on a constant and a binary variable (BFemme), which takes a value of one for females and is zero otherwise, yields the following result:

= 71.0 – 4.84×BFemme , R2 = 0.40, SER = 2.0

(0.3)    (0.57)

a. What is the interpretation of the intercept? What is the interpretation of the slope? How tall are females, on average?

b. Test the hypothesis that females, on average, are shorter than males, at the 1% level.

c.  Is it likely that the error term is homoskedastic here?

3. Ch. 5. (Continuation from Chapter 4, number 5) You have learned in one of your economics courses

that one of the determinants of per capita income (the "Wealth of Nations") is the population growth

rate. Furthermore you also found out that the Penn World Tables contain income and population data

for 104 countries of the world. To test this theory, you regress the GDP per worker (relative to the

United States) in 1990 (RelPersInc) on the difference between the average population growth rate of

that country (n) to the U.S. average population growth rate (nus) for the years 1980 to 1990. This results in the following regression output:

= 0.518 – 18.831×(n nus) , R2 = 0.522, SER = 0.197

(0.056)    (3.177)

a. Is there any reason to believe that the variance of the error terms is homoscedastic?

b. Is the relationship statistically significant?

4. Ch. 5. You recall from one of your earlier lectures in macroeconomics that the per capita income     depends on the savings rate of the country: those who save more end up with a higher standard of     living. To test this theory, you collect data from the Penn World Tables on GDP per worker relative to the United States (RelProd) in 1990 and the average investment share of GDP from 1980-1990 (SK),   remembering that investment equals saving. The regression results in the following output:

= –0.08 + 2.44×SK , R2 = 0.46, SER = 0.21

(0.04)  (0.38)

a. Interpret the regression results carefully.

b. Calculate the t-statistics to determine whether the two coefficients are significantly different from zero. Justify the use of a one-sided or two-sided test.

c. You accidentally forget to use the heteroskedasticity-robust standard errors option in your regression package and estimate the equation using homoskedasticity-only standard errors. This changes the

results as follows:

= -0.08 + 2.44×SK , R2 = 0.46, SER = 0.21

(0.04)   (0.26)

You are delighted to find that the coefficients have not changed at all and that your results have become even more significant. Why haven't the coefficients changed? Are the results really more significant?

Explain.

d. Upon reflection you think about the advantages of OLS with and without homoskedasticity-only

standard errors. What are these advantages? Is it likely that the error terms would be heteroskedastic in this situation?

5. Ch. 5. Carefully discuss the advantages of using heteroskedasticity-robust standard errors over

standard errors calculated under the assumption of homoskedasticity. Give at least three examples where it is very plausible to assume that the errors display heteroskedasticity.

Stata Questions

6. Ch. 4. The data file growth.dta contains data on average growth rates from 1960 through 1995 for 65

countries, along with variables that are potentially related to growth. In this exercise, you will investigate the relationship between growth and trade.

a. Construct a scatterplot of average annual growth rate (Growth) on the average trade share (TradeShare). Does thereappear to be a relationship between the variables?

b. One country, Malta, has a trade share much larger than the other countries. Find Malta on the scatterplot. Does Malta look like an outlier?

c. Using all observations, run a regression of Growth on TradeShare. What is the estimated slope? What is the estimated intercept? Use the regression to predict the growth rate for a country with a trade

share of 0.5 and with a trade share equal to 1.0.

d. Estimate the same regression, excluding the data from Malta. Answer the same questions in (c).

e. Plot the estimated regression functions from (c) and (d). Using the scatterplot in (a), explain why the regression function that includes Malta is steeper than the regression function that excludes Malta.

f. Where is Malta? Why is the Malta trade share so large? Should Malta be included or excluded from the analysis?

7. Ch. 4. The data file Birthweight_Smoking.dta, which contains data for a random sample of babies born in Pennsylvania in 1989. The data include the baby’s birth weight together with various characteristics of the mother, including whether she smoked during the pregnancy. In this exercise you will investigate the relationship between birth weight and smoking during pregnancy.

a. In the sample:

i. What is the average value of Birthweight for all mothers?

ii. For mothers who smoke?

iii. For mothers who do not smoke?

Hint: Use the mean command in Stata or use the tabstat command.

b.           i. Use the data in the sample to estimate the difference in average birth weight for smoking and nonsmoking mothers.

ii. What is the standard error for the estimated difference in (i)?  (Hint: you will need the

equation: s. e. (̅(x)SmokeTS  ̅(x)NonSmokeTS) = (s. e. (̅(x)SmokeTS ))2  + (s. e. (̅(x)NonSmokeTS))2

iii. Construct a 95% confidence interval for the difference in the average birth weight for smoking and nonsmoking mothers.

c. Ch. 5. Run a regression of Birthweight on the binary variable Smoker.

i. Explain how the estimated slope and intercept are related to your answers in parts (a) and (b).

ii. Explain how the S.E.(β(̂)1)  is related to your answer in b(ii).

iii. Construct a 95% confidence interval for the effect of smoking on birth weight.

d. Do you think smoking is uncorrelated with other factors that cause low birth weight? That is, do you think that the regression error term, say ui, has a conditional mean of zero, given Smoking (xi)?

Documentation for Growth Data

Growth contains data on average growth rates over 1960-1995 for 65 countries, along with variables

that are potentially related to growth. These data were provided by Professor Ross Levine of Brown

University and were used in his paper with, Thorsten Beck and Norman Loayza “Finance and the Sources of Growth” Journal of Financial Economics, 2000, Vol. 58, pp. 261- 300.

Variable Definitions

 

Documentation for Birthweight_Smoking Data

The datafile Birthweight_Smoking is from the 1989 linked National Natality-Mortality Detail files, which  contains a census of infant births and deaths. The data in bw_smoking.dta are for births in Pennsylvania in 1989.

These data were provided by Professor Douglas Almond, Kenneth Chay, and David Lee and area subset  of the data used in their paper “The Costs of Low Birth Weight,” Quarterly Journal of Economics, August 2005, 120(3): 1031-1083. The file contains 3,000 observations on the variables described below

Variable Definitions