MIEF Quantitative Methods I (Basic Econometrics) Summer 2022


Module 12: Problem Set 11

MIEF Quantitative Methods I (Basic Econometrics)

Summer 2022

General Guidelines

· Submit your answers in the associated Canvas assignment. In your submission, please upload a pdf answer sheet with screenshots of Stata/Excel output under each question where Stata is required.

· You will need to access use Stata and/or Microsoft Excel for your problem sets. Stata can be purchased or accessed remotely via Citrix, instructions are on the syllabus and posted to Canvas.

· When working on problem sets, be sure to show your work and carefully explain your answers. For questions that require some computation, answers that consist solely of a single number (even if correct) will be given less credit than those that show how the answer was deduced. For interpretation questions, answers that solely restate an estimate (for example, βˆ1=0.5) will be given less credit than those that do not explain what the estimate means in words.

Exercise 1

Use the Wooldridge file:  FERTIL2.DTA (data of women from Botswana)

children – Number of living children

age – Age of the woman

educ – number of years of education

electric - =1 if the woman lives in a residence with electricity

urban - =1 if woman lives in a city

1. Estimate the linear function:  children = f(age age2 educ electric urban)

2. Put into words each of the slope coefficients. What would be the impact if we were looking at 100 women?

3. Using the p-values, which of the coefficients are statistically significant at the .05 level of significance?

Exercise 2

For this exercise, use the Excel file “PS11_Data” available in PS11 folder on Canvas.

2. a) Use the RESET test to check for non-linear relationships in a linear model:

internet_use = f(pop_density gdp_pcap rural access)

What are your conclusions?  Explain.

2. b) Generate log variables for all variables (the minimum value for all our variables is greater than zero, which means we can take the log). Use the RESET test to check for non-linear relationship in the model:

log_internet_use = f(log_pop_density log_gdp_pcap log_rural log_access)

      What are your conclusions?  Explain.

2. c) Carry out the Davidson-MacKinnon test to see which of the following two models is better:

internet_use = f(pop_density gdp_pcap rural access)

internet_use = f(log_pop_density log_gdp_pcap log_rural log_access)

What are your conclusions?  Explain

Exercise 3

Use the excel file: infmrt2007

infmort – number of deaths within the first year per 1,000 live births

pcinc – per capita income

physic – physicians per 100,000 members of the civilian population

popul – population in thousands

1. Estimate the linear function:  infmort = f(pcinc, physic, popul).  Note this is similar to what was presented in class but now the function is linear.

2. Generate and list the residuals and studentized residuals along with the name of the state.

3. Now add the dummy variable for DC as an additional independent variable.  Compare the t-stat on the DC coefficient with the studentized residual from (2).  Are they the same?  Studentized residuals follow a t distribution so large studentized residuals indicate outliers.  Are there any other states besides DC that are outliers?

4. Run the regression excluding the observation for DC.  How do the coefficients compare with the coefficients in (3)?  Compute the predicted value for DC from this function.  Compute the hypothetical residual (actual infmort for DC minus this predicted value.  Compare this value with the coefficient on the DC variable from (3).