Quantifying Effects Using Regression Models (25 points)
Quantifying Effects Using Regression Models (25 points)
Question 1: What is the effect of educational television on literacy in young children? (5 points)
A group of researchers were interested in determining whether exposure to a series of educational television shows leads to improvements in children’s literacy. The researchers conducted an experiment where one group of young children, the control group, watched a series of traditional (non-educational) cartoon shows and a second group of young children, the treatment group, watched a series of educational shows focused on literacy. All the children then completed a literacy test and received a score ranging from 0-100. Here is the sample regression function estimating the treatment effect and the R2 value:

In the above model, Educational TV Group is a binary variable that equals 1 for children in the treatment group and 0 for children in the control group.
● Interpret the intercept.
● Interpret the slope coefficient.
● Interpret the R2 value.
● What can you conclude from this model in real-world terms?
● In two or three sentences, discuss one concern you have with this analysis.
Question 2: What is the effect of having children on women’s wages? (8 points)
Economists hypothesize that there is a “motherhood wage penalty,” meaning women who have children tend to be paid less than women who do not have children. There are a few theories to support this argument. Some argue that women with children have less time to devote to their jobs, which is reflected in their earnings. Others argue that there is discrimination against women with children, which is why they are paid less. Using the wagedata.csv dataset, you will explore whether the effect is observable in the data. The dataset includes survey data from 1,157 women.
Table 1: Variable Definitions for wagedata.csv
|
Variable |
Definition |
|
ID |
ID of the woman |
|
age |
Age in years |
|
numChildren |
Number of children |
|
child.bin |
=1 if the woman has any children, 0 otherwise |
|
educ.level |
Level of education (1=no HS degree, 2=HS degree, 3=some college, 4=college degree |
|
tenure |
Current job tenure in years |
|
fullTime |
Employment status (full time=TRUE, otherwise=FALSE) |
|
marstat |
Marital status |
|
region |
Name of region (1=N. East, 2=N. Central, 3=South, 4=West) |
|
Urban |
Geographical classification (1=urban, otherwise=0) |
|
Industry |
Job industry type |
|
Wage.dollars |
Hourly wage in dollars |
● Run a bivariate regression model in R where wage.dollars is the dependent variable and child.bin is the independent variable. Report the sample regression function.
● What is the mean difference in expected hourly wage between women who have children and women who don’t have children?
● What is the expected hourly wage of women who do not have children?
● What is the expected hourly wage of women who have children?
Question 3: What is the effect of having children on women’s wages, controlling for education? (12 points)
As this was not an experimental study, it is important to control for possible confounders.
● In two or three sentences, explain why level of education might be a confounding variable
● Run a regression in R where wage.dollars is the dependent variable, the independent variable of interest is numChildren and the control variable is educ.level. Report the sample regression function.
● Interpret the coefficient on numChildren.
● Interpret the coefficient on educ.level.
● Report and interpret the adjusted R2 value.
● Is there another variable in the dataset that you would recommend adding in as a control variable? State the variable and explain why it would be useful to add as a control.
● Given this analysis, what can you conclude about the effect of having children on women’s wages?
R Code (3 points)
Paste your R code here
2021-03-14