Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

BC 3018 Econometrics (Fall 2023)

Problem Set Three

Instructions: Please read all questions carefully and in their entirety. Your problem set should be typed up with the appropriate formatting and submitted through Courseworks under the Problem Set Three assignment.  This problem set is exclusively a computer exercise and requires use of R. The first few lines of your script should provide the following information in this order (comment out each line with #): YOUR NAME, Problem Set Three, Econometrics (Fall 2023), and the Date.

Your script must be clean and neatly annotated. Please save your script as an R Markdown file.

Part I: Answer the following questions.

Each of the problems below require the Beauty dataset based on an original study fromHamermesh and Biddie (1994) in the American Economic Review on the returns to beauty in the labor market.

Variable definitions within the Beauty dataset are as follows:

.  wage is the hourly wage

.  Below_Average is a dummy variable derived from a variable called Looks ranging from 1 to 5 (average looks = 3). Below_Average equals one if individual i has below average (i.e.,

≤ 2) looks and zero otherwise.

.  Above_Average is also a dummy variable based on looks equal to one if individual i has

above average looks (i.e., ≥ 4) and zero otherwise.

.  Female is a dummy variable equal to one if the worker is a woman and zero otherwise

.  Education captures years of schooling

.  Experience denotes years of workforce experience

.  Married is a dummy variable equal to one if the worker is married and zero otherwise

.  south is a dummy variable equal to one if the worker lives in the south and zero otherwise

.  Health is a dummy variable equal to one if the worker reports being in good health and zero otherwise

1.  First, let’s inspect our data using basic descriptive statistics!

(a)  What are average wages in these data for workers with “below average" and “above

average" looks (Hint: You’ll need to use mean·) and attach the appropriate condition)?

(b) Now I would like for you to confirm that both “beauty" dummy variables were properly constructed by forming a contingency table using table·) (use Beauty$Below_Average and Beauty$Above_Average as the inputs). How many workers were rated with “above average" and “below average" looks?

2.  In contrast to previous problem sets, I would like for you to construct your table output first (again using stargazer·)) and you will use this table to answer the remaining questions in this section.   You must report the robust standard errors for each regression.   The OLS

regression specifications are given as follows:

Specification 1: Mincereqn. (log(wages) = β0 + β1E duc + β2Ex per + β3Ex per2 +ε)

.  Specification 2: Same as (1) with Female, Below_AverageAbove_Average

.  Specification 3: Same as (2) with Female × Above_Average

3. Let’s focus exclusively on column (1) (i.e., Specification 1).

(a)  What are the estimated returns to education?  Assess the statistical significance of this

estimate using a t-test based on your table output.

(b) Interpret the effects of Experience on wages.

(c) If we wanted to statistically assess the joint significance of Experience and Experience2 ,

what test should we use? State the null and alternative hypotheses.

4.  Using column (2) (i.e., Specification 2), interpret the estimated coefficient on Above Average

and assess its statistical significance using the p-value from your table output.

5. Let’s focus exclusively on column (3) (i.e., Specification 3).

(a)  Interpret the estimated interaction term for Female × Above_Average and its statistical

significance using a p-value from your table output.

(b)  How should we interpret the estimated OLS coefficient for “Below Average?" Be careful.

(c)  Suppose that we are worried that our “beauty" dummy variables  suffer from non- classical measurement error. Why might this be the case and how might this measure-ment error affect our OLS estimates?

(d)  Why did we request robust standard errors?  Perform the relevant test based on your response.

Part II: Answer the following questions.

Each of the problems below once again requires the Beauty dataset and variable definitions are similarly defined as in Part I. For this section only, you do not need to produce a stargazer table!

All OLS regression specifications within this section will be based on:

Health = β0 + β1E duc + β2Black + β3Married + β4south + ε                    (1)

1. Let’s begin by constructing linear probability model estimates with robust standard errors.

(a)  Interpret the estimate corresponding to Education and assess its statistical significance.

(b)  Interpret the estimate for Black and assess its statistical significance.

(c)  Construct predicted probabilities based on our linear probability model estimates.  Do

these predicted probabilities make sense? Why or why not?

2.  Now let’s produce probit estimates based on these data.

(a) First produce probit estimates using the glm(·) function.  How should we interpret the

estimate corresponding to Black? Is it statistically significant?

(b)  Now construct a marginal effects estimate for Black based on our probit model using

the AME computation.  Is it statistically significant?  More generally, how do these

estimates differ from what you produced using the glm(·) function?

3.  Now let’s turn to the probit and logit estimates based on these data.

(a)  Produce and interpret a marginal effects estimate for Education based on our logit model

using the PEA computation. Is it statistically significant?

(b) Finally, what is the pseudo-R2 for this model?

References

Hamermesh, D. S. and J. E. Biddie (1994, December).  Beauty and labor market.  The American Economic Review 84(5), 1174–1194.