Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

ECMT2150 INTERMEDIATE ECONOMETRICS

Week 13 Tutorial - Panel Data and Panel Data Models

Q1. Wooldridge Chp 14 Q5

Suppose that, for one semester, you can collect the following data on a random sample of college juniors and seniors for each class taken: a standardized final exam score, percentage of lectures     attended, a dummy variable indicating whether the class is within the student’s major, cumulative grade point average prior to the start of the semester, and SAT score.

a)   Why would you classify this data set as a cluster sample? Roughly, how many observations would you expect for the typical student?

b)   Write a model that explains final exam performance in terms of attendance and the other characteristics. Use s to subscript student and c to subscript class. Which variables do not change within a student?

c)   If you pool all of the data and use OLS, what are you assuming about unobserved student  characteristics that affect performance and attendance rate? What roles do SAT score and prior GPA play in this regard?

d)   If you think SAT score and prior GPA do not adequately capture student ability, how would you estimate the effect of attendance on final exam performance?

Stata 1 (Wooldridge Chp 13 Computer Exercise 13)

Use the data in WAGEPAN.dta for this exercise. Use standard errors that are robust to clustering, arbitrary heteroskedasticity and serial correlation in the FD errors, ∆uit .

a)   Consider the unobserved effects model

lwageit  = F0  +  61 d81t  + … + 67 d87t  +  F1 educi  +  y1 d81t educi  + … +   y7 d87t educi + F2unionit  + ai  + uit

where ai  is allowed to be correlated with educi  and unionit  (notice the subscripts). Which parameters can you estimate using first differencing?

b)   Estimate the equation from part (a) by FD remember to either force Stata to estimate it with no intercept (use the noconstant option) or use the year dummies for 1982-87 only. Test    the null hypothesis that the return to education does not vary over time.

c)   Now allow the union differential to change over time (along with education) and estimate the equation by FD. What is the estimated union differential in 1980? What about in 1987? Is the difference in the differential between 1980 and 1987 statistically significant?

d)  Test the null hypothesis that the union differential has not varied over time, and discuss your results in light of your answer to part (c).

Stata 2 (Wooldridge Chp 14 Computer Exercise 10)

Use the data in AIRFARE.dta for this exercise.

We are interested in estimating the model :

log(fareit) = et  + F1 concenit + F2 log(disti) + F3 [log(disti )]2 + ai + uit,           t = 1, … ,4

where et  means that we allow for different year intercepts.

Use cluster robust standard errors.

a)   Estimate the above equation by pooled OLS, being sure to include year dummies. If 」concenit = 0.10, what is the estimated percentage increase in fare?

b)  What is the cluster robust 95% confidence interval for F1 in the pooled OLS model? Why do    we prefer it to the usual confidence interval? Re-estimate the model without using the cluster robust standard errors and find the standard/non-robust 95% CI for F1 . Compare the two CIs  and comment.

c)   Describe what is happening with the quadratic in log(dist). In particular, for what value of dist  does the relationship between log(fare) and dist become positive? [Hint: First figure out the     turning point value for log(dist), and then exponentiate.] Is the turning point outside the range of the data?

d)   Now estimate the equation using random effects. How does the estimate of F1 change?

e)   Now estimate the equation using fixed effects. What is the FE estimate of F1 ? Why is it fairly similar to the RE estimate?

Hint: What is  for RE estimation?

1

Recall  = 1 − l]2 .

When estimating a model by random effects or fixed effects, Stata automatically reports what it calls p = , the fraction of the variance due to ai .

[NB Stata and our textbook/our slides use different notation. We call the unobserved heterogeneity ai , while Stata calls it u_i. e_i in Stata is our idiosyncratic error uit ].

To get Stata to estimate  you need to add an option theta” to the command for random effects:

xtreg y x1 x2 … xk, re theta

theta is what Stata calls 入.

f)     Name two characteristics of a route (other than distance between stops) that are captured by ai . Might these be correlated with concenit ?

g)    Are you convinced that higher concentration on a route increases airfares? What is your best estimate?