Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

EC33092 Econometrics

Lab session 1

Heteroscedasticity

The data file lab_session2.dta” provides the data on a sample of 114 workers in an industrial town in Southern India in 1990. The variables are defined as follows:

WI: weekly wage income in rupees.

Age: age in years.

Female: 1 if female workers and 0 for male workers .

Primary: a dummy variable taking a value of 1 for workers with primary education.

Secondary: a dummy variable taking a value of 1 for workers with secondary education.

College: a dummy variable taking a value of 1 for workers with college education .

Permanent: a dummy variable taking a value of 1 for workers with permanent jobs and a value of 0 for temporary workers.

a) Our interest is in finding out how weekly wages relate to age, gender, level of education and job tenure.

ln WIi = 1  + 2 Agei + 3 Female + 4 Pr imaryi + 5 Secondary+ 6 College+ 7 Permanent + ui

Interpret the results.

b) Is heteroscedasticity an issue in this specification?

c) Do male workers with higher education earn higher weekly wages than female workers with higher education?

d) Is there a significant difference between workers with primary    school and those who are illiterate? Assess this with respect to both the education dummy variable and the interaction term and explain the results. What about the difference between workers with           secondary level of education and illiterate workers? What about the difference between those with college education, compared to the

illiterate workers?

Lab session 2

Probit model

The data file “lab_session3.dta” provides the data on mortgage approvals. The variables are defined as follows:

Approve: "1 if mortgage is approved, 0 otherwise"

White: "1 if applicant white, 0 otherwise"

Hrat: "housing exp, % total inc"

Obrat: "other oblgs,  % total inc"

Unem: "unemployment rate by industry"

Male: "1 if applicant male, 0 otherwise"

Married: "1 if applicant married, 0 otherwise"

Dep: "number of dependents"

Sch: "1 if > 12 years schooling, 0 otherwise"

Cosign: "1 if there is a cosigner, 0 otherwise"

Chist: "0 if account deliquent >= 60 days, 1 otherwise" Pubrec: "1 if filed bankruptcy, 0 otherwise"

Mortlat1: "1 if one or two late payments, 0 otherwise"

Mortlat2 "1 if > 2 late payments, 0 otherwise"

a) Our interest is in testing whether there is discrimination against minorities. Estimate a probit model of approve on white:

approvei = 0  + 1 white + ui

Interpret the results.

b) As controls, add the variables hrat, obrat, loanprc, unem, male, married, dep, sch, cosign, chist, pubrec, mortlat1, mortlat2. What happens to the coefficient on white?  Is there still evidence against nonwhite?

c) How do these results compare with the linear probability estimates?

d) Use a linear probability model and allow the effect of race to  interact with the variable measuring other obligations as a         percentage of income (obrat). Is the interaction term significant? Comment on the results.

Binary DV and Heteroskedastcity Exercise (adapted from Wooldridge Ch. 17. Ex c1) :

For this exercise we are using the PNTSPRD dataset from Wooldridge. The data are for the   1994- 1995 men’s college basketball seasons. The spread is for the day before the game was played.

The variable favwin is a binary variable if the team favoured by the Las Vegas point spread   wins. The variable spread is the expected diference between the winners and losers scores, taking all available informaton into consideraton, presented by Las Vegas gambling               insttutons.

553 observatons on 12 variables:

•   favscr: favored team’s score

•    undscr: underdog’s score

•   spread: las vegas spread

•   favhome: =1 if favored team at home

•    neutral: =1 if neutral site

•   fav25: =1 if favored team in top 25

•    und25: =1 if underdog in top 25

•   fregion: favorite’s region of country

•    uregion: underdog’s region of country

•   scrdif: favscr - undscr

•   sprdcvr: =1 if spread covered

favwin: =1 if favored team wins

A linear probability model to estmate the probability that the favoured team wins is : P(favwin = 1|spTead) =  F0  +  F1SpTead

(i)         Explain why, if the spread incorporates all relevant informaton, what value would we expect F0 to equal? Let's call that value 0

(ii)        Estmate the model from part (i) by OLS.

a.   Do you think that heteroskedastcity might be a problem? If so, why?

b.   Test for heteroskedastcity.

c.    Use both the usual and heteroskedastcity-robust standard errors and discuss the diferences.

d.   Test H0 : F0  = against a two-sided alternatve.

(iii)        Is spread statstcally signifcant? What is the estmated probability that the

favoured team wins when spread = 10?

(iv)       Now, estmate a Probit model for the same specifcaton above.

a.   To complete the same test for as in part (i) d, what value should you test of the constant in the Probit output? Conduct this test.

b.   Use the probit model to estmate the probability that the favored team wins when spread = 10. Compare this with the LPM estmate from part (iii).

c.   What is the average marginal efect of a one unit increase in the spread on the likelihood of winning? Compare to LPM.

(v)        Add the variables favhome, fav25, and und25 to the probit model and test joint   signifcance of these variables using the likelihood rato test. Interpret this result . Do you think the spread incorporates all observable informaton prior to a game ?

Time Series:

For this exercise load the tme_series.dta dataset from Blackboard. Feel free to look at notes form Lab 3 to complete the tasks.

1.    Declare the data to be tme series with t as the tme variable.

2.    Create a tme series graph of y1. What type of process do you think is underlying?

3.   Conduct appropriate analysis to determine the underlying process.

4.    Create a tme series graph of y2. What type of process do you think is underlying?

5.   Conduct appropriate analysis to determine the underlying process. Is y2 a unit root process?

6.   Do what is necessary to make y2 a statonary process and confrm this is now the case with the appropriated analysis.

Lab session 5

Panel data

The data file lab_session5.dta” provides the data on rental prices and other variables for college towns for the years 1980 and 1990. The idea is to see whether a stronger presence of students affects rental rates.

2. yea 80 or 90

3. pop city population

4. enrol # college students enrolled

5. rent average rent

6. rnthsg renter occupied units

7. tothsg occupied housing units

8. Avginc per capita income

9. lenroll log(enroll)

10. lpop log(pop)

11. lrent log(rent)

12. ltothsg log(tothsg)

13. lrnthsg log(rnthsg)

14. lavginc log(avginc)

21. pctstu percent of population students

23. y90 =1 if year == 90

a) The unobserved effects model is;

log(rentit ) = 0  + 60y90t + 1 log(popit )+ 2 log(avgincit )+ 3pctstuit + ai + ui

Estimate the equation by pooled OLS and report the results in standard form. What do you make of the estimate on the 1990 dummy variable? What do you get for 3 ?

b) Are the standard errors you report in part a) valid? Explain

c) Now, difference the equation and estimate by OLS. Compare your estimate of 3   with that from part a). Does the relative size o the student population appear to affect rental prices?

d) Estimate the model by fixed effects to verify that you get identical estimates and standard errors to those in part c)

Lab session 4

Autocorrelation

Question A

Consider the following regression model on the determinants of U.S. Domestic Price of Copper ( 1951- 1980) :

where:

C= 12-month average U.S. domestic price of copper (cents per pound).

G=annual gross national product ($, billions).

I=12-month average index of industrial production.

L=12-month average of London Metal Exchange price of copper (pounds sterling).

H=number of housing starts per year (thousand of units).

A=12-month average price of aluminium (cents per pound). Interpret the results.

Question B

Obtain the residuals from the preceding regression.

1. Plot the residuals over time.

2. Plot the residual against the lagged residual.

What can you surmise about the presence of autocorrelation in the residuals?

Question C

Estimate the Durbin-Watson statistics and comment on the nature of autocorrelation present in the data.

How would you estimate the model?

Lab session 8

IV Estimation

Exercise (Wooldridge, 4th edition)

Use the datafile lab_session8.dta” to. The variables are defined as follows:

Sibs: "number of siblings"

Educ: "years of education"

Brthord: "birth order"

Log(wage): "natural log of wage"

a)

i.    Estimate the return to education for men. Interpret the results.

ii.    Consider the variable sibs (number of siblings) as an IV for educ. Discuss whether the IV conditions are satisfied. Run   sibs as an IV for educ including the first stage . Interpret the results.

b) Reduced Form : To convince yourself that using sibs as an IV is not the same as just plugging in sibs in for educ and running an     OLS regression, run regression of log(wage) on sibs and explain     your findings. Is the variable sibs a valid instrument? Discuss.

c) The variable brthord is birth order (one for a first-born child, two for a second-born child, and so on). Explain why educ and brthord might be negatively correlated. Estimate the first stage regression  of educ on brthord to determine whether there is a statistically       significant negative correlation.

d) Use brthord as an IV for educ. Report and interpret the results. Discuss whether you think that brthord is a valid instrument?

e) Now, suppose that we include the number of siblings as an    explanatory variable in the wage equation, in order to control for family background. Use brthord as an IV for educ, assuming that sibs is exogenous. State and test the identification assumption.