关键词 > ECMT6007/ECON4954

ECMT6007/ECON4954: Analysis of Panel Data Semester 1, 2022

发布时间：2023-05-29

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

ECMT6007/ECON4954: Analysis of Panel Data

Semester 1, 2022

Sample Questions of Final Exam

❼ These sample questions here are not meant to be exhaustive, covering every aspects of

the unit; they rather illustrate the types of questions to expect in the exam.

❼ The appendix contains the tables for the critical values of the t, F, and χ2 distributions,

which you may use for the ﬁnal exam (in case you don’t have them).

1. Consider the following dynamic linear panel data model:

yit = λyi,t-1 + βxit + ηi + εit (1.1)

where ηi and εit are independently and identically distributed (i.i.d.) with zero means and variances ση(2) and σe(2) respectively. We are interested in the estimation of β using a random sample {yit , xit }, i = 1, · · · , N , t = 1, · · · , T, N > T > 2. Suppose that the explanatory variable xit is predetermined by:

xit = γxi,t-1 + µi + eit (1.2)

where 0 < γ < 1, µi and eit are i.i.d. with zero means and variances σµ(2) and σe(2) respectively. Suppose that ηi and µi are uncorrelated to each other. Both εit and eit are i.i.d. noises, and each of these two does not correlated to any other random variables.

(a) Does the Pooled OLS estimator applied to model (1.1) with {yit , xit } typically over- estimate or underestimate λ? Explain brieﬂy.

(b) Is the FD estimator of λ consistent, and why? What happens if xit is strictly exogenous?

(d) If we resort to estimation with instrumental variable (either IV or GMM estimators), propose t叫О instruments and brieﬂy illustrate the estimation procedure. Remember to check the variables you proposed are indeed valid instruments.

2. Consider the estimation of the Cobb-Douglas production function of an industry: Yit = Ait Kit(β)1 Lit(β)2 Mit(β)3

where Ait , Kit , Lit and Mit , respectively are ﬁrm i’s productivity, capital, labor and intermediate inputs at time t. After taking logarithm, we get

yit = ait + β1 kit + β2 lit + β3mit

where the small letters represent the logarithm of the corresponding large letters. We col-

lect a sample of N ﬁrms’ output and input data {Yit , Kit , Lit , Mit } (or equivalently{yit , kit , lit , mit }), where i = 1, · · · , n and t = 1, · · · , T. Now suppose that ait can be decomposed:

ait = β0 + ai + eit where ai is ﬁrm i’s intrinsic productivity that does not change over time, and eit is the temporary productivity shock; β0 is a global intercept term reﬂecting the industry average productivity. Firm i cannot observe eit at time t so that it makes input decisions based on ai and then produces output yit . We end up with the regression model (2):

yit = β0 + β1 kit + β2 lit + β3mit + ai + eit (2)

(a) Interpret the coeﬃcients β1 , β2 and β3 one by one.

(b) Suppose that we want to test whether the return to scale is constant. Write down

the null and alternative hypotheses, the test statistics to be used and its distribution under the null (including the degree of freedom), and the decision rule.

(c) Outline the procedure of the First Diﬀerence (FD) estimator for this model. List the assumptions for the FD estimator to be consistent.

(d) Outline the procedure of the Fixed Eﬀect (FE) estimator for this model. List the assumptions for the FE estimator to be consistent.

(e) Under what conditions on the error term uit = ai + eit will the Random Eﬀects (RE) estimator provide consistent estimates of (β1 , β2 , β3 )? Are the assumptions likely to be valid in this model? Explain brieﬂy.

(f) One by one, calculate the degree of freedom of Pooled OLS, FD, FE, and RE esti-

mators for the model (1) with T > 2.

3. We are interested in analyzing the eﬀect of the government building a new hospital on housing prices in the suburb of Sydenham. Rumors that a new hospital would be built in Sydenham began after 2006, and the hospital was built and began operating in 2008. We have data on the prices of houses sold in Sydenham in 2006 and another sample on houses that sold in Sydenham in 2010. The hypothesis we wish to test is that the price of houses located near the site of new hospital would rise above the price of more distant houses. The data for each year includes the dummy variable near which is equal to one if the house is located within 2 kilometers of the new hospital. House prices, for both years of data, were measured in 2010 prices. The variable rprice denotes the real house price (scaled by $100,000). The following simple regression model was estimated using only the year 2010 sample of data

rp入rice = 10.131 + 2.688near (3.1)

(0.309) (0.788)

n = 96, R2 = 0.199

while the following was estimated using only the 2006 sample of data

rp入rice = 9.252 + 1.412near (3.2)

(0.265) (0.671)

n = 105, R2 = 0.106

(a) Explain one by one the interpretation of the estimates in model (3.2)?

(b) Based on the estimates in (3.1) and (3.2), from 2006 to 2010, what is the average price change for all houses in Sydenham?

(c) Explain why we cannot infer from the estimates in (3.1) that the location of the hospital caused the price of houses located nearby to increase? What evidence from model (3.2) supports this conclusion?

(d) Using the information from models (3.1) and (3.2), calculate the diﬀerence-in-diﬀerences estimate of the impact of the new hospital on the price of nearby houses?

(e) Propose a linear regression model that can directly estimate the eﬀect of new hospital on housing price.

4. Consider a linear panel data model

yit = xβ + αi + γt + eit

where αi is a ﬁxed eﬀect, and γt is a time eﬀect.

(a) Write down the model after the within transformation.

(b) State the assumptions for FE to be consistent and show FE is consistent under those conditions.

(c) Show that FE is a generalized least square estimator. In particular, show the transformed regression model (in matrix form) for FE .