Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

SEMESTER 1 ASSESSMENT, 2022

ECOM30001: Basic Econometrics


Let yi denote the number of patents applied for by firm i during a given calendar year. The probability density of y《follows a Poisson distribution:

f / I、) exp(—入i)

/ (yi|Ai) = ―!

with E(yi)= and VAR(yi)=入

Assume the conditional expectation of patents is given by:

入i = E [yi|Xi] = exp {% + 月i sales” + 月 RDi + 月 RD扌}

where:

y = number of patents applied for by firm i during the calendar year

sales = annual sales of firm i during the calendar year

RD = total expenditure on research & development by firm i during the calendar year

a) What is meant by the term count data? What are the important characteristics of count data?

b) What is the interpretation of the parameter %i?.

c) Derive an expression for the marginal effect of RD on the conditional mean function E [yi|Xi].

Consider the following simple econometric model:

yt = + +

What is meant by the term autocorrelation in the random error &? Clearly explain the consequences for the OLS estimator if you ignore the presence of autocorrelation in the random error. Briefly outline how you would test for first order AR(1) autocorrelation in the random error. Your answer should clearly state the null and alternative hypotheses, the test statisic, and its distribution.

Question 3 [5 marks]

Consider the following regression:

△ inft = +  inft_i + 月扌△ inft—i + 月岑△ inft_2 + 月扌△ inft_3 + 月立△ inft— + &

where inf represents the quarterly inflation rate. This econometric model was estimated using the method of Ordinary Least Squares (OLS) for the period 1960:Q4 to 2015:Q4 and the results are presented in Figure 1.

 

Figure 1: Question 3: OLS Regression Results

Outline how you would test whether the series for the quarterly inflation rate was stationary or not. Your answer should clearly state the null and alternative hypotheses, the test statistic and its distribution. Using the results in Figure 1, what is the value of the Augmented DickeyFuller test statistic? At the 5% level of significance, explain whether the sample evidence is consistent with the null hypothesis. Based upon the estimation results presented in Figure 1 , the p-value associated with the Augmented Dickey-Fuller test is p=0.0072.

Consider the following econometric model:

yi = % + Xi + 与

What is meant by the term heteroskedasticity? What are the consequences for the OLS estimator if you ignore heteroskedasticity in the random error & ? Briefly outline how you would test for the presence of heteroskedasticity using White's test. Your answer should clearly state the null and alternative hypotheses, the test statistic and its distribution.

Consider the following labour demand equation for married women:

In wage《=  +hours《+ 月 educ《+ 月 exper《+ 月 exper + ewi (1)

where ln wage is the natural logarithm of the hourly wage for individual i and:

hours = annual hours of work

educ = years of education

exper = years of labour market experience

Consider the following labour supply equation for married women:

hours" = + ln wage《+ a2 kidsl6《+ a3 kids618《+ a faminc《+ (2)

where:

kidsl6 = number of children less than 6 in household

kids618 = number of children aged 6-18 in household

famine = household income from all sources excluding employment income of individual i

The reduced form equations for this demand-supply system are given by:

ln wage《=nw0 + nwi educ《+ nw2 exper《+ 兀源岑 exper2

+ nw4 kidsl6i + ㈱立 kids618《+ 兀源6 faminci + Uwi

and:

hoursi = nho + nhi educ《+ nh2 exper《+ nh3 exper2

+ nh4 kidsl6i + nh kids618《+ nh6 faminci + vhi

a) [4 marks] Consider the econometric model (1). Do you think the condition:

COV (hours, £w |educ, exper) = 0

is likely to be satisfied? Explain why or why not. Outline three possible reasons why this condition might not be satisfied. Explain the consequences for the OLS estimator if this condition is not satisfied.

b) [5 marks] Clearly explain whether the labour demand equation (1) satisfies the necessary condition for identification. Why or why not? At the 5% level, test the hypothesis that the necessary condition(s) for identification of the labour demand function are satisfied. Your answer should clearly state the null and alternative hypotheses, the distribution of the test statistic, and your conclusion.

c) [5 marks] Clearly explain whether the labour supply equation (2) satisfies the necessary condition for identification. Why or why not? At the 5% level, test the hypothesis that the necessary condition(s) for identification of the labour supply function are satisfied. Your answer should clearly state the null and alternative hypotheses, the distribution of the test statistic, and your conclusion.

Linear hypothesis test

Hypothesis:

educ = 0

exper = 0

expersq = 0

Model 1: restricted model

Model 2: lnwage ~ educ + exper + expersq + kidsl6 + kids618 + faminc

Note: Coefficient covariance matrix supplied.

Res.Df Df F Pr(>F)

1 424

2 421 3 22.364 0.0000000000001872 ***

Signif. codes: 0 ‘***' 0.001 ‘**' 0.01 0.05 ‘.' 0.1 1

Figure 2: Wald Test of Hypothesis: : = n=nw = 0

Li near hypothesi s test

Hypothesis:

ki dsl6 = 0

kids618 = 0

fami nc = 0

Model 1: restri cted model

Model 2: hours ~ educ + exper + expersq + kidsl6 + ki ds618 + fami nc

Note: Coefficient covariance matrix supplied.

Res.Df Df F Pr(>F)

1 424

2 421 3 1.6272 0.1824

Figure 3: Wald Test of Hypothesis: : = nh = 0

d) [3 marks] Clearly explain what is meant by the Weak Instruments problem. Explain the consequences for statistical inference using the method of Two-Stage Least Squares (2SLS) with weak instruments. In light of your answers to parts b) and c) above, do you think there is a weak instrument problem associated with the estimation of the labour demand and labour supply system? Clearly explain why or why not.

e) [3 marks] The econometric model (2) was estimated by the method of Two-Stage Least Squares (2SLS) and the results are reported in Figure 4.

 

Figure 4: 2SLS Regression Results (with robust standard errors) for Model (2)

At the 5% level of significance, test the hypothesis that a 10% increase in the hourly wage increases annual hours of work by more than 50 hours. Your answer should clearly state the null and alternative hypotheses, the distribution of the test statistic, and your decision.

Consider the following econometric model:

In fareit = /3( + concen^ + 月2】ndist《+ ^3lndist扌

+ 月 YEAR98t + YEAR99t + YEARQQt + * (3)

where:

fare = average fare on route i in period t, in dollars concen = market concentration on route i in period t, measured by the market share of the largest carrier dist = average distance of route i, in miles

YEAR98 = 1 if year = 1998, Q otherwise

YEAR99 = 1 if year = 1999, Q otherwise

YEAR00 = 1 if year = 2QQQ, Q otherwise

Suppose you have a dataset of 4,228 observations on 1,Q57 different airline routes over 4 years.

a) [2 marks] Suppose you estimate model (3) using Ordinary Least Squares (OLS). Do you think that the standard errors are valid? Clearly explain why or why not.

b) Consider the following alternative econometric model :

ln fare《t = + /3i concen^ + 月2】ndist《+ 月3】ndist扌

+ 月 YEAR98t + /3 YEAR99t + /3  YEARQQt + U + * (4)

where u represents an unobserved time invariant random variable.

i) [4 marks] Suppose you estimate this econometric model using the Random Effects (RE) estimator. Clearly explain the assumption about the relationship between and concenit that is imposed when estimating the model using the Random Effects (RE) estimator. Clearly explain, and provide an example, whether you think that this is a realistic assumption.

ii) [3 marks] The Random Effects estimator (RE) nests the Pooled OLS model (3) when av = Q. Test the hypothesis that the pooled OLS model is the most appropriate model for the data. Your answer should clearly state the null and alternative hypotheses, the distribution of the test statistic, and your decision.

Lagrange Multiplier Test - (Breusch-Pagan) for balanced panels

data: 1nfare 〜 concen + 1ndist + 1ndistsq + yearl998 + yearl999 + year2000 chisq = 5087.5, df = 1, p-value < O.OOOOOOOOOOOOOOO22

alternative hypothesis: signifi cant effects

Figure 5: LM Test for Random Effects Model (4)

iii) [8 marks] Clearly outline the important differences between the Random Effects (RE) estimator and the Fixed Effects (FE) estimator. Your answer should clearly explain the variation in the data that is used to identify the parameters of interest.

c) [3 marks] Consider the following alternative Correlated Random Effects (CRE) model:

In fareit = /3q + concen^ + 月2】ndist《+ 月岑lndist扌

+ 3 YEAR98t + 3 YEAR99t + 3 YEAR00t + 06 concen + n + * (5)

where concen represents the mean value for market concentration on route i and n represents an unobserved time invariant random variable that satisfies the restriction COV(ni, Xit) = 0 for each of the explanatory variables in model (5).

The estimation results for model are presented in Figure 6.

 

Figure 6: CRE Results (with cluster-robust standard errors) for Model (5)

Using the results in Figure 6, test the hypothesis that the Random Effects (RE) estimator is the most appropriate model using a t-test at the 5% level of significance. Your answer should clearly state the null and alternative hypotheses, the distribution of the test statistic, and your decision.

Consider a dataset of 3,580 workers with the following variables:

wage = hourly wage, in dollars

educ = years of education

exper = years of labour market experience

union = 1 if member of a union, 0 otherwise

non-metro = 1 if live in a non-metropolitan area, 0 otherwise

Consider the following latent variable model determining the choice to join a union:

union* = /3( +  educ《+ 月 exper《+ 月 exper + 月 ln wage《+ non-metro^ + & (6)

where e|X ~ N(0, af).

This suggests the following econometric model:

unions = /3( +  educ《+ 月 exper《+ 月 exper + 月 lnwage《+ non-metro^ + & (7)

where:

f = 1 if union* 2 0

union淫

[=0 if union* < 0

The parameters of model (7) were estimated as a Probit Model and the results are presented in Figure 7

Note that the probability density function for a standard normal variable Z is given by:

 

©0)=忐%)

a) [4 marks] Let pi represent the predicted probability that an individual is a member of a union, based upon their observed characteristics. Consider the following decision rule:

   —  

if pi 2 0.5 then predict that union《=1, otherwise union《=0

Junioni = 1 if pi 2 0.5

[union《=0 if pi < 0.5

true

predicted

frequency

0

0

2,618

1

0

936

0

1

16

1

1

10

TOTAL

3,580

Table 1: Predicted Probability Threshold pi 2 0.5

Based upon the information in Table 1, calculate the percentage of outcomes that are correctly predicted. Using Table Table 1, comment on the usefulness of the model predicting union=1 or union = 0.


Figure 7: Probit Estimation Results for Model (7)

b) [5 marks] Calculate the marginal effect for years of education (educ) for a worker living in a metropolitan area with 13 years of education, 10 years of labour market experience and an hourly wage of $7 per hour.

c) [5 marks] Explain how you would calculate the average marginal effect (AME) for living in a non-metropolitan area (non-metro) for an individual with 13 years of education, 10 years of labour market experience, and an hourly wage of $7 per hour. You have not been provided with enough information to actually calculate this marginal effect so do not attempt to calculate the marginal effect. Instead, your answer should clearly explain how you would calculate this marginal effect.

d) [6 marks] An alternative to the Probit model is the Linear Probability Model (LPM). Provide a brief outline of the Linear Probability Model. Explain the main advantages and disadvantages of using this linear (OLS) specification, compared to an alternative non-linear procedure, such as the Probit model.

Some Useful Formulas

Variance of the Sum of Two Random Variables

VAR(aX + bY) = a VAR(X) + b2 VAR(Y) + 2ab COV(X, Y)
VAR(aX — bY) = a2 VAR(X) + b2 VAR(Y) — 2ab COV(X,Y)

Sample Variance

VaR(x)= £ (Xi -12

'丿 N  1

Sample Covariance

COV(X, Y) = £ (Xi - X)- )

"丿 N — 1

Multiple Linear Regression Model

yi = % + Xli + 月2 X2i + Pk XKi + Ei

OLS Residuals

ei = yi (b0 + bl Xli + b2 X2i + ...bK XKi)

Estimator of Error Variance

7= £=RSS

= N — K  1 = N — K  1

where N denotes the sample size, and (K + 1) the number of estimated parameters in the model, including the intercept.

Sample t Statistic

t = bk  %k

se(bk) where %k is the hypothesised value under the null hypothesis.

Sample F-statistic

=RSSr RSSur/M = (R - RR)/M

rsSur/(n K  D (1 rUr)/n K 1

when the dependent variable in both the restricted and unrestricted model are the same. Here M denotes the number of restrictions, N denotes the sample size, and (K + 1) the number of estimated parameters in the unrestricted model, including the intercept term.


Sample F-statistic for Test of Overall Significance

F =

(TSS - RSS)/K _ R2/K

RSS/(N K 1) = (1 R2)/(N K 1)

with:

TSS = (N 1) * ay

where is the sample variance of the dependent variable y.

 

Here N denotes the sample size, and K the number of estimated parameters in the model, excluding the intercept term.

Goodness of Fit

R2= £(yi RSS

E(yi - y)2 = tss

and:

RSS/(N - K - 1)

= tss/(n -1) = ay

where a2 is the estimator of the error variance and is the sample variance of the dependent variable y.

 

Here N denotes the sample size, and (K + 1) the number of estimated parameters in the model, including the intercept term.

Note:

p2 = ] f (1 r2) (N 1) \

1(1 R )(N - K - 1)/

White Test for Heteroskedasticity

Econometric Model of Interest

 

yi = 6( + Xii + 月2 X2i + Pk XKi +

Auxiliary Regression

ei = Y( + 71 Zli + Y2 Z2i + • • • YK ZMi + ui

 

where denote the OLS residuals from the Econometric Model of Interest above.

The test statistic is N R2 ~ x2(M) where N is the sample size, R2 is the R2 from this auxiliary regression, and M + 1 is the number of parameters in the auxiliary regression (including the intercept).


Probit Model

Latent Variable Formulation

y* = + 乂財 + 月扌 x& +... 5k XKi + 匀 ~ N(o,)

Response Probability

Pr(Yi = 1)=由] + Xi, + X2i + . XKi\

where $(•) is the cumulative distribution function for the standard normal distribution. Marginal Effect if Xj is a continuous variable

"7 ( + Xii + X2i + .. XKi) 5
oXij \ae Oe /

where ©(•) is the probability density function for the standard normal distribution. Marginal Effect if Xj is an indicator (dummy) variable

"X 5 ( +  Xii + X2i + ..位 + ..座 XKi)

-M +  Xii +  X2i + ... XKi)

\ °e °e °e °e / Xij =0


LM Test for First Order AR(1) Autocorrelation

Econometric Model of Interest

yt = + 角 + 月扌 X2t + ••- Bk XKt +

Auxiliary Regression

et = + 7i Xit + Y2 X2t + ---7k XKt + P et-i + ut

The test statistic is T R2 ~ 乂where T is the sample size, and R2 is the R2 from this auxiliary regression.

Augmented Dickey-Fuller test

With intercept, no trend:

m

△yt = a + Yyt-i + E d △ yt-s + Ut

S=1

With intercept and trend:

m

yt = a + At + Yyt-i + 】d△ yt-s + Ut

S=1



Critical Values for the 5% Upper Tail Probabilities of the F Distribution