ECOM30001: Basic Econometrics 2022
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
SEMESTER 1 ASSESSMENT, 2022
ECOM30001: Basic Econometrics
Let yi denote the number of patents applied for by firm i during a given calendar year. The probability density of y《follows a Poisson distribution:
f / I、) exp(—入i)端
/ (yi|Ai) = ―!
with E(yi)=為 and VAR(yi)=入
Assume the conditional expectation of patents is given by:
入i = E [yi|Xi] = exp {% + 月i sales” + 月扌 RDi + 月岑 RD扌}
where:
y = number of patents applied for by firm i during the calendar year
sales = annual sales of firm i during the calendar year
RD = total expenditure on research & development by firm i during the calendar year
a) What is meant by the term count data? What are the important characteristics of count data?
b) What is the interpretation of the parameter %i?.
c) Derive an expression for the marginal effect of RD on the conditional mean function E [yi|Xi].
Consider the following simple econometric model:
yt =歯 + + 诳
What is meant by the term autocorrelation in the random error &? Clearly explain the consequences for the OLS estimator if you ignore the presence of autocorrelation in the random error. Briefly outline how you would test for first order AR(1) autocorrelation in the random error. Your answer should clearly state the null and alternative hypotheses, the test statisic, and its distribution.
Question 3 [5 marks]
Consider the following regression:
△ inft =歯 + 凯 inft_i + 月扌△ inft—i + 月岑△ inft_2 + 月扌△ inft_3 + 月立△ inft—扌 + &
where inf represents the quarterly inflation rate. This econometric model was estimated using the method of Ordinary Least Squares (OLS) for the period 1960:Q4 to 2015:Q4 and the results are presented in Figure 1.
Figure 1: Question 3: OLS Regression Results
Outline how you would test whether the series for the quarterly inflation rate was stationary or not. Your answer should clearly state the null and alternative hypotheses, the test statistic and its distribution. Using the results in Figure 1, what is the value of the Augmented DickeyFuller test statistic? At the 5% level of significance, explain whether the sample evidence is consistent with the null hypothesis. Based upon the estimation results presented in Figure 1 , the p-value associated with the Augmented Dickey-Fuller test is p=0.0072.
Consider the following econometric model:
yi = % + Xi + 与
What is meant by the term heteroskedasticity? What are the consequences for the OLS estimator if you ignore heteroskedasticity in the random error & ? Briefly outline how you would test for the presence of heteroskedasticity using White's test. Your answer should clearly state the null and alternative hypotheses, the test statistic and its distribution.
Consider the following labour demand equation for married women:
In wage《= 歯 +hours《+ 月扌 educ《+ 月岑 exper《+ 月扌 exper扌 + ewi (1)
where ln wage《 is the natural logarithm of the hourly wage for individual i and:
hours = annual hours of work
educ = years of education
exper = years of labour market experience
Consider the following labour supply equation for married women:
hours" = + ln wage《+ a2 kidsl6《+ a3 kids618《+ a扌 faminc《+ (2)
where:
kidsl6 = number of children less than 6 in household
kids618 = number of children aged 6-18 in household
famine = household income from all sources excluding employment income of individual i
The reduced form equations for this demand-supply system are given by:
ln wage《=nw0 + nwi educ《+ nw2 exper《+ 兀源岑 exper2
+ nw4 kidsl6i + 冗㈱立 kids618《+ 兀源6 faminci + Uwi
and:
hoursi = nho + nhi educ《+ nh2 exper《+ nh3 exper2
+ nh4 kidsl6i + nh立 kids618《+ nh6 faminci + vhi
a) [4 marks] Consider the econometric model (1). Do you think the condition:
COV (hours, £w |educ, exper) = 0
is likely to be satisfied? Explain why or why not. Outline three possible reasons why this condition might not be satisfied. Explain the consequences for the OLS estimator if this condition is not satisfied.
b) [5 marks] Clearly explain whether the labour demand equation (1) satisfies the necessary condition for identification. Why or why not? At the 5% level, test the hypothesis that the necessary condition(s) for identification of the labour demand function are satisfied. Your answer should clearly state the null and alternative hypotheses, the distribution of the test statistic, and your conclusion.
c) [5 marks] Clearly explain whether the labour supply equation (2) satisfies the necessary condition for identification. Why or why not? At the 5% level, test the hypothesis that the necessary condition(s) for identification of the labour supply function are satisfied. Your answer should clearly state the null and alternative hypotheses, the distribution of the test statistic, and your conclusion.
Linear hypothesis test
Hypothesis:
educ = 0
exper = 0
expersq = 0
Model 1: restricted model
Model 2: lnwage ~ educ + exper + expersq + kidsl6 + kids618 + faminc
Note: Coefficient covariance matrix supplied.
Res.Df Df F Pr(>F)
1 424
2 421 3 22.364 0.0000000000001872 ***
Signif. codes: 0 ‘***' 0.001 ‘**' 0.01 0.05 ‘.' 0.1 1
Figure 2: Wald Test of Hypothesis: : = n”扌=nw岑 = 0
Li near hypothesi s test
Hypothesis:
ki dsl6 = 0
kids618 = 0
fami nc = 0
Model 1: restri cted model
Model 2: hours ~ educ + exper + expersq + kidsl6 + ki ds618 + fami nc
Note: Coefficient covariance matrix supplied.
Res.Df Df F Pr(>F)
1 424
2 421 3 1.6272 0.1824
Figure 3: Wald Test of Hypothesis: : = nh一 = 0
d) [3 marks] Clearly explain what is meant by the Weak Instruments problem. Explain the consequences for statistical inference using the method of Two-Stage Least Squares (2SLS) with weak instruments. In light of your answers to parts b) and c) above, do you think there is a weak instrument problem associated with the estimation of the labour demand and labour supply system? Clearly explain why or why not.
e) [3 marks] The econometric model (2) was estimated by the method of Two-Stage Least Squares (2SLS) and the results are reported in Figure 4.
Figure 4: 2SLS Regression Results (with robust standard errors) for Model (2)
At the 5% level of significance, test the hypothesis that a 10% increase in the hourly wage increases annual hours of work by more than 50 hours. Your answer should clearly state the null and alternative hypotheses, the distribution of the test statistic, and your decision.
Consider the following econometric model:
In fareit = /3( + concen^ + 月2】ndist《+ ^3lndist扌
+ 月扌 YEAR98t + 毎 YEAR99t + 月 一 YEARQQt + * (3)
where:
fare = average fare on route i in period t, in dollars concen = market concentration on route i in period t, measured by the market share of the largest carrier dist = average distance of route i, in miles
YEAR98 = 1 if year = 1998, Q otherwise
YEAR99 = 1 if year = 1999, Q otherwise
YEAR00 = 1 if year = 2QQQ, Q otherwise
Suppose you have a dataset of 4,228 observations on 1,Q57 different airline routes over 4 years.
a) [2 marks] Suppose you estimate model (3) using Ordinary Least Squares (OLS). Do you think that the standard errors are valid? Clearly explain why or why not.
b) Consider the following alternative econometric model :
ln fare《t = + /3i concen^ + 月2】ndist《+ 月3】ndist扌
+ 月扌 YEAR98t + /3立 YEAR99t + /3 一 YEARQQt + U + * (4)
where u represents an unobserved time invariant random variable.
i) [4 marks] Suppose you estimate this econometric model using the Random Effects (RE) estimator. Clearly explain the assumption about the relationship between and concenit that is imposed when estimating the model using the Random Effects (RE) estimator. Clearly explain, and provide an example, whether you think that this is a realistic assumption.
ii) [3 marks] The Random Effects estimator (RE) nests the Pooled OLS model (3) when av = Q. Test the hypothesis that the pooled OLS model is the most appropriate model for the data. Your answer should clearly state the null and alternative hypotheses, the distribution of the test statistic, and your decision.
Lagrange Multiplier Test - (Breusch-Pagan) for balanced panels
data: 1nfare 〜 concen + 1ndist + 1ndistsq + yearl998 + yearl999 + year2000 chisq = 5087.5, df = 1, p-value < O.OOOOOOOOOOOOOOO22
alternative hypothesis: signifi cant effects
Figure 5: LM Test for Random Effects Model (4)
iii) [8 marks] Clearly outline the important differences between the Random Effects (RE) estimator and the Fixed Effects (FE) estimator. Your answer should clearly explain the variation in the data that is used to identify the parameters of interest.
c) [3 marks] Consider the following alternative Correlated Random Effects (CRE) model:
In fareit = /3q + 角 concen^ + 月2】ndist《+ 月岑lndist扌
+ 3扌 YEAR98t + 3立 YEAR99t + 3一 YEAR00t + 06 concen + n + * (5)
where concen《 represents the mean value for market concentration on route i and n represents an unobserved time invariant random variable that satisfies the restriction COV(ni, Xit) = 0 for each of the explanatory variables in model (5).
The estimation results for model are presented in Figure 6.
Figure 6: CRE Results (with cluster-robust standard errors) for Model (5)
Using the results in Figure 6, test the hypothesis that the Random Effects (RE) estimator is the most appropriate model using a t-test at the 5% level of significance. Your answer should clearly state the null and alternative hypotheses, the distribution of the test statistic, and your decision.
Consider a dataset of 3,580 workers with the following variables:
wage = hourly wage, in dollars
educ = years of education
exper = years of labour market experience
union = 1 if member of a union, 0 otherwise
non-metro = 1 if live in a non-metropolitan area, 0 otherwise
Consider the following latent variable model determining the choice to join a union:
union* = /3( + 洗 educ《+ 月扌 exper《+ 月岑 exper扌 + 月扌 ln wage《+ 毎 non-metro^ + & (6)
where e|X ~ N(0, af).
This suggests the following econometric model:
unions = /3( + 洗 educ《+ 月扌 exper《+ 月岑 exper扌 + 月扌 lnwage《+ 毎 non-metro^ + & (7)
where:
f = 1 if union* 2 0
union淫
[=0 if union* < 0
The parameters of model (7) were estimated as a Probit Model and the results are presented in Figure 7
Note that the probability density function for a standard normal variable Z is given by:
©0)=忐顼%)
a) [4 marks] Let pi represent the predicted probability that an individual is a member of a union, based upon their observed characteristics. Consider the following decision rule:
— —
if pi 2 0.5 then predict that union《=1, otherwise union《=0
Junioni = 1 if pi 2 0.5
[union《=0 if pi < 0.5
true |
predicted |
frequency |
0 |
0 |
2,618 |
1 |
0 |
936 |
0 |
1 |
16 |
1 |
1 |
10 |
TOTAL |
3,580 |
Table 1: Predicted Probability Threshold pi 2 0.5
Based upon the information in Table 1, calculate the percentage of outcomes that are correctly predicted. Using Table Table 1, comment on the usefulness of the model predicting union=1 or union = 0.
Figure 7: Probit Estimation Results for Model (7)
b) [5 marks] Calculate the marginal effect for years of education (educ) for a worker living in a metropolitan area with 13 years of education, 10 years of labour market experience and an hourly wage of $7 per hour.
c) [5 marks] Explain how you would calculate the average marginal effect (AME) for living in a non-metropolitan area (non-metro) for an individual with 13 years of education, 10 years of labour market experience, and an hourly wage of $7 per hour. You have not been provided with enough information to actually calculate this marginal effect so do not attempt to calculate the marginal effect. Instead, your answer should clearly explain how you would calculate this marginal effect.
d) [6 marks] An alternative to the Probit model is the Linear Probability Model (LPM). Provide a brief outline of the Linear Probability Model. Explain the main advantages and disadvantages of using this linear (OLS) specification, compared to an alternative non-linear procedure, such as the Probit model.
Some Useful Formulas
Variance of the Sum of Two Random Variables
VAR(aX + bY) = a VAR(X) + b2 VAR(Y) + 2ab COV(X, Y)
VAR(aX — bY) = a2 VAR(X) + b2 VAR(Y) — 2ab COV(X,Y)
Sample Variance
VaR(x)= £ 与(Xi -12
'丿 N — 1
Sample Covariance
COV(X, Y) = £ (Xi - X)① - 卩)
"丿 N — 1
Multiple Linear Regression Model
yi = % + Xli + 月2 X2i + Pk XKi + Ei
OLS Residuals
ei = yi — (b0 + bl Xli + b2 X2i + ...bK XKi)
Estimator of Error Variance
7= £世=RSS
= N — K — 1 = N — K — 1
where N denotes the sample size, and (K + 1) the number of estimated parameters in the model, including the intercept.
Sample t Statistic
t = bk — %k
se(bk) where %k is the hypothesised value under the null hypothesis.
Sample F-statistic
=RSSr — RSSur/M = (瑶R - RR)/M
rsSur/(n — K — D (1 — rUr)/n — K — 1
when the dependent variable in both the restricted and unrestricted model are the same. Here M denotes the number of restrictions, N denotes the sample size, and (K + 1) the number of estimated parameters in the unrestricted model, including the intercept term.
Sample F-statistic for Test of Overall Significance
F = |
(TSS - RSS)/K _ R2/K RSS/(N — K — 1) = (1 — R2)/(N — K — 1) |
with: |
TSS = (N — 1) * ay |
where 葺 is the sample variance of the dependent variable y.
Here N denotes the sample size, and K the number of estimated parameters in the model, excluding the intercept term.
Goodness of Fit |
R2= £(yi ―砰 RSS E(yi - y)2 = tss |
and: |
RSS/(N - K - 1) 竺 = tss/(n -1) = ay |
where a2 is the estimator of the error variance and 葺 is the sample variance of the dependent variable y.
Here N denotes the sample size, and (K + 1) the number of estimated parameters in the model, including the intercept term.
Note: |
p2 = ] f (1 r2) (N — 1) \ 1(1 R )(N - K - 1)/ |
White Test for Heteroskedasticity
Econometric Model of Interest
yi = 6( + 角 Xii + 月2 X2i + Pk XKi +
Auxiliary Regression |
ei = Y( + 71 Zli + Y2 Z2i + • • • YK ZMi + ui |
where denote the OLS residuals from the Econometric Model of Interest above.
The test statistic is N R2 ~ x2(M) where N is the sample size, R2 is the R2 from this auxiliary regression, and M + 1 is the number of parameters in the auxiliary regression (including the intercept).
Probit Model
Latent Variable Formulation
y* =歯 + 乂財 + 月扌 x& +... 5k XKi + 匀 ~ N(o,渚)
Response Probability
Pr(Yi = 1)=由]色 + 四 Xi, + 色 X2i + .土 XKi\
where $(•) is the cumulative distribution function for the standard normal distribution. Marginal Effect if Xj is a continuous variable
坐"7 (由 + 色 Xii + 色 X2i + ..也 XKi) 5
oXij \ae Oe 。己 /。己
where ©(•) is the probability density function for the standard normal distribution. Marginal Effect if Xj is an indicator (dummy) variable
"X皂 5 (的 + 也 Xii + 险 X2i + ..位 + ..座 XKi)
-M 应 + 也 Xii + 性 X2i + ...也 XKi)
\ °e °e °e °e / Xij =0
LM Test for First Order AR(1) Autocorrelation
Econometric Model of Interest
yt = + 角 + 月扌 X2t + ••- Bk XKt + 诳
Auxiliary Regression
et = + 7i Xit + Y2 X2t + ---7k XKt + P et-i + ut
The test statistic is T R2 ~ 乂扌⑴ where T is the sample size, and R2 is the R2 from this auxiliary regression.
Augmented Dickey-Fuller test
With intercept, no trend:
m
△yt = a + Yyt-i + E d △ yt-s + Ut
S=1
With intercept and trend:
m
△yt = a + At + Yyt-i + 】d△ yt-s + Ut
S=1
Critical Values for the 5% Upper Tail Probabilities of the F Distribution
2022-06-11