ECO00037M Statistics and Econometrics 2021-2
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
ECO00037M
MSc Degree Examinations 2021-2
Statistics and Econometrics
Section A
1.
(a) [20%]
(i) [15%] We have event sets A = {1,3,5,7}, B = {3,6,9}, C = {2,4,8}. A, B , and C cover all possible outcomes. With a brief explanation, show the elements of the following sets:
A. Union of A and B .
B. Intersection of B and C .
C. Intersection of and , where signifies the complement of an event set E.
(ii) [5%] We know Pr(B) = 0.4 and Pr(A|B) = 0.3 . Find Pr( ∪ ).
(b) [25%]
(i) [15%] The amount of time in minutes a person shops online for a birthday card can be modeled by an exponential distribution with the average amount of time equal to five minutes.
A. Calculate the probability that a person spends between five to ten minutes shopping online for a birthday card .
B. Two people are shopping online (independently) for birthday cards. Compute the probability that both spend more than eight minutes .
C. Five people are shopping online (independently) for birthday cards. What is the probability that only two people spend more than six minutes?
(ii) [10%] X and Y are independent Poisson random variables whose means are uX and uy . Then, X + Y is a Poisson random variable with mean u = uX + uy . Express Pr (X = X |X + Y = n) in terms of uX , uy , X and n .
(c) [25%] You are trying to develop a strategy for investing in two different stocks. The expected annual return for a £1,000 investment in each stock under four different economic conditions has the following probability distribution:
|
|
Returns (in £) |
|
Probability |
Economic Condition |
Stock A |
Stock B |
0.1 |
Recession |
-100 |
50 |
0.3 |
Slow growth |
0 |
150 |
0.4 |
Moderate growth |
80 |
-20 |
0.2 |
Fast growth |
150 |
100 |
(i) Compute the expected return by investing £1,000 into stock A and into stock B, respectively.
(ii) Compute the standard deviation of the return by investing £1,000 into stock A and into stock B, respectively.
(iii) Compute the covariance between the return for investing £1,000 into stock A and the return for investing £1,000 into stock B .
Now you consider investing some portion of the £1,000 in both stocks A and B.
(iv) Compute the expected return and the standard deviation of the return by
investing £300 in stock A and £700 in stock B.
(v) Compute the expected return and the standard deviation of the return by investing £700 in stock A and £300 in stock B.
(d) [30%] We have Xi ~i. i. d. (ux , ax(2)) and Yi ~i. i. d. (uy , ay(2)), i = 1, … , n(> 1), and Xi and Yi are independent of each other. Answer the following questions with an explanation.
(i) [7%] Is − an unbiased estimator of ux − uy , where = ∑1 Xi and = ∑1 Yi ?
(ii) [7%] Is 2 an unbiased estimator of ux(2)?
(iii) [16%] Is a consistent estimator of ux uy ?
2.
(a) [25%] A 10% significance level test about the population mean u, for H0 : u = 0 against H1 : u > 0, was implemented based on a statistic, which was a standardised sample average. The p-value was 0.075. Under the null hypothesis, the statistic has a t- distribution with 10 degrees of freedom. Answer the following questions with a brief explanation.
(i) Do you reject the null hypothesis?
(ii) Explain what a Type I error is. What is the probability of making a Type I error using this test?
(iii) Using the same sample, what is the p-value for the test H0 : u = 0 against H1 : u ≠ 0?
(iv) Using the same sample, do you reject the null hypothesis H0 : u = 0 if the
alternative is H1 : u ≠ 0?
(v) Using the same sample, what is the value of the test statistic?
(b) [15%] A fertilizer company claims that its product improves average crop yields by more than 400 kg per acre. A random sample of 20 observations shows an average increase of 418.65 kg per acre with a sample standard deviation of 48.60 kg per acre . Stating any assumptions you need to make, test the company's claim at the 5% significance level.
(c) [20%] When the sample size is increased, does the power of the test increase or decrease? Discuss using the example of testing H0 : u = 10 against H1 : u ≠ 10, where u is the population mean, with a sample Xi ~i. i. d. N(8,36) , i = 1, … , n, at the 5% level of significance. Your discussion must include the numerical values of the powers of the test associated with the sample sizes, n = 16 and 25 .
(d) [20%] Suppose Xi ~i. i. d. N (ux , a 2 ) and Yi ~i. i. d. N (uy , a 2 ) , i = 1, … , n , and Xi and Yi are independent of each other. The parameters ux , uy and a 2 are unknown. We are interested in testing H0 : ux = uy against H1 : ux ≠ uy . Show that the t-test based on − and the ANOVA test are equivalent, where = ∑1 Xi and = ∑1 Yi .
(e) [20%] A supervisor of several shops in an area is monitoring the reputation of these shops. The supervisor suspects that one of these shops may be receiving too many complaints. The manager of this shop claims that the average number of complaints received per day has historically been two, which is the same as the average of other shops. Therefore, the supervisor monitored the number of complaints per day for a certain period and found that it had never been below five. Assuming that the number of complaints per day follows a Poisson process, test the shop manager's claim using the p-value method at a significance level of 10%.
3. Consider a linear model
yi = F1 + xi F2 + ui , i = 1,2, … , n,
where ui ~i. i. d. N(0, a 2 ) . We observe (yi , xi ), i = 1,2, … , n. The OLS estimators of F1 and F2 are b1 = − b2 and b2 = ∑1 ai yi with ai = (xi − )/∑1 (xi − )2 , = ∑1 yi /n , = ∑1 xi /n . It is assumed that xi is non-stochastic. We define ei = yi − i , i = b1 + xi b2 .
(a) [20%]
(i) [10%] Show that ∑1 yi(2) = ∑1 i(2) + ∑1 ei(2) .
(ii) [10%] Show that ∑1 i(2) − n2 = ∑1 (i − )2 where = ∑1 i /n .
(b) [50%]
(i) Is b2 an unbiased estimator of F2 ? Your discussion should start from the equality, b2 = ∑1 ai yi .
(ii) Show that Var(b2 ) = G2
(iii) b1 = ∑1 ci yi where ci = − ai . Show that Var(b1 ) = a 2 ( + ).
(iv) Suppose F2 ≠ 0. Derive the bias of the estimator of F1 , 1 = . Show a condition under which this estimator is unbiased.
(v) Derive E(b2 b1 ). Is b2 b1 an unbiased estimator of F2F1 ? Explain the reason for your answer.
(c) [30%] The regression model was estimated using n > 5 observations. Suppose n is even. After the estimation, it was found that the last n/2 observations {yi , xi }, i = n/2 + 1, … , n , are an identical repeat of the first n/2 observations, {yi , xi }, i = 1, … , n/2.
(i) Rewrite the estimators, b1 = − b2 and b2 = ∑1 ai yi , in terms of the first n/2 observations, {yi , xi }, i = 1, … , n/2. Comment on the properties of the estimators b1 and b2 .
(ii) Consider R2 = 1 − . Rewrite R2 in terms of the first n/2 observations, {yi , xi }, i = 1, … , n/2. Is it R2 = 1 − ? Explain your answer.
(iii) Consider the estimator of Var(b2 ), , where 2 = . Rewrite this
variance estimator in terms of the first n/2 observations, {yi , xi }, i = 1, … , n/2. Is it
with 2 = ? Explain your answer.
SECTION B
4.
1000 words limits
(a) [20%]
A researcher estimates the following regression model to predict the price of houses
lprice= β 0 + β 1llotsize+ β2bathrms+ β3bd3+ β4bd4+ β5bd5+u, |
4.1 |
where lprice is the natural logarithm of house price, llotsize is the natural logarithm of lot size; bathrms is the number of bathrooms, bd3 is a dummy variable taking value 1 for houses with 3 bedrooms and 0 otherwise, bd4 is a dummy variable taking value 1 for houses with 4 bedrooms and 0 otherwise, bd5 is a dummy variable taking value 1 for houses with 5 or more bedrooms and 0 otherwise, and u is an error term. Table 4.1 reports the ordinary least squares estimation results using Stata.
Provide an economic and statistical interpretation of the coefficients and discuss all tests defining the null hypothesis of each of them and providing details on whether the null hypothesis is accepted or rejected.
Table 4.1
. reg lprice llotsize bathrms bd3 bd4 bd5
Source |
SS df MS |
Model Residual |
39.5180877 5 7.90361754 35.8950825 540 .066472375 |
Total |
75.4131702 545 .138372789 |
Number of obs = 546
F ( 5, 540) = 118.90
Prob > F = 0.0000
R-squared = 0.5240
Adj R-squared = 0.5196
Root MSE = .25782
lprice |
Coef. |
Std. |
Err. |
t |
P> |t | |
[95% Conf. |
Interval] |
llotsize |
.4501384 |
.0284 |
825 |
15.80 |
0.000 |
.3941883 |
.5060885 |
bathrms |
.2428136 |
.0241 |
509 |
10.05 |
0.000 |
.1953723 |
.2902549 |
bd3 |
.1920954 |
.0270 |
597 |
7.10 |
0.000 |
.1389402 |
.2452506 |
bd4 |
.1985782 |
.0370 |
555 |
5.36 |
0.000 |
.1257876 |
.2713689 |
bd5 |
.1566038 |
.0793 |
499 |
1.97 |
0.049 |
.0007315 |
.312476 |
cons |
6.791724 |
.2367 |
439 |
28.69 |
0.000 |
6.326672 |
7.256776 |
(b) [10%]
Define under which conditions the OLS (Ordinary Least Squares) estimator of the above model is the best linear unbiased estimator, treating the explanatory variables as stochastic (non-fixed) variables drawn randomly from a well-defined population.
(c) [35%] Assuming that all conditions described in point (4.b) are satisfied except for the fact that the error term is not independent of llotisize, is the OLS estimator unbiased and efficient? Explain what can potentially cause a dependence between llotsize and the error term. Discuss what type of estimator you would adopt if the error term was correlated with llostsize. Provide full details on the estimation procedure adapting formulas to the specific example and defining the conditions under which the proposed estimator is consistent. How would you test for these conditions?
(d) [35%] Suppose that you can observe a dummy variable that takes value 1 for urban areas and 0 otherwise. Suppose also that lprice is on average much larger in urban than in rural areas, and the effects of llotsize and bd3, bd4 and bd5 on lprice are also larger in urban than in rural area. Explain how you would modify model (4. 1) to allow for such dependence between lprice and urban and rural areas. Write down the new extended model and provide an interpretation of the coefficients. Explain a test procedure for the null hypothesis that the coefficients of all the additional variables in the extended model are equal to zero.
5.
1000 words limits
(a) [30%]
Discuss an empirical example of a probit model different from the ones discussed in the lectures, seminars, computer practical sessions and past exams. Write down the probit model and provide details on all variables included and the interpretation of the coefficients. The variables included should be sensible determinants of the dependent variable and you must include at least two explanatory variables. Define the likelihood of the probit model and explain how the maximum likelihood estimator of the coefficients is computed. Make sure to adapt formulas and explanation to your empirical example.
(b) [25%]
Provide the formula for the marginal effect of two of the explanatory variables in the model defined in point (5.a). Compute the ratio between the marginal effect of the first variable and the marginal effect of the second variable and show that this is equal to the ratio between the two coefficients of these two variables. Given this result, provide an interpretation for the ratio between the coefficients of two variables in the model defined in point (5.a.).
(c) [20%]
Consider a linear probability model rather than a probit model for the example discussed in point (5.a). Write down the linear probability model and provide an interpretation of the coefficients. Explain in detail an estimation procedure for the linear probability model. Make sure to adapt formulas and discussion to this specific example.
2022-08-09