Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

ECMT2150 S2 2022 Canvas Quiz

PART B [30 marks]

Empirical Exercise Using STATA: Estimation, Interpretation, and Inference

Questions related to this part will be worth 30 marks. There will be a mixture of numerical and short answer questions.

REMINDER: I encourage you to work through the following analysis of the data in STATA or another software package before doing the quiz. There are no trick questions, so if you have completed each of the following questions, kept a copy of your output and made a note of your answers, there will be no surprises when you are taking the quiz. You should not need to use STATA during the quiz at all. The quiz is not timed, so you can leave and come back to the quiz if you need to.

Accessing the data: You can download the data from the Canvas Quiz Instructions’ page where you found these instructions. Go to the ‘Assignments’ area on our Canvas site:

https://canvas.sydney.edu.au/courses/44341/assignments

Note: there are multiple versions of the dataset. Each student will have a link to just one of these datasets. I have edited the data for each version to make them different enough to prevent collusion. You need to answer the questions using your own data. If you answer questions  using  one of your classmate’s  datasets, you will  answer  questions  in the  quiz incorrectly and you will lose marks or be referred to the academic integrity office.

Data and research question description

We are interested in assessing the effect of school resources, as measured by school spending per enrolled student, on student reading ability, as measured by the proportion of Year 6 students who  pass  a  reading  comprehension test. To  investigate this, we will  analyse  a random  sample  of  schools  across  New  South  Wales  and  Victoria  in  2014.  The  dataset, ‘ reading_literacy.dta’, contains 933 observations on the following four variables:

•   Read6YR: per cent of Year 6 students who pass a reading comprehension test

•   SchoolExpend: Total school expenditure (in dollars)

•   Enroll: Number of students enrolled at the school

•   Poverty: per cent of the student population from a low socio-economic background

Questions:

1)   Generate  the  following  extra  variables  that  you  will  need  to  use  in  subsequent analysis: lnSchoolExpendPP which is the natural logarithm of school expenditure per student enrolled at the school, and lnNumEnroll which is the natural logarithm of the total number of students enrolled at the school.

2)   [2 marks] Compute the mean, standard deviation, minimum and maximum values for Read6YR, SchoolExpendPP, NumEnroll and Poverty .

3)   [3 marks] Estimate the following multiple linear regression model:

Read6YR =  F0  + F1 lnSchoolExpendPP + F2 lnNumEnroll + u                                   (1)

Report the results (i.e. estimates, standard errors and model fit).

4)   [3 marks] What is the interpretation of F1 ? What would you expect the sign of this coefficient to be? Explain your answer.

5)   [3 marks] Test the hypothesis that F1  = 0 against the alternative that F1  ≠ 0 using a 10% significance level. What do you conclude? Make sure to follow all the steps when doing your hypothesis test.

6)   [3  marks]  Now  estimate  a  new  multiple  regression  model that  includes  an  extra explanatory  variable  to  account  for  socio-economic  status  of the  school  student population:

Read6YR =  F0  + F1 lnSchoolExpendPP + F2 lnNumEnroll + F3 Poverty + u                      (2)

What share of the variation in the percentage of Year 6 students who pass a reading comprehension test does the model explain? What does this measure of goodness- of-fit tell us about the reliability of our results?

7)   [4 marks] What happened to the coefficient on lnSchoolExpendPP when you added the additional explanatory variable Poverty to the model? Explain why there might be a difference between the two estimates for F1  in models (1) and (2).

8)   [4    marks]   What    happened   to   the   standard   error   of   the   coefficient   on lnSchoolExpendPP when you added the additional explanatory variable Poverty to the model? Explain why there might be a difference between the two standard error estimates for F1 in models (1) and (2). [Hint: Review the formula for the variance of an MLR estimator]

9)   [4 marks] In your opinion, which model (1) or (2) is more likely to measure the causal effect of school funding on student reading ability? Explain your reasoning.

10) [4 marks] Create a file in STATA or in Word that documents your work (commands and output) in STATA. This should be no more than 4 pages long. You will need to upload this file (as a PDF) at the end of the quiz in the Assignment dropbox. Quizzes that are missing this file will be penalized.

There are various of ways to do this:

•   You could use a STATA log file that you could later save as a PDF to upload

•   You can copy paste from the log file into a Word doc and convert to a PDF

•   You can highlight the relevant output in the Results window, then right clicking to copy either as text or as a picture. Then paste into a word document and save as a PDF.

Whatever method you use, save the PDF document somewhere you can find it, so you can easily upload it while you are taking your quiz.