School of Economics

ASSIGNMENT

Semester 1 - 2021


ECMT1010 Introduction to Economic Statistics

Due: 6.00PM Friday 4 June 2021


Academic Dishonesty and Plagiarism

Academic honesty is a core value of the University, so all students are required to act honestly, ethically and with integrity. This means that the University is opposed to and will not tolerate academic dishonesty or plagiarism, and will treat all allegations of academic dishonesty and plagiarism seriously. The consequences of engaging in pla-giarism and academic dishonesty, along with the process by which they are determined and applied, are set out in the Academic Honesty in Coursework Policy 2015. You can find these documents University Policy Register at http://sydney.edu.au/policies (enter ‘Academic Honesty’ in the search field).


Instructions

i. Enter your answers using the Word template available under the ECMT1010 Canvas module ‘Assignment’.

ii. The assignment is anonymously marked. Save your answers in a .DOCX or .PDF file named 123456789.docx where 123456789 is your 9-digit University of Sydney SID. Do not put your name on your answers. Do not include a cover sheet.

iii. Submit the electronic copy of your answers through Turnitin under the Canvas module ‘Assignment’. Work not submitted on or before the due date is subject to a penalty of 5% per calendar day late. Work submitted more than 10 days after the due date, or after the return date, will receive a mark of 0.

iv. Use your assigned data set (available under the Canvas module ‘Assignment’). See ‘Specific instructions’ below. Write your data set number (#) using the box provided in the Word template. Use of the wrong data set will be reviewed as a potential case of Academic Dishonesty.

v. The assignment has a maximum of 20 marks and accounts for 15% of your final grade. Maximum marks are indicated for each question.

Aim: This assignment illustrates the use of various statistical techniques in an economic application. You will use software (e.g., Excel, StatKey) to analyze student survey data.

Data description: You are assigned a data set consisting of randomly-selected undergraduate university student survey responses. Your data extract contains information on grade point average (GPA), weekly hours watching TV, and preferred award.


Specific instructions:

● Your allocated data set is available in the Excel spreadsheet Students#.xlsx (where # is the last digit of your SID). It contains 3 columns and 101-110 rows (depending on your sample).

● The first row contains the variable names; the remaining rows contain the individual survey responses. The Award column identifies the student’s preference (chosen from Olympic gold medal, Academy award, and Nobel prize), TV is time each student spends watching TV (in hours per week), and GPA is the student’s grade point average (i.e., a measure of overall academic performance at university).

● Answer all questions. Show all numerical answers to 3 decimal places. Carry out all tests using a 5% level of significance.

● IMPORTANT: When communicating statistical results, it is important that your de-scriptions are concise as well as accurate. Keep your comments, conclusions, com-parisons, etc., to one or two sentences. Excessively long responses indicate a lack of understanding and will be penalised accordingly.

● HINT: If you convert your Excel data file into csv format, you can upload it to StatKey using ‘Upload File’.


QUESTIONS

1. You are curious whether, in terms of GPA, students that express a preference to win a Nobel prize perform better than the other students, on average. Set up the null and alternative hypothesis taking care to define your notation clearly. [2 marks]

2. Test the hypothesis using Statkey with the ‘reallocate groups’ randomization method, produce a dotplot of the randomization distribution (with at least 2,000 samples) of the appropriate sample statistic. Carry out the hypothesis test using the randomization dis-tribution and state your conclusion. [2 marks]

3. Verify that the Central Limit Theorem applies in this case, carry out the same hypothesis test using the appropriate approximation and state your conclusion. Briefly compare these results to your findings in 2. [2 marks]

You now move on to investigate the relationship between GPA and TV hours.

4. Using appropriate software, produce a scatterplot of GPA against TV hours using your sample. Compute the sample correlation and comment on the degree of association be-tween the two variables. (There is no need to show your computational steps.) [2 marks]

5. Set up the null and alternative hypotheses to test whether there is a statistically significant linear association between GPA and TV hours, taking care to define your notation clearly. [2 marks]

6. Test whether there is a statistically significant linear association between GPA and TV hours, showing all your steps and clearly stating your conclusion. [2 marks]

To evaluate whether GPA is a linear function of TV hours, you decide to set up an appropriate regression model.

7. Write down your regression model taking care to define your notation clearly. Using appropriate software, estimate the regression model and report your results. [2 marks]

8. Use your regression results to give a one-sentence interpretation of the regression slope estimate. [2 marks]

9. Test whether TV hours is an effective predictor of GPA in the regression model you have estimated in 7. Make sure you report your null and alternative hypotheses, the test statistic, decision rule, and conclusion to the test. [2 marks]

10. Using your results above, briefly comment on whether your data provide evidence that watching more TV causes a reduction in academic performance. [2 marks]


Due: 6.00PM Friday 4 June 2021