Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

TSTA101 Introductory Statistics, Term 1, 2023

PROJECT REPORT

General Information:

▪   The objective of this project is to analyse the provided young employee’s information in Tasmania and to discuss issues related to relationships between employee’s wages and abilities, etc.

▪   You need to work in a group (no groups of more than THREE members will be permitted) to complete the assignment. You will need to complete the assignment group work sheet (available on Moodle) and attach to your assignment. For a group submission, each student in the group needs to write a brief statement of his/her contribution on the cover sheet. All students must sign this work sheet if you work in a group. It is NOT acceptable for one student to sign for all group members.

▪   An electronic copy of the assignment needs to be submitted to the Assignment Folder on Moodle on Monday 29th May at 5.00 pm.

▪   You will be required to use appropriate technique or method to evaluate or to present data. DO NOT use every technique you can think of as this only shows that you do not understand what is required. Use the most appropriate, although you should also remember that different techniques/tests/graphs may provide you with different types of information.  Use your judgement carefully.

▪   Use Microsoft Excel to generate graphs and calculate numerical measures for describing random variables.

▪   Your explanations must be clear, concise and complete.

▪   Wherever calculation is required, please show your work in detail as partial marks are given to each step.

▪   Ensure that you analyse data thoroughly and present results carefully. Make sure that you interpret results in the context of the initial problem in order to show your understanding. You can also make recommendations about further research that should be conducted in order to provide a better answer.

▪   All tables and diagrams should be accurately labelled and referenced and referred to in the text.

▪   It is recommended that you type your assignment using a computer.

▪   All hypothesis tests must be performed at the 5% level of significance. Students are NOT allowed  to  directly use  the  test  function in MS  Excel  to  conduct hypothesis  tests. All hypothesis tests must follow the structure below:

1.   State the null and alternative hypotheses.

2.   Show how to construct the test statistic and what the distribution is under the null hypothesis.

3.   Calculate the test statistic.

4.   State the significance level of the test.

5.   State the rejection rule.

6.   State the conclusion of test expressed in terms of the aim of the test.

▪   Please note that teaching staff may provide advice but are not responsible for resolving personal difficulties you may have when working with peers.

Assessment Criteria:

You should be aware that assignments will be checked for plagiarism. Plagiarism is punishable by reduction or cancellation of marks, and in the most serious cases, exclusion from a unit, a course or the University. This warning applies to a case where a student submits somebody else’s work as their own, AND to a student who willingly allows another to copy and submit their work. I expect your plagiarism match with any source (internet or other groups) to be less than 15%. Plagiarism policy is described in the TSTA101 unit outline.

Submission and Request for extension.

▪   Submit your assignment including a cover sheet through TURNITIN found on Moodle under the ‘Major Assignment’ icon. The electronic copy must have signed cover sheet with name and student ID on the Cover Sheet. Please remember that you are responsible for lodging the assignment on or before the due date.

▪   If  you  have  problems  submitting  your  assignment,  you  MUST  contact  your  lecturer immediately explaining the situation by email and attach your assignment in the email before the due time. In your email, you must clearly identify in the title of your email that you experiencing a problem in TSTA101 Introductory Statistics. In the body of the email, explain the specific problem.

▪   The late assessment and Extension Policy applies. Please refer to this policy in the Unit Outline.

DATA DESCRIPTION

(A FICTITIOUS DATASET DESIGNED FOR THE ASSIGNMENT ONLY)

A consulting firm randomly selected 150 young employees in Tasmania. These selected employees answered questions and undertook a  standard IQ test and a KW test. The KW test examines respondents’ knowledge about the duties in their workplaces and the knowledge about the Australian and Tasmanian labour markets. Respondents’ answers are entered a spreadsheet where each column represents a variable. These variables include:

1. wage: monthly earnings in dollars

2. hours: average weekly working hours

3. IQ: IQ score

4. KW: knowledge of work score

5. educ: years of education

6. exper: years of work experience

7. tenure: years with the current employer

8. age: age in years

9. marriage: marriage status

10. gender: female or male

11. urban: =Y if lives in urban areas

=N if lives in rural areas

12. sibs: the number of siblings

13. brthord: birth order, e.g. =2 means he/she is the second child in the family.

14. meduc: mother’s education

15. feduc: father’s education

The missing values are shown by a .” in the cells.

Questions:

1.   Read  the  provided  raw  data  carefully  to  check whether  all  respondents  have  provided information for each variable. Explain what you have done to manage the missing data. Clearly indicate the final number of observations (respondents) you will use in the following analysis. Submit an electronic copy of the Excel spreadsheet of the final dataset together with your assignment. All your following analysis should be based on this final dataset. [15 marks]

2.   Pick up two numerical variables and two categorical variables and then describe each of them one by one. Use appropriate tables/graphs and numerical measures to help you describe the distribution of the variables. [15 marks]

3.   It’s often asked what factors relate to IQ score and KW score. Look through your data and first pick up one numerical variable that you think may relate to IQ score. Explain why you pick up this variable. Then use an appropriate graph and an appropriate numerical measure to discuss the empirical relationship between IQ score and this numerical variable. Repeat the same exercise for the relationship between KW score and a numerical variable to which you think KWmay relate. [15 marks]

4.   You want to look at the relationship between gender and wages. However, you notice that gender is a categorical variable and wage is a numerical variable. One way to work on two different types of variables is to transform one variable to the type of the other. You decide to generate a categorical variable based on the level of wage, and this categorical variable has two values, “high” and low” . For example, you choose a threshold value for wage, and if a respondent’s wage is no less than the threshold value, you enter high” and enter low” otherwise.

a.   Describe in detail how you have decided the threshold value for generating the new categorical variable for the level of wage. Then use an appropriate graph to present this variable. (Hint: you may choose to use an appropriate numerical measure of wage as the threshold value). [6 marks]

b.   Present these two categorical variables together using an appropriate graph, and then discuss what the graph shows. [5 marks]

c.   Produce a contingency table to present these two categorical variables. Based on the contingency table, calculate the related (empirical) joint and marginal probabilities. You may  find  helpful  to  produce  another  contingency  table  to  show  your  calculated probabilities. (Hint: you may need Excel skills -- e.g. use the commands such as “sort” or “countif”— to count the relevant frequencies, or use PivotTable function) [8 marks]

d.   Based on the sample information, calculate the probability of either being a female or getting a low wage level, and calculate the probability of being a female conditional on getting a low wage level [5 marks]

e.   Examine whether the statement Males tend to receive high wages than females” is true, false or inconlusive based on the sample information. Explain your response.[6 marks]  [Total Marks 30]

5.   Suppose that the population average of (monthly) wage of young employees in Tasmania in the previous year before this survey was conducted was $900.

a.   Conduct a hypothesis test that the population average wage of young employees in Tasmania during the year of survey remains the same as in the previous year. [10 marks]

b.   Construct a 95% confidence estimate for the population average wage, and comment whether the population average wage in the year of survey remains the same as in the previous year.

[15 marks] [Total Marks 25]