Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

ETM5900

Business Statistics [S2 2023]

Assignment 2

INSTRUCTIONS

1. This is an individual assignment worth 50 marks. It forms 25% of your final mark for this unit.

2. Assignment presentation format:

. Use default format, paragraph, and margin settings.

. Font size: 12. [Times New Roman]

. At least 1.15 spacing between lines.

3. Students must always uphold Academic Integrity.

4. Submissions of the assignment is to be done via Moodle. Please submit answers in pdf file document and your excel working file.

5. Answer all 5 questions.

6. The assignment must represent your own work and not extracts from your colleagues. Do not introduce irrelevant information.

7. Ensure the assignment is completed by the date and time specified with a cover page.

Question 1[Total 10 marks]

Explain what happens to the width of a confidence interval estimate of µ (the true mean) when each of the following occurs:

(i)     The confidence level increases from 95% to 99%.                                            [3 marks]

(ii)    The sample size decreases.

[2 marks]

(iii)   The value of s (the sample standard deviation) increases.

[2 marks]

(iv)   Explain the fundamental differences between point and interval estimates stating

clearly which is preferable with intuitive reasoning                                          [3 marks]

Question 2 [Total 12 marks]

Previously, it has been established that the average score of high school seniors across five subjects (i.e. reading, writing, math, science, and social studies) was 50 marks. The Education Department claims that due to having experienced teaching staff and improved teaching and learning methods,it is likely that students are now able to obtain an average score greater than 50. Using the HSSTraining.sav data file:


You must use your subsample of the HSSTraining.sav data. Your sample should consist of 200 observations starting from the patient whose ID is the same as the last three digits of your student number. For example, if your student number is 20275749, you would use individuals 49 to 248.

Column A: the patient ID

Column B: gender [0=Male and 1= Female}

Column C: Ethnic Group  [1= Hispanic, 2=Asian, 3=African-American, and 4= white]

Column D: Socio economic group [1=low, 2=middle and 3= high]

Column E: type of school [1= public and 2=private]

Column F: type of program [1= general and 2=academic and 3= vocational]

Column G: reading score

Column H: writing score

Column I: Maths score

Column J: science score

Column K: social studies score

(i)      State the appropriate null and alternative hypotheses based on the above scenario (claim). [2 marks]

(ii)     Test this claim made by the Education Department at 1% level of significance. [4 marks]

(iii)    Draw your conclusion explaining the results.                                                    [2 marks]

(iv)    Obtain and interpret the 97% confidence interval for the true mean.                [2 marks]

(v)     Obtain and interpret the 95% confidence interval for the true mean.                [2 marks]

Question 3 [Total 10 marks]

According to the organisers of a weight loss program, participants lose 5kg, on average, over

the course of two months. Using the Losingweight.sav data file, answer the following questions.

You must use your subsample of the Losingweight.sav  survey data. Your sample  should consist of 100 observations starting from the respondent whose ID is the same as the last three digits of your student number. For example, if your student number is 20275749, you would use individuals 49 to 148.

Column A: the patient ID

Column B: diet assigned to participant [1=None, 2=Atkins and 3=vegetarian]

Column C: exercise level assigned to participant [1=None, 2= 30 minutes per day and 3= 60 minutes per day]

Column D: observed weight loss in kilos over last 2 months

Column E: Gender

(i)     Obtain the 95% confidence interval for the mean weight loss of all participants. [4 marks]

(ii)    Test the organiser’s claim based on the above results stating any necessary assumptions.             [6 marks]

Question 4 [10 marks]

A group of researchers are interested in studying the prevalence of obesity, diabetes, and other cardiovascular risk factors in Subang Jaya, Selangor. To gain more insight into this question, 1150 subjects were interviewed and some of the results obtained are compiled in the data file A1 S2 2023.xls. The columns provide the following information:

Column A: the patient ID

Column B: the level of stabilised glucose

Column C: The total level of cholesterol

Column D: the level of high-density-lipoprotein (“good” cholesterol)

Column E: the weight of the patient

Column F: the gender of the patient

Column G: the type of body frame (small, medium, large)

The data is available on the “A1 S2 2023.xls” file on the Moodle. You must use your subsample of the survey data. Your sample will consist of 200 observations starting from the respondent whose ID is the same as the last three digits of your student number. For example, if your student number is 20275749, you would use individuals 749 to 948.

All tables, graphs and comments for this question should be places in the designated spaces in the Worksheet Results.

Construct a 95% confidence interval for:

(i)     The true mean total level of cholesterol for male.                                             [2 marks]

(ii)    The true mean total level of cholesterol for female.                                          [2 marks]

(iii)   The true mean level of cholesterol for the type of body frame.                         [3 marks]

(iv)   What conclusions can you derive from these confidence intervals?                 [3 marks]

Question 5 [Total 8 marks]

A study claims that the average height of young adult females in 2022 (that is, females between 18 and 24 years of age) in Kuala Lumpur is greater than 20 years ago. The study claims that the average height of the young adult female population 20 years ago was 160 cm. Your firm manufactures female clothing targeted at the young adult female market, and hence the current value  of this population is  of vital  interest to you.  To  investigate  the  article's  claim,  an investigator randomly selected 100 young adult females, measured their heights, and found the mean height to be 168.5 cm. From the last census, the standard deviation is known to be 27.5 cm.

a)  Use the 5% significance level to test whether the average height of the female adult population in 2022 is greater than 20 years ago. [4 marks]

b)  How would you explain your findings in part (a) to anon-statistician on your firm's board? [2 marks]

c)  What is the p-value for the test performed in part (a)? Interpret this value. [2 marks]