Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

ETC1000 Business and Economic Statistics

Exam

Semester 1 2022

Background: Measuring Numeracy among Secondary School Students in Timor-Leste

This exam uses data from a numeracy test performed with Secondary School students in Timor-Leste in 2019.

More than 1,800 students were tested across 60 schools, with around 30 students per school.  The schools were randomly chosen, and then 30 students were randomly chosen from the class lists of Grade 8 and 9 students      enrolled at the school.   Independent researchers then supervised the tests with those 30 students.

The numeracy test comprised 19 questions.  A range of other information was also collected from each child, including family income and parents’ education.

Instructions

Attempt all questions.  The total marks for the exam is 80 marks.

Section A (25 marks)

First we consider the data on family income.  Here are the descriptive statistics, histogram and Box-and-Whisker plot for the monthly family income per person for the full dataset.

Monthly Family Income per person (US$)

Column1

Mean

70.9588735

Standard Error

0.629781504

Median

63.21

Mode

65.942

Standard Deviation

26.99993104

Sample Variance

728.9962762

Kurtosis

1.097971785

Skewness

1.00512952

Range

183.834

Minimum

6.6865

Maximum

190.5205

Sum

130422.4095

Count

1838

Confidence Level(95.0%)

1.235162881

1. Compare the median and mean incomes.  What does this tell you about skewness in the dataset?  Describe the concept of skewness in everyday language. (3 marks)

The mean ($71) exceeds the median ($63), which means there is positive skewness.  This means there are a few very large values, much bigger than the centre” of the data, which are pulling up the mean.

2. Calculate a 95% confidence interval for the mean income per person.  Interpret that interval in everyday language.  (3 marks)

70.96 - 1.24, to 70.96 + 1.24.  This is $69.72 to $72.20.  We are 95% sure that the true average monthly family income per person is between $69.72 and $72.20.

3. In this case, the confidence interval you calculated will turn out to be very narrow.  What does this say about the accuracy of your estimate?  Knowing how confidence intervals are calculated, what is the main reason the  interval is so narrow in this case?  (3 marks)

It is a very accurate estimate.  The accuracy depends on standard deviation (s) and sample size (n).  Standard        deviation is not that small ($27), but it’s a huge sample of 1838 people (n), so this allows us to get a very accurate estimate.

4. Using the Box-and-Whisker plot, report the first and third quartile of the incomes, and explain what these values mean. (3 marks)

1st  quartile: $52.35.  25% of children have family income per person below $52.35.

3rd  quartile: $87.52.  25% of children have family income per person above $87.52.

5. Explain what the following values in the Box-and-whisker plot mean: 6.69, 70.95. (2 marks)

$6.69 is the smallest value - the poorest household in the sample.

$70.95 is the mean income.

6. The Official Poverty Line for Timor-Leste is $57 per person per month.  A politician recently announced that       “more than half the population live below the poverty line” .  Use one statistic from all the information above, and explain in plain language why the politician must be wrong.  (2 marks)

The median is $63.21, which means that half the population earn less than this amount.  Since $57 is smaller, then less than half would be earning below that value.

7. An expert in a university responded to the politician’s comment with the following: “the poverty rate is actually much lower: we estimate it is 35.9%, with a 95% confidence interval of 33.7% to 39.1%” .  The University expert     has made an error in their calculation.  How do we know that?  Explain. (2 marks)

A confidence interval should be symmetric around the mean.  This is not symmetric.

8. Let’s assume the university expert’s confidence interval is correct.  Using just that confidence interval                 information alone, perform a hypothesis test of whether the rate is below what the politician is claiming.  Explain the steps of the hypothesis test, and explain how you can reach a conclusion even though you don’t have a p-      value. (5 marks)

H0: rate = 50%, H1: rate < 50%. The 95% confidence interval is well below 50%, so the hypothesis test should be rejected - if the interval does not contain the null hypothesis value, then this means the p-value will be small, and H0 should be rejected.

9. Is it valid to use a poverty rate calculated from this particular sample as an estimate of the poverty rate for the whole country?  Explain your logic - think carefully about who the sample used in this study includes. (2 marks)

Not strictly valid, as this sample only comprises household with children in Grade 8 or 9.  So, for example,             households with only young children would not be included.  So the sample does not represent all households in the country.

Section B. (8 marks)

The following table shows the percentage of parents with the different levels of education.

For example, the table shows that in 1.25% of cases, the father has primary education or less, and the mother has secondary education or higher.

Father's Education

Mother's Education

Primary School or below

Secondary School or higher

Primary School or below

57.04%

30.34%

87.38%

Secondary School or higher

1.25%

11.36%

12.62%

58.29%

41.71%

100.00%

10.  What proportion of mothers have secondary education or higher? (1 mark)

12.62%

11. What can we learn from comparing the values 12.62% and 41.71%? (2 marks)

Many more fathers than mothers had access to secondary school - more than triple the proportion!

12. If the Father has a Secondary education or higher, what are the chances of the mother also having that level of education? (2 marks)

11.36 / 41.71.

13. It is widely believed that the level of mother’s education and father’s education are NOT independent.  Does this data support that belief?  Explain your reasoning. (3 marks)

If they were independent, Pr(mother having secondary education, given father does not) would be the   same as Pr(mother having secondary education given the father does).  The first prob is 1.25 / 58.29,      clearly much smaller than 11.36 / 41.71.  i.e. If the father doesn’t have secondary education, the mother is much less likely to have it.  The two are definitely NOT independent.

Section C. (8 marks)

Consider the following two regression outputs. The dependent variable in both cases is the child’s score in the numeracy test.  The independent variables are:

mother = 1 if mother has secondary education or higher, =0 if primary or lower

father = 1 if father has secondary education or higher, =0 if primary or lower