Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Semester Two 2019

ETF1100

BUSINESS STATISTICS PAPER 1 OF 1

Question 1 (18 marks)

The Census data reports information about weekly incomes of households in the format shown in the

table below. This is data for single-parent households in Melbourne in 2016.

Income Range

Number of Households

Cumulative Number of Households

Cumulative % of Households

Negative/Nil income

4,979

4,979

3.23%

$1-$149

1,921

6,900

4.47%

$150-$299

5,254

12,154

7.88%

$300-$399

5,733

17,887

11.59%

$400-$499

8,952

26,839

17.39%

$500-$649

14,475

41,314

26.77%

$650-$799

15,715

57,029

36.96%

$800-$999

17,688

74,717

48.42%

$1,000-$1,499

33,628

108,345

70.21%

$1,500-$1,999

20,354

128,699

83.40%

$2,000-$2,499

12,629

141,328

91.59%

$2,500-$2,999

5,312

146,640

95.03%

$3,000-$3,999

5,014

151,654

98.28%

$4,000 or more

2,654

154,308

100.00%

Grand Total

154,308


(a).        If you want to calculate the mean weekly income for these households, what are some of the

problems you encounter when income is presented as a range? What approximations would you need to make? (2 marks)

(b).        Now we want to calculate the median income for these households.

i.            What is the definition of the median? (1 mark)

ii. Based on the table above, describe how you would calculate the median income for these households. (2 marks)

iii.          Give  an  approximate  estimate for the  median  using the  method  outlined  in the previous question. (1 mark)

(c).        The mean for this data has been found to be approximately $1,302 per week. The median you found in part (b) should be somewhat smaller than the mean of $1,302. What does this tell us about the shape of the distribution of the data set? Give some intuition for why this shape produces a mean that is larger than the median? (3 marks)

The sample standard deviation of income across households has been calculated for this set of data and is $924.

i.            Write down the formula for the sample standard deviation. (2 marks)

ii. Explain the sample standard deviation formula in words and how it measures the spread of the data. (3 marks)

iii.          Another measure of the spread/dispersion of the data is the range. Explain how the range is calculated and why the standard deviation is generally a better measure of spread/dispersion than the range. (2 marks)

iv.          It is common to construct a confidence interval based on +/- 2 standard deviations either side of the mean. This usually covers 95% of the data if the data is normally distributed.  For our data with a  mean $1,302, standard deviation $924 and the median you estimated earlier —what is problematic about this approach? (2 marks)

Question 2 (18 marks)

In order to analyse the data on poverty rates amongst households in the Melbourne region in 2016 we have created the data set shown in the snapshot below.

The snapshot shows the first 10 data points. The variables are defined as follows:

•    Household Type = either “Single Parent” or Two Parents” .

•    Number of Children = the number of children in the household.

•    Number of Households = the number of households in the Melbourne region in 2016 that fall into the category.

•    Poverty Status = “Yes” means the household is in poverty and No” means they are not in poverty.

(a).        From the data above we have produced two pivot tables.

The first pivot table, shown below, illustrates the number of Melbourne households in total in each category (household type and number of children) in 2016.

The second pivot table, shown below, shows the number of Melbourne households which are in poverty in each category (household type and number of children) in 2016.

Using these two  pivot tables, show  how you could calculate the  poverty  rate for Melbourne households in this period with a single-parent and two children. (2 marks)

ii. Using these two pivot tables, show how you could calculate the overall poverty rate in Melbourne in this period. (2 marks)

iii.          Below we report certain probabilities:

P( Poverty = Yes | Household Type = Single Parent ∩ Number of Children = 1 ) = 0.36 P( Poverty = Yes | Household Type = Single Parent ∩ Number of Children = 4 ) = 0.85 P( Poverty = Yes | Household Type = Two Parents ∩ Number of Children = 1  ) = 0.13 P( Poverty = Yes | Household Type = Two Parents ∩ Number of Children = 4  ) = 0.66

Interpret these probabilities and discuss how they vary with the number of children and the household type. Give some intuition for these patterns. (4 marks)

The pivot table below focuses just on households in poverty and is presented as “% of Total” .

i.            What does the value 11.94% mean in the pivot table? (1 mark)

ii.           What is the probability that a household in poverty has two parents and one child?    (1 mark)

iii.          What is the probability that a household in poverty has more than two children? (2 marks)

(c).        The  pivot table  below focuses just  on  households  in  poverty  and  is  presented  as %  of Column” .

i.            What does the value 57.70% mean in the pivot table? (1 mark)

ii.           What does the value 22.76% mean in the pivot table? (1 mark)

iii.          Suppose  you  are  interested  in  investigating  whether  the  number  of  children  is independent of whether you have 1 or 2 parents using the data in the pivot table above.  If  these  two  variables  were  independent  then  write  down  a  probability statement that you would expect to hold. (2 marks)

iv.          Use data from the pivot table above to show that number of parents and number of children are NOT independent. (2 marks)

Question 3 (20 marks)

(a).        The following table shows trends in the composition of households in Melbourne over the 5-

yearly censuses from 1981 to 2016.

Census

Year

Number of One- Parent Households

Number of Two- parent Households

Total Number of Households

% of One-Parent Households

1981

83,968

276,580

360,549

24.88

1986

94,668

308,105

402,773

24.43

1991

106,145

339,477

445,622

24.54

1996

117,683

375,428

493,111

24.80

2001

131,898

413,677

545,576

24.61

2006

146,220

462,517

608,737

24.02

2011

161,212

505,943

667,155

24.16

2016

174,769

563,155

737,924

23.68

Show  how  to  calculate  the  average  annual  growth  in the  number  of families  in Melbourne between 1981 and 2016? (2 marks)

ii.           Calculate the percentage of two-parent households in Melbourne in 2016. (1 mark)

(b).        It is claimed that there has been a fall in the number of one-parent families in Melbourne since

1981. In order to investigate this we have estimated a regression model with a linear time trend in “Census Year” with the dependent variable being “% of One-Parent Households” . The results are shown below:

Should seasonal dummy variables have been included in this time series regression model? (2 marks)

ii. Interpret  the  coefficient  for  the  intercept  in  this  regression  model  and  discuss whether it is meaningful. (2 marks)

iii.          Interpret  the  coefficient  for  Census  Year  in  this  regression  model.  Is  the  sign consistent with the claim in the question? (2 marks)

iv.          Undertake a hypothesis test at the 5% significance level of whether there has been

a change in the proportion of one-parent households over the time span examined.  (4 marks)

v.           How would your conclusion have differed if you had chosen a 1% significance level? (2 marks)

vi.          Show  how  to  calculate  the  predicted  value  for  the  proportion  of  one-parent households in 2050. (2 marks)

vii.         How reliable do you think your prediction in the previous question will be? Justify your answer (3 marks)