ETF1100 Business Statistics Semester Two 2020
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
Semester Two 2020
ETF1100
Business Statistics
Question 1
(a) Let us begin by looking at some statistics on covid-19 for Australia and the United States
shown in Exhibit 1. This data is for 2 October 2020.
Exhibit 1
Country |
Population |
Total Cases |
Total Deaths |
Total Cases per Million |
Total Deaths per Million |
Australia |
25,499,881 |
27,136 |
894 |
1,064.162 |
35.059 |
United States |
331,002,647 |
7,417,845 |
209,794 |
22,410.229 |
633.814 |
(i). Is it useful to compare total cases in Australia with those in the United States? Explain your answer and how the countries could be better compared. (2 marks)
(ii). With reference to Exhibit 1, outline how “Total Cases per Million” is calculated for
Australia from the other data in the table. (2 marks)
(iii). Explain what “Total Deaths per Million” measures and compare this statistic for
Australia and the United States. (2 marks)
(iv). A further important statistic, which is not shown in Exhibit 1, is the death rate per
covid-19 case. Using the numbers in Exhibit 1 calculate this figure for Australia and the United States and compare the numbers. (2 marks)
(b) In Exhibit 2 we outline some summary statistics across all countries, for a single day (2
October 2020), on total deaths and total deaths per million persons.
Exhibit 2
total_deaths total_deaths_per_million |
|
Mean |
4899.516746 Mean 126.5259139 |
Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count |
1399.483337 138 0 20232.0959 409337704.6 61.68438692 7.319112388 207808 0 207808 1023999 209 |
Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count |
13.90254442 37.485 0 200.9867531 40395.67491 7.071985387 2.496156331 1237.551 0 1237.551 26443.916 209 |
(i). How many countries are included in the data? (1 mark)
(ii). Are the two variables skewed and if so in what direction? Provide reasons for your
answers. (2 marks)
What is the mode for total deaths? Interpret what this value means in the context of the data and whether you think it is informative. (2 marks)
In Exhibit 3 we provide descriptive statistics across countries on 2 October 2020 for the variables; population and GDP per capita.
population |
Exhibit 3 gdp_per_capita |
|
|
Mean |
37083651.23 |
Mean |
19284.98379 |
Standard Error |
9880221.384 |
Standard Error |
1459.349887 |
Median |
6871287 |
Median |
13031.5265 |
Mode |
#N/A |
Mode |
#N/A |
Standard Deviation |
142836703.6 |
Standard Deviation |
19687.70634 |
Sample Variance |
2.04023E+16 |
Sample Variance |
387605781.1 |
Kurtosis |
81.75228927 |
Kurtosis |
4.107205715 |
(i). What is the mean population and GDP per capita of the countries in our data? Also report the units of measurement for each of these values. (2 marks)
(ii). The standard error for the population variable is 9880221.384 while the standard
deviation is 142836703.6. Write a formula which shows the relationship between these two values? (1 mark)
(iii). The value of the mode for GDP per capita is “#NA” . Explain what this means and why
you think this has occurred. (2 marks)
(iv). Some countries have missing values for GDP per capita. How can we tell this? (1 mark)
(v). How do you think these missing values bias mean GDP per capita in terms of accurately measuring the income level of countries in our data? (2 marks)
(d) Previously we saw that a number of countries had apparently recorded zero total deaths from covid-19. Exhibit 4 lists these countries and some additional statistics.
Exhibit 4
Location |
Total Deaths |
Total Deaths per Million |
Population |
GDP per Capita |
Anguilla |
0 |
0 |
15002 |
|
Bhutan |
0 |
0 |
771612 |
8708.597 |
(i). What are some of the features of these countries? (2 marks)
(ii). Data can be accurate or inaccurate. With reference to the data in Exhibit 4, identify a
country where you think the report or zero deaths is more likely to be accurate and identify a country where you think it is more likely to be inaccurate. In both cases provide reasons for your answers. (2 marks)
Question 2
(a) I have created two categorical variables using data from 2 October 2020. Each variable takes
three values:
IncomeGroup:
Low: if a country’s GDP per capita is below $5,000.
Middle: if a country’s GDP per capita is from $5,000 to below $15,000.
High: if a country’s GDP per capita is $15,000 or more. CovidImpact:
Mild: if a country’s total covid-19 cases per million was less than 500.
Moderate: if a country’s total covid-19 cases per million was from 500 to less than 5,000.
Severe: if a country’s total covid-19 cases per million was 5,000 or more.
Using these variables, I have constructed a pivot table shown in Exhibit 5. The pivot table reports the number of countries in each of the categories on 2 October 2020.
Exhibit 5
date |
2020-10-02 |
|||
|
||||
High Low Middle |
8 27 13 |
31 35 21 |
43 13 18 |
82 75 52 |
Grand Total |
48 87 74 209 |
(i). Which of the nine combinations of these two categorical variables is most common and how many countries does it include? (2 marks)
(ii). What proportion of the countries in our data are middle income countries? (1 mark)
(iii). If I note that 8/209 countries have high incomes and mild covid-19 impact, what sort
of probability am I describing? (1 mark)
(iv). Calculate the following three probabilities:
P( CovidImpact=Severe | IncomeGroup=High )
P( CovidImpact=Severe | IncomeGroup= Middle )
P( CovidImpact=Severe | IncomeGroup=Low ) (3 marks)
(v). Give an intuitive definition of the concept of statistical independence. Using the results in the previous question, outline whether you think a country’s income level is independent from the severity with which it is affected by covid-19. (2 marks)
(b) Let us now investigate the covid-19 death rate. This is the ratio of persons who died from
covid-19 relative to the number of persons who had the disease, expressed as a percentage.
Exhibit 6
(i). In Exhibit 6 I used Excel to automatically construct a histogram of the covid-19 death rate across countries. Comment on three problems, or things that could be improved, with regard to this histogram. (3 marks)
2022-06-14