Basic Statistics Assignment #2
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
Assignment #2
Basic Statistics
Individual Assignment
1. You work for a popular pizzeria in suburban Rochester that has recently considered providing a guarantee on delivery times (e.g. “Guaranteed delivery within 50 minutes, or your pizza is free!”). However, you are quite concerned about just how many pizzas you will be giving away for free under such a policy. You shift manager claims that delivery times are uniformly distributed between 30 minutes and an hour, but you are not so sure about this. Luckily, you have been recording the actual delivery times for the past several years and have assembled a dataset with 1400 observed actual delivery times (DeliveryTimes.xlsx).
a. Using the included dataset, compute the sample average, the sample standard deviation and the sample variance of delivery time. You only need to report these three numbers here.
Mean=45.106
Var=6.099
Sd=2.469
b. Based on what you found in part a), are these findings consistent with delivery times being distributed uniformly between 30 and 60 minutes? Why or why not? Note: there is no need for a formal test here, just an informal discussion (for which the formulas for the mean and variance of a uniform distribution that you can look up on Wikipedia will be useful).
X~N (30,60), so mean= (30+60)/2=45 which is similar to 45.106. However, the Var= [(60-30)^2]/12=75 which is not similar to 6.095. Therefore, the findings do not support the claim.
c. Suppose you assume instead that delivery times are normally distributed (rather than uniform) with the mean and standard deviation you found in part a). Using the estimates from part a), construct an interval into which you expect 95% of delivery times to fall. Repeat this exercise, but replace 95% with 80%. How do these intervals compare with what you would have concluded had you just assumed that delivery times were distributed uniformly between 30 and 60 minutes?
95% interval= [40.27, 49.95]
80% interval= [41.94, 48.28]
Uniform distribution
95% predictive interval: X ~ U [30, 60]; left 0.025 (1/40) and right 0.025 (1/40) Lower: 30 +30/40 = 30.75, Upper: 60 − 30/40 = 59.25;
[30.75, 59.25]
80% predictive interval: X ~ U [30, 60]; left 0.1 (1/10) and right 0.1 (1/10); Lower: 30 + 30/10 = 33, Upper: 60 − 30/10 = 57
[33, 57]
d. Continuing to assume a normal distribution, what is the probability that a given pizza is delivered in 45 minutes or less? How about 40 minutes or less? 50 minutes or more? Compare these answers to what you would conclude under the uniform assumption.
Uniform distribution
P (X > 50) = (60-50)/(60-30) = 0.33;
P (X ≤ 40) = (40-30)/(60-30) = 0.33;
P (X ≤ 45) = (45-30)/(60-30) = 0.5
e. If you were to implement a policy of only charging for pizzas that are delivered in under 50 minutes, would it matter which distribution was the correct one? Why or why not? (Note: you do not need to compute anything here, just answer the question from an intuitive standpoint).
In the uniform distribution P is equal to 0.33 and in the norm distribution P is equal to 0.02. The former is larger than the latter. The norm distribution is the correct one.
f. Again using the information from part a), construct a 95% confidence interval for the population mean. How does this compare to the 95% interval you constructed in part c (for the Normal case)? If it is different, what is the reason for this?
95% confidence interval: [ − 1.96 × , + 1.96 × ]= [45.11-0.13,
45.11+0.13] = [44.98,45.24]
It is much smaller than we found in part c. Because confidence interval is to describe how confident we are about our estimate for whole population and the larger the sample size is, the smaller the confidence interval will become.
g. Would the confidence interval you constructed in part f) change if you assumed the population distribution was uniform instead of Normal? Why or why not?
It will not change. Because confidence interval relies on LLN and CLT and when we have a large sample size (1400 can be considered as large in this case), we use the same formula no matter what the population distribution is.
h. Using the information from part a), test the null hypothesis that the population average delivery time is 50 minutes against the two-sided alternative hypothesis that it is not 50 minutes at both the 1% and 5% levels. What do you conclude?
Null: µ=50, µ≠50
t= = −74.08, |t|=74.08
74.08 > 1.96 and 74.08 > 2.58, so we reject the null hypothesis at 5% and at 1%
2022-09-26