Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Assignment #2

Basic Statistics

Individual Assignment

1.   You work for a popular pizzeria in suburban Rochester that has recently considered          providing a guarantee on delivery times (e.g. “Guaranteed delivery within 50 minutes, or your pizza is free!”). However, you are quite concerned about just how many pizzas you  will be giving away for free under such a policy. You shift manager claims that delivery    times are uniformly distributed between 30 minutes and an hour, but you are not so       sure about this. Luckily, you have been recording the actual delivery times for the past    several years and have assembled a dataset with 1400 observed actual delivery times      (DeliveryTimes.xlsx).

a.   Using the included dataset, compute the sample average, the sample standard deviation and the sample variance of delivery time. You only need to report     these three numbers here.

Mean=45.106

Var=6.099

Sd=2.469

b.   Based on what you found in part a), are these findings consistent with delivery    times being distributed uniformly between 30 and 60 minutes? Why or why not? Note: there is no need for a formal test here, just an informal discussion (for        which the formulas for the mean and variance of a uniform distribution that you can look up on Wikipedia will be useful).

X~N (30,60), so mean= (30+60)/2=45 which is similar to 45.106. However, the   Var= [(60-30)^2]/12=75 which is not similar to 6.095. Therefore, the findings do not support the claim.

c.   Suppose you assume instead that delivery times are normally distributed (rather than uniform) with the mean and standard deviation you found in part a). Using the estimates from part a), construct an interval into which you expect 95% of    delivery times to fall. Repeat this exercise, but replace 95% with 80%. How do     these intervals compare with what you would have concluded had you just          assumed that delivery times were distributed uniformly between 30 and 60         minutes?

95% interval= [40.27, 49.95]

 

80% interval= [41.94, 48.28]

 

Uniform distribution

95% predictive interval: X ~ U [30, 60]; left 0.025 (1/40) and right 0.025 (1/40) Lower: 30 +30/40 = 30.75, Upper: 60 − 30/40 = 59.25;

[30.75, 59.25]

80% predictive interval: X ~ U [30, 60]; left 0.1 (1/10) and right 0.1 (1/10); Lower: 30 + 30/10 = 33, Upper: 60 − 30/10 = 57

[33, 57]

d.   Continuing to assume a normal distribution, what is the probability that a given pizza is delivered in 45 minutes or less? How about 40 minutes or less? 50          minutes or more? Compare these answers to what you would conclude under  the uniform assumption.

 

Uniform distribution

P (X > 50) = (60-50)/(60-30) = 0.33;

P (X ≤ 40) = (40-30)/(60-30) = 0.33;

P (X ≤ 45) = (45-30)/(60-30) = 0.5

e.   If you were to implement a policy of only charging for pizzas that are delivered in under 50 minutes, would it matter which distribution was the correct one? Why  or why not? (Note: you do not need to compute anything here, just answer the   question from an intuitive standpoint).

In the uniform distribution P is equal to 0.33 and in the norm distribution P is  equal to 0.02. The former is larger than the latter. The norm distribution is the correct one.

f.    Again using the information from part a), construct a 95% confidence interval for the population mean. How does this compare to the 95% interval you                    constructed in part c (for the Normal case)? If it is different, what is the reason    for this?

95% confidence interval: [ − 1.96 ×  ,  + 1.96 × ]= [45.11-0.13,

45.11+0.13] = [44.98,45.24]

It is much smaller than we found in part c. Because confidence interval is to   describe how confident we are about our estimate for whole population and the larger the sample size is, the smaller the confidence interval will become.

g.   Would the confidence interval you constructed in part f) change if you assumed the population distribution was uniform instead of Normal? Why or why not?

It will not change. Because confidence interval relies on LLN and CLT and when we have a large sample size (1400 can be considered as large in this case), we  use the same formula no matter what the population distribution is.

h.   Using the information from part a), test the null hypothesis that the population  average delivery time is 50 minutes against the two-sided alternative hypothesis that it is not 50 minutes at both the 1% and 5% levels. What do you conclude?

Null: µ=50, µ≠50

t= = −74.08, |t|=74.08

74.08 > 1.96 and 74.08 > 2.58, so we reject the null hypothesis at 5% and at 1%