Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit


Practice Exam #1

Some part of the exam will require interpretation of provided R output. Some parts of the exams will require to use R and a provided dataset to complete the actual exam.  

You will use the Packages.csv dataset for this practice exam.  

You are the COO of FedEx Ground. Based on customer reviews and your own data, you’ve seen a surge in damaged packages. You are tasked with understanding the issues and reducing damaged packages over the next twelve months. The dataset is described below.

Variable

Description

driversworking

total number of drivers employed by this firm who are delivering packages on this date

weekend

1 indicates this observation is on a weekend, 0 otherwise

expectedpackagesdelivered

total number of packages planned for delivery on this date

extrahands

1 if an additional 1,000 workers should have been hired temporarily for this day, 0 otherwise

weatherconditions

100% indicates perfect weather, 0% indicates bad weather

(perhaps a big snow, no packages delivered)

pctdamaged

portion of packages delivered that day that that were damaged in transit

pctwithinsurance

percent of packages for which customers had purchased additional damage insurance through FedEx

communicationsystems

percent of systems functioning with no difficulties that day (0-100%)

pctoversized

percent of packages that are oversized on this date

1. Create four individual simple regression models in R to explain/predict damaged packages using the predictors weather conditions, percent of packages that are oversized, communication systems functionality, and percent of packages with insurance, one predictor at a time.  Write out the four individual model equations using the coefficients from R output. 

2. Check each model for statistical significance for both intercept and independent variable.

3. Interpret the intercepts and coefficients of each model (meaning in words using the variables, not Xs and Ys).

4. Interpret the R-squared values for each linear model you created in Question 1. Which is the best model for explaining variability in damaged packages?  

5. Create a single model of damaged packages including all four of the above predictors simultaneously.  

a. Check statistical significance, and take appropriate action if variables are non-statistically significant.  

b. Write out the final model equation.  

c. Interpret the coefficient associated with each coefficient and the intercept. (meaning in words using the variables, not Xs and Ys)

d. How does R-squared change? How does your interpretation of R-squared change?

e. How much variation in damaged packages is left unexplained?