STU22005: Applied Probability II 2021
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
STU22005: Applied Probability II
Trinity Term 2021
1. A large company introduced a range of new initiatives to improve employee satisfaction. They selected 11 employees at random from the company and measured their change in satisfaction ratings before and after the initiatives were introduced (change = after - before). The mean change was 9.36, with standard deviation 6.727. The values are displayed in the histogram.
−5 0 5 10 15 20 25
Change in rating (after − before)
(a) i. Briefly explain the commonly-used notation µ , , , σ , and s in the
context of this example. [3 marks]
ii. Construct and interpret a 99% confidence interval. [7 marks]
iii. List any assumptions that are made in the construction of the confidence interval, and comment on how they could be assessed.
[6 marks]
(b) Let U1 , U2 , ..., U40 be uniformly distributed random variables from a to b.
The probability density function is:
f(x) = ,
a < x < b
otherwise
40
E[Ui] = (b + a) and Var(Ui ) = (b - a)2 . Let T =z Ui .
i=1
i. Show E[T] = 20(a + b) and Var(T) = (b - a)2 . [8 marks]
ii. Let a = 5 and b = 15. Use the central limit theorem (CLT) to find P (410 < T < 420), approximately. Explain in your own words (2-3 sentences) why the CLT is appropriate to use here. [9 marks]
(33 marks)
Oc Trinity College Dublin, The University of Dublin 2021
2. The functionality of an electrical component of a machine was tested under a range of temperatures (temp = 0, 1, 2, ..., 14 一 C) in an experiment. A scatter plot of the data is shown, there were 30 data points in total.
|
0 2 4 6 8 10 12 14
Temperature (degrees Celsius)
(a) A simple linear regression model was fitted to this dataset using R. Here is
some of the output:
Estimate Std . Error t value Pr(>|t|) (Intercept) 16 .8625 1 .3906 12 .126 1 .16e-12 temp 0 .8625 0 .1691 5 .102 2 .10e-05
i. Write out the estimated equation of the line and interpret the
parameter estimates.
ii. Interpret the hypothesis test for the temp parameter.
[7 marks]
[7 marks]
iii. The minimum acceptable functionality of the component is 20. Test whether the average functionality for when temperature equals 0一 C is lower than 20, using α = 0.05. [7 marks]
iv. State the model assumptions and identify if the assumptions are reasonably met using the following residual plots. [6 marks]
|
|
|
|
18 20 22 24 26 28
Predicted values
|
−2 −1 0 1 2
Theoretical Quantiles
(b) Some new information came to light during discussions between the
statistician and the person who carried out the experiment: there were two types of components used in the experiment and each was tested once under each temperature 0, 1, 2,...,14 一 C. The scatter plot shows the data with the points for component A shown in empty circles and for component B in filled circles.
|
0 2 4 6 8 10 12 14
Temperature (degrees Celsius)
The dataset was re-analysed by fitting this model:
yi = β0 + β1 xi1 + β2 xi2 + β3 xi1xi2 + ∈i
where yi is the functionality value for the ith experimental unit, xi1 is equal to 1 if the ith experimental unit was a type A component and 0 for a type B component, and xi2 is the temperature for the ith experimental unit.
Write out this model in matrix notation, clearly showing the structure of each matrix. [7 marks]
(34 marks)
3. (a) Let X1 , X2 , ..., X乞 be an independent and identically distributed sample from a distribution with probability density function:
f (x) = λ2 xeα入n
Derive the maximum likelihood estimator (MLE) of λ . [12 marks]
(b) Independent Bernoulli trials were performed until a success was observed.
Let X be the number of trials until the first success.
i. What distribution does X follow? [2 marks]
ii. On the 4th trial, the first success was observed. The likelihood function was constructed for this data, with a graph of it shown below. Briefly explain (1-2 sentences) what is on the x and y axes. In your own words (1-2 sentences), explain how the graph can aid finding the MLE.
[8 marks]
0.0 0.2 0.4 0.6 0.8 1.0
(c) A sample of data of size 10 was collected at random from a population and the median was calculated. Sampling from the original data, with replacement, 1000 bootstrap samples were found and the median computed for each.
i. If you had the vector of 1000 bootstrap medians, describe how you would use it to construct a 90% confidence interval. [7 marks]
ii. Suppose a histogram was generated of the vector of bootstrap medians. Explain (2-3 sentences) what the histogram is approximating. [4 marks]
(33 marks)
2022-08-26