Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Stats 401 Final Exam question pool

Questions copyright Miles Chen. For personal use only. Do not share these questions with anyone who is not currently enrolled in the class. Do not post or distribute this document in any way. Violations will be  reported to the Dean of Students.

Be sure to bring your calculator.

Week 1:

1)   Define a p-value. Explain the reasoning for why a small p-value leads to the rejection of the null hypothesis. Explain why in NHST (null hypothesis significance testing) we never use the phrase  “accept the null hypothesis.”

2)   What assumptions must be true about our data for us to use CLT based tests (e.g. the t-test)?

3)   What are some reasons for using the Wilcoxon-rank-sum test over a Student’s t-test or Welch test?

Week 2:

4)   What assumptions must be true about our data for us to use a randomization test? What assumptions must be true about our data for us to use Bootstrap?

5)   Explain how to conduct a randomization test to compare the means of two different randomly   assigned treatments (call them treatA and treatB). Write some pseudo-code to show how this is done.

6)   Write some pseudo-code to calculate the LOOCV error rate to evaluate the performance of two competing models (call them mod1 and mod2).

7)   Write pseudo-code to perform a bootstrap estimation of the standard error of the sample mean.

Week 3:

8)   How does a regression decision tree decide to make splits?

9)   Describe bagging for decision trees. How does bagging improve decision tree performance?

10) Describe the creation of a random forest. How does random forest improve decision tree

performance?

Week 4:

11) How is a cubic basis spline created, and what ensures that the result is smooth?

12) What is the difference between a natural cubic spline and a smoothing spline? Under what conditions would fitting a natural cubic spline and smoothing spline produce equal results?

13) Briefly explain how local regression fits the data.

Week 5:

14) In a logistic regression model with one predictor variable, how can we interpret the slope’ parameter B1 ?

15) A univariate Bayes classifier. The pdf of the normal distribution is = 1/sqrt(2 pi * sigma^2) * exp(

- (x - mu)^2 / 2* sigma^2 )

Let’s say X comes from a mixture of two distributions A and B. X is composed of 80% A and 20%


B. Distribution A is normal with mean 0 and sd 1. Distribution B is normal with a mean of 2 with a sd of 2. (These are just example values).

Let’s say an observation has a value of 1.5. What are the Bayes classifier probabilities of belonging to class A vs class B?

16) What is the naïve part of the Naïve Bayes classifier?

17) Write some pseudo-code for a K-nearest neighbors.

Week 6:

18) Write Y-hat as the result of forward-propagation for a neural network with one hidden layer. Let W(1) be your first weight matrix, B(1) be the first set of biases, etc. Let X be your data. Let f be    the sigmoid activation function.

19) When using a neural network for classification, in the output layer, we create a node for each possible class. For the MNIST handwritten digit classification, we had 10 nodes in the output

layer. Why couldnt we simply use one node to predict the numeric digit?

Week 7:

20) How does K-means clustering decide which points belong to a cluster?

21) Briefly (in two or three sentences max) explain the big-picture concept of Principal component analysis.

22) Let’s say we want to project two-dimensional data down to one dimension. How would we

calculate the first principal component of a two-dimensional dataset using eigenvectors? Week 8:

23) Binomial example Maximum Likelihood Estimation: Let’s say there is a carnival game where you draw marbles from a mystery bag. The bag has red and blue marbles.

To play the game, you draw 5 marbles with replacement, and will lose if you get red 4 or 5 times out of 5.

You observe other players play the game. You have seen 20 draws and you observe red marbles 16 times out of 20. Based on the observed data, what is your MLE estimate of the proportion     red in the bag? Based on that estimate, what is the probability of losing the game? (example      numbers)

24) Bayesian statistics beta-binomial. Using the previous example and data, let’s say you had a      prior belief that the proportion red p comes from a Beta distribution with parameters alpha =    beta = 5. After seeing the same data (16 red in 20 draws), what would the posterior distribution of p be? What is the integral that we would calculate to get the probability of losing the game? (You don’t have to actually find the numeric value, but write out the integral with all relevant

terms)

Week 9:

25) Imagine a Markov chain with two states: A, B. If you are at state A, you have a 50% probability of remaining at state A and a 50% probability of moving to state B. If you are at state B, you have a 10% probability of remaining at B, and a 90% probability of moving to A. Draw a diagram with     nodes and arrows to represent this. Write the transition matrix that represents the chain. If your current distribution has [A = 0.2, B = 0.8], what will be the distribution after two iterations of the Markov Chain?

26) MCMC - Markov Chain Monte Carlo. The metropolis algorithm is used to approximate an    unknown distribution by following rules about sampling from a proposal distribution and    whether to accept the proposed value. What is the probability used to decide if one should accept the proposed value? What is the value that is recorded if the proposed value is not  accepted?


How detailed does the pseudo-code need to be?

Here’s an example of pseudo code for LOOCV that would earn full marks

# data exists in environment. Has n rows

n = rows in data

for( i in 1:n){

test = row i in data

train = all rows in data except i

mod1 = fit model 1 to train

yhat1 = use mod1 to predict for test

error1[i] = yhat1 - y for test

mod2 = fit model 2 to train

yhat2 = use mod2 to predict for test

error2[i] = yhat2 - y for test

}

loocv1 = mean(error1^2)

loocv2 = mean(error2^2)