闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

STA 142A: Homework 2

1. Poisson Classiﬁer for Multiclass Classiﬁcation. Let X e R be a univariate random variable representing input data. And Y be the random variable representing the output data. In binary classiﬁcation with logistic regression, the conditional distribution of a binary output Y (assuming no intercept terms for simplicity) is assumed to be:

P (Y = 1|X) =

Now consider we are required to multi-class classiﬁcation where the output Y can take any non-negative integer value 0, 1, 2, ... (for example, the number of daily hits on a web server). A standard model for Y is the Poisson. Recalling the probability mass function of the Poisson distribution, we now assume the conditional distribution of the ouput Y given X as:

P (Y = k|X) = (βk!(X))亿 e − (βx) , for k = 0, 1, 2, . . .

(a) Is the above approach for multi-class classiﬁcation, a generative or discriminative ap- proach ?

(b) Given n training samples (x1 , y1 ), . . . , (xn, yn), how will you estimate the parameter β via MLE.

2. Logistic Regression versus LDA. In this question, we will compare the performance of Logistic Regression and LDA through a simulation. Let X represent the input random vari- able and Y represent the output random variable for Binary classiﬁcation. Let the conditional distributions be as follows:

X|(Y = 1) is a t · distribution with 1 degree of freedom with mean µ 1 X|(Y = · 1) is a t · distribution with 1 degree of freedom with mean 0.

and let P (Y = 1) = 0.5. Details about t-distribution could be found in the wikipedia link here. You could use np.random.standard t and np.random.binomial for this question.

(a) Repeat the following procedure for 100 trails: Set µ 1 = 1 and generate n = 100 training data samples (x1 , y1 ), . . . , (x100 , y100 ) from the above model. Train a logistic regression and LDA classiﬁer on this training data. Generate n = 100 testing data from the same model. Note that you will know the true labels in this testing data as you generated it. Plot a box-plot of the test error of logistic regression and LDA across all the 100 trails. What is the mean and variance of the test errors of Logistic regression and LDA ? (Here, for each trail, the test error is deﬁned as the number of misclassiﬁed samples on the testing data.)

(b) Repeat the above procedure with µ 1 = 2 and µ 1 = 3. Comment on what you observe.

3. Question 3 in pages 398-399 of the textbook.