CS 4340-5340 Practice Questions
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
CS 4340-5340 Practice Questions
1. Consider a perceptron learning algorithm (PLA) used for a binary classification problem.
(i) What will be the algorithm’s output if the training data are not linearly
separable?
(ii) Would you call the PLA supervised or unsupervised learning?
2. Obtain the linear regression of Y on X1, X2 (by hand-calculation or writing code, but do
not use an off-the-shelf linear regression function):
X1 X2 Y
0 0 2
0 1 2
1 0 10
1 1 10
(i) Write the regression equation.
(ii) If we use the result of this regression as a classifier, what will be the equation of
the classifier?
3. Exercises 3.6 and 3.7 from the textbook (page 92).
4. In least squares linear regression, we obtain the solution (the weights) analytically. In
logistic regression, why don’t we analytically solve for the weights by setting the partial
derivatives of the (log-)likelihood expression (or the negative of that expression) to
zeros?
5. Explain the concepts of “training data,” “test data,” “training error,” and “test error.” Is
it good to have as low a training error as possible?
6. What does “i.i.d.” stand for? Why is it important in the machine learning / statistics
literature?
7. True or false:
(i) The decision (classification) boundary in PLA is linear.
(ii) The decision (classification) boundary in logistic regression is non-linear.
(iii) The loss (error) function in linear regression is convex and therefore guarantees
the existence of a unique (global) optimum solution.
8. Justify or refute: Gradient descent/ascent guarantees convergence to the global optimal
solution.
Advice: Please read the textbook carefully. If you can devote further time, read the references,
particularly, the ISLR book.
2025-12-24