Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

CSEN 240 Machine Learning

Homework #1

1.   (5 points) Please hand-wriEen copy the following statement and sign.

“I am commiEed to being a person of integrity. I pledge, as a member of the Santa Clara University community, to abide by and uphold the standards of academic integrity contained in the Student Conduct Code.”

2.   [Level 1: high-level understanding] (3pt) AI vs. ML

a.   Give 1 example of AI applicaWons that do not use machine learning. Because the same applicaWon may be implemented using machine learning as well, to avoid   confusion, please describe the algorithm used in your example.

b.   Give 2 examples of AI applicaWons that use machine learning

3.   [Level 1: high-level understanding] (12pt)

a.   Which machine learning algorithm ((a) classificaWon, (b) regression, (c) clustering, and (d) dimension reducWon) is the best soluWon for the following problems:

(one each)

i.   SegmenWng customers into groups based on their purchasing behaviors.

ii.   Visualizing high-dimensional data in a lower-dimensional space.

iii.   PredicWng the stock price of a company based on historical data.

iv.   IdenWfying fraudulent transacWons in a credit card dataset.

b.   Which of the following can describe PLA? (select all that apply)

i.   It is a supervised learning algorithm.

ii.   It is an unsupervised learning algorithm.

iii.   It can be used to solve a classificaWon problem.

iv.   The algorithm is based on the concept of iteraWvely updaWng its weights in response to errors in its predicWon

c.   Which of the following problems PLA is best suited for? (select one)

i.   Predict housing prices based on locaWon, size, and other factors.

ii.   Classify images, such as idenWfying the contents of a photograph

iii.   Group customers according to their purchasing behavior

iv.   Extract important features from a dataset, allowing for more accurate analysis

4.   [Level 2: manual exercise] (20pt)

o Given the following dataset of labeled points in two dimensions

. PosiWve samples: {(3,3), (4,4)}

. NegaWve samples: {(1,2), (2,1)}

o If the current weight is (-1, 0.3, 0), what should be new weight if we are using

PLA to update the weight?

. Note that g(⃑(x))  = sign(w!  + w1x1  + w2x2)

o Aaer the new weight, how many samples does it classify correctly?

. Note: please just update once. No need to complete the whole epoch or the whole training in this manual exercise.

5.   [Level 3: manual exercise] Computer-based exercise] (30 points)

o Implement a PLA (using HW1-5.ipynb as the starWng point)

o Given the following dataset of labeled points in two dimensions

. PosiWve samples: {(3,3), (4,4)}

. NegaWve samples: {(1,2), (2,1)}

o And, its iniWal weight: (-1, 0.3, 0)

o Show the weights aaer each update

6.   [Level 3: Computer-based exercise] (30 points)

o (Using HW1-6.ipynb as the starWng point and the PLA rouWne that you implemented in HW1-5)

o Assuming (0, 1, 1) is the ground truth of the decision boundary, create 40 unique samples (20 are posiWve and 20 are negaWve).

o First, evenly split the 40 samples into two sets: one is called training samples, and the other is called tesWng samples.

o Second, train a PLA boundary using 100% of the training samples, and test the accuracy of the unseen tesWng samples.  (Repeat 10 Wmes for the average accuracy)

o Third, train a PLA boundary using 60% of the training samples (e.g., 6 posiWve  and 6 negaWve samples), and test the accuracy of the unseen tesWng samples. (Repeat 10 Wmes for the average accuracy)

o What do you observe?  Can you explain?