闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

DS 310 Machine Learning

Fall 22

Problem Set 2: Nearest Neighbor Classifier

1. (25 pts.) Suppose a 5 nearest neighbor search for a query sample q returns {7, 8, 7, 8, 9} as the values of the function at the 5 nearest data samples in the training set. Assuming prediction using 5 nearest neighbor interpolation, what is the predicted value of the function for the query sample?

2. (25 pts.) Consider the following database of patients with specific symptoms diagnosed

by a physician. Training Example

Fever	Nausea	Diarrhea	Chills
no	no	no	no
average	no	no	no
high	no	no	yes
high	yes	yes	no
average	no	yes	no
no	yes	yes	no
average	yes	yes	no

Classification

Healthy

Flu

Salmonella

IBD

(a) Design a suitable representation for this data set.

(b) Define a suitable distance measure.

(d) Classify the preceding query instance using 3-nearest-neighbor classifier trained on the training data.

3. (25 pts.) Suppose you are given a data set consisting of text documents that have been labeled as belonging to one of two classes (+ denoting ”interesting” and − denoting ”not interesting”). Suppose the vocabulary used in the document consists of V distinct words w1 ··· wV . Suppose you want to construct a nearest-neighbor classifier from this data set.

(a) Design a suitable representation for documents that is amenable for use with a nearest-neighbor classifier.

(b) Define a suitable distance measure to use with a nearest-neighbor classifier.

4. (25 pts.) Suppose you are given document collection where each document consists of both text and pictures. Suppose each document has been labeled as belonging to one of two classes (+ denoting ”interesting” and − denoting ”not interesting”). Suppose the vocabulary used in the document consists of V distinct words w1 ··· wV . Suppose the pictures are made of a visual vocabulary of F distinct identifiable features. Suppose you want to construct a nearest-neighbor classifier from this data set. How would you go about designing a nearest-neighbor classifier in this case? Explain with pseudo-code.