Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Algorithms & Applications (ENG-160)

Spring 2023

Lab #12

Due: April 28, 2023 @ 7 PM

The purpose of this lab is to allow you to submit your final project code and team worksheet for early feedback.

You can use the feedback to improve your final project submission due on May 5th, 2023.

1. You must submit your work to Brightspace for grading.

2. Please make sure the filenames for the function and script (.m) files are the same as described in the problem statement.

3. Upload all the required files to Brightspace.

k-Means Clustering Excerpt

The k-Means Clustering algorithm is an unsupervised learning algorithm that classifies data into k distinct clusters. Clustering is the task of grouping data together based on some measure of similarity. For example, we might cluster based on Euclidean distance:

Figure 1. Clusters of Data

InFigure 1, we see three clusters of data points centered at different locations. The closer together points are the more similar they are. Typically, clustering is done as a way of looking for possibly unknown patterns in a dataset. For the final project, you are to implement the k-means algorithm and apply it to a couple of example datasets.  For additional  information  regarding  k-Means, you  can  refer to the  final  project  specification  in Brightspace or the following sources:

K-means: A Complete Introduction. K-means is an unsupervised clustering… | by Alan Jeffares | Towards Data Science

Image Compression using K-Means Clustering | by Satyam Kumar | Towards Data Science

Clear, Visual Explanation of K-Means for Image Compression with GIFs | by Sebastian Charmot | Towards Data

Science


Part I (4 pts)

For the final project, you need to form a team with a peer student and document a collaboration plan using the TeamWorksheet.xlsx template provided in Brightspace.  Report the progress that your team has already made and the remaining work to be done using the project plan spreadsheet. Upload the project plan spreadsheet to Brightspace for grading.

Part II (16 pts)

Submit the two functions kMeansCompute.m, labelsKMeans.m, and a test script that you have  created  to  call  the  functions  to  perform  the  experiments  described  in  the  project specification on page 5:

For the experiment, it is suggested that you create a script file that read in the data, set the k-means parameters, run the k-means algorithm, relabel the clusters and then compute accuracy. A snippet of the script is as follows:

%% Run K-mean algorithm on training dataset

[centroids, idx, itr] = kMeansCompute(dataTrain, K, distFunc, tol, max_iters);

%% Relabel the clusters

meanLabels = mapLabels(centroids, idx, labelsTrain);

idxTrain = labelKMeans(centroids, meanLabels, dataTrain, distFunc);

%% Prediction accuracy on training and test dataset

trainAccuracy = sum(idxTrain == labelsTrain)/length(idxTrain);

Submit TeamWorksheet.xlsx, kMeansCompute.m, labelsKMeans.m, the test script and the published PDF of the test script showing initial experimental results. You will have to upload five files to Brightspace for full credit.