Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

MLDS 421: Data Mining

Assignment -3

Individual Assignment (100 points)

Instructions:

• Submit the paper review as a word or pdf file.

• Submit code as a Python notebook (. ipynb) file along with the HTML version.

• Write elegant code with substantial comments. If you have referred to or reused code from a website add the links as reference.

1. Paper Review – Following the guidelines review any one of the technical papers from Group3 (30)

2. Design and build a movie recommendation engine on the MovieLens dataset. (70)

• Exploratory Data Analysis (EDA): (10)

1. Perform exploratory data analysis to understand the structure and characteristics of the dataset.

2. Visualize key statistics such as movie ratings distribution, user preferences, etc.

3. Explore relationships between variables (e.g., user ratings, movie genres).

• Building the Recommendation Engine: (30)

4. Implement two collaborative filters: user-based and item-based.

5. Implement two different techniques matrix factorization based collaborative filters.

6. Split the dataset into training and testing sets.

7. Train the recommendation engine on the training set.

8. Generate movie recommendations for users based on their historical ratings or preferences.

9. Evaluate the performance of the recommendation engine using appropriate metrics.

• Fine-tuning and Optimization: (20)

10. Experiment with the algorithms above and fine tune parameters (e.g., similarity measures, neighborhood size) to optimize the recommendation engine's performance.

11. Fine-tune the recommendation engine based on evaluation metrics.

12. What would be your approach to handle cold starts for users.

13. Design and implement a hybrid recommender using the top two models. Use an architecture diagram to illustrate your hybrid model.

• Presentation and Documentation: (10)

14. Summarize your approach, findings, and implementation details.

15. Present the recommendation engine's performance metrics and any optimizations made.