Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

QTM 347: Machine Learning

Spring 2024

Course Description

This course is designed to introduce students to the field of machine learning, an essential toolset for making sense of the vast and complex datasets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years. This class will present a number of important modeling and prediction techniques that are staples in the fields of machine learning, artificial intelligence, and data science (more broadly). In addition, this course will cover the statistical underpinnings of common methods. The tentative list of topics includes

1.  Regression as a predictive task and general model fitting: a review of linear regression, cross-validation, and leave-one-out cross-validation, bootstrapping, alternative prediction methods for continuous outcomes, sparse regression (Ridge, LASSO, and Elastic Net), non-linear methods, tree-based methods

2.  Classification methods:  K-nearest neighbors, naive Bayes classifiers, linear discriminants, logistic regression, deep learning

3.  Unsupervised methods: Principal components analysis, clustering, and k-means algorithms

Grading

You are responsible for keeping up with all announcements made in class and for all changes in the schedule that are posted on the class website.

The grade will be based on the following:

   Homework 30%

   Exam 30%

   Final project 35%

   Participation 5%

Homework

There will be a total of 3 homework assignments. The homework assignments consist of both theoretical and empirical questions. The main statistical software used in class is Python.

The homework assignments are done in groups. The group size is up to four students. You should choose the same group for all homework assignments. The homework will be graded on a 100 point scale. You have a total of three free late days for all homework assignments as a group. You can use at most two late days for one homework assignment.

Homework is assigned and due as follows:

●   Homework 1 handed out on  Feb 4, 2024 , due on  Feb 25, 2024  

●   Homework 2 handed out on  Feb 25, 2024 , due on  Mar 17, 2024  

●   Homework 3 handed out on  Mar 17, 2024 , due on  Apr 7, 2024 

Exam

The exam will be a take-home exam. The exam will be handed out at  Apr 12, 2024  

00:00 am and due at  Apr 15, 2024  11:59 pm. You can choose any 24 hours in between to complete. There is no make-up exam.

Course Project

The goal of the course project is to prepare you for some project experience in machine learning. By the end of the project, we hope that you will have gained some hands-on experience in applying ML to a real-world problem, or learned some research frontiers in machine learning. We will provide a sample list of datasets and papers. You have two options to complete the project. The first option is to pick a dataset that interests you, and apply the knowledge that we have gained this semester to analyze this dataset.

The second option is to replicate a research paper and explore the possible extensions/improvements of the paper.

The course project is done in the same group as the homework.

There is a project proposal presentation on   Mar 6, 2024. Each group needs to prepare a five-minute presentation that includes all the group members (up to four students) and the topic of your group.

There are final project presentations on Apr 24, 2024 and Apr 29, 2024. Each group needs to prepare a fifteen-minute presentation that includes the motivation, setup, and results of the project. Before the full project presentation, we ask that you set up a publicly available GitHub repository about your work, along with detailed documentation about how to use the code repository and what findings you currently have about the project.

We expect when each group presents, other groups provide critical feedback, which will be counted toward the participation in this course.

Finally, by  May 8, 2024 , refine the GitHub repository and the accompanying documentation.

Prerequisites

QTM 220 Regression Analysis or equivalent.

The class does not assume specific knowledge of models for causal analysis and machine learning. Hence the class would cover the relevant models in the two areas in addition to the combination of the two.

Textbook

You will only be tested on the material presented in lectures, and learned through the homework. Some of the problems and supplementary material will be drawn on

●   James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. An introduction to statistical learning. Vol. 112. New York: springer, 2013.

●   Hastie, Trevor, Robert Tibshirani, Jerome H. Friedman, and Jerome H. Friedman. The elements of statistical learning: data mining, inference, and prediction. Vol. 2. New York: springer, 2009.

Honor Code

All students enrolled at Emory are expected to abide by the Emory College Honor Code. Any type of academic misconduct is not allowed, which includes 1) receiving or giving information about the content or conduct of an examination knowing that the release of such information is not allowed and 2) plagiarizing, whether intentionally or unintentionally, in any assignment. For the activities that are considered to be academically dishonest, refer to the Honor Code:

http://catalog.college.emory.edu/academic/policies-regulations/honor-code.html.

Accessibility and Accommodations

As the instructor of this course, I endeavor to provide an inclusive learning environment. I want every student to succeed. The Department of Accessibility Services (DAS) works with students who have disabilities to provide reasonable accommodations. It is your responsibility to request accommodations. In order to receive consideration for reasonable accommodations, you must register with the DAS at http://accessibility.emory.edu/students. Accommodations cannot be retroactively applied, so you need to contact DAS as early as possible and please contact me as early as possible in the semester to discuss the plan for implementation of your accommodations.

For additional information about accessibility and accommodations, please contact the Department of Accessibility Services at (404) 727-9877 or accessib[email protected].