Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

INST414: Data Science Techniques

Course Syllabus, v1.0, 28 January 2026

Spring 2026

Instructors

Instructor:

Prof. Cody Buntain

Instructor Email:

[email protected]

Instructor Office Hours:

TBD

Instructor Office Location:

HBK 2117B

Instructional TA and Course Aide:

Udaya Sha Krishnarajapuram Radhakrishna <[email protected]>

Jian Zheng <[email protected]>

Amaan Mohammed <[email protected]> Yixin Bai <[email protected]>

Undergraduate TA:

TBD

TA Email List:

[email protected]

Meeting Time and Spaces

Classroom:

Lecture: ESJ 2204

Discussion: HBK 0115

Meeting Times:

Lecture: Wednesday 3:00 PM-4:15 PM

Practical (0101): Friday 11:00 AM - 12:15 PM

Practical (0102): Friday 12:30 PM - 1:45 PM

First Day of Class:

28 January 2026

Last Day of Class:

18 May 2026

Course Description

This course explores the application of data science techniques to unstructured, real-world datasets including social media and open data sources. The course will focus on techniques and approaches that allow the extraction of information relevant for experts and non-experts  in a wide range of areas including smart cities, transportation or public safety. This course will explore approaches to extract insights from large-scale datasets. The course will cover the complete analytical funnel from data extraction and cleaning to data analysis and insights, interpretation, and visualization. The data analysis component will focus on techniques in both supervised and unsupervised learning to extract information from datasets. Topics will include clustering, classification, and regression techniques.  Through homework assignments, a project, exams and in-class activities, students will practice working with these techniques and tools to extract relevant information from structured and unstructured data.

Required Background

Prerequisite Courses: 1 course with a minimum grade of C- from (INST201, INST301); and minimum grade of C- in INST126, INST314, STAT100, MATH115, and PSYC100.

Students are expected to have prior experience with and competency in computer programming. Proficiency in the Python language is preferred but not essential. Course assignments will primarily be written in Python and built on the Jupyter notebook framework, which does not come standard on most platforms and is often installed via the command line, so a familiarity with console applications is also preferred.

Student Learning Outcomes

This course’s main goal is to expose students to the collection and analysis of large, web-scale data, collected from online sources, with the goal of extracting actionable insights through applied exercises/hands-on projects. Over this course, students will:

1.  Collect and clean large-scale datasets.

2.  Articulate the math behind supervised and unsupervised techniques.

3.  Execute supervised and unsupervised machine learning techniques.

4.  Select and evaluate various types of machine learning techniques.

5.  Explain the results coming out of the models.

6.  Critically evaluate the accuracy of different algorithms and the appropriateness of a given approach

Textbooks

Textbooks below provide useful background and reference material. They are freely available for UMD students as well.

Introduction to Machine Learning with Python : A Guide for Data Scientists

(IMLP) by Andreas C. Mäller and Sarah Guido, ebook available at UMD library

Python Data Science Handbook: Essential Tools for Working with Data

by Jake VanderPlas, available at: https://jakevdp.github.io/PythonDataScienceHandbook/ Links to an external site.

Software

Jupyter notebooks written in Python 3 will be used for all in-class examples and assignments. The Anaconda distribution

Links to an external site.

of Python 3 is strongly recommended to provide all of these programs and other libraries. If students wish to use an alternative data analysis environment (R, Matlab, Julia, etc.) they are welcome to do so, but instructional support is only guaranteed for Python.

Jupyter also provides a ready-made Docker container for data science-style notebooks, available here:

https://jupyter-docker-stacks.readthedocs.io/

Links to an external site.

You can also use Google's Colab Links to an external site.

to run notebooks without a local instantiation of Jupyter.