关键词 > 198:462

198:462 Introduction to Deep Learning Spring 2023

发布时间:2023-02-08

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

198:462 Introduction to  Deep Learning

Spring 2023

Regular class time:

Lectures: Monday and Wednesday 2-3:20 pm -  WL-AUD (Wright Rieman Laboratories Chemistry Building Busch Campus)

Recitations: Monday 4:05 -5pm - WL-AUD

Instructor Office hours: Tuesday 6-7pm (online via zoom)

https://rutgers.zoom.us/my/elgammal?pwd=TXI4ZlVPN0NYV01nV2JYbjBscjJCQT09

Class TAs/Graders: TBA

Class Web page: Canvas page

Canvas web site for the class where the assignment, announcements, grades, and other resources will be posted.

Please upload your photo to canvas so I know who is who.

Overview:

This is a basic undergraduate-level course that intends to cover a variety of fundamental deep learning topics to get you acquainted with the field.

Topics:

•     Intro to Machine Learning

•     From Linear Regression to Perceptron

•     Multilayer Perceptron, forward and backward Propagation

•     Convolution Neural Network Models

•     Recurrent Neural Network, LSTMs

•     Encoder-Decoder Architectures, Sequence to Sequence learning

•     Attention and Transformer

•    Application domains: Computer Vision and Natural Language Processing

Textbooks:

We will mainly follow:

Dive into Deep Learning by Aston Zhang, Zack Lipton, Mu Li, and Alex Smola. (https://d2l.ai/)

Other useful textbooks :

§ Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville. (https://www.deeplearningbook.org)

§ For advanced Machine Learning Topics: Pattern Recognition and Machine Learning (PRML) by Christopher C. Bishop (https://www.microsoft.com/en-us/research/uploads/prod/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning- 2006.pdf)

Required Background:

Linear algebra (250) and basic probability and statistics (206).

Formal Perquisites: (01:640:250 INTRO LINEAR ALGEBRA and 01:198:112 DATA STRUCTURES and 01:198:206 INTRODUCTION TO DISCRETE STRUCTURES II and 01:198:206 INTRODUCTION TO   DISCRETE STRUCTURES II )

OR

(01:640:250 INTRO LINEAR ALGEBRA and 01:198:112 DATA STRUCTURES and 01:640:477    MATHEMATICAL THEORY OF PROBABILITY and 01:640:477 MATHEMATICAL THEORY OF PROBABILITY )

OR

(01:640:250 INTRO LINEAR ALGEBRA and 01:198:112 DATA STRUCTURES and 01:960:379 BASIC PROB AND STAT and 01:960:379 BASIC PROB AND STAT )

Python programming is a must. We will mainly use Pytorch which is a python library. If you do not know python, try to learn it in the first two weeks of the class to catch up. Some helpful resources will be provided.

Course Load:

Homework Assignments (50%): 4-5 assignments, individual, involved small programming projects and other problems

Quizzes (30%): 4-6 quizzes, online or on class (TBA).

Class Project (20%): groups of 2. Will require a proposal, presentations, and a report.

Tentative Course Outline and Schedule:

Week

Topic

Reading

1

Intro to Deep Learning

Ch 1

2

Machine Learning Basics

Preliminaries

Tensors, Linear Algebra

Auto Differentiation

Ch 2.1-2.3

Ch 2.4-2.6

3

Linear Neural Network for

Regression

Closed Form

Gradient Descent

Ch 3

4

Linear Regression Implementation Generalization, Weight Decay

Ch 3.6,3.7

5

Linear Neural Net for Classification Softmax regression

Image Classification Example   Generalization in Classification

Ch 4

6

Multilayer Perceptron (MLP)

Forward and Backward

Propagation

Ch 5.1 – 5.3

7

Numerical Stability and              Initialization, Vanishing and      Exploding gradients                    Generalization in DL, Dropouts

Ch 5.4 – 5.6

8

Going Deeper and Larger:

Layers and Modules

Lazy Initialization, Custom Layers

Ch 6

9

Convolution Neural Networks (CNNs)

 

10

Modern CNNs: AlexNet, VGGNet,

ResNet, Multi-branch networks

Ch 8

11

Recurrent Neural Networks       Language Models                          Backpropagation Through Time

Ch 9

12

Long Short-Term Memory (LSTM) Gated Recurrent Units (GRU)

Ch 10

13

Encoder- Decoder Architectures

Sequence-to-sequence learning

U-Net

 

14

Attention and Transformers

Ch 11

15

Generative Models