Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Probabilistic Modeling and Statistical Computing

Semester: Fall 2023

Course number: DSAN-5100

Course description

Probabilistic models are essential for the understanding of data that are affected by uncertainty. This

course introduces students to the fundamentals of probabilistic modeling and then covers computational

techniques for the analysis of such data. After introducing basic concepts and approaches such as

probability distributions, random variables, and conditioning, the course covers basic probability

distributions that are frequently used in practice and some of their properties, such as Laws of Large Numbers. In the second half, students will learn about computational techniques for the use of

probabilistic models. This includes methods for faithful simulation of random variables (Monte Carlo), the extraction of condensed models from observed data (maximum likelihood, Bayesian models), methods for

models with hidden or partially observed variables (latent variables, expectation-maximization, hidden Markov models), and some general data science techniques that incorporate probabilistic models

(graphical models, stochastic optimization).

Objective : Statistics is the language in which data is analyzed and interpreted, and thus any serious data scientist must have a Ørm understanding of the mathematical principles of probability and statistics. A

hard working student of this course will build this critical foundation.

Details : This course is a self-contained introduction to probability and statistics with a focus on data

science. The topics covered include fundamentals of probability theory and statistical inference, including: probabilistic models, random variables, useful distributions, expectations, the law of large numbers, the

central limit theorem, point and conØdence interval estimation, maximum likelihood methods, hypothesis tests, and linear regression (as time permits).

Prerequisites: This course assumes that you have a Good knowledge of R, a calculus-based introductory probability and statistics class, linear algebra (mainly matrix algebra), multivariable calculus (partial

derivatives, integration, optimization with several variables).

Technology details: The course will use statistical software (R) throughout. It is available as a free

download at https://cran.r-project.org. I recommend that you also obtain RStudio which is available for

free at https://www.rstudio.com. This software is also installed on all Georgetown University Information Services (UIS) computer labs. You are expected to be proØcient in R and to be able to use it on weekly

assignments. Any laptop with 2GB or more of RAM running Win 7 or higher, Mac OS X or a recent dialect of Linux is sufØcient.

Course details

Instructional team

The following are the email addresses for the instructional team.

Professors

Professor James Hickman: webpage

Professor Jeffrey Jacobs: webpage

Professor Benjamin Houghton: webpage

Teaching assistants

Matthew Moriarty

Xulu Wang

Kefan Yu

Yiming Chen

Shihong Zhou

Ziyue Li

Xin Xiang

Linlin Wang

Katherine Nunez

Xinyang Liu

Canvas

Most of the content for the course will be hosted on this website. However, Canvas will also be used for deliverable submissions, grading, quizzes, and due dates.

The Canvas site for the class can be found at the following link: Canvas page

Textbooks

Primary textbook(s):

The course does not follow a speciØc textbook, however content will be drawn from the following books. All of which can be found online.

• M. H. DeGroot and M. J. Schervis h , Probability and Statistics, Fourth Edition, Publisher: Pearson (2012),

(ISBN- 13) 978-0321500465. (pdf)

• Peter Dalgaard, Introductory Statistics with R, 2nd edition. Springer 2008. Softcover ISBN 978-0-387-

79053-4. (pdf)

Extra textbook(s):

• George Casella and Roger Berger , Statistical Inference. second edition. (pdf)

• Laura Chihara and Tim Hesterberg, Mathematical Statistics with Resampling and R. Wiley 2011. ISBN: 978- 1- 118-02985-5. (pdf)

For students not familiar with elementary probability or statistical inference at the introductory undergraduate level, as a supplement to the above course textbook:

• Probability and Statistics for Engineering and the Sciences. 9th Edition, Jay L. Devore. Publisher: Cengage Learning. ISBN: 1305251806. Chapters 1-9. (pdf)

Zoom link

Almost all meetings of this class will be in-person. However, occasionally we may have remote lectures, these will be held at the following zoom link:

https://georgetown.zoom.us/j/97534516609?pwd=dHVSNlcwaWpsSVZMQjlUTkMyYUJPZz09

Share point folder

The Share point tab in the navigation bar contains IMPORTANT information and shared resources for the

course.

The links will take you to cloud-based directories (folders) with various sub-folders such as “slides”, labs”, codes”, homeworks”, etc.

The Øles in the share-point folder are also built into the course website, so you can access them from the website download links OR from the share-point folders, which ever you prefer.

New content and slides will be periodically added to the folders as the semester progresses.

If something appears to be missing then notify your professor ASAP!

Canvas navigation

Canvas can be organized in a number of different ways. Watch the 5-minute Canvas video tutorial to learn more about how Canvas navigation works. We hope this gives you a sense of what Canvas offers and how  you can navigate it.

Canvas section names

The following information only applies to multi-section classes. It can be ignored if your class only has a single section.

Often, we combine all lecture sections into one large Canvas “super-section This is done to make course management easier for the instructional team.

This change only affects the course name that you see in Canvas. It DOES NOT change which section you are registered in or the section you need to attend during lecture.

Your true” section is deØned in MyAccess . For example, if MyAccess says you are in 5000-04 then you are in 5000-04, even if the Canvas course name is 5000-01.

Another way to check your “true” section is to click the “ people” tab on the left of Canvas. From there you can see the section which you are registered.

Course structure

The following is the typical format for most DSAN courses, however, professors may choose to deviate from this format.

Modules

The course consists of 14 weeks, divided into 6 modules, each module is two weeks long and focuses on a particular topic.

The Ørst and last week of the class will be standalone (i.e. not included in any module)

Generally, each module includes an introductory lecture (Ørst week) followed by a more in-depth lecture (second week).

Lectures (1.5 hr)

Each week will contain a 1.5-hour lecture, followed by a 1-hour lab period.

The lecture covers a large amount of information with Q&A at the end (~10 minutes):

Lab (1.0 hr)

General Q & A (~10 minutes)

Hands-on coding demonstration and assignment (.ipynb or .rmd) (~40 minutes):

Student presentations (when applicable)+Q&A: (~10- 15 minute)

Outside of class

This is a graduate level class, you should expect to spend around 10 to 20 hours on it per week outside of class.

Meeting details

Location

Unless otherwise stated, meetings are in person with the following details:

Note : Meeting times & locations maybe subject to change, refer to My-Access for ofØcia l information

5100-01: (Professor Hickman): Monday 3:30-4:45 PM, St. Marys Room: 107

5100-02 (lab): (Professor Houghton): Monday 4:46-6:00 PM, St. Marys Room: 111

5100-03: (Professor Jacobs): Thursday 12:30-3:00 PM, Car Barn Room: 201

5100-04: (Professor Hickman & Houghton): Wednesday 6:30-9:00 PM, Car Barn Room: 203 5100-05 (lab): (Professor Hickman): Monday 4:46-6:00 PM, Reiss Room: 502

Calendar

This is the tentative meeting schedule for the class. If changes occur, you will be notiØed via Canvas.

Note: All times below are in Eastern standard time (EST)

Meeting

Monday

Sections

Wednesday Sections

Thursday Sections

1

ModiØed schedule

For the Ørst class, all sections (lecture & lab) will be

absorbed into one remote section meeting. If you

can’t attend during this times-slot, then email in advance

and watch the recording afterwards

(zoom link)

Aug-24

12:30 PM -

3:00 PM

Aug-24

12:30 PM - 3:00 PM

Aug-24

12:30 PM -

3:00 PM

2 Aug-28 Aug-30 Aug-31

Regular schedule

Meeting

Monday

Sections Wednesday Sections


Thursday

Sections

All sections meet in person at their regular time &

place


3

Sept-5 (T)

Sept-6

Sept-7

4

Sept- 11

Sept- 13

Sept- 14

5

Sept- 18

Sept-20

Sept-21

6

Sept-25

Sept-27

Sept-28

7

Oct-2

Oct-4

Oct-5

8

Oct- 11

Oct- 11

Sept-6

ModiØed schedule

(W)

Sections 1,2,3 and 5 meet in a

Regular

To accommodate the holiday, certain sections follow

combined remote meeting this

meeting

a modiØed schedule this week.

week on Wednesday Oct- 11 from

6:30-9:00 PM

If you can’t attend during this times-

slot, then email your professor in

advance & watch the recording