Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

MSBX 5410: Fundamentals of Data Analytics

2022

Course Description: This course will expose students to fundamental concepts for      working with data programmatically. We will cover basic techniques for making sense of raw data, including subsetting, aggregating, and summarizing, data cleaning and              manipulation, and visualization. We will also cover basic and more advanced techniques for analyzing data with statistical models. Sessions will consist of both lecture and class  exercises. The programming and statistical concepts will be taught primarily using R (a   programming language designed for statistical computing), but are broadly applicable to other languages and data analysis tools.

Textbook: There is no official textbook for this class. If you are looking for a textbook” to augment my lecture material and notes, Hadley Wickham has written a great book for

beginners (https://r4ds.had.co.nz/) and one for more advanced users (https://adv- r.hadley.nz/). Other suggestions for reference material will be provided as the course progresses.

Lecture Slides: I will post lecture slides after class.

Lecture Capture: The afternoon section (205) will be recorded and made available to  all students. Distance students (sections 205B and 574) should try to watch the videos as soon as possible to stay in sync with the in-person sections. Lectures will be posted here:

https://leedscapture.mediasite.com/Mediasite/Channel/msbx-5410-reinholtz

Computer Policy: If it wasn’t obvious, you’ll need your computer for each class. You’ll need Excel for the first session. After that, you’ll need to have R installed. I prefer—and  will use during class—“base” R and not RStudio. You can download R from any “CRAN  mirror”, but here is a link (Iowa State’s mirror): https://mirror.las.iastate.edu/CRAN/

Grading:

15%      Homework Assignments

15%     Project

35%    Exam 1

35%    Exam 2

Homework Assignments: There will be four homework assignments (see schedule   for due dates). You may talk to other students in the class about the homework.               However, you may not copy another student’s work or data. For example, you can ask   another student “Why isn’t this code working?” and show that student your code. But,    you cannot ask another student “Can I see your solution?” In short, you can get/provide help with debugging, but you cannot just share/copy the solutions. You may only submit homework once. Late homework will not be accepted. I will drop your lowest     homework grade.

Project: The project can be done as a group (up to 5 team members) or individually. There are two options for project topic, which I will discuss in class:

Option 1: Airbnb project. I have collected Airbnb data (from InsideAirbnb.com) and       have provided a series of questions that the data can inform. I’ve designed the questions so not all of them have exact “correct” answers—you’ll have to think about them and       what you can learn/say from the data.

Option 2: Choose your own project. This option is intended for students with prior          programming/analysis experience and a desire to learn skills/techniques not directly     covered in my course. You need my permission if you want to pursue this option ! The    topic must be data or programming oriented, but—beyond this—can be about whatever you want. In the past, students have worked on projects involving data scraping,              machine learning, predictive modeling, sports analytics, etc. Please email me a proposal (with the team members) that describes the goal of your choose your own project”        before the first midterm if you want to pursue this option.

The final deliverable for this project is a short report that conveys your analysis and conclusions. You should limit the report to five (or fewer!) pages of text. Well            commented code and supplementary figures can be included as an appendix.

Exams: Exams will be administered in-person during class time (i.e., 8:30am– 11:45am or 12:45pm–4:00pm). (Instructions for distance students will be sent before the exams. Please keep the evenings of the exam days free.) Exams will require application of           course material to novel problems and/or data sets. Exams will be open book” and         “open notes”—students may also use the internet passively (i.e., you can search for          existing information on the internet, but you cannot request help from others using the internet). Exams are independent work. You are not allowed to communicate about course or exam content with other studentsor anyone other than meduring the exams. All questions about the exam should be directed to me. More details will be         provided during class.

Class Schedule:

Week

Day

Tentative Topics

1

Tues.

(7/5)

Course Introduction, Excel Lab

Thurs.

(7/7)

Introduction to R, Vectors, Functions, Indexing

2

Tues.

(7/12)

Iteration (for Loops), Conditional Logic (if Statements)

Homework #1 Due at 11:59pm on Monday (7/11)

Thurs.

(7/14)

Tabular Data (data frames)

3

Tues.

(7/19)

Chipotle Data in R

Homework #2 Due at 11:59pm on Monday (7/18)

Thurs.

(7/21)

Exam #1

4

Tues.

(7/26)

Statistical Inference, t-tests, chi-squared test

Thurs.

(7/28)

Regression and Visualization

5

Tues.

(8/2)

Regression and Visualization (part 2)

Homework #3 Due at 11:59pm on Monday (8/1)

Thurs.

(8/4)

Regression and Visualization (part 3)

Project Due at 11:59pm on Sunday (8/7)

6

Tues.

(8/9)

Logistic Regression, k-means clustering

Homework #4 Due at 11:59pm on Monday (8/8)

Thurs.

(8/11)

Exam #2

University Policies:

Classroom Behavior

Both students and faculty are responsible for maintaining an appropriate learning environment in all instructional settings, whether in person, remote or online. Those who fail to adhere to such behavioral standards may be subject to discipline. Professional courtesy and sensitivity are especially important with respect to individuals and topics dealing with race, color, national origin, sex, pregnancy, age, disability, creed, religion, sexual orientation, gender identity, gender expression, veteran status, political affiliation or political philosophy. For more information, see the policies on classroom behavior and the Student Conduct & Conflict Resolution policies.

Requirements for COVID- 19

As a matter of public health and safety, all members of the CU Boulder community and all visitors to campus must follow university, department and building requirements and all public health orders in place to reduce the risk of spreading infectious disease. CU Boulder currently requires COVID- 19 vaccination and boosters for all faculty, staff and students. Students, faculty and staff must upload proof of vaccination and boosters or file for an exemption based on medical, ethical or moral grounds through the MyCUHealth portal.

The CU Boulder campus is currently mask-optional. However, if public health conditions change and masks are again required in classrooms, students who fail to adhere to masking requirements will be asked to leave class, and students who do not leave class when asked or who refuse to comply with these requirements will be referred to Student Conduct and Conflict Resolution. For more information, see the policy on classroom behavior and the Student Code of Conduct. If you require accommodation because a disability prevents you from fulfilling these safety measures, please follow the steps in the “Accommodation for Disabilities” statement on this syllabus.

If you feel ill and think you might have COVID- 19, if you have tested positive for COVID- 19, or if you are unvaccinated or partially vaccinated and have been in close contact with someone who has COVID- 19, you should stay home and follow the further guidance of the Public Health Office ([email protected]). If you are fully vaccinated and have been in close contact with someone who has COVID- 19, you do not need to stay home; rather, you should self-monitor for symptoms and follow the further guidance of the Public Health Office ([email protected]).