Statistics 101A

Introduction to Regression Analysis Using R

DEPARTMENT OF STATISTICS

COURSE SYLLABUS (TENTATIVE)


Course: STAT 101A: INTRODUCTION TO DATA ANALYSIS AND REGRESSION

Lecture Meeting: Lecture 1: MWF 11:00 – 11:50 AM. Through Zoom

Quarter: Spring 2021

Professor: Akram Almohalwas

Office: (Mathematical Science Bldg.) MS 8919

E-mail: [email protected] (best way to get in touch with me)

Phone: 310-794-4004 (Office)

Office Hours: MF 10:00 AM - 10:50 AM other times by appointment if necessary.

Textbook: "A Modern Approach to Regression with R", S. J. Sheather (2009).

TA: FAN, LIFENG, [email protected]


Course Description:

Lecture, three hours; discussion, one hour. Requisites: course 10 or 12 or 13 or Economics 41 or score of 4 or higher on Advanced Placement Statistics Examination, and course 20. Recommended: course 102A. Applied regression analysis, with emphasis on general linear model (e.g., multiple regression) and generalized linear model (e.g., logistic regression). Special attention to modern extensions of regression, including regression diagnostics, graphical procedures, and bootstrapping for statistical influence. P/NP or letter grading.


Course overview:

This class is an introduction to statistical modeling with a (very strong) emphasis on regression. Regression is perhaps the most commonly used statistical tool in different disciplines including economics, business, medicine, social science, psychology, and education. In this course we will teach statistical modeling, including graphical and descriptive techniques, model fitting, and model evaluation. I expect that we will be covering seven to eight chapters of the assigned book.


Course objectives:

● Introducing you to the concepts, strategies, and mathematical underpinnings that help to clarify the understanding of regression, model fitting, and model evaluation.

● Using R for conducting regression analysis with emphasis on analyzing, and interpreting the relevant outcomes.

● Verbal and oral communication of statistical findings to a statistical and non-statistical audience.

● Providing you with examples and sometimes refereed journal articles that use regres-sion techniques and thus helping you develop a better understanding of how statistics is used in scientific research.


How we will achieve the above goal:

We plan to achieve the above goal by introducing you to some statistical knowledge, teaching you how to use a statistical software called R-studio to perform data analysis, and having you engage in problem solving, application, data cleansing, and synthesis of statistical information through in-class and some on-line quizzes, some in-class quizzes, Homework, articles review (Maybe), and in-class exams.


Statistical Software:

I will be using R-Studio to demonstrate data analysis, interpretation of the data, and writing the results within context. We are going to use R-Studio for our assignments.


Statistics Background:

We will spend this first week reviewing simple linear regression from stat 10, confidence intervals and hypothesis tests, and covering the basics of R. Although these are not com-plex methods, the conceptual underpinnings are rather subtle and complex, and you are expected to be familiar with these concepts. (review your stat 10 notes and stat 20 notes)


Math Background

Familiarity with linear and matrix algebra will be very helpful. The first couple of chap-ters of the textbook provide a pretty good sense of the level of math required.


COURSE POLICIES:

Please remember to turn off your cell phones. You are expected to adhere to the honor code of conduct. Late submissions are not going to be accepted under any circumstances.


Course Management System

All course materials, announcements, and assignments will be posted on CCLE.

Please log onto http://ccle.ucla.edu using your Bruin log on ID.

You should print a copy of the relevant material for each week including lecture notes, articles, etc. And bring it to lecture with you.


General Methodology Used in Conducting the Course:

Lectures, Homework, Quizzes and Problem Solving, Midterm, “Articles or a Project” and Data Analysis R-Studio and Final Exam.


Important Dates:

Midterm: Week 5: Friday, April 30th, 2021. (in-class during zoom lecture)

Final Examination: Week 11: June 9, 2021 Wednesday 3:00 pm-6:00 pm


The final exam date cannot be changed and I cannot schedule other times.

Holidays (no class meeting)

Memorial Day    Monday, May 31

Instruction ends    Friday, June 4th 2021.


IX. References and Resources:

● Assigned book: Sheather, S. J. (2009). A Modern Approach to Regression with R. Springer Texts in Statistics. ISBN: 978-0-387-09608-78

● You do not need to purchase the book you can access it online through: http://www.springerlink.com/content/978-0-387-09607-0

● You can see the R commands used in different chapters by going to

● http://gatton.uky.edu/faculty-research/faculty/sheather-simon

● The links for the data and R commend should then be:

● http://gattonweb.uky.edu/sheather/book/data_sets.php

● http://gattonweb.uky.edu/sheather/book/r_code.php


R-Studio References:

● http://math.illinoisstate.edu/dhkim/rstuff/rtutor.html

● http://www.gardenersown.co.uk/Education/Lectures/R/graphs.htm

● https://www.datacamp.com/courses/free-introduction-to-r


Downloading R-Studio:

http://rstudio.org/


Attendance: Students are expected to attend all scheduled classes. It is the student's responsibility to find out what was discussed in a missed class, I'm not your mailman, I don't deliver concepts to your homes. Attendance will be used to determine grades in borderline cases.


Incomplete: To receive an incomplete grade, you must have completed at least 75% of the total grade with a minimum of C average.


Grading Scheme:

Quizzes (online quizzes through ccle)    03%

Online quizzes on CCLE and unannounced in-class to help you keep up with the mate-rial. The times of the online quizzes will be posted on CCLE. The objective of the quiz-zes is to give you a chance to do your own learning, construct your own knowledge, and stay on top of things. In order to reach these objectives, you should do these quizzes on your own and revisit the concepts underlying the items that you get wrong.

● If you forget or failed to take a quiz at the specified time, it cannot be made up and it will be counted as zero. Please do not email me about missing a quiz.


Kaggle/ ccle project:   12%

A simple individual friendly competition to apply the statistical concepts learned throughout the course on a real-world data and submit your predictions on kaggle or ccle. A score is calculated based on your MSE score or Adjusted R2 score, or AIC score. Stu-dents’ grades are based on the ranking of the best Score to the worst Score. All scores are compared to a baseline of the professor or TA submission. Students are not allowed to use techniques or packages that are not discussed in class nor the ones which are not di-rectly related to the course content. One third of the total project’s grade is based on the kaggle competition scores (TBD) The second one-third of the total project’s grade is based on the simplicity of the final Model (k – number of betas) (K value TBD)

The last one-third of the total project’s grade is based on the final paper write-up.


Homework    18%

● A total of six assignments throughout this quarter. Most of these assignments are questions from the assigned textbook and the rest are from assigned articles/re-lated subjects studied in lecture. You are encouraged to discuss the homework with your classmates. However, you are required to do your coding and write-up independently and turn in an electronic copy in a PDF format on ccle using R markdown. Your homework must include your name your student ID, lecture number and section number.

● Late homework is not accepted. Do not email me of missing a submission dead-line.

● The major concepts discussed in the articles will be a part of your midterm and final examination.


Midterm    30%

Our midterm exam will be administered in the middle of the quarter as listed above. You should expect questions from the lectures, articles, and overall questions about the pro-ject. You are allowed to create your own cheat sheet 8.5 by 11 (A4) written or typed on both sides. You need to have your ID to take the test. You cannot have access to cell phone or computer during the test.


Final    37%

The final exam is cumulative. On the final exam you should expect questions from the lectures, articles, and overall questions about the project. You need to have your ID to take the test. You cannot have access to cell phone, computer or Internet. Midterm and final will include material discussed in the book, lecture, and possibly arti-cles. The Midterm and FINAL exams are NOT BASED ON GROUP WORK and you each student need to turn in their own paper.


Incomplete: To receive an incomplete grade, you must have completed at least 75% of the total grade with a minimum of C average.


Letter Grades: The final letter grades will be based on the following:

A : 93% or higher
A- : 90% to 92.99%
B+ : 87% to 89.99%
B : 83% to 86.99%
B- : 80% to 82.99%
C+ : 77% to 79.99%
C : 73% to 76.99%
C- : 70% to 72.99%
D+ : 67% to 69.99%
D : 63% to 66.99%
D- : 60.00% to 62.99%
F : 59.99% or lower


Academic Integrity:

From http://www.deanofstudents.ucla.edu/Academic-Integrity

“With its status as a world-class research institution, it is critical that the University uphold the highest standards of integrity both inside and outside the classroom. As a student and member of the UCLA community, you are expected to demonstrate integrity in all of your academic endeavors. Accordingly, when accusations of academic dishonesty occur, The Office of the Dean of Students is charged with investigating and adjudicating suspected violations. Academic dishonesty, includes, but is not limited to, cheating, fabrication, plagiarism, multiple submissions or facilitating academic misconduct.

NOTICE: If you have a disability which will make it difficult for you to carry out the work for this course or if you anticipate a need for special assistance or accommodation due to a disability, please contact the Office of Services for Students with Disabilities as soon as possible. Efforts will be made to arrange appropriate and/or suitable accommodation.