STA302/STA1001: Methods of Data Analysis I Summer 2021


STA302/STA1001: Methods of Data Analysis I Summer 2021

Nnenna Asidianya

Course Description

This course covers theory and applications of regression analysis. We will develop the theory of regression models and study how to analyse data when such models are appropriate. Topics to be covered include: simple and multiple regression models using least squares, analysis of variance, inference for regression parameters when the errors are normally distributed, confi-dence and prediction intervals, geometry of least squares, multicollinearity, regression models for quantitative and qualitative predictors, model selection and validation, and diagnostics.


Pre-requisites: If you do not have the equivalent pre-requisites, you will be un-enrolled from the course. Students should have a second year statistics course, such as STA238, STA248, STA255, or STA261, a computer science such as CSC108, CSC120, CSC121, or CSC148 and a mathematics course such as MAT221(70%), MAT223, or MAT240 or equivalent preparation as determined by the department. If you can advise students to contact our undergraduate office at [email protected] if they have questions about course prerequisities.

Email Policy

Important announcements, lecture notes, additional material, and other course info will be posted on Quercus. Check it regularly. You are responsible for keeping up with announce-ments from instructors on Quercus and via e-mail.

To ensure your email gets to me, please ensure the following:

• Use your academic email, for example [email protected]

• Use the following format in the subject of your email: CourseName/LASNTNAME For example: STA302S/ASIDIANYA

• Be very clear and concise.

Required Materials

• This course requires the following textbooks:

1. Kutner, M. H., Nachtstein, C. J., and Neter, J. Applied Linear Regression Models. McGraw-Hill, 5th edition.

2. Sheather, S.J. A Modern Approach to Regression with R. (Springer).

• We will be using RStudio for performing statistical analyses. R is a free software that can either be downloaded onto your personal computer or used in the cloud. If you choose to work with R on your personal computer, then installation will be a two step process:

1. The base R framework is available for download at R framework

2. RStudio is a good integrated development environment to R (makes it simpler to work in R) and can also be downloaded for free at RStudio.

For students who are unable to download R, they may be able to use the following local cloud: Jupyter. You will need to use your UTORID and password.

Course Components

Class Lectures: Two three hours a week class times will be used to cover important course mate-rials. It is important that you attend these classes in order to keep up with the topics, and gain a deeper understanding of the applications of statistics in social sciences.

Office Hours: We will hold office through Bb Collaborate in the Quercus course page. The office hour schedule will be posted on Quercus. It is recommended that you visit office hours whenever you have a question about the material. This is very important since this is an online accelerated class, so onus is up to the student to have material clarified as quickly as possible. Don’t wait until the last minute to ask your questions.

Course Assessment

Assignments: Over the course of the semester students will be given a mini assignment to work on that is related to the course material up until that point. This may include both an R component and a written component. More details, such as the content and deadline, will be communicated later. No late report will be accepted.

Undergraduate students will be evaluated in the following way:

Graduate students will be evaluated in the following way:

    There will be one hour at the end of each Monday lecture period to sumbit your online quiz via Quercus. It will be based on the material from the previous week. No late submissions will be accepted.

Weekly Quizzes

There will be 5 “weekly” online quizzes, that will be open during the last hour of each Wednesday lecture. Quizzes will begin on July 14th and continue until the last lecture period.

• Because only the best 4 out of 5 quiz marks will be counted, there will not be any accommodations for missed quizzes. These will receive a mark of 0, but will be dropped as the lowest quiz mark. Therefore, you may miss one quizzes without penalty

• Each of the 4 best quiz will be worth 10% of the overall course grade (slightly less for graduate students).

• The quizzes will be multiple choice and cover material from the previous set of lectures. You may wish to have a calculator available at this time to aid in any calculations.

• Quizzes can be found under Quercus Quizzes in the navigation bar. Quizzes must be done individually.

• Missed quiz: Because only the best 5 quiz marks will be counted, there will not be any accommodations for missed quizzes. These will receive a mark of 0, but will be dropped as part of the two worst quiz marks. Therefore, you may miss two quizzes without penalty.

There are no make-up quizzes. Quizzes, beyond the 1 that will be dropped, will be given zero.

Mini Assignment

You will be given one mini assignment in the term. The purpose of this mini assignment is to develop your skills which will be useful for the final project at the end of the term. The mini assignment will have a heavy focus on the use of statistical software (R ally), and will involve applying the theoretical and applied methods discussed in class.

Final Project

The course project is broken down onto two steps along the duration of the course.

• Step 1: Starting from week 1 (July 5th), and four the next four weeks, you will track and report the number of hours you spend studying for STA302 to contribute to the project dataset. (Due: 11 PM EST Each Sunday)

• Step 1 continued: You will also track and record the number of hours spend thinking about COVID-19.

• Step 2: At the end of the term you will take the average of the four weeks and record this on a google form that will be distributed.

To do this I release a Quercus Quiz called "Week n Data where you will record the number of hours you contributed towards each task in step 1.

Students will be required to demonstrate their understanding of the methods taught in lecture by developing a reasonable regression model using the techniques taught in class. The students will be responsible for choosing the correct methods to apply and providing appropriate justifications where necessary. This is a formal report and therefore it must contain the following sections:

• Introduction section: provides details regarding the question you wish to address, why the model is being developed, how you intend to go about developing the model, and finally how the model meets the purpose mentioned earlier.

• Exploratory data analysis section: a detailed description of the variables in the data set with appropriate tables or figures that highlight certain characteristics of your variables that you deem important to mention.

• Model development section: a detailed discussion of the process used to come to the final model. Justifications may be both statistical and empirical in nature.You should also have as well as in-depth diagnostics to illustrate the ‘goodness’ of the model.

• Conclusion section: restate why the model is useful in the context of the data, provide an interpretation of the final model in non-technical language (i.e explain how the vari-ables work, discuss predictions), and discuss any limitations/problems remaining with the model and how they might impact its use in the real world.

The timelines for the project are as follows:

Missed Assessment Policy

Students are responsible for completing all of the assessments detailed in the previous section. If a student is sick and needs to request an extension or accommodation on a mini project, they must send an email to their instructor. In order for the request to be considered, the email:

• There are no accommodations for missed quizzes other than the flexibility already built into the grading scheme (i.e., the best 4 of 5 quizzes).

• for a missed mini assignment, notification about an extension must be received at least 48 hours before the mini project or term test is is due

• for an extension pertaining to the final project, notification of the accommodation or exten-sion must be provided within 48 hours. You may receive an extension of up to 72 hours at most.

• the subject line must be written in the format shown in Email Policy

• must include your full name and student number in the body of the email

• must specify for which project the extension/accommodation is being requested

• must include the following sentences:

– “I affirm that I am experiencing an illness or personal emergency and I understand that to falsely claim so is an offence under the Code of Behaviour on Academic Matters.”

Remark Policy

Any requests to have marked work re-evaluated must be made in writing to the email: [email protected] within one week of the date the work was returned to the class. The request must contain a valid justification for consideration. You are responsible to check that your scores are entered correctly on Quercus. Any requests for a mark that was not entered correclty in Quercus must be made in writing within one week of the date the mark was entered in Quercus.

“Note that your entire assessment may be remarked and your assessment grade may remain the same, go up, or go down.”

Intellectual Property

Course materials provided on Quercus, such as lecture slides, assignments, tests and solutions are the intellectual property of your instructor and are for the use of students currently enrolled in this course only.

    What is not permitted is providing materials to predatory tutoring companies, or to friends who are not officially enrolled in this course this term.

Providing course materials to any person or company outside of the course is unauthorized use. This includes providing materials to predatory tutoring companies.

Accessibility Statement

Students with diverse learning styles and needs are welcome in this course. The University of Toronto offers academic accommodations for students with disabilities. If you require accommo-dations, or have any accessibility concerns about the course, the classroom, or course materials, please contact Accessibility Services as soon as possible: [email protected] or http://accessibility.utoronto.ca

Academic Integrity Statement

Academic integrity is essential to the pursuit of learning and scholarship in a university, and to ensuring that a degree from the University of Toronto is a strong signal of each student’s individual academic achievement. As a result, the University treats cases of cheating and pla-giarism very seriously. The University of Toronto’s Code of Behaviour on Academic Matters (http://www.governingcouncil.utoronto.ca/policies/behaveac.htm) outlines the behaviours that constitute academic dishonesty and the processes for addressing academic offences. Potential offences include, but are not limited to:

IN PAPERS AND ASSIGNMENTS: Using someone else’s ideas or words without appropriate acknowledgement. Submitting your own work in more than one course without the permission of the instructor. Making up sources or facts. Obtaining or providing unauthorized assistance on any assignment.

ON TESTS AND EXAMS: Using or possessing unauthorized aids. Sharing, posting, or dis-cussing questions or answers with anyone in or outside the course. Misrepresenting your identity.

IN ACADEMIC WORK: Falsifying institutional documents or grades. Falsifying or altering any documentation required by the University, including (but not limited to) doctor’s notes. All suspected cases of academic dishonesty will be investigated following procedures outlined in the Code of Behaviour on Academic Matters. If you have questions or concerns about what constitutes appropriate academic behaviour or appropriate research and citation methods, you are expected to seek out additional information on academic integrity from your instructor or from other institutional resources (see http://academicintegrity.utoronto.ca/).