Stats 102A Summer 2021, Lecture 1 with Miles Chen


Logistics

The midterm exam is timed. You will have 2 hours to complete the exam.

You can take the Midterm Exam anytime between Friday July 9, 12 Noon and will close Saturday July 10, 11:59PM (California time). Once you begin the exam, you must complete it within 2 hours. Make sure you start the exam before Saturday at 9:59PM or you will have less than two hours to complete the exam. I know the date and time is slightly different from what is listed in the syllabus and I apologize for the change.

The midterm exam will be taken on Gradescope as an “online assignment.” The following video provides details on how Gradescope’s online assignments work.

https://www.youtube.com/watch?v=pgklq6JDatA&t=0s

The exam is open book and open note. You are allowed to use R. You may read documentation.

**You are not allowed to communicate with other students or people during the exam. You are not allowed to share questions from the exam with anyone else. Incidents of cheating will be reported.**

Exam content will cover material from Lecture 1-1, Lecture 1-2, Lecture 2-2, and Lecture 3-2. You should be familiar with the material needed to complete homework 1 to homework 3. (You do not need to have finished HW3 before the midterm.)


Topics:

There will be several questions showing you some code and asking you *what* will happen when we run the code and *why* R produces the output. You are allowed to use R to figure out what the code will output. If the question asks why, be sure to answer why.

For example, a question covering material from Lecture 1-1 could be:

> x <- c(NA, NULL)

> y <- c(NA, character(0))

What will the following code return? If they return different results, why?

> length(x)

> length(y)

Using the same values of x and y from above, what will the following code return? If they return different results, why?

> is.logical(x)

> is.logical(y)

There will be several questions asking you to write a few lines of code to perform a task. Questions will be simple enough so that a solution can be done with six lines of code or less. This is not a restriction. You can write twenty lines of code if you need to. This is simply a description of the level of complexity of the problems.

For these questions, you will write your code in a textbox. You should write the code in R Studio first and after you get it working, you will copy your code over to the textbox. Make sure you only copy the code. When we grade your exam, we will copy the code from the textbox into R and run it to evaluate it. Make sure you do not include stray symbols like + or > or # that sometimes precedes lines when you copy it. If it produces errors, you will not get points for your submission.

Example problem: I might show you the head of dataset and will ask you to use tidyr and/or dplyr to reshape the data frame and provide summary statistics by group.

Another example: I might provide the url of a website and ask you for the code needed to extract certain items.

Another example: I could provide a series of text strings and ask you to write a line of code that uses regular expressions to extract a particular portion or pattern.

Another example: I could ask you to write a vectorized function that will accept a vector of values and can calculate the corresponding outputs from a piece-wise function.

These are just some ideas of problems that I could put on the exam.