Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Lecture 1: Stats 217

Summer 2022

Section 1 should serve as a recapitulation of basic facts from probability (at the level of Stats 116).  By no means is this supposed to be carefully exhaustive.  The main purpose is to help you remember what you have seen before so that you can go back and revise them.  Courses in probability (at the level of Stats 116), calculus (at the level of Math 19 and Math 20) and linear algebra (at the level of Math 51) are prerequisites.

1    Recap of probability concepts

1.1    Random variables

We start with a probability triple (Ω , F, P) where Ω is the sample space, F is a collection of events and P is a probability measure. A random variable X is a mapping X : Ω → R with cumulative distribution function (cdf) FX  : R → [0, 1] defined by FX (x) = P(X x).

Random variables that take values in a countable set are called discrete random variables, ex- amples being Binomial, Poisson, Geometric, etc. We describe them by specifying their probability mass function (pmf) pX (x) = P(X = x). The cdf of discrete random variables is generally dis- continuous. Random variables whose cdf is continuous, are called continuous random variables.

Continuous random variables we will encounter in this course will have a probability densityfunc-  d 

dx

1.2    Joint distributions

A collection of events {Ai }iI  (here I denotes any indexing set) is a collection of mutually inde- pendent events if for any nite collection Ai1 , ..., Aik   (with i1 , ..., ik  all distinct), we have

P ( Air) = P(Air)

For a nite collection of random variables X1 , ..., Xk , we describe their inter-dependence by specifying their joint cdf FX1 ,...,Xk (x1 , ..., xk )  := P(X1   ≤ x1 , ..., Xk   ≤ xk ).  Equivalently, we can also specify their joint pmf pX1 ,...,Xk (x1 , ..., xk ) = P(X1   = x1 , ..., Xk   = xk ) or joint pdf

k           

fX1 , . . . ,Xk (x1 , ..., xk ) = x1 ...∂xk FX1 , . . . ,Xk (x1 , ..., xk ).

Random variables X1 , ..., Xk  are called mutually independent if their joint cdf splits into the product of individual cdfs: FX1 ,...,Xk (x1 , ..., xk ) = i(k)=1 FXi (xi ). For discrete random variables, this is equivalent to splitting of joint pmf: pX1 ,...,Xk (x1 , ..., xk ) = i(k)=1 pXi (xi ) and for continu- ous random variables, independence is equivalent to splitting of joint pdf: fX1 ,...,Xk (x1 , ..., xk ) = i(k)=1 fXi (xi ).

A collection of random variables {Xi }iI is a collection of mutually independent random vari- ables if any nite collection is mutually independent.

1.3    Conditional distributions

For events A and B, the conditional probability P(A|B) := P(A ∩ B)/P(B). Using the law of total probability and the definition of conditional probability, we obtain Bayes rule:

P(A|B)P(B)            

P(B|A) =

Take a collection of random variables X1 , ..., Xk , with joint pmf pX1 ,...,Xk   or joint pdf fX1 ,...,Xk . Fix sets I, J ⊂ {1, ..., k}.  Then the conditional distribution of {Xi }iI  given {Xj }jJ  is spec- ified by the conditional pmf pXI|XJ (xI |xJ ) := pXI,XJ (xI , xJ )/pXJ (xJ ) for discrete case (with p replaced by f in continuous case). Here, we abuse notation a bit by identifying XI  = {Xi }iI  and xI  = {xi }iI .

Note: Independent random variables with the same distribution will be referred to as "i.i.d." (inde- pendent and identically distributed).

1.4    Expectation

For a random variable X, the expected value E(X) is defined as xpX (x) in the discrete case and xfX (x)dx in the continuous case. The sum and integral are taken over the possible values X can take.

The law of the unconscious statistician posits that for any random variable X and any function g, E(g(X)) = g(x)pX (x) for discrete and g(x)fX (x)dx for continuous.  In particular, this helps us to compute E(X2 ) which in turn enables us to compute Var(X) := E(X2 ) − (E(X))2 .