DSC 212 — Probability and Statistics for Data Science Lecture 3
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
DSC 212 — Probability and Statistics for Data Science
Lecture 3
January 17, 2023
Example 1 (Poisson distribution). For a parameter λ ∈ (0, ∞), a Poisson(λ) distributed random variable X has the PMF,
PX({X = k}) = e −λ
k ∈ {0, 1, 2, . . .}
Notice that the PMF satisfies,
对 PX({X = k}) = e−λ 对 k! = e −λ · eλ = 1.
Exercise 1. Calculate the expected value of Poisson distribution.
Solution: Observe that,
E(X) = 对 k · PX({X = k}) =对 k · e −λ · = λe−λ 对 = λe−λ · eλ = λ,
石企
e λ
which proves the claim.
3.1 Continuous Random Variables
The CDF of a continuous random variable is defined as
PX({X ≤ u}) = FX(u)
A random variable is continuous if there exists a function fX such that:
FX(u) = \ fX (t) dt
Here fX is called the probability density function (PDF) of X . Notice that fX satisfies,
∂FX(t)
∂t
\−fX (t) dt = 1 (normalization)
Example 2 (Exponential distribution). The PDF of an Exp(λ) random variable X is given by,
fX (t) = λ ∈ (0, ∞) (3.1)
Observe that
\− fX (t) dt = \0 ∞ e−λtdt = λ · '0(∞) = 1
which verifies the normalization condition of fX . If instead, we plug in u as the upper limit, we get the CDF of the exponential random variable,
FX(u) = 1 − e −λu .
Example 3 (Uniform distribution). The PDF fX for a Uniform[a,b] random variable is,
0 t < a fX (t) =〈 a ≤ t ≤ b
(0 t > b
where a < b.
We obtain the CDF by integrating the PDF,
0 u < a
(0 u > b
Example 4 (Gaussian distribution). The PDF fX for a Gaussian random variable N(µ,σ2 ) is,
fX (t) = exp ( − (tµ2)2 ) where µ ∈ R and σ 2 ≥ 0 (3.2)
On integrating the PDF, we get the CDF,
FX(u) = Φ ( u µ ) ,
where is Φ is called the error function and is available in most standard numerical computing libraries. For example, scipy .special .erf in Python.
Definition 1 (Dirac delta function). Now we see the definition of Dirac Delta Function,
δ(x) =
The Dirac delta satisfies,
g(t)δ(t − t0 )dt = g(t0 ).
(.0
fX (t) =〈.δ (t − ) +
(0
1 |
2(b−a) |
t < a
a ≤ t ≤ b
t > b
3.2 Expectation of Continuous random variables
If X is a continuous random variable with PDF fX , the expected value of a function g : R → R is defined as,
Eg(X) = \− g(t)fX (t) dt
Example 6 (Expectation: Exponential random variable). Using the formula from (2), for g(t) = t, we get,
EX = \0 ∞ t · λe−λt dt
Integrating by parts, we get:
λ [t · − \ (t) dt] = −te−λt + \ e −λt dt = −te−λt + '0(∞) = 0 − =
Example 7 (Expectation: Gaussian random variable). Recall the PDF fX from (3.2), we get
EX = \− xexp ( − (xµ2)2 ) dx
Substituting u = x − µ, we get x = u + µ and ∂u = ∂x. Integrating the modified equation, we
EX = \−(u + µ)exp ( − ) du
= \− uexp ( − ) du + µ \− exp ( − ) du
石 物
odd function integral is 0
= µ · \− exp ( − ) du = µ \ fY (t)dt = µ .
石 物
1
where Y is the Gaussian random variable with mean parameter µ = 0 and variance σ 2 .
Property 1 (Linearity of Expectation). For any two functions f and g : R → R, and any two numbers α and β ∈ R, the expected value of the sum of the scaled functions is equal to the sum of the expected values of the scaled functions. Mathematically, this means
E(αf + βg)(X) = Eαf(X) + β g(X) = αEf(X) + β Eg(X).
3.3 Multiple random variables: Joint and Marginal distributions
PXY({X ≤ u} ∩ {Y ≤ v}) is called joint CDF of (X,Y).
Example 8 (Joint PMF). The joint PMF for a pair of discrete random variables is expressed as:
PXY({X = u} ∩ {Y = v}).
The marginal distribution of X is the distribution of X irrespective of the value of the Y . Mathematically, the PMF for X = u is given by,
PX({X = u}) =对PXY({X = u} ∩ {Y = v}),
v
where the sum is over all possible values v that the random variable Y can take. Definition 2. A pair of random variables is said to be independent, denoted X ⊥⊥ Y , if,
PXY ({X = u} ∩ {Y = v}) = PX({X = u}) · PX({Y = v}).
Example 9. Consider 2 pairs of random variables, one independent and one dependent.
PXY |
Y=0 |
Y=1 |
PX |
X=0 X=1 |
0.1 0.4 |
0.2 0.3 |
0.3 0.7 |
PY |
0.5 |
0.5 |
|
(X,Y) are dependent.
Definition 3. 3 random variables X,Y,Z are said to be independent if,
PXY Z({X = u} ∩ {Y = v} ∩ {Z = t}) = PX({X = u})PY({Y = v})PZ({Z = t}) where the marginal distribution,
PX({X = u}) =对PXY Z({X = u} ∩ {Y = v} ∩ {Z = t})
Av,t
and PY and PZ are defined similarly.
2023-02-08