Applied Statistics Notes for [3 Continuous Random Variables]

Probability Density Function (pdf) (概率密度函数)

pdf is defined as the derivative of the cdf:

\[f_{X}(x)=\frac{\text{d}F(x)}{\text{d}x}=\lim_{h\rightarrow 0}\frac{F(x+h)-F(x)}{h} \]

Note that pdf is NOT the probability that \(X=x\), but is proportional to the probability that \(X\) is close to \(x\):

\[\begin{aligned} P[x<X\leq x+h]&=F_{X}(x+h)-F_{X}(x)\\\\ &=\frac{F_{X}(x+h)-F_{X}(x)}{h}\cdot h\\\\ &\approx f_{X}(x)\cdot h \text{ (if } h \text{ is small)} \end{aligned} \]

Properties

\(f_{X}(x)\geq 0\)
\(F_{X}(x)=\int_{-\infty}^xf_{X}(t)\text{d}t\)
\(\int_{-\infty }^{\infty}f_{X}(t)\text{d}t=1\)
\(P[a<X\leq b]=\int_{a^{+}}^bf_{X}(x)\text{d}x\)

Expection

\[E[X]=\int_{-\infty}^{\infty}xf_{X}(x)\text{d}x \]

Variance

\[\text{VAR}[X]=E[(X-E[X])^2] \]

Exponential (指数分布) Random Variable

cdf:

\[F_{X}(x)= \begin{cases} 0, &x<0\\\\ 1-e^{-\lambda x}, &x\geq 0 \end{cases} \]

pdf:

\[f_{X}(x)= \begin{cases} 0, &x<0\\\\ \lambda e^{-\lambda x}, &x\geq 0 \end{cases} \]

\(E[X]=\frac{1}{\lambda}\)

Proof

\[\begin{aligned} E[X]&=\int_{0^{+}}^{\infty} xf_{X}(x)\\\\ &=\int_{0^{+}}^{\infty} x\lambda e^{-\lambda x}\\\\ &=-\int_{0^{+}}^{\infty} -x\lambda e^{-\lambda x}\\\\ &=-[(t-\frac{1}{\lambda})e^{\lambda t}]|_{-\infty}^{0^{-}} & (t=-x)\\\\ &=\frac{1}{\lambda} \end{aligned} \]

\(\text{VAR}[X]=\frac{1}{\lambda^2}\)

Proof

\[\begin{aligned} E[X^2]&=\int_{0^{+}}^{\infty} x^2f_{X}(x)\\\\ &=\int_{0^{+}}^{\infty} x^2\lambda e^{-\lambda x}\\\\ &=[(-x^2-\frac{2x}{\lambda}-\frac{2}{\lambda^2})e^{-\lambda x}]|_{0^{+}}^{\infty}=\frac{2}{\lambda^2} \end{aligned} \]

\[\text{VAR}[X]=E[X^2]-(E[X])^2=\frac{2}{\lambda^2}-(\frac{1}{\lambda})^2=\frac{1}{\lambda^2} \]

The exponential random variable is the limiting form of the geometric random variable.

i.e. For a Geometric RV \(M\) and a Exponentail RV \(X\), take \(\lambda =np\), we have

\(P[X\leq t]=P[M\leq nt]=1-P[M> nt]=1-(1-p)^{nt}\stackrel{n\rightarrow \infty}{\longrightarrow} 1-e^{-\lambda t}\)

as \(n\) is sufficient large.

Memoryless Property: \(P[X>t+h|X>t]=P[X>h]\)

Gaussian (高斯分布) Random Variable

\[X\sim \mathscr{N}(\mu, \sigma^2) \]

Here \(\mu\triangleq E[X]\), \(\sigma^2\triangleq \text{VAR}[X^2]\)

pdf:

\[f_{X}(x)=\frac{1}{\sqrt{2\pi}\sigma}e^{\frac{-(x-\mu)^2}{2\sigma^2}} \]

Special value
- \(P[\mu-\sigma<X\leq \mu+\sigma]=68\%\)
- \(P[\mu-2\sigma<X\leq \mu+2\sigma]=95\%\)
- \(P[\mu-3\sigma<X\leq \mu+3\sigma]=99.7\%\)

Normalized Gaussian

\[X\sim\mathscr{N}(\mu, \sigma^2)\Leftrightarrow Y\sim\mathscr{N}(0, 1) \]
\(Y\triangleq\frac{X-\mu}{\sigma}\)
cdf of Normalized Gaussian

\[\phi_{X}(x)=P[X\leq x]=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^xe^{\frac{-t^2}{2}}\text{d}t \]

cdf

\[F_{Y}(y)=\phi_{\frac{Y-\mu}{\sigma}}(\frac{y-\mu}{\sigma}) \]

Q-function for Normalized Gaussian

\[Q_{X}(x)=P[X>x] \]

where \(X\) is the Normalized Gaussian.

Q-function

\[P[Y>y]=Q_{\frac{Y-\mu}{\sigma}}(\frac{y-\mu}{\sigma}) \]

Recurrence for the raw moments of a normal distribution

Denote mean is \(\mu\) and variance is \(\sigma^2\):

\[E[X^{n+1}]=\mu E[X^{n}]+n\sigma^2 E[X^{n-1}] \]

Proof

\[\begin{aligned} E[X^{n-1}]&=\int_{-\infty}^{+\infty}x^{n-1}\frac{1}{\sqrt{2\pi}\sigma}e^{\frac{-(x-\mu)^2}{2\sigma^2}}\text{d}x\\\\ &=\int_{-\infty}^{+\infty}\frac{1}{\sqrt{2\pi}\sigma}e^{\frac{-(x-\mu)^2}{2\sigma^2}}\text{d}\frac{x^{n}}{n}\\\\ &=[\frac{x^{n}}{n}\frac{1}{\sqrt{2\pi}\sigma}e^{\frac{-(x-\mu)^2}{2\sigma^2}}]|_{x=-\infty}^{+\infty}-\int_{-\infty}^{+\infty}\frac{x^{n}}{n}\text{d}\frac{1}{\sqrt{2\pi}\sigma}e^{\frac{-(x-\mu)^2}{2\sigma^2}}\\\\ &=0-\int_{-\infty}^{+\infty}\frac{x^{n}}{n}\frac{1}{\sqrt{2\pi}\sigma}e^{\frac{-(x-\mu)^2}{2\sigma^2}}\cdot\frac{\mu-x}{\sigma^2}\text{d}x\\\\ &=-\frac{\mu}{n\sigma^2}\int_{-\infty}^{+\infty}x^{n}\frac{1}{\sqrt{2\pi}\sigma}e^{\frac{-(x-\mu)^2}{2\sigma^2}}\text{d}x+\frac{1}{n\sigma^2}\int_{-\infty}^{+\infty}x^{n+1}\frac{1}{\sqrt{2\pi}\sigma}e^{\frac{-(x-\mu)^2}{2\sigma^2}}\text{d}x\\\\ &=-\frac{\mu E[X^n]}{n\sigma^2}+\frac{E[X^{n+1}]}{n\sigma^2}\\\\ \end{aligned} \]

Thus

\[n\sigma^2E[X^{n-1}]=-\mu E[X^n]+E[X^{n+1}] \]

Joint Cumulative Distribution Function

\[F_{X, Y}(x, y)=P[\lbrace X\leq x\rbrace\cap\lbrace Y\leq y\rbrace] \]

Marginal cdf

\[F_{X}(x)=F_{X, Y}(x, \infty) \]

\[F_{Y}(y)=F_{X, Y}(\infty, y) \]

Jointly Continuous Random Variables

We say that two random variables are jointly continuous if the joint cumulative
distribution function is continuous and differentiable.

Joint Probability Density Function

\[f_{X, Y}(x, y)=\frac{\partial^2 F_{X, Y}(x, y)}{\partial x\partial y} \]

Note that the probability density function is not necessarily continuous.

Marginal Densities

\[f_{X}(x)=\int_{-\infty}^{\infty}f_{X, Y}(x, \beta)\text{d}\beta \]

\[f_{Y}(y)=\int_{-\infty}^{\infty}f_{X, Y}(\alpha, y)\text{d}\alpha \]

Joint Distributions (\(X\) Discrete, \(Y\) Continuous)

Joint CDF: \(F_{X, Y}(x, y)=P[X\leq x, Y\leq y]\)

PMF in \(X\) and CDF in \(Y\): \(p_{X, Y}(x, y)=P[X=x, Y\leq y]\)

PMF in \(X\) and PDF in \(Y\): \(f_{X, Y}(x, y)=\frac{\text{d}}{\text{d}y}P[X=x, Y\leq y]\)

Conditional cdf (Continuous \(X\) and \(Y\))

Define the conditional cdf of \(Y\) given \(X=x\) by

\[F_{Y|X}(y|x)=\frac{\int_{-\infty}^yf_{X, Y}(x, \beta)\text{d}\beta}{f_{X}(x)} \]

Proof

\[\begin{aligned} F_{Y|X}(y|x)&=\lim_{h\rightarrow 0}\frac{P[\lbrace Y\leq y\rbrace\cap \lbrace x\leq X\leq x+h\rbrace]}{P[\lbrace x\leq X\leq x+h\rbrace]}\\\\ &=\lim_{h\rightarrow 0}\frac{\int_{-\infty}^y\int_{x}^{x+h}f_{X, Y}(\alpha, \beta)\text{d}\alpha\text{d}\beta}{\int_{x}^{x+h}f_{X}(\alpha)\text{d}\alpha}\approx\lim_{h\rightarrow 0}\frac{h\int_{-\infty}^yf_{X, Y}(x, \beta)\text{d}\beta}{f_{X}(x)h}\\\\ &=\frac{\int_{-\infty}^yf_{X, Y}(x, \beta)\text{d}\beta}{f_{X}(x)} \end{aligned} \]

Conditional pdf (Continuous \(X\) and \(Y\))

Define the conditional pdf of \(Y\) given \(X=x\) by

\[f_{Y|X}(y|x)=\frac{f_{X, Y}(x, y)}{f_{X}(x)} \]

Properties

Product rules:

\[f_{X, Y}(x, y)=f_{X|Y}(x|y)f_{Y}(y) \]

\[f_{X, Y}(x, y)=f_{Y|X}(y|x)f_{X}(x) \]

Total Probability Theorem

\[f_{Y}(y)=\int_{-\infty}^{\infty}f_{Y|X}(y|x)f_{X}(x)\text{d}x \]

Bayes Theorem

\[f_{X|Y}(x|y)=\frac{f_{Y|X}(y|x)f_{X}(x)}{\int_{-\infty}^{\infty}f_{Y|X}(y|t)f_{X}(t)\text{d}t} \]

Conditional pmf or pdf (\(X\) discrete, \(Y\) continuous)

Let \(f_{X, Y}(x, y)=\frac{\text{d}}{\text{d}y}P[X=x, Y\leq y]\) (joint pmf in \(X\) and pdf in \(Y\)).

Conditional pmf of \(X\) given \(Y\):

\[p_{X|Y}(x|y)= \begin{cases} \frac{p_{X, Y}(x, y)}{p_{Y}(y)}, &p_{Y}(y)>0\\\\ \text{undefined}, &\text{otherwise} \end{cases} \]

Conditional pmf of \(X\) given \(Y\):

\[p_{X|Y}(x|y)= \begin{cases} \frac{f_{X, Y}(x, y)}{f_{X}(x)}, &f_{X}(x)>0\\\\ \text{undefined}, &\text{otherwise} \end{cases} \]

Gamma function

\[\Gamma(z)=\int_{0}^{\infty}x^{z-1}e^{-x}\text{d}x \]

\(\Gamma(z)=(z-1)!\) for any \(z\in\mathbb{Z}\)

[知乎] 特殊函数入门指南——伽马函数(一) - fell

Independence (Continuous Random Variables)

The following three statements are equivalent:

\(X\) and \(Y\) are independent.
\(F_{X, Y}(x, y)=F_{X}(x)F_{Y}(y)\) for all \(x\) and \(y\)
\(f_{X, Y}(x, y)=f_{X}(x)f_{Y}(y)\) for all \(x\) and \(y\)

Moments and Central Moments

Joint moment:

\[E[X^jY^k]=\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}x^jy^kf_{X, Y}(x, y)\text{d}x\text{d}y \]

Central moment: \(E[(X-E[X])^j(Y-E[Y])^k]\)

Conditional Expectation

\[E[Y|x]= \begin{cases} \int_{-\infty}^{\infty} yf_{Y|X}(y|x)\text{d}y, & Y \text{ is continuous}\\\\ \sum_{y} yp_{Y|X}(y|x)\text{d}y, & Y \text{ is discrete}\\\\ \end{cases} \]

\(E[E[Y|X]]=E[Y]\)

Two Joint Gaussian RVs

Joint pdf

If \(X_1\) and \(X_2\) are jointly Gaussian, their joint pdf is given by

\[f_{X_1, X_2}(x_1, x_2)=\frac{1}{2\pi\sigma_{1}\sigma_2 \sqrt{1-\rho^2} }\text{exp}\left\lbrace-\frac{\frac{(x_1-m_1)^2}{\sigma_1^2}-2\rho\frac{(x_1-m_1)(x_2-m_2)}{\sigma_1\sigma_2}+\frac{(x_2-m_2)^2}{\sigma_2^2}}{2(1-\rho^2)}\right\rbrace \]

where \(m_i=E[X_i], \sigma_i^2=\text{VAR}[X_i]\), \(\rho\) is the correlation coefficient.

Vector Notation

\[\vec{X}= \left[ \begin{matrix} X_1\\\\ X_2\\\\ \vdots\\\\ X_n \end{matrix} \right] \]

\[\begin{aligned} C&=E[(\vec{X}-E[\vec{X}])(\vec{X}-E[\vec{X}])^{T}]\\\\ &= \left[ \begin{matrix} E[(X_1-E[X_1])(X_1-E[X_1])] &E[(X_1-E[X_1])(X_2-E[X_2])] &\cdots &E[(X_1-E[X_1])(X_n-E[X_n])]\\\\ E[(X_2-E[X_2])(X_1-E[X_1])] &E[(X_2-E[X_2])(X_2-E[X_2])] &\cdots &E[(X_2-E[X_2])(X_n-E[X_n])]\\\\ \vdots &\vdots &\ddots &\cdots\\\\ E[(X_n-E[X_n])(X_1-E[X_1])] &E[(X_n-E[X_n])(X_2-E[X_2])] &\cdots &E[(X_n-E[X_n])(X_n-E[X_n])]\\\\ \end{matrix} \right] \end{aligned} \]

\[f_{\vec{X}}(\vec{x})=\frac{1}{2\pi\sqrt{\det(C)}}\text{exp}\left\lbrace -\frac{1}{2}(\vec{x}-E[\vec{X}])^{T}C^{-1}(\vec{x}-E[\vec{X}])\right\rbrace \]

Properties

Assume that \(X_1\) and \(X_2\) are joint Gaussian.

The marginal densities of \(X_1\) and \(X_2\) are Gaussian.
If \(X_1\) and \(X_2\) are uncorrelated, they are also independent.
The conditional density \(X_1\) given \(X_2\) is Gaussian.
Any affine combination of \(X_1\) and \(X_1\) is Gaussian.

Markov Inequality

Let \(X\) be a non-negative RV., \(X\geq 0\).

\[P(X\geq a)\leq \frac{E(X)}{a} \]

Proof

\[\begin{aligned} E(X)&=\int_{0}^{\infty} tf_{X}(t)\text{d}t\\\\ &\geq\int_{a}^{\infty} tf_{X}(t)\text{d}t\\\\ &\geq \int_{a}^{\infty} af_{X}(t)\text{d}t=aP(X\geq a) \end{aligned} \]

Chebyshev Inequality

Let \(X\) be a RV. with mean \(m\) and variance \(\sigma^2\).

\[P[|X-m|\geq a]\leq \frac{\sigma^2}{a^2} \]

In other words, for any \(a\) large enough (compared the standard deviation), the probability that \(X\) is further than \(a\) from the mean is negligible (可忽略). (Very useful consider error)

Proof

\[\begin{aligned} P(|X-m|\geq a)&=P((X-m)^2\geq a^2)\\\\ &\leq \frac{E[(X-m)^2]}{a^2}=\frac{\sigma^2}{a^2} \end{aligned} \]

Independent and Identically Distributed (i.i.d.) Random Variables

In probability theory and statistics a collection of random variables is independent and identically distributed (i.i.d.) if each random variable has the same probability distribution as the others and all are mutually independent.

Central Limit Theorem

Suppose \(X_i\) for \(i\in\lbrace 1, 2, \cdots, n\rbrace\) are i.i.d random variables with mean \(\mu\) and variance \(\sigma^2\).

Define \(S_n=\sum_{i=1}^n X_i, Z_n=\frac{S_n-n\mu}{\sigma\sqrt n}\), then

\[\lim_{n\rightarrow \infty}P[Z_n<z]=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{z}\text{exp}(-\frac{x^2}{2})\text{d}x \]

posted @ 2025-04-16 12:55 Displace 阅读(11) 评论(0) 收藏举报

刷新页面返回顶部

Displace

Applied Statistics Notes for [3 Continuous Random Variables]

Probability Density Function (pdf) (概率密度函数)

Properties

Expection

Variance

Exponential (指数分布) Random Variable

Gaussian (高斯分布) Random Variable

Normalized Gaussian

Recurrence for the raw moments of a normal distribution

Joint Cumulative Distribution Function

Marginal cdf

Jointly Continuous Random Variables

Joint Probability Density Function

Marginal Densities

Joint Distributions (\(X\) Discrete, \(Y\) Continuous)

Conditional cdf (Continuous \(X\) and \(Y\))

Conditional pdf (Continuous \(X\) and \(Y\))

Properties

Conditional pmf or pdf (\(X\) discrete, \(Y\) continuous)

Gamma function

Independence (Continuous Random Variables)

Moments and Central Moments

Conditional Expectation

Two Joint Gaussian RVs

Joint pdf

Vector Notation

Properties

Markov Inequality

Chebyshev Inequality

Independent and Identically Distributed (i.i.d.) Random Variables

Central Limit Theorem

公告