Applied Statistics Notes for [2 Discrete Random Variables]
Random Variables
-
A random variable \(X\) is a function that \(\underline{\text{assigns a number to every outcome}}\) of an experiment.
-
A discrete random variable assumes values from a \(\underline{\text{countable}}\) set \(S_X=\{x_1, x_2, \cdots\}\)
-
A discrete random variable is finite if \(\underline{\text{its range is finite}}\).
Probability Mass Function (pmf) (质量分布函数)
Consider a discrete random variable \(X\) that assumes values from a finite or countable set \(S_X=\{x_1, x_2, \cdots, x_k\}\), the pmf of \(X\) is defined as:
Properties of Probability Mass Functions
-
\(p_X(x)\geq 0\)
-
\(\sum_{x\in S_x}p_X(x)=1\)
-
\(p[x\in B]=\sum_{x\in B}p_{X}(x)\) where \(B\subseteq S_x\)
Cumulative Distribution Function (cdf) (累积分布函数)
A cumulative distribution function (cdf) of a random variable(r.v.) \(X\) is defined as
- For discrete r.v. , \(F_X(a)=\sum_{x\leq a}p_X(x)\)
Expected Value
- Expected value of a discrete r.v. is defined by
-
The expected value is defined if the sum converges absolutely: \(\sum_{x}|x|p_X(x)\)
-
If this sum does not converge, then the expected value does not exist (DNE).
Properties of Expected Value
总体来讲就是期望的线性性。
-
\(E[aX]=aE[X]\)
-
\(E[c]=c\)
-
\(E[X+c]=E[X]+c\)
-
\(E[\sum X]=\sum E[X]\)
Variance
Define \(D\) to be the difference between the value of the r.v.
The variance of a r.v. , \(\text{VAR}[X]\) (often denoted by \(s^2\)), is defined as \(\underline{\text{the expected value of the squared difference}}\):
Properties of Expected Value
也是总体来讲是方差的线性性
-
\(\text{VAR}[c]=0\)
-
\(\text{VAR}[X+c]=\text{VAR}[X]\)
-
\(\text{VAR}[cX]=c^2\text{VAR}[X]\)
Moments (矩) & Central Moments (中心矩)
- The \(n^{th}\) moment of a r.v. \(X\) is defined as
\(E[X^n]=\sum_{x}x^np_X(x)\)
- The \(n^{th}\) central moment of a r.v. \(X\) is defined as
\(E[(X-E[X])^n]=\sum_{x}(x-E[X])^np_X(x)\)
Distribution
Bernouli
-
\(E[X]=p\)
-
\(VAR[X]=p(1-p)\)
Binomial
-
\(E[X]=np\)
-
\(VAR[X]=np(1-p)\)
Geometric
-
\(E[M]=\frac{1}{p}\)
-
\(VAR[M]=\frac{1-p}{p^2}\)
Poisson
(\(\alpha\) is a constant.)
- \(E[N]=\alpha\)
Proof
- \(VAR[N]=\alpha\)
Proof
-
\(P[N=k]\) achieves its maximum value at \(\lfloor\alpha\rfloor\)
-
\[\lim_{n\rightarrow \infty}C_{n}^kp^k(1-p)^{n-k}=\lim_{n\rightarrow \infty}\frac{\alpha^k}{k!}e^{-\alpha} \]
here \(\alpha\triangleq np\)
Proof
Consider \(p_0\), the probability of no successes:
Consider the ratio
Since
And
We can obtain
Joint probability mass function (joint pmf)
Joint moment
Covariance
-
When is positive: If \(X\) is greater that its mean, Y is also usually greater than its mean.
-
When is negative: If \(X\) is greater that its mean, Y is usually less than its mean.
-
\(X\) and \(Y\) are uncorrelated if \(\text{COV}[X, Y]=0\).
-
\(X\) and \(Y\) are uncorrelated if and only if \(E[XY]=E[X]\cdot E[Y]\)
Proof
- If \(X\) and \(Y\) are uncorrelated, then \(\text{VAR}[X+Y]=\text{VAR}[X]+\text{VAR}[Y]\)
Correlation coefficient
where \(\sigma^2_X=\text{VAR}(X), \sigma^2_Y=\text{VAR}(Y)\)
range: \(\rho_{X, Y}\in[-1, 1]\).
(comment: linear regression.)

浙公网安备 33010602011771号