Applied Statistics Notes for [4 Parameter Estimation]
Sample Mean (as a R.V.)
According to the \(n\) copies of a R.V. \(X\), we have the sample mean \(\bar{X}\) is defined as
Sample mean \(\bar{X}\), as a R.V., is an estimation method (ESTIMATOR) of averaging \(X_1, X_2, \cdots, X_n\) if \(n\) data are considered.
Features of sampling:
-
Independent: \(X_i\) does not affect each other.
-
Representative: every \(X_i\) has the same distribution as the population: \(X_i\sim F(x, \theta)\)
Some commonly used statistics

Sample Moment vs Population Moment

Numerical Description of Sample Mean and Sample Variance
-
\[E[\bar{X}]=\mu \]
Proof
-
\[\text{VAR}[\bar{X}]=\frac{\sigma^2}{n} \]
Proof
-
\[E[S_{n-1}^2]=\sigma^2 \]
Proof
- Lemma 1: \(E(X^2)=\sigma^2+\mu^2\)
证明:
设 \(X\) 是离散型随机变量,其概率分布函数为 \(P(x)\)。
\(\mu:=\int xP(x)\text{d} x\)
\(\sigma^2:=\int (x-\mu)^2P(x)\text{d} x\)
展开方差的定义式:
即:
- Definition 1: 定义一个在随机变量的样本集合上的方差函数 \(f(X)\) 是好的,当且仅当 \(E[f(X)]=\sigma^2(X)\)。
回到原问题,我们要证明 \(f(X)=\frac{1}{n-1}\sum_{i=1}^{n}(X_{i}-\bar X)^2\) 是好的。
则:
因为这个试验是独立随机试验,所以根据 Lemma 1有:
Unbiasedness 无偏性
\(\bar{X}\) is an unbiased estimator of \(\mu\).
\(S_{n-1}^2\) is an unbiased estimator of \(\sigma^2\).
Point Estimation
Let the pdf (or pmf) of a R.V. \(X\) be \(f(x; \vec{\theta})\) with an unknown parameter vector \(\vec{\theta}=(\theta_1, \theta_2, \cdots, \theta_m)^{T}, \theta\in\Omega\subseteq \mathbb R^{m}\), where \(\Omega\) is the corresponding parameter space, and \(m\geq 1\) represents the number of unknown parameters to be estimated.
Point Estimator vs Point Estimate
-
We take a random sample \(X_1, X_2, \cdots, X_n\) from a population with the pdf/pmf \(f(x; \vec{\theta})\), where \(n\) is the sample size.
-
If a statistic \(Y=T(X_1, X_2, \cdots, X_n)\) is used to estimate the parameter \(\vec{\theta}\), then the statistic is called a point estimator of \(\vec{\theta}\), where \(Y\) is a random variable.
-
If the observations (观测值) of \(X_1, X_2, \cdots, X_n\) are \(x_1, x_2, \cdots, x_n\), then \(y=T(x_1, x_2, \cdots, x_n)\) is called
a point estimate of \(\vec{\theta}\), where \(y\) is a real number. -
Examples:
-
\(\bar{X}=\frac{1}{n}\sum_{i=1}^nX_i\) is a point estimator of \(\mu=E[X]\), and \(\bar{x}=\frac{1}{n}\sum_{i=1}^nx_i\) is a point estimate of \(\mu\).
-
\(S_{n-1}^2=\frac{1}{n-1}\sum_{i=1}^n(X_i-\bar{X})^2\) is a point estimator of \(\sigma^2=\text{VAR}[X]\), and \(s_{n-1}^2=\frac{1}{n-1}\sum_{i=1}^n(x_i-\bar{x})^2\) is a point estimate of \(\sigma^2\).
-
Properties
-
Point estimators are NOT unique (by using different methods or different moments)
-
Unbiased/Biased
- Definition: If the expectation of the estimator \(\hat{\theta}\) exists, and for any \(\theta\in \Omega\), \(E[\hat{\theta}]=\theta\), then \(\hat{\theta}\) is called an unbiased estimator of \(\theta\). Otherwise, it is called a biased estimator.
- \(b_n(\hat{\theta})\triangleq E[\hat{\theta}]-\theta\) is called the bias of the estimator \(\hat{\theta}\).
- If \(b_n(\hat{\theta})\neq0\), then \(\hat{\theta}\) is a biased estimator of \(\theta\).
- If \(b_n(\hat{\theta})\neq0\) and \(\lim_{n\rightarrow \infty} b_n(\hat{\theta})=0\), then \(\hat{\theta}\) is an asymptotic unbiased estimator (渐进无偏估计量) of \(\theta\).
-
Efficient
- Definition: Let \(\hat{\theta_1}\) and \(\hat{\theta_2}\) are two unbiased estimator of \(\theta\). If for any \(\theta\in\Omega\), we have \(\text{Var}[\hat{\theta_1}]<\text{Var}[\hat{\theta_2}]\), then \(\hat{\theta_1}\) is said to be more efficient than \(\hat{\theta_2}\)
Method of Moments Estimation
Definition
Suppose that there are \(m\) unknown parameters \(\theta_1, \theta_2, \cdots, \theta_m\), where the unknown parameters are functions of 𝑚 or more theoretical moments. i.e.,
where \(E[X^k]\) is the theoretical/population moment.
then the method of moments estimator (MME), denoted by \((\hat{\theta_1}, \hat{\theta_2}, \cdots, \hat{\theta_m})\) of \((\theta_1, \theta_2, \cdots, \theta_m)\) is
where \(\bar{X^k}=\frac{1}{n}\sum_{i=1}^{n}X_{i}^k\) is the sample moment.
Procedure
-
Calculate low order moments, finding expressions for the population moments in terms of the parameters. Typically, the number of low order moments needed will be the same as the number of parameters.
-
Invert the expressions found in the preceding step, finding new expressions for the parameters in terms of the moments
-
Insert the sample moments into the expressions obtained in the second step, thus obtaining estimators of the parameters in terms of the sample moments.

Properties
-
MME defined above are NOT unique because the parameter can be written as different functions of moments, e.g ., the parameter of a Poisson distribution.
- \(X\sim \text{Poisson}(\lambda), E[X]={Var}[X]=\lambda\)
-
MME obtained are often biased.
- \(S_{n}^2=\frac{1}{n}\sum_{i=1}^nX_i^2\)
Interval Estimation
Definition
Assume that the population \(X\sim f(x; \theta) (\theta\in \Omega)\), for any \(\alpha\in(0, 1)\), if there exist two statistics
s.t. \(\forall \theta\in \Omega\)
Then the random interval \([T_1, T_2]\) is the interval desired.
\(T_1\) and \(T_2\) are called the lower and upper confidence limits/bounds, respectively.
A value \([t_1, t_2]\) of the random interval \([T_1, T_2]\) is also called a \(100(1-\alpha)\%\) confidence interval for \(\theta\).
The quantity \(1-\alpha\) is called the confidence level associated with the confidence interval.
\(\chi^2\) (chi-squared)-Distribution
Let \(X_1, X_2, \cdots, X_n\) be a random sample (i.e., i.i.d.) from the population \(X\sim\mathscr{N}(0, 1)\), and \(Y=X_1^2, X_2^2, \cdots, X_n^2\). Then \(Y\) follows the \(\chi^2\)-distribution with \(n\) degrees of freedom, denoted as \(Y\sim \chi^2(n)\).
- Degrees of freedom: the number of variables that can change independently.
p.d.f.:
Properties
- Additivity
If \(V\sim \chi^2(n_1), W\sim \chi^2(n_2)\), \(V\) and \(W\) are independent, then \(V+W\sim \chi^2(n_1+n_2)\).
- Expectation and Variance
If \(Y\sim \chi^2(n)\), then \(E[Y]=n\), \(\text{Var}[Y]=2n\)
\(t\)-distribution
If \(Z\sim \mathscr{N}(0, 1), Y\sim\chi^2(n)\), \(Z\) and \(Y\) are independent. Let \(W=\frac{Z}{\sqrt{\frac{Y}{n}}}\), then \(W\) follows the \(t\)-distribution with \(n\) degrees of freedom, denoted as \(W\sim t(n)\).
p.d.f.
3 Theorems in Sampling Distribution
Theorem 1
Let \(X_1, X_2, \cdots, X_n\) be a random sample from the population \(X\sim \mathscr{N}(\mu_X, \sigma_X^2)\), then
i.e.
Proof
By the property of normal distribution, the linear combination of independent normal R.V.s still follows a normal distribution.
Theorem 2
Let \(X_1, X_2, \cdots, X_n\) be a random sample from the population \(X\sim \mathscr{N}(\mu_X, \sigma_X^2)\), then
Proof
Thus
Since \(\frac{\bar{X}-\mu_X}{\frac{\sigma_{X}}{\sqrt{n}}}\sim\mathscr{N}(0, 1)\), \(\left(\frac{\bar{X}-\mu_X}{\frac{\sigma_{X}}{\sqrt{n}}}\right)^2 \sim \chi^2(1)\).
Since \(\frac{X_i-\mu_{X}}{\sigma_{X}}\sim \mathscr{N}(0, 1)\), \(\sum_{i=1}^{n}\left(\frac{X_i-\mu_{X}}{\sigma_{X}}\right)^2\sim \chi^2(n)\).
Thus \(\frac{(n-1)S_{n-1}^2}{\sigma_{X}^2}\sim \chi^2(n-1)\).
Remark
Theorem 3
Let \(X_1, X_2, \cdots, X_n\) be a random sample from the population \(X\sim \mathscr{N}(\mu_X, \sigma_X^2)\), then
Proof
Some Notations
\(z\) notation
For \(Z\sim\mathscr{N}(0, 1)\), \(z_a\) denotes its upper \(a\)-th quantile s.t.
for \(a\in[0, 1]\)
-
\[P[-z_{\frac{\alpha}{2}}\leq \bar{Z}\leq z_{\frac{\alpha}{2}}]=1-\alpha \]
\(t\) notation
For \(T_{n-1}\sim t(n-1)\), \(t_{n-1, a}\) denotes its upper \(a\)-th quantile of \(t(n-1)\) s.t.
for \(a\in[0, 1]\)
\(\chi^2\) notation
For \(U_{n-1}\sim \chi^2(n-1)\), \(\chi^2_{n-1, a}\) denotes its upper \(a\)-th quantile of \(\chi^2(n-1)\) s.t.
for \(a\in[0, 1]\)
Examples
Random Interval for \(\mu_{X}\) with Known \(\sigma^2_{X}\)
By Theorem 1, if the population \(X\sim\mathscr{N}(\mu_{X}, \sigma^2_{X})\), then \(\bar{X}\sim\mathscr{N}(\mu_{X}, \frac{\sigma^2_{X}}{n})\).
Thus
is the random interval for \(\mu_{X}\) with probability \(\alpha\).
Confidence Interval for \(\mu_{X}\) with Known \(\sigma^2_{X}\)
confidence interval (置信区间) 和 random interval 的区别实际上就是前者是后者代入具体数值。
Random/Confidence Interval for \(\mu_{X}\) with Unknown \(\sigma^2_{X}\)
By Theorem 3, \(\frac{\bar{X}-\mu_{X}}{\frac{S_{n-1}}{\sqrt{n}}}\sim t(n-1)\).
Thus
is the random interval for \(\mu_{X}\) with probability \(\alpha\).
C.I.:
Random/Confidence Interval for \(\sigma^2_{X}\) with Unknown \(\mu_{X}\)
By Theorem 2, \(\frac{(n-1)S_{n-1}^2}{\sigma_{X}^2}\sim \chi^2(n-1)\)
Thus
is the random interval for \(\sigma^2_{X}\) with probability \(\alpha\).
C.I.:

浙公网安备 33010602011771号