1.一元线性回归里的最小二乘
\[\begin{align*}
& 最小二乘用来估计参数\beta_0,\beta_1\\
& 离差平方和Q(\beta_0,\beta_1)=\sum_{i=1}^n[y_i-E(y_i)]^2=\sum_{i=1}^n(y_i-\beta_0-\beta_1x_i)^2\\
& 最小二乘:寻找使上式离差平方和最小的\beta_0,\beta_1\\
& Q(\hat{\beta_0},\hat{\beta_1})=\sum_{i=1}^n(y_i-\hat{\beta_0}-\hat{\beta_1}x_i)^2=
\min\limits_{\beta_0,\beta_1}=\sum_{i=1}^n(y_i-\beta_0-\beta_1x_i)^2\\
& \hat{y_i}=\hat{\beta_0}+\hat{\beta_1}x_i ,为y_i(i=1,2,...,n)的回归拟合值\\
& e_i=y_i-\hat{y_i},为y_i(i=1,2,...,n)的残差\\
& \sum_{i=1}^n e_i^2=\sum_{i=1}^n(y_i-\hat{\beta_0}-\hat{\beta_1}x_i)^2 残差平方和\\
& 求\hat{\beta_0},\hat{\beta_1}是一个求极值问题,Q是关于\hat{\beta_0},\hat{\beta_1}的非负
二次函数,最小值总是存在\\
& \begin{cases}
\frac{\partial Q}{\partial \beta_0}|_{\beta_0=\hat{\beta_0}}=-2\sum\limits_{i=1}^n(y_i-\hat{\beta_0}-\hat{\beta_1}x_i)=0 \\[2ex]
\frac{\partial Q}{\partial \beta_1}|_{\beta_1=\hat{\beta_1}}=-2\sum\limits_{i=1}^n(y_i-\hat{\beta_0}-\hat{\beta_1}x_i)x_i=0
\end{cases} \quad (2.18)\\[2ex]
& \begin{cases}
n\hat{\beta_0}+(\sum\limits_{i=1}^nx_i)\hat{\beta_1}=\sum\limits_{i=1}^ny_i \\[2ex]
(\sum\limits_{i=1}^nx_i)\hat{\beta_0}+(\sum\limits_{i=1}^nx_i^2)\hat{\beta_1}=\sum\limits_{i=1}^nx_iy_i
& \end{cases} \quad (2.19)\\
& 求解得到 \\
& \begin{cases}
\hat{\beta_0}=\overline{y}-\hat{\beta_1}\overline{x}\\[2ex]
\hat{\beta_1}=\frac{\sum\limits_{i=1}^n(x_i-\overline{x})(y_i-\overline{y})}
{\sum\limits_{i=1}^n(x_i-\overline{x})^2}
& \end{cases} \quad (2.20)\\
& \end{align*}
\]
\[\begin{align*}
& 从2.19得\\
& \left(\sum_{i=1}^{n} x_{i}\right)\left(\bar{y}-\hat{\beta}_{1} \bar{x}\right)+\left(\sum_{i=1}^{n} x_{i}^{2}\right) \hat{\beta}_{1}=\sum_{i=1}^{n} x_{i} y_{i} \\
& \hat{\beta}_{1}=\frac{\sum\limits_{i=1}^n x_i {y}_{i}-\left(\sum\limits_{i=1}^{n} x_{i}\right) \bar{y}}{\left(\sum\limits_{i=1}^{n} x_{i}^{2}\right)-\left(\sum\limits_{i=1}^{n} x_{i}\right) \bar{x}}=\frac{\sum\limits_{i=1}^{n} x_{i} y_{i}-\sum\limits_{i=1}^{n} x_{i} \bar{y}-\sum\limits_{i=1}^{n} \bar{x} y_{i}+\sum\limits_{i=1}^{n} \bar{x} \bar{y}}{\sum\limits_{i=1}^{n} x_{i}^{2}-2 \sum\limits_{i=1}^{n} x_{i} \bar{x}+\sum\limits_{i=1}^{n} \bar{x}^{2}}\\
& =\frac{\sum\limits_{i=1}^{n}\left(x_{i}-\bar{x}\right)\left(y_{i}-\bar{y}\right)}{\sum\limits_{i=1}^{n}\left(x_{i}-\bar{x}\right)^{2}}\\
& 记L_{xx}=\sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)^{2}=\sum_{i=1}^{n} x_{i}^{2}-n(\bar{x})^{2} \quad (2.21)\\
& L_{xy}=\sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)\left(y_{i}-\bar{y}\right)=\sum_{i=1}^{n} x_{i} y_{i}-n \bar{x} \bar{y} \quad (2.22)\\
& 2.20式简写为
\begin{cases}
{\hat{\beta}_{0}=\bar{y}-\hat{\beta}_{1} \bar{x}} \\
{\hat{\beta_1}}=\frac{L xy}{L x x}
\end{cases} \quad(2.23)\\
& \begin{aligned}
\hat{\beta}_{1}=& \frac{\sum\limits_{i=1}^{n}\left(x_{i}-\bar{x}\right) y_{i}}{\sum\limits_{i=1}^{n}\left(x_{i}-\bar{x}\right)^{2}}
= \frac{\sum\limits_{i=1}^{n} x_{i} y_{i}-n \bar{x} \bar{y}}{\sum\limits_{i=1}^{n} x_{i}^{2}-n(\bar{x})^{2}}
\end{aligned}
\end{align*}
\]
回归直线\(\bar{y}=\hat{\beta_0}+\hat{\beta_1}x\)是通过店\((\bar{x},\bar{y})\)的,通过样本重心
\[由式2.18知\begin{cases}
\sum\limits_{i=1}^ne_i=0\\
\sum\limits_{i=1}^nx_ie_i=0
\end{cases}
\]
2.一元线性回归里最小二乘估计的性质
1.线性
\[\hat{\beta_0},\hat{\beta_1}为随机变量y_i的线性函数\\
\hat{\beta_0}=\frac{\sum\limits_{i=1}^n(x_i-\bar{x})y_i}{\sum\limits_{i=1}^n(x_i-\bar{x})^2}=\sum\limits_{i=1}^n\frac{x_i-\bar{x}}{\sum\limits_{j=1}^n(x_j-\bar{x})^2}y_i \quad (2.37)\\
\frac{x_i-\bar{x}}{\sum\limits_{j=1}^n(x_j-\bar{x})^2}y_i是常数
\]
2.无偏性
由于\(x_i\)是非随机变量,\(y_i=\beta_0+\beta_1x_i+\epsilon_i,E(\epsilon_i)=0\)有\(E(y_i)=\beta_0+\beta_1x_i\quad (2.38)\)
再由式2.37得\(E(\hat{\beta_1})=\sum\limits_{i=1}^n\frac{x_i-\bar{x}}{\sum\limits_{j=1}^n(x_j-\bar{x})^2}E(y_i)=\sum\limits_{i=1}^n\frac{x_i-\bar{x}}{\sum\limits_{j=1}^n(x_j-\bar{x})^2}(\beta_0+\beta_1x_i)=\beta_1\)
\(\sum\limits_{i=1}^n(x_i-\bar{x})\beta_0=0,\sum\limits_{i=1}^n(x_i-\bar{x})x_i=\sum\limits_{i=1}^n(x_i-\bar{x})^2\)
3.一元线性回归里的极大似然估计
极大似然估计:利用总体的分布密度或概率分布的表达式及其样本所提供的信息求未知参数估计量的一种方法
直观想法:A发生的可能为0.01或0.1,若一次试验中A发生了,自然认为A发生的概率为0.1
总体X为连续型分布时,设其分布密度族为\({f(x,\theta),\theta \in \varTheta}\),X的一个独立同分布样本为\(x_1,x_2,...,x_n\),似然函数为
\(L(\theta;x_1,x_2,...,x_n)=\sum\limits_{i=1}^n f(x_i;\theta)\quad 2.29\)
最大似然估计的一切\(\theta\)中选择使随机样本(x_1,x_2,...,x_n)落在店(x_1,...,x_n)附近概率最大的\(\hat{\theta}\)为\(\theta\)真值的估计值
\(L(\theta;x_1,x_2,...,x_n)=\max\limits_{\theta}L(\theta;x_1,x_2,...,x_n) \quad 2.30\)
连续型随机变量:似然函数为样本的联合分布密度函数
离散型随机变量:似然函数为样本的联合概率函数
只要联合密度形式是已知的,就可以用最大似然估计
一元线性回归中,假设\(\epsilon_i\text{~} N(0,\sigma^2)时,y_i\text{~} N(\beta_0+\beta_1x_i,\sigma^2)\)可知\(y_i\)的分布密度为\(f_i(y_i)=\frac{1}{\sqrt{2\pi}\sigma}e^{\frac{-1}{2\theta}[y_i-(\beta_0+\beta_1x_i)]^2},i=1,2...,n\)
\(y_1,y_2,...,y_n\)的似然函数为(似然函数为样本分布概率之积)
\(L(\beta_0,\beta_1,\sigma^2)=\prod\limits_{i=1}^nf(x_i)=(2\pi\sigma^2)^{-\frac{n}{2}}e^{-\frac{1}{2\sigma^2}\sum\limits_{i=1}^n[y_i-(\beta_0+\beta_1x_i)]^2}\)极大化L(这些样本出现的概率最大)
与\(ln(L)\)极大化等价,取对数似然函数:
\(ln(L)=-\frac{n}{2}ln(2\pi\sigma^2)-\frac{1}{2\sigma^2}\sum\limits_{i=1}^n[y_i-(\beta_0+\beta_1x_i)]^2\)等价于求\(\sum\limits_{i=1}^n[y_i-(\beta_0+\beta_1x_i)]^2\)的最小值
即与最小二乘法等价,\(等价前提是\epsilon_i \text{~}N(0,\sigma^2)\)