实用指南:最优估计准则与方法(2)线性最小方差估计(LMMSE)_学习笔记
前言
线性最小方差估计(Linear Minimum Variance Estimation,LMVE)是一种特殊的最小方差估计(MMSE)。作为卡尔曼滤波(Kalman Filtering)的基础,在最优估计理论中的地位占有相当核心。本文将详细介绍该最优估计准则与途径。
最优估计问题(Optimal Estimation Problem)
实际工程中大多是多维随机向量的估计问题,如下:
设X XX为n nn维随机向量,Z ZZ为观测X XX的m mm维随机向量,X ^ \hat{X}X^为对X XX的估计量,是关于Z ZZ的函数,X ~ \tilde{X}X~为X ^ \hat{X}X^的估计误差,建立对X XX的最优估计。令A AA为n × m n×mn×m维矩阵,B BB为n nn维随机向量,X ^ \hat{X}X^与Z ZZ满足线性关系,根据最优估计准则对估计量X ^ \hat{X}X^进行求极值。
X ^ = A Z + B X ~ = X − X ^ = X − ( A Z + B ) \begin{align*} \hat{X}&=AZ+B \tag{1}\\ \tilde{X}&=X-\hat{X}=X-(AZ+B) \tag{2} \\ \end{align*}X^X~=AZ+B=X−X^=X−(AZ+B)(1)(2)
线性最小方差估计(Linear Minimum Mean Squared Error, LMMSE)
线性最小方差估计准则与最小方差一致,是使估计均方误差集合均值最小的估计[1]。常简记做线性最小方差(LMMSE)[2]。
代价函数为:
J = E [ X ~ T X ~ ] = T r ( E [ X ~ X ~ T ] ) = E [ ( X − ( A Z + B ) ) T ( X − ( A Z + B ) ) ] = m i n (3) J =E[\tilde{X}^{T}\tilde{X}]=Tr(E[\tilde{X}\tilde{X}^{T}])=E[(X-(AZ+B))^{T}(X-(AZ+B))]=min \tag{3}J=E[X~TX~]=Tr(E[X~X~T])=E[(X−(AZ+B))T(X−(AZ+B))]=min(3)
最小化(Minimizing)
最小方差估计(MMSE)在最小化的过程不需要已知X XX与Z ZZ的条件概率密度函数P ( X ∣ Z ) P(X|Z)P(X∣Z),联合概率密度函数P ( X , Z ) P(X,Z)P(X,Z)和条件数学期望E [ X ∣ Z ] E[X|Z]E[X∣Z]此技巧在工程上的应用受到很大限制[1]。就是。注意:这种苛刻的的先验条件,线性最小方差估计(LMMSE)仅需要已知随机向量X XX与Z ZZ的一阶距和二阶矩,即E [ X ] E[X]E[X],E [ Z ] E[Z]E[Z],V a r ( X ) Var(X)Var(X),V a r ( Z ) Var(Z)Var(Z),C o v ( X , Z ) Cov(X ,Z)Cov(X,Z),C o v ( X , Z ) Cov(X, Z)Cov(X,Z),这使得工程应用的难度显著降低,其最小化推导过程如下:
J = E [ ( X − ( A Z + B ) ) T ( X − ( A Z + B ) ) ] \begin{align*} J&=E[(X-(AZ+B))^{T}(X-(AZ+B))] \tag{4}\\ \end{align*}J=E[(X−(AZ+B))T(X−(AZ+B))](4)
先对式(2)的J JJ对B BB求偏导,并取极小:
∂ J ∂ B = 0 ∂ J ∂ ( X − ( A Z + B ) ) ∂ ( X − ( A Z + B ) ) ∂ B = 0 − 2 E [ X − ( A Z + B ) ) ] = 0 E [ X − A Z − B ] = 0 E [ X ] − A E [ Z ] − B = 0 B = E [ X ] − A E [ Z ] \begin{align*} \frac{\partial J}{\partial B}&=0\\ \frac{\partial J}{\partial (X-(AZ+B))}\frac{\partial (X-(AZ+B))}{\partial B}&=0\\ -2E[X-(AZ+B))] &= 0\\ E[X-AZ-B] &= 0\\ E[X]-AE[Z]-B &= 0\\ B &= E[X]-AE[Z] \tag{5}\\ \end{align*}∂B∂J∂(X−(AZ+B))∂J∂B∂(X−(AZ+B))−2E[X−(AZ+B))]E[X−AZ−B]E[X]−AE[Z]−BB=0=0=0=0=0=E[X]−AE[Z](5)
再对式(4)的J JJ对A AA求偏导,代入式(5)中的B BB并取极小:
∂ J ∂ A = 0 ∂ J ∂ ( X − ( A Z + B ) ) ∂ ( X − ( A Z + B ) ) ∂ A = 0 − 2 E [ ( X − ( A Z + B ) ) Z T ] = 0 E [ ( X − B ) Z T ] − E [ A Z Z T ] = 0 E [ X Z T ] − B E [ Z T ] − E [ A Z Z T ] = 0 E [ X Z T ] − ( E [ X ] − A E [ Z ] ) E [ Z T ] − E [ A Z Z T ] = 0 E [ X Z T ] − E [ X ] E [ Z T ] + A E [ Z ] E [ Z T ] − A E [ Z Z T ] = 0 ( E [ X Z T ] − E [ X ] E [ Z T ] ) − A ( ( E [ Z Z T ] − E [ Z ] E [ Z T ] ) = 0 E [ ( X − E [ X ] ) ( Z − E [ Z ] ) T ] − A E [ ( Z − E [ Z ] ) ( Z − E [ Z ] ) T ] = 0 C o v ( X , Z ) − A V a r ( Z ) = 0 A = C o v ( X , Z ) V a r ( Z ) − 1 \begin{align*} \frac{\partial J}{\partial A}&=0\\ \frac{\partial J}{\partial (X-(AZ+B))}\frac{\partial (X-(AZ+B))}{\partial A}&=0 \\ -2E[(X-(AZ+B))Z^{T}] &= 0 \tag{6} \\ E[(X-B)Z^{T}] -E[AZZ^{T}]&= 0\\ E[XZ^{T}]-BE[Z^{T}] -E[AZZ^{T}]&= 0\\ E[XZ^{T}]-(E[X]-AE[Z])E[Z^{T}] -E[AZZ^{T}]&= 0\\ E[XZ^{T}]-E[X]E[Z^{T}]+AE[Z]E[Z^{T}]-AE[ZZ^{T}]&= 0\\ (E[XZ^{T}]-E[X]E[Z^{T}])-A((E[ZZ^{T}]-E[Z]E[Z^{T}])&= 0\\ E[(X-E[X])(Z-E[Z])^{T}]-AE[(Z-E[Z])(Z-E[Z])^{T}]&= 0\\ Cov(X,Z)-AVar(Z) &= 0\\ A &= Cov(X,Z)Var(Z)^{-1} \tag{7}\\ \end{align*}∂A∂J∂(X−(AZ+B))∂J∂A∂(X−(AZ+B))−2E[(X−(AZ+B))ZT]E[(X−B)ZT]−E[AZZT]E[XZT]−BE[ZT]−E[AZZT]E[XZT]−(E[X]−AE[Z])E[ZT]−E[AZZT]E[XZT]−E[X]E[ZT]+AE[Z]E[ZT]−AE[ZZT](E[XZT]−E[X]E[ZT])−A((E[ZZT]−E[Z]E[ZT])E[(X−E[X])(Z−E[Z])T]−AE[(Z−E[Z])(Z−E[Z])T]Cov(X,Z)−AVar(Z)A=0=0=0=0=0=0=0=0=0=0=Cov(X,Z)Var(Z)−1(6)(7)
将A AA代入式(5),并求出B BB:
B = E [ X ] − A E [ Z ] = E [ X ] − C o v ( X , Z ) V a r ( Z ) − 1 E [ Z ] \begin{align*} B &= E[X]-AE[Z] \\ &= E[X]-Cov(X,Z)Var(Z)^{-1}E[Z] \tag{8}\\ \end{align*}B=E[X]−AE[Z]=E[X]−Cov(X,Z)Var(Z)−1E[Z](8)
将A AA和B BB代入式(1),并求出X ^ \hat{X}X^:
X ^ = A Z + B = C o v ( X , Z ) V a r ( Z ) − 1 Z + E [ X ] − C o v ( X , Z ) V a r ( Z ) − 1 E [ Z ] = E [ X ] + C o v ( X , Z ) V a r ( Z ) − 1 ( Z − E [ Z ] ) \begin{align*} \hat{X}&=AZ+B\\ &=Cov(X,Z)Var(Z)^{-1}Z+E[X]-Cov(X,Z)Var(Z)^{-1}E[Z]\\ &=E[X]+Cov(X,Z)Var(Z)^{-1}(Z-E[Z]) \tag{9}\\ \end{align*}X^=AZ+B=Cov(X,Z)Var(Z)−1Z+E[X]−Cov(X,Z)Var(Z)−1E[Z]=E[X]+Cov(X,Z)Var(Z)−1(Z−E[Z])(9)
无偏性(Unbiased)
线性最小方差估计X ^ \hat{X}X^的数学期望为:
E [ X ^ ] = E [ E [ X ] + C o v ( X , Z ) V a r ( Z ) − 1 ( Z − E [ Z ] ) ] = E [ X ] + C o v ( X , Z ) V a r ( Z ) − 1 E [ Z − E [ Z ] ] = E [ X ] \begin{align*} E[\hat {X}] &= E[E[X]+Cov(X,Z)Var(Z)^{-1}(Z-E[Z])] \\ &= E[X]+Cov(X,Z)Var(Z)^{-1}E[Z-E[Z]] \\ &= E[X] \tag{10}\\ \end{align*}E[X^]=E[E[X]+Cov(X,Z)Var(Z)−1(Z−E[Z])]=E[X]+Cov(X,Z)Var(Z)−1E[Z−E[Z]]=E[X](10)
显然,线性最小方差估计是无偏的,有:
E [ X ~ ] = E [ X − X ^ ] = 0 (11) E[\tilde{X}]=E[X-\hat{X}]=0 \tag{11}E[X~]=E[X−X^]=0(11)
协方差矩阵(Covariance Matrix)
线性最小方差估计的估计误差X ~ \tilde{X}X~的协方差矩阵为[1]:
E [ X ~ X ~ T ] = E [ ( X − X ^ ) ( X − X ^ ) T ] = E [ ( X − ( E [ X ] + C o v ( X , Z ) V a r ( Z ) − 1 ( Z − E [ Z ] ) ) ( X − ( E [ X ] + C o v ( X , Z ) V a r ( Z ) − 1 ( Z − E [ Z ] ) ) T ] = E [ ( ( X − E [ X ] ) − C o v ( X , Z ) V a r ( Z ) − 1 ( Z − E [ Z ] ) ) ( ( X − E [ X ] ) − C o v ( X , Z ) V a r ( Z ) − 1 ( Z − E [ Z ] ) ) T ] = C 1 + C 2 + C 3 \begin{align*} E[\tilde{X}\tilde{X}^{T}] &= E[(X-\hat{X})(X-\hat{X})^{T}] \\ &= E[(X-(E[X]+Cov(X,Z)Var(Z)^{-1}(Z-E[Z]))(X-(E[X]+Cov(X,Z)Var(Z)^{-1}(Z-E[Z]))^{T}] \\ &= E[((X-E[X])-Cov(X,Z)Var(Z)^{-1}(Z-E[Z]))((X-E[X])-Cov(X,Z)Var(Z)^{-1}(Z-E[Z]))^{T}] \\ &= C_{1}+C{2}+C{3} \tag{12}\\ \end{align*}E[X~X~T]=E[(X−X^)(X−X^)T]=E[(X−(E[X]+Cov(X,Z)Var(Z)−1(Z−E[Z]))(X−(E[X]+Cov(X,Z)Var(Z)−1(Z−E[Z]))T]=E[((X−E[X])−Cov(X,Z)Var(Z)−1(Z−E[Z]))((X−E[X])−Cov(X,Z)Var(Z)−1(Z−E[Z]))T]=C1+C2+C3(12)
其中,
C 1 = E [ ( X − E [ X ] ) ( X − E [ X ] ) T ] = V a r ( x ) \begin{align*} C_{1} &= E[(X-E[X])(X-E[X])^{T}] \\ &= Var(x) \tag{13} \\ \end{align*}C1=E[(X−E[X])(X−E[X])T]=Var(x)(13)
C 2 = − E [ ( X − E [ X ] ) ( Z − E [ Z ] ) T ] [ V a r ( Z ) − 1 ] T C o v ( X , Z ) T − C o v ( X , Z ) V a r ( Z ) − 1 E [ ( Z − E [ Z ] ) ) ( X − E [ X ] ) T ] = − C o v ( X , Z ) V a r ( Z ) − 1 C o v ( Z , X ) − C o v ( X , Z ) V a r ( Z ) − 1 C o v ( Z , X ) = − 2 C o v ( X , Z ) V a r ( Z ) − 1 C o v ( Z , X ) \begin{align*} C_{2} &= - E[(X-E[X])(Z-E[Z])^{T}][Var(Z)^{-1}]^{T}Cov(X,Z)^{T} \\ &- Cov(X,Z)Var(Z)^{-1}E[(Z-E[Z]))(X-E[X])^{T}] \\ &= - Cov(X,Z)Var(Z)^{-1}Cov(Z,X) - Cov(X,Z)Var(Z)^{-1}Cov(Z,X) \\ &= - 2Cov(X,Z)Var(Z)^{-1}Cov(Z,X) \tag{14} \\ \end{align*}C2=−E[(X−E[X])(Z−E[Z])T][Var(Z)−1]TCov(X,Z)T−Cov(X,Z)Var(Z)−1E[(Z−E[Z]))(X−E[X])T]=−Cov(X,Z)Var(Z)−1Cov(Z,X)−Cov(X,Z)Var(Z)−1Cov(Z,X)=−2Cov(X,Z)Var(Z)−1Cov(Z,X)(14)
C 3 = C o v ( X , Z ) V a r ( Z ) − 1 E [ ( Z − E [ Z ] ) ( Z − E [ Z ] ) T ] [ V a r ( Z ) − 1 ] T C o v ( X , Z ) T = C o v ( X , Z ) V a r ( Z ) − 1 E [ ( Z − E [ Z ] ) ( Z − E [ Z ] ) T ] V a r ( Z ) − 1 C o v ( Z , X ) = C o v ( X , Z ) V a r ( Z ) − 1 V a r ( Z ) V a r ( Z ) − 1 C o v ( Z , X ) = C o v ( X , Z ) V a r ( Z ) − 1 C o v ( Z , X ) \begin{align*} C_{3} &= Cov(X,Z)Var(Z)^{-1}E[(Z-E[Z])(Z-E[Z])^{T}][Var(Z)^{-1}]^{T}Cov(X,Z)^{T} \\ &= Cov(X,Z)Var(Z)^{-1}E[(Z-E[Z])(Z-E[Z])^{T}]Var(Z)^{-1}Cov(Z,X) \\ &= Cov(X,Z)Var(Z)^{-1}Var(Z)Var(Z)^{-1}Cov(Z,X) \\ &= Cov(X,Z)Var(Z)^{-1}Cov(Z,X) \tag{15} \\ \end{align*}C3=Cov(X,Z)Var(Z)−1E[(Z−E[Z])(Z−E[Z])T][Var(Z)−1]TCov(X,Z)T=Cov(X,Z)Var(Z)−1E[(Z−E[Z])(Z−E[Z])T]Var(Z)−1Cov(Z,X)=Cov(X,Z)Var(Z)−1Var(Z)Var(Z)−1Cov(Z,X)=Cov(X,Z)Var(Z)−1Cov(Z,X)(15)
式(13)(14)(15)代入式(13),得:
E [ X ~ X ~ T ] = V a r ( X ) − C o v ( X , Z ) V a r ( Z ) − 1 C o v ( Z , X ) \begin{align*} E[\tilde{X}\tilde{X}^{T}] &=Var(X)- Cov(X,Z)Var(Z)^{-1}Cov(Z,X) \tag{16}\\ \end{align*}E[X~X~T]=Var(X)−Cov(X,Z)Var(Z)−1Cov(Z,X)(16)
线性变换(Linear Transformation)
由式(2)(6)(11),得
E [ ( X − ( A Z + B ) ) Z T ] = E [ X ~ Z T ] = 0 E [ ( X − ( A Z + B ) ) ] = E [ X ~ ] = 0 \begin{align*} E[(X-(AZ+B))Z^{T}] &= E[\tilde{X}Z^{T}] = 0 \tag{17}\\ E[(X-(AZ+B))] &=E[\tilde{X}] = 0 \tag{18}\\ \end{align*}E[(X−(AZ+B))ZT]E[(X−(AZ+B))]=E[X~ZT]=0=E[X~]=0(17)(18)
根据正交投影定理[2],从几何上来看,估计量X ^ \hat{X}X^是X XX在由A AA和B BB确定关于Z ZZ的线性空间上(Z ZZ和B BB所在平面)的投影。估计误差的均值为0,即垂线的长度为0,X ^ \hat{X}X^与X XX重合,线性最小方差估计即为无偏估计。线性转换的过程可以理解为:由于Z ZZ作为观测量无法调整,但是B BB可以被调整使得Z ZZ和B BB所在平面令X XX落在该平面,A AA、B BB和Z ZZ则确定该平面上的投影向量,即得到最优无偏估计。
参考文献
[1] 最优估计准则与方法(1)最小方差估计(MMSE)_学习笔记
https://blog.csdn.net/jimmychao1982/article/details/149478176
[2] 《最优估计理论》,刘胜,张红梅著,2011,高等教育出版社。
[3] 3-2 正交定理, Yandld
https://www.bilibili.com/video/BV1wj411H7j7?t=2.7
浙公网安备 33010602011771号