最优化基础(三)

最优化基础(三)[1]

函数的可微性与展开

定义:设有n 元实函数\(f(x)\), 其中自变量\(x=(x_1,\cdots,x_n)^T\in\mathbb{R}^n\) 称向量

\[\nabla f(x) = \left ( \frac{\partial f(x)}{\partial x_1 } , \frac{\partial f(x)}{\partial x_2},\cdots, \frac{\partial f(x)}{\partial x_n}\right )^T \]

\(f(x)\)\(x\)处的一阶导数或梯度。称矩阵

\[\nabla^2 f(x) = \left( \begin{array}{cccc} \frac{\partial ^2f(x)}{\partial x_1^2} & \frac{\partial ^2f(x)}{\partial x_1\partial x_2} & \cdots & \frac{\partial ^2f(x)}{\partial x_1\partial x_n}\\ \frac{\partial ^2f(x)}{\partial x_2\partial x_1} & \frac{\partial ^2f(x)}{\partial x_2^2} & \cdots & \frac{\partial ^2f(x)}{\partial x_2\partial x_n}\\ \vdots & \vdots &\vdots & \vdots\\ \frac{\partial ^2f(x)}{\partial x_n\partial x_1} & \frac{\partial ^2f(x)}{\partial x_n\partial x_2} & \cdots & \frac{\partial ^2f(x)}{\partial x_n^2}\\ \end{array} \right) \]

\(f(x)\)\(x\)处的二阶导数或Hesse 矩阵. 若梯度\(\nabla f(x)\)的每个分量函数在\(x\)都连续, 则称\(f\)\(x\) 一阶连续可微;若Hesse 阵\(\nabla^2 f(x)\)的各个分量函数都连续,则称\(f\)\(x\) 二阶连续可微.

\(f\) 在开集\(D\)的每一点都连续可微,则称\(f\)\(D\)上一阶连续可微;若\(f\) 在开集\(D\) 的每一点都都二阶连续可微,则称\(f\)\(D\)上二阶连续可微.

泰勒展开

设函数\(f:\mathbb{R}^n \rightarrow \mathbb{R}\) 连续可微,那么

\[\begin{aligned} f(x+h) &=f(x)+\int_0^1\nabla f(x+\tau h)^Th \mathrm{d}\tau\\ &=f(x)+\nabla f(x+\xi h)^Th,\quad \xi \in(0,1)\\ &=f(x)+\nabla f(x)^T h+o(\|h\|) \end{aligned} \]

进一步, 若函数\(f\)是二次连续可微的, 则有

\[\begin{aligned} f(x+h) &=f(x)+\nabla f(x)^Th+\int_0^1(1-\tau)h^T\nabla^2 f(x+\tau h)h \mathrm{d}\tau\\ &=f(x)+\nabla f(x)^Th+\frac{1}{2}h^T \nabla^2 f(x+\xi h)h,\quad \xi \in(0,1)\\ &=f(x)+\nabla f(x)^T h+\frac{1}{2}h^t\nabla^2 f(x)h+o(\|h\|^2) \end{aligned} \]

\[\begin{aligned} \nabla f(x+h) &=\nabla f(x)+\int_0^1\nabla^2f(x+\tau h)^Th \mathrm{d}\tau\\ &=\nabla f(x)+\nabla^2 f(x+\xi h)^Th, \quad \xi \in(0,1)\\ &=\nabla f(x)+\nabla^2f(x)^Th+o(\|h\|) \end{aligned} \]

设有向量值函数\(F=(F_1,F_2,\cdots,F_m)^T:\mathbb{R}^n\rightarrow \mathbb{R}^m\),若每个分量函数\(F_i\)都是(连续) 可微的,则称\(F\) 是(连续) 可微的.向量值函数\(F\)\(x\) 的导数\(F'\in\mathbb{R}^{m\times n}\)是指它在\(x\)的Jacobi 矩阵, 记为\(F'(x)\)\(J_F(x)\), 即

\[F'(x):=J_F(x):= \left( \begin{array}{cccc} \frac{\partial F_1(x)}{\partial x_1} & \frac{\partial F_1(x)}{\partial x_2} & \cdots & \frac{\partial F_1(x)}{\partial x_n}\\ \frac{\partial F_2(x)}{\partial x_1} & \frac{\partial F_2(x)}{\partial x_2} & \cdots & \frac{\partial F_2(x)}{\partial x_n}\\ \vdots & \vdots &\vdots & \vdots\\ \frac{\partial F_m(x)}{\partial x_1} & \frac{\partial F_m(x)}{\partial x_2} & \cdots & \frac{\partial F_m(x)}{\partial x_n}\\ \end{array} \right) \]

考虑到标量函数的梯度定义, 有时也把向量函数\(F\)的Jacobi 矩阵的转置称为\(F\)\(x\) 的梯度,记为

\[\nabla F(x)=J_F(x)^T=(\nabla F_1(x),\nabla F_2(x),\cdots,\nabla F_m(x)) \]


  1. 马昌凤. 最优化方法及其Matlab程序设计[M]. 科学出版社, 2010. ↩︎

posted @ 2017-09-27 11:46  main_c  阅读(315)  评论(0)    收藏  举报