最优化基础(三)
最优化基础(三)[1]
函数的可微性与展开
定义:设有n 元实函数\(f(x)\), 其中自变量\(x=(x_1,\cdots,x_n)^T\in\mathbb{R}^n\) 称向量
\[\nabla f(x) = \left ( \frac{\partial f(x)}{\partial x_1 } , \frac{\partial f(x)}{\partial x_2},\cdots, \frac{\partial f(x)}{\partial x_n}\right )^T
\]
为\(f(x)\)在\(x\)处的一阶导数或梯度。称矩阵
\[\nabla^2 f(x) = \left(
\begin{array}{cccc}
\frac{\partial ^2f(x)}{\partial x_1^2} & \frac{\partial ^2f(x)}{\partial x_1\partial x_2} & \cdots & \frac{\partial ^2f(x)}{\partial x_1\partial x_n}\\
\frac{\partial ^2f(x)}{\partial x_2\partial x_1} & \frac{\partial ^2f(x)}{\partial x_2^2} & \cdots & \frac{\partial ^2f(x)}{\partial x_2\partial x_n}\\
\vdots & \vdots &\vdots & \vdots\\
\frac{\partial ^2f(x)}{\partial x_n\partial x_1} & \frac{\partial ^2f(x)}{\partial x_n\partial x_2} & \cdots & \frac{\partial ^2f(x)}{\partial x_n^2}\\
\end{array}
\right)
\]
为\(f(x)\) 在\(x\)处的二阶导数或Hesse 矩阵. 若梯度\(\nabla f(x)\)的每个分量函数在\(x\)都连续, 则称\(f\)在\(x\) 一阶连续可微;若Hesse 阵\(\nabla^2 f(x)\)的各个分量函数都连续,则称\(f\) 在\(x\) 二阶连续可微.
若\(f\) 在开集\(D\)的每一点都连续可微,则称\(f\) 在\(D\)上一阶连续可微;若\(f\) 在开集\(D\) 的每一点都都二阶连续可微,则称\(f\)在\(D\)上二阶连续可微.
泰勒展开
设函数\(f:\mathbb{R}^n \rightarrow \mathbb{R}\) 连续可微,那么
\[\begin{aligned}
f(x+h) &=f(x)+\int_0^1\nabla f(x+\tau h)^Th \mathrm{d}\tau\\
&=f(x)+\nabla f(x+\xi h)^Th,\quad \xi \in(0,1)\\
&=f(x)+\nabla f(x)^T h+o(\|h\|)
\end{aligned}
\]
进一步, 若函数\(f\)是二次连续可微的, 则有
\[\begin{aligned}
f(x+h) &=f(x)+\nabla f(x)^Th+\int_0^1(1-\tau)h^T\nabla^2 f(x+\tau h)h \mathrm{d}\tau\\
&=f(x)+\nabla f(x)^Th+\frac{1}{2}h^T \nabla^2 f(x+\xi h)h,\quad \xi \in(0,1)\\
&=f(x)+\nabla f(x)^T h+\frac{1}{2}h^t\nabla^2 f(x)h+o(\|h\|^2)
\end{aligned}
\]
及
\[\begin{aligned}
\nabla f(x+h) &=\nabla f(x)+\int_0^1\nabla^2f(x+\tau h)^Th \mathrm{d}\tau\\
&=\nabla f(x)+\nabla^2 f(x+\xi h)^Th, \quad \xi \in(0,1)\\
&=\nabla f(x)+\nabla^2f(x)^Th+o(\|h\|)
\end{aligned}
\]
设有向量值函数\(F=(F_1,F_2,\cdots,F_m)^T:\mathbb{R}^n\rightarrow \mathbb{R}^m\),若每个分量函数\(F_i\)都是(连续) 可微的,则称\(F\) 是(连续) 可微的.向量值函数\(F\) 在\(x\) 的导数\(F'\in\mathbb{R}^{m\times n}\)是指它在\(x\)的Jacobi 矩阵, 记为\(F'(x)\) 或\(J_F(x)\), 即
\[F'(x):=J_F(x):= \left(
\begin{array}{cccc}
\frac{\partial F_1(x)}{\partial x_1} & \frac{\partial F_1(x)}{\partial x_2} & \cdots & \frac{\partial F_1(x)}{\partial x_n}\\
\frac{\partial F_2(x)}{\partial x_1} & \frac{\partial F_2(x)}{\partial x_2} & \cdots & \frac{\partial F_2(x)}{\partial x_n}\\
\vdots & \vdots &\vdots & \vdots\\
\frac{\partial F_m(x)}{\partial x_1} & \frac{\partial F_m(x)}{\partial x_2} & \cdots & \frac{\partial F_m(x)}{\partial x_n}\\
\end{array}
\right)
\]
考虑到标量函数的梯度定义, 有时也把向量函数\(F\)的Jacobi 矩阵的转置称为\(F\) 在\(x\) 的梯度,记为
\[\nabla F(x)=J_F(x)^T=(\nabla F_1(x),\nabla F_2(x),\cdots,\nabla F_m(x))
\]
马昌凤. 最优化方法及其Matlab程序设计[M]. 科学出版社, 2010. ↩︎

浙公网安备 33010602011771号