多变量函数的微分学
多变量函数的微分学
矩阵范数
定义
称\(||\cdot||:\cup_{n,m\in N} M_{n,m}\mapsto [0,+\infty[\)是一个矩阵范数,若
- \(||A||\ge 0,||A||=0\Leftrightarrow A=0\).
- \(||aA||=|a|\cdot||A||,\forall a\in R\).
- \(||A+B||\le||A||+||B||\).
- \(||AB||\le||A||\cdot ||B||,A\in M_{n,m},B\in M_{m,k}\).
常用的矩阵范数
- 矩阵的2-范数(Hilbert-Schmidt范数):\(||A||_2=\sqrt{\sum_{i=1}^{n}\sum_{j=1}^{m}|a_{ij}|^2},\forall A\in M_{n,m}\).
- 矩阵的算子范数\(||A||=\sup_{|x|=1}|Ax|,\forall A\in M_{n,m}\).
\(R^m\)上的范数等价性
对于两个范数\(||\cdot||\)和\(||\cdot||_2\),称两者等价当且仅当\(\exist c_1,c_2>0\)满足\(\forall x,c_1||x||\le ||x||_2\le c_2||x||\).(即两个范数可以互相控制).\(R^m\)上的任意两个范数都是等价的.
矩阵范数的等价性
对于两个矩阵范数\(||\cdot||\)和\(||\cdot||_*\)有\(\forall n,m\in N,A\in M_{n,m},\exist a=a(m,n),b=b(m,n)\)满足\(a||A||\le ||A||_*\le b||A||\).
\(M_{n,m}\)与\(R^{n\times m}\)同构\(\Rightarrow ||\cdot||\)和\(||\cdot||_*\)等价.
多变量函数的微分
定义
集合的锥点
设\(E\sube R^m\),称\(x\in E\)是\(E\)的一个锥点,如果存在矩阵\(A\in GL_m(R)\)和\(\delta>0\)满足\(\forall t\in [0,\delta]^m,x+At\in E\).(即存在一个以\(x\)为端点的方体,在经过拉伸旋转等变形后包含在\(E\)中).
显然当\(x\in \mathring{E}\),\(x\)一定是\(E\)的锥点(取\(A=I_m\)).
方体\(\prod_{i=1}^{m}[a_i,b_i]\)的边界点也是锥点.
考虑锥点可以研究\(\partial E\)上点的微分.
高阶无穷小
-
假设有映射\(\phi:E\sube R^m\mapsto R^n,0\in E'\cap E\),称\(\phi(h)\)是\(E \ni h\to 0\)的高阶无穷小,如果
\[\forall \varepsilon>0,\exist \delta>0,\forall h\in B(0,\delta)\cap E,|\phi(h)|\le \varepsilon|h| \]记作\(\phi(h)=o(h)(h\to 0)\).
-
对于\(f,g:E\sube R^m\mapsto R^n,x_)\in E'\cap E\).称\(f(x)\)是关于\(E\ni x\to x_0\)时相对于\(g(x)\)的高阶无穷小,如果
\[\forall \varepsilon>0,\exist \delta>0,\forall x\in B(x_,\delta)\cap E,|f(x)|\le \varepsilon|g(x)| \]记作\(f(x)=o(g(x))(x\to x_0)\).
- \(o(h)\)是向量值,每个分量是\(o(|h|)\),且\(|o(h)|=o(|h|)\).
- 若有\(\phi:E\mapsto M_{n,m}\),且\(\sup_{h\in E}||\phi(h)||<+\infty\),这里的\(||\cdot||\)是算子范数.那么有\(\phi(h)o(h)=o(h)(h\to 0)\).
- \(o(Ah+o(h))=o(h)\).
映射的微分
设\(E\sube R^m\),\(x\)是\(E\)的锥点,则称\(f:E\mapsto R^m\)在\(x\)处可微,如果存在一个线性变换(矩阵)\(A_x\in M_{n,m}\)满足\(\forall h\in E-x(:=\{z-x|z\in E\}),f(x+h)-f(x)=A_xh+o(h),(h\to 0)\).此时称\(A_x\)为\(f\)在点\(x\)的微分(或切映射/导映射),记为\(A_x=df(x)=Df(x)=f'(x)\).如果\(E\)上的每一点都是锥点,且\(f\)在\(E\)上处处可微,则称\(f\)在\(E\)上可微.
关于\(A_x\)的唯一性的证明:
\[\begin{array}{l} ◂\ 取x_0\in E,假设\forall h\in E-x_0有\\ \left\{ \begin{array}{c} f(x_0+h)-f(x_0)=A_{x_0}+o(h)\\ f(x_0+h)-f(x_0)=B_{x_0}+o(h)\\ \end{array} \right. \\ \Rightarrow(A-{x_0}-B_{x_0})h=o(h)\\ x_0是E的锥点,\exist C\in GL_m(R),\eta>0满足x_o+C[0,\eta]^m\sube E\\ 记C=(\vec{c_1},\vec{c_2}\dots \vec{c_m})\Rightarrow \forall t\le \eta,1\le i\le m ,t\vec{c_i}\in E-x_0\\ 又\forall \varepsilon>0,\exist \delta>0满足\forall h\in B(0,\delta)\cap (E-x_0),|(A_{x_0}-B_{x_0})h|\le \varepsilon |h|.\\ 取t=min\{\delta \frac{\eta}{|c_i|}\}\Rightarrow |(A_{x_0}-B_{x_0})(t\vec{c_i})|\le \varepsilon |t\vec{c_i}|\Rightarrow |(A_{x_0}-B_{x_0})(\vec{c_i})|\le \varepsilon |\vec{c_i}|\\ \Rightarrow (A_{x_0}-B_{x_0})\vec{c_i}=0 \Rightarrow (A_{x_0}-B_{x_0})C=0\Rightarrow A_{x_0}=B_{x_0}. \ ▸ \end{array} \]
下面只考虑\(E\)的内点的可微性,结论对锥点也成立.
微分的几何意义
考虑函数\(f:E\sube R^m\mapsto R^n\).
函数的图像
记\(S=\{(x,y)|x\in E,y=f(x)\}\)为\(f\)的图像.
广义平面
对于\(b\in R^n,A\in M_{n,m}(R)\),由方程\(y=Ax+b\)决定的曲面,即\(P_b:=\{(x,y)\in R^{m+n}|(-A,I)\begin{pmatrix} x\\ y \\ \end{pmatrix}=b\}\)称为一个广义平面.
- 当\(b=0\)时,\(P_b\)是一个线性空间,即\((-A,I)\begin{pmatrix} x\\ y \\ \end{pmatrix}=0\)的解空间.
- 当\(b\ne 0\)时,设\(x_0\)满足\((-A,I)x_0=b\),则\(P_b=P_0+x_0\),是一个仿射线性空间.
切平面与切空间
若\(x_0\in \mathring{E}\),且\(f\)在\(x_0\)处可微,则有\(f(x)=f(x_0)+Df(x_0)(x-x_0)+o(x-x_0),E\ni x\to x_0\).
记\(Df(x_0)=\begin{pmatrix} \vec{a_1}\\ \vec{a_2} \\ \vdots\\ \vec{a_n}\end{pmatrix},P=\{(x,y)|y-f(x_0)=Df(x_0)(x-x_0),x\in R^m\}\).
则有\(\forall 1\le i\le n,y_i-f(x_0)_i=\vec{a_i}(x-x_0)\).该方程对应了一个\(R^{m+1}\)上的超平面\(H_i\),且\((x_0,f(x_0)_i)\in H_i\).\(P\)是一个仿射线性空间且\((x_0,f(x_0))\in P\),称之为\((x_0,f(x_0))\in S\)处的切平面.记\(y_0=f(x_0).\)称\(TS_{(x_0,y_0)}=P-(x_0,y_0)\)是\(S\)在\((x_0,y_0)\)处的切空间,\(TS_{(x_0,y_0)}\)是线性空间.并且有\(TS_{(x_0,y_0)}=\{(h,D(f_0)h)|h\in R^m\}\).
下面证明\(S\)在\((x_0,y_0)\)处的切空间是由过\(S\)中的所有参数曲线\(\sigma\)在\((x_0,y_0)\)处的切向量构成.即\(TS_{(x_0,y_0)}=\{\sigma'(0)|\exist \delta>0,\sigma:[-\delta,\delta]\mapsto S,\sigma(0)=(x_0,y_0)且\sigma在0处可微\}\).
记\(W=\{\sigma'(0)|\exist \delta>0,\sigma:[-\delta,\delta]\mapsto S,\sigma(0)=(x_0,y_0)且\sigma在0处可微\}.\)
-
证明\(TS_{(x_0,y_0)}\sube W\).
\[\begin{array}{l} ◂\ x_0\in \mathring{E},\forall h\in R^m,\exist \delta满足\forall t\in [-\delta,\delta],x_0+th\in E\\ \Rightarrow \sigma (t)=(x_0+th,f(x_0,+th))\in S且\sigma(0)=(x_0,y_0)\\ \Rightarrow \sigma'(0)形如(h,\frac{d}{dt}f(x_0+th)|_{t=0})=(h,Df(x_0)h)\in TS_{(x_0,y_0)}.\\ \Rightarrow TS_{(x_0,y_0)}\sube W. \ ▸ \end{array} \] -
证明\(W\sube TS_{(x_0,y_0)}\).
\[\begin{array}{l} ◂\ 设\sigma:[-\delta,\delta]\mapsto S满足\sigma(0)=(x_0,y_0)\\ 则\sigma(t)=(\sigma_x(t),\sigma_y(t))\in S,\sigma_y(t)=f(\sigma_x(t))\\ \sigma'(0)=(\frac{d}{dt}\sigma_x(t)|_{t=0},\frac{d}{dt}(f\circ \sigma_x)(t)|_{t=0})=(\sigma_x'(0),Df(\sigma_x(0))\sigma_x'(0))\in TS_{(x_0,y_0)}.\\ \Rightarrow W\sube TS_{(x_0,y_0)}. \ ▸ \end{array} \]
\(R^m\)上一般曲面\(S\)上\(P\)点的切空间
若\(\sigma:[-\delta,\delta]\mapsto S\)满足\(\sigma(0)=P\)则\(\sigma'(0)\)称为曲面\(S\)在\(P\)的一个切向量,切向量的全体称为\(S\)在\(P\)处的切空间.即\(TS_P:=\{\sigma'(0)|\exist \delta>0,\sigma :[-\delta,\delta]\mapsto S满足\sigma(0)=P且\sigma在0处可微\}\).
而且有
即,\(Df(x_0)\)是将\(E\)在\(x_0\)处的切空间映射到\(f(E)\)在\(f(x_0)\)处的切空间.因此\(Df(x_0)\)也被称作切映射.
微分的计算
偏导数
有\(f:E\sube R^m \mapsto R^n,x^*\in \mathring{E}\),则\(\forall 1\le i\le m,\exist \delta>0,\forall x_i\in [x^*_i-\delta,x^*_i+\delta],(x^*_1,x^*_2,\cdots,x_i,c\dots,x^*_m)\in E\),若\(f(x^*_1,x^*_2,\cdots,x_i,c\dots,x^*_m)\)在\(x^*_i\)处可导,称\(\frac{d}{dx_i}f(x^*_1,x^*_2,\cdots,x_i,c\dots,x^*_m)|_{x_i=x^*_i}\)为\(f(x)\)在\(x^*\)处关于\(x_i\)的一阶偏导数,记为\(\frac{\partial}{\partial x_i}f(x^*)\),\(\frac{\partial}{\partial x_i}\)也称为偏导数算子.也记为\(\partial_{x_i}f(x^*),D_if(x^*),f'_{x_i}(x^*)\).
不难看出\(\frac{\partial}{\partial x_i}f(x^*)\)存在\(\Leftrightarrow f(x+t\vec{e_i})\)在\(t=0\)处可导.
数值函数微分的计算
-
设\(E\sube R^m,f:E\mapsto R\)在\(x\in \mathring{E}\)处可微,则\(1\le i\le m\),\(\frac{\partial}{\partial x_i}f(x)\)存在且\(Df(x)=(\frac{\partial}{\partial x_1}f(x),\frac{\partial}{\partial x_2}f(x),\cdots,\frac{\partial}{\partial x_m}f(x))\).
-
(梯度的定义)若\(f:E\sube R^m\mapsto R\)在\(x\)处可微,则称\(\nabla f(x)=(Df(x))^T\)为\(f\)的梯度(列向量).
此时,若记\(S=\{(x,f(x))\in R^{m+1}|x\in E\}\).则\(S\)在点\((x_0,y_0=f(x_0))\)处的切平面\(P_{(x_0,y_0)}=\{(x,y)|y-y_0=\nabla f(x_0)(x-x_0)\}\).即\(P_{(x_0,y_0)}\)的法向量为\(\begin{pmatrix} \nabla f(x_0)\\ -1 \\ \end{pmatrix}\).
向量值函数微分的计算
有向量值函数\(f:R^m\mapsto R^n\),\(f(x)=(f_1(x),f_2(x),\cdots f_n(x))\).
-
(Jacobi矩阵的定义)假设\(f\)的所有分量在\(x\in E\)处有所有一阶偏导数,则称
\[J_f(x):=(a_{i,j}=\frac{\partial f_i}{\partial x_j}(x))_{n\times m}=\begin{pmatrix} Df_1\\ Df_2 \\ \vdots \\ Df_n\\ \end{pmatrix}=\begin{pmatrix} \frac{\partial f_1}{\partial x_1}(x) & \frac{\partial f_1}{\partial x_2}(x) & \cdots & \frac{\partial f_1}{\partial x_m}(x)\\ \frac{\partial f_2}{\partial x_1}(x) & \frac{\partial f_2}{\partial x_2}(x) & \cdots & \frac{\partial f_2}{\partial x_m}(x) \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial f_m}{\partial x_1}(x) & \frac{\partial f_m}{\partial x_2}(x) & \cdots & \frac{\partial f_m}{\partial x_m}(x)\end{pmatrix} \]为\(f\)在\(x\)处的Jacobi矩阵(或Jacobi).
-
\(f:E\sube R^m \mapsto R^n,f=(f_1,f_2,\cdots,f_n),x\in \mathring{E}\),则
- \(f\)在\(x\)处可微\(\Leftrightarrow \forall 1\le j\le n\),\(f_j\)在\(x\)处可微.
- 若\(f\)在\(x\)处可微,则\(Df(x)=J_f(x)\).
-
\(f:E\sube R^m \mapsto R^n,f=(f_1,f_2,\cdots,f_n),x\in \mathring{E}\),若\(f\)在\(x\)处可微,则有\(f\)在\(x\)处连续.
偏导数与可微性
-
可微性可以推出偏导数存在,但偏导数存在不能推出可微.一个例子是
\[f(x,y)= \left\{ \begin{array}{c} 0 & xy=0\\ 1 & xy\ne 0\\ \end{array} \right. \]\(\frac{\partial}{\partial x}f(0,0)\)与\(\frac{\partial}{\partial y}f(0,0)\)都存在,但\(f\)在\((0,0)\)处不连续.
切平面存在和\(m\)个切向量存在有本质不同.
-
\(f:R^m\mapsto R^n\),\(f=(f_1,f_2,\cdots f_n),x^*\in \mathring{E}\).
- (偏导数有界\(\Rightarrow\)连续)若\(\exist \delta>0\)满足\(B(x^*,\delta)\sube E\)且\(\forall 1\le i\le n,1\le j\le m, \sup_{x\in B(x^*,\delta)}|\frac{\partial f_i}{\partial x_j} (x)|\le M < +\infty\),则\(f\)在\(x^*\)处连续.即若\(f\)的每个偏导数都在以\(x^*\)附近有界,\(f\)在\(x^*\)处连续.
- (Jacobi连续\(\Rightarrow\)可微)若\(\exist \delta>0\)满足\(B(x^*,\delta)\sube E\)且\(J_f(x)\)在\(B(x^*,\delta)\)上存在且在\(x^*\)处连续,则\(f\)在\(x^*\)处可微.(\(J_f(x)\)在\(x^*\)处连续指\(x\to x^*\)时\(||J_f(x)-J_f(x^*)||\to 0\)).
\(C^1\)函数类
若\(f:E\sube R^m \mapsto R^n,\forall 1\le i\le n,1\le j\le m,\frac{\partial f_i}{\partial x_j}\in C(E;R))\)则称\(f\in C^1(E;R^n)\).
即\(C^1(E;R^n)=\{f:E\mapsto R^n|J_f(x)\in C(E,M_{n,m})\}\).
\(E\)为\(R^m\)上开集,则\(f\in C^1(E;R^n)\Leftrightarrow f\)在\(E\)上可微且\(J_f(x)\)在\(E\)上连续.
微分法的基本定律
线性算子
若\(T:(X,+,\cdot,\mathbb{F})\mapsto (Y,+,\cdot,\mathbb{F})\)满足
- \(T(x+y)=Tx+Ty, \forall x,y\in X\)
- \(\forall x\in X,a\in \mathbb{F},T(ax)=aTx\)
则称\(T\)是\(X\)到\(Y\)的线性算子.
微分与四则运算
若\(f:E\sube R^m \mapsto R^{n_1},g:E\sube R^m \mapsto R^{n_2}\)均在\(x\in \mathring{E}\)处可微,则
- \(n_1=n_2\)时,\(D(\alpha f+\beta g)(x)=\alpha Df(x)+\beta Dg(x)\).
- \(n_2=1\)时,\(D(fg)(x)=Dfg(x)+fDg(x)\),若\(g(x)\ne 0\),则\(D(\frac{f}{g})(x)=\frac{Dfg-fDg}{g^2}(x)\).
总的来说和\(R\)上的函数类似,注意矩阵运算规则即可.
复合映射的微分法
设\(f:X\sube R^m \mapsto Y\sube R^n\)在点\(x\in \mathring{X}\)处可微,\(f(x)\in \mathring{Y}\),\(g:Y\mapsto R^k\)在\(y=f(x)\)处可微,则\(g\circ f:X\mapsto R^k\)在\(x\)处可微且
由此我们得知
-
若\(g(y)=(g_1(y),g_2(y),\cdots,g_k(y))^T,y=f(x)=(f_1(x),f_2(x),\cdots,f_n(x))^T\).则
\[\frac{\partial}{\partial x_j}(g_i\circ f)(x)=\sum_{l=1}^{n}\frac{\partial}{\partial y_l}g_i(y)\frac{\partial}{\partial x_j}f_l(x) \] -
(数值函数的复合求导公式)若\(f:X\sube R^m\mapsto R^n,g:Y\mapsto R,x\in \mathring{X},f(x)\in \mathring{Y}\)则
\[D(g\circ f)(x)=(\frac{\partial}{\partial x_1}(g\circ f)(x),\frac{\partial}{\partial x_2}(g\circ f)(x),\cdots,\frac{\partial}{\partial x_m}(g\circ f)(x))\\ \frac{\partial}{\partial x_i}(g\circ f)(x)=\sum_{l=1}^{n}\frac{\partial}{\partial y_l}g(y)\frac{\partial}{\partial x_i}f_l(x)|_{y=f(x)} \] -
设\(E\sube R^m\)是开集,\(f:E\mapsto R^n\)处处可微,\(\sigma :I\mapsto E\)处处可微(其中\(I\)是一个区间),则\((f\circ \sigma):I\mapsto R^n\)处处可微,且
\[(f\circ \sigma)'(t)=Df(\sigma(t))\sigma '(t),\forall t\in I \]
微分中值不等式
欧式空间中的线段
若\(x,y\in R^m\),定义\([x,y]=[y,x]=\{tx+(1-t)y|t\in[0,1]\},(x,y)=(y,x)=\{tx+(1-t)y|t\in(0,1)\}\)为\(R^m\)中以\(xy\)为端点的闭线段与开线段.特别地,\(x=y\)时\([x,y]={x}\).
微分中值不等式
设\(f:E\sube R^m \mapsto R^n,x,y\in R^m\)满足\([x,y]\sube E,(x,y)\sube \mathring{E}\).\(f\)在\([x,y]\)上连续且在\((x,y)\)上处处可微,则\(\exist \xi \in (x,y)\)满足
其中\(||\cdot||\)是矩阵的算子范数.
由此有推论:
- 设\(\Omega\in R^m\)是区域(连通开集),\(f:\Omega \mapsto R^n\)在\(\Omega\)上处处可微,若\(\forall x\in \Omega,Df(x)=0\),则\(f(x)\equiv C\).
方向导数与梯度
方向导数
设\(x_0\in R^m,f:B(x_0,\delta) \mapsto R\),给定非零向量\(\vec{v}\in R^m\)则若
存在且有限,则称\(D_{\vec{v}}f(x_0)\)为\(f\)在\(x_0\)处沿着方向\(\vec{v}\)的导数,当\(|\vec{v}|=1\)时,称\(D_{\vec{v}}(x)\)为\(f\)在\(x_0\)处沿方向\(\vec{v}\)的方向导数.
显然偏导数是方向导数,且方向导数存在并不能说明\(f\)可微.
如果令\(\sigma(t)=x_0+t\vec{v}\),则有\(D_{\vec{v}}(x_0)=\frac{d}{dt}(f\circ \sigma)|_{t=0}\).
由复合求导的公式,若\(f\)在\(x_0\)处可微,有\(D_{\vec{v}}(x_0)=Df(\sigma(0))\sigma'(0)=D_f(x_0)\vec{v}\).因此
且\(f\)沿着正梯度方向(\(\vec{v}=\frac{\nabla f(x_0)}{|\nabla f(x_0)|}\))上升最快,负梯度方向(\(\vec{v}=-\frac{\nabla f(x_0)}{|\nabla f(x_0)|}\))下降最快.
多元数值函数的微分学
中值定理
有\(f:E\sube R^m \mapsto R,[x,y]\sube E,(x,y)\sube \mathring{E}\).\(f\)在\([x,y]\)上连续且在\((x,y)\)上可微,而\(\exist \theta\in (0,1)\)满足
或者
有推论:设\(\Omega\in R^m\)是区域(连通开集),\(f:\Omega \mapsto R\)在\(\Omega\)上处处可微,若\(\forall x\in \Omega,\nabla f(x)=0\),则\(f(x)\equiv C\).
高阶偏导数
(下面记\(D_i=\frac{\partial}{\partial x_i}\))
-
(定义):\(\Omega\)是\(R^m\)上的开集,\(f:\Omega \mapsto R\)的一阶偏导数\(D_if(x)\)存在,若\(x\mapsto D_if(x)\)在\(x_0\in \Omega\)处有一阶偏导数\(\frac{\partial}{\partial x_j}(D_if)\),则记
\[D_jD_if(x_0):=\frac{\partial^2}{\partial x_j\partial x_i}f(x_0):=\frac{\partial}{\partial x_j}(\frac{\partial}{\partial x_i}f(x_0)) \]为\(f\)在\(x_0\)处的一个二阶偏导数.若所有二阶偏导数在\(\Omega\)上处处存在,则称\(f\)是\(\Omega\)内的一个二阶偏导函数.
-
更一般地,称
\[ D_{i_n}D_{i_{n-1}}\dots D_{i_1}f(x_0):=\frac{\partial^n}{\partial x_{i_n}\partial x_{i_{n-1}}\dots \partial x_{i_1}}f(x_0)=\frac{\partial}{\partial x_{i_n}}\frac{\partial}{\partial x_{i_{n-1}}}\dots\frac{\partial}{\partial x_{i_1}}f(x_0) \]为\(f\)在\(x_0\)处的一个\(n\)阶偏导数.
偏导数的换序问题
-
(\(C^k\)函数类):若\(\Omega\)是\(R^m\)上的开集,则
\[C^k(\Omega;R)=\{f:\Omega \mapsto R|f所有阶数不大于k的偏导数在\Omega上都存在且连续\} \]那么显然有:
- \(C(\Omega;R)\supe C^1(\Omega;R) \supe \cdots C^k(\Omega;R)\supe C^{k+1}(\Omega;R)\supe \cdots\)
- \(C^{\infty}:=\cap_{k\in N} C^k(\Omega;R)\)
-
(偏导数算子):若\(\alpha=(\alpha_1,\alpha_2,\cdots ,\alpha_m)\in \mathbb{N}^m\)(并记\(|\alpha|=\sum_{j=1}^{m}\alpha_j\)),则称
\[D^\alpha:=\partial^{\alpha_1}_{x_1}\partial^{\alpha_2}_{x_2}\cdots\partial^{\alpha_m}_{x_m}=(\frac{\partial}{\partial x_1})^{\alpha_1}(\frac{\partial}{\partial x_2})^{\alpha_2}\cdots (\frac{\partial}{\partial x_m})^{\alpha_m} \]是一个\(\alpha\)阶的偏导数算子.特别地,约定当\(\alpha_i=0\)时有\(\partial^{\alpha_i}_{x_i}=\partial^{0}_{x_i}:=Id\)(恒等算子).
-
设\(m\ge 2,\Omega\)是\(R^m\)上的开集,\(f\in C^r(\Omega;R)\).则\(\forall 2\le k\le r\),\(f\)的\(k\)阶偏导数\(\frac{\partial^k}{\partial x_1 \partial x_2 \cdots \partial x_m}f(x)\)不依赖于偏导数的顺序,即\(x_1,x_2\cdots x_m\)可交换次序.
-
(单次式的偏导数计算公式):设\(\alpha=(\alpha_1,\alpha_2,\cdots ,\alpha_m),\beta=(\beta_1,\beta_2,\cdots ,\beta_m),\alpha,\beta\in \mathbb{N}^m\).定义
\[|\alpha|=\sum_{j=1}^{m}\alpha_j\\ \alpha^\beta=\alpha_1^{\beta_1}\alpha_2^{\beta_2}\cdots \alpha_m^{\beta_m}\\ \beta !=\beta_1 !\beta_2!\cdots\beta_m! \]由组合意义,可以发现
\[(\sum_{i=1}^{m}\alpha_i)^n=\sum_{|\beta|=n}\frac{n!}{\beta !}\alpha^\beta \]该结论可以延伸至交换环\(\mathcal{A}\)上(\(\mathcal{A}\)是具有乘法运算的线性空间,且加法乘法满足交换律,结合律和对应的分配律).
对于\(R^m\)上的开集\(\Omega\)和\(f\in C^k(\Omega;R)\),\(\forall 1\le n\le k,h=(h_1,h_2,\cdots ,h_m),t\in[0,1]\)有
-
\[ (\sum_{i=1}^{m}h_iD_i)^n(c_1f+c_2g)=\sum_{|\alpha|=n}\frac{n!}{\alpha !}h^\alpha D^\alpha(c_1f+c_2g) \]
\[\frac{d^n}{dt^n}(f(x_0+th))=(\sum_{i=1}^{m}h_i \frac{\partial}{\partial x_i})^nf(x_0+th)=\sum_{|\alpha|=n}\frac{n!}{\alpha !}h^\alpha(D^\alpha f)(x_0+th) \]同时有单次式的偏导数计算公式:
\[D^\beta x^\alpha= \left\{ \begin{array}{c} \frac{\alpha !}{(\alpha-\beta)!} x^{\alpha-\beta} & \alpha-\beta \in \mathbb{N}^m\\ 0 & \alpha-\beta\notin \mathbb{N}^m\\ \end{array} \right. \] -
多变量函数的Taylor公式
设\(\Omega\)是\(R^m\)上的开集,\(n\in \mathbb{N},f\in C^{n+1}(\Omega;R),x_0\in \Omega.\forall x\in \Omega,[x_0,x]\sube \Omega\)有
-
(具有Lagrange型余项的Taylor公式)
\[f(x)=\sum_{k=0}^{n}\sum_{|\alpha|=k}\frac{D^\alpha f(x_0)}{\alpha !}(x-x_0)^\alpha+\sum_{|\alpha|=n+1}\frac{D^\alpha f(\xi)}{\alpha !}(x-x_0)^\alpha \]其中\(\xi \in (x_0,x)\).\(\sum_{k=0}^{n}\sum_{|\alpha|=k}\frac{D^\alpha f(x_0)}{\alpha !}(x-x_0)^\alpha\)称为Taylor多项式.
-
(具有积分型余项的Taylor公式)
\[f(x)=\sum_{k=0}^{n}\sum_{|\alpha|=k}\frac{D^\alpha f(x_0)}{\alpha !}(x-x_0)^\alpha+(n+1)\sum_{|\alpha|=n+1}\frac{(x-x_0)^\alpha}{\alpha !}\int_{0}^{1}(1-t)^n(D^\alpha f)(x_0+t(x-x_0))dt. \] -
(具有\(o(|x-x_0|^n)\)型余项的Taylor公式)
\[f(x)=\sum_{k=0}^{n}\sum_{|\alpha|=k}\frac{D^\alpha f(x_0)}{\alpha !}(x-x_0)^\alpha+o(|x-x_0|^n) \]
联系一元函数进行记忆:
\[\varphi(x)=\sum_{i=0}^{n}\frac{\varphi^{(i)}(x_0)}{i!}(x-x_0)^i+\frac{\varphi^{(n+1)}(\xi)}{(n+1)!}(x-x_0)^{n+1},\xi=tx_0+(1-t)x,t\in(0,1)\\ \varphi(x)=\sum_{i=0}^{n}\frac{\varphi^{(i)}(x_0)}{i!}(x-x_0)^i+\frac{1}{n!}\int_{x_0}^{x}(x-t)^n\varphi^{(n+1)}(t)dt\\ \varphi(x)=\sum_{i=0}^{n}\frac{\varphi^{(i)}(x_0)}{i!}(x-x_0)^i+o(|x-x_0|^n) \]
- (Taylor公式的唯一性)
Hessian矩阵
称
为\(f\)在\(x\)处的Hessian矩阵.
Hessian矩阵与二阶Taylor公式
设\(\Omega\)是\(R^m\)上的开集,\(f\in C^2(\Omega;R)\).\(x_0\in \Omega\),则\(\forall x\in \Omega,[x_0,x]\sube \Omega\)有
函数的极值问题
局部极值的定义
\(f:E\sube R^m\mapsto R,x_0\in \mathring{E}\).若\(\exist \delta>0,B(x_0,\delta)\sube E\)满足\(\forall x\in B(x_0,\delta),f(x)\ge f(x_0)\)(或\(f(x)\le f(x_0)\)).则称\(x_0\)是\(f\)的一个局部极小值点(或局部极大值点).\(f(x_0)\)称为\(f\)的一个局部极小值(或局部极大值).
将以上\(\le\)替换为\(<\),\(\ge\)替换为\(>\),则得到严格局部极小(大)值的定义.
可以类似定义整体最小(大)值,即\(f(x_0)=min_{x\in E}f(x)\)(或\(f(x_0)=max_{x\in E}f(x)\)).
临界点的定义
设\(f:E\sube R^m\mapsto R^n,x_0\in \mathring{E}\),若\(f\)在\(x_0\)处可微且\(r(Df(x_0))<min(n,m)\),则称\(x_0\)为\(f\)的一个临界点,并称\(f(x_0)\)为\(f\)的一个临界值.
特别地,当\(n=1\)即\(f\)为数值函数时,\(r(Df(x_0))<1 \Leftrightarrow \frac{\partial f}{\partial x_1}(x_0)=\frac{\partial f}{\partial x_2}(x_0)=\cdots =\frac{\partial f}{\partial x_m}(x_0)\).
Fermat极值原理
若\(f:E\sube R^m\mapsto R,x_0\in \mathring{E}\),\(f\)在\(x_0\)处可微.若\(x_0\)是\(f\)的局部极值点,则\(x_0\)是\(f\)的临界点.
即局部极值只能在临界点取到.
Rolle定理
\(\Omega\)是\(R^m\)上的有界开集,\(f:\overline{\Omega}\mapsto R\)连续且在\(\Omega\)内可微.若\(f|_{\partial \Omega}\equiv C\),则\(\exist \xi\in \Omega,\nabla f(\xi)=0\).
由Rolle定理和Fermat极值原理,可以得出:
Hessian矩阵判别
\(\Omega\)是\(R^m\)上开集,\(f\in C^2(\Omega;R)\),设\(x_0\in \Omega\)是\(f\)的一个临界点,即\(\nabla f(x_0)=0\).则:
-
若\(H_f(x_0)\)正定,则\(x_0\)是\(f\)的一个严格局部极小值点.
-
若\(H_f(x_0)\)负定,则\(x_0\)是\(f\)的一个严格局部极大值点.
-
若\(x_0\)是\(f\)的一个局部极小值点,则\(H_f(x_0)\)半正定.
-
若\(x_0\)是\(f\)的一个局部极大值点,则\(H_f(x_0)\)半负定.
-
若\(H_f(x_0)\)不定,则\(x_0\)不是\(f\)的极值点.
临界点不一定是极值点,例如:
\[f(x,y)=x^2-y^2 \]这个函数的形状类似马鞍面.不难发现\((0,0)\)是\(f\)的临界点但不是极值点.
称不是极值点的临界点为\(f\)的"鞍点".
凸函数与凹函数的微分性质
凸函数的可微性
若\(\Omega \sube R^m\)是凸开集,\(f:\Omega \mapsto R\)是凸函数,则
-
\(\forall x\in \Omega,f\)在\(x\)处可微\(\Leftrightarrow \exist !V(x)\)满足\(\forall y\in \Omega f(y)\ge f(x)+(V(x),y-x)\).即支撑平面唯一.此时\(V(x)=\nabla f(x)\).
-
若\(f\in C^2(\Omega;R)\),则\(f\)是凸函数\(\Leftrightarrow \forall x\in \Omega,H_f(x)\)半正定.
可以发现对于一般凸函数\(f:\Omega \mapsto R\)和\(x_0 \in \mathring{\Omega}\)有\(x_0\)是临界点\(\Leftrightarrow x_0\)是整体最小值点.
凸函数的方向导数
定义
若\(\Omega\)是\(R^m\)上的凸开集,\(f:\Omega \mapsto R\)是凸函数,\(B(x,\delta)\sube \Omega\).则
-
\(\forall \vec{v}\in R^m,D_{\vec{v}}^+f(x)\)存在.
-
\(D_{a\vec{v}}^+f(x)=aD_{\vec{v}}^+f(x),\forall a>0\)且\(D_{a\vec{v}+b\vec{u}}^+f(x)\le aD_{\vec{v}}^+f(x)+bD_{\vec{u}}^+f(x),\forall a,b>0,a+b=1\).
-
\(|D_{\vec{v}}^+f(x)|\le L|\vec{v}|\)且\(|D_{\vec{v}}^+f(x)-D_{\vec{u}}^+f(x)|\le L|\vec{v}-\vec{u}|\).
其中\(L\)是\(f\)在\(B(x,\delta)\)上的Lipschitz连续系数.(凸函数在局部Lipschitz连续)
-
\(\forall h\in B(0,\delta),f(x+h)-f(x)-D_h^+f(x)=o(h)\).
Lipschitz函数\(f\)有全部偏导数\(\Leftrightarrow f\)可微.(不依赖凸性)
隐函数和反函数定理
隐函数问题
假设\(F(x_0,y_0)=0,x_0\in R^m,y_0\in R^n\).求\(B(x_0,\delta_x)\)(或\(B(y_0,\delta_y)\))及其上的函数\(y=f(x)\)(或\(x=g(y)\)).满足\(F(x,f(x))=0\)(或\((g(y),y)=0\)).
考虑最简单的情形.
若\(detA\ne 0\),则有\(y=A^{-1}(b-Bx)\)(\(detB\ne 0\)同理).
当\(r(A)< m\)时方程有多解或方程无解.不难发现\(F(x_0,y_0)=0\)排除了方程无解的情况.
\(C^k(\Omega;M_{n,l})\)函数类
设\(\Omega\)是\(R^m\)上的开集,\(A:x\in \Omega\mapsto A(x)= (a_{ij}(x))_{n\times l}\in M_{n,l}\).称\(A\in C^k(\Omega;M_{n,l})\)如果\(\forall x\in \Omega,1\le i\le n,1\le j\le l,a_{ij}(x)\in C^k(\Omega,R)\).
即\(A\)的每个分量在\(\Omega\)上都是\(C^k\)函数.
压缩映射的不动函数
\(U\sube R^m,V\sube R^n\)均为开集,\(\Phi=(\Phi_1,\Phi_2,\cdots,\Phi_n)^T:U\times \overline{V}\mapsto R^n\)满足
- \(\Phi(U\times \overline{V})\sube V\)
- \(\exist 0<q<1,\forall x\in U,y,z\in \overline{V},|\Phi(x,y)-\Phi(x,z)|\le q|y-z|\).(称\(\Phi\)在\(\overline{V}\)上是一致压缩的)
那么有
-
\(\forall x\in U,\exist !y\in V,\Phi(x,y)=y\).(由此引入\(f:U\mapsto V,f(x)\mapsto y,\forall x\in U\)).
-
若\(\Phi \in C(U\times \overline{V};V)\),则\(f\in C(U;V)\).
-
若\(\Phi \in C(U\times \overline{V};V)\),则\(f\in C(U;V)\).
-
\(\forall 1\le k\le +\infty\),若\(\Phi \in C^k(U\times \overline{V};V)\),则\(f\in C^k(U;V)\).且\((I-D_y\Phi)\)在\(U\times V\)上处处可逆,并有
\[Df(x)=(I-(D_y\Phi)(x,y))^{-1}D_x\Phi(x,y)|_{y-f(x)} \]
局部隐函数定理
\(\Omega\sube R^{m+n},F\in C^k(\Omega;R^n)\).假设\(\vec{p_0}\in \Omega\)满足
- \(F(\vec{p_0})=0\)
- \(r((DF)(p_0))=n\).(即\((DF)(p_0)\in M_{n,n+m}\)行满秩)
则\(\exist \delta>0,\eta>0\)满足\(\overline{B}(x_0;\delta)\times \overline{B}(y_0,\eta)\sube \Omega\)且
-
\(\forall (x,y)\in \overline{B}(x_0;\delta)\times \overline{B}(y_0,\eta),det(D_yF)(x,y)\ne 0\)
-
\(\forall x\in B(x_0,\delta),\exist ! y\in B(y_0,\eta)\)满足\(F(x,y)=0\).由此,记\(f:B(x_0,\delta)\mapsto B(y_0,\eta),f(x)=y\Rightarrow F(x,f(x))=0,\forall x\in B(x_0,\delta)\)且\(f(x_0)=y_0\)
此时称\(f\)是方程\(F(x,y)=0\)在\((x_0,y_0)\)附近确定的满足\(f(x_0)=y_0\)的隐函数.
-
\(f\in C^k(B(x_0,\delta),B(y_0,\eta))\)且\(Df(x)=-(D_yF(x,y))^{-1}(D_xF)(x,y)|_{y=f(x)}=-(D_yF)(x,f(x))^{-1}(D_xF)(x,f(x)),\forall x\in B(x_0,\delta)\).
反函数定理及应用
定义
同胚
设\(A\sube R^n,B\sube R^m\),若\(f:A\mapsto B\)是连续双射,即\(f\in C(A;B),f^{-1}\in C(B;A)\).则称\(f:A\mapsto B\)是一个同胚映射,若\(AB\)之间存在一个同胚映射,那么\(AB\)是同胚的.
微分同胚
设\(U,V\)是\(R^m\)上的开集,\(f:U\mapsto V\)是同胚映射.若\(f\)和\(f^{-1}\)都是可微函数,则称\(f:U\mapsto V\)是一个微分同胚.进一步,若\(f\in C^k(U;V),f^{-1}\in C^k(V;U)\),则称\(f\)是一个\(C^k\)类微分同胚(\(k=+\infty\)时是光滑微分同胚).此时\(UV\)是\(C^k\)类微分同胚.
开映射
设\(\Omega\)是\(R^m\)上的开集,若\(f\)将\(\Omega\)中的开集映射到\(R^m\)中的开集,则称\(f\)是一个开映射.
区域
称\(G\sube R^m\)为一个区域,如果\(G\)是连通开集.
邻域
称\(U(x_0)\)是\(x_0\)的一个邻域,如果\(\exist \delta>0\)满足\(B(x_0,\delta)\sube U(x_0)\).由于开集可以写成互不相交的联通分支的并,不妨假定\(U(x_0)\)是一个区域.
可微映射的开映射定理
设\(\Omega\)是\(R^m\)上的开集,\(f:\Omega \mapsto R^m\)可微且\(\forall x\in \Omega,det(Df(x))\ne 0\),则\(f\)是开映射.
上面的条件\(\forall x\in \Omega,det(Df(x))\ne 0\)不是必备的,事实上有:
Brouwer区域不动定理
若\(\Omega\)是\(R^m\)上的开集,\(f:\Omega \mapsto R^m\)连续且局部是一一映射,那么\(f\)是开映射.
事实上由之后的反函数定理,\(det(Df(x_0))\ne 0 \Rightarrow\)在\(x_0\)附近\(f\)是一一映射.
反函数的可微性与微分法
设\(\Omega\)是\(R^m\)上的开集,\(f:\Omega \mapsto R^m\)是单射且可微,满足\(\forall x\in \Omega,det(Df(x))\ne 0\),则\(f(\Omega)\)是开集并且
即
局部反函数定理
设\(\Omega\)是\(R^m\)上的开集,\(f\in C^k(\Omega;R^m),1\le k\le +\infty\).\(x_0\in \Omega\)满足\((Df)(x_0)\)可逆,即\(det(Df(x_0))\ne 0\),则存在\(x_0\)的邻域\(U(x_0)\sube \Omega\)及\(f(x_0)\)的邻域\(V(f(x_0))\sube f(\Omega)\).使得\(f:U\mapsto V\)是\(C^k\)类微分同胚.此外
整体反函数定理
设\(\Omega\)是\(R^m\)上的开集,\(f\in C^k(\Omega;R^m),k\ge 1\).若\(f\)满足
- \(f\)是\(\Omega\)上的单射
- \(\forall x\in \Omega,det(Df(x))\ne 0\)
则\(f:\Omega\mapsto f(\Omega)\)是\(C^k\)类微分同胚且
反函数定理的应用
球极坐标(球面坐标)
\(x=(x_1,x_2\cdots x_m)=\Psi(r,\theta_1,\theta_2\cdots \theta_{m-1})\),其中\(m\ge 2,r\ge 0,\forall 1\le i< m-1,\theta_i\in [0,\pi],\theta_{m-1}\in [0,2\pi]\).
具体的定义为:
有:
-
\(m=2\)时,\((x,y)=\Psi(r,\theta)\),\(\Psi\)在\([0,+\infty)\times [0,2\pi)\)上是满射,在\((0,+\infty)\times (0,2\pi)\)上是单射,且
\[\frac{\partial(x,y)}{\partial(r,\theta)}=r \] -
\(m=3\)时,\(\Psi\)在\([0,+\infty)\times [0,\pi]\times [0,2\pi]\)上是满射,在\((0,+\infty)\times (0,\pi)\times (0,2\pi)\)上是单射,且
\[\frac{\partial(x,y,z)}{\partial(r,\theta_1,\theta_2)}=r^2\sin{\theta_1} \] -
\(m\ge 4\)时,\(\Psi\)在\([0,+\infty)\times [0,\pi]^{m-2}\times [0,2\pi]\)上是满射,在\((0,+\infty)\times (0,\pi)^{m-2}\times (0,2\pi)\)上是单射,且
\[\frac{\partial(x_1,x_2\cdots x_m)}{\partial(r,\theta_1,\theta_2\cdots \theta_{m-1})}=r^{m-1}\sin^{m-2}{\theta_1}\sin^{m-3}{\theta_2}\cdots \sin{\theta_{m-2}} \]
同胚映射
- \(R^m\)中任意\(m\)维开空间和\(R^m\)是\(C^{\infty}\)同胚
- \(R^m\)中任意\(m\)维开球和\(R^m\)是\(C^{\infty}\)同胚
利用同胚确定曲面的维数
设\(P_0=(x^*,y^*)\in R^k\times R^{n-k}(1\le k<n)\).记\(I(x^*):=\prod_{i=1}^{k}(x_I^*-\delta,x^*_i-\delta),J(y^*):=\prod_{j=1}^{n-k}(y^*_j-\delta,y^*_j+\delta)\)且\(y^*=f(x^*)\).
考虑曲面\(S(P_0)=\{(x,f(x))|x\in I(x^*)\}\)
\(\exist \varphi:U(P_0)(:=I(x^*)\times J(y^*)) \mapsto (-1,1)^n\)是\(C^k\)类同胚.满足:
- \(\varphi(P_0)=0\)
- \(\varphi(S(P_0))=(-1,1)^k\times \{0\}\)
局部展平技术
若\(f:\Omega\sube R^m \mapsto R^n\),函数图像\(S(f)=\{(x,f(x))|x\in \Omega\}\),有:
若\(f\in C^k(\Omega;R^n)\),\(\psi\)是\(C^k\)类微分同胚.
映射的秩与函数相关性
秩定理
考虑\(f:R^m \mapsto R^n\),\(P\in M_n,Q\in M_m\)是置换矩阵.
若定义\(\hat{f}(x):=(P\circ f\circ Q)(x)\)
则\(f(x)=(P^{-1}\circ \hat{f}\circ Q^{-1})(x)\)
如有必要,可以对\(x\)的分量与\(f\)的分量进行置换.
若\(P_0\in R^m\),\(U(P_0)\)是\(P_0\)的一个邻域.设\(f\in C^k(U(P_0);R^n),k\ge 1\)满足\(\forall p\in U(P_0),rank(Df(p))\equiv r\).
则有:
-
若\(r=m<n\)(即列满秩).存在\(P_0\)的一个邻域\(O(P_0)\),\(O(P_0)\sube U(P_0)\)以及\(m\)维的开空间\(I\),且存在\(C^k\)同胚\(\varphi:O(P_0)\mapsto I\).同时存在\(f(P_0)\)的一个邻域\(O(f(P_0))\sube R^n\)以及其上的\(C^k\)同胚\(\psi:O(f(P_0))\mapsto \psi(O(f(P_0)))\sube R^n\).满足\(f(O(P_0))\sube O(f(P_0))\)且:
\[\psi\circ f \circ \varphi^{-1}(u)\equiv \begin{pmatrix} u \\ 0 \\ \end{pmatrix}\in R^n,\forall u\in I \] -
若\(r=n<m\)(即行满秩).存在\(P_0\)的一个邻域\(O(P_0)\),\(O(P_0)\sube U(P_0)\),以\(f(P_0)\)为中心的\(n\)维开空间\(I\),\(m-n\)维开空间\(J\)以及\(C^k\)同胚\(\varphi:O(P_0)\mapsto I\times J\).满足:
\[f \circ \varphi^{-1} \begin{pmatrix} u \\ v \\ \end{pmatrix}\equiv u,\forall (u,v)\in I\times J \] -
若\(r<min(m,n)\).存在\(P_0\)的一个邻域\(O(P_0)\),\(O(P_0)\sube U(P_0)\),\(r\)维开空间\(I\),\(m-r\)维开空间\(J\),\(C^k\)同胚\(\varphi:O(P_0)\mapsto I\times J\)以及\(f(P_0)\)的一个邻域\(O(f(P_0))\sube R^n\),\(C^k\)同胚\(\psi:O(f(P_0))\mapsto \psi(O(f(P_0)))\sube R^n\).满足\(f(O(P_0))\sube O(f(P_0))\)且:
\[\psi\circ f \circ \varphi^{-1}\begin{pmatrix} u \\ v \\ \end{pmatrix}\equiv \begin{pmatrix} u \\ 0 \\ \end{pmatrix}\in R^n,\forall (u,v)\in I\times J \]
函数相关性与独立性
定义
设\(U\)是\(R^m\)上的开集,有函数组\(f_1,f_2\cdots f_n\in C(U;R)\).记\(f=(f_1,f_2\cdots f_n)\).称函数组\(f_1,f_2\cdots f_n\)在\(U(x_0)\)上是函数独立的,如果
否则,称\(f_1,f_2\cdots f_n\)在\(U(x_0)\)上是函数相关.
函数相关性的定理
TBC

浙公网安备 33010602011771号