凸集和凸函数
凸集和凸函数
定义与基本性质
- 若\(E\subseteq R^m\)满足,\(\forall x,y\in E,\forall t\in [0,1]\),有\((1-t)x+ty\in E\),则称\(E\)是\(R^m\)上的凸集.
- 若\(E\)是\(R^m\)上的凸集,称\(f:E\mapsto R\)是凸(凹)函数如果\(f((1-t)x+ty)\le(\ge)(1-t)f(x)+tf(y),\forall x,y\in E,\forall t\in [0,1]\).
- \(E\)是\(R^m\)上凸集,则\(f\)是凸函数\(\Leftrightarrow \{(x,y)|y\ge f(x),x\in E\}\)是\(R^{m+1}\)上的凸集.
- 由凸函数的定义,\(\forall \{t_k\}^{n}_{k=1}\)满足\(t_k\ge 0,\sum_{k=1}^{n}t_k=1\),则\(\sum_{k=1}^{n}t_kx_k\in E\)且\(f(\sum_{k=1}^{n}t_kx_k)\le \sum_{k=1}^{n}t_kf(x_k)\)
- 若\(E\)是凸集,\(\overline{E}\)也是凸集
- 若\(\vec{a_1},\vec{a_2}\dots \vec{a_n}\in R^m\),称\(\sum_{i=1}^{n}t_i\vec{a_i},\forall 1\le i \le n,t_i\ge 0,\sum_{i=1}^{n}t_i=1\)为\(\vec{a_1},\vec{a_2}\dots \vec{a_n}\)的一个凸组合,由此可以引出凸包(Convex Hull)的概念.
- 若\(E\subseteq R^m\),称\(Conv(E)\)为\(E\)的凸包$$Conv(E):={\sum_{i=1}^{n}t_ix_i|\forall 1\le i\le n,x_i\in E,\sum_{i=1}^{n}t_i=1,n\in N}$$.不难验证,\(Conv(E)\)是包含\(E\)的最小凸集.
若\(\vec{a_1},\vec{a_2}\dots \vec{a_n}\in R^m\),则\(Conv(\{\vec{a_1},\vec{a_2}\dots \vec{a_n}\})\)是紧凸集.
事实上,\(T=\{\vec{t}=(t_1,t_2\dots t_m)|\sum_{i=1}^{m}t_i=1\}\)是紧凸集.同时定义\(\Phi(\vec{t})=\sum_{i=1}^{m}t_i\vec{a_i}\),则\(\Phi:T\mapsto Conv(\{\vec{a_1},\vec{a_2}\dots \vec{a_n}\})\)是连续的一一映射.
- 若\(E\)是凸集,称\(x\in E\)是\(E\)的极点当且仅当\(x\)不能表示为\(E\)中其他点的凸组合.即\(\forall t\in ]0,1[,\forall a,b\in E,x\ne ta+(1-t)b\).
由定义可知,若\(x\in \mathring{E}\),\(x\)不可能是\(E\)的极点.因此,极点只能在\(\partial E\)上.
- 若\(K\subseteq R^m\)是紧凸集,则\(K\)一定有极点,若\(K\)不是单点集,\(K\)至少有两个极点.
- 若\(K\)的全部极点是\(\vec{a_1},\vec{a_2}\dots \vec{a_n}\),则\(K=Conv(\{\vec{a_1},\vec{a_2}\dots \vec{a_n}\})\).
- \(\prod_{i=1}^{m}[a_i,b_i]\)的全体极点是全体顶点.
凸函数的上确界
交点引理
设\(E\subseteq R^m\)是紧集,\(x\in K,y\in R^m,x\ne y\),则从\(y\)出发经过\(x\)的射线必与\(\partial K\)有不同于\(y\)的交点.即\(\rho:=\max\{t\ge 1|y+t(x-y)\in K\}\)是良定义的且\(y+\rho(x-y)\in \partial K\).特别地,如果\(x\in \mathring K\),则\(\rho\ge \frac{r}{1+2|x-y|}>1\),其中\(r\)满足\(B(x,r)\subseteq K\).
极值问题
\(K \subseteq R^m\)是紧凸集,\(f:K\mapsto R\)是凸函数.则:
- \(\sup_{x\in K} f(x)=\sup_{x\in \partial K}f(x)\).
- 特别地,若\(f\)在\(K\)上连续,则有\(\max_{x\in K} f(x)=\max_{x\in \partial K}f(x)\).即最值可以在边界上达到.
- 若\(\exists x_0\in \mathring{K},f(x_0)=\max_{x\in K}f(x)\).则\(f(x)\equiv c\).
对于凹函数,将上面的\(sup\)换成\(inf\),\(max\)换成\(min\).
紧集条件不能去除,一个反例是\(K=\{(x_1,x_2\dots x_m)\in R^m|x_i\ge 0,\forall 1\le i\le m,\sum_{i=1}^{m}x_i<1\}\),\(f(x)=\frac{1}{1-\sum_{i=1}^{m}x_i}\).
紧凸多面体的极值
若\(K=conv(\vec{a_1},\vec{a_2}\dots \vec{a_n})\subseteq R^m\),\(f:K\mapsto R\)是凸函数,则\(f\)在\(K\)上有界.即\(\max_{1\le i\le n}f(\vec{a_i})+n\min_{1\le i\le n}[f(\vec{c})-f(\vec{a_i})]\le f(x)\le \max_{1\le i\le n}f(\vec{a_i})\).其中\(c=\frac{1}{n}\sum_{i=1}^{n}\vec{a_i}\).特别地,\(\max_{x\in K}f(x)=\max_{1\le i\le n}f(\vec{a_i})\).
凸函数的下界
设\(f:E\mapsto R\)为凸函数,\(E\)是有界凸集且\(\mathring{E}\ne \emptyset\),则\(\inf_{x\in E}f(x)>-\infty\).
以上\(\mathring{E}\ne \emptyset\)的条件可以去除.
考虑\(\mathring{E}= \emptyset\)的情形:
- \(E\)为单点集,命题成立.
- \(E\)不是单点集,\(\exist x\in E\)及\(1\le k<m\)维线性空间\(W\),\(E\subseteq x+W\)且\(E-x\)对于\(W\)而言内部不为空.
凸函数在局部是Lipschitz函数
若\(f:E\mapsto R^m\)是凸函数,对\(x_0\in E\),定义\(I(x_0,\delta)=\prod_{i=1}^{m}[x+i-\delta,x_i+\delta]\).
-
若\(I(x_0,2\delta)\subseteq E\),则\(f\)在\(I(x_0,\delta)\)上有界且满足Lipschitz条件.
考虑\(I(x_0,\delta)\)中两点\(x,y\)和\(I(x_0,2\delta)\)中一点\(z\)满足\(x+\frac{\delta(x-y)}{\sqrt{M}|x-y|}\),其中\(M=\sup_{x\in E}|f(x)|\),\(f(x)-f(y)\)由\(f(y)-f(z)\)控制,通过使得其只与\(|x-y|\)相关,再由\(xy\)的对称性确定\(|f(x)-f(y)|\)满足Lipschitz条件.
-
若\(E\)是开集,\(K\subset E\)是紧集,则\(f\)在\(K\)上满足Lipschitz条件.
结合上一条,对\(x\in K,\exist \delta_x>0,I(x,\delta_x)\subseteq E\),取所有\(\mathring{I(x,\delta_x)}\)得到\(K\)的有限开覆盖.
-
\(f\)在\(\mathring{E}\)上连续.
凸投影定理和凸函数的支撑平面
凸投影定理
设\(K\subseteq R^m\)为非空的闭凸集,则有:
-
\(\forall x\in R^m,\exist x^*\in K,dist(x,K)=dist(x-x^*)\),且\(\forall y\in K,(x-x^*,y-x^*)\le 0\),即\(x-x^*\)与\(y-x^*\)的夹角大于\(\frac{\pi}{2}\).
-
引入垂足映射\(P:R^m\mapsto K,P(x)=x^*,\forall x\in R^m\).则有\(\forall x,y\in R^m,|P(x)-P(y)|\le |x-y|.\)
不妨先假定\(x\notin K\),则有\(dist(x,x^*)\ne 0\).
先证明\(P\)是良定义的,假设\(\exist x\in R^m,x_1^*,x_2^*\in K\)满足\(dist(x,K)=dist(x,x^*_1)=dist(x,x^*_2)\).
取\(z=\frac{1}{2}(x^*_1+x^*_2)\in K\),有\(|x-z|\le |x-x^*_1|\Rightarrow |x-z|=|x-x^*_1|\).
由平行四边形法则,\(2(|x-x^*_1|^2+|x-x^*_2|^2)=|x^*_1-x^*_2|^2+4|x-z|^2\).
\(\Rightarrow|x^*_1-x^*_2|^2=0 \Rightarrow x^*_1=x^*_2\).
固定\(x\in R^m,y\in K\),定义\(f:[0,1]\mapsto R,f(t)=|x-x^*-t(y-x^*)|^2,\forall t\in[0,1]\).
\(\forall t\in ]0,1]\),\(x^*+t(y-x^*)\in K\),由\(x^*\)的唯一性,可知\(f(t)>f(0)\).
\(\Rightarrow f'_+(0)\ge 0 \Rightarrow -2t(x-x^*,y-x^*)\ge 0 \Rightarrow (x-x^*,y-x^*)\le 0\).
\(\forall x,y\in R^m,|P(x)-P(y)|^2=(P(x)-P(y),P(x)-P(y))=(P(x)-x,P(x)-P(y))+(x-y,P(x)-P(y))+(y-P(y),P(x)-P(y))\)且有\((P(x)-x,P(x)-P(y))\le 0,(y-P(y),P(x)-P(y))\le 0\).
\(\Rightarrow |P(x)-P(y)|^2\le (x-y,P(x)-P(y))\).
\(\Rightarrow |P(x)-P(y)|\le |x-y|\).
-
设\(E\subseteq R^m\)是闭凸集,\(f\in C(E;R^n)\),记\(F=f\circ P\)则有\(F|_E=f,F\in C(r^m;R^n)\).
-
分离性
设\(K\subseteq R^m\)是闭凸集,\(x\notin K\),则\(\exist s\in R^m,s\ne 0,(s,x)>\sup_{y\in K}(s,y)\).即在\(s\)的方向上,\(x\)在\(K\)的上方.
令$s=x-P(x)\Rightarrow \forall y\in K,(s,y-P(x))\le 0 \Rightarrow (s,y-x+s)\le 0 \Rightarrow (s,x)\ge |s|^2+(s,y) $.
-
分离性的推论
若\(K\)是凸集,\(x\in \partial K\),则\(\exist s\in R^m,s \ne 0\)满足\((s,x)\ge (s,y),\forall y\in K\).
\(\partial K=\partial \overline{K} \Rightarrow \exist \{x_n\}\subseteq \overline{K}^c,x_n\to x\).
由分离性,\(\exist s_n\ne 0,|s_n|=1\)满足\((s_n,x_n)\ge (s,y),\forall y\in K\).
\(\{s_n\}\)有收敛到\(s\)的收敛子列\(\Rightarrow (s,x)\ge (s,y),\forall y\in K\).
凸函数的支撑平面
若\(f:E\subseteq R^m\mapsto R,f\)是凸函数,则\(\forall x\in \mathring{E},\exist v(x)\in R^m\)满足\(\forall y\in E,f(y)\ge f(x)+v(x)\cdot (y-x)\).其中\(v(x)\)称为\(x\)的支撑平面.
特别地,\(m=1\)时,\(v(x)\)介于\(f'_-(x)\)与\(f'_+(x)之间\).\(v(x)\)唯一\(\Leftrightarrow f\)是可微的.
记\(K=\{(x,z)|z\ge f(x)\}是R^{m+1}\)上的凸集\(\Rightarrow (x,f(x))\in \partial K\).由分离性的推论,\(\exist (s,\alpha)\in R^{m+1},(s,\alpha)\ne 0\)且\((s,x)+\alpha f(x)\ge (s,y)+\alpha r,\forall (y,r)\in K\).
若\(\alpha >0\),令\(r\to +\infty\),矛盾.
若\(\alpha =0\),则\(s\ne 0\).由\(x\in \mathring{E},\exist \delta >0,x+\delta s\in E\).
\(\Rightarrow (s,x)+\alpha f(x)\ge (s,x+\delta s)+\alpha f(x+\delta s) \Rightarrow 0<\delta|s|\le \alpha (f(x)-f(x+\delta s)) \Rightarrow \alpha\ne 0\).矛盾.
所以\(\alpha<0\).那么\(f(y)\ge \frac{1}{|\alpha|}(s,y-x)+f(x)=f(x)+(\frac{s}{|\alpha|},y-x)\).
\((s,\alpha)\)依赖于\(x\Rightarrow v(x)=\frac{s}{|\alpha|}\).
分离性与支撑平面的几何解读
-
定义\(R^m\)上的超平面:若\(\xi\in R^m \xi\ne 0,t\in R\),则集合\(H:=\{x\in R^m|x\cdot \xi =t\}\)称为\(R^m\)上的超平面.相对该超平面,定义\(H^+=\{x\in R^m|x\cdot \xi\ge t\},H^-=\{x\in R^m|x\cdot \xi\le t\}\),称为对应超平面的两个半空间.此时,\(\xi\)是平面\(H\)的法向量.
从超平面的角度解释上面的两个结论:
-
\(K\)是闭凸集,\(x\notin K\),令\(\xi=x-P(x)\).由凸投影定理,$\exist t\in R,(x,\xi)>t>\sup_{y\in K}(y,\xi)\Rightarrow x\in H^+,K\subset H^- $.
-
\(K\)是凸集,\(x_0\in \partial K\),则\(\exist \xi\in R^m,\xi\ne 0\)满足\((\xi,x_0)\ge (\xi,y),\forall y\in K\).
若令\(H=\{x\in R^m|(\xi,x)=(\xi,x_0)\}\)则有\(x_0\in \partial K,x_0\in H\)且\(K\subseteq H^-\).满足这两个条件的平面\(H\)称为\(K\)的支撑平面.
-
-
从几何的角度来看凸函数的支撑平面.
-
当\(m=1\)时,有\(f(y)\ge f(x)+v(x)(y-x),\forall y\in E\).此时的支撑平面是一条直线\(L\),可以认为\(L\)是由法向量\((v(x),-1)\)和点\((x,f(x))\)确定.此时沿着\((v(x),-1)\)来看我们有\(\{(y,f(y)|y\in E)\}\subseteq L^-\).
-
\(m\le 2\)时有\(R^{m+1}=\{(x,z)|x\in R^m,z\in R\}\).
由凸集的支撑平面可知存在\(\vec{\xi}\)以及超平面\(H\)满足\(\{(y,f(y))|y\in E\}\subseteq H^-\).
由于\(E=\{(y,z)|y\in E,z\ge f(y)\}\).有\(\vec{\xi}\cdot \vec{e_{m+1}}\le 0\).其中\(\vec{e_{m+1}}\)是最后一维的单位向量.
由于\(\vec{\xi}\ne 0\),我们可以对其最后一维归一化,于是有\(\vec{\xi}=(v,-1)\).
从而\(H\)由\((v,-1)\)与点\((x,f(x))\)确定.即\(\forall (y,z)\in R^{m+1},(y,z)\in H\Rightarrow (y-x,z-f(x))\cdot (v,-1)=0\Rightarrow z=f(x)+v(y-x)\).
因此\(\{(y,f(y))|y\in E\}\subseteq H^-\Rightarrow f(y)\ge f(x)+v(y-x)\).
-

浙公网安备 33010602011771号