不等式

符号说明

矩阵\(A \in \mathbb{R}^{m \times n}\)
\(\|A\|\):矩阵\(A\)的谱范数
\(\|A\|_*\): 矩阵\(A\)的核范数
\(\|A\|_F\): 矩阵\(A\)的F范数
\(\mathrm{rank}()\)表示矩阵的秩。

[Jensen’s inequality]

\(f(\theta x + (1-\theta)y) \le \theta f(x)+(1-\theta)f(y)\)

如果\(f:\mathbb{R}^n \rightarrow \mathbb{R}\)为凸函数,\(\theta \in [0, 1]\)\(x,y\in \mathrm{dom}f\)那么:

\[f(\theta x + (1-\theta)y) \le \theta f(x)+(1-\theta)f(y) \]

实际上,上述为凸函数的定义,为比较一般的Jensen’s inequality。

\(f(\theta_1 x_1 + \ldots + \theta_k x_k) \le \theta_1 f(x_1)+\ldots+ \theta_k f(x_k)\)

如果\(f:\mathbb{R}^n \rightarrow \mathbb{R}\)为凸函数,\(\theta_i \in [0, 1], \sum \limits_{i=1}^k \theta_i =1\)\(x_1, \ldots, x_k \in \mathrm{dom}f\)那么:

\[f(\theta_1 x_1 + \ldots + \theta_k x_k) \le \theta_1 f(x_1)+\ldots+ \theta_k f(x_k) \]

证:假设\(\theta_1 = 0 或者 1\)时,不等式是一定成立的,所以假设\(\theta_1 \in (0,1)\)
\(\theta = \theta_1, \theta' = 1-\theta\),\(x = x_1, \theta'y = \theta_2 x_2 + \ldots + \theta_k x_k\),根据凸函数的定义可得:

\[f(\theta x + \theta' y) \le \theta f(x) + \theta' f(y) \]

\(\sum \limits_{i=2}^k \theta_i / \theta'=1\),所以,同样满足条件,所以通过数学归纳法即可证明上述等式。

\(f(\int_S p(x)x \: \mathrm{d}x) \le \int_S f(x)p(x) \mathrm{d}x\)

如果在\(S \subseteq \mathrm{dom} f\)上,\(p(x) \ge 0\),且\(\int_S p(x) \: \mathrm{d}x=1\),则当相应的积分存在时:

\[f(\int_S p(x)x \: \mathrm{d}x) \le \int_S f(x)p(x) \mathrm{d}x \]

试证(注意,是试证):
\(\theta_i = p(x_i) \Delta x_i, i=1,2,\ldots,k\),且满足\(\sum \limits_{i=1}^k \theta_i =1\)(这个性质至少在p(x)是连续函数的时候是能够满足的),那么根据第二形态Jensen’s inequality可以得到:

\[f(\sum \limits_{i=1}^k p(x_i)\Delta x_i x_i) \le \sum \limits_{i=1}^k p(x_i)\Delta x_i f(x_i) \]

\(\max |\Delta x_i| \rightarrow 0\)即可得积分形式不等式(当然,里面含有一个极限和函数互换的东西,因为凸函数一定是连续函数,所以这个是可以互换的,应该没弄错)。

\(f(\mathrm{E}x) \le \mathrm{E}(f(x))\)

如果\(x\)是随机变量,事件\(x \in \mathrm{dom}f\)发生的概率为1,函数\(f\)为凸函数,且相应的期望存在时:

\[f(\mathrm{E}x) \le \mathrm{E}(f(x)) \]

证:
\(S = domf\),随机变量\(x\)的概率密度函数为\(p(x)\),则\(\int_S p(x)=1\),于是,根据积分形式的Jensen’s inequality即可得:

\[f(\mathrm{E}x) \le \mathrm{E}(f(x)) \]

[Young's inequality] \(ab \le \frac{a^p}{p} + \frac{b^q}{q}\)

Young's inequality-wiki

\(p,q \in [1, +\infty)\)且均为实数,满足:

\[\frac{1}{p} + \frac{1}{q} = 1 \]

\(a, b>0\)亦为实数,那么:

\[ab \le \frac{a^p}{p} + \frac{b^q}{q} \]

证1:

对于\(x \in \mathbb{R}^+, \alpha \in (0, 1)\),有\(x^{\alpha} \le 1 + \alpha (x-1)\)(因为\(x^{\alpha}\)为凹函数,而不等式右边是在点\((1, 1)\)的切线)。令\(x = b/a, \alpha = 1/q\) ,可得:

\[a^{1/p}b^{1/q} \le \frac{a}{p} + \frac{b}{q} \]

\(a:=a^p, b:=b^q\),代入即可得,另外\(a,b=0\)的时候不等式必成立,结果得证。

证2:
考察\(Oxy\)平面上由方程\(y=x^{p-1}\)所定义的曲线,它也可以表示为\(x=y^{\frac{1}{p-1}}=y^{q-1}\),作积分得:

\[S_1 = \int_0^a y \mathrm{d}x = \int_0^a x^{p-1} \mathrm{d}x = \frac{a^p}{p} \\ S_2 = \int_0^b x \mathrm{d}y = \int_0^a y^{q-1} \mathrm{d}y = \frac{b^q}{q} \]

显然:

\[ab \le S_1 + S_2 = \frac{a^p}{p} + \frac{b^q}{q} \]

只有当\(b^q = a^p\)的时候,不等式才得以成立,证毕。

[Holder's inequality] \(\|xy\|_1 \le \|x\|_p \|y\|_q\)

Holder's inequality-wiki

离散形式

\(p, q \in [1, +\infty)\),且\(\frac{1}{p}+\frac{1}{q}=1\),\(x, y \in C^{n}\),其中\(C\)表示复数域,则:

\[\|xy\|_1 = \sum \limits_{i=1}^n |x_iy_i| \le (\sum \limits_{i=1}^n |x_i|^p)^{\frac{1}{p}}(\sum \limits_{i=1}^n |y_i|^q)^{\frac{1}{q}} = \|x\|_p \|y\|_q \]

注意,\(m \times n\)的矩阵可以看成是\(mn\)维的向量。

证:

\[a_k = \frac{|x_k|}{(\sum \limits_{i=1}^{n} |x_i|^p)^{\frac{1}{p}}}, b_k = \frac{|y_k|}{(\sum \limits_{i=1}^{n} |y_i|^q)^{\frac{1}{q}}} \]

则有\(\sum \limits_{k=1}^n a_k^p = 1, \sum_{k=1}^n b_k^q = 1\),由杨不等式\(a_kb_k \le \frac{a_k^p}{p} + \frac{b_k^q}{q}\)求和,得

\[\sum \limits_{k=1}^n a_k b_k \le \frac{\sum \limits_{k=1}^{n}a_k^p}{p} + \frac{\sum \limits_{k=1}^{n}b_k^q}{q} = \frac{1}{p} + \frac{1}{q}=1 \]

\[\frac{\sum \limits_{i=1}^n |x_i||y_i|}{(\sum \limits_{i=1}^n |x_i|^p)^{\frac{1}{p}}(\sum \limits_{i=1}^n |y_i|^q)^{\frac{1}{q}}} \le1 \]

所以得证。
另外需要一提的是\(n \rightarrow + \infty\),且右端俩式收敛,则这个式子也对于\(n \rightarrow +\infty\)也可成立。

积分形式

\(p, q \in [1, +\infty)\),且\(\frac{1}{p}+\frac{1}{q}=1\),\(x(t), y(t), t\in [t_0, t_1]\),且

\[\int_{t_0}^{t_1}|x(t)y(t)|\mathrm{d}t,\: [\int_{t_0}^{t_1}|x(t)|^p\mathrm{d}t]^{\frac{1}{p}}, \:[\int_{t_0}^{t_1}|y(t)|^q\mathrm{d}t]^{\frac{1}{q}} \]

均存在,则

\[\int_{t_0}^{t_1}|x(t)y(t)|\mathrm{d}t \le [\int_{t_0}^{t_1}|x(t)|^p\mathrm{d}t]^{\frac{1}{p}} [\int_{t_0}^{t_1}|y(t)|^q\mathrm{d}t]^{\frac{1}{q}} \]

证:

\[a = \frac{|x(t)|}{[\int_{t_0}^{t_1}|x(t)|^p\mathrm{d}t]^{\frac{1}{p}}}, \quad b = \frac{|y(t)|}{[\int_{t_0}^{t_1}|y(t)|^q\mathrm{d}t]^{\frac{1}{q}}} \]

则有\(\int_{t_0}^{t_1}a^p \mathrm{d}t=1, \: \int_{t_0}^{t_1}b^q \mathrm{d}t=1\),并由杨不等式\(ab\le \frac{a^p}{p} + \frac{b^q}{q}\)并积分可得:

\[\int_{t_0}^{t_1}ab \mathrm{d}t \le 1 \]

\[\int_{t_0}^{t_1}|x(t)y(t)|\mathrm{d}t \le [\int_{t_0}^{t_1}|x(t)|^p\mathrm{d}t]^{\frac{1}{p}} [\int_{t_0}^{t_1}|y(t)|^q\mathrm{d}t]^{\frac{1}{q}} \]

证毕。

[trace-nuclear] \(\mathrm{Tr}(A^TB) \le \|A\|\|B\|_*\)

证明:
根据\(\|B\|_*\)的对偶定义:

\[\|B\|_* = \sup \{\mathrm{Tr}(A^TB)| \|A\| \le 1\} = \sup \{\mathrm{Tr}(A^TB)| \|A\| = 1\} \\ \Rightarrow \alpha \|B\|_* \ge \alpha\mathrm{Tr}((A^TB)), \|A\| =1 \]

\(A := \alpha A\)代之,则\(\|A\| = \alpha\)

\[\|A\|\|B\|_* \ge \mathrm{Tr}(A^TB) \]

因为\(B\)是任意的,所以不等式对任意的\(A,B\)都成立(当然前提是能做矩阵的乘法).

[算术-几何平均不等式] \(a^{\theta}b^{1-\theta} \le \theta a +(1-\theta)b\)

如果\(a,b\ge 0\)\(\theta \in [0, 1]\),那么

\[a^{\theta}b^{1-\theta} \le \theta a +(1-\theta)b \]

\(\theta = 1/2\)时,\(\sqrt{ab} \le (a+b)/2\)

证1:因为\(-\log x\)为定义在\((0, +\infty)\)上的凸函数,根据[Jensen’s inequality]可得:

\[-\log (\theta a + (1-\theta)b) \le -\theta \log(a) -(1-\theta) \log(b) \]

俩边取指数可得:

\[\big(\theta a+(1-\theta)b\big)^{-1} \le (a^{\theta}b^{(1-\theta)})^{-1} \]

所以

\[a^{\theta}b^{1-\theta} \le \theta a +(1-\theta)b \]

证2:
根据[Young's inequality]可得:

\[ab \le \frac{a^p}{p} + \frac{b^q}{q} \]

\(a = a^{\theta}, b = b^{1-\theta}\),\(p = 1/\theta,q=1/(1-\theta)\)\(p,q\)满足条件,所以:

\[a^{\theta}b^{1-\theta} \le \theta a +(1-\theta)b \]

[Gibb's inequality] \(-\sum \limits_{i=1}^np_i \log p_i \le -\sum \limits_{i=1}^n p_i\log q_i\)

假设\(P=\{p_1, \ldots, p_n\}, Q=\{q_1, \ldots, q_n\}\)分别为一个概率分布, 那么有下列不等式成立:

\[-\sum \limits_{i=1}^np_i \log p_i \le -\sum \limits_{i=1}^n p_i\log q_i \]

等价于:

\[\sum \limits_{i=1}^np_i \log p_i \ge \sum \limits_{i=1}^n p_i\log q_i \]

亦等价于:

\[-\sum \limits_{i=1}^n p_i \log \frac{p_i}{q_i} \le 0 \]

当且仅当\(p_i=q_i\)时等式成立.

这意味着是KL散度:

\[D(P\|Q)=-\sum_{i=1}^n p_i\ln \frac{q_i}{p_i} \ge 0 \]

wiki

证1:

因为\(\log a = \frac{\ln a}{\ln 2}\), 所以我们简单证明\(\ln\)的不等式即可.
\(I\)表示\(p_i > 0\)的指示集,又\(\ln x \le x-1, x>0\), 故:

\[-\sum \limits_{i \in I} p_i \ln \frac{q_i}{p_i} \ge -\sum \limits_{i \in I} p_i (\frac{q_i}{p_i}-1) =-\sum \limits_{i \in I} q_i +1 \ge 0 \]

经过延拓\(0\ln0=0\), 则上式成立, 又\(x=1\)的时候\(\ln x = x-1\), 所以\(p_i=q_i, i\in I\), 又因为\(\sum_{i\in I} p_i=1\), 所以\(\sum_{i\in I} q_i=1\), 所以\(p_i=q_i=0, i \not \in I\), 故\(p_i =q_i, i=1,2,\ldots, n\)

证2:

因为\(-\log\)严格凸,所以利用[Jensen' inequality]可以得到:

\[\sum_i p_i \log \frac{q_i}{p_i} \le \log \sum_i p_i \frac{q_i}{p_i} = 0 \]

而根据[Jensen' inequality]等式成立的条件可以得到:

\[\frac{p_1}{q_1} = \frac{p_2}{q_2} =\cdots =\frac{p_n}{q_n} \]

\(\sum_i q_i=\sum p_i =1\)所以\(p_i=q_i\)时等式成立,\(p_i=0\)的情况和上面一样讨论.

自然,该不等式可以推广到积分形式:

\[D(P\| Q)=-\int p(x) \log \frac{q(x)}{p(x)} \mathrm{d}x \ge 0 \]

[Gronwall's inequality] \(u(t) \le f(t)e^{\int_0^th(s)\mathrm{d}s}\)

假设\(f\)\([0, +\infty)\)上非负,单调递增, \(h, u \in \mathrm{C}[0, +\infty)\),且\(h\)非负, 满足:

\[u(t) \le f(t) + \int_{0}^th(s)u(s) \mathrm{d}s, \quad t\ge 0, \]

则:

\[u(t) \le f(t)e^{\int_0^th(s)\mathrm{d}s}. \]

注意:
如果

\[u(t) = f(t) + \int_{0}^th(s)u(s) \mathrm{d}s, \]

并不能推出:

\[u(t) = f(t)e^{\int_0^th(s)\mathrm{d}s}. \]

但是当\(f(t)\equiv C_0 \ge 0\)的时候, 是有此类性质的(可用类似证1的方法证明).

证1:

记: \(w(t)=\int_0^t h(s)u(s) \mathrm{d}s\), 则\(w(0)=0\), \(w'(t)=h(t)u(t)\), 可得:

\[w'(t)=h(t)u(t)\le h(t) f(t)+h(t)w(t). \]

即:

\[w'(t)-h(t)w(t)\le h(t)f(t). \]

\(H(t)=\int_0^t h(s)\mathrm{d}s\), 则\(H(0)=0, H'(t)=h(t)\).
俩边同乘以\(e^{-H(t)}\),不改变符号:

\[e^{-H(t)}(w'(t)-h(t)w(t))=(e^{-H(t)}w(t))'\le e^{-H(t)}h(t)f(t), \]

俩边是同时在\([0, t]\)上积分得:

\[w(t)\le e^{H(t)} \int_0^t e^{-H(s)}h(s)f(s)\mathrm{d}s. \]

注意到(因为\(f(t)\)单增, 且积分内部为非负):

\[\int_0^t e^{-H(s)}h(s)f(s)\mathrm{d}s\le \int_0^t e^{-H(s)}h(s)\mathrm{d}s \: f(t)=-e^{-H(s)}|_0^t \: f(t)=(1-e^{-H(t)})f(t), \]

所以:

\[u(t) \le f(t)+w(t) \le e^{H(t)}f(t). \]

证毕.

证2(\(u\)需非负):

\[\begin{array}{ll} u(t) &\le f(t) + \int_{0}^th(s)u(s) \mathrm{d}s \\ & \le f(t) +\epsilon + \int_{0}^th(s)u(s) \mathrm{d}s, \epsilon > 0. \end{array} \]

则:

\[\frac{h(t)u(t)}{f(t)+\epsilon + \int_{0}^th(s)u(s) \mathrm{d}s} \le h(t) \]

俩边在\([0,t]\)上积分:

\[\int_0^t \frac{h(s)u(s)}{f(s)+\epsilon + \int_{0}^sh(\tau)u(\tau) \mathrm{d}\tau} \mathrm{d}s\le \int_0^t h(s)\mathrm{d}s \]

注意,因为\(f(t)\)是单增的,所以\(s\in[0, t]\)时:

\[\frac{h(s)u(s)}{f(s)+\epsilon + \int_{0}^sh(\tau)u(\tau) \mathrm{d}\tau} \ge \frac{h(s)u(s)}{f(t)+\epsilon + \int_{0}^sh(\tau)u(\tau) \mathrm{d}\tau} \ge 0, \]

所以:

\[\int_0^t \frac{h(s)u(s)}{f(t)+\epsilon + \int_{0}^sh(\tau)u(\tau) \mathrm{d}\tau} \mathrm{d}s=\ln \frac{f(t)+\epsilon+\int_0^t h(s)u(s)\mathrm{d}s}{f(t)+\epsilon}\le \int_0^t h(s)\mathrm{d}s, \]

所以:

\[f(t)+\epsilon + \int_0^t h(s)u(s)\mathrm{d}s \le e^{H(t)}(f(t)+\epsilon), \]

其中\(H(t)=\int_0^t h(s) \mathrm{d}s\).
俩边令\(\epsilon \rightarrow0\)得:

\[u(t) \le f(t)+ \int_0^t h(s)u(s)\mathrm{d}s \le e^{H(t)}f(t). \]

证毕.

证3:

\(M(T)=\max \limits_{0\le t\le T} \int_0^t h(s)u(s)\mathrm{d}s\),
则:

\[u(t)\le f(t)+M(T) \\ \Rightarrow h(t)u(t) \le h(t)f(t) + M(T)h(t), \]

于是:

\[u(t) \le f(t)+\int_0^th(s)f(s)+M(T)h(s)\mathrm{d}s. \]

因为\(f(t)\)单增, 所以:

\[\int_0^t h(s)f(s)\mathrm{d}s\le f(t) \int_0^th(s)\mathrm{d}s. \]

\(H(t)=\int_0^t h(s)\mathrm{d}s\), 可得:

\[u(t)\le f(t)(1+H(t))+H(t)M(T) \\ \Rightarrow h(t)u(t)\le f(t)h(t)(1+H(t))+h(t)H(t)M(T). \]

于是:

\[u(t) \le f(t)+\int_0^t f(s)h(s)(1+H(s))+h(s)H(s)M(T) \mathrm{d} s. \]

注意到:

\[\int_0^t H(s)h(s) \mathrm{d}s=\frac{H^2(t)-H^2(0)}{2}=\frac{H^2(t)}{2}. \]

所以:

\[u(t) \le f(t)(1+H(t)+ \frac{H^2(t)}{2!})+\frac{H^2(t)M(T)}{2!}. \]

重复此类操作可得:

\[u(t) \le f(t)(1+H(t)+ \frac{H^2(t)}{2!} + ...+\frac{H^n(t)}{n!})+\frac{H^n(t)M(T)}{n!}. \]

\(n\rightarrow + \infty\):

\[u(t) \le f(t)e^{H(t)}+0. \]

证毕.
注:
最后这部分也可以利用:

\[1+t+\ldots+\frac{t^n}{n!}\le e^t, t\ge0 \]

来证明, 但是我觉得如果是俩边取极限,那就不必考虑\(t\)得正负问题了,虽然多此一举,但是更酷啊.

[\(C_p\) inequality] \((|a|+|b|)^p \le C_p(|a|^p+|b|^p)\)

假设\(a, b\)为实数,\(p>0\), 则

\[(|a|+|b|)^p \le C_p(|a|^p+|b|^p), \]

其中

\[C_p = \left \{ \begin{array}{ll} 1, & 0<p \le 1, \\ 2^{p-1}, & p>1. \end{array} \right. \]

证明:

\(0<p\le1\): 考虑函数\(f(x) = (1+x)^p-x^p-1, x \ge 0\), 其导数为

\[f'(x) = p[(x+1)^{p-1}-x^{p-1}]<0, \]

\(f(x)\)\([0,+\infty)\)上单调递减,由\(f(0)=0\), 所以\(f(x)\le0\). 代入\(|b|/|a|(a\not =0)\)即得:

\[(|a|+|b|)^p \le C_p(|a|^p+|b|^p), \]

显然,\(a=0\)时也成立.

\(p>1\): 考虑凸函数\(|x|^p\)可得:

\[(\frac{|a|+|b|}{2})^{p} \le \frac{1}{2}(|a|^p+|b|^p), \]

证毕.

posted @ 2019-04-28 20:51  馒头and花卷  阅读(559)  评论(0编辑  收藏  举报