BoydC9pt1

1756198714501


Unconstrained minimization problems

Terminology and assumptions

\(\mu\)-strongly convex (\(\mu-SC\)), where \(\mu>0\) :

  1. \(f(\theta x+(1-\theta)y)\leq \theta f(x)+(1-\theta)f(y)-\frac{\mu}{2}\theta(1-\theta)\|x-y\|^2\), \(\theta\in[0,1]\)
  2. If \(f\) is differentiable, then

    \[f(y)\geq f(x)+\nabla f(x)^T(y-x)+\frac{\mu}{2}\|y-x\|^2 \]

  3. If \(f\) is differentiable,then

    \[f(y)\leq f(x)+\nabla f(x)^T(y-x)+\frac{1}{2\mu}\|\nabla f(y)-\nabla f(x)\|^2 \]

    specially,

    \[f(x)-f(x^*)\leq \frac{1}{2\mu} \|\nabla f(x)\|^2 \]

  4. If \(f\) is differentiable,then

    \[(\nabla f(y)-\nabla f(x))^T(y-x) \geq \mu\|y-x\|^2 \]

    Thus \(\|\nabla f(y)-\nabla f(x)\|\geq\mu \|y-x\|\)
  5. If \(f\) is twice differentiable, then

    \[\nabla^2 f(x) \succeq \mu I \]

proof

Lemma 3.

\[\begin{aligned} &&f(y)\geq& f(x)+\nabla f(x)^T(y-x)+\frac{\mu}{2}\|y-x\|^2\\ \implies&& \inf_y f(y)\geq& \inf_y \{f(x)+\nabla f(x)^T(y-x)+\frac{\mu}{2}\|y-x\|^2\}\\ \implies&& f(x^*) \geq & f(x)-\frac{1}{2\mu} \|\nabla f(x)\|^2 \end{aligned} \]

let \(g_{y}(x)=f(x)-\nabla f(y)^T x\), then \(g_y(x)\) is also \(\mu\)-strongly convex, and \(\nabla g_y(x)=\nabla f(x)-\nabla f(y)\), \(g_y(x^*)=g_y(y)=f(y)-\nabla f(y)^T y\), thus

\[f(x)\leq f(y)+\nabla f(y)^T(x-y)+\frac{1}{2\mu}\|\nabla f(x)-\nabla f(y)\|^2 \]

\(L\)-smooth : For differentiable \(f:\mathbb{R}^n\to\mathbb{R}\), for all \(x,y\in \mathbb{R}^n\)

\[\|\nabla f(y)-\nabla f(x)\|\leq L \|y-x\| \]

where \(\|\cdot\|\) denotes \(\|\cdot\|_2\), the following lemmas hold:

  1. Gradient \(L\)-Lipschitz continuity.
  2. Gradient inner product inequality:

    \[ \left<\nabla f(y)-\nabla f(x),y-x \right> \leq L\|y-x\|^2 \]

  3. Descent lemma:

    \[f(y)\leq f(x)+\nabla f(x)^T(y-x)+\frac{L}{2}\|y-x\|^2 \]

  4. \(f(\theta x + (1 - \theta)y) \geq \theta f(x) + (1 - \theta)f(y) - \frac{L}{2} \theta(1 - \theta)\|x - y\|^2\), where \(\theta\in [0,1]\).
  5. If \(f\) is twice differentiable, then

    \[\nabla^2 f(x) \preceq L I \]

If \(f\) is convex, then

  1. Gradient inner product inequality:

    \[\frac{1}{L}\|\nabla f(y)-\nabla f(x)\|^2 \leq \left<\nabla f(y)-\nabla f(x),y-x \right> \]

  2. \[ f(y)\geq f(x)+\nabla f(x)^T(y-x)+\frac{1}{2L}\|\nabla f(y) - \nabla f(x)\|^2 \]

proof.

Lemma 2.

\[\begin{aligned} \left<\nabla f(y)-\nabla f(x),y-x \right> \leq &\|\nabla f(y)-\nabla f(x)\|\cdot\|y-x\|\\ \leq & L \|y-x\|^2 \end{aligned} \]

Lemma 3.

\[\begin{aligned} &f(y)-f(x)\\ =& \int_0^1 \left<\nabla f(x+t(y-x)),y-x\right> dt\\ =& \int_0^1 \left<\nabla f(x+t(y-x))-\nabla f(x),y-x\right> dt+\left<\nabla f(x),y-x\right>\\ \leq& \int_0^1 tL \|y-x\|^2 dt+\left<\nabla f(x),y-x\right>\\ =& \nabla f(x)^T(y-x)+\frac{L}{2}\|y-x\|^2 \end{aligned} \]

Lemma 4. For \(\theta \in[0,1]\), let \(\tilde{x} = \theta x+(1-\theta)y\), by descent lemma then

\[\begin{align} f(y)\leq& f(\tilde{x})+\nabla f(\tilde{x})^T(y-\tilde{x})+\frac{L}{2}\|y-\tilde{x}\|^2\nonumber\\ =& f(\tilde{x})+\theta\nabla f(\tilde{x})^T(y-x)+\frac{\theta^2L}{2}\|y-x\|^2\\ f(x)\leq& f(\tilde{x})+\nabla f(\tilde{x})^T(x-\tilde{x})+\frac{L}{2}\|x-\tilde{x}\|^2\nonumber\\ =& f(\tilde{x})-(1-\theta)\nabla f(\tilde{x})^T(y-x)+\frac{(1-\theta)^2L}{2}\|y-x\|^2\\ \end{align} \]

let \((1-\theta)(1)+\theta(2)\), then

\[\theta f(x) + (1 - \theta)f(y) \leq f(\theta x + (1 - \theta)y)+ \frac{L}{2} \theta(1 - \theta)\|x - y\|^2 \]

Lemma 6.
For \(x_1,x_2\), for all \(x\)

\[\begin{aligned} f(x) &\geq f(x_1)+\nabla f(x_1)^T(x-x_1)\triangleq g_1(x)\quad && \text{convex}\\ f(x) &\leq f(x_2)+\nabla f(x_2)^T(x-x_2) + \frac{L}{2} \|x-x_2\|^2\triangleq g_2(x) && \text{descent lemma}\\ \end{aligned} \]

Thus,

\[\begin{aligned} &\inf (g_2(x)-g_1(x))\\ =&\inf \{\frac{L}{2} \|x-x_2\|^2+(\nabla f(x_2)-\nabla f(x_1))^Tx\\ &+f(x_2)-f(x_1)-\nabla f(x_2)^Tx_2+\nabla f(x_1)^Tx_1\}\\ \geq &0 \end{aligned} \]

Furthermore,

\[\begin{aligned} &\nabla (g_2(x)-g_1(x))=L(x-x_2)+\nabla f(x_2)-\nabla f(x_1)\\ \implies & x^* = \frac{\nabla f(x_1)-\nabla f(x_2)}{L}+x_2 \end{aligned} \]

Hence

\[\begin{aligned} &\inf (g_2(x)-g_1(x)) = g_2(x^*)-g_1(x^*)\\ =& -\frac{1}{2L}\|\nabla f(x_2)-\nabla f(x_1)\|^2-\nabla f(x_1)^T(x_2-x_1)+ f(x_2)-f(x_1) \\ \geq& 0 \end{aligned} \]

which means

\[f(x_2)\geq f(x_1)+\nabla f(x_1)^T(x_2-x_1)+\frac{1}{2L}\|\nabla f(x_2)-\nabla f(x_1)\|^2 \]

Lemma 7. By lemma 6,

\[\begin{aligned} f(x_2)\geq& f(x_1)+\nabla f(x_1)^T(x_2-x_1)+\frac{1}{2L}\|\nabla f(x_2)-\nabla f(x_1)\|^2\\ f(x_1)\geq& f(x_2)+\nabla f(x_2)^T(x_1-x_2)+\frac{1}{2L}\|\nabla f(x_2)-\nabla f(x_1)\|^2\\ \end{aligned} \]

Thus

\[(f(x_2)-f(x_1))^T(x_2-x_1)\geq \frac{1}{L}\|\nabla f(x_2)-\nabla f(x_1)\|^2 \]

posted @ 2025-08-28 15:39  p0q  阅读(5)  评论(0)    收藏  举报