BoydC3pt2

1755859903503


Previous part: BoydC3pt1


Operations that preserve convexity

There are some practical methods for establishing convexity of a function.

  1. verify definition (often simplified by restricting to a line);
  2. for twice differentiable functions, show \(\nabla^2 f(x) \succeq 0\);
  3. show that \(f\) is obtained from simple convex functions by operations that preserve convexity.
  4. show \(\mathbf{epi}~f\) is convex, which can convert the computation of functions into operations on sets.

Positive weighted sum & composition with affine function

Thm 2.1. Such operations preserve convexity

  1. nonnegative multiple: \(f\) is convex \(\Rightarrow\) \(\alpha f\) is convex, \(\alpha\geq 0\).
  2. sum: \(f_1,f_2\) are convex \(\Rightarrow\) \(f_1+f_2\) is convex.

    which extends to infinite sums, integrals. \(e.g.\) if \(f(x,y)\) is convex in \(x\) and \(w(y)\geq 0\) for \(y\in\mathcal{A}\), then

    \[g(x) = \int_{\mathcal{A}}w(y)f(x,y)~dy \]

    is convex in \(x\).

  3. composition with affine function: \(f\) is convex \(\Rightarrow\) \(f(Ax+b)\) is convex.

Pointwise maximum

Thm 2.2. If \(f_k\) are convex \((k\in\left[m\right])\), their pointwise maximum

\[f(x) = \max\{f_1(x),f_2(x),\cdots,f_m(x)\} \]

is convex.

Ex 2.1. Sum of \(r\) largest components. For \(x\in\mathbb{R}^n\) we denote by \(x_{\left[i\right]}\) the \(i\)-th largest component of \(x\), \(i.e.\)

\[x_{\left[1\right]}\geq x_{\left[2\right]}\geq \cdots \geq x_{\left[n\right]} \]

Then the function

\[f(x) = \sum_{i=1}^r x_{\left[i\right]} \]

is convex.

Thm 2.3. If for \(y\in\mathcal{A}\), \(f(x,y)\) is convex in \(x\), then the function

\[g(x) =\sup_{y\in\mathcal{A}} f(x,y) \]

is convex. The domain of \(g\) is

\[\mathbf{dom}~g=\{x~|~(x,y)\in\mathbf{dom}~f ~\text{for all}~ y\in\mathcal{A},~\sup_{y\in\mathcal{A}}f(x,y)<\infty\} \]

Let \(f_y(x)=f(x,y)\) for certain \(y\), \(\mathbf{epi}~g=\cap_{y\in\mathcal{A}}~\mathbf{epi}~f_y\) is convex.

Composition

Thm 2.4. Composition with scalar functions: \(g:\mathbb{R}^n\to\mathbb{R}\) and \(h:\mathbb{R}\to\mathbb{R}\),

\[f(x)=h(g(x)) \]

is convex if \(\begin{cases} &g~\text{convex},~h~\text{convex},~\tilde{h} ~\text{nondecreasing}\\ & g~\text{concave},~h~\text{convex},~\tilde{h} ~\text{nonincreasing} \end{cases}\)

Thm 2.5. Vector composition: \(g:\mathbb{R}^n\to\mathbb{R}^k\) and \(h:\mathbb{R}^k\to\mathbb{R}\),

\[f(x)=h(g(x))=h(g_1(x),g_2(x),\cdots,g_k(x)) \]

is convex if \(\begin{cases} &g_i~\text{convex},~h~\text{convex},~\tilde{h} ~\text{nondecreasing in each argument}\\ &g_i~\text{concave},~h~\text{convex},~\tilde{h} ~\text{nonincreasing in each argument} \end{cases}\)

Minimization

Thm 2.5. If \(f(x, y)\) is convex in \((x, y)\) and \(C\) is a convex set, then

\[g(x) = \inf_{y\in C} f(x,y) \]

is convex.

Rmk. Thm 2.5 is similar to thm 2.3, but with a little difference: thm 2.5 requires \(f\) is convex in \((x, y)\), but thm 2.3 just requires \(f\) is convex in \(x\).

proof. \(\mathbf{epi}~f = \{(x,y,t)~|~t\geq f(x,y)\}\) is convex, and \(\mathbf{epi}~g\) \(=\) \(\{(x,t)~|~(x,y,t)\in \mathbf{epi}~f~\text{for some}~y\in C\}\) \(=\) \(\cup_{y\in C}\) \(\{(x,t)~|~(x,y,t)\in \mathbf{epi}~f\}\). Hence \(\forall~(x_1,t_1),(x_2,t_2)\in\mathbf{epi}~g\), \(\exists~y_1,y_2\in C\) \(s.t.\) \((x_1,y_1,t_1)\) and \((x_2,y_2,t_2)\in\mathbf{epi}~f\). Thus \(\theta(x_1,y_1,t_1)+(1-\theta)(x_2,y_2,t_2)\in \mathbf{epi}~f\) for \(0\leq\theta\leq 1\) and \(\theta y_1+(1-\theta)y_2\in C\), which means \(\theta(x_1,t_1)+(1-\theta)(x_2,t_2)\in\mathbf{epi}~g\). \(\mathbf{epi}~g\) convex \(\Rightarrow\) \(g\) is convex.

In fact, \(\mathbf{epi}~g\) is the projection of \(\mathbf{epi}~f\) on some of its components. But the projection is little different from thm 3.4 of BoydC2 ( \(y\in\mathbb{R}^n\) in thm 3.4 but \(y\in C\) here, which can be regarded as a promotion of thm 3.4).

Perspective of a function

Def 2.1. The perspective of a function \(f:\mathbb{R}^n\to \mathbb{R}\) is the function \(g:\mathbb{R}^{n}\times\mathbb{R}\to \mathbb{R}\):

\[g(x,t) = tf(x/t),\quad\mathbf{dom}~g=\{(x,t)~|~x/t\in\mathbf{dom}~f,~t>0\} \]

\(g\) is convex if \(f\) is convex.

The conjugate function

Definition and examples

Def 3.1. The conjugate of a function \(f\) is

\[f^*(y)=\sup_{x\in\mathbf{dom}~f} (y^Tx-f(x)) \]

\(f^*\) is convex

Ex 3.1. Norm: the conjugate of \(f(x)=\|x\|\).

\[\begin{align*} f^*(y)&=\sup_{x} y^Tx-\|x\|\\ &=\sup_{t\geq 0}t\sup_{\|u\| =1}(y^Tu-1)\\ &=\sup_{t\geq 0}t(\|y\|_*-1)\\ &=\begin{cases} 0\quad& \|y\|_*\leq 1\\ \infty & \|y\|_*>1 \end{cases} \end{align*} \]

where \(\|v\|_*=\sup_{\|u\|\leq 1}u^Tv=\sup_{\|u\|= 1}u^Tv\) is the dual norm of \(\|\cdot\|\).

Ex 3.2. Entropy maximization: \(f(x)=\sum_{i=1}^n x_i\log x_i\)

\[\begin{align*} f^*(y) &= \sup_{x} y^Tx - \sum_{i=1}^n x_i\log x_i\\ &=\sup_{x}\sum_{i=1}^n (y_i-\log x_i)x_i \end{align*} \]

let

\[\frac{d (y_i-\log x_i)x_i}{d x_i} = y_i-\log x_i-1=0 \]

then \(x_i=e^{y_i-1}\).
Thus

\[f^*(y) = \sum_{i=1}^n e^{y_i-1} \]

Basic properties

Thm 3.1. Fenchel’s inequality:

\[f(x)+f^*(y)\geq x^Ty \]

Thm 3.2. If \(f\) is convex and closed, then \(f^{**}=f\).

proof 1. using minimax theorem (not rigorous) \(f^{**}(z)\) \(=\) \(\sup_y z^Ty-f^{*}(y)\) \(=\) \((\sup_y z^Ty-\sup_x(y^Tx-f(x)))\) \(=\) \(\sup_y \inf_x(z^Ty-y^Tx+f(x))\) \(=\) \(\inf_x\sup_y((z-x)^Ty+f(x))\).

\[\sup_y((z-x)^Ty+f(x)) = \begin{cases} & \infty \quad & x\neq z\\ & f(z) & x=z \end{cases} \]

Thus \(\inf_x\sup_y((z-x)^Ty+f(x))=f(z)\) \(i.e.\) \(f^{**}(z)=f(z)\) .

proof 2. using the lemma: If \(f:\mathbb{R}^n\to\mathbb{R}\) is convex,

\[\tilde{f}(x)=\sup\{g(x)~|~g~\text{affine},~g(z)\leq f(z)~\text{for all}~z\} \]

\(\tilde{f}=f\) if \(f\) is closed.

let

\[G=\bigcup_y \{g~|g(x)=y^Tx+b,~b\leq -f^*(y)\} \]

then \(f^{**}(x)=\sup\{g(x)~|~g\in G\}\), \(G\subseteq\) \(\{g~|~g~\text{affine},~g(x)\leq f(x)\}\).

Assume for contradiction that \(\exists~g(x)=a^Tx+b\), \(g(x)\leq f(x)\) and \(g\notin G\) \(\Rightarrow\) \(g\) \(\notin\) \(\{g~|g(x)=a^Tx+b,~b\leq -f^*(a)\}\) \(\Rightarrow\) \(b> -f^*(a)=\inf_x f(x)- a^Tx\), which conflicts with \(g(x)=a^Tx+b\leq f(x)\). Thus \(\{g~|~g~\text{affine},~g(x)\leq f(x)\}\) \(\subseteq\) \(G\).

So \(f^{**}(x)=\sup\{g(x)~|~g~\text{affine},~g(z)\leq f(z)~\text{for all}~z\}\). \(f^{**}=f\).

Rmk. The above derivation actually clarifies the meaning of \(-f^*(y)\): it is the intercept of the tangent line to \(f(x)\) with a slope of \(y\). \(y^T x - f^*(y)\) represents a tangent line to \(f(x)\).

Thm 3.2. If \(f\) is differentiable, \(f^*\) is also called Legendre transform of \(f\) and

\[f^*(y) = x^{*T}\nabla f(x^*)-f(x^*) \]

where \(\nabla f(x^*)=y\).

Thm 3.3. If \(g(x)=f(Ax+b)\), \(A\in\mathbb{R}^{n\times n}\)

\[g^*(y) = f^*(A^{-T}y)-b^TA^{-T}y \]


Next part: BoydC3pt3

posted @ 2025-08-23 01:06  p0q  阅读(12)  评论(0)    收藏  举报