Convex Optimization - L5 Duality
Convex Optimization - L5 Duality
1. Lagrange dual problem
1.1 Lagrangian
Standard form problem (not necessarily convex)
where \(\boldsymbol{x} \in \mathbb{R}^{n}\), and its domain \(\mathcal{D} = \left(\cap_{i=0}^{m} \text{dom} \, f_i \right) \cap \left(\cap_{i=1}^{p} \text{dom} \, h_i \right) \neq \varnothing\)
Lagrangian function : \(L: \mathbb{R}^{n} \times \mathbb{R}^{m} \times \mathbb{R}^{p} \rightarrow \mathbb{R}\), with \(\text{dom} \, L = \mathcal{D} \times \mathbb{R}^{m} \times \mathbb{R}^{p}\);
is weighted sum of objective and constraint functions.
- \(\lambda_i\) is Lagrange multiplier associated with \(f_i(\boldsymbol{x}) \leq 0\) and
- \(\nu_i\) is Lagrange multiplier associated with \(h_i(\boldsymbol{x}) = 0\)
1.2 Lagrange dual function
Lagrange dual function: \(g : \mathbb{R}^{m} \times \mathbb{R}^{p} \rightarrow \mathbb{R}\):
\(g\) can be \(-\infty\) for some \(\boldsymbol{\lambda}\), \(\boldsymbol{\nu}\). \(g\) is concave even the primal problem is not convex.
Lower bound property : If \(\boldsymbol{\lambda} \succeq \boldsymbol{0}\), then \(g(\boldsymbol{\lambda}, \boldsymbol{\nu}) \leq p^*\)
Proof : If \(\tilde{\boldsymbol{x}}\) is feasible and \(\boldsymbol{\lambda} \succeq \boldsymbol{0}\), then
minimizing over all feasible \(\tilde{\boldsymbol{x}}\) gives \(p^* \geq g(\boldsymbol{\lambda}, \boldsymbol{\nu})\)
2. The Lagrange dual problem
- finds best lower bound on \(p^*\), obtained from Lagrange dual function
- a convex optimization problem; optimal value denoted \(d^*\)
- \(\boldsymbol{\lambda}\) , \(\boldsymbol{\nu}\) are dual feasible if \(\boldsymbol{\lambda} \succeq \boldsymbol{0}\), \((\boldsymbol{\lambda}, \boldsymbol{\nu}) \in \text{dom } g\)
- often simplified by making implicit constraint\((\boldsymbol{\lambda}, \boldsymbol{\nu}) \in \text{dom } g\) explicit.
2.1 Weak and strong duality
Weak duality : \(d^* \leq p^*\)
- always holds (for convex and nonconvex problems)
- can be used to find nontrivial lower bounds for difficult problems
Strong duality: \(d^* = p^*\)
- does not hold in general does not hold in general
- (usually) holds for convex problems
- conditions that guarantee strong duality in convex problems are called constraint qualifications
2.2 Slater's constraint qualification
Strong duality holds for a convex problem:
If it is strictly feasible, i.e.,
-
also guarantees that the dual optimum is attained (if \(p* > - \infty\))
-
can be sharpened: e.g., can replace \(\text{int} \mathcal{D}\) with \(\text{relint} \mathcal{D}\) (interior relative to affine hull); linear inequalities do not need to hold with strict inequality, ...
There exist many other types of constraint qualifications
2.3 Examples
(1) Inequality form LP
Primal problem
The dual function \(g(\boldsymbol{\lambda})\) is
The dual problem is:
-
from Slater's condition: \(\boldsymbol{p}^*=\boldsymbol{q}^*\) if \(\boldsymbol{A \tilde{x}} \prec \boldsymbol{b}\) for some \(\boldsymbol{\tilde{x}}\)
-
in fact, \(\boldsymbol{p}^*=\boldsymbol{q}^*\) except when primal and dual are infeasible
(2) Quadratic program
A primal problem is (assume \(\boldsymbol{P} \in \mathbb{S}^{n}_{++}\), i.e., \(\boldsymbol{P}\) is a symmetric positive definite matrix)
The dual function \(g(\boldsymbol{\lambda})\) is:
Note that \(h(\boldsymbol{x})=\boldsymbol{x}^{\top} \boldsymbol{P} \boldsymbol{x} + \boldsymbol{\lambda}^{\top} (\boldsymbol{A} \boldsymbol{x} - \boldsymbol{b})\) is a quadratic function w.r.t \(\boldsymbol{x}\). Thus, let \(\dfrac{\partial h(\boldsymbol{x})}{\partial \boldsymbol{x}} = \boldsymbol{0}\), then, we will have $\boldsymbol{x} = -\dfrac{1}{2} \boldsymbol{P}^{-1} \boldsymbol{\lambda}^{\top} \boldsymbol{A} $
The dual problem is:
-
from Slater's condition: \(\boldsymbol{p}^*=\boldsymbol{q}^*\) if \(\boldsymbol{A \tilde{x}} \prec \boldsymbol{b}\) for some \(\boldsymbol{\tilde{x}}\)
-
in fact, \(\boldsymbol{p}^*=\boldsymbol{q}^*\) always
(3) A nonconvex problem with strong duality
The trust region problem : the primal problem is:
where \(\boldsymbol{A} \in \mathbb{S}^{n}\) (\(\boldsymbol{A}\) is a symmetric matrix) and \(\boldsymbol{A} \nsucceq \boldsymbol{0}\), thus, this is not a convex problem.
The dual function \(g(\boldsymbol{\lambda})\) is:
where \((\boldsymbol{A} + \lambda \boldsymbol{I})^{\dagger}\) is the pseudo-inverse of \(\boldsymbol{A} + \lambda \boldsymbol{I}\). The dual function \(g(\lambda)\) is minimized by \(\boldsymbol{x} = -(\boldsymbol{A} + \lambda \boldsymbol{I})^{\dagger}\).
The dual problem and equivalent semi-definite problem (SDP) are:
- strong duality although primal problem is not convex (not easy to show)
3. Geometric interpretation
We can give a simple geometric interpretation of the dual function in terms of the set. Suppose:
which is the set of values taken on by the constraint and objective functions.
Let \(f_{i}(\boldsymbol{x})=u_i, h_{i}(\boldsymbol{x})=v_i, f_{0}(\boldsymbol{x})=t\), then, the optimal value of the primal problem \(\boldsymbol{p}^*\) can be expressed in terms of \(\mathcal{G}\) as:
To evaluate the dual function \(f(\boldsymbol{\lambda}, \boldsymbol{\nu})\) at \((\boldsymbol{\lambda}, \boldsymbol{\nu})\), we can minimize the affine function
over \((\boldsymbol{u}, \boldsymbol{v}, t) \in \mathcal{G}\). i.e.,
Therefore, we have
4. Optimality conditions
4.2 Complementary slackness
Assume strong duality holds, \(\boldsymbol{x}^*\) is primal optimal, \((\boldsymbol{\lambda}^*, \boldsymbol{\nu}^*)\) is dual optimal
Hence, the two inequalities hold with equality
-
\(\boldsymbol{x}^*\) minimizes \(L(\boldsymbol{x}, \boldsymbol{\lambda}^*, \boldsymbol{\nu}^*)\)
-
Complementary slackness: \(\lambda_i^* f_i(\boldsymbol{x}^*) = 0\) for \(i=1,2,\cdots, m\), that is:
4.3 Karush-Kuhn-Tucker (KKT) conditions
(1) KKT conditions for nonconvex problem
The following four conditions are called KKT conditions (for a problem with differentiable constraint function \(f_i\), \(h_i\):
-
(1) primal constraints: \(f_i(\boldsymbol{x}) \leq 0, i = 1,2,\cdots,m\), \(h_i(\boldsymbol{x}) = 0, i = 1,2,\cdots,p\)
-
(2) dual constraints: $\boldsymbol{\lambda} \succeq \boldsymbol{0}$
-
(3) complementary slackness: \(\lambda_if_i(\boldsymbol{x})=0, i=12,\cdots,m\)
-
(4) gradient of Lagrangian with respect to \(\boldsymbol{x}\) vanishes:
If strong duality holds and \(\boldsymbol{x}\), \(\boldsymbol{\lambda}\) ,\(\boldsymbol{\nu}\) are optimal, then they must satisfy the KKT conditions
To summarize, for any optimization problem with differentiable objective and constraint functions for which strong duality obtains, any pair of primal and dual optimal points must satisfy the KKT conditions.
(2) KKT conditions for convex problems
If \(\tilde{x}\), \(\tilde{\lambda}\), \(\tilde{\nu}\) satisfy KKT for a convex problem, then they are optimal:
-
from complementary slackness: \(f_0(\tilde{x}) = L(\tilde{x}, \tilde{\lambda}, \tilde{\nu})\)
-
from 4th KKT condition (and convexity): \(g(\tilde{\lambda}, \tilde{\nu}) = L(\tilde{x}, \tilde{\lambda}, \tilde{\nu})\)
hence, \(f_0(\tilde{x}) = g(\tilde{\lambda}, \tilde{\nu})\)
If Slater's condition is satisfied: \(\boldsymbol{x}\) is optimal if and only if there exist \(\boldsymbol{\lambda}\), \(\boldsymbol{\nu}\) that satisfy KKT conditions
-
Slater implies strong duality, and dual optimum is attained
-
generalizes optimality condition \(\nabla f_0(\boldsymbol{x})\) for unconstrained problem
5. Perturbation and sensitivity analysis
Reference
[1] Convex Optimization, edx, video(Youtube), slides
[2]

浙公网安备 33010602011771号