Gabriele Farina - Conic optimization

MIT 6.7220/15.084 — Nonlinear Optimization Thu, Feb 29th 2024
Lecture 6
Conic optimization
Instructor: Prof. Gabriele Farina ( gfarina@mit.edu)★
★These notes are class material that has not undergone formal peer review. The TAs and I are grateful for any
reports of typos.
In Lectures 2 and 4, we considered normal cones at the intersection of linear equality and inequality
constraints. In Exercise 1 of Homework 1, we were also able to give closed formulas for the normal
cone in certain sets, such as a ball.
In Lecture 5, we turned our attention to more general feasible sets. There, we saw how KKT condi-
tions define necessary conditions for feasible sets defined by functional constraints by approximating
the feasible set with the linearization of the active constraints and considering the normal cone to
the intersection of those linearizations.
Today, we look at a class of convex feasible sets that are neither linear equality nor inequality
constraints. Also, they are not defined via functional constraints, making the KKT machinery inap-
plicable.
1 Conic optimization
A conic optimization problem is a nonlinear optimization problem whose feasible set is the intersec-
tion between an affine subspace (that is, a system of linear equalities) and a nonempty closed convex
cone 𝒦:
min
𝑥
s.t.
𝑓(𝑥)
𝐴𝑥 = 𝑏
𝑥 ∈ 𝒦
This class of problems captures linear programming, semidefinite programming, second-order cone
programming, copositive programming, and more, depending on the specific cone 𝒦 that is selected.
1.1 Nonnegative cone ⟷ Linear programming
The first example is the nonnegative cone ℝ𝑛
≥0. In this case, the conic problem takes the more fa-
miliar form
min
𝑥
s.t.
𝑓(𝑥)
𝐴𝑥 = 𝑏
𝑥 ≥ 0
which is a linear program. So, for the specific choice of the nonnegative cone, conic programming
corresponds to linear programming.
1.2 Ice-cream (aka. Lorentz) cone ⟷ Second-order conic programming
Definition 1.1.
The ice-cream cone, or Lorentz cone, is defined as
ℒ𝑛 ≔ {(𝑥, 𝑧) ∈ ℝ𝑛−1 × ℝ : 𝑧 ≥ ‖𝑥‖2}.
The figure on the right shows the shape of ℒ3.
Conic problems for the specific choice of the Lorentz
cone are usually called second-order cone programs,
or SOCP for short. Several problems of interest in
engineering and physics can be modeled as SOCP,
including the design of antenna arrays, the positions
of spring systems at rest, and grasp planning in ro-
botics. −1.25
0
𝑥1
1.250𝑥2
1.25
0
1
𝑧
1.3 Semidefinite cone ⟷ Semidefinite programming
Definition 1.2. The semidefinite cone 𝒮𝑛 is the set of
positive semidefinite 𝑛 × 𝑛 matrices:
𝒮𝑛 ≔ {𝑋 ∈ 𝕊𝑛 : 𝑋 ≽ 0}
= {𝑋 ∈ 𝕊𝑛 : 𝑎⊤𝑋𝑎 ≥ 0 ∀𝑎 ∈ ℝ𝑛},
where 𝕊𝑛 is the set of symmetric 𝑛 × 𝑛 real matrices.
The figure on the right shows the cone of values (𝑥, 𝑦, 𝑧)
corresponding to the semidefinite cone of 2 × 2 matrices
(𝑥
𝑦
𝑦
𝑧) ≽ 0.
In this simple 2 × 2 case, the positive semidefiniteness
condition is equivalent to
0 𝑥 1
−1.25
0
𝑦
1.25
0
1
𝑧
𝑥 + 𝑧 ≥ 0 (the sum of the eigenvalues [trace] is nonnegative)
𝑥𝑧 − 𝑦2 ≥ 0 (the product of the eigenvalues [determinant] is nonnegative.)
These conditions are equivalent to 𝑥, 𝑧 ≥ 0 ∧ 𝑥𝑧 ≥ 𝑦2. So, slices of the cone with planes orthog-
onal to the 𝑧-axis are shaped like parabolas 𝑥 ≥ 𝑦2/𝑧. Similarly, slices with planes orthogonal to
the 𝑥-axis are shaped like parabolas 𝑧 ≥ 𝑦2/𝑥. Note how the curvature of the parabolas decreases
as the slicing plane gets further away from the origin.
Remark 1.1. Semidefinite programming subsumes both linear programming and second-order
cone programming. This is because
𝑥 ∈ ℝ𝑛
≥0 ⟺
⎝
⎜⎛𝑥1
⋱ 𝑥𝑛⎠
⎟⎞ ≽ 0, and
𝑧 ≥ ‖𝑥‖2 ⟺ (𝑧𝐼
𝑥⊤
𝑥
𝑧) ≽ 0.
Semidefinite programming is an extremely powerful tool with applications in all disciplines. I have
noted resources that treat semidefinite programming extensively at the end of this document.
1.4 Copositive cone ⟷ Copositive programming
Definition 1.3. The copositive cone 𝒞𝑛 is the set of
symmetric 𝑛 × 𝑛 real matrices:
𝒞𝑛 ≔ {𝑋 ∈ 𝕊𝑛 : 𝑎⊤𝑋𝑎 ≥ 0 ∀𝑎 ∈ ℝ𝑛
≥0}.
The difference with the positive semidefinite cone 𝒮𝑛 is
given by the fact that we need 𝑎⊤𝑋𝑎 ≥ 0 only for non-
negative vectors 𝑎 ∈ ℝ𝑛
≥0, and not for all 𝑎 ∈ ℝ𝑛.
The figure on the right shows the cone of values (𝑥, 𝑦, 𝑧)
corresponding to the copositive cone of 2 × 2 matrices,
that is
(𝑥
𝑦
𝑦
𝑧) ∈ 𝒞2.
0 𝑥 1
−1.25
0
𝑦
1.25
0
1
𝑧
It follows immediately from the definition that 𝒞𝑛 is a superset of both the cone of nonnegative
symmetric matrices 𝒩𝑛 ≔ 𝕊𝑛 ∩ ℝ𝑛×𝑛
≥0 and the cone of semidefinite matrices. Hence,
𝒩𝑛 + 𝒮𝑛 ⊆ 𝒞𝑛 for all 𝑛 ∈ ℕ.
Curiously, the reverse inequality is known to hold, but only up to dimension 𝑛 ≤ 4 [Dia62]:
𝒞𝑛 = 𝒮𝑛 + 𝒩𝑛 if 𝑛 ≤ 4.
Beyond dimension 4, some matrices are copositive, and yet they are not expressible as the sum of
𝒮𝑛 with 𝒩𝑛.
Remark 1.2. One important fact to know about the copositive cone is that despite being a closed
convex cone, optimization of even linear functions over the copositive cone is computationally
intractable!
For example, the size of the maximum independent set of a simple graph 𝐺 = (𝑉 , 𝐸), that is, the
size of the largest set 𝑆 ⊆ 𝑉 of vertices with the property that no two vertices in 𝑆 are connected
by an edge in 𝐸, can be computed as the solution to the following copositive problem. Let 𝐴 ∈
{0, 1}𝑉 ×𝑉 denote the adjacency matrix of 𝐺, that is,
𝐴𝑖𝑗 = {1 if (𝑖, 𝑗) ∈ 𝐸
0 otherwise.
The size of the maximum independent set of 𝐺 can then be found by solving the copositive opti-
mization problem
min
𝑤,𝑡∈ℝ,𝑄
s.t.
𝑤
𝑄 = 𝑤𝐼 + 𝑡𝐴 − 𝟙𝟙⊤
𝑄 ∈ 𝒞𝑛
(MIS)
where 𝟙 ∈ ℝ𝑉 is the vector of all ones. More information is available in the paper by Klerk, E. de, &
Pasechnik, D. V. [KP02]. Since deciding whether a graph admits an independent set of size 𝑘 cannot
take polynomial time unless P = NP, this means that solving copositive optimization problems must
take more than polynomial time in the worst case unless P = NP. The hardness applies already for
linear objectives, as the previous example shows. So, we conclude—for example—that one cannot
construct an efficient separation oracle for 𝒞𝑛.
Several techniques are known to “relax” the copositive cone by, for example, approximating it with
a (higher-dimensional) semidefinite cone. The monographs of Gärtner, B., & Matoušek, J. [GM12]
and Klerk, E. de. [Kle02] do an excellent job at explaining approximation techniques that involve
semidefinite cones.
2 Optimality conditions and conic duality
In general, the feasible set Ω of conic optimization problems cannot be easily captured via functional
constraints. Because of that, we cannot use the KKT conditions we discussed in the previous lecture
to compute the normal cone at each point of the feasible set. In this situation, we need to compute
the normal cone from first principles.
To simplify the treatment and avoid the machinery of relative interiors (interiors in the topology
induced by the smallest affine subspace that contains a set), we will assume in the following that 𝒦
is a cone with interior. This condition is satisfied by all the cases discussed above.
In order to compute the normal cone at a point at the intersection of {𝐴𝑥 = 𝑏} and 𝒦, we will
use the following general result (again based on separation), which enables us to consider the affine
subspace and the cone separately and sum their normal cones.
Theorem 2.1. Let 𝐻 ≔ {𝑥 ∈ ℝ𝑛 : 𝐴𝑥 = 𝑏, with 𝐴 ∈ ℝ𝑚×𝑛} be an affine subspace, and 𝑆 be a
closed convex set (not necessarily a cone) such that 𝑆⚬ ∩ 𝐻 is nonempty, where 𝑆⚬ is the interior
of 𝑆. For any 𝑥 ∈ 𝐻 ∩ 𝑆,
𝒩𝐻∩𝑆 (𝑥) = 𝒩𝐻 (𝑥) + 𝒩𝑆 (𝑥).
So, using the fact (see Lecture 2 or Lecture 5) that 𝒩𝐻 (𝑥) = colspan(𝐴⊤), we have
𝒩𝐻∩𝑆 (𝑥) = {𝐴⊤𝜆 + 𝑧 : 𝜆 ∈ ℝ𝑚, 𝑧 ∈ 𝒩𝑆 (𝑥)}.
Remark 2.1. The condition of existence of a feasible point in the interior is sometimes referred to
as Slater’s condition, strict feasibility condition, or strong feasibility condition. The term Slater’s
condition is appropriate in light of the constraint qualification condition for the KKT theory we
discussed last time. In both cases, the insistence on an interior solution is required to rule out
pathological behavior.
We know from Lectures 2 and 5 that Slater’s condition was not necessary for the intersection of
halfspaces, which are flat surfaces. The condition becomes, however, crucial when considering sets
that can have curvature. For example, consider the intersection of a two-dimensional ball of radius
1 centered in (𝑥, 𝑦) = (1, 0) with the plane 𝑥 = 0. The normal cone at the intersection is ℝ2, but
this is not equal to the sum between the normal cones of the sets at the intersection points (which
evaluates to ℝ × {0}).
Remark 2.2. The previous result can be generalized, without much effort nor substantially new
ideas to the case of intersection between two generic convex sets whose relative interiors have a
nonempty intersection. We do not need such a generalization for this lecture; however, you might
find it useful to know that—provided mild hypotheses hold—the normal cone to the intersection
between convex sets is equal to the sum of the normal cones to the individual sets.
2.1 The normal cone at a point in 𝒦
With Theorem 2.1 in our toolbox, the only obstacle to writing first-order optimality conditions for
conic problems is being able to compute the normal at points in 𝒦. Luckily, we can do that in closed
form in all the abovementioned cases.
In particular, it turns out that the normal cone at any point in 𝒦 can be defined starting from
the normal cone at 0 (remember that 0 is always part of a nonempty cone, as a straightforward
consequence of the definition of cone). The normal cone for a cone 𝒦 at 0 ∈ 𝒦 is such an important
object, that several names for it or its related quantities are used in the literature:
• The polar cone of 𝒦, often denoted 𝒦⟂, is exactly the normal cone at 0:
𝒦⟂ ≔ 𝒩𝒦(0) = {𝑑 : ⟨𝑑, 𝑦⟩ ≤ 0 ∀𝑦 ∈ 𝒦}.
• The dual cone of 𝒦, often denoted 𝒦∗, is the opposite of the normal cone at 0:
𝒦∗ ≔ −𝒦⟂ = −𝒩𝒦(0) = {𝑑 : ⟨𝑑, 𝑦⟩ ≥ 0 ∀𝑦 ∈ 𝒦}.
For all the important cones considered earlier, we have a very precise idea of what their polar and
dual cones look like:
Theorem 2.2.
1. The nonnegative cone, the Lorentz cone, and the semidefinite cones are self-dual cones,
that is, 𝒦∗ = 𝒦.
2. The dual cone to the copositive cone 𝒞𝑛 is called the totally positive cone, defined as
𝒫𝑛 ≔ {∑
𝑘
𝑖=1
𝑧𝑖𝑧⊤
𝑖 : 𝑧𝑖 ∈ ℝ𝑛
≥0, 𝑘 ∈ ℕ} = {𝐵𝐵⊤ : 𝐵 ∈ ℝ𝑛×𝑚
≥0 , 𝑚 ∈ ℕ}
[▷ Try to prove the previous theorem.] Since for a closed convex cone 𝒦, (𝒦∗)∗ = 𝒦, it follows
that the dual to the totally positive cone is the copositive cone. [▷ You should also try to prove
that (𝒦∗)∗ = 𝒦.]
Once the normal cone at 0 is established, the normal cone at any point 𝑥 ∈ 𝒦 can be recovered as
a function of 𝒩𝒦(0).
Theorem 2.3. Let 𝒦 ⊆ ℝ𝑛 be a nonempty, closed convex cone. The normal cone at any point
𝑥 ∈ 𝒦 is given by
𝒩𝒦(𝑥) = {𝑑 ∈ ℝ𝑛 : ⟨𝑑, 𝑥⟩ = 0, 𝑑 ∈ 𝒩𝒦(0)}.
Proof . Fix a generic 𝑥 ∈ 𝒦. We break the proof into a proof of the two separate inclusions.
(⊆) Let 𝑑 ∈ ℝ𝑛 be a direction in 𝒩𝒦(𝑥), that is,
⟨𝑑, 𝑦 − 𝑥⟩ ≤ 0 ∀𝑦 ∈ 𝒦.
Since 𝒦 is a cone, we have that both 2𝑥 and 1
2 𝑥 belong to 𝒦. So,
⟨𝑑, 2𝑥 − 𝑥⟩ ≤ 0 ⟹ ⟨𝑑, 𝑥⟩ ≤ 0, and
⟨𝑑, 1
2 𝑥 − 𝑥⟩ ≤ 0 ⟹ ⟨𝑑, 𝑥⟩ ≥ 0,
showing that ⟨𝑑, 𝑥⟩ = 0 is necessary. But then, from the condition that ⟨𝑑, 𝑦 − 𝑥⟩ ≤ 0 for
all 𝑦 ∈ 𝒦 we deduce that necessarily ⟨𝑑, 𝑦⟩ ≤ 0 for all 𝑦 ∈ 𝒦, which means that 𝑑 ∈ 𝒦⟂ =
𝒩𝒦(0).
(⊇) Vice versa, let 𝑑 ∈ {𝑑 ∈ ℝ𝑛 : ⟨𝑑, 𝑥⟩ = 0, 𝑑 ∈ 𝒩𝒦(0)}. Then, for any 𝑦 ∈ 𝒦,
⟨𝑑, 𝑦 − 𝑥⟩ = ⟨𝑑, 𝑦⟩ − ⟨𝑑, 𝑥⟩
= ⟨𝑑, 𝑦⟩ (since ⟨𝑑, 𝑥⟩ = 0)
≤ 0 (since 𝑑 ∈ 𝒦⟂). □
2.2 Conic duality
We now have all the ingredients to write first-order optimality conditions for a conic problem
min
𝑥
s.t.
𝑓(𝑥)
𝐴𝑥 = 𝑏
𝑥 ∈ 𝒦.
(2)
Specifically, by combining Theorem 2.1 with Theorem 2.3 and the sufficiency of first-order optimality
conditions seen in Lecture 3, we obtain the following result.
Theorem 2.4. If 𝑓 is convex and differentiable, and the conic optimization (2) is strictly feasible,
that is, there exists 𝑥 ∈ 𝒦⚬ such that 𝐴𝑥 = 𝑏, then a point 𝑥∗ ∈ Ω ≔ 𝒦 ∩ {𝑥 : 𝐴𝑥 = 𝑏} is a
minimizer of 𝑓 on Ω if and only if
−∇𝑓(𝑥∗) ∈ {𝐴⊤𝜆 + 𝑧 : 𝜆 ∈ ℝ𝑚, 𝑧 ∈ 𝒩𝒦(𝑥∗)},
that is, expanding 𝒩𝒦(𝑥∗) using Theorem 2.3, if and only if
−∇𝑓(𝑥∗) ∈ {𝐴⊤𝜆 + 𝑧 : 𝜆 ∈ ℝ𝑚, 𝑧 ∈ 𝒦⟂, ⟨𝑧, 𝑥∗⟩ = 0}
⟺ −∇𝑓(𝑥∗) ∈ {𝐴⊤𝜆 − 𝑧 : 𝜆 ∈ ℝ𝑚, 𝑧 ∈ 𝒦∗, ⟨𝑧, 𝑥∗⟩ = 0}.
In particular, when 𝑓(𝑥) = ⟨𝑐, 𝑥⟩ is a linear objective, then optimality of 𝑥∗ for (2) is equivalent to
the existence of a solution for
−𝑐 = −𝐴⊤𝜆 − 𝑧 subject to 𝜆 ∈ ℝ𝑚, 𝑧 ∈ 𝒦∗, ⟨𝑧, 𝑥∗⟩ = 0.
This suggests that whenever the conditions of strict feasibility holds, then a duality theory for conic
problems can be derived following the exact same steps as we did in Lecture 2. In particular, we
find that the dual problem is
max
𝜆,𝑧
s.t.
⟨𝑏, 𝜆⟩
𝑧 = 𝑐 − 𝐴⊤𝜆
𝜆 ∈ ℝ𝑚
𝑧 ∈ 𝒦∗.
In particular, after showing that the value of the dual is always upper bounded by the value of
the primal (weak duality) using the same steps as in Lecture 2, we conclude the following. [▷ Try
working out the details, it should only take a minute.]
Theorem 2.5. If the primal problem
min
𝑥
s.t.
⟨𝑐, 𝑥⟩
𝐴𝑥 = 𝑏
𝑥 ∈ 𝒦
is strictly feasible and admits an optimal solution 𝑥∗, then the dual problem
max
𝜆,𝑧
s.t.
⟨𝑏, 𝜆⟩
𝑧 = 𝑐 − 𝐴⊤𝜆
𝜆 ∈ ℝ𝑚
𝑧 ∈ 𝒦∗
admits an optimal solution (𝜆∗, 𝑧∗) such that:
• the values of the two problems coincide: ⟨𝑐, 𝑥∗⟩ = ⟨𝑏, 𝜆∗⟩; and
• the solution 𝑧∗ satisfies the complementary slackness condition ⟨𝑥∗, 𝑧∗⟩ = 0.
Failure of duality when the constraint qualification is not satisfied. It is essential to remember that
duality might fail when the strict feasibility constraint qualification is not satisfied. Specifically, the
primal problem might be feasible and yet the dual might not have an optimal solution.
The failure mode can be of different kinds:
• the primal has an optimal solution, the supremum of the dual matches the value of the primal,
and yet the dual does not have a maximizer (Example 2.1); or
• the primal has an optimal solution, the dual has an optimal solution, but the values of the
problems differ (Example 2.2).
Example 2.1 ([Kle02]). Consider the problem
min
𝑋
s.t.
2𝑋12 + 𝑋22
𝑋11 = 0
𝑋22 = 1
𝑋 ≽ 0.
The only point in the feasible set is (0
0
0
1); therefore, the optimal value is 1.
Since the matrix (0
0
0
1) is not in the interior of the semidefinite cone (one eigenvalue is 0), strict
feasibility does not hold.
The dual problem in this case is
max
𝜆,𝑍
s.t.
𝜆2
𝑍 = (0
1
1
1) − 𝜆1(1
0
0
0) − 𝜆2(0
0
0
1)
𝜆 ∈ ℝ2
𝑍 ≽ 0
.
For 𝜆 = (− 1
𝜀 , 1 − 𝜀), 𝜀 > 0 we obtain the feasible point 𝑍 = (1
𝜀
1
1
𝜀) ≽ 0 with objective value 1 −
𝜀. Hence, the supremum of the dual matches the value of the primal, but the dual has no optimal
solution.
Example 2.2 ([BV04]). Consider the primal problem
min
𝑋
s.t.
𝑥2
𝑋 =
⎝
⎜⎜⎜⎛𝑥2 + 1
0
0
0
𝑥1
𝑥2
0
𝑥2
0 ⎠
⎟⎟⎟⎞
𝑋 ≽ 0.
Since the determinant must be nonnegative, it follows immediately that 𝑥2 = 0 for all feasible
points, and so the objective value is 0. The dual problem is (after some manipulations)
max
𝑍
s.t.
−𝑍11
𝑍11 + 2𝑍23 = 1
𝑍22 = 0
𝑍 ≽ 0.
Since 𝑍 is positive semidefinite, then so must be the submatrix (𝑍22
𝑍23
𝑍23
𝑍33
). But since 𝑍22 = 0,
then necessarily 𝑍23 = 0. So, for any feasible point in the dual, the constraint 𝑍11 + 𝑍23 = 1
is equivalent to 𝑍11 = 1. So, the dual attains an optimal point, but the value at optimality is
different from the value at optimality of the primal problem.
3 Further readings and bibliography
Hiriart-Urruty, J.-B., & Seeger, A. [HS10] wrote an excellent survey of the properties of the copos-
itive cone.
For further reading on semidefinite programming, I find the short books by Klerk, E. de. [Kle02]
and Gärtner, B., & Matoušek, J. [GM12] very well-written and approachable. The book by Boyd,
S., & Vandenberghe, L. [BV04] is a great reference too.
The monograph by Ben-Tal, A., & Nemirovski, A. [BN01] contains an extensive list of applications
of second-order cone problems and semidefinite problems.
Bibliography
[Dia62] P. H. Diananda, “On non-negative forms in real variables some or all of which are non-
negative,” Mathematical Proceedings of the Cambridge Philosophical Society, vol. 58, no. 1,
pp. 17–25, 1962, doi: 10.1017/S0305004100036185.
[KP02] E. de Klerk and D. V. Pasechnik, “Approximating of the stability number of a graph via
copositive programming,” SIAM J. Optim., vol. 12, no. 4, pp. 875–892, 2002, doi: 10.1137/
S105262340138324.
[GM12] B. Gärtner and J. Matoušek, Approximation Algorithms and Semidefinite Programming.
Berlin, Germany: Springer, 2012. [Online]. Available: https://link.springer.com/book/
10.1007/978-3-642-22015-9
[Kle02] E. de Klerk, Aspects of Semidefinite Programming. Springer US, 2002. [Online]. Available:
https://link.springer.com/book/10.1007/b105286
[BV04] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge University Press, 2004.
[Online]. Available: https://web.stanford.edu/~boyd/cvxbook/
[HS10] J.-B. Hiriart-Urruty and A. Seeger, “A Variational Approach to Copositive Matrices,”
SIAM Review, vol. 52, no. 4, pp. 593–629, 2010, [Online]. Available: http://www.jstor.
org/stable/41062014
[BN01] A. Ben-Tal and A. Nemirovski, Lectures on modern convex optimization: analysis, algo-
rithms, and engineering applications. SIAM, 2001.

Content

Typo or question?

Metadata