MIT 6.7220/15.084 — Nonlinear Optimization (Spring ‘25) Thu, Mar 6th 2025
Lecture 9
Conic optimization
Instructor: Prof. Gabriele Farina ( gfarina@mit.edu)★
In Lectures 3, we considered normal cones at the intersection of linear equality and
inequality constraints. In Exercise 1 of Homework 1, we were also able to give closed formulas
for the normal cone in certain sets, such as a ball. In Lecture 7, we turned our attention to
more general feasible sets. There, we saw how KKT conditions define necessary conditions
for feasible sets defined by functional constraints by approximating the feasible set with the
linearization of the active constraints and considering the normal cone to the intersection
of those linearizations.
Today, we look at a class of convex feasible sets that are neither linear equality nor
inequality constraints. Also, they are not defined via functional constraints, making the
KKT machinery inapplicable.
L9.1 Conic optimization
A conic optimization problem is a nonlinear optimization problem whose feasible set is
the intersection between an affine subspace (that is, a system of linear equalities) and a
nonempty closed convex cone 𝒦:
min
𝑥
s.t.
𝑓(𝑥)
𝐴𝑥 = 𝑏
𝑥 ∈ 𝒦
This class of problems captures linear programming, semidefinite programming, second-
order cone programming, copositive programming, and more, depending on the specific
cone 𝒦 that is selected.
L9.1.1 Nonnegative cone ⟷ Linear programming
The first example is the nonnegative cone ℝ𝑛
≥0. In this case, the conic problem takes the
more familiar form
min
𝑥
s.t.
𝑓(𝑥)
𝐴𝑥 = 𝑏
𝑥 ≥ 0
which is a linear program. So, for the specific choice of the nonnegative cone, conic
programming corresponds to linear programming.
L9.1.2 Ice-cream (aka. Lorentz) cone ⟷ Second-order conic programming
Definition L9.1.
The ice-cream cone, or Lorentz cone, is defined
as
ℒ𝑛 ≔ {(𝑥, 𝑧) ∈ ℝ𝑛−1 × ℝ : 𝑧 ≥ ‖𝑥‖2}.
The figure on the right shows the shape of ℒ3.
Conic problems for the specific choice of the
Lorentz cone are usually called second-order
cone programs, or SOCP for short. Several prob-
lems of interest in engineering and physics can
be modeled as SOCP, including the design of
antenna arrays, the positions of spring systems
at rest, and grasp planning in robotics.
−1.25
0
𝑥1
1.250𝑥2
1.25
0
1
𝑧
L9.1.3 Semidefinite cone ⟷ Semidefinite programming
Definition L9.2. The semidefinite cone 𝒮𝑛 is the set of
positive semidefinite 𝑛 × 𝑛 matrices:
𝒮𝑛 ≔ {𝑋 ∈ 𝕊𝑛 : 𝑋 ⪰ 0}
= {𝑋 ∈ 𝕊𝑛 : 𝑎⊤𝑋𝑎 ≥ 0 ∀𝑎 ∈ ℝ𝑛},
where 𝕊𝑛 is the set of symmetric 𝑛 × 𝑛 real matri-
ces.
The figure on the right shows the cone of values
(𝑥, 𝑦, 𝑧) corresponding to the semidefinite cone of
2 × 2 matrices
(𝑥
𝑦
𝑦
𝑧) ⪰ 0.
0 𝑥 1
−1.25
0𝑦
1.25
0
1
𝑧
In this simple 2 × 2 case, the positive semidefiniteness condition is equivalent to
𝑥 + 𝑧 ≥ 0 (the sum of the eigenvalues [trace] is nonnegative)
𝑥𝑧 − 𝑦2 ≥ 0 (the product of the eigenvalues [determinant] is nonnegative.)
These conditions are equivalent to 𝑥, 𝑧 ≥ 0 ∧ 𝑥𝑧 ≥ 𝑦2. So, slices of the cone with planes
orthogonal to the 𝑧-axis are shaped like parabolas 𝑥 ≥ 𝑦2/𝑧. Similarly, slices with planes
orthogonal to the 𝑥-axis are shaped like parabolas 𝑧 ≥ 𝑦2/𝑥. Note how the curvature of the
parabolas decreases as the slicing plane gets further away from the origin.
Remark L9.1. Semidefinite programming subsumes both linear programming and
second-order cone programming. This is because
𝑥 ∈ ℝ𝑛
≥0 ⟺
(
(((𝑥1
⋱
𝑥𝑛)
))) ⪰ 0, and
𝑧 ≥ ‖𝑥‖2 ⟺ (𝑧𝐼
𝑥⊤
𝑥
𝑧) ⪰ 0.
Semidefinite programming is an extremely powerful tool with applications in all disciplines.
I have noted resources that treat semidefinite programming extensively at the end of this
document; we will see connections to optimization of polynomials in the next lecture.
L9.1.4 Copositive cone ⟷ Copositive programming
Definition L9.3. The copositive cone 𝒞𝑛 is the set of
symmetric 𝑛 × 𝑛 real matrices:
𝒞𝑛 ≔ {𝑋 ∈ 𝕊𝑛 : 𝑎⊤𝑋𝑎 ≥ 0 ∀𝑎 ∈ ℝ𝑛
≥0}.
The difference with the positive semidefinite cone
𝒮𝑛 is given by the fact that we need 𝑎⊤𝑋𝑎 ≥ 0
only for nonnegative vectors 𝑎 ∈ ℝ𝑛
≥0, and not for
all 𝑎 ∈ ℝ𝑛.
The figure on the right shows the cone of values
(𝑥, 𝑦, 𝑧) corresponding to the copositive cone of 2 ×
2 matrices, that is
(𝑥
𝑦
𝑦
𝑧) ∈ 𝒞2.
0 𝑥 1
−1.25
0𝑦
1.25
0
1
𝑧
It follows immediately from the definition that 𝒞𝑛 is a superset of both the cone of nonneg-
ative symmetric matrices 𝒩𝑛 ≔ 𝕊𝑛 ∩ ℝ𝑛×𝑛
≥0 and the cone of semidefinite matrices. Hence,
𝒩𝑛 + 𝒮𝑛 ⊆ 𝒞𝑛 for all 𝑛 ∈ ℕ.
Curiously, the reverse inequality is known to hold, but only up to dimension 𝑛 ≤ 4 [Dia62]:
𝒞𝑛 = 𝒮𝑛 + 𝒩𝑛 if 𝑛 ≤ 4.
Beyond dimension 4, some matrices are copositive, and yet they are not expressible as the
sum of 𝒮𝑛 with 𝒩𝑛.
Remark L9.2. One important fact to know about the copositive cone is that despite
being a closed convex cone, optimization of even linear functions over the copositive
cone is computationally intractable!
For example, the size of the maximum independent set of a simple graph 𝐺 = (𝑉 , 𝐸), that
is, the size of the largest set 𝑆 ⊆ 𝑉 of vertices with the property that no two vertices in 𝑆
are connected by an edge in 𝐸, can be computed as the solution to the following copositive
problem. Let 𝐴 ∈ {0, 1}𝑉 ×𝑉 denote the adjacency matrix of 𝐺, that is,
𝐴𝑖𝑗 = {1 if (𝑖, 𝑗) ∈ 𝐸
0 otherwise.
The size of the maximum independent set of 𝐺 can then be found by solving the copositive
optimization problem
min
𝑤,𝑡∈ ℝ,𝑄
s.t.
𝑤
𝑄 = 𝑤𝐼 + 𝑡𝐴 − 1 1⊤
𝑄 ∈ 𝒞𝑛
(MIS)
where 1 ∈ ℝ𝑉 is the vector of all ones. More information is available in the paper by de
Klerk, E., & Pasechnik, D. V. [dP02]. Since deciding whether a graph admits an independent
set of size 𝑘 cannot take polynomial time unless P = NP, this means that solving copositive
optimization problems must take more than polynomial time in the worst case unless P =
NP. The hardness applies already for linear objectives, as the previous example shows. So,
we conclude—for example—that one cannot construct an efficient separation oracle for 𝒞𝑛.
Several techniques are known to “relax” the copositive cone by, for example, approximating
it with a (higher-dimensional) semidefinite cone. The monographs of Gärtner, B., & Ma-
toušek, J. [GM12] and de Klerk, E. [de 02] do an excellent job at explaining approximation
techniques that involve semidefinite cones.
L9.2 Optimality conditions and conic duality
In general, the feasible set Ω of conic optimization problems cannot be easily captured via
functional constraints. Because of that, we cannot use the KKT conditions we discussed in
the previous lecture to compute the normal cone at each point of the feasible set. In this
situation, we need to compute the normal cone from first principles.
To simplify the treatment and avoid the machinery of relative interiors (interiors in the
topology induced by the smallest affine subspace that contains a set), we will assume in
the following that 𝒦 is a cone with interior. This condition is satisfied by all the cases
discussed above.
In order to compute the normal cone at a point at the intersection of {𝐴𝑥 = 𝑏} and 𝒦,
we will use the following general result (again based on separation), which enables us to
consider the affine subspace and the cone separately and sum their normal cones.
Theorem L9.1. Let 𝐻 ≔ {𝑥 ∈ ℝ𝑛 : 𝐴𝑥 = 𝑏, with 𝐴 ∈ ℝ𝑚×𝑛} be an affine subspace, and
𝑆 be a closed convex set (not necessarily a cone) such that 𝑆⚬ ∩ 𝐻 is nonempty, where
𝑆⚬ is the interior of 𝑆. For any 𝑥 ∈ 𝐻 ∩ 𝑆,
𝒩𝐻∩𝑆 (𝑥) = 𝒩𝐻 (𝑥) + 𝒩𝑆 (𝑥).
So, using the fact (see Lecture 2) that 𝒩𝐻 (𝑥) = colspan(𝐴⊤), we have
𝒩𝐻∩𝑆 (𝑥) = {𝐴⊤𝜇 + 𝑧 : 𝜇 ∈ ℝ𝑚, 𝑧 ∈ 𝒩𝑆 (𝑥)}.
Remark L9.3. The condition of existence of a feasible point in the interior is sometimes
referred to as Slater’s condition, strict feasibility condition, or strong feasibility condi-
tion. The term Slater’s condition is appropriate in light of the constraint qualification
condition for the KKT theory we discussed in Lecture 7. In both cases, the insistence
on an interior solution is required to rule out pathological behavior.
We know from Lectures 3 that Slater’s condition was not required for the intersection
of halfspaces, which are flat surfaces. The condition becomes, however, crucial when
considering sets that can have curvature. For example, consider the intersection of a two-
dimensional ball of radius 1 centered in (𝑥, 𝑦) = (1, 0) with the plane 𝑥 = 0. The normal
cone at the intersection is ℝ2, but this is not equal to the sum between the normal cones
of the sets at the intersection points (which evaluates to ℝ × {0}).
Remark L9.4. The previous result can be generalized, without much effort nor substan-
tially new ideas to the case of intersection between two generic convex sets whose
relative interiors have a nonempty intersection. In other words, the normal cone to the
intersection between convex sets is equal to the sum of the normal cones to the individual
sets if the relative interiors of the two sets intersect. This also explains why the strict
feasibility condition imposed by Slater is sufficient.
L9.2.1 The normal cone at a point in 𝒦
With Theorem L9.1 in our toolbox, the only obstacle to writing first-order optimality
conditions for conic problems is being able to compute the normal at points in 𝒦. Luckily,
we can do that in closed form in all the abovementioned cases.
In particular, it turns out that the normal cone at any point in 𝒦 can be defined starting
from the normal cone at 0 (remember that 0 is always part of a nonempty cone, as a
straightforward consequence of the definition of cone). The normal cone for a cone 𝒦 at
0 ∈ 𝒦 is such an important object, that several names for it or its related quantities are
used in the literature:
• The polar cone of 𝒦, often denoted 𝒦⟂, is exactly the normal cone at 0:
𝒦⟂ ≔ 𝒩𝒦(0) = {𝑑 : ⟨𝑑, 𝑦⟩ ≤ 0 ∀𝑦 ∈ 𝒦}.
• The dual cone of 𝒦, often denoted 𝒦∗, is the opposite of the normal cone at 0:
𝒦∗ ≔ −𝒦⟂ = −𝒩𝒦(0) = {𝑑 : ⟨𝑑, 𝑦⟩ ≥ 0 ∀𝑦 ∈ 𝒦}.
For all the important cones considered earlier, we have a very precise idea of what their
polar and dual cones look like:
Theorem L9.2.
1. The nonnegative cone, the Lorentz cone, and the semidefinite cones are self-dual
cones, that is, 𝒦∗ = 𝒦.
2. The dual cone to the copositive cone 𝒞𝑛 is called the totally positive cone, defined
as
𝒫𝑛 ≔ {∑
𝑘
𝑖=1
𝑧𝑖𝑧⊤
𝑖 : 𝑧𝑖 ∈ ℝ𝑛
≥0, 𝑘 ∈ ℕ} = {𝐵𝐵⊤ : 𝐵 ∈ ℝ𝑛×𝑚
≥0 , 𝑚 ∈ ℕ}
[▷ Try to prove the previous theorem.] Since for a closed convex cone 𝒦, (𝒦∗)∗ = 𝒦, it
follows that the dual to the totally positive cone is the copositive cone. [▷ You should also
try to prove that (𝒦∗)∗ = 𝒦.]
Once the normal cone at 0 is established, the normal cone at any point 𝑥 ∈ 𝒦 can be
recovered as a function of 𝒩𝒦(0).
Theorem L9.3. Let 𝒦 ⊆ ℝ𝑛 be a nonempty, closed convex cone. The normal cone at
any point 𝑥 ∈ 𝒦 is given by
𝒩𝒦(𝑥) = {𝑑 ∈ 𝒦⊥ : ⟨𝑑, 𝑥⟩ = 0}.
Proof. Fix a generic 𝑥 ∈ 𝒦. We break the proof into a proof of the two separate inclusions.
(⊆) Let 𝑑 ∈ ℝ𝑛 be a direction in 𝒩𝒦(𝑥), that is,
⟨𝑑, 𝑦 − 𝑥⟩ ≤ 0 ∀𝑦 ∈ 𝒦.
Since 𝒦 is a cone, we have that both 2𝑥 and 1
2 𝑥 belong to 𝒦. So,
⟨𝑑, 2𝑥 − 𝑥⟩ ≤ 0 ⟹ ⟨𝑑, 𝑥⟩ ≤ 0, and
⟨𝑑, 1
2 𝑥 − 𝑥⟩ ≤ 0 ⟹ ⟨𝑑, 𝑥⟩ ≥ 0,
showing that ⟨𝑑, 𝑥⟩ = 0 is necessary. But then, from the condition that ⟨𝑑, 𝑦 − 𝑥⟩ ≤ 0 for
all 𝑦 ∈ 𝒦 we deduce that ⟨𝑑, 𝑦⟩ ≤ 0 for all 𝑦 ∈ 𝒦, which means that 𝑑 ∈ 𝒦⟂ = 𝒩𝒦(0).
(⊇) Vice versa, let 𝑑 ∈ {𝑑 ∈ 𝒦⊥ : ⟨𝑑, 𝑥⟩ = 0}. Then, for any 𝑦 ∈ 𝒦,
⟨𝑑, 𝑦 − 𝑥⟩ = ⟨𝑑, 𝑦⟩ − ⟨𝑑, 𝑥⟩
= ⟨𝑑, 𝑦⟩ (since ⟨𝑑, 𝑥⟩ = 0)
≤ 0 (since 𝑑 ∈ 𝒦⟂). □
L9.2.2 First-order optimality conditions
We now have all the ingredients to write first-order optimality conditions for a conic problem
min
𝑥
s.t.
𝑓(𝑥)
𝐴𝑥 = 𝑏
𝑥 ∈ 𝒦.
(2)
Specifically, by combining Theorem L9.1 with Theorem L9.3 and the sufficiency of first-
order optimality conditions seen in Lecture 3, we obtain the following result.
Theorem L9.4. If 𝑓 is convex and differentiable, and the conic optimization (2) is strictly
feasible, that is, there exists 𝑥 ∈ 𝒦⚬ such that 𝐴𝑥 = 𝑏, then a point 𝑥∗ ∈ Ω ≔ 𝒦 ∩ {𝑥 :
𝐴𝑥 = 𝑏} is a minimizer of 𝑓 on Ω if and only if
−∇𝑓(𝑥∗) ∈ {𝐴⊤𝜇 + 𝑧 : 𝜇 ∈ ℝ𝑚, 𝑧 ∈ 𝒩𝒦(𝑥∗)},
that is, expanding 𝒩𝒦(𝑥∗) using Theorem L9.3, if and only if
−∇𝑓(𝑥∗) ∈ {𝐴⊤𝜇 + 𝑧 : 𝜇 ∈ ℝ𝑚, 𝑧 ∈ 𝒦⟂, ⟨𝑧, 𝑥∗⟩ = 0}
⟺ −∇𝑓(𝑥∗) ∈ {𝐴⊤𝜇 − 𝑧 : 𝜇 ∈ ℝ𝑚, 𝑧 ∈ 𝒦∗, ⟨𝑧, 𝑥∗⟩ = 0}.
Remark L9.5. The condition ⟨𝑧, 𝑥∗⟩ = 0 is called the conic complementary slackness
condition. It is a generalization of the complementary slackness condition we saw
in Lecture 3 for linear programming. In the special case where 𝒦 = ℝ𝑛
≥0, the conic
complementary slackness condition reduces to the complementary slackness condition
for linear programming.
L9.2.3 Conic duality for linear objectives
In particular, when 𝑓(𝑥) = ⟨𝑐, 𝑥⟩ is a linear objective, then optimality of 𝑥∗ for (2) is
equivalent to the existence of 𝜇, 𝑧 such that
−𝑐 = 𝐴⊤𝜇 − 𝑧, where 𝜇 ∈ ℝ𝑚, 𝑧 ∈ 𝒦∗, ⟨𝑧, 𝑥∗⟩ = 0.
This suggests that whenever the conditions of strict feasibility holds, then a duality theory
for conic problems can be derived following the exact same steps as we did in Lecture 3. In
particular, we find that the dual problem is
max
𝜇,𝑧
s.t.
⟨𝑏, 𝜇⟩
𝑧 = 𝑐 − 𝐴⊤𝜇
𝜇 ∈ ℝ𝑚
𝑧 ∈ 𝒦∗.
After showing that the value of the dual is always upper bounded by the value of the primal
(weak duality) using the same steps as in Lecture 3, we conclude the following. [▷ Try
working out the details, it should only take a minute.]
Theorem L9.5. If the primal problem
min
𝑥
s.t.
⟨𝑐, 𝑥⟩
𝐴𝑥 = 𝑏
𝑥 ∈ 𝒦
is strictly feasible and admits an optimal solution 𝑥∗, then the dual problem
max
𝜇,𝑧
s.t.
⟨𝑏, 𝜇⟩
𝑧 = 𝑐 − 𝐴⊤𝜇
𝜇 ∈ ℝ𝑚
𝑧 ∈ 𝒦∗
admits an optimal solution (𝜇∗, 𝑧∗) such that:
• the values of the two problems coincide: ⟨𝑐, 𝑥∗⟩ = ⟨𝑏, 𝜇∗⟩; and
• the solution 𝑧∗ satisfies the complementary slackness condition ⟨𝑥∗, 𝑧∗⟩ = 0.
▶ Lagrangian duality in the conic case. Alternatively, it is possible to adapt the Lagrangian
ideas we saw in Lecture 8 (which, after all, were nothing but a rewrite of the first-oder
optimality conditions) to conic problems.
The key difference is that the constraint 𝑥 ∈ 𝒦 is not a functional constraint, so it cannot
be easily included in the Lagrangian. For this, we consider some intermediate Lagrangian
problem in which we only penalize violation of the constraint 𝐴𝑥 = 𝑏, but otherwise retain
the constraint 𝑥 ∈ 𝒦 as part of the domain. In other words, we consider the Lagrangian
ℒ(𝑥; 𝜇) : 𝒦 × ℝ𝑚, ℒ(𝑥; 𝜇) = ⟨𝑐, 𝑥⟩ − ⟨𝜇, 𝐴𝑥 − 𝑏⟩.
The conditions in Theorem L9.4 guarantee that for any solution 𝑥∗ of (2), there exist 𝜇∗ ∈
ℝ𝑚 such that
−𝑐 − 𝐴⊤𝜇 ∈ {𝑧 ∈ 𝒦⊥ : ⟨𝑧, 𝑥∗⟩ = 0} = 𝒩𝒦(𝑥∗).
and therefore 𝑥∗ is a minimizer of ℒ(𝑥; 𝜇∗) over 𝒦. Hence, using the same steps as in
Lecture 8, we obtain the strong Lagrangian duality statement
max
𝜇∈ℝ𝑚 inf
𝑥∈𝒦 ℒ(𝑥; 𝜇) = ⟨𝑐, 𝑥∗⟩ = min
𝑥∈𝒦 sup
𝜇∈ℝ𝑚
ℒ(𝑥; 𝜇).
One can easily check that the Lagrangian dual problem on the left is the same as the dual
problem we derived earlier in Theorem L9.5. Indeed,
inf
𝑥∈𝒦 ℒ(𝑥; 𝜇) = inf
𝑥∈𝒦⟨𝑐, 𝑥⟩ − ⟨𝜇, 𝐴𝑥 − 𝑏⟩
= ⟨𝜇, 𝑏⟩ + inf
𝑥∈𝒦⟨𝑥, 𝑐 − 𝐴⊤𝜇⟩.
The infimum on the right-hand side is equal to 0 if (𝑐 + 𝐴⊤𝜇) ∈ 𝒦∗, and −∞ otherwise [▷
Show this!]. Hence, the dual problem is indeed equivalent to
max
𝜇,𝑧
s.t.
⟨𝑏, 𝜇⟩
𝑐 − 𝐴⊤𝜇 ∈ 𝒦∗.
L9.2.4 Failure of duality when the constraint qualification is not satisfied
It is essential to remember that duality might fail when the strict feasibility constraint
qualification is not satisfied. Specifically, the primal problem might be feasible and yet the
dual might not have an optimal solution.
The failure mode can be of different kinds:
• the primal has an optimal solution, the supremum of the dual matches the value of
the primal, and yet the dual does not have a maximizer (Example L9.1); or
• the primal has an optimal solution, the dual has an optimal solution, but the values
of the problems differ (Example L9.2).
Example L9.1 ([de 02]). Consider the problem
min
𝑋
s.t.
2𝑋12 + 𝑋22
𝑋11 = 0
𝑋22 = 1
𝑋 ⪰ 0.
The only point in the feasible set is (0
0
0
1); therefore, the optimal value is 1.
Since the matrix (0
0
0
1) is not in the interior of the semidefinite cone (one eigenvalue is
0), strict feasibility does not hold.
The dual problem in this case is
max
𝜆,𝑍
s.t.
𝜆2
𝑍 = (0
1
1
1) − 𝜆1(1
0
0
0) − 𝜆2(0
0
0
1)
𝜆 ∈ ℝ2
𝑍 ⪰ 0
.
For 𝜆 = (− 1
𝜖 , 1 − 𝜖), 𝜖 > 0 we obtain the feasible point 𝑍 = (1/𝜖
1
1
𝜖) ⪰ 0 with objective
value 1 − 𝜖. Hence, the supremum of the dual matches the value of the primal, but the
dual has no optimal solution.
Example L9.2 ([BV04]). Consider the primal problem
min
𝑋
s.t.
𝑥2
𝑋 =
(
(((𝑥2 + 1
0
0
0
𝑥1
𝑥2
0
𝑥2
0 )
)))
𝑋 ⪰ 0.
Since the determinant must be nonnegative, it follows immediately that 𝑥2 = 0 for
all feasible points, and so the objective value is 0. The dual problem is (after some
manipulations)
max
𝑍
s.t.
−𝑍11
𝑍11 + 2𝑍23 = 1
𝑍22 = 0
𝑍 ⪰ 0.
Since 𝑍 is positive semidefinite, then so must be the submatrix (𝑍22
𝑍23
𝑍23
𝑍33
). But since
𝑍22 = 0, then necessarily 𝑍23 = 0. So, for any feasible point in the dual, the constraint
𝑍11 + 𝑍23 = 1 is equivalent to 𝑍11 = 1. So, the dual attains an optimal point, but the
value at optimality is different from the value at optimality of the primal problem.
L9.3 Conic solvers and software tools
Conic programming is a powerful tool that can be used to solve a wide range of optimization
problems. Several excellent software packages are available that can solve conic programs
efficiently. Here are just a few:
• CVX [GB08] is a modeling language for convex optimization problems that can be used
with several solvers, including MOSEK [AA00] and SCS [O'D+16]. It is compatible
with Python and numpy.
• JuMP [Lub+23] is a modeling language for mathematical optimization problems in
Julia. Like CVX, it can be used with several solvers, including MOSEK and SCS.
• MOSEK [AA00] is a commercial solver that can handle a wide range of optimization
problems, including linear, quadratic, conic, and semidefinite programs.
L9.4 Further readings and bibliography
Hiriart-Urruty, J.-B., & Seeger, A. [HS10] wrote an excellent survey of the properties of the
copositive cone.
For further reading on semidefinite programming, I find the short books by de Klerk, E.
[de 02] and Gärtner, B., & Matoušek, J. [GM12] very well-written and approachable. The
book by Boyd, S., & Vandenberghe, L. [BV04] is a great reference too.
The monograph by Ben-Tal, A., & Nemirovski, A. [BN01] contains an extensive list of
applications of second-order cone problems and semidefinite problems.
[Dia62] Diananda, P. H. (1962). On non-negative forms in real variables some or all of
which are non-negative. Mathematical Proceedings of the Cambridge Philosoph-
ical Society, 58(1), 17–25. https://doi-org.ezproxyberklee.flo.org/10.1017/S0305004100036185
[dP02] de Klerk, E., & Pasechnik, D. V. (2002). Approximating of the stability number
of a graph via copositive programming. SIAM J. Optim., 12(4), 875–892.
https://doi-org.ezproxyberklee.flo.org/10.1137/S105262340138324
[GM12] Gärtner, B., & Matoušek, J. (2012). Approximation Algorithms and Semidef-
inite Programming. Springer. https://link.springer.com/book/10.1007/978-3-
642-22015-9
[de 02] de Klerk, E. (2002). Aspects of Semidefinite Programming. Springer US. https://
link.springer.com/book/10.1007/b105286
[BV04] Boyd, S., & Vandenberghe, L. (2004). Convex Optimization. Cambridge Univer-
sity Press. https://web.stanford.edu/~boyd/cvxbook/
[GB08] Grant, M., & Boyd, S. (2008). CVX: Matlab Software for Disciplined Convex
Programming.
[AA00] Andersen, E. D., & Andersen, K. D. (2000). The MOSEK interior point optimizer
for linear programming: an implementation of the homogeneous algorithm. High
Performance Optimization, 197–232.
[O'D+16] O'Donoghue, B., Chu, E., Parikh, N., & Boyd, S. (2016). Conic Optimization
via Operator Splitting and Homogeneous Self-Dual Embedding. Journal of
Optimization Theory and Applications, 169(3), 1042–1068.
[Lub+23] Lubin, M., Dowson, O., Dias Garcia, J., Huchette, J., Legat, B., & Vielma, J.
P. (2023). JuMP 1.0: Recent improvements to a modeling language for mathe-
matical optimization. Mathematical Programming Computation. https://doi.
org/10.1007/s12532-023-00239-3
[HS10] Hiriart-Urruty, J.-B., & Seeger, A. (2010). A Variational Approach to Copositive
Matrices. SIAM Review, 52(4), 593–629. http://www.jstor.org.ezproxyberklee.flo.org/stable/41062014
[BN01] Ben-Tal, A., & Nemirovski, A. (2001). Lectures on modern convex optimization:
analysis, algorithms, and engineering applications. SIAM.
Changelog
• Mar 6, 2025: Fixed typo in conic dual problem (thanks Jonathan Huang!)
★These notes are class material that has not undergone formal peer review. The TAs and I are grateful
for any reports of typos.
Lecture 9
Conic optimization
Instructor: Prof. Gabriele Farina ( gfarina@mit.edu)★
In Lectures 3, we considered normal cones at the intersection of linear equality and
inequality constraints. In Exercise 1 of Homework 1, we were also able to give closed formulas
for the normal cone in certain sets, such as a ball. In Lecture 7, we turned our attention to
more general feasible sets. There, we saw how KKT conditions define necessary conditions
for feasible sets defined by functional constraints by approximating the feasible set with the
linearization of the active constraints and considering the normal cone to the intersection
of those linearizations.
Today, we look at a class of convex feasible sets that are neither linear equality nor
inequality constraints. Also, they are not defined via functional constraints, making the
KKT machinery inapplicable.
L9.1 Conic optimization
A conic optimization problem is a nonlinear optimization problem whose feasible set is
the intersection between an affine subspace (that is, a system of linear equalities) and a
nonempty closed convex cone 𝒦:
min
𝑥
s.t.
𝑓(𝑥)
𝐴𝑥 = 𝑏
𝑥 ∈ 𝒦
This class of problems captures linear programming, semidefinite programming, second-
order cone programming, copositive programming, and more, depending on the specific
cone 𝒦 that is selected.
L9.1.1 Nonnegative cone ⟷ Linear programming
The first example is the nonnegative cone ℝ𝑛
≥0. In this case, the conic problem takes the
more familiar form
min
𝑥
s.t.
𝑓(𝑥)
𝐴𝑥 = 𝑏
𝑥 ≥ 0
which is a linear program. So, for the specific choice of the nonnegative cone, conic
programming corresponds to linear programming.
L9.1.2 Ice-cream (aka. Lorentz) cone ⟷ Second-order conic programming
Definition L9.1.
The ice-cream cone, or Lorentz cone, is defined
as
ℒ𝑛 ≔ {(𝑥, 𝑧) ∈ ℝ𝑛−1 × ℝ : 𝑧 ≥ ‖𝑥‖2}.
The figure on the right shows the shape of ℒ3.
Conic problems for the specific choice of the
Lorentz cone are usually called second-order
cone programs, or SOCP for short. Several prob-
lems of interest in engineering and physics can
be modeled as SOCP, including the design of
antenna arrays, the positions of spring systems
at rest, and grasp planning in robotics.
−1.25
0
𝑥1
1.250𝑥2
1.25
0
1
𝑧
L9.1.3 Semidefinite cone ⟷ Semidefinite programming
Definition L9.2. The semidefinite cone 𝒮𝑛 is the set of
positive semidefinite 𝑛 × 𝑛 matrices:
𝒮𝑛 ≔ {𝑋 ∈ 𝕊𝑛 : 𝑋 ⪰ 0}
= {𝑋 ∈ 𝕊𝑛 : 𝑎⊤𝑋𝑎 ≥ 0 ∀𝑎 ∈ ℝ𝑛},
where 𝕊𝑛 is the set of symmetric 𝑛 × 𝑛 real matri-
ces.
The figure on the right shows the cone of values
(𝑥, 𝑦, 𝑧) corresponding to the semidefinite cone of
2 × 2 matrices
(𝑥
𝑦
𝑦
𝑧) ⪰ 0.
0 𝑥 1
−1.25
0𝑦
1.25
0
1
𝑧
In this simple 2 × 2 case, the positive semidefiniteness condition is equivalent to
𝑥 + 𝑧 ≥ 0 (the sum of the eigenvalues [trace] is nonnegative)
𝑥𝑧 − 𝑦2 ≥ 0 (the product of the eigenvalues [determinant] is nonnegative.)
These conditions are equivalent to 𝑥, 𝑧 ≥ 0 ∧ 𝑥𝑧 ≥ 𝑦2. So, slices of the cone with planes
orthogonal to the 𝑧-axis are shaped like parabolas 𝑥 ≥ 𝑦2/𝑧. Similarly, slices with planes
orthogonal to the 𝑥-axis are shaped like parabolas 𝑧 ≥ 𝑦2/𝑥. Note how the curvature of the
parabolas decreases as the slicing plane gets further away from the origin.
Remark L9.1. Semidefinite programming subsumes both linear programming and
second-order cone programming. This is because
𝑥 ∈ ℝ𝑛
≥0 ⟺
(
(((𝑥1
⋱
𝑥𝑛)
))) ⪰ 0, and
𝑧 ≥ ‖𝑥‖2 ⟺ (𝑧𝐼
𝑥⊤
𝑥
𝑧) ⪰ 0.
Semidefinite programming is an extremely powerful tool with applications in all disciplines.
I have noted resources that treat semidefinite programming extensively at the end of this
document; we will see connections to optimization of polynomials in the next lecture.
L9.1.4 Copositive cone ⟷ Copositive programming
Definition L9.3. The copositive cone 𝒞𝑛 is the set of
symmetric 𝑛 × 𝑛 real matrices:
𝒞𝑛 ≔ {𝑋 ∈ 𝕊𝑛 : 𝑎⊤𝑋𝑎 ≥ 0 ∀𝑎 ∈ ℝ𝑛
≥0}.
The difference with the positive semidefinite cone
𝒮𝑛 is given by the fact that we need 𝑎⊤𝑋𝑎 ≥ 0
only for nonnegative vectors 𝑎 ∈ ℝ𝑛
≥0, and not for
all 𝑎 ∈ ℝ𝑛.
The figure on the right shows the cone of values
(𝑥, 𝑦, 𝑧) corresponding to the copositive cone of 2 ×
2 matrices, that is
(𝑥
𝑦
𝑦
𝑧) ∈ 𝒞2.
0 𝑥 1
−1.25
0𝑦
1.25
0
1
𝑧
It follows immediately from the definition that 𝒞𝑛 is a superset of both the cone of nonneg-
ative symmetric matrices 𝒩𝑛 ≔ 𝕊𝑛 ∩ ℝ𝑛×𝑛
≥0 and the cone of semidefinite matrices. Hence,
𝒩𝑛 + 𝒮𝑛 ⊆ 𝒞𝑛 for all 𝑛 ∈ ℕ.
Curiously, the reverse inequality is known to hold, but only up to dimension 𝑛 ≤ 4 [Dia62]:
𝒞𝑛 = 𝒮𝑛 + 𝒩𝑛 if 𝑛 ≤ 4.
Beyond dimension 4, some matrices are copositive, and yet they are not expressible as the
sum of 𝒮𝑛 with 𝒩𝑛.
Remark L9.2. One important fact to know about the copositive cone is that despite
being a closed convex cone, optimization of even linear functions over the copositive
cone is computationally intractable!
For example, the size of the maximum independent set of a simple graph 𝐺 = (𝑉 , 𝐸), that
is, the size of the largest set 𝑆 ⊆ 𝑉 of vertices with the property that no two vertices in 𝑆
are connected by an edge in 𝐸, can be computed as the solution to the following copositive
problem. Let 𝐴 ∈ {0, 1}𝑉 ×𝑉 denote the adjacency matrix of 𝐺, that is,
𝐴𝑖𝑗 = {1 if (𝑖, 𝑗) ∈ 𝐸
0 otherwise.
The size of the maximum independent set of 𝐺 can then be found by solving the copositive
optimization problem
min
𝑤,𝑡∈ ℝ,𝑄
s.t.
𝑤
𝑄 = 𝑤𝐼 + 𝑡𝐴 − 1 1⊤
𝑄 ∈ 𝒞𝑛
(MIS)
where 1 ∈ ℝ𝑉 is the vector of all ones. More information is available in the paper by de
Klerk, E., & Pasechnik, D. V. [dP02]. Since deciding whether a graph admits an independent
set of size 𝑘 cannot take polynomial time unless P = NP, this means that solving copositive
optimization problems must take more than polynomial time in the worst case unless P =
NP. The hardness applies already for linear objectives, as the previous example shows. So,
we conclude—for example—that one cannot construct an efficient separation oracle for 𝒞𝑛.
Several techniques are known to “relax” the copositive cone by, for example, approximating
it with a (higher-dimensional) semidefinite cone. The monographs of Gärtner, B., & Ma-
toušek, J. [GM12] and de Klerk, E. [de 02] do an excellent job at explaining approximation
techniques that involve semidefinite cones.
L9.2 Optimality conditions and conic duality
In general, the feasible set Ω of conic optimization problems cannot be easily captured via
functional constraints. Because of that, we cannot use the KKT conditions we discussed in
the previous lecture to compute the normal cone at each point of the feasible set. In this
situation, we need to compute the normal cone from first principles.
To simplify the treatment and avoid the machinery of relative interiors (interiors in the
topology induced by the smallest affine subspace that contains a set), we will assume in
the following that 𝒦 is a cone with interior. This condition is satisfied by all the cases
discussed above.
In order to compute the normal cone at a point at the intersection of {𝐴𝑥 = 𝑏} and 𝒦,
we will use the following general result (again based on separation), which enables us to
consider the affine subspace and the cone separately and sum their normal cones.
Theorem L9.1. Let 𝐻 ≔ {𝑥 ∈ ℝ𝑛 : 𝐴𝑥 = 𝑏, with 𝐴 ∈ ℝ𝑚×𝑛} be an affine subspace, and
𝑆 be a closed convex set (not necessarily a cone) such that 𝑆⚬ ∩ 𝐻 is nonempty, where
𝑆⚬ is the interior of 𝑆. For any 𝑥 ∈ 𝐻 ∩ 𝑆,
𝒩𝐻∩𝑆 (𝑥) = 𝒩𝐻 (𝑥) + 𝒩𝑆 (𝑥).
So, using the fact (see Lecture 2) that 𝒩𝐻 (𝑥) = colspan(𝐴⊤), we have
𝒩𝐻∩𝑆 (𝑥) = {𝐴⊤𝜇 + 𝑧 : 𝜇 ∈ ℝ𝑚, 𝑧 ∈ 𝒩𝑆 (𝑥)}.
Remark L9.3. The condition of existence of a feasible point in the interior is sometimes
referred to as Slater’s condition, strict feasibility condition, or strong feasibility condi-
tion. The term Slater’s condition is appropriate in light of the constraint qualification
condition for the KKT theory we discussed in Lecture 7. In both cases, the insistence
on an interior solution is required to rule out pathological behavior.
We know from Lectures 3 that Slater’s condition was not required for the intersection
of halfspaces, which are flat surfaces. The condition becomes, however, crucial when
considering sets that can have curvature. For example, consider the intersection of a two-
dimensional ball of radius 1 centered in (𝑥, 𝑦) = (1, 0) with the plane 𝑥 = 0. The normal
cone at the intersection is ℝ2, but this is not equal to the sum between the normal cones
of the sets at the intersection points (which evaluates to ℝ × {0}).
Remark L9.4. The previous result can be generalized, without much effort nor substan-
tially new ideas to the case of intersection between two generic convex sets whose
relative interiors have a nonempty intersection. In other words, the normal cone to the
intersection between convex sets is equal to the sum of the normal cones to the individual
sets if the relative interiors of the two sets intersect. This also explains why the strict
feasibility condition imposed by Slater is sufficient.
L9.2.1 The normal cone at a point in 𝒦
With Theorem L9.1 in our toolbox, the only obstacle to writing first-order optimality
conditions for conic problems is being able to compute the normal at points in 𝒦. Luckily,
we can do that in closed form in all the abovementioned cases.
In particular, it turns out that the normal cone at any point in 𝒦 can be defined starting
from the normal cone at 0 (remember that 0 is always part of a nonempty cone, as a
straightforward consequence of the definition of cone). The normal cone for a cone 𝒦 at
0 ∈ 𝒦 is such an important object, that several names for it or its related quantities are
used in the literature:
• The polar cone of 𝒦, often denoted 𝒦⟂, is exactly the normal cone at 0:
𝒦⟂ ≔ 𝒩𝒦(0) = {𝑑 : ⟨𝑑, 𝑦⟩ ≤ 0 ∀𝑦 ∈ 𝒦}.
• The dual cone of 𝒦, often denoted 𝒦∗, is the opposite of the normal cone at 0:
𝒦∗ ≔ −𝒦⟂ = −𝒩𝒦(0) = {𝑑 : ⟨𝑑, 𝑦⟩ ≥ 0 ∀𝑦 ∈ 𝒦}.
For all the important cones considered earlier, we have a very precise idea of what their
polar and dual cones look like:
Theorem L9.2.
1. The nonnegative cone, the Lorentz cone, and the semidefinite cones are self-dual
cones, that is, 𝒦∗ = 𝒦.
2. The dual cone to the copositive cone 𝒞𝑛 is called the totally positive cone, defined
as
𝒫𝑛 ≔ {∑
𝑘
𝑖=1
𝑧𝑖𝑧⊤
𝑖 : 𝑧𝑖 ∈ ℝ𝑛
≥0, 𝑘 ∈ ℕ} = {𝐵𝐵⊤ : 𝐵 ∈ ℝ𝑛×𝑚
≥0 , 𝑚 ∈ ℕ}
[▷ Try to prove the previous theorem.] Since for a closed convex cone 𝒦, (𝒦∗)∗ = 𝒦, it
follows that the dual to the totally positive cone is the copositive cone. [▷ You should also
try to prove that (𝒦∗)∗ = 𝒦.]
Once the normal cone at 0 is established, the normal cone at any point 𝑥 ∈ 𝒦 can be
recovered as a function of 𝒩𝒦(0).
Theorem L9.3. Let 𝒦 ⊆ ℝ𝑛 be a nonempty, closed convex cone. The normal cone at
any point 𝑥 ∈ 𝒦 is given by
𝒩𝒦(𝑥) = {𝑑 ∈ 𝒦⊥ : ⟨𝑑, 𝑥⟩ = 0}.
Proof. Fix a generic 𝑥 ∈ 𝒦. We break the proof into a proof of the two separate inclusions.
(⊆) Let 𝑑 ∈ ℝ𝑛 be a direction in 𝒩𝒦(𝑥), that is,
⟨𝑑, 𝑦 − 𝑥⟩ ≤ 0 ∀𝑦 ∈ 𝒦.
Since 𝒦 is a cone, we have that both 2𝑥 and 1
2 𝑥 belong to 𝒦. So,
⟨𝑑, 2𝑥 − 𝑥⟩ ≤ 0 ⟹ ⟨𝑑, 𝑥⟩ ≤ 0, and
⟨𝑑, 1
2 𝑥 − 𝑥⟩ ≤ 0 ⟹ ⟨𝑑, 𝑥⟩ ≥ 0,
showing that ⟨𝑑, 𝑥⟩ = 0 is necessary. But then, from the condition that ⟨𝑑, 𝑦 − 𝑥⟩ ≤ 0 for
all 𝑦 ∈ 𝒦 we deduce that ⟨𝑑, 𝑦⟩ ≤ 0 for all 𝑦 ∈ 𝒦, which means that 𝑑 ∈ 𝒦⟂ = 𝒩𝒦(0).
(⊇) Vice versa, let 𝑑 ∈ {𝑑 ∈ 𝒦⊥ : ⟨𝑑, 𝑥⟩ = 0}. Then, for any 𝑦 ∈ 𝒦,
⟨𝑑, 𝑦 − 𝑥⟩ = ⟨𝑑, 𝑦⟩ − ⟨𝑑, 𝑥⟩
= ⟨𝑑, 𝑦⟩ (since ⟨𝑑, 𝑥⟩ = 0)
≤ 0 (since 𝑑 ∈ 𝒦⟂). □
L9.2.2 First-order optimality conditions
We now have all the ingredients to write first-order optimality conditions for a conic problem
min
𝑥
s.t.
𝑓(𝑥)
𝐴𝑥 = 𝑏
𝑥 ∈ 𝒦.
(2)
Specifically, by combining Theorem L9.1 with Theorem L9.3 and the sufficiency of first-
order optimality conditions seen in Lecture 3, we obtain the following result.
Theorem L9.4. If 𝑓 is convex and differentiable, and the conic optimization (2) is strictly
feasible, that is, there exists 𝑥 ∈ 𝒦⚬ such that 𝐴𝑥 = 𝑏, then a point 𝑥∗ ∈ Ω ≔ 𝒦 ∩ {𝑥 :
𝐴𝑥 = 𝑏} is a minimizer of 𝑓 on Ω if and only if
−∇𝑓(𝑥∗) ∈ {𝐴⊤𝜇 + 𝑧 : 𝜇 ∈ ℝ𝑚, 𝑧 ∈ 𝒩𝒦(𝑥∗)},
that is, expanding 𝒩𝒦(𝑥∗) using Theorem L9.3, if and only if
−∇𝑓(𝑥∗) ∈ {𝐴⊤𝜇 + 𝑧 : 𝜇 ∈ ℝ𝑚, 𝑧 ∈ 𝒦⟂, ⟨𝑧, 𝑥∗⟩ = 0}
⟺ −∇𝑓(𝑥∗) ∈ {𝐴⊤𝜇 − 𝑧 : 𝜇 ∈ ℝ𝑚, 𝑧 ∈ 𝒦∗, ⟨𝑧, 𝑥∗⟩ = 0}.
Remark L9.5. The condition ⟨𝑧, 𝑥∗⟩ = 0 is called the conic complementary slackness
condition. It is a generalization of the complementary slackness condition we saw
in Lecture 3 for linear programming. In the special case where 𝒦 = ℝ𝑛
≥0, the conic
complementary slackness condition reduces to the complementary slackness condition
for linear programming.
L9.2.3 Conic duality for linear objectives
In particular, when 𝑓(𝑥) = ⟨𝑐, 𝑥⟩ is a linear objective, then optimality of 𝑥∗ for (2) is
equivalent to the existence of 𝜇, 𝑧 such that
−𝑐 = 𝐴⊤𝜇 − 𝑧, where 𝜇 ∈ ℝ𝑚, 𝑧 ∈ 𝒦∗, ⟨𝑧, 𝑥∗⟩ = 0.
This suggests that whenever the conditions of strict feasibility holds, then a duality theory
for conic problems can be derived following the exact same steps as we did in Lecture 3. In
particular, we find that the dual problem is
max
𝜇,𝑧
s.t.
⟨𝑏, 𝜇⟩
𝑧 = 𝑐 − 𝐴⊤𝜇
𝜇 ∈ ℝ𝑚
𝑧 ∈ 𝒦∗.
After showing that the value of the dual is always upper bounded by the value of the primal
(weak duality) using the same steps as in Lecture 3, we conclude the following. [▷ Try
working out the details, it should only take a minute.]
Theorem L9.5. If the primal problem
min
𝑥
s.t.
⟨𝑐, 𝑥⟩
𝐴𝑥 = 𝑏
𝑥 ∈ 𝒦
is strictly feasible and admits an optimal solution 𝑥∗, then the dual problem
max
𝜇,𝑧
s.t.
⟨𝑏, 𝜇⟩
𝑧 = 𝑐 − 𝐴⊤𝜇
𝜇 ∈ ℝ𝑚
𝑧 ∈ 𝒦∗
admits an optimal solution (𝜇∗, 𝑧∗) such that:
• the values of the two problems coincide: ⟨𝑐, 𝑥∗⟩ = ⟨𝑏, 𝜇∗⟩; and
• the solution 𝑧∗ satisfies the complementary slackness condition ⟨𝑥∗, 𝑧∗⟩ = 0.
▶ Lagrangian duality in the conic case. Alternatively, it is possible to adapt the Lagrangian
ideas we saw in Lecture 8 (which, after all, were nothing but a rewrite of the first-oder
optimality conditions) to conic problems.
The key difference is that the constraint 𝑥 ∈ 𝒦 is not a functional constraint, so it cannot
be easily included in the Lagrangian. For this, we consider some intermediate Lagrangian
problem in which we only penalize violation of the constraint 𝐴𝑥 = 𝑏, but otherwise retain
the constraint 𝑥 ∈ 𝒦 as part of the domain. In other words, we consider the Lagrangian
ℒ(𝑥; 𝜇) : 𝒦 × ℝ𝑚, ℒ(𝑥; 𝜇) = ⟨𝑐, 𝑥⟩ − ⟨𝜇, 𝐴𝑥 − 𝑏⟩.
The conditions in Theorem L9.4 guarantee that for any solution 𝑥∗ of (2), there exist 𝜇∗ ∈
ℝ𝑚 such that
−𝑐 − 𝐴⊤𝜇 ∈ {𝑧 ∈ 𝒦⊥ : ⟨𝑧, 𝑥∗⟩ = 0} = 𝒩𝒦(𝑥∗).
and therefore 𝑥∗ is a minimizer of ℒ(𝑥; 𝜇∗) over 𝒦. Hence, using the same steps as in
Lecture 8, we obtain the strong Lagrangian duality statement
max
𝜇∈ℝ𝑚 inf
𝑥∈𝒦 ℒ(𝑥; 𝜇) = ⟨𝑐, 𝑥∗⟩ = min
𝑥∈𝒦 sup
𝜇∈ℝ𝑚
ℒ(𝑥; 𝜇).
One can easily check that the Lagrangian dual problem on the left is the same as the dual
problem we derived earlier in Theorem L9.5. Indeed,
inf
𝑥∈𝒦 ℒ(𝑥; 𝜇) = inf
𝑥∈𝒦⟨𝑐, 𝑥⟩ − ⟨𝜇, 𝐴𝑥 − 𝑏⟩
= ⟨𝜇, 𝑏⟩ + inf
𝑥∈𝒦⟨𝑥, 𝑐 − 𝐴⊤𝜇⟩.
The infimum on the right-hand side is equal to 0 if (𝑐 + 𝐴⊤𝜇) ∈ 𝒦∗, and −∞ otherwise [▷
Show this!]. Hence, the dual problem is indeed equivalent to
max
𝜇,𝑧
s.t.
⟨𝑏, 𝜇⟩
𝑐 − 𝐴⊤𝜇 ∈ 𝒦∗.
L9.2.4 Failure of duality when the constraint qualification is not satisfied
It is essential to remember that duality might fail when the strict feasibility constraint
qualification is not satisfied. Specifically, the primal problem might be feasible and yet the
dual might not have an optimal solution.
The failure mode can be of different kinds:
• the primal has an optimal solution, the supremum of the dual matches the value of
the primal, and yet the dual does not have a maximizer (Example L9.1); or
• the primal has an optimal solution, the dual has an optimal solution, but the values
of the problems differ (Example L9.2).
Example L9.1 ([de 02]). Consider the problem
min
𝑋
s.t.
2𝑋12 + 𝑋22
𝑋11 = 0
𝑋22 = 1
𝑋 ⪰ 0.
The only point in the feasible set is (0
0
0
1); therefore, the optimal value is 1.
Since the matrix (0
0
0
1) is not in the interior of the semidefinite cone (one eigenvalue is
0), strict feasibility does not hold.
The dual problem in this case is
max
𝜆,𝑍
s.t.
𝜆2
𝑍 = (0
1
1
1) − 𝜆1(1
0
0
0) − 𝜆2(0
0
0
1)
𝜆 ∈ ℝ2
𝑍 ⪰ 0
.
For 𝜆 = (− 1
𝜖 , 1 − 𝜖), 𝜖 > 0 we obtain the feasible point 𝑍 = (1/𝜖
1
1
𝜖) ⪰ 0 with objective
value 1 − 𝜖. Hence, the supremum of the dual matches the value of the primal, but the
dual has no optimal solution.
Example L9.2 ([BV04]). Consider the primal problem
min
𝑋
s.t.
𝑥2
𝑋 =
(
(((𝑥2 + 1
0
0
0
𝑥1
𝑥2
0
𝑥2
0 )
)))
𝑋 ⪰ 0.
Since the determinant must be nonnegative, it follows immediately that 𝑥2 = 0 for
all feasible points, and so the objective value is 0. The dual problem is (after some
manipulations)
max
𝑍
s.t.
−𝑍11
𝑍11 + 2𝑍23 = 1
𝑍22 = 0
𝑍 ⪰ 0.
Since 𝑍 is positive semidefinite, then so must be the submatrix (𝑍22
𝑍23
𝑍23
𝑍33
). But since
𝑍22 = 0, then necessarily 𝑍23 = 0. So, for any feasible point in the dual, the constraint
𝑍11 + 𝑍23 = 1 is equivalent to 𝑍11 = 1. So, the dual attains an optimal point, but the
value at optimality is different from the value at optimality of the primal problem.
L9.3 Conic solvers and software tools
Conic programming is a powerful tool that can be used to solve a wide range of optimization
problems. Several excellent software packages are available that can solve conic programs
efficiently. Here are just a few:
• CVX [GB08] is a modeling language for convex optimization problems that can be used
with several solvers, including MOSEK [AA00] and SCS [O'D+16]. It is compatible
with Python and numpy.
• JuMP [Lub+23] is a modeling language for mathematical optimization problems in
Julia. Like CVX, it can be used with several solvers, including MOSEK and SCS.
• MOSEK [AA00] is a commercial solver that can handle a wide range of optimization
problems, including linear, quadratic, conic, and semidefinite programs.
L9.4 Further readings and bibliography
Hiriart-Urruty, J.-B., & Seeger, A. [HS10] wrote an excellent survey of the properties of the
copositive cone.
For further reading on semidefinite programming, I find the short books by de Klerk, E.
[de 02] and Gärtner, B., & Matoušek, J. [GM12] very well-written and approachable. The
book by Boyd, S., & Vandenberghe, L. [BV04] is a great reference too.
The monograph by Ben-Tal, A., & Nemirovski, A. [BN01] contains an extensive list of
applications of second-order cone problems and semidefinite problems.
[Dia62] Diananda, P. H. (1962). On non-negative forms in real variables some or all of
which are non-negative. Mathematical Proceedings of the Cambridge Philosoph-
ical Society, 58(1), 17–25. https://doi-org.ezproxyberklee.flo.org/10.1017/S0305004100036185
[dP02] de Klerk, E., & Pasechnik, D. V. (2002). Approximating of the stability number
of a graph via copositive programming. SIAM J. Optim., 12(4), 875–892.
https://doi-org.ezproxyberklee.flo.org/10.1137/S105262340138324
[GM12] Gärtner, B., & Matoušek, J. (2012). Approximation Algorithms and Semidef-
inite Programming. Springer. https://link.springer.com/book/10.1007/978-3-
642-22015-9
[de 02] de Klerk, E. (2002). Aspects of Semidefinite Programming. Springer US. https://
link.springer.com/book/10.1007/b105286
[BV04] Boyd, S., & Vandenberghe, L. (2004). Convex Optimization. Cambridge Univer-
sity Press. https://web.stanford.edu/~boyd/cvxbook/
[GB08] Grant, M., & Boyd, S. (2008). CVX: Matlab Software for Disciplined Convex
Programming.
[AA00] Andersen, E. D., & Andersen, K. D. (2000). The MOSEK interior point optimizer
for linear programming: an implementation of the homogeneous algorithm. High
Performance Optimization, 197–232.
[O'D+16] O'Donoghue, B., Chu, E., Parikh, N., & Boyd, S. (2016). Conic Optimization
via Operator Splitting and Homogeneous Self-Dual Embedding. Journal of
Optimization Theory and Applications, 169(3), 1042–1068.
[Lub+23] Lubin, M., Dowson, O., Dias Garcia, J., Huchette, J., Legat, B., & Vielma, J.
P. (2023). JuMP 1.0: Recent improvements to a modeling language for mathe-
matical optimization. Mathematical Programming Computation. https://doi.
org/10.1007/s12532-023-00239-3
[HS10] Hiriart-Urruty, J.-B., & Seeger, A. (2010). A Variational Approach to Copositive
Matrices. SIAM Review, 52(4), 593–629. http://www.jstor.org.ezproxyberklee.flo.org/stable/41062014
[BN01] Ben-Tal, A., & Nemirovski, A. (2001). Lectures on modern convex optimization:
analysis, algorithms, and engineering applications. SIAM.
Changelog
• Mar 6, 2025: Fixed typo in conic dual problem (thanks Jonathan Huang!)
★These notes are class material that has not undergone formal peer review. The TAs and I are grateful
for any reports of typos.