Convex Sets

TagsBasics

Proof Tips

Showing convexity

A set is convex if and only if for every θ[0,1]\theta \in [0, 1] and x1,x2Sx_1, x_2 \in S, we have θx1+(1θ)x2S\theta x_1 + (1-\theta)x_2 \in S. To show convexity, you can

Showing something is not convex

Avoid applying definitions as they are often messy

General patterns

Showing something in a set

A set typically has certain rules. You want to show that something (like a convex combination) can be written in a similar format and satisfies these rules. This is sufficient to showing that something belongs in some set.

Affine and Convex Sets

Lines and line segments 🐧

If you have any points x1,x2x_1, x_2, then you can define a line as

y=θx1+(1θ)x2y = \theta x_1 + (1-\theta)x_2

If θ[0,1]\theta \in [0, 1],l then this is a line segment. If you don’t constrain it, then you get a line.

Affine Sets 🐧

A set CC is affine if for all x1,x2Cx_1, x_2 \in C, then θx1+(1θ)x2C\theta x_1 + (1-\theta)x_2 \in C. Visually, these are typically lines or planes or larger subspaces. We can extend this to any arbitrary number of coeficients, as long as the coeficients add to 1 (no constraints about the negativity of the coeficients, so the coeficients are not bounded).

If we have a set of points, the affine hull is the set of possible affine combinations of these points. Often, they look like a hyperplane or a subspace containing the points, shifted away from the origin by some amount. We denote this as aff(C)aff(C).

Affine sets and subspaces 🚀

By definition, subspaces are affine sets. But are all subspaces vector spaces? Well, not really. So vector spaces needs to contain zero, and we don’t guarantee this in CC. But what if you had some x0Cx_0 \in C? Can we say something about the space Cx0C - x_0? Actually, you can! In fact, we claim that V=Cx0V = C - x_0 is a subspace.

The critical upshot is that C=V+x0C = V + x_0, which means that any affine set is just a shifted version of a subspace!

The solution set to Ax=bAx =b is an affine set and conversely, every affine set can be expressed as the solution set to linear equations.

Affine dimension, relative interior (extra)

The affine dimension of a set is the dimension of the affine hull.

💡
Careful! This may be counterintuitive in certain cases. In R2, three non-colinear points will have an affine dimension of 2.

We define the relative interior of an affine hull intuitively as the space enclosed by the points. Don’t worry too much about this rigorous definition for now.

Convex Sets & Convex Combinations 🐧

A convex set is basically an affine set, except that θ[0,1]\theta \in [0, 1],and for the multi-coefficient condition, we have iθi=1\sum_i \theta_i = 1. The point iθixi\sum_i\theta_i x_i is the convex combination of x1,,xkx_1, …, x_k. Like with affine, a set is convex if and only if it contains every convex combination of points.

Convex combinations are generalizable to infinite sums, integrals, and probability distributions. This is helpful for many cases, like generalizing properties to expectations.

The convex hull is the set formed by all possible convex combinations of the points in a set. These convex combinations may not be unique. It is denoted by conv(C)conv(C).

Intuitively, convex hulls look like you put cling wrap around the shape or set of points.

💡
Any set that is affine must be convex; the affine constraint is more strict. Convex sets are very often not affine. In fact, if they are finite, they are not affine.

When you are showing if a set is convex but not affine, you might arrive at some non-negativity property of (1θ)(1-\theta) or θ\theta.

Cones 🐧

A set CC is a cone if for every xCx\in C and θ0\theta \geq 0, we have θxC\theta x \in C. Note that this could just be a bunch of lines spreading out from the origin. If you wanted an actual “cone” shaped space, you need the additional condition of convexity. If we have convexity, we have a convex cone, meaning that θ1x1+θ2x2C\theta_1 x_1 + \theta_2 x_2 \in C.

A conic hull is just all iθixi\sum_i \theta_i x_i where θi0\theta_i\geq 0. This is also the smallest convex cone that contains CC.

A cone is just a restricted subspace where the coefficients must be positive. Any subspace can be a cone; just put two copies of the vectors facing in opposite directions.

Examples of Convex Sets

Some degenerate examples

Convex hull

This just creates a convex hull that surrounds a set of points

Hyperplanes, Halfspaces

A hyperplane is the solution to

This is the set of points with a constant inner product to aa. But what does this mean? Well, it’s much easier to see it as

where aTx0=ba^Tx_0 = b. In this case, it’s clear that we want the set of points xx such that xx0x - x_0 is perpendicular to aa. You can also write this as x0+ax_0 + a^\perp. Note how this x0x_0 can be multiple things but the easiest is the point colinear to aa

A hyperplane divides RnR^n into halfspaces, which is the solution to

Halfspaces are convex but definitely not affine.

Euclidean balls and ellipsoids

We define a euclidian ball as literally a sphere in R2.

And the generalization is the ellipsoid

Note that PS++nP \in S_{++}^n, i.e. symmetric positive definite. The lengths of the semi-axes are λi\sqrt{\lambda_i}

If PP is semidefinite, there may be degenerate axes, but it is still convex.

Norm Balls, Norm Cones

Norms are anything that satisfies non-negativity, scaling, and triangle inequality

You can define a norm ball as just the ball using this norm

And the norm cone as basically a norm ball that varies with one extra axis, the radius. If the norm is euclidian, it’s called the second order cone or the Lorentz cone.

It’s called a cone because it can look like a cone. Think of sweeping a circle upwards and larger.

Polyhedra / Polytope

A polyhedron is just a solution set to a finite number of linear equalities and inequalities. Polyhedra include affine sets and other infinite sizes, although if we get examples that are bounded on all sides, then we have a polytope.

A convex hull of a finite set is a polyhedron, but it’s not easy to express it in the inequality form. You may need more points to describe a polyhedron as a convex hull, as compared to using inequalities.

Positive Semidefinite Cone

Now we’re getting into some weird places. If SnS^n is the set of symmetric n×nn\times n matrices, then the set S+nS_+^n is a convex cone. Meaning, that if A,BS+nA, B \in S_{+}^n, then θ1A+θ2BS+n\theta_1 A + \theta_2 B \in S_+^n.

Operations that Preserve Convexity

Forward and inverse images

Let’s clear up image and reverse image a bit more.

Proof tips

You always want to map something back to simple land

Intersection

Convexity is preserved under intersection, and the proof is really simple. Just use the convex definition and the intersection definition.

This is true for any intersections, including infinite ones. The converse is also true: every closed convex set is an intersection of halfspaces. It is also the intersection of all the halfspaces that contain it.

A common use of intersection: you have some a=f(t)a = f(t), where tt is some variable and aa is a vector, and your constraint is aTxβa^Tx\leq \beta. In this case, you take the intersection across all values of tt to get you the final convex set.

Affine functions

If we transform a space by an affine function f(x)=Ax+bf(x) = Ax + b, then it is still convex. This is also true for the inverse image of a convex set under ff, i.e. some xx such that f(x)Sf(x) \in S.

Examples

Perspective Function

The perspective function P:Rn+1RnP: R^{n+1} → R^n is something that mimics the pinhole perception of an object. A good example is a pinhole camera that projects onto the plane x3x_3.

The perspective function is explicitly defined as P(x,t)=x/t,t0P(x, t) = x/t, t\geq 0, where tt is the dimension that you’re projecting on. Therefore, in R3, we have the projection operator P([x1,x2],x3)=(x1/x3,x2/x3,1)P([x_1, x_2], x_3) = (x_1/x_3, x_2/x_3, 1). We drop this last component because it is always 1.

To use this function, try to look for opportunities to arrange a set in the format x/tx / t, and show that this xx is convex. Or, you can try having a set xx and compute x/tx/t and showing that this x/tx/t is convex. It depends on which direction is simpler.

Linear fractional functions

This is the composition of a perspective function with an affine function. You can write the affine function as a block matrix

and we do this because the last component is special. When we compose g(x)g(x) with PP, we get

f(x)=P(g(x))=Ax+bcTx+df(x) = P(g(x)) = \frac{Ax + b}{c^Tx + d}

and this is the linear-fractional or projective function. You can also interpret linear-fractional function in the form of a matrix QQ

that you apply to vectors (x,1)(x, 1) and normalize the last component before dropping it. Geometrically, it looks like you’re viewing something at an angle. But more importantly, a lot of different expressions are linear-fractional. They can be helpful for later!

Images and inverse images under linear fractional functions are also convex.

Generalized Inequalities

Proof tips

TODO

Proper Cones

We define a cone to be proper if it is

Proper cones and generalized inequalities

You can define an inequality with respect to a proper cone, such that

This just means that the partial ordering has a difference that is in a cone.

Note that this is nuanced; it is only a partial ordering. It may be possible to have something such that xyx \npreceq y  and yxy \npreceq x, because if yxKy - x \notin K, it doesn’t imply that xyKx - y \in K.

Many properties are consistent with normal inequalities

Special cases of generalized inequalities

Maximum and Minimum elements

As mentioned above, generalized inequalities do not impose a linear ordering, which would mean that any two points are comparable. But there is the concept of maximum and minimums, but it gets weird.

xSx \in S is the minimum element of SS with respect to some K\preceq_K if for every ySy \in S, we have xKyx \preceq_K y. Vice versa for the maximum element. These points are unique, and it means that Sx+KS \subseteq x + K.

xSx \in S is the minimal element of SS with respect to some K\preceq_K if for every ySy \in S, we have yKxy \preceq_K x if and only if y=xy = x. Vice versa for the maximum element. This is a slightly different definition, and in certain cases it is equivalent to the first definition. However, a key technicality emerges when we realize that certain yy is not comparable to xx and therefore doesn’t matter. In contrast, minimum elements are comparable to all ySy \in S.

Under this insight, we realize that minimal elements are not unique. A point XSX \in S is minimal if and only if (xK)S={x}(x - K)\cap S = \{x\}.

The easiest way of understanding this nuanced difference is through an example of the R+2R_+^2 cone.

x1x_1 is the minimum element (the lighter shade i the cone), and x2x_2 is one of many minimal elements.

Separating and Supporting hyperplanes

Separating Hyperplane Theorem

If C,DC, D are non-empty and disjoint convex sets, there exists some a0,ba \neq 0, b such that

This is not true for non-convex sets. Can you think of an easy example?

Supporting Hyperplane Theorem

Given some x0x_0 as a boundary point of set CC, a supporting hyperplane is defined as {xaTx=aTx0}\{x | a^Tx = a^Tx_0\} such that aTxaTx0a^Tx \leq a^Tx_0 for all xCx \in C. Or, another way to put it: aT(xx0)0a^T(x - x_0) \leq 0, which means that the hyperplane inequality contains the whole set. The equation is satisfied by all points in the hyperplane.

The supporting hyperplane theorem states that if a set is convex, there exists a supporting hyperplane at every boundary point.

Dual Cones

This section is more developed in the textbook but for now, we’ll just use a sneak peek

Dual cone definition

If KK is a cone, then

is the dual cone of KK. Interstingly, KK^* is a cone and convex, even if the original cone is not. Geometrically, we can understand xTyx^Ty as a halfspace. If the halfspace contains the cone, then yy is allowed. In the picture below, the left is valid and the right is not

Here are some properties

There’s some deeper truth, and when we reach function space we’ll see a nice analogy in function land.