Duality in linear programming relates a pair of linear programs:
The primal program: \begin{align*} \text{maximize: } & \mathbf{c}^\top \mathbf{x} \\ \text{subject to: } & A \mathbf{x} \leq \mathbf{b} \text{ and } \mathbf{x} \geq 0 \end{align*}
The dual program: \begin{align*} \text{minimize: } & \mathbf{b}^\top \mathbf{y} \\ \text{subject to: } & A^\top \mathbf{y} \geq \mathbf{c} \text{ and } \mathbf{y} \geq 0 \end{align*}
The weak duality theorem says that the minima of \mathbf{b}^\top \mathbf{y} is an upper bound on the maxima of \mathbf{c}^\top \mathbf{x}. The strong duality theorem says that any optima of either program is also the optima of the other program. In other words, the upper bound given by the weak duality theorem is tight. In this post I will give a geometric interpretation of the weak duality theorem.
Let’s first visualize a set of linear constraints. Each linear constraint is a (hyper)plane that cuts the space into halves. In 3D space, a linear constraint is a plane; in 2D, a linear constraint is a line. The figure below shows 3 linear constraints: x_1 \leq 2, x_1 + x_2 \leq 1.5, and x_2 \leq 2. Here we also assume x_1 \geq 0 and x_2 \geq 0. Given a set of constraints, the feasible space contains every point that satisfies all constraints in the set. This is shown by the shaded area in the figure.
We can therefore think of the primal program as picking a point in the feasible region of the constraints that maximizes \mathbf{c}^\top \mathbf{x}. Note that for a given \mathbf{p}, \mathbf{c}^\top \mathbf{x} = \mathbf{p} itself is also a line, so we are basically trying to push this line as far up (and to the right) as possible, as long as it still intersects the feasible region. For example, suppose in our 2D case \mathbf{c}^\top \mathbf{x} is just x_1 + x_2, then the best we can do is x_1 + x_2 = 1.5 which coincides with our second constraint.
A crucial concept to understand the dual program is the linear combination of constraints. To linearly combine constraints, we just multiply both sides of each constraint by some number, and add all the left sides together to get a new left side, and similarly add the right sides. For example, adding all of our 3 constraints above together gets us 2x_1 + 2x_2 \leq 5.5, which is the same as x_1 + x_2 \leq 2.75. Note that the feasible region of this new constraint contains the entire feasible region of the old constraints, and this very important fact is true in general: let F be the feasible region of a set of constraints C, and let c' be a linear combination of the constraints in C. Then the feasible region of c' is a superset of F. The proof is simple: consider any point \mathbf{x} in the feasible region F. Then \mathbf{x} must satisfy every constraint c \in C. Multiplying both sides of a constraint by k does not change the constraint, so \mathbf{x} also satisfies kc for any k. If \mathbf{x} satisfies two constraints c_1 and c_2, it also satisfies c_1 + c_2, because if two inequalities l_1 \leq r_1 and l_2 \leq r_2 both hold, l_1 + l_2 \leq r_1 + r_2 also holds. Therefore, \mathbf{x} must satisfy any linear combinations of the constraints in C, including c'.
Now let’s focus on the dual program. We can understand it as trying to derive an upper bound on the maxima of the primal program, i.e. of \mathbf{c}^\top \mathbf{x}. The key idea is, we can derive such an upper bound from a linear combination of the primal constraints! This is a natural idea, because our primal constraints already look like upper bounds, so all we need to do is to combine them into an upper bound on \mathbf{c}^\top \mathbf{x}. This is where the linear combination of constraints comes in: we need to assign a weight y to each constraint, add them together, so that the left hand side is at least \mathbf{c}^\top \mathbf{x}. But this is exactly the dual constraints A^\top \mathbf{y} \geq \mathbf{c}! On the right hand side, we now have \mathbf{b}^\top \mathbf{y} which we can use as our upper bound. It is a valid upper bound, because of our little lemma above that the feasible region of a linear combination of C over approximates the feasible region of C itself. In addition, we also want this upper bound to be as tight as possible, which is why we are minimizing the objective \mathbf{b}^\top\mathbf{y}.
In short, the primal program tries to pick a point in the feasible region of the constraints that maximizes the objective. The dual program tries to derive a tight upper bound on the objective, by linearly combining the primal constraints. So far we only focused on the weak duality theorem. The strong duality theorem involves a bit more work (so it is even more fun), and hopefully we’ll get to it in a future post.