The Max Flow-Min Cut Theorem via Linear Programming Duality

The Max Flow-Min Cut Theorem states that the maximum flow from the source node to the sink node through a capacitated network is equal to the capacity of the minimum cut that separates that source node from the sink node.  In this post we’ll show that this result, which can be proved using only network concepts, also follows from the strong duality theorem for linear programming.

Here’s a linear programming formulation of the maximum flow problem.

\displaystyle    \begin{aligned} \text{Maximize } f & \\ \text{subject to } &\sum_{j:(i,j) \in A} x_{ij} - \sum_{k:(k,i) \in A} x_{ki} = \begin{cases} f, & i = s; \\ -f, & i = t; \\ 0, &\text{otherwise;} \end{cases} \text{ for all } i \in N; \\ &0 \leq x_{ij} \leq u_{ij}, \text{ for all } (i,j) \in A. \end{aligned}

Here, f is the total amount of flow, x_{ij} is the amount of flow sent along arc (i,j)A is the set of arcs, N is the set of nodes, s is the (single) source, t is the (single) sink, and u_{ij} is the capacity (upper bound) of arc (i,j).

The constraints involving the summations are the flow-balance constraints.  They enforce the idea that the total amount of flow into a node must equal the total amount of flow out of that node.  The two exceptions are the source and the sink, which emit and absorb, respectively, all the flow through the network.

The dual of this problem is

\displaystyle    \begin{aligned} \text{Minimize } &\sum_{(i,j) \in A} u_{ij} z_{ij} \\ \text{subject to } &y_i - y_j + z_{ij} \geq 0, \:\: \text{ for all } (i,j) \in A; \\ &-y_s + y_t = 1; \\ &z_{ij} \geq 0, \text{ for all } (i,j) \in A. \end{aligned}

Here, z_{ij} is the dual variable associated with the upper bound constraint on arc (i,j), and y_i is the dual variable associated with the flow-balance constraint on node i.

Next, let’s take a look at why the dual problem corresponds to the problem of finding a minimum cut.  In a search for an optimal solution we will want to set z_{ij} = 0 as much as possible.  This means that we will want to set y_i = y_j for as many arcs (i,j) as possible. (*)  However, we cannot set y_i = y_j for all arcs (i,j), as the requirement -y_s + y_t = 1 means y_t will have a value one unit larger than y_s.  So in a search for an optimal solution to the dual we can consider only those cases in which we break the node set into two pieces: Those nodes i satisfying y_i = y_s and those nodes j satisfying y_j = y_t.  Thus this set of cases in which the optimal solution must be found corresponds to the set of s-t cuts that place the source and sink in different subsets.

Does this set of cases leave out any s-t cuts, though?  No.  To see this, note that every cut (S,\bar{S}) with s \in S, t \in \bar{S} corresponds to a feasible solution to the dual in the following manner.  Let y_s = 0, y_t = 1.  Then let y_i = \begin{cases} 0, & i \in S; \\ 1, & i \in \bar{S}. \end{cases}

Finally, let z_{ij} = \begin{cases} 1, & (i,j) \in (S, \bar{S}); \\ 0, & (i,j) \not\in (S, \bar{S}). \end{cases}

This assignment satisfies the dual problem, and its corresponding objective function value is exactly the capacity of the cut (S, \bar{S}).

Since the assignment of zero flow through the network is feasible for the original problem, both the original problem and its dual are feasible.  By the strong duality theorem, then, the original problem and its dual have optimal solutions whose objective function values are equal.  Therefore, the maximum flow through the network must equal the capacity of the minimum cut that separates the source node s from the sink node t.


(*)  Well, the constraint actually just requires y_i \geq y_j, but we know from complementary slackness that if there’s positive flow on an arc (i,j) in the optimal solution to the maximum flow problem then the corresponding dual constraint  y_i - y_j + z_{ij} \geq 0 must be satisfied at equality.  So for the dual constraints that actually constrain the solution and thus actually matter we’ll have y_i - y_j + z_{ij} = 0 in the dual optimal.

Posted in linear programming, optimization | Leave a comment

Solving Second-Order Difference Equations with Constant Coefficients: Part II

In this post we’ll discuss the missing case from last month’s derivation of the solution for second-order difference equations with constant coefficients; namely, the case in which the characteristic equation has a repeated root rather than two distinct roots.

Once again, suppose we have the difference equation a_{n+2} = A a_{n+1} + B a_n.  The characteristic equation is still r^2 = Ar + B.  Let’s suppose now that the latter equation has the single root r_0 rather than distinct roots r_1 and r_2.

Much of the derivation goes through as before with r_1 = r_2 = r_0, so we’ll pick up the derivation where it starts to differ.  This occurs with the partial fractions decomposition.  Instead of using partial fractions decomposition to rewrite

\displaystyle S = \frac{a_0 + (a_1 - a_0)x}{(1-r_1 x)(1-r_2 x)}


\displaystyle S = \frac{C}{1-r_1 x} + \frac{D}{1-r_2 x},

having repeated roots means we must do the decomposition as

\displaystyle S = \frac{C}{1-r_0 x} + \frac{D}{(1-r_0 x)^2}.

The question now is how to expand the second fraction into an infinite series.  In the previous derivation we used the geometric series formula

\displaystyle \frac{1}{1-rx} = \sum_{n=0}^{\infty} r^n x^n.

Differentiating both sides of this formula, we obtain

\displaystyle \frac{r}{(1-rx)^2} = \sum_{n=1}^{\infty} n r^n x^{n-1}.

We’ll rewrite this and shift indices to get

\displaystyle \frac{1}{(1-rx)^2} = \sum_{n=1}^{\infty} n r^{n-1} x^{n-1} = \sum_{n=0}^{\infty} (n+1) r^n x^n.

Now we can express S as

\displaystyle S = C \sum_{n=0}^{\infty} r_0^n x^n + D \sum_{n=0}^{\infty} (n+1) (r_0^n x^n) = \sum_{n=0}^{\infty} ((C +D) r_0^n + D n r_0^n)x^n.

Letting E = C+D, this gives us

\displaystyle S = \sum_{n=0}^{\infty} (E r_0^n + D n r_0^n)x^n.

As before, if we remember that \displaystyle S = \sum_{n=0}^{\infty} a_n x^n and that the only way two formal power series can be equal is if their coefficients are equal, we get the solution in the repeated roots case to be

\displaystyle a_n = E r_0^n + D n r_0^n,

where E and D depend on the initial conditions a_0 and a_1.

Posted in generating functions, recurrence relations | Leave a comment

Solving Second-Order Difference Equations with Constant Coefficients

Suppose you have a linear, homogeneous second-order difference equation with constant coefficients of the form a_{n+2} = A a_{n+1} + B a_n.  The solution procedure is as follows: “Guess” a solution of the form a_n = r^n.  Then substitute this guess into the difference equation to obtain r^{n+2} = A r^{n+1} + Br^n.  Divide both sides by r^n (assuming r \neq 0), yielding r^2 = Ar + B.  Solve this equation to obtain the two solutions r = r_1 and r = r_2.  This gives two solutions to the original difference equation: a_n = r_1^n and a_n = r_2^n.  Assuming r_1 \neq r_2, and after arguing that the solutions to the original equation form a vector space of dimension 2 (we’ll leave that aside for now), your two solutions to the original equation form a basis for the solution space.  This means that all solutions to the difference equation can be represented as a linear combination of the two basis solutions r_1^n and r_2^n, which means that the general solution to a_{n+2} = A a_{n+1} + B a_n is a_n = C r_1^n + D r_2^n, where C and D are determined by the initial conditions a_0 and a_1.

Many of my students do not find this argument satisfying (although it is valid).  They don’t like the guessing aspect near the beginning.  How do you know the right guess?  How would you come up with that guess?  In this post we’ll do a direct derivation that a_n = C r_1^n + D r_2^n is actually the correction solution (with no guesses!).

The derivation uses generating functions.  We’ll start by reindexing the original recurrence into the form a_n = A a_{n-1} + B a_{n-2}.  Then multiply this equation by x^n and then sum as n goes from 2 to infinity to obtain

\displaystyle \sum_{n=2}^{\infty} a_n x^n = A \sum_{n=2}^{\infty} a_{n-1} x^n + B \sum_{n=2}^{\infty} a_{n-2} x^n.

We want to reindex this equation so that each summation has the same bounds and the same subscripts for a_n.  Doing this yields

\displaystyle \sum_{n=0}^{\infty} a_n x^n - a_0 - a_1 x= A \sum_{n=1}^{\infty} a_n x^{n+1} + B \sum_{n=0}^{\infty} a_n x^{n+2}, or

\displaystyle \sum_{n=0}^{\infty} a_n x^n - a_0 - a_1 x= A x \sum_{n=0}^{\infty} a_n x^n - a_0 x + B x^2 \sum_{n=0}^{\infty} a_n x^n.

For simplicity’s sake, let’s let \displaystyle S = \sum_{n=0}^{\infty} a_n x^n.  This substitution and some rearrangement produces

\displaystyle S - Ax S - Bx^2 S = a_0 + a_1 x - a_0 x, which implies

\displaystyle S = \frac{a_0 + (a_1 - a_0)x}{1 - Ax - Bx^2}.

Now, let’s factor the denominator.  We already know that x^2 - Ax - B = (x-r_1)(x-r_2).  Replacing x with 1/x produces 1/x^2 - A/x - B = (1/x-r_1)(1/x-r_2).  Then multiply by x^2 to obtain 1 - Ax - Bx^2 = (1-r_1 x)(1 - r_2 x).  Therefore,

\displaystyle S = \frac{a_0 + (a_1 - a_0)x}{(1-r_1 x)(1-r_2 x)}.

Applying partial fractions decomposition, we can write this as

\displaystyle S = \frac{C}{1-r_1 x} + \frac{D}{1-r_2 x},

for some constants and D.  We could figure out what C and D are in terms of a_0 and a_1, but for the purposes of this derivation we don’t need to do that.

Now we’re going to pull a property of generating functions.  They can be thought of as formal power series, with the consequence that you can expand them as power series without worrying about questions of convergence.  (See, for example, Wilf’s text generatingfunctionology.)  Applying the geometric series expansion, we have

\displaystyle S = C \sum_{n=0}^{\infty} (r_1 x)^n + D \sum_{n=0}^{\infty} (r_2 x)^n = \sum_{n=0}^{\infty} (C r_1^n + D r_2^n)x^n.

Remembering that \displaystyle S = \sum_{n=0}^{\infty} a_n x^n, we now have

\displaystyle \sum_{n=0}^{\infty} a_n x^n = \sum_{n=0}^{\infty} (C r_1^n + D r_2^n)x^n.

Since the only way two formal power series can be equal is if their coefficients are equal, we get the solution to the original difference equation to be

a_n = C r_1^n + D r_2^n.

Remember, this is for the case when the characteristic polynomial r^2 - Ar - B has two distinct roots r_1 and r_2.  Next month we’ll take a look at the repeated root case, where r_1 = r_2.

Posted in generating functions, recurrence relations | Leave a comment

Adventures in Fine Hall

The Princeton Alumni Weekly has a great article up about the mathematicians at Princeton and the Institute for Advanced Study in the 1930s and 1940s.  I’ve heard a lot of anecdotes about mathematicians over the years, but I had not heard any of the ones in this article before.  Check it out!

Posted in mathematicians | Leave a comment

A Quasiperfect Number Must Be an Odd Perfect Square

Let \sigma be the sum of divisors function.  For example, \sigma(5) = 1 + 5 = 6, as the divisors of 5 are 1 and 5.  Similarly, \sigma(6) = 1 + 2 +3 + 6 = 12, as the divisors of 6 are 1, 2, 3, and 6.

A perfect number is one that is equal to the sum of its divisors, excluding itself.  Thus 6 is a perfect number, as 6 = 1 + 2 + 3.  In terms of the divisor function, a perfect number is one such that \sigma(n) = 2n.  An interesting theorem about perfect numbers is that an even number is perfect if and only if it is of the form 2^{p-1}(2^p-1), where both p and 2^p-1 are prime.

Similarly, a quasiperfect number is a number n such that \sigma(n) = 2n+1.  We don’t know whether there are any quasiperfect numbers, but we do know that if they exist there are certain properties they must have.  In this post we will work through the following result of Cattaneo [1].  (I gave the same argument in this post on math.SE.)

Theorem: If \sigma(n) = 2n+1, then n is an odd perfect square.

Proof: If n = \prod_{i=1}^r p_i^{a_i} is the prime factorization of n, then we know that
\displaystyle \sigma(n) = \prod_{i=1}^r (1 + p_i + p_i^2 + \cdots + p_i^{a_i}).
Since \sigma(n) = 2n+1 is odd, though, each factor 1 + p_i + p_i^2 + \cdots + p_i^{a_i} must also be odd.

For each odd prime factor p_i, then, there must be an odd number of terms in the sum 1 + p_i + p_i^2 + \cdots + p_i^{a_i}. Thus a_i must be even. This means that if p_i is odd, p_i^{a_i} is an odd perfect square. The product of odd perfect squares is another odd perfect square, and therefore n = 2^s m^2, where m is odd.

Now we have \sigma(n) = (2^{s+1}-1)M, where

\displaystyle M = \prod_{p_i \text{ odd}} (1 + p_i + p_i^2 + \cdots + p_i^{a_i}).

Since \sigma(n) = 2n+1,
\begin{aligned}  &(2^{s+1}-1)M = 2^{s+1}m^2 + 1 \\  \implies &(2^{s+1}-1)(M - m^2)-1 = m^2.  \end{aligned}
This means that -1 is a quadratic residue for each prime factor of 2^{s+1}-1. A consequence of the quadratic reciprocity theorem, though, is that -1 is a quadratic residue of an odd prime p if and only if p \equiv 1 \bmod 4. Thus all prime factors of 2^{s+1}-1 are congruent to 1 \bmod 4. The product of numbers congruent to 1 \bmod 4 is still congruent to 1 \bmod 4, so 2^{s+1}-1 \equiv 1 \bmod 4. However, if s > 0, then 2^{s+1}-1 \equiv 3 \bmod 4. Thus s must be 0. Therefore, n = m^2, where m is odd.


  1.  Paolo Cattaneo, “Sui numeri quasiperfetti,” Bollettino dell’Unione Matematica ItalianoSerie 3, Vol. 6 (1951), no. 1, p. 59-62.
Posted in number theory | Leave a comment

Determinant of a Symmetric Pascal Matrix

The infinite symmetric Pascal matrix Q is given by

\displaystyle Q = \begin{bmatrix} 1 & 1 & 1 & 1 & \cdots  \\ 1 & 2 & 3 & 4 & \cdots \\ 1 & 3 & 6 & 10 & \cdots \\ 1 & 4 & 10 & 20 & \cdots \\ \vdots & \vdots & \vdots & \vdots & \ddots \end{bmatrix},

where entry (i,j) in Q is \binom{i+j}{i}.  (Note that we begin indexing the matrix with 0, not 1, in keeping with the way Pascal’s triangle is usually indexed.)

The purpose of this post is to show that the determinant of Q_n, the (n+1) \times (n+1) upper-left submatrix of Q, is 1.

For example, here’s Q_4:

\displaystyle Q_4 = \begin{bmatrix} 1 & 1 & 1 & 1 & 1\\ 1 & 2 & 3 & 4 & 5\\ 1 & 3 & 6 & 10 & 15\\ 1 & 4 & 10 & 20 & 35 \\ 1 & 5 & 15 & 35 & 70 \end{bmatrix}.

The proof starts with Vandermonde’s identity,

\displaystyle \sum_{k=0}^n \binom{n}{k} \binom{m}{r-k} = \binom{n+m}{r}.

Let r = m in Vandermonde, and then use the symmetry property of the binomial coefficients, \binom{m}{m-k} = \binom{m}{k}.  This gives us

\displaystyle \sum_{k=0}^n \binom{n}{k} \binom{m}{k} = \binom{n+m}{n}.

Now, let’s create matrix P_n, where the (i,j) entry in P_n is \binom{i}{j}.  For example, P_4 is given by

\displaystyle P_4 = \begin{bmatrix} 1 & 0 & 0 & 0 & 0 \\ 1 & 1 & 0 & 0 & 0 \\ 1 & 2 & 1 & 0 & 0 \\ 1 & 3 & 3 & 1 & 0 \\ 1 & 4 & 6 & 4 & 1 \end{bmatrix}.

Now, let’s consider the product P_n P^T_n.  Since the (i,j) entry in P^T_n is \binom{j}{i}, entry (n,m) in P_n P^T_n is given by

\displaystyle \sum_{k=0}^n \binom{n}{k} \binom{m}{k},

which, according to the corollary to Vandermonde’s identity that we just proved above, is \binom{n+m}{n}.

Since \binom{n+m}{n} is entry (n,m) in Q_n, this means that P_n P^T_n = Q_n.  For example,

\displaystyle \begin{bmatrix} 1 & 0 & 0 & 0 & 0 \\ 1 & 1 & 0 & 0 & 0 \\ 1 & 2 & 1 & 0 & 0 \\ 1 & 3 & 3 & 1 & 0 \\ 1 & 4 & 6 & 4 & 1 \end{bmatrix} \begin{bmatrix} 1 & 1 & 1 & 1 & 1 \\ 0 & 1 & 2 & 3 & 4 \\ 0 & 0 & 1 & 3 & 6 \\ 0 & 0 & 0 & 1 & 4 \\ 0 & 0 & 0 & 0 & 1 \end{bmatrix} = \begin{bmatrix} 1 & 1 & 1 & 1 & 1\\ 1 & 2 & 3 & 4 & 5\\ 1 & 3 & 6 & 10 & 15\\ 1 & 4 & 10 & 20 & 35 \\ 1 & 5 & 15 & 35 & 70 \end{bmatrix}.

Since it is clearly the case that \det P_n = \det P^T_n = 1, and \det (AB) = \det A \det B in general, we have that

\displaystyle \det Q_n = 1.

We’ve done more than prove that \det Q_n = 1, actually; we’ve also found the LU decomposition of Q_n!

For further reading I recommend Alan Edelman and Gilbert Strang’s article “Pascal Matrices” (American Mathematical Monthly 111(3): March 2004, pp. 189-197).  They give four proofs (including this one) that \det Q_n = 1.  Each proof describes a different way of establishing the LU decomposition given here.

Posted in binomial coefficients, matrices | 2 Comments

A Sequence of Right-Hand Riemann Sums Whose Limit Does Not Exist

Recently in my advanced calculus class one of my students asked a (perhaps the) key question when we hit the integration material: Why can’t you just use the limit of the right-hand Riemann sums formed from equally-spaced partitions to define the integral?  I gave the usual answer involving the 0-1 Dirichlet function:  If you choose a rational point in each subinterval to evaluate the function you get 1 for the limit of the Riemann sums, but if you choose an irrational point in each subinterval you get 0 for the limit.  Thus the integral would not be well-defined.

I realized later, though, that I had kind of sidestepped his question.  He asked specifically about right-hand Riemann sums using equally-spaced partitions, and if you force yourself to work under those constraints on the standard interval [0,1], then the partition points are always rational.  Thus you’re taking 1 for the function value on each subinterval, yielding \lim_{n \to \infty} R_n = 1.  The limit of the right-hand Riemann sums using equally-spaced partitions on [0,1] does in fact exist for the Dirichlet function.

So, I wondered, is there a simple example of a bounded function on [0,1] such that the limit of the right-hand Riemann sums based on equally-spaced partitions does not exist?  (A simple unbounded example is f(x) = \frac{1}{x}[x > 0].)  It took me a while to come up with one, but here it is.

Let \displaystyle f(x) = \begin{cases} 1, &\text{ if } x= p/q \text{ in lowest terms and } q \text{ is even}; \\ 0, &\text{otherwise}. \end{cases}

Let \displaystyle R_n = \sum_{i=1}^n f(1/n) \frac{1}{n}, the right-hand Riemann sum on [0,1] using n equally-spaced subintervals.

If n is odd, then R_n = 0.  This is because n has no even factors, and so i/n cannot reduce to a fraction with an even denominator.

However, if n is a power of 2, then R_n = \frac{n-1}{n}.  To see this, first observe that i/n for any i \in \{1, 2, \ldots, n-1\} will have more powers of 2 in its denominator than in its numerator, leaving i/n with an even denominator when reduced to lowest terms.  However, f(n/n) = f(1) = 0.  Thus \displaystyle R_n = \sum_{i=1}^n f(i/n) \frac{1}{n} = \sum_{i=1}^{n-1} \frac{1}{n} = \frac{n-1}{n}.

Since R_n \geq 0 for each n and R_n = 0 when n is odd, we have \liminf_{n \to \infty} R_n = 0.  Similarly, since R_n \leq 1 for each n and R_n = \frac{n-1}{n} when n is a power of 2, \limsup_{n \to \infty} R_n = 1.  Thus \lim_{n \to \infty}R_n does not exist.

Posted in calculus, real analysis | Leave a comment