## The Max Flow-Min Cut Theorem via Linear Programming Duality

The Max Flow-Min Cut Theorem states that the maximum flow from the source node to the sink node through a capacitated network is equal to the capacity of the minimum cut that separates that source node from the sink node.  In this post we’ll show that this result, which can be proved using only network concepts, also follows from the strong duality theorem for linear programming.

Here’s a linear programming formulation of the maximum flow problem.

\displaystyle \begin{aligned} \text{Maximize } f & \\ \text{subject to } &\sum_{j:(i,j) \in A} x_{ij} - \sum_{k:(k,i) \in A} x_{ki} = \begin{cases} f, & i = s; \\ -f, & i = t; \\ 0, &\text{otherwise;} \end{cases} \text{ for all } i \in N; \\ &0 \leq x_{ij} \leq u_{ij}, \text{ for all } (i,j) \in A. \end{aligned}

Here, f is the total amount of flow, $x_{ij}$ is the amount of flow sent along arc $(i,j)$A is the set of arcs, N is the set of nodes, s is the (single) source, t is the (single) sink, and $u_{ij}$ is the capacity (upper bound) of arc $(i,j)$.

The constraints involving the summations are the flow-balance constraints.  They enforce the idea that the total amount of flow into a node must equal the total amount of flow out of that node.  The two exceptions are the source and the sink, which emit and absorb, respectively, all the flow through the network.

The dual of this problem is

\displaystyle \begin{aligned} \text{Minimize } &\sum_{(i,j) \in A} u_{ij} z_{ij} \\ \text{subject to } &y_i - y_j + z_{ij} \geq 0, \:\: \text{ for all } (i,j) \in A; \\ &-y_s + y_t = 1; \\ &z_{ij} \geq 0, \text{ for all } (i,j) \in A. \end{aligned}

Here, $z_{ij}$ is the dual variable associated with the upper bound constraint on arc $(i,j)$, and $y_i$ is the dual variable associated with the flow-balance constraint on node i.

Next, let’s take a look at why the dual problem corresponds to the problem of finding a minimum cut.  In a search for an optimal solution we will want to set $z_{ij} = 0$ as much as possible.  This means that we will want to set $y_i = y_j$ for as many arcs $(i,j)$ as possible. (*)  However, we cannot set $y_i = y_j$ for all arcs $(i,j)$, as the requirement $-y_s + y_t = 1$ means $y_t$ will have a value one unit larger than $y_s$.  So in a search for an optimal solution to the dual we can consider only those cases in which we break the node set into two pieces: Those nodes i satisfying $y_i = y_s$ and those nodes j satisfying $y_j = y_t$.  Thus this set of cases in which the optimal solution must be found corresponds to the set of $s-t$ cuts that place the source and sink in different subsets.

Does this set of cases leave out any $s-t$ cuts, though?  No.  To see this, note that every cut $(S,\bar{S})$ with $s \in S, t \in \bar{S}$ corresponds to a feasible solution to the dual in the following manner.  Let $y_s = 0, y_t = 1$.  Then let $y_i = \begin{cases} 0, & i \in S; \\ 1, & i \in \bar{S}. \end{cases}$

Finally, let $z_{ij} = \begin{cases} 1, & (i,j) \in (S, \bar{S}); \\ 0, & (i,j) \not\in (S, \bar{S}). \end{cases}$

This assignment satisfies the dual problem, and its corresponding objective function value is exactly the capacity of the cut $(S, \bar{S})$.

Since the assignment of zero flow through the network is feasible for the original problem, both the original problem and its dual are feasible.  By the strong duality theorem, then, the original problem and its dual have optimal solutions whose objective function values are equal.  Therefore, the maximum flow through the network must equal the capacity of the minimum cut that separates the source node s from the sink node t.

(*)  Well, the constraint actually just requires $y_i \geq y_j$, but we know from complementary slackness that if there’s positive flow on an arc $(i,j)$ in the optimal solution to the maximum flow problem then the corresponding dual constraint  $y_i - y_j + z_{ij} \geq 0$ must be satisfied at equality.  So for the dual constraints that actually constrain the solution and thus actually matter we’ll have $y_i - y_j + z_{ij} = 0$ in the dual optimal.

## Solving Second-Order Difference Equations with Constant Coefficients: Part II

In this post we’ll discuss the missing case from last month’s derivation of the solution for second-order difference equations with constant coefficients; namely, the case in which the characteristic equation has a repeated root rather than two distinct roots.

Once again, suppose we have the difference equation $a_{n+2} = A a_{n+1} + B a_n$.  The characteristic equation is still $r^2 = Ar + B$.  Let’s suppose now that the latter equation has the single root $r_0$ rather than distinct roots $r_1$ and $r_2$.

Much of the derivation goes through as before with $r_1 = r_2 = r_0$, so we’ll pick up the derivation where it starts to differ.  This occurs with the partial fractions decomposition.  Instead of using partial fractions decomposition to rewrite

$\displaystyle S = \frac{a_0 + (a_1 - a_0)x}{(1-r_1 x)(1-r_2 x)}$

as

$\displaystyle S = \frac{C}{1-r_1 x} + \frac{D}{1-r_2 x}$,

having repeated roots means we must do the decomposition as

$\displaystyle S = \frac{C}{1-r_0 x} + \frac{D}{(1-r_0 x)^2}$.

The question now is how to expand the second fraction into an infinite series.  In the previous derivation we used the geometric series formula

$\displaystyle \frac{1}{1-rx} = \sum_{n=0}^{\infty} r^n x^n$.

Differentiating both sides of this formula, we obtain

$\displaystyle \frac{r}{(1-rx)^2} = \sum_{n=1}^{\infty} n r^n x^{n-1}$.

We’ll rewrite this and shift indices to get

$\displaystyle \frac{1}{(1-rx)^2} = \sum_{n=1}^{\infty} n r^{n-1} x^{n-1} = \sum_{n=0}^{\infty} (n+1) r^n x^n$.

Now we can express S as

$\displaystyle S = C \sum_{n=0}^{\infty} r_0^n x^n + D \sum_{n=0}^{\infty} (n+1) (r_0^n x^n) = \sum_{n=0}^{\infty} ((C +D) r_0^n + D n r_0^n)x^n.$

Letting $E = C+D$, this gives us

$\displaystyle S = \sum_{n=0}^{\infty} (E r_0^n + D n r_0^n)x^n.$

As before, if we remember that $\displaystyle S = \sum_{n=0}^{\infty} a_n x^n$ and that the only way two formal power series can be equal is if their coefficients are equal, we get the solution in the repeated roots case to be

$\displaystyle a_n = E r_0^n + D n r_0^n$,

where E and D depend on the initial conditions $a_0$ and $a_1$.

## Solving Second-Order Difference Equations with Constant Coefficients

Suppose you have a linear, homogeneous second-order difference equation with constant coefficients of the form $a_{n+2} = A a_{n+1} + B a_n$.  The solution procedure is as follows: “Guess” a solution of the form $a_n = r^n$.  Then substitute this guess into the difference equation to obtain $r^{n+2} = A r^{n+1} + Br^n$.  Divide both sides by $r^n$ (assuming $r \neq 0$), yielding $r^2 = Ar + B$.  Solve this equation to obtain the two solutions $r = r_1$ and $r = r_2$.  This gives two solutions to the original difference equation: $a_n = r_1^n$ and $a_n = r_2^n$.  Assuming $r_1 \neq r_2$, and after arguing that the solutions to the original equation form a vector space of dimension 2 (we’ll leave that aside for now), your two solutions to the original equation form a basis for the solution space.  This means that all solutions to the difference equation can be represented as a linear combination of the two basis solutions $r_1^n$ and $r_2^n$, which means that the general solution to $a_{n+2} = A a_{n+1} + B a_n$ is $a_n = C r_1^n + D r_2^n$, where C and D are determined by the initial conditions $a_0$ and $a_1$.

Many of my students do not find this argument satisfying (although it is valid).  They don’t like the guessing aspect near the beginning.  How do you know the right guess?  How would you come up with that guess?  In this post we’ll do a direct derivation that $a_n = C r_1^n + D r_2^n$ is actually the correction solution (with no guesses!).

The derivation uses generating functions.  We’ll start by reindexing the original recurrence into the form $a_n = A a_{n-1} + B a_{n-2}$.  Then multiply this equation by $x^n$ and then sum as n goes from 2 to infinity to obtain

$\displaystyle \sum_{n=2}^{\infty} a_n x^n = A \sum_{n=2}^{\infty} a_{n-1} x^n + B \sum_{n=2}^{\infty} a_{n-2} x^n$.

We want to reindex this equation so that each summation has the same bounds and the same subscripts for $a_n$.  Doing this yields

$\displaystyle \sum_{n=0}^{\infty} a_n x^n - a_0 - a_1 x= A \sum_{n=1}^{\infty} a_n x^{n+1} + B \sum_{n=0}^{\infty} a_n x^{n+2},$ or

$\displaystyle \sum_{n=0}^{\infty} a_n x^n - a_0 - a_1 x= A x \sum_{n=0}^{\infty} a_n x^n - a_0 x + B x^2 \sum_{n=0}^{\infty} a_n x^n$.

For simplicity’s sake, let’s let $\displaystyle S = \sum_{n=0}^{\infty} a_n x^n$.  This substitution and some rearrangement produces

$\displaystyle S - Ax S - Bx^2 S = a_0 + a_1 x - a_0 x$, which implies

$\displaystyle S = \frac{a_0 + (a_1 - a_0)x}{1 - Ax - Bx^2}$.

Now, let’s factor the denominator.  We already know that $x^2 - Ax - B = (x-r_1)(x-r_2)$.  Replacing x with $1/x$ produces $1/x^2 - A/x - B = (1/x-r_1)(1/x-r_2)$.  Then multiply by $x^2$ to obtain $1 - Ax - Bx^2 = (1-r_1 x)(1 - r_2 x)$.  Therefore,

$\displaystyle S = \frac{a_0 + (a_1 - a_0)x}{(1-r_1 x)(1-r_2 x)}$.

Applying partial fractions decomposition, we can write this as

$\displaystyle S = \frac{C}{1-r_1 x} + \frac{D}{1-r_2 x}$,

for some constants and D.  We could figure out what C and D are in terms of $a_0$ and $a_1$, but for the purposes of this derivation we don’t need to do that.

Now we’re going to pull a property of generating functions.  They can be thought of as formal power series, with the consequence that you can expand them as power series without worrying about questions of convergence.  (See, for example, Wilf’s text generatingfunctionology.)  Applying the geometric series expansion, we have

$\displaystyle S = C \sum_{n=0}^{\infty} (r_1 x)^n + D \sum_{n=0}^{\infty} (r_2 x)^n = \sum_{n=0}^{\infty} (C r_1^n + D r_2^n)x^n.$

Remembering that $\displaystyle S = \sum_{n=0}^{\infty} a_n x^n$, we now have

$\displaystyle \sum_{n=0}^{\infty} a_n x^n = \sum_{n=0}^{\infty} (C r_1^n + D r_2^n)x^n.$

Since the only way two formal power series can be equal is if their coefficients are equal, we get the solution to the original difference equation to be

$a_n = C r_1^n + D r_2^n$.

Remember, this is for the case when the characteristic polynomial $r^2 - Ar - B$ has two distinct roots $r_1$ and $r_2$.  Next month we’ll take a look at the repeated root case, where $r_1 = r_2$.

The Princeton Alumni Weekly has a great article up about the mathematicians at Princeton and the Institute for Advanced Study in the 1930s and 1940s.  I’ve heard a lot of anecdotes about mathematicians over the years, but I had not heard any of the ones in this article before.  Check it out!

## A Quasiperfect Number Must Be an Odd Perfect Square

Let $\sigma$ be the sum of divisors function.  For example, $\sigma(5) = 1 + 5 = 6$, as the divisors of 5 are 1 and 5.  Similarly, $\sigma(6) = 1 + 2 +3 + 6 = 12$, as the divisors of 6 are 1, 2, 3, and 6.

A perfect number is one that is equal to the sum of its divisors, excluding itself.  Thus 6 is a perfect number, as $6 = 1 + 2 + 3$.  In terms of the divisor function, a perfect number is one such that $\sigma(n) = 2n$.  An interesting theorem about perfect numbers is that an even number is perfect if and only if it is of the form $2^{p-1}(2^p-1)$, where both p and $2^p-1$ are prime.

Similarly, a quasiperfect number is a number n such that $\sigma(n) = 2n+1$.  We don’t know whether there are any quasiperfect numbers, but we do know that if they exist there are certain properties they must have.  In this post we will work through the following result of Cattaneo [1].  (I gave the same argument in this post on math.SE.)

Theorem: If $\sigma(n) = 2n+1$, then n is an odd perfect square.

Proof: If $n = \prod_{i=1}^r p_i^{a_i}$ is the prime factorization of n, then we know that
$\displaystyle \sigma(n) = \prod_{i=1}^r (1 + p_i + p_i^2 + \cdots + p_i^{a_i}).$
Since $\sigma(n) = 2n+1$ is odd, though, each factor $1 + p_i + p_i^2 + \cdots + p_i^{a_i}$ must also be odd.

For each odd prime factor $p_i$, then, there must be an odd number of terms in the sum $1 + p_i + p_i^2 + \cdots + p_i^{a_i}$. Thus $a_i$ must be even. This means that if $p_i$ is odd, $p_i^{a_i}$ is an odd perfect square. The product of odd perfect squares is another odd perfect square, and therefore $n = 2^s m^2$, where m is odd.

Now we have $\sigma(n) = (2^{s+1}-1)M$, where

$\displaystyle M = \prod_{p_i \text{ odd}} (1 + p_i + p_i^2 + \cdots + p_i^{a_i})$.

Since $\sigma(n) = 2n+1$,
\begin{aligned} &(2^{s+1}-1)M = 2^{s+1}m^2 + 1 \\ \implies &(2^{s+1}-1)(M - m^2)-1 = m^2. \end{aligned}
This means that -1 is a quadratic residue for each prime factor of $2^{s+1}-1$. A consequence of the quadratic reciprocity theorem, though, is that -1 is a quadratic residue of an odd prime p if and only if $p \equiv 1 \bmod 4$. Thus all prime factors of $2^{s+1}-1$ are congruent to $1 \bmod 4$. The product of numbers congruent to $1 \bmod 4$ is still congruent to $1 \bmod 4$, so $2^{s+1}-1 \equiv 1 \bmod 4$. However, if $s > 0$, then $2^{s+1}-1 \equiv 3 \bmod 4$. Thus s must be $0$. Therefore, $n = m^2$, where m is odd.

References

1.  Paolo Cattaneo, “Sui numeri quasiperfetti,” Bollettino dell’Unione Matematica ItalianoSerie 3, Vol. 6 (1951), no. 1, p. 59-62.

## Determinant of a Symmetric Pascal Matrix

The infinite symmetric Pascal matrix Q is given by

$\displaystyle Q = \begin{bmatrix} 1 & 1 & 1 & 1 & \cdots \\ 1 & 2 & 3 & 4 & \cdots \\ 1 & 3 & 6 & 10 & \cdots \\ 1 & 4 & 10 & 20 & \cdots \\ \vdots & \vdots & \vdots & \vdots & \ddots \end{bmatrix},$

where entry $(i,j)$ in Q is $\binom{i+j}{i}$.  (Note that we begin indexing the matrix with 0, not 1, in keeping with the way Pascal’s triangle is usually indexed.)

The purpose of this post is to show that the determinant of $Q_n$, the $(n+1) \times (n+1)$ upper-left submatrix of Q, is 1.

For example, here’s $Q_4$:

$\displaystyle Q_4 = \begin{bmatrix} 1 & 1 & 1 & 1 & 1\\ 1 & 2 & 3 & 4 & 5\\ 1 & 3 & 6 & 10 & 15\\ 1 & 4 & 10 & 20 & 35 \\ 1 & 5 & 15 & 35 & 70 \end{bmatrix}.$

The proof starts with Vandermonde’s identity,

$\displaystyle \sum_{k=0}^n \binom{n}{k} \binom{m}{r-k} = \binom{n+m}{r}$.

Let $r = m$ in Vandermonde, and then use the symmetry property of the binomial coefficients, $\binom{m}{m-k} = \binom{m}{k}$.  This gives us

$\displaystyle \sum_{k=0}^n \binom{n}{k} \binom{m}{k} = \binom{n+m}{n}$.

Now, let’s create matrix $P_n$, where the $(i,j)$ entry in $P_n$ is $\binom{i}{j}$.  For example, $P_4$ is given by

$\displaystyle P_4 = \begin{bmatrix} 1 & 0 & 0 & 0 & 0 \\ 1 & 1 & 0 & 0 & 0 \\ 1 & 2 & 1 & 0 & 0 \\ 1 & 3 & 3 & 1 & 0 \\ 1 & 4 & 6 & 4 & 1 \end{bmatrix}$.

Now, let’s consider the product $P_n P^T_n$.  Since the $(i,j)$ entry in $P^T_n$ is $\binom{j}{i}$, entry $(n,m)$ in $P_n P^T_n$ is given by

$\displaystyle \sum_{k=0}^n \binom{n}{k} \binom{m}{k}$,

which, according to the corollary to Vandermonde’s identity that we just proved above, is $\binom{n+m}{n}$.

Since $\binom{n+m}{n}$ is entry $(n,m)$ in $Q_n$, this means that $P_n P^T_n = Q_n$.  For example,

$\displaystyle \begin{bmatrix} 1 & 0 & 0 & 0 & 0 \\ 1 & 1 & 0 & 0 & 0 \\ 1 & 2 & 1 & 0 & 0 \\ 1 & 3 & 3 & 1 & 0 \\ 1 & 4 & 6 & 4 & 1 \end{bmatrix} \begin{bmatrix} 1 & 1 & 1 & 1 & 1 \\ 0 & 1 & 2 & 3 & 4 \\ 0 & 0 & 1 & 3 & 6 \\ 0 & 0 & 0 & 1 & 4 \\ 0 & 0 & 0 & 0 & 1 \end{bmatrix} = \begin{bmatrix} 1 & 1 & 1 & 1 & 1\\ 1 & 2 & 3 & 4 & 5\\ 1 & 3 & 6 & 10 & 15\\ 1 & 4 & 10 & 20 & 35 \\ 1 & 5 & 15 & 35 & 70 \end{bmatrix}.$

Since it is clearly the case that $\det P_n = \det P^T_n = 1$, and $\det (AB) = \det A \det B$ in general, we have that

$\displaystyle \det Q_n = 1$.

We’ve done more than prove that $\det Q_n = 1$, actually; we’ve also found the LU decomposition of $Q_n$!

For further reading I recommend Alan Edelman and Gilbert Strang’s article “Pascal Matrices” (American Mathematical Monthly 111(3): March 2004, pp. 189-197).  They give four proofs (including this one) that $\det Q_n = 1$.  Each proof describes a different way of establishing the LU decomposition given here.

Posted in binomial coefficients, matrices | 2 Comments

## A Sequence of Right-Hand Riemann Sums Whose Limit Does Not Exist

Recently in my advanced calculus class one of my students asked a (perhaps the) key question when we hit the integration material: Why can’t you just use the limit of the right-hand Riemann sums formed from equally-spaced partitions to define the integral?  I gave the usual answer involving the 0-1 Dirichlet function:  If you choose a rational point in each subinterval to evaluate the function you get 1 for the limit of the Riemann sums, but if you choose an irrational point in each subinterval you get 0 for the limit.  Thus the integral would not be well-defined.

I realized later, though, that I had kind of sidestepped his question.  He asked specifically about right-hand Riemann sums using equally-spaced partitions, and if you force yourself to work under those constraints on the standard interval [0,1], then the partition points are always rational.  Thus you’re taking 1 for the function value on each subinterval, yielding $\lim_{n \to \infty} R_n = 1$.  The limit of the right-hand Riemann sums using equally-spaced partitions on [0,1] does in fact exist for the Dirichlet function.

So, I wondered, is there a simple example of a bounded function on [0,1] such that the limit of the right-hand Riemann sums based on equally-spaced partitions does not exist?  (A simple unbounded example is $f(x) = \frac{1}{x}[x > 0]$.)  It took me a while to come up with one, but here it is.

Let $\displaystyle f(x) = \begin{cases} 1, &\text{ if } x= p/q \text{ in lowest terms and } q \text{ is even}; \\ 0, &\text{otherwise}. \end{cases}$

Let $\displaystyle R_n = \sum_{i=1}^n f(1/n) \frac{1}{n},$ the right-hand Riemann sum on $[0,1]$ using n equally-spaced subintervals.

If n is odd, then $R_n = 0$.  This is because n has no even factors, and so $i/n$ cannot reduce to a fraction with an even denominator.

However, if n is a power of 2, then $R_n = \frac{n-1}{n}$.  To see this, first observe that $i/n$ for any $i \in \{1, 2, \ldots, n-1\}$ will have more powers of 2 in its denominator than in its numerator, leaving $i/n$ with an even denominator when reduced to lowest terms.  However, $f(n/n) = f(1) = 0$.  Thus $\displaystyle R_n = \sum_{i=1}^n f(i/n) \frac{1}{n} = \sum_{i=1}^{n-1} \frac{1}{n} = \frac{n-1}{n}$.

Since $R_n \geq 0$ for each n and $R_n = 0$ when n is odd, we have $\liminf_{n \to \infty} R_n = 0$.  Similarly, since $R_n \leq 1$ for each n and $R_n = \frac{n-1}{n}$ when n is a power of 2, $\limsup_{n \to \infty} R_n = 1$.  Thus $\lim_{n \to \infty}R_n$ does not exist.