Equivalence of Two Binomial Sums

This month’s post entails proving the following equivalence:

Identity 1: \displaystyle \sum_{k=r}^n \binom{k-1}{r-1} p^r (1-p)^{k-r} = \sum_{k=r}^n \binom{n}{k} p^k (1-p)^{n-k}.

Despite the fact that these two sums are over the same range and are equal for all values of n they are not equal term-by-term.

If you think about the two sides probabilistically, it’s not too hard to prove that they are equal.  For example, \binom{k-1}{r-1} p^r (1-p)^{k-r} is the probability of obtaining the rth head on the kth flip when repeatedly flipping a fair coin.  Thus the left side of Identity 1 is the probability of flipping a fair coin times and obtaining at least r heads.  Similarly, \binom{n}{k} p^k (1-p)^{n-k} is the probability of obtaining exactly k heads when flipping a fair coin n times.  Thus the right side of Identity 1 is also the probability of flipping a fair coin times and obtaining at least r heads.  Therefore, the two sides must be equal.

A while back I was also interested in trying to prove algebraically that the two sides are equal.  This turned out to be harder than it looked, and I thought I would record my derivation here.  It’s a bit heavy on the algebra, I’m afraid.  The overall goal will be to isolate the coefficient of p^m on both sides of Identity 1, as each of the sides can be thought of as a polynomial in p.  If these are equal then the identities must be equal.

The left side first.  Expand (1-p)^{k-r} using the binomial theorem to get

\displaystyle \sum_{k=r}^n \binom{k-1}{r-1} p^r (1-p)^{k-r} = p^r \sum_{k=r}^n \binom{k-1}{r-1} \sum_{j=0}^{k-r} \binom{k-r}{j} (-p)^j.

Now, which terms in this double sum contain a p^m term?  Those in which j = m-r.  Thus the coefficient of p^m in this double sum is

\displaystyle \sum_{k=r}^n \binom{k-1}{r-1} \binom{k-r}{m-r} (-1)^{m-r} = \sum_{k=m}^n \binom{k-1}{r-1} \binom{k-r}{m-r} (-1)^{m-r}

\displaystyle = (-1)^{m-r} \sum_{k=m}^n \frac{r}{k} \binom{k}{r} \binom{k-r}{m-r}, \text{ by the absorption identity}

\displaystyle = (-1)^{m-r} r \sum_{k=m}^n \frac{1}{k} \binom{k}{m} \binom{m}{r}, \text{ by trinomial revision}

\displaystyle = (-1)^{m-r} r \binom{m}{r} \sum_{k=m}^n \binom{k}{m} \frac{1}{k}

\displaystyle = (-1)^{m-r} r \binom{m}{r} \sum_{k=m}^n \binom{k-1}{m-1} \frac{1}{m}, \text{ by the absorption identity (again)}

\displaystyle = (-1)^{m-r} \frac{r}{m} \binom{m}{r} \sum_{k=m-1}^{n-1} \binom{k}{m-1}

\displaystyle = (-1)^{m-r} \binom{m-1}{r-1} \binom{n}{m}, \text{ by, once more, the absorption identity, plus the hockey stick identity}.

Now for the right side of Identity 1.  Expand (1-p)^{n-k} using the binomial theorem to get

\displaystyle \sum_{k=r}^n \binom{n}{k} p^k (1-p)^{n-k} = \sum_{k=r}^n \binom{n}{k} p^k \sum_{j=0}^{n-k} \binom{n-k}{j} (-p)^j.

As before, which terms in this contain a p^m term?  Those in which j = m-k.  Thus the coefficient of p^m in this double sum is

\displaystyle \sum_{k=r}^n \binom{n}{k} \binom{n-k}{m-k} (-1)^{m-k} =  \sum_{k=r}^m \binom{n}{k} \binom{n-k}{m-k} (-1)^{m-k}

\displaystyle = \sum_{k=r}^m \binom{n}{m} \binom{m}{k} (-1)^{m-k}, \text{ by trinomial revision}

\displaystyle = \binom{n}{m} \sum_{k=0}^{m-r} \binom{m}{m-k} (-1)^k, \text{ reindexing the sum}

\displaystyle = \binom{n}{m} \sum_{k=0}^{m-r} \binom{m}{k} (-1)^k

\displaystyle = \binom{n}{m} (-1)^{m-r} \binom{m-1}{r-1},

where the last expression comes from the formula for the partial alternating row sum of the binomial coefficients.

Posted in binomial coefficients, probability | Leave a comment

A Sum of Ratios of Binomial Coefficients

In this post we evaluate the sum \displaystyle \sum_{k=0}^m \frac{\binom{m}{k}}{\binom{n}{k}}.  Then we’ll generalize it and evaluate \displaystyle \sum_{k=0}^m \frac{\binom{m}{k}\binom{k}{r}}{\binom{n}{k}}.

The key tools we need for the first sum are the trinomial revision identity, \binom{n}{m} \binom{m}{k} = \binom{n}{k} \binom{n-k}{m-k}, and the parallel summation identity, \sum_{k=0}^m \binom{n+k}{k} = \binom{n+m+1}{m}.  Using trinomial revision, we have

\displaystyle \sum_{k=0}^m \frac{\binom{m}{k}}{\binom{n}{k}} = \sum_{k=0}^m \frac{\binom{n-k}{m-k}}{\binom{n}{m}} = \frac{1}{\binom{n}{m}} \sum_{k=0}^m \binom{n-k}{m-k}.

We can evaluate the remaining sum using parallel summation, but we need to pull a variable switch first.  Replace k with m-k and then apply parallel summation to obtain

\displaystyle \frac{1}{\binom{n}{m}} \sum_{k=0}^m \binom{n-m+k}{k} = \frac{\binom{n+1}{m}}{\binom{n}{m}}.

We thus have the identity

\displaystyle \sum_{k=0}^m \frac{\binom{m}{k}}{\binom{n}{k}} = \frac{\binom{n+1}{m}}{\binom{n}{m}} = \frac{n+1}{n+1-m},

where the last step follows thanks to some algebra simplification with factorials.

To evaluate the second sum we’ll need an identity that’s like an upper-index version of Vandermonde’s convolution: \sum_{k=-m}^{n} \binom{m+k}{r} \binom{n-k}{s} = \binom{m+n+1}{r+s+1}.  Otherwise, the derivation for the second sum proceeds just as the first one does up through the point where we replace k with m-k.  Continuing from there and applying the upper Vandermonde identity, we have

\displaystyle \sum_{k=0}^m \frac{\binom{m}{k}\binom{k}{r}}{\binom{n}{k}} = \frac{1}{\binom{n}{m}} \sum_{k=0}^m \binom{n-m+k}{k} \binom{m-k}{r} = \frac{1}{\binom{n}{m}} \sum_{k=0}^m \binom{n-m+k}{n-m} \binom{m-k}{r} = \frac{1}{\binom{n}{m}} \binom{n+1}{n-m+r+1}.

We can rewrite this and perform some more algebra simplification to get our generalization:

\displaystyle \sum_{k=0}^m \frac{\binom{m}{k}\binom{k}{r}}{\binom{n}{k}} = \frac{\binom{n+1}{m-r}}{\binom{n}{m}} = \frac{n+1}{m^{\underline{r}} (n+1-m)^{\underline{r+1}}}.


Posted in binomial coefficients | Leave a comment

An Alternating Convolution Identity via Sign-Reversing Involutions

This month’s post is on the combinatorial proof technique of sign-reversing involutions.  It’s a really clever idea that can often be applied to identities that feature alternating sums.  We’ll illustrate the technique on the following identity.

\displaystyle \sum_{k=0}^m \binom{n}{k} \binom{n}{m-k} (-1)^k = (-1)^{m/2} \binom{n}{m/2} [ m \text{ is even}].

(Here, we use the Iverson bracket notation [A], where [A] is 1 if A is true and 0 if A is false.)

This identity is interesting because there’s a lot going on here.  For one, it’s an alternating version of the famous Vandermonde identity.  But what’s happening with the right side?  Why is it zero when m is odd, and why does the parity of m/2 matter?  The sign-reversing involution technique will help us make sense of all of that.

First, \binom{n}{k} \binom{n}{m-k} can be thought of in terms of counting ordered pairs (A,B), each of which is a subset of \{1, 2, \ldots, n\}.  However, the choice of (A,B) is subject to the restriction that |A| = k and |B| = m-k.  The alternating sum on the left side of the identity is taken over all such pairs such that |A| + |B| = m, where the parity of a particular (A,B) is the parity of |A|.

We now have a combinatorial interpretation of the left side.  Now, let’s define the sign-reversing involution.  Given (A,B), let x denote the largest element in the symmetric difference A \oplus B of A and B; i.e., A \oplus B = (A-B) \cup (B-A).  Let

\displaystyle \phi(A,B) = \begin{cases} (A-\{x\}, B \cup \{x\}), & x \in A; \\ (A \cup \{x\}, B - \{x\}, & x \in B.\end{cases}

Note that \phi takes an ordered pair (A,B) such that A, B \subseteq \{1, 2, \ldots, n\} and |A| + |B| = m and maps it to another such ordered pair.  Moving the largest element x in the symmetric difference of A and B to the other set changes the number of elements in A by one and thus the parity of |A|.  This means \phi is sign-reversing.  In addition, \phi(A,B) has the same largest element x in its symmetric difference that (A,B) does; thus applying \phi to \phi(A,B) just sends x back to its original set.  Thus \phi(\phi(A,B)) = (A,B), which makes \phi an involution.

These two properties mean that \phi associates an ordered pair (A,B) of the kind we’re examining with another ordered pair that has a different parity.  Each of these pairs cancels out its associated pair in terms of evaluating the alternating sum on the left side of the identity we’re trying to prove.  Since every pair (A,B) has an associated pair that cancels it out, we have that the right side of our identity is the number of pairs (A,B) for which \phi is not defined.  Let’s count those leftover pairs now.

The only way for \phi to be undefined is for the largest element x in A \oplus B to not exist.  But that can only happen if A-B = \emptyset = B-A; in other words, A = B.  In this case, though, since |A| + |B| = m, we have |A| = |B| = m/2.  Thus when m is odd, there are no pairs (A,B) on which \phi is undefined; hence the right side of the identity is 0 when m is odd.  When m is even, there are \binom{n}{m/2} ways to choose the elements of A, and since B = A, this choice also determines B.  Since for all of these pairs A has m/2 elements, the parity of these leftover elements is the parity of m/2.

In general, this is how a sign-reversing involution argument works: Find a combinatorial interpretation of the terms in the left side of the alternating sum, then find a sign-reversing involution that maps entities with odd parity to entities with even parity and vice versa.  Counting the entities that are left over – i.e., those not paired up by the involution – will yield the right side of the identity.

For more sign-reversing involution arguments, see this post and this post.  Chapter 6 of Benjamin and Quinn’s Proofs that Really Count [1] and Chapter 5 of Aigner’s A Course in Enumeration [2] have examples as well.

(For what it’s worth, the identity can be proved more easily using generating functions, but I don’t think that approach gives as much insight into why the identity is true.)


  1.  Arthur T. Benjamin and Jennifer J. Quinn.  Proofs that Really Count.  MAA, 2003.
  2. Martin Aigner.  A Course in Enumeration.  Springer, 2010.


Posted in binomial coefficients, combinatorics | Leave a comment

A Mathematical Riddle

This past weekend I attended the wedding of a former student, Jake Linenthal.  On the dinner tables at the reception were sheets of paper containing a Mad Libs-style story of how Jake and his wife Abby got engaged, as well as a wedding riddle.  The riddle went something like this:

An evil king has captured the bride and groom for a wedding that is about to take place.  The king says that he will not let the wedding go on unless you can complete a task for him.  First, he blindfolds you.  Then he hands you a standard deck of 52 cards and tells you that 10 of the cards are face up and the rest are face down.  Finally, he says that you have to divide the 52 cards into two stacks such that each stack has the same number of face-up cards.  You cannot remove the blindfold, you cannot tell whether a card is face up or face down simply by touching it, and the cards are indestructible.

At first I could not see how it would be possible to complete such a task.  There simply didn’t seem to be enough to work with here.

Then, after realizing that the rules do allow you to turn cards over, and working through an example where you have a deck of four cards with two of them face up, I found the solution.

Solution: Divide the deck into a stack of 42 cards and a stack of 10 cards.  Then turn over every card in the 10-card stack.

Why does this work?  You know that 10 cards are face up.  Suppose x of the face-up cards are in the stack of 42 cards.  That means that the stack of 10 cards has 10-x cards that are face up and x cards that are face down.  Turning over every card in the 10-card stack then gives you 10-x cards that are face down and x cards that are face up.  With x face-up cards in the 42-card stack and x face-up cards in the 10-card stack, you’ve completed the king’s task and saved the wedding.

The form of the solution means that we can easily generalize the problem.  Suppose you have an n-card deck in which m cards are face up, and you have to divide all the cards into two stacks with the same number of face-up cards in each.  Then the solution is to (1) Divide the cards into a stack of n-m cards and a stack of m cards, and then (2) Turn over every card in the stack of m cards.

Interestingly enough, with this solution you won’t know how many face-up cards are in each stack.  But, fortunately, the task doesn’t require that.

Posted in arithmetic, puzzles | Leave a comment

Infinite Series Expressions for Pi, E, Phi, and Gamma

Recently I was trying to find infinite series expressions for some famous mathematical constants, and I thought I would record what I found here.  This post considers \pi, Euler’s constant e, the golden ratio \Phi, and the Euler-Mascheroni constant \gamma.

(Of course, you can construct an infinite series for any of these constants by simply adding the next significant digit in the constant’s infinite decimal representation with each successive term in the series, but that’s kind of boring.)

First, \pi.  An infinite series representation for \pi starts with the geometric series formula \displaystyle \frac{1}{1-t} = 1 + t + t^2 + t^3 + \cdots, valid for |t| < 1.  If you replace t by -t^2, you get

\displaystyle \frac{1}{1+t^2} = 1 - t^2 + t^4 - t^6 \pm \cdots.

Integrating both sides of this expression from 0 to x yields

\displaystyle \arctan x = x - \frac{1}{3}x^3 + \frac{1}{5}{x^5} - \frac{1}{7}x^7 \pm \cdots,

as long as the infinite series on the right converges (which it will when |x| \leq 1, thanks to the alternating series test).

Substituting 1 for x yields

\displaystyle \frac{\pi}{4} = 1 - \frac{1}{3} + \frac{1}{5} - \frac{1}{7} \pm \cdots.

Then, multiplying both sides by 4 produces the Leibniz series for \pi:

\displaystyle \pi = \frac{4}{1} - \frac{4}{3} + \frac{4}{5} - \frac{4}{7} \pm \cdots.

Second, e.  We’ll take as our definition of e that e is the value of a such that \frac{d}{dx} a^x = a^x.  Constructing the Maclaurin series for e^x and substituting 1 for x will give us an infinite series expression for e.

Since our definition of e^x is such that \frac{d^n}{dx^n} (e^x) = e^x regardless of the value of n, the Maclaurin series for e^x is simply

\displaystyle e^x = 1 + x + \frac{x^2}{2} + \frac{x^3}{3!} + \frac{x^4}{4!} + \cdots.

Substituting 1 for x gives us

\displaystyle e = \frac{1}{0!} + \frac{1}{1!} + \frac{1}{2!} + \frac{1}{3!} + \frac{1}{4!} + \cdots.

(The fact that the Maclaurin series for e^x converges for any real value of x is an easy application of the ratio test, and the fact that the Maclaurin series for e^x actually converges to e^x follows from a slightly more involved application of the Lagrange Remainder Theorem.)

Third, \Phi.  The first two infinite series are well-known, but the one we give here for the golden ratio \Phi is (I believe) much less so.  It starts with a property of \Phi that is fairly well-known, namely, that the ratio of the n+1 Fibonacci number to the nth Fibonacci number approaches \Phi as approaches infinity:

\displaystyle \Phi = \lim_{n \to \infty} \frac{F_{n+1}}{F_n}.

To obtain an infinite series expression for \Phi, then, we just need to represent \displaystyle \frac{F_{n+1}}{F_n} as the partial sum of an infinite series.  Well, this telescoping sum works, although it’s rather boring:

\displaystyle \frac{F_{n+1}}{F_n} = \frac{F_{n+1}}{F_n} - \frac{F_n}{F_{n-1}} + \frac{F_n}{F_{n-1}} \mp \cdots - \frac{F_2}{F_1} + \frac{F_2}{F_1}.

However, grouping terms differently gives us something that is perhaps interesting.  For example, we have

\displaystyle \frac{F_{n+1}}{F_n} - \frac{F_n}{F_{n-1}} = \frac{F_{n+1} F_{n-1} - F_n^2}{F_n F_{n-1}} = \frac{(-1)^n}{F_n F_{n-1}}, where the last step is due to Cassini’s identity.

Grouping terms this way leaves out \dfrac{F_2}{F_1} = 1, so we have

\displaystyle \frac{F_{n+1}}{F_n} = 1 + \frac{1}{F_2 F_1} - \frac{1}{F_3 F_2} + \frac{1}{F_4 F_3} \mp \cdots \frac{(-1)^n}{F_n F_{n-1}}.


\displaystyle \Phi = 1 + \frac{1}{F_1 F_2} - \frac{1}{F_2 F_3} + \frac{1}{F_3 F_4} \mp \cdots.

(I found this one from Thomas Andrews’s answer here on Math.SE.)

Finally, \gamma.  Our infinite series expression for \gamma is derived in a manner similar to our series for \Phi: We’ll take a well-known expression for \gamma and turn it into an infinite series.

The usual definition of \gamma is that it is the long-term error in the approximation H_n - \ln n, where H_n is the nth harmonic number \displaystyle H_n = \sum_{k=1}^n \frac{1}{k}.  Our goal, then, is to turn -\ln n into the partial sum of an infinite series.  Here’s one such, although (as in our example with \Phi), it’s rather boring:

\displaystyle -\ln n = -\ln n + \ln (n-1) - \ln (n-1) \pm \cdots + \ln 1 - \ln 1.

Thanks to properties of logarithms, though, this can be expressed more interestingly as

\displaystyle -\ln n = \ln \left( \frac{n-1}{n} \right) + \ln \left( \frac{n-2}{n-1} \right) + \cdots + \ln \left(\frac{1}{2} \right).

By interspersing the terms in this expression with the terms in H_n and taking the limit, we have the following infinite series expression for \gamma:

\displaystyle \gamma = 1 + \ln \left(\frac{1}{2}\right) + \frac{1}{2} + \ln \left(\frac{2}{3}\right) + \frac{1}{3} + \ln \left(\frac{3}{4}\right) + \frac{1}{4} + \cdots.

Posted in sequences and series | Leave a comment

Minimum Number of Wagons to Connect Every City in Ticket to Ride – Europe

How many wagons would you need to create a route network in the board game Ticket to Ride – Europe so that every city is connected to every other city through the network?  If you could do this you could potentially earn every destination ticket in the game.

To answer this question, we first need to make a graph out of the Ticket to Ride – Europe map.  We need to represent each city by a vertex, and we need to represent each city-to-city route by an arc with weight equal to the number of wagons it takes to claim that particular route.  Then the question entails finding a minimum spanning tree that connects all the vertices in this graph.  The total weight of the minimum spanning tree will be the answer to our question.

Fortunately, there are some fast algorithms for finding minimum spanning trees, and some of them are simple enough that we can easily perform them by hand.  For this post we’ll use Prim’s algorithm, which involves the following steps:

  1. Start at any vertex to initialize the tree.
  2. Select a minimum-weight arc that is not in the tree but that connects a vertex in the tree with one not in the tree.  Add that arc and the new vertex to the tree.
  3. Repeat Step 2 until all vertices are in the tree.

Edinburgh is a logical city in which to start, since from there the only choice for arc is the 4-wagon route to London.  Add London and the route from Edinburgh to London to our tree.

Now, there are two minimum-weight arcs out of Edinburgh or London to new cities.  These are the two-wagon routes from London to Amsterdam and from London to Dieppe.  It actually doesn’t matter which one we pick; the proof of correctness for Prim’s algorithm tells us that either would lead to a minimum spanning tree, even if we don’t eventually choose the other route for our tree.  Let’s take alphabetical order as the tiebreaker and choose the London to Amsterdam route for two wagons.

In the next iteration the shortest route available to us isn’t the two-wagon route from London to Dieppe; it’s the one-wagon route from Amsterdam to Bruxelles.  So we must take that one.

Next iteration: There are multiple two-wagon routes available now: London to Dieppe, Bruxelles to Dieppe, Bruxelles to Paris, Bruxelles to Frankfurt, and Amsterdam to Frankfurt.  Alphabetical order for the new city gives us Dieppe, and alphabetical order again would select Bruxelles over London as the city to connect Dieppe to.  (Again, any of these choices would give us a minimum spanning tree at the end, so all we need is some tiebreaker criterion.)

Next choice is Dieppe to Paris for one wagon.

In the next iteration there are multiple two-wagon routes available again: Dieppe to Brest, Bruxelles to Frankfurt, and Amsterdam to Frankfurt.  We do not include the London-to-Dieppe route or the Bruxelles-to-Paris route because we only consider arcs from cities already in our tree to new cities.  Alphabetical order as the tiebreaker criterion means we select Dieppe to Brest as our new arc.

From here, I’ll just list the remaining arcs that are added, in order of selection (again using our alphabetical order tiebreaker).  For completeness I’ll include the arcs and vertices we’ve already selected.

  1. Edinburgh (start)
  2. Edinburgh to London (4 wagons)
  3. London to Amsterdam (2)
  4. Amsterdam to Bruxelles (1)
  5. Bruxelles to Dieppe (2)
  6. Dieppe to Paris (1)
  7. Dieppe to Brest (2)
  8. Amsterdam to Frankfurt (2)
  9. Frankfurt to Essen (2)
  10. Essen to Berlin (2)
  11. Frankfurt to Munchen (2)
  12. Munchen to Venezia (2)
  13. Venezia to Roma (2)
  14. Roma to Brindisi (2)
  15. Venezia to Zagrab (2)
  16. Zagrab to Budapest (2)
  17. Budapest to Wien (1)
  18. Munchen to Zurich (2)
  19. Zurich to Marseille (2)
  20. Essen to Kobenhavn (3)
  21. Brindisi to Palermo (3)
  22. Budapest to Sarajevo (3)
  23. Sarajevo to Sofia (2)
  24. Sofia to Bucuresti (2)
  25. Sofia to Athina (3)
  26. Athina to Smyrna (2)
  27. Smyrna to Constantinople (2)
  28. Constantinople to Angora (2)
  29. Angora to Erzurum (3)
  30. Erzurum to Sochi (3)
  31. Sochi to Rostov (2)
  32. Rostov to Kharkov (2)
  33. Sochi to Sevastopol (2)
  34. Kobenhavn to Stockholm (3)
  35. Marseille to Barcelona (4)
  36. Barcelona to Madrid (2)
  37. Barcelona to Pamplona (2)
  38. Madrid to Cadiz (3)
  39. Cadiz to Lisbon (2)
  40. Berlin to Danzig (4)
  41. Danzig to Warszawa (2)
  42. Danzig to Riga (3)
  43. Warszawa to Wilno (3)
  44. Wilno to Kyiv (2)
  45. Kyiv to Smolensk (3)
  46. Smolensk to Moskva (2)
  47. Moskva to Petrograd (4)

This gives us a total of 108 required wagons.  Since you only have 45 wagons you can play in a game of Ticket to Ride – Europe, you can’t get anywhere close to connecting all of the cities in this game.  Even two complete sets (i.e., 90 wagons) wouldn’t be enough.  (We haven’t considered stations in this analysis, but even if you play all three of your stations to substitute for the three four-wagon routes required in the minimum spanning tree, you still need 96 wagons to complete the rest of the tree.)

So, even if it were possible in the game to hold all the destination tickets, you cannot actually earn them all.  (That should not surprise anyone who has played Ticket to Ride – Europe, I imagine.)  What may be a bit more surprising is what our answer of 108 implies about just how far from possible it is to earn every destination ticket.

A couple of fun observations: (1) At one point Zagrab actually wins the alphabetical order criterion!  This is because the alternative choice is Zurich.  (2)  For quite a long time after selecting Edinburgh to London we can avoid choosing a route that requires more than two wagons.  This is because many of the cities in northwestern and central Europe on the map are connected by only one- or two-wagon routes.

Posted in combinatorial optimization, games, graph theory | Leave a comment

The Max Flow-Min Cut Theorem via Linear Programming Duality

The Max Flow-Min Cut Theorem states that the maximum flow from the source node to the sink node through a capacitated network is equal to the capacity of the minimum cut that separates that source node from the sink node.  In this post we’ll show that this result, which can be proved using only network concepts, also follows from the strong duality theorem for linear programming.

Here’s a linear programming formulation of the maximum flow problem.

\displaystyle    \begin{aligned} \text{Maximize } f & \\ \text{subject to } &\sum_{j:(i,j) \in A} x_{ij} - \sum_{k:(k,i) \in A} x_{ki} = \begin{cases} f, & i = s; \\ -f, & i = t; \\ 0, &\text{otherwise;} \end{cases} \text{ for all } i \in N; \\ &0 \leq x_{ij} \leq u_{ij}, \text{ for all } (i,j) \in A. \end{aligned}

Here, f is the total amount of flow, x_{ij} is the amount of flow sent along arc (i,j)A is the set of arcs, N is the set of nodes, s is the (single) source, t is the (single) sink, and u_{ij} is the capacity (upper bound) of arc (i,j).

The constraints involving the summations are the flow-balance constraints.  They enforce the idea that the total amount of flow into a node must equal the total amount of flow out of that node.  The two exceptions are the source and the sink, which emit and absorb, respectively, all the flow through the network.

The dual of this problem is

\displaystyle    \begin{aligned} \text{Minimize } &\sum_{(i,j) \in A} u_{ij} z_{ij} \\ \text{subject to } &y_i - y_j + z_{ij} \geq 0, \:\: \text{ for all } (i,j) \in A; \\ &-y_s + y_t = 1; \\ &z_{ij} \geq 0, \text{ for all } (i,j) \in A. \end{aligned}

Here, z_{ij} is the dual variable associated with the upper bound constraint on arc (i,j), and y_i is the dual variable associated with the flow-balance constraint on node i.

Next, let’s take a look at why the dual problem corresponds to the problem of finding a minimum cut.  In a search for an optimal solution we will want to set z_{ij} = 0 as much as possible.  This means that we will want to set y_i = y_j for as many arcs (i,j) as possible. (*)  However, we cannot set y_i = y_j for all arcs (i,j), as the requirement -y_s + y_t = 1 means y_t will have a value one unit larger than y_s.  So in a search for an optimal solution to the dual we can consider only those cases in which we break the node set into two pieces: Those nodes i satisfying y_i = y_s and those nodes j satisfying y_j = y_t.  Thus this set of cases in which the optimal solution must be found corresponds to the set of s-t cuts that place the source and sink in different subsets.

Does this set of cases leave out any s-t cuts, though?  No.  To see this, note that every cut (S,\bar{S}) with s \in S, t \in \bar{S} corresponds to a feasible solution to the dual in the following manner.  Let y_s = 0, y_t = 1.  Then let y_i = \begin{cases} 0, & i \in S; \\ 1, & i \in \bar{S}. \end{cases}

Finally, let z_{ij} = \begin{cases} 1, & (i,j) \in (S, \bar{S}); \\ 0, & (i,j) \not\in (S, \bar{S}). \end{cases}

This assignment satisfies the dual problem, and its corresponding objective function value is exactly the capacity of the cut (S, \bar{S}).

Since the assignment of zero flow through the network is feasible for the original problem, both the original problem and its dual are feasible.  By the strong duality theorem, then, the original problem and its dual have optimal solutions whose objective function values are equal.  Therefore, the maximum flow through the network must equal the capacity of the minimum cut that separates the source node s from the sink node t.


(*)  Well, the constraint actually just requires y_i \geq y_j, but we know from complementary slackness that if there’s positive flow on an arc (i,j) in the optimal solution to the maximum flow problem then the corresponding dual constraint  y_i - y_j + z_{ij} \geq 0 must be satisfied at equality.  So for the dual constraints that actually constrain the solution and thus actually matter we’ll have y_i - y_j + z_{ij} = 0 in the dual optimal.

Posted in linear programming, optimization | Leave a comment