## A Combinatorial Proof for the Power Sum

Probably the most common formula for the power sum $\displaystyle \sum_{k=0}^n k^m$ is the one involving binomial coefficients and Bernoulli numbers $B_k$, sometimes called Faulhaber’s formula:

$\displaystyle \sum_{k=0}^n k^m = \frac{1}{m+1} \sum_{k=0}^m \binom{m+1}{k} B_k n^{m+1-k}.$

Historically, this, or a variant of it, was the first general formula for the power sum.

There are other ways to express the power sum, though.  The subject of this post is to give a combinatorial proof for the following one involving Stirling numbers:

$\displaystyle \sum_{k=0}^n k^m = \sum_{k=1}^m \binom{n+1}{k+1} \left\{ m \atop k \right\} k! .$

Proof: How many functions f are there from the set $\{1, 2, \ldots, m+1 \}$ to the set $\{1, 2, \ldots, n+1\}$ such that $f(1) > f(j)$ for any $j \in \{2, \ldots, m+1\}$?

Answer 1: Condition on the value of $f(1)$. If $f(1) = k+1$, then there are k choices for $f(2)$, k choices for $f(3)$, etc., for a total of $k^m$ functions. Summing over all possible values of k gives the left side.

Answer 2: Condition on the number of elements in the range of f. (This is not the codomain, which is $\{1, 2, \ldots, n+1\}$, but the subset consisting of elements from the codomain that have something from $\{1, 2, \ldots, m+1\}$ mapped to them.  The range may be the entire codomain or a proper subset of it.) If the range of f has $k+1$ elements, there are $\binom{n+1}{k+1}$ ways to choose exactly which elements will comprise the range. The largest of these must be $f(1)$, and the remaining m elements in the domain must be mapped to the other k elements in the range. There are $\left\{ m \atop k \right\}$ ways to assign the remaining m elements to k nonempty groups, and then k! ways to assign these groups to the actual k remaining elements in the range. (More generally, the number of onto functions from an m-set to a k-set is $\left\{ m \atop k \right\} k!$.) Summing over all possible values of k gives the right side.

(I first gave this proof as an answer here on Math.SE.)

## Yet Another Nice Proof of the Derangements Formula

A couple of posts from a few years back give proofs of the formula for $D_n$, the number of derangements on n elements.  (A derangement is a permutation with no fixed points.)  Both proofs avoid the use of the principle of inclusion-exclusion, which is the standard method of proving the derangements formula.

This post contains another proof of the derangements formula, and I think this one is my favorite of the three.  It starts with a proof of the following identity:

$\displaystyle n! = \sum_{k=0}^n \binom{n}{k} D_{n-k} = \sum_{k=0}^n \binom{n}{k} D_k.$

Proof.  Both sides of the identity count the number of permutations on n elements.  The left side is straightforward.  For the middle formula, condition on the number of fixed points in the permutation.  To count the number of permutations with k fixed points, first choose which k of the n elements are to be mapped to themselves.  This can be done in $\binom{n}{k}$ ways.  The remaining n-k elements cannot be mapped to themselves; i.e., they must be deranged.  There are $D_{n-k}$ ways to do this.  Multiply these together and sum over all possible values of k to get the middle formula.  The right side just reverses the summation on the middle formula (i.e., replacing k with n-k), while using the binomial coefficient symmetry property $\binom{n}{k} = \binom{n}{n-k}$.

Next, apply the binomial inversion formula.  This comes in different forms.  The one we will use says that for sequences $f(n)$ and $g(n)$ we have

$\displaystyle f(n) = \sum_{k=0}^n \binom{n}{k} g(k) \iff g(n) = \sum_{k=0}^n \binom{n}{k} f(k) (-1)^{n-k}.$

Binomial inversion can be proved multiple ways.  My favorite uses exponential generating functions, but there are more direct proofs, too (see, for example, p. 193 of Concrete Mathematics [1]).

Applying the binomial inversion formula to Equation (1), with $f(n) = n!$ and $g(k) = D_k$, we get

$\displaystyle D_n = \sum_{k=0}^n \binom{n}{k} k! (-1)^{n-k} = \sum_{k=0}^n \binom{n}{k} (n-k)! (-1)^k = n! \sum_{k=0}^n \frac{(-1)^k}{k!},$

where the second step reverses the summation.

References

1.  Ronald L. Graham, Donald E. Knuth, and Oren Patashnik.  Concrete Mathematics (2nd ed.), Addison-Wesley, 1994.

## Solving Two-Term Combinatorial Recurrence Relations, Part II

A few years ago I published a paper [2] giving a partial answer to the following question in Graham, Knuth, and Patashnik’s Concrete Mathematics (Problem 6.92):

Develop a general theory of the solutions to the two-parameter recurrence

$\displaystyle \left| n \atop k \right| = (\alpha n + \beta k + \gamma) \left| n-1 \atop k \right| + (\alpha' n + \beta' k + \gamma') \left| n-1 \atop k-1 \right| + [n=k=0].$

A post from a few years ago discusses the results in my paper.  I also asked a question on Math Overflow (my first MO question!) asking for other known results.

Well, a recent paper by Barbero, Salas, and Villaseñor, “Bivariate Generating Functions for a Class of Linear Recurrences: General Structure” [1], gives a complete solution to this recurrence in the form of exponential generating functions.  I’m not going to try to summarize their paper here, as it’s too technical, although I will mention that they do have to break the recurrence into cases to obtain their results.  (Given my earlier work on the problem this does not surprise me.)  If you’re interested, take a look.  They do some very nice work to get their results.

References

1.  Barbero, Salas, and Villaseñor, “Bivariate Generating Functions for a Class of Linear Recurrences: General Structure,” Journal of Combinatorial Theory, Series A, 125 (2014) 146-165.

2.  Spivey, Michael Z.,  “On Solutions to a General Combinatorial Recurrence,” Journal of Integer Sequences, 14 (9): Article 11.9.7, 2011.

Posted in combinatorics, recurrence relations | 1 Comment

## A Probabilistic Proof of a Binomial Coefficient Identity

I’m a big fan of combinatorial proofs, in which you show that both sides of an identity count the same entity in two different ways.  Probabilistic proofs – in which you show that both sides of an identity express the probability that an event occurs in two different ways – are a close relative of combinatorial proofs.  In fact, many probabilistic proofs can be translated directly into combinatorial proofs by clearing denominators.  Sometimes, though, the probabilistic proof is more natural than the corresponding combinatorial one.  Today’s post is one of these.

One of the basic binomial coefficient identities that involves a fraction is the following:

$\displaystyle \sum_{k=0}^n \binom{n}{k} \frac{(-1)^k}{k+1} = \frac{1}{n+1}$.

Interpreting this probabilistically, the left-hand side looks like it uses inclusion-exclusion, and the right looks like the probability of selecting one special item out of n+1 choices. With that in mind, we can further interpret the left-hand side as having something to do with selecting items out of k+1 choices, as k varies from 0 to n. Probabilists like balls and jars, so let’s construct a scenario in those terms.

Suppose there are balls numbered 1 through n placed in a jar along with one black ball. We select the balls, one-by-one, and remove them from the jar. What is the probability that the black ball is the first ball chosen? It’s pretty clear that this is $\frac{1}{n+1}$, the right-hand side of our identity.

To use inclusion-exclusion for the left-hand side we have to interpret “the black ball is the first ball chosen” as the union or intersection of a collection of other events. Moreover, we have to do this in such a way as to work in the binomial coefficient and unit fraction $\frac{1}{k+1}$ as part of counting up the probabilities when these events are intersected. From the intersection standpoint, the black ball does have to come before ball 1 and before ball 2 and before ball 3 and so forth. So let’s let $A_i$, $1 \leq i \leq n$, be the event that the black ball is drawn before ball i and use the intersection version of inclusion-exclusion to find an expression for $P\left( \bigcap_{i=1}^n A_i \right)$. We have that $P(\bar{A}_i) = \frac{1}{2}$ for any i. Similarly, $P(\bar{A}_i \cap \bar{A}_j) = \frac{1}{3}$ because the probability that the black ball does not come before ball i and does not come before ball j is the probability that it comes last of those three, which is $\frac{1}{3}$. In general, the probability that the black ball comes as the last of any k+1 specific balls is $\frac{1}{k+1}$, and there are $\binom{n}{k}$ collections of $k+1$ specific balls that include the black ball. By inclusion-exclusion, then, the probability that the black ball is the first ball chosen is given by

$\displaystyle 1 - \sum_{k=1}^n \binom{n}{k} \frac{(-1)^{k+1}}{k+1} = \sum_{k=0}^n \binom{n}{k} \frac{(-1)^k}{k+1}.$

Thus both sides of the identity represent the probability that the black ball is the first ball chosen out of n+1 specific balls.

(The are, of course, other proofs of this identity.  Two of the quicker ones are to use the absorption identity for binomial coefficients, as in my post here, and integrating the function $(1-x)^n$ in two different ways, as in my post here.)

## Euler Sums, Part II: A Symmetry Formula

A post from a few months ago gave a proof that $\displaystyle \sum_{x=1}^\infty \sum_{y = 1}^x \frac{1}{x^2 y} = 2 \zeta(3).$  In today’s post I’d like to prove a general symmetry formula for Euler sums like this one.  Define

$\displaystyle \zeta_N(a) = \sum_{x=1}^N \frac{1}{x^a},$

$\displaystyle \zeta(a) = \sum_{x=1}^{\infty} \frac{1}{x^a},$

$\displaystyle \zeta_N(a,b) = \sum_{x=1}^N \sum_{y=1}^{x-1} \frac{1}{x^a y^b},$ and

$\displaystyle \zeta(a,b) = \sum_{x=1}^{\infty} \sum_{y=1}^{x-1} \frac{1}{x^a y^b}.$

(Notice the upper index on the second sum is $x-1$, not $x$ like we used in the previous post.)  The formula we will be proving is

$\displaystyle \zeta(a,b) + \zeta(b,a) = \zeta(a) \zeta(b) - \zeta(a+b)$.

We will show this by proving a more precise result:

$\displaystyle \zeta_N(a,b) + \zeta_N(b,a) = \zeta_N(a) \zeta_N(b) - \zeta_N(a+b).$

In other words, the symmetry result is true in both the finite and infinite versions; there’s no error term when you truncate the sum at a fixed N.

Here we go:

$\displaystyle \zeta_N(a,b) + \zeta_N(b,a) = \sum_{x=1}^N \sum_{y=1}^{x-1} \frac{1}{x^a y^b} + \sum_{x=1}^N \sum_{y=1}^{x-1} \frac{1}{x^b y^a}$

$= \displaystyle \sum_{y=1}^N \sum_{x=y+1}^N \frac{1}{x^a y^b} + \sum_{x=1}^N \sum_{y=1}^{x-1} \frac{1}{x^b y^a}$, by swapping the order of summation on the first sum

$= \displaystyle \sum_{x=1}^N \sum_{y=x+1}^N \frac{1}{x^b y^a} + \sum_{x=1}^N \sum_{y=1}^{x-1} \frac{1}{x^b y^a}$, by relabeling variables on the first sum

$\displaystyle = \sum_{x=1}^N \sum_{y=x}^N \frac{1}{x^b y^a} - \sum_{x=1}^N \frac{1}{x^{a+b}} + \sum_{x=1}^N \sum_{y=1}^{x-1} \frac{1}{x^b y^a},$

$= \displaystyle \sum_{x=1}^N \sum_{y=1}^N \frac{1}{x^b y^a} - \zeta_N(a+b)$

$= \displaystyle \zeta_N(a) \zeta_N(b) - \zeta_N(a+b).$

Taking the limit as $N \to \infty$, we get

$\displaystyle \zeta(a,b) + \zeta(b,a) = \zeta(a) \zeta(b) - \zeta(a+b)$.

(This argument also appears in my post on the evaluation of a triple Euler sum on math.SE.)

## A Math Error in the New Yorker

I normally use this blog to talk about mathematical questions that interest me.  However, I saw a math error in the New Yorker yesterday that I think is worth commenting on.

James Surowiecki has an article (The Mobility Myth) arguing that income mobility (the ability of groups to change their economic status) in the United States isn’t very high.  At one point he makes the following statement: “The middle class isn’t all that mobile, either: only twenty per cent of people born into the middle quintile ever make it into the top one.”

This sounds pretty bad at first glance: Twenty percent isn’t that large of a percentage.  However, the top quintile is, by definition, the top one-fifth; i.e., the top twenty percent.  If the United States were to have perfect income mobility (in the sense that a parent’s economic status has absolutely no effect on that of their children) then we would see exactly twenty percent of children born into the middle class end up in the top quintile (as well as twenty percent in the bottom quintile and twenty percent in each of the other three quintiles).  In other words, the argument that “twenty percent of people born into the middle quintile make it into the top one” is actually an argument in favor of high income mobility in the United States, not against it!

Surowiecki’s misunderstanding about what quintiles actually mean causes some of his other claims not to be as drastic as they sound.  For example, “fewer than ten per cent [of the bottom quintile] get into the top quintile,” while still not that great, isn’t quite so bad once you realize that twenty percent is the goal.  Similarly with “In San Jose, just thirteen per cent of people in the bottom quintile make it to the top.”  (However, I should point out that, given the number of people we’re talking about here, the difference between twenty percent and thirteen percent is quite large in absolute terms.)

I’m not going to comment on the merits of the rest of Surowiecki’s column, as to do so would be to step too far from my area of expertise.  I will add, though, that his data appear to come from the study “Where is the Land of Opportunity?: Intergenerational Mobility in the United States,” by Chetty, Hendren, Kline, and Saez.  This document contains of wealth of information to interest data guys like me.  See especially Table III and Online Appendix Table IV, which appear to be the sources of some of the numbers Surowiecki cites.

Posted in journalism, statistics | 2 Comments

## The Maximum Value of Spearman’s Footrule Distance

Given a permutation $\sigma$ on n elements, Spearman’s Footrule Distance $F(\sigma)$ is the sum of the absolute differences between i and $\sigma(i)$ over all values of i:

$\displaystyle F(\sigma) = \sum_{i=1}^n |\sigma(i) - i|.$

Spearman’s Footrule Distance can be thought of as a measure of the disarray of a permutation.  Another such measure is Kendall’s Tau, which uses the sum of squared deviations rather than the sum of absolute deviations.

In this post we prove that $\displaystyle \max_{\sigma} F(\sigma) = \begin{cases} 2m^2, & n = 2m; \\ 2m^2+2m, & n = 2m+1; \end{cases}$
where the maximum is taken over all permutations $\sigma$ on n elements.  The argument is that given by user4140 in this post on Math.SE with some details added.

First, let’s look at the case where n is even.  Let $n = 2m$.  Let $\sigma$ be any permutation with property P; namely, that $\sigma(i) > m$ if $i \leq m$ and $\sigma(i) \leq m$ if $i > m$.  Then

$\displaystyle F(\sigma) = \sum_{i=1}^{2m} |\sigma(i) - i| = \sum_{i=1}^m (\sigma(i) - i) + \sum_{i=m+1}^{2m} (i - \sigma(i))$.

Now, in the first sum $\sigma(i)$ ranges over the values m+1 through 2m, hitting each exactly once.  In the second sum $\sigma(i)$ does the same with the values 1 through m.  Therefore,

$\displaystyle F(\sigma) = 2 \sum_{i=m+1}^{2m} i - 2 \sum_{i=1}^m i = 2m(2m+1) - m(m+1) - m(m+1)$
$\displaystyle = 2m(2m+1) - 2m(m+1) = 2m(2m+1-m-1) = 2m^2.$

Now, suppose we have a permutation $\sigma$ that does not have property P.  Suppose, without loss of generality, that $\sigma(i) \leq m$ for some $i \leq m$.  Then, by the pigeonhole principle, there exists some $j > m$ such that $\sigma(j) > m$.  Swap $\sigma(i)$ and $\sigma(j)$ to create a new permutation $\sigma'$.  Place the four points $i, \sigma(i), j,\sigma(j)$ on a number line according to their numerical values.  Let A, B, and C denote, respectively, the distances between consecutive points.  Then $|\sigma(i) - i| + |\sigma(j) - j| = A+C$.  Regardless of the order of $i, \sigma(i), j$, and $\sigma(j)$, we have $|\sigma'(i) - i| + |\sigma'(j) - j| = A + C + 2B$.  This is because the A to C distances must both be covered under the $\sigma'$ permutation and the B distance must be covered twice.

Therefore, any permutation that does not have property P cannot be a permutation that achieves the maximum value of Spearman’s Footrule Distance.  Since those permutations $\sigma$ that do have property P have $F(\sigma) = 2m^2$, this must be the maximum value of Spearman’s Footrule Distance for permutations satisfying $n = 2m$.

For $n = 2m+1$ the argument is similar.  We need property Q; namely, that $\sigma(i) > m+1$ if $i \leq m$, $\sigma(m+1) = m+1$, and $\sigma(i) \leq m$ if $i > m+1$.  Then

$\displaystyle F(\sigma) = \sum_{i=1}^{2m+1} |\sigma(i) - i| = \sum_{i=1}^m (\sigma(i) -i) + \sum_{i=m+2}^{2m+1} (i - \sigma(i))$
$\displaystyle = (2m+1)(2m+2) - (m+1)(m+2) - m(m+1) = 2m^2+2m.$

The argument for permutations $\sigma$ that do not satisfy property Q is similar.  This proves the result.

For more information on Spearman’s Footrule Distance, see Diaconis and Graham, “Spearman’s Footrule as a Measure of Disarray,” Journal of the Royal Statistical Society, Series B, Vol. 39, No. 2 (1977) , pp. 262-268.