A Probabilistic Proof of a Binomial Coefficient Identity, Generalized

In a post from a couple of years ago I gave a probabilistic proof of the binomial coefficient identity

\displaystyle \sum_{k=0}^n \binom{n}{k} \frac{(-1)^k}{k+1} = \frac{1}{n+1}.

In this post I modify and generalize this proof to establish the identity

\displaystyle (1) \:\:\:\:\:\: \sum_{k=0}^n \frac{(-1)^k}{m+k} = \frac{(m-1)! \, n!}{(m+n)!}.

As in the original proof, we use a balls-and-jars interpretation.  Suppose we have a jar that contains a black ball, n numbered blue balls, and m-1 numbered red balls.  Suppose we choose the balls one-by-one from the jar.  Let A_i be the event that the black ball is chosen after blue ball i.  Let B be the event that the black ball is chosen before all the red balls.  We argue that Identity (1) gives in two different ways the probability of B \cap \bigcap_{i=1}^n A_i, the event that all the blue balls are chosen, then the black ball, and then all the red balls.

The right side of Identity (1) is easier to explain.  There are (m+n)! ways to arrange all m+n balls.  There are (m-1)! ways to order the blue balls, one way to order the black ball, and n! ways to order the red balls.  This gives \displaystyle \frac{(m-1)! \, n!}{(m+n)!} as the probability that all the blue balls are selected, then the black ball, and then all of the red balls.

For the left side of Identity (1) we need the principle of inclusion/exclusion.  A slight generalization of one version of it says that

\displaystyle P \left(B \cap \bigcap_{i=1}^n A_i \right) = P(B) - \sum_{i=1}^n P(B \cap \bar{A}_i) + \sum_{1 \leq i < j \leq n} P(B \cap \bar{A}_i \cap \bar{A}_j) \\ - \sum_{1 \leq i < j < k \leq n} P(B \cap \bar{A}_i \cap \bar{A}_j \cap \bar{A}_k) + \cdots + (-1)^n P(B \cap \bar{A}_1 \cap \cdots \cap \bar{A}_n).

The event \bar{A}_i is the event that the black ball is chosen before blue ball i.  Thus, the intersection of any k of these \bar{A}_i‘s with B is the event that the black ball is chosen first out of m+k specific balls: the black one, the m-1 red ones, and k of the blue ones.  This has probability 1/(m+k).  There are \binom{n}{k} ways to select k of the \bar{A}_i‘s.  By this generalized inclusion/exclusion principle, then, the probability that the black ball is chosen after the blue balls but before any of the red balls is also given by

\displaystyle \sum_{k=0}^n \binom{n}{k} \frac{(-1)^k}{m+k}.

This argument is taken from my paper [1], where I give a probabilistic proof of the binomial inverse of Identity (1), as well as proofs of generalizations of both (1) and its binomial inverse.


  1.  Michael Z. Spivey, “Probabilistic proofs of a binomial identity, its inverse, and generalizations.” American Mathematical Monthly, 123 (2): 175-180, 2016.
Posted in binomial coefficients, probability | Leave a comment

A Proof of Dobinski’s Formula for Bell Numbers

Dobinski’s formula entails the following infinite series expression for the nth Bell number \varpi_n:

\displaystyle \varpi_n = \frac{1}{e} \sum_{k=0}^{\infty} \frac{k^n}{k!}.

In this post we’ll work through a proof of Dobinski’s formula.  We’ll need four formulas:

  1. The Maclaurin series for e^x: \displaystyle e^x = \sum_{k=0}^{\infty} \frac{x^k}{k!}.
  2. The formula for converting normal powers to falling factorial powers: \displaystyle x^n = \sum_{k=0}^n \left\{ n \atop k \right\} x^{\underline{k}}, where \left\{ n \atop k \right\} is a Stirling number of the second kind.
  3. The formula relating Bell numbers and Stirling numbers of the second kind: \displaystyle \varpi_n = \sum_{k=0}^n \left\{ n \atop k \right\}.  (This follows directly from the combinatorial definitions of the Bell numbers and the Stirling numbers of the second kind.)
  4. The representation of falling factorial powers as factorials: \displaystyle k^{\underline{m}} = \frac{k!}{(k-m)!}, valid when k \geq m and k, m \in \mathbb{N}.

Here we go… Starting with the infinite series expression \displaystyle \sum_{k=0}^{\infty} \frac{k^n}{k!}, convert to falling factorial powers to get

\displaystyle \sum_{k=0}^{\infty} \frac{k^n}{k!} = \sum_{k=0}^{\infty} \frac{1}{k!} \sum_{j=0}^n \left\{n \atop j \right\} k^{\underline{j}} = \sum_{j=0}^n \left\{ n \atop j \right\} \sum_{k=0}^{\infty} \frac{k^{\underline{j}}}{k!} = \sum_{j=0}^n \left\{n \atop j \right\} \sum_{k=j}^{\infty} \frac{k^{\underline{j}}}{k!} = \sum_{j=0}^n \left\{ n \atop j \right\} \sum_{k=j}^{\infty} \frac{1}{(k-j)!} = \sum_{j=0}^n \left\{ n \atop j \right\} \sum_{i=0}^{\infty} \frac{1}{i!} = \sum_{j=0}^n \left\{ n \atop j \right\} e.

Thus \displaystyle \frac{1}{e} \sum_{k=0}^{\infty} \frac{k^n}{k!} = \frac{1}{e} \sum_{j=0}^n \left\{ n \atop j \right\} e = \varpi_n.

(I made a similar argument in this Math.SE question.)

Posted in Bell numbers, sequences and series, Stirling numbers | Leave a comment

Operations Research Makes List of Top 5 STEM Professions Employing Women

My February 2016 issue of ORMS Today says that “operations research analyst” is the STEM (Science, Technology, Engineering, and Mathematics) profession with the third-highest percentage of women.  The percentage of OR analysts who are women is 55.4%.  I’m glad to see that, and I’m not that surprised, given what I’ve seen in the profession.

The sentence that caught my eye, though, was this one: “This field only has 55.4 percent female workers, but that is still a considerable amount when looking at women in STEM.”  Only 55.4%?  What I think they mean is that for the STEM profession with the third-highest percentage of women, that’s a relatively small number.  But it sure looks funny to say “only,” which seems to imply that 55.4% is somehow not a large enough percentage of women.  (Would 44.6% men be too many?)

ORMS Today gives the source as the January 12, 2016, edition of USA Today.  I couldn’t find that online, but I did find this site that gives the same text quoted in ORMS Today.

A couple of pages later, in the “Newsmakers” section, ORMS Today says that, “According to [MIT Professor Richard] Larson, people can expect to spend one to two years of their lives waiting line, most of it stuck in traffic.”  Yikes!


Posted in optimization, statistics, operations research, gender | Leave a comment

An Explicit Solution to the Fibonacci Recurrence

The Fibonacci sequence is a famous sequence of numbers that starts 1, 1, 2, 3, 5, 8, 13, 21, and continues forever.  Each number in the sequence is the sum of the two previous numbers in the sequence.  It’s easy to generate the sequence this way, but if you want, say, the 1000th Fibonacci number, it’s going to take you a while to get there by hand.  In this post I’m going to discuss how to come up with a formula for the nth Fibonacci number.

Let’s let F_n denote the nth Fibonacci number.  Let’s start counting with n = 0, and say F_0 = 0.  Then F_1 = 1.  These are known as the initial conditions. For n \geq 2, F_n = F_{n-1} + F_{n-2}.  This is known as the recurrence.

The first thing I want to prove is that the set of solutions to the Fibonacci recurrence F_n = F_{n-1} + F_{n-2} (ignoring the initial conditions for now) form what’s called a vector space (under function addition and scalar multiplication).  To show that a set is a vector space we need to show that it satisfies a particular collection of axioms.  Suppose the sequence \{s_n\} is a solution to the Fibonacci recurrence.  Most of the vector space axioms are satisfied simply because s_n is a function; the ones we still need to check are that (1) adding two solutions gives us another solution, and (2) multiplying a solution by a constant (a scalar) gives us another solution.  First, if s_n and t_n are solutions, then substituting s_n + t_n for F_n in the right side of the recurrence yields (s_{n-1} + t_{n-1}) + (s_{n-2} + t_{n-2}) = (s_{n-1} + s_{n-2}) + (t_{n-1} + t_{n-2}) = s_n + t_n, which proves that s_n + t_n is also a solution to the Fibonacci recurrence.  Second, if s_n is a solution and c is a constant then substituting cs_n into the right side of the recurrence yields cs_{n-1} + cs_{n-2} = c(s_{n-1} + s_{n-2}) = cs_n, which means that c s_n is also a solution to the recurrence.  Thus (1) and (2) are satisfied, which means that the set of solutions to the Fibonacci recurrence is a vector space.  (Again, we’re ignoring the initial conditions.)

This may seem like a lot of overhead, but there’s a good reason. Vector spaces have something called a dimension.  (Example: Real Euclidean three-space \mathbb{R}^3 has dimension 3.)  The vector space consisting of the solutions to the Fibonacci recurrence turns out to have dimension 2.  (See below.)  Because of this, if we can find two solutions s_n and t_n to the Fibonacci recurrence that are not simply multiples of each other (more generally, are not linearly independent), then every solution to the Fibonacci recurrence must be of the form p s_n + q t_n, where p and q are some constants that depend on the initial conditions.  The set \{s_n, t_n\} is called a basis of the solution space.

Let’s now see why the dimension of the solution space to the Fibonacci recurrence is 2. We can represent the Fibonacci recurrence in matrix form like so: \displaystyle \begin{bmatrix} F_n \\ F_{n-1} \end{bmatrix} = \begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix} \begin{bmatrix} F_n \\ F_{n-1} \end{bmatrix}.  If we let X_n = (F_n, F_{n-1}) and A denote the matrix, this equation becomes X_n = A X_{n-1}.  Since the transition matrix A has dimension 2, so does the vector space of solutions to the Fibonacci recurrence.

Now we just need to find two solutions to the Fibonacci recurrence that aren’t multiples of each other.  Going simpler, the recurrence S_n = 2 S_{n-1} has S_n = 2^n as a solution.  Since the Fibonacci recurrence can be represented just like this, X_n = A X_{n-1}, with the matrix A instead of 2 and a vector X_n instead of S_n, maybe there is also a solution to the Fibonacci recurrence of the form F_n = r^n.  We just need to figure out the value of r.  So let’s substitute F_n = r^n into the Fibonacci recurrence and see if there are any values of r that work.  We get r^n = r^{n-1} + r^{n-2}. Dividing by r^{n-2}, we end up with r^2 = r + 1.  Solving this equation with the quadratic formula, we get not one but two values for r: \displaystyle r_1 = \frac{1 + \sqrt{5}}{2}, and \displaystyle r_2 = \frac{1 - \sqrt{5}}{2}.  This means that s_n = r_1^n and t_n = r_2^n are both solutions to the Fibonacci recurrence.  And since these solutions are not multiples of each other, they can be our set of basis solutions!

This means that every solution to the Fibonacci recurrence must be of the form  F_n = p r_1^n + q r_2^n, where r_1 = \frac{1 + \sqrt{5}}{2} and r_2 = \frac{1 - \sqrt{5}}{2}.  The specific solution we want is the one that has values for p and q such that the initial conditions F_0 = 0 and F_1 = 1 are satisfied.  Plugging n = 0 and n = 1 into F_n = p r_1^n + q r_2^n, we have 0 = p + q and 1 = p r_1 + q r_2.  Using the fact that r_1 + r_2 = 1, we can solve this system of equations to find out that p = 1/\sqrt{5}, q = -1/\sqrt{5}.  Therefore, the explicit solution to the Fibonacci recurrence is

\displaystyle F_n = \frac{1}{\sqrt{5}} \left(\frac{1 + \sqrt{5}}{2}\right)^n - \frac{1}{\sqrt{5}} \left(\frac{1 - \sqrt{5}}{2}\right)^n.

This formula is known as Binet’s formula.

Additional comments:

  1. The number r_1 = \frac{1+\sqrt{5}}{2} is known as the golden ratio.
  2. The number r_2 = \frac{1-\sqrt{5}}{2} is between -1 and 0, which means that as n gets larger and larger r_2^n gets closer and closer to 0.  Thus, for large values of n, F_n \approx \frac{1}{\sqrt{5}} r_1^n.
  3. The technique described here can be used to solve any linear homogeneous recurrence relation with constant coefficients.
Posted in Fibonacci sequence, recurrence relations, sequences and series | Leave a comment

Calculating SECA Offsets for Clergy

Recently my pastor posed a problem to me: How to calculate his SECA offset?  The answer can be found using algebra or using infinite series.  (For the non-math inclined, skip to the bottom of this post for an answer.)

Background: Most of us who work for an organization in the United States are charged a 15.3% Social Security FICA tax.  (This is 15.3% of our salaries.)  Half (7.65%) of that tax is paid by our employers, and the other half is paid by us (in the form of a deduction from our paychecks).  However, clergy are considered self-employed for Social Security tax purposes, which means that they have to pay all 15.3% themselves.  To counteract this, many congregations pay their clergy an extra 7.65% of their compensation to cover the Social Security tax that is normally covered by employers.  This is called a SECA offset.  However, this extra bit is itself considered compensation, so it is also taxed by Social Security!  This means that the SECA offset payment, if it is calculated this way, is actually less than needed to cover the intended part of the clergy’s Social Security tax.  The congregation could then pay an extra part to make up that difference, but that extra part would also be taxed, which means they’re still underpaying… and so on forever, in a perpetual cycle.

Let’s take an example.  Suppose a pastor makes a nice, round $50,000 in compensation.  On top of this, the congregation pays the pastor a SECA offset of 7.65%, which is $3825.  This means the pastor’s actual taxable income from the church is $53,825.  The 7.65% Social Security tax on $53,825 is $4117.61, which means the congregation has actually underpaid by $4117.61 minus $3825, which is $292.61.  The congregation could then give the extra $292.61 to the pastor, but that would itself count as income and thus raise the pastor’s Social Security tax even further.  Again, this process could continue forever.  Fortunately, looking at this problem the right way mathematically can solve it for us.

Let’s formulate the problem like this: How much should a congregation pay its clergy in SECA offset if it wants that payment to be exactly 7.65% of the clergy’s total compensation?

(Infinite Series Solution.)  Turning to math, let’s let A denote a pastor’s original compensation.  The first-order attempt at covering the SECA offset is 0.0765A.  This, though, is taxable, so the congregation needs to include 7.65% of this amount, or (0.0765)^2 A.  But then this extra is also taxable at 7.65%, so the congregation needs to include an additional (0.0765)^3 A, and so on.  Continuing in this vein, we have what is known as an infinite series for the amount the congregation should pay the pastor in total compensation:

\displaystyle A + 0.0765 A + (0.0765)^2 A + (0.0765)^3 A + (0.0765)^4 A + (0.0765)^5 A + \cdots.

The shorthand way to write an infinite series like this one is with sigma notation:

\displaystyle \sum_{i=0}^{\infty} (0.0765)^i A = A \sum_{i=0}^{\infty} (0.0765)^i.

This infinite series is a special class of infinite series known as a geometric series, \displaystyle \sum_{i=0}^{\infty} r^i.  In our example, r = 0.0765.  Provided -1 < r < 1, geometric series satisfy the nice formula

\displaystyle \sum_{i=0}^{\infty} r^i = \frac{1}{1-r}.

This gives us our answer: The congregation should pay the pastor a total of

\displaystyle A \sum_{i=0}^{\infty} (0.0765)^i = \frac{A}{1-0.0765} = \frac{A}{0.9235} = 1.082837A.  Subtracting the original compensation of A from this, we’re left with 0.082837A for the SECA offset.

This means the SECA offset should actually be 8.2837% of the original compensation if that payment is to be exactly 7.65% of the clergy’s total compensation.

(Algebra Solution.)  While the infinite series solution is the straightforward way of tackling the problem, the math is much easier if we set up an algebraic equation to solve the problem.  Let’s let x denote the desired amount of SECA offset, with A the pastor’s original compensation.  We want x to be exactly 7.65% of the pastor’s total compensation.  The pastor’s total compensation, though, is A+x.  That gives us the following equation:

x = 0.0765(A+x) .

Solving this for x, we get x = 0.0765A + 0.0765x, or 0.9235x = 0.0765A, or x = \frac{0.0765}{0.9235}A = 0.082837A, the same answer as before – but with less work.

(Of course, in practice there are other issues to consider, such as the fact that raising the pastor’s compensation by 8.2837% will raise his/her income tax and benefit payments as well.  The answer to the problem as posed, though, which just focuses on the SECA offset, is 8.2837% of the original compensation amount.)

Posted in applications, sequences and series | Leave a comment

Generalized Binomial Coefficients from Multiplicative and Divisible Functions

Given a function f from the natural numbers to the natural numbers, one way to generalize the binomial coefficient is via

\displaystyle \binom{n+m}{n}_f = \frac{ \prod_{i=1}^{n+m} f(i)}{ \left( \prod_{i=1}^n f(i) \right) \left( \prod_{i=1}^m f(i) \right)}.

The usual binomial coefficient \binom{n+m}{n} of course has f as the identity function f(n) = n.

Question: What kinds of functions f guarantee that \binom{m+n}{n}_f is an integer for all natural values of n and m?  A famous paper by Knuth and Wilf [2] includes a proof that strongly divisible functions (i.e., functions satisfying \gcd(f(m), f(n)) = f(\gcd (m,n)) guarantee this property.  The most famous strongly divisible function is probably the Fibonacci sequence F_n, so the binomial coefficient \binom{n+m}{n}_F is always an integer.

Another answer to this question comes from a paper [1] Tom Edgar and I just published.  We prove that functions that are both multiplicative and divisible also guarantee that \binom{m+n}{n}_f is an integer for all natural values of n and m.  (Multiplicative functions have f(mn) = f(m)f(n) when m and n are relatively prime.  Divisible functions have f(m)|f(n) whenever m|n.)  The class of multiplicative and divisible functions includes Euler’s totient, Dedekind’s psi, as well as many others.  (See Appendix A in our paper for a long list of some of these.)  It also includes the class of completely multiplicative functions (i.e., f(mn) = f(m)f(n) for all natural numbers m and n).  The class of multiplicative and divisible functions has overlap with the strongly divisible class of functions, but it is neither a subset nor a superset of it.

Tom and I start by proving the following formula for the generalized binomial coefficient when f is multiplicative:

\displaystyle \binom{n+m}{n}_f = \prod_p \left( \prod_{i \geq 0} \left( \frac{f(p^{i+1})}{f(p^i)}\right)^{\epsilon^p_i}\right),

where the outer product is taken over all primes p.  Also, \epsilon^p_i is defined to be 1 if there is a carry in the ith position when m and n are added in base p and 0 otherwise.  Since the condition “f(p^i) divides f(p^{i+1}) for all primes p and nonnegative integers i” turns out to be equivalent to f being divisible, the integrality result follows.

Our formula also allows us to give a quick proof of Kummer’s Theorem on the prime factorization of the binomial coefficients, as f(p^{i+1})/f(p^i) = p when f is the identity function.

In addition, we prove that generalized Catalan numbers C_f = \frac{1}{f(n+1)} \binom{2n}{n}_f are always integers when f is both multiplicative and divisible.  (Actually, we prove this for generalized Fuss-Catalan numbers and then derive the usual Catalan numbers as a special case.)


  1. Tom Edgar and Michael Z. Spivey, “Multiplicative Functions, Generalized Binomial Coefficients, and Generalized Catalan Numbers,” Journal of Integer Sequences 19 (1): Article 16.1.6, 2016.
  2. Donald E. Knuth and Herbert S. Wilf, “The Power of a Prime That Divides a Generalized Binomial Coefficient,” Journal für die reine und angewandte Mathematik 396: 212-219, 1989.


Posted in binomial coefficients, elementary number theory, number theory | 2 Comments

Integrality of the Catalan Numbers via Kummer’s Theorem

Why is the nth Catalan number, \binom{2n}{n}/(n+1), an integer?  If you know one of its combinatorial interpretations, then the answer is clear, but how do you get integrality strictly from this formula?  In this post I’m going to discuss how one can use Kummer’s Theorem to prove that the Catalan numbers are integers.

Kummer’s Theorem states that the highest power of a prime p dividing the binomial coefficient \binom{n+m}{n} is the number of carries when m and n are added in base p.

Proof.  Assuming we already know that the binomial coefficient \binom{2n}{n} is an integer, we only need to consider primes p that divide n+1 in order to prove \binom{2n}{n}/(n+1) is an integer.  So, suppose p is a prime that divides n+1.  It may be that p^2, or p^3, or some higher power of p divides n+1 as well.  So let’s let p^r be the highest power of p that divides n+1.  Since p^r | (n+1), the base p representation of n+1 has r zeros at the end.  (Similarly, the fact that nine thousand, seven hundred has 10^2 as the highest power of 10 that divides it means that its base 10 representation, 9700, has two zeros at the end.)  This means that the base p representation of n has r instances of p-1 at the end.  (In the previous example, we see this with the fact that 9699 has two copies of 9 at the end.)  When n is added to itself in base p, then, we get carries in the ones place, the p‘s place, and so forth, up through at least the p^{r-1} place, for a total of at least r carries.  Thus, by Kummer’s Theorem, \binom{2n}{n} is divisible by p^r.  Since we can make a similar argument for all primes p that divide n+1, \binom{2n}{n} is divisible by n+1.  Therefore, \binom{2n}{n}/(n+1) is an integer.  \square

Pomerance makes this argument in his paper [1] and extends it to divisors n+k and n-k instead of n+1.  Tom Edgar and I have a paper that will soon be appearing in the Journal of Integer Sequences that generalizes it in another way – to generalized Fuss-Catalan numbers that are formed from generalized binomial coefficients.


  1.  Carl Pomerance, “On numbers related to Catalan numbers,” preprint.
Posted in binomial coefficients, Catalan numbers, number theory | Leave a comment