I’m a big fan of combinatorial proofs, in which you show that both sides of an identity count the same entity in two different ways. Probabilistic proofs – in which you show that both sides of an identity express the probability that an event occurs in two different ways – are a close relative of combinatorial proofs. In fact, many probabilistic proofs can be translated directly into combinatorial proofs by clearing denominators. Sometimes, though, the probabilistic proof is more natural than the corresponding combinatorial one. Today’s post is one of these.

One of the basic binomial coefficient identities that involves a fraction is the following:

.

Interpreting this probabilistically, the left-hand side looks like it uses inclusion-exclusion, and the right looks like the probability of selecting one special item out of *n*+1 choices. With that in mind, we can further interpret the left-hand side as having something to do with selecting items out of *k*+1 choices, as *k* varies from 0 to *n*. Probabilists like balls and jars, so let’s construct a scenario in those terms.

Suppose there are balls numbered 1 through *n* placed in a jar along with one black ball. We select the balls, one-by-one, and remove them from the jar. What is the probability that the black ball is the first ball chosen? It’s pretty clear that this is , the right-hand side of our identity.

To use inclusion-exclusion for the left-hand side we have to interpret “the black ball is the first ball chosen” as the union or intersection of a collection of other events. Moreover, we have to do this in such a way as to work in the binomial coefficient and unit fraction as part of counting up the probabilities when these events are intersected. From the intersection standpoint, the black ball does have to come before ball 1 and before ball 2 and before ball 3 and so forth. So let’s let , , be the event that the black ball is drawn before ball *i* and use the intersection version of inclusion-exclusion to find an expression for . We have that for any *i*. Similarly, because the probability that the black ball does not come before ball *i* and does not come before ball *j* is the probability that it comes last of those three, which is . In general, the probability that the black ball comes as the last of any *k*+1 specific balls is , and there are collections of specific balls that include the black ball. By inclusion-exclusion, then, the probability that the black ball is the first ball chosen is given by

Thus both sides of the identity represent the probability that the black ball is the first ball chosen out of *n*+1 specific balls.

(The are, of course, other proofs of this identity. Two of the quicker ones are to use the absorption identity for binomial coefficients, as in my post here, and integrating the function in two different ways, as in my post here.)