Pascal Matrices and Binomial Inversion

In this post we’ll look at the relationship between a Pascal matrix, its inverse, and binomial inversion.  It turns out that these are the same concepts viewed from two different angles.


The Pascal matrix P_m is the (m+1) \times (m+1) matrix containing Pascal’s triangle through row m.  For example,

\displaystyle P_3 = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 1 & 1 & 0 & 0 \\ 1 & 2 & 1 & 0 \\ 1 & 3 & 3 & 1 \end{bmatrix}.

The inverse of a Pascal matrix turns out to be a slight variation on that Pascal matrix – one in which some of the entries are negative.  For example,

\displaystyle P_3^{-1} = \begin{bmatrix} 1 & 0 & 0 & 0 \\ -1 & 1 & 0 & 0 \\ 1 & -2 & 1 & 0 \\ -1 & 3 & -3 & 1 \end{bmatrix}.

More precisely, P_m is the (m+1) \times (m+1) whose (i,j) entry (starting with row 0 and column 0) is \binom{i}{j}, and P_m^{-1} is the (m+1) \times (m+1) matrix whose (i,j) entry (again starting with row 0 and column 0) is \binom{i}{j}(-1)^{i-j}.


Binomial inversion is the following property: Given two sequences \{a_n\}, \{b_n\}, we have

\displaystyle b_n = \sum_{k=0}^n \binom{n}{k} a_k \iff a_n = \sum_{k=0}^n \binom{n}{k} b_k (-1)^{n-k}.

Binomial inversion is useful for proving sum identities involving the binomial coefficients: If you can prove one such identity, you get a second identity immediately from binomial inversion.


Let’s look at the connection between Pascal matrices and binomial inversion now.  Suppose you take a sequence \{a_n\} and turn its first m+1 entries into a column vector {\bf a}_m.  For example,

\displaystyle {\bf a}_3 = \begin{bmatrix} a_0 \\ a_1 \\ a_2 \\ a_3 \end{bmatrix}.

Now, multiply {\bf a}_m by P_m.  This produces another vector that we can call {\bf b}_m.  In other words, P_m {\bf a}_m = {\bf b}_m.  For example,

\displaystyle P_3 {\bf a}_3 = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 1& 1 & 0 & 0 \\ 1 & 2 & 1 & 0 \\ 1 & 3 & 3 & 1 \end{bmatrix} \begin{bmatrix} a_0 \\ a_1 \\ a_2 \\ a_3 \end{bmatrix} =  \begin{bmatrix} b_0 \\ b_1 \\ b_2 \\ b_3 \end{bmatrix} = {\bf b}_3.

If we multiply both sides of this equation by P_m^{-1}, we get {\bf a}_m = P^{-1}_m {\bf b}_m.  For example,

\displaystyle {\bf a}_3 = \begin{bmatrix} a_0 \\ a_1 \\ a_2 \\ a_3 \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 & 0 \\ -1& 1 & 0 & 0 \\ 1 & -2 & 1 & 0 \\ -1 & 3 & -3 & 1 \end{bmatrix} \begin{bmatrix} b_0 \\ b_1 \\ b_2 \\ b_3 \end{bmatrix} = P_3^{-1} {\bf b}_3.

Let’s take a closer look at how {\bf b}_m is calculated.  Entry nb_n, is the inner product of row n of P_m and the {\bf a}_m vector.  In other words,

\displaystyle b_n = \sum_{k=0}^n \binom{n}{k} a_k.

In addition, if we look at how entry n is calculated in {\bf a}_m via the {\bf a}_m = P^{-1}_m {\bf b}_m equation, we have that a_n is the inner product of row n of P_m^{-1} and the {\bf b}_m vector.  In other words,

\displaystyle a_n = \sum_{k=0}^n \binom{n}{k} (-1)^{n-k} b_k.

Thus the relationship between the \{a_n\} and \{b_n\} sequences expressed via the Pascal matrix and its inverse is that

\displaystyle b_n = \sum_{k=0}^n \binom{n}{k} a_k \iff a_n = \sum_{k=0}^n \binom{n}{k} (-1)^{n-k} b_k.

This is exactly binomial inversion.

Thus binomial inversion is just expressing what the inverse of a Pascal matrix is, and knowledge of the inverse of a Pascal matrix gives you the binomial inversion formula.

Advertisements
Posted in binomial coefficients, matrices | 2 Comments

Counting Poker Hands

For this post I’m going to go through a classic exercise in combinatorics and probability; namely, proving that the standard ranking of poker hands is correct.

First, here are the standard poker hands, in ranked order.

  1. Straight flush: Five cards of the same suit, in sequential numerical order.
  2. Four-of-a-kind: Four cards of the same denomination and one card of a different denomination.
  3. Full house: Three cards of one denomination and two cards of the same different denomination.
  4. Flush: Five cards of the same suit.
  5. Straight: Five cards in sequential numerical order.
  6. Three-of-a-kind: Three cards of one denomination and two cards of two other, different, denominations.
  7. Two pair: Two cards of one denomination, two cards of a second denomination, and one card of a third denomination.
  8. One pair: Two cards of one denomination and three cards of three other denominations.
  9. High card: None of the above.

Let’s count the number of ways each of these hands can occur.

First, there are \binom{52}{5} = 2,598,960 possible poker hands, as we’re choosing five cards from fifty-two.

  1. Straight flush:
    1. Choose the suit, in \binom{4}{1} = 4 ways.
    2. Then choose the denomination of the smallest card in the straight, in \binom{10}{1} = 10 ways.
    3. Once you choose the denomination of the smallest card in the straight, the rest of the denominations in the straight are determined.
    4. Thus there are 4(10) = 40 straight flushes.
    5. Probability: 40/2,598,960 = 0.0000154.
  2. Four-of-a-kind:
    1. Choose the denomination, in \binom{13}{1} = 13 ways.
    2. Then choose the four cards in that denomination, in \binom{4}{4} = 1 way.
    3. Finally, choose the off card, in \binom{48}{1} = 48 ways.
    4. Thus there are 13(1)(48) = 624 four-of-a-kind hands.
    5. Probability: 624/2,598,960 = 0.000240.
  3. Full house:
    1. Choose the denomination for three cards, in \binom{13}{1} = 13 ways.
    2. Next choose three of four cards in that denomination in \binom{4}{3} = 4 ways.
    3. Then choose the denomination for two cards in \binom{12}{1} = 12 ways.
    4. Finally, choose two of four cards in the second denomination in \binom{4}{2} = 6 ways.
    5. Thus there are 13(4)(12)(6) = 3744 full house hands.
    6. Probability: 3744/2,598,960 = 0.00144.
  4. Flush:
    1. Choose the suit in \binom{4}{1} = 4 ways.
    2. Then choose five cards from that suit in \binom{13}{5} = 1287 ways.
    3. This gives 4(1287) = 5148.
    4. However, we’ve included the straight flushes in this count, so we must subtract them off to get 5148 - 40 = 5048 flush hands.
    5. Probability: 5048/2,598,960 = 0.00194.
  5. Straight:
    1. Choose the denomination of the smallest card in the straight, in \binom{10}{1} = 10 ways.
    2. This fixes the denominations in the straight.
    3. Next, for each of the five denominations in the straight, we have four choices for the suit, giving 4(4)(4)(4)(4) = 1024 choices for the suits.
    4. This gives 10(1024) = 10,240.
    5. However, we’ve included the straight flushes in this count, so we must subtract them off to get 10,240 - 40 = 10,200 straight hands.
    6. Probability: 10,200/2,598,960 = 0.00392.
  6. Three-of-a-kind:
    1. Choose the denomination for the three-of-a-kind in \binom{13}{1} = 13 ways.
    2. Next, choose three cards from that denomination in \binom{4}{3} = 4 ways.
    3. Then choose two different denominations from the remaining 12 in \binom{12}{2} = 66 ways.
    4. Then select one card from the larger of the two off-denominations in \binom{4}{1} = 4 ways.
    5. Finally, select one card from the smaller of the two off-denominations in \binom{4}{1} = 4 ways.
    6. All together, there are 13(4)(66)(4)(4) = 54,912 three-of-a-kind hands.
    7. Probability: 54,912/2,598,960 = 0.021.
  7. Two pair:
    1. Choose the two denominations for the pairs in \binom{13}{2} = 78 ways.
    2. Choose two cards from the larger of the chosen denominations in \binom{4}{2} = 6 ways.
    3. Choose two cards from the smaller of the chosen denominations in \binom{4}{2} = 6 ways.
    4. Choose the fifth card from the 44 cards not in one of the two chosen denominations in \binom{44}{1} = 44 ways.
    5. Thus there are 78(6)(6)(44) = 123,552 two-pair hands.
    6. Probability: 123,552/2,598,960 = 0.0475.
  8. One pair:
    1. Choose the denomination for the pair in \binom{13}{1} = 13 ways.
    2. Choose two cards from that denomination in \binom{4}{2} = 6 ways.
    3. Choose three different denominations for the remaining three cards in \binom{12}{3} = 220 ways.
    4. Choose one card from each of those three different denominations in \binom{4}{1} \binom{4}{1} \binom{4}{1} = 4(4)(4) = 64 ways.
    5. Thus there are 13(6)(220)(64) = 1,098,240 one-pair hands.
    6. Probability: 1,098,240/2,598,960 = 0.423.
  9. High card:
    1. The sum of all the previous counts is 1,296,420.
    2. Thus there are 2,598,960 - 1,296,420 = 1,302,540 high card hands.
    3. Probability: 1,302,540/2,598,960 = 0.501.
Posted in combinatorics, probability | Leave a comment

An Expected Value Connection Between Order Statistics from a Discrete and a Continuous Distribution

Years ago, in the course of doing some research on another topic, I ran across the following result relating the expected values of the order statistics from a discrete and a continuous distribution.  I found it rather surprising.

Theorem: Fix n, and let 1 \leq i \leq n. Let X_1 < X_2 < \cdots < X_i be the order statistics from a random sample of size i from the continuous uniform [0,n+1] distribution. Let Y_1 < Y_2 < \cdots < Y_i be the order statistics from a random sample of size i, chosen without replacement, from the discrete uniform \{1, \ldots, n\} distribution. Then, for each j, 1 \leq j \leq i, E[X_j] = E[Y_j].

Why would the expected values of the order statistics from a discrete distribution and a continuous distribution with different ranges match up exactly?  Particularly when the values from the discrete distribution are chosen without replacement?  It’s not too hard to prove, using basic order statistic properties, that E[X_j] = j(n+1)/(i+1) and that E[Y_j] = j(n+1)/(i+1) as well.  I didn’t find that very satisfying, though.  It took me a little while, but eventually I found a proof that uses a transformation from one sample to the other.  Here it is.

First, though, I need a lemma (which is fairly intuitive and which I won’t prove):

Lemma: Let X_1 \leq X_2 \leq \cdots \leq X_n be the order statistics from a random sample of size n from a distribution F (continuous or discrete). Let \sigma be a random permutation of the values \{1, \ldots, n\}, where \sigma is independent of the values of X_1, X_2, \ldots, X_n. Then the values X_{\sigma(1)}, X_{\sigma(2)}, \ldots, X_{\sigma(i)} are a random sample of size i from distribution F.

Now the proof.

Proof: Select a random sample of i values from the continuous uniform [0,n+1] distribution in the following manner: Randomly select n values from [0,n+1] and order them W_1 < W_2 < \cdots < W_n. Then randomly select i of these n values independently of the values of the W_j‘s. By the lemma, this produces a random sample of i values. In addition, the previous step is equivalent to randomly selecting i of the n indices; i.e., selecting i values without replacement from the discrete uniform \{1, \ldots, n\} distribution. Thus we have, for each k \in \{1, \ldots, n\}, P(X_j = W_k) = P(Y_j = k). In addition, E[W_k] = k(n+1)/(n+1) = k, since W_k is the k^{th} smallest value of a random sample of size n from the continuous uniform [0,n+1] distribution. Therefore, by the law of total expectation,

\displaystyle E[X_j] =  \sum_{k=1}^n E[X_j | X_j = W_k] \; P(X_j = W_k) = \sum_{k=1}^n E[W_k] \; P(Y_j = k) \\ = \sum_{k=1}^n k \; P(Y_j = k) = E[Y_j].

Posted in order statistics, probability, statistics | Leave a comment

Two Methods for Proving an Alternating Binomial Identity

Recently I gave an optional homework problem in which I asked students to prove the following binomial identity:

\displaystyle \sum_{k=0}^n \binom{n}{k} \frac{2^k}{k+1} (-1)^k = \frac{1}{n+1} [ n \text{ is even}].

(Here, I’m using the Iverson bracket notation [P] in which [P] = 1 if P is true and [P] =  0 if P is false.)

I intended for students to use the following approach.

Method 1.

First, prove the identity

\displaystyle \sum_{k=0}^n \binom{n}{2k}\frac{1}{2k+1} = \frac{2^n}{n+1}.

This was an earlier problem in the exercises, so the students could have just cited it.  But let’s prove it with the absorption identity, \displaystyle \binom{n}{k} \frac{n+1}{k+1} = \binom{n+1}{k+1}.

Rewriting the absorption identity, we have \displaystyle \binom{n}{2k} \frac{n+1}{2k+1} = \binom{n+1}{2k+1}.  Then

\displaystyle \sum_{k=0}^n \binom{n}{2k}\frac{1}{2k+1} = \frac{1}{n+1} \sum_{k=0}^n \binom{n+1}{2k+1} = \frac{2^{n+1}}{2(n+1)} = \frac{2^n}{n+1}.

(Here, the second summation is just adding up the odd-index numbers in row n+1 of Pascal’s triangle.  This is known to be half the value of the sum of the numbers in row n+1; i.e., half of 2^{n+1}.)

Now that we’ve proved this identity, let’s rewrite it as

\displaystyle \sum_{k=0}^n \binom{n}{k} \frac{(-1)^k}{k+1} [k \text{ is even}] = \frac{2^n}{n+1}.

(Notice how the left side is the same as the left side in the identity we just proved because all the odd-index terms are zero.)  Now we can apply binomial inversion to get

\displaystyle \sum_{k=0}^n \binom{n}{k} \frac{2^k}{k+1} (-1)^k = \frac{1}{n+1} [n \text{ is even}].

This was a fairly hard problem to do this way in the sense that it’s likely not clear by looking at the first identity we proved that you can rewrite it in such a way that binomial inversion can be used.  However, my student Jiawen (Christine) Li found an easier way to prove the original identity by using the absorption property directly.  Christine’s proof comprises Method 2.

Method 2.

Applying the absorption identity, we have

\displaystyle \sum_{k=0}^n \binom{n}{k} \frac{2^k}{k+1} (-1)^k = \frac{1}{n+1} \sum_{k=0}^n \binom{n+1}{k+1} (-2)^k = \frac{1}{n+1} \sum_{k=1}^{n+1} \binom{n+1}{k} (-2)^{k-1} = \frac{-1}{2(n+1)} \sum_{k=1}^{n+1} \binom{n+1}{k} (-2)^k = \frac{-1}{2(n+1)} \left( \sum_{k=0}^n \binom{n+1}{k} (-2)^k - 1 \right) = \frac{-1}{2(n+1)} \left( (-1)^{n+1} - 1\right) = \frac{1}{2(n+1)} \left( (-1)^n + 1 \right) = \frac{2}{2(n+1)} [n \text{ is even}] = \frac{1}{n+1} [ n \text{ is even}].

For more examples with the absorption identity, see this post.

Posted in binomial coefficients, combinatorics | Leave a comment

A Bonus Question on Convergent Series

Occasionally when teaching the sequences and series material in second-semester calculus I’ve included the following question as a bonus:

Question: Suppose \displaystyle \sum_{n=1}^{\infty} a_n is absolutely convergent.  Does that imply anything about the convergence of \displaystyle \sum_{n=1}^{\infty} a^2_n?

The answer is that \displaystyle \sum_{n=1}^{\infty}a^2_n converges.  I’m going to give two proofs.  The first is the more straightforward approach, but it is somewhat longer.  The second is clever and shorter.

First method: If \displaystyle \sum_{n=1}^{\infty} a_n is absolutely convergent, then, by the divergence test, \displaystyle \lim_{n \to \infty} |a_n| = 0.  Thus there exists some N > 0 such that if n \geq N then |a_n| \leq 1.  This means that, for n \geq N, a^2_n \leq |a_n|.  By the direct comparison test, then, \displaystyle \sum_{n=1}^{\infty} a^2_n converges.

Second method: As we argued above, \displaystyle \lim_{n \to \infty} |a_n| = 0.  Thus \displaystyle \lim_{n \to \infty} \frac{a^2_n}{|a_n|} = \lim_{n \to \infty} |a_n| = 0.  By the limit comparison test, then, \sum_{n=1}^{\infty} a^2_n converges.

(The first method is David Mitra’s answer and the second method is my answer to this Math.SE question.)

Posted in calculus, sequences and series | 6 Comments

My Experiences on a Post-Election Panel

No mathematics post this month.  Instead, I’m just going to link to an article I published last week in Inside Higher Ed.  In this article I describe my experiences as the conservative voice on a panel held on my campus on November 9.  This was the day after the U.S. Presidential election, and the purpose of the panel was to help the mostly progressive campus make sense of Donald Trump’s win.

Posted in campus issues, politics | 1 Comment

Relations That Are Symmetric and Antisymmetric

When I teach relations and their properties, the question of whether a relation can be both symmetric and antisymmetric always seems to arise.  This post addresses that question.

First, a reminder of the definitions here: A relation \rho on a set S is symmetric if, for all x,y \in S, (x,y) \in \rho implies that (y,x) \in \rho.  A relation \rho on a set S is antisymmetric if, for all x,y \in S, (x,y) \in \rho and (y,x) \in \rho implies that x = y.

At first glance the definitions look a bit strange, in the sense that we would expect antisymmetric to mean “not symmetric.”  But that’s not quite what antisymmetric means. In fact, it is possible for a relation to be both symmetric and antisymmetric.

The boring (trivial) example is the empty relation \rho on a set.  In this case, the antecedents in the definitions of symmetric ((x,y) \in \rho) and antisymmetric ((x,y) \in \rho and (y,x) \in \rho) are never satisfied.  Thus the conditional statements in the definitions of the two properties are vacuously true, and so the empty relation is both symmetric and antisymmetric.

More interestingly, though, there are nontrivial examples.  Let’s think through what those might be. Suppose (x,y) \in \rho.  If \rho is symmetric, then we require (y,x) \in \rho as well.  If \rho is antisymmetric, then, since (x,y) \in \rho and (y,x) \in \rho, we must have x = y.  And this gives us a characterization of relations that are both symmetric and antisymmetric:

If a relation \rho on a set S is both symmetric and antisymmetric, then the only ordered pairs in \rho are of the form (x,x), where x \in S.

Thus, for example, the equality relation on the set of real numbers is both symmetric and antisymmetric.

Perhaps it’s clearer now the sense in which antisymmetric is opposing symmetric.  Symmetric means that there cannot be one-way relationships between two different elements.  Antisymmetric means that there cannot be two-way relationships between two different elements.  Both definitions allow for a relationship between an element and itself.

Posted in relations | Leave a comment