## Pascal Matrices and Binomial Inversion

In this post we’ll look at the relationship between a Pascal matrix, its inverse, and binomial inversion.  It turns out that these are the same concepts viewed from two different angles.

The Pascal matrix $P_m$ is the $(m+1) \times (m+1)$ matrix containing Pascal’s triangle through row m.  For example,

$\displaystyle P_3 = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 1 & 1 & 0 & 0 \\ 1 & 2 & 1 & 0 \\ 1 & 3 & 3 & 1 \end{bmatrix}.$

The inverse of a Pascal matrix turns out to be a slight variation on that Pascal matrix – one in which some of the entries are negative.  For example,

$\displaystyle P_3^{-1} = \begin{bmatrix} 1 & 0 & 0 & 0 \\ -1 & 1 & 0 & 0 \\ 1 & -2 & 1 & 0 \\ -1 & 3 & -3 & 1 \end{bmatrix}.$

More precisely, $P_m$ is the $(m+1) \times (m+1)$ whose $(i,j)$ entry (starting with row 0 and column 0) is $\binom{i}{j}$, and $P_m^{-1}$ is the $(m+1) \times (m+1)$ matrix whose $(i,j)$ entry (again starting with row 0 and column 0) is $\binom{i}{j}(-1)^{i-j}$.

Binomial inversion is the following property: Given two sequences $\{a_n\}, \{b_n\}$, we have

$\displaystyle b_n = \sum_{k=0}^n \binom{n}{k} a_k \iff a_n = \sum_{k=0}^n \binom{n}{k} b_k (-1)^{n-k}$.

Binomial inversion is useful for proving sum identities involving the binomial coefficients: If you can prove one such identity, you get a second identity immediately from binomial inversion.

Let’s look at the connection between Pascal matrices and binomial inversion now.  Suppose you take a sequence $\{a_n\}$ and turn its first $m+1$ entries into a column vector ${\bf a}_m$.  For example,

$\displaystyle {\bf a}_3 = \begin{bmatrix} a_0 \\ a_1 \\ a_2 \\ a_3 \end{bmatrix}$.

Now, multiply ${\bf a}_m$ by $P_m$.  This produces another vector that we can call ${\bf b}_m$.  In other words, $P_m {\bf a}_m = {\bf b}_m$.  For example,

$\displaystyle P_3 {\bf a}_3 = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 1& 1 & 0 & 0 \\ 1 & 2 & 1 & 0 \\ 1 & 3 & 3 & 1 \end{bmatrix} \begin{bmatrix} a_0 \\ a_1 \\ a_2 \\ a_3 \end{bmatrix} = \begin{bmatrix} b_0 \\ b_1 \\ b_2 \\ b_3 \end{bmatrix} = {\bf b}_3$.

If we multiply both sides of this equation by $P_m^{-1}$, we get ${\bf a}_m = P^{-1}_m {\bf b}_m$.  For example,

$\displaystyle {\bf a}_3 = \begin{bmatrix} a_0 \\ a_1 \\ a_2 \\ a_3 \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 & 0 \\ -1& 1 & 0 & 0 \\ 1 & -2 & 1 & 0 \\ -1 & 3 & -3 & 1 \end{bmatrix} \begin{bmatrix} b_0 \\ b_1 \\ b_2 \\ b_3 \end{bmatrix} = P_3^{-1} {\bf b}_3$.

Let’s take a closer look at how ${\bf b}_m$ is calculated.  Entry n$b_n$, is the inner product of row n of $P_m$ and the ${\bf a}_m$ vector.  In other words,

$\displaystyle b_n = \sum_{k=0}^n \binom{n}{k} a_k$.

In addition, if we look at how entry n is calculated in ${\bf a}_m$ via the ${\bf a}_m = P^{-1}_m {\bf b}_m$ equation, we have that $a_n$ is the inner product of row n of $P_m^{-1}$ and the ${\bf b}_m$ vector.  In other words,

$\displaystyle a_n = \sum_{k=0}^n \binom{n}{k} (-1)^{n-k} b_k$.

Thus the relationship between the $\{a_n\}$ and $\{b_n\}$ sequences expressed via the Pascal matrix and its inverse is that

$\displaystyle b_n = \sum_{k=0}^n \binom{n}{k} a_k \iff a_n = \sum_{k=0}^n \binom{n}{k} (-1)^{n-k} b_k$.

This is exactly binomial inversion.

Thus binomial inversion is just expressing what the inverse of a Pascal matrix is, and knowledge of the inverse of a Pascal matrix gives you the binomial inversion formula.

Posted in binomial coefficients, matrices | 2 Comments

## Counting Poker Hands

For this post I’m going to go through a classic exercise in combinatorics and probability; namely, proving that the standard ranking of poker hands is correct.

First, here are the standard poker hands, in ranked order.

1. Straight flush: Five cards of the same suit, in sequential numerical order.
2. Four-of-a-kind: Four cards of the same denomination and one card of a different denomination.
3. Full house: Three cards of one denomination and two cards of the same different denomination.
4. Flush: Five cards of the same suit.
5. Straight: Five cards in sequential numerical order.
6. Three-of-a-kind: Three cards of one denomination and two cards of two other, different, denominations.
7. Two pair: Two cards of one denomination, two cards of a second denomination, and one card of a third denomination.
8. One pair: Two cards of one denomination and three cards of three other denominations.
9. High card: None of the above.

Let’s count the number of ways each of these hands can occur.

First, there are $\binom{52}{5} = 2,598,960$ possible poker hands, as we’re choosing five cards from fifty-two.

1. Straight flush:
1. Choose the suit, in $\binom{4}{1} = 4$ ways.
2. Then choose the denomination of the smallest card in the straight, in $\binom{10}{1} = 10$ ways.
3. Once you choose the denomination of the smallest card in the straight, the rest of the denominations in the straight are determined.
4. Thus there are 4(10) = 40 straight flushes.
5. Probability: 40/2,598,960 = 0.0000154.
2. Four-of-a-kind:
1. Choose the denomination, in $\binom{13}{1} = 13$ ways.
2. Then choose the four cards in that denomination, in $\binom{4}{4} = 1$ way.
3. Finally, choose the off card, in $\binom{48}{1} = 48$ ways.
4. Thus there are 13(1)(48) = 624 four-of-a-kind hands.
5. Probability: 624/2,598,960 = 0.000240.
3. Full house:
1. Choose the denomination for three cards, in $\binom{13}{1} = 13$ ways.
2. Next choose three of four cards in that denomination in $\binom{4}{3} = 4$ ways.
3. Then choose the denomination for two cards in $\binom{12}{1} = 12$ ways.
4. Finally, choose two of four cards in the second denomination in $\binom{4}{2} = 6$ ways.
5. Thus there are 13(4)(12)(6) = 3744 full house hands.
6. Probability: 3744/2,598,960 = 0.00144.
4. Flush:
1. Choose the suit in $\binom{4}{1} = 4$ ways.
2. Then choose five cards from that suit in $\binom{13}{5} = 1287$ ways.
3. This gives 4(1287) = 5148.
4. However, we’ve included the straight flushes in this count, so we must subtract them off to get $5148 - 40 = 5048$ flush hands.
5. Probability: 5048/2,598,960 = 0.00194.
5. Straight:
1. Choose the denomination of the smallest card in the straight, in $\binom{10}{1} = 10$ ways.
2. This fixes the denominations in the straight.
3. Next, for each of the five denominations in the straight, we have four choices for the suit, giving 4(4)(4)(4)(4) = 1024 choices for the suits.
4. This gives 10(1024) = 10,240.
5. However, we’ve included the straight flushes in this count, so we must subtract them off to get $10,240 - 40 = 10,200$ straight hands.
6. Probability: 10,200/2,598,960 = 0.00392.
6. Three-of-a-kind:
1. Choose the denomination for the three-of-a-kind in $\binom{13}{1} = 13$ ways.
2. Next, choose three cards from that denomination in $\binom{4}{3} = 4$ ways.
3. Then choose two different denominations from the remaining 12 in $\binom{12}{2} = 66$ ways.
4. Then select one card from the larger of the two off-denominations in $\binom{4}{1} = 4$ ways.
5. Finally, select one card from the smaller of the two off-denominations in $\binom{4}{1} = 4$ ways.
6. All together, there are 13(4)(66)(4)(4) = 54,912 three-of-a-kind hands.
7. Probability: 54,912/2,598,960 = 0.021.
7. Two pair:
1. Choose the two denominations for the pairs in $\binom{13}{2} = 78$ ways.
2. Choose two cards from the larger of the chosen denominations in $\binom{4}{2} = 6$ ways.
3. Choose two cards from the smaller of the chosen denominations in $\binom{4}{2} = 6$ ways.
4. Choose the fifth card from the 44 cards not in one of the two chosen denominations in $\binom{44}{1} = 44$ ways.
5. Thus there are 78(6)(6)(44) = 123,552 two-pair hands.
6. Probability: 123,552/2,598,960 = 0.0475.
8. One pair:
1. Choose the denomination for the pair in $\binom{13}{1} = 13$ ways.
2. Choose two cards from that denomination in $\binom{4}{2} = 6$ ways.
3. Choose three different denominations for the remaining three cards in $\binom{12}{3} = 220$ ways.
4. Choose one card from each of those three different denominations in $\binom{4}{1} \binom{4}{1} \binom{4}{1} = 4(4)(4) = 64$ ways.
5. Thus there are 13(6)(220)(64) = 1,098,240 one-pair hands.
6. Probability: 1,098,240/2,598,960 = 0.423.
9. High card:
1. The sum of all the previous counts is 1,296,420.
2. Thus there are $2,598,960 - 1,296,420 = 1,302,540$ high card hands.
3. Probability: 1,302,540/2,598,960 = 0.501.

## An Expected Value Connection Between Order Statistics from a Discrete and a Continuous Distribution

Years ago, in the course of doing some research on another topic, I ran across the following result relating the expected values of the order statistics from a discrete and a continuous distribution.  I found it rather surprising.

Theorem: Fix n, and let $1 \leq i \leq n$. Let $X_1 < X_2 < \cdots < X_i$ be the order statistics from a random sample of size i from the continuous uniform $[0,n+1]$ distribution. Let $Y_1 < Y_2 < \cdots < Y_i$ be the order statistics from a random sample of size i, chosen without replacement, from the discrete uniform $\{1, \ldots, n\}$ distribution. Then, for each j, $1 \leq j \leq i$, $E[X_j] = E[Y_j]$.

Why would the expected values of the order statistics from a discrete distribution and a continuous distribution with different ranges match up exactly?  Particularly when the values from the discrete distribution are chosen without replacement?  It’s not too hard to prove, using basic order statistic properties, that $E[X_j] = j(n+1)/(i+1)$ and that $E[Y_j] = j(n+1)/(i+1)$ as well.  I didn’t find that very satisfying, though.  It took me a little while, but eventually I found a proof that uses a transformation from one sample to the other.  Here it is.

First, though, I need a lemma (which is fairly intuitive and which I won’t prove):

Lemma: Let $X_1 \leq X_2 \leq \cdots \leq X_n$ be the order statistics from a random sample of size n from a distribution F (continuous or discrete). Let $\sigma$ be a random permutation of the values $\{1, \ldots, n\}$, where $\sigma$ is independent of the values of $X_1, X_2, \ldots, X_n$. Then the values $X_{\sigma(1)}, X_{\sigma(2)}, \ldots, X_{\sigma(i)}$ are a random sample of size i from distribution F.

Now the proof.

Proof: Select a random sample of i values from the continuous uniform $[0,n+1]$ distribution in the following manner: Randomly select n values from $[0,n+1]$ and order them $W_1 < W_2 < \cdots < W_n$. Then randomly select i of these n values independently of the values of the $W_j$‘s. By the lemma, this produces a random sample of i values. In addition, the previous step is equivalent to randomly selecting i of the n indices; i.e., selecting i values without replacement from the discrete uniform $\{1, \ldots, n\}$ distribution. Thus we have, for each $k \in \{1, \ldots, n\}$, $P(X_j = W_k) = P(Y_j = k)$. In addition, $E[W_k] = k(n+1)/(n+1) = k$, since $W_k$ is the $k^{th}$ smallest value of a random sample of size n from the continuous uniform $[0,n+1]$ distribution. Therefore, by the law of total expectation,

$\displaystyle E[X_j] = \sum_{k=1}^n E[X_j | X_j = W_k] \; P(X_j = W_k) = \sum_{k=1}^n E[W_k] \; P(Y_j = k) \\ = \sum_{k=1}^n k \; P(Y_j = k) = E[Y_j].$

## Two Methods for Proving an Alternating Binomial Identity

Recently I gave an optional homework problem in which I asked students to prove the following binomial identity:

$\displaystyle \sum_{k=0}^n \binom{n}{k} \frac{2^k}{k+1} (-1)^k = \frac{1}{n+1} [ n \text{ is even}]$.

(Here, I’m using the Iverson bracket notation $[P]$ in which $[P] = 1$ if P is true and $[P] = 0$ if P is false.)

I intended for students to use the following approach.

Method 1.

First, prove the identity

$\displaystyle \sum_{k=0}^n \binom{n}{2k}\frac{1}{2k+1} = \frac{2^n}{n+1}$.

This was an earlier problem in the exercises, so the students could have just cited it.  But let’s prove it with the absorption identity, $\displaystyle \binom{n}{k} \frac{n+1}{k+1} = \binom{n+1}{k+1}$.

Rewriting the absorption identity, we have $\displaystyle \binom{n}{2k} \frac{n+1}{2k+1} = \binom{n+1}{2k+1}$.  Then

$\displaystyle \sum_{k=0}^n \binom{n}{2k}\frac{1}{2k+1} = \frac{1}{n+1} \sum_{k=0}^n \binom{n+1}{2k+1} = \frac{2^{n+1}}{2(n+1)} = \frac{2^n}{n+1}$.

(Here, the second summation is just adding up the odd-index numbers in row $n+1$ of Pascal’s triangle.  This is known to be half the value of the sum of the numbers in row $n+1;$ i.e., half of $2^{n+1}$.)

Now that we’ve proved this identity, let’s rewrite it as

$\displaystyle \sum_{k=0}^n \binom{n}{k} \frac{(-1)^k}{k+1} [k \text{ is even}] = \frac{2^n}{n+1}$.

(Notice how the left side is the same as the left side in the identity we just proved because all the odd-index terms are zero.)  Now we can apply binomial inversion to get

$\displaystyle \sum_{k=0}^n \binom{n}{k} \frac{2^k}{k+1} (-1)^k = \frac{1}{n+1} [n \text{ is even}].$

This was a fairly hard problem to do this way in the sense that it’s likely not clear by looking at the first identity we proved that you can rewrite it in such a way that binomial inversion can be used.  However, my student Jiawen (Christine) Li found an easier way to prove the original identity by using the absorption property directly.  Christine’s proof comprises Method 2.

Method 2.

Applying the absorption identity, we have

$\displaystyle \sum_{k=0}^n \binom{n}{k} \frac{2^k}{k+1} (-1)^k = \frac{1}{n+1} \sum_{k=0}^n \binom{n+1}{k+1} (-2)^k = \frac{1}{n+1} \sum_{k=1}^{n+1} \binom{n+1}{k} (-2)^{k-1} = \frac{-1}{2(n+1)} \sum_{k=1}^{n+1} \binom{n+1}{k} (-2)^k = \frac{-1}{2(n+1)} \left( \sum_{k=0}^n \binom{n+1}{k} (-2)^k - 1 \right) = \frac{-1}{2(n+1)} \left( (-1)^{n+1} - 1\right) = \frac{1}{2(n+1)} \left( (-1)^n + 1 \right) = \frac{2}{2(n+1)} [n \text{ is even}] = \frac{1}{n+1} [ n \text{ is even}].$

For more examples with the absorption identity, see this post.

## A Bonus Question on Convergent Series

Occasionally when teaching the sequences and series material in second-semester calculus I’ve included the following question as a bonus:

Question: Suppose $\displaystyle \sum_{n=1}^{\infty} a_n$ is absolutely convergent.  Does that imply anything about the convergence of $\displaystyle \sum_{n=1}^{\infty} a^2_n$?

The answer is that $\displaystyle \sum_{n=1}^{\infty}a^2_n$ converges.  I’m going to give two proofs.  The first is the more straightforward approach, but it is somewhat longer.  The second is clever and shorter.

First method: If $\displaystyle \sum_{n=1}^{\infty} a_n$ is absolutely convergent, then, by the divergence test, $\displaystyle \lim_{n \to \infty} |a_n| = 0$.  Thus there exists some $N > 0$ such that if $n \geq N$ then $|a_n| \leq 1$.  This means that, for $n \geq N$, $a^2_n \leq |a_n|$.  By the direct comparison test, then, $\displaystyle \sum_{n=1}^{\infty} a^2_n$ converges.

Second method: As we argued above, $\displaystyle \lim_{n \to \infty} |a_n| = 0$.  Thus $\displaystyle \lim_{n \to \infty} \frac{a^2_n}{|a_n|} = \lim_{n \to \infty} |a_n| = 0$.  By the limit comparison test, then, $\sum_{n=1}^{\infty} a^2_n$ converges.

(The first method is David Mitra’s answer and the second method is my answer to this Math.SE question.)

Posted in calculus, sequences and series | 6 Comments

## My Experiences on a Post-Election Panel

No mathematics post this month.  Instead, I’m just going to link to an article I published last week in Inside Higher Ed.  In this article I describe my experiences as the conservative voice on a panel held on my campus on November 9.  This was the day after the U.S. Presidential election, and the purpose of the panel was to help the mostly progressive campus make sense of Donald Trump’s win.

Posted in campus issues, politics | 1 Comment

## Relations That Are Symmetric and Antisymmetric

When I teach relations and their properties, the question of whether a relation can be both symmetric and antisymmetric always seems to arise.  This post addresses that question.

First, a reminder of the definitions here: A relation $\rho$ on a set S is symmetric if, for all $x,y \in S$, $(x,y) \in \rho$ implies that $(y,x) \in \rho$.  A relation $\rho$ on a set S is antisymmetric if, for all $x,y \in S$, $(x,y) \in \rho$ and $(y,x) \in \rho$ implies that $x = y$.

At first glance the definitions look a bit strange, in the sense that we would expect antisymmetric to mean “not symmetric.”  But that’s not quite what antisymmetric means. In fact, it is possible for a relation to be both symmetric and antisymmetric.

The boring (trivial) example is the empty relation $\rho$ on a set.  In this case, the antecedents in the definitions of symmetric ($(x,y) \in \rho$) and antisymmetric ($(x,y) \in \rho$ and $(y,x) \in \rho$) are never satisfied.  Thus the conditional statements in the definitions of the two properties are vacuously true, and so the empty relation is both symmetric and antisymmetric.

More interestingly, though, there are nontrivial examples.  Let’s think through what those might be. Suppose $(x,y) \in \rho$.  If $\rho$ is symmetric, then we require $(y,x) \in \rho$ as well.  If $\rho$ is antisymmetric, then, since $(x,y) \in \rho$ and $(y,x) \in \rho$, we must have $x = y$.  And this gives us a characterization of relations that are both symmetric and antisymmetric:

If a relation $\rho$ on a set $S$ is both symmetric and antisymmetric, then the only ordered pairs in $\rho$ are of the form $(x,x)$, where $x \in S$.

Thus, for example, the equality relation on the set of real numbers is both symmetric and antisymmetric.

Perhaps it’s clearer now the sense in which antisymmetric is opposing symmetric.  Symmetric means that there cannot be one-way relationships between two different elements.  Antisymmetric means that there cannot be two-way relationships between two different elements.  Both definitions allow for a relationship between an element and itself.