Some Corrections to My Paper “A Combinatorial View of Sums of Powers”

A few months ago Mathematics Magazine published a paper of mine, “A Combinatorial View of Sums of Powers.” In it I give a combinatorial interpretation for the power sum \sum_{k=1}^n k^m, together with combinatorial proofs of two formulas for this power sum. (An earlier version of some of the results in this paper actually appeared in a blog post from several years ago.)

Recently, however, I received an email from Michael Maltenfort at Northwestern University, pointing out a couple of errors in the paper. I’m recording those errors here.


The most serious mistake Maltenfort pointed out to me is the paper’s claim that m is a nonnegative integer. The formulas actually require m to be a positive integer. For example, in the paper I state the following.

Theorem 1 (Original). The power sum \sum_{k=1}^n k^m is the number of functions f: [m+1] \mapsto [n+1] such that, for all i \in [m], f(i) < f(m+1).

However, when m = 0 this doesn’t work. The power sum as I stated it entails adding up n copies of 1 and so evaluates to n. On the other hand, the set [m] is the set \{1, 2, \ldots, m\}. This means that all n+1 functions from [1] to [n+1] vacuously satisfy the condition in the theorem. The theorem as stated in the paper is off by 1, then, for the case m = 0.

Maltenfort suggests changing the lower index on the sum to start at k = 0, so that we have the following.

Theorem 1 (Updated). The power sum \sum_{k=0}^n k^m is the number of functions f: [m+1] \mapsto [n+1] such that, for all i \in [m], f(i) < f(m+1).

As long as we are fine with the convention that 0^0 = 1 (and I am a fan of this convention in a combinatorial setting), this allows the formula to hold in the case m = 0 as well.

Maltenfort also points out that the two formulas for the power sum need to be updated in order for them to hold for the m = 0 case. My paper has

Theorem 2 (Original): \displaystyle \sum_{k=1}^n k^m = \sum_{k=0}^m \binom{n+1}{k+1} \left\{ m \atop k \right\} k!, and

Theorem 3 (Original): \displaystyle \sum_{k=1}^n k^m = \sum_{k=0}^{m-1} \left\langle m \atop k \right\rangle \binom{n+k+1}{m+1}.

These should be updated to

Theorem 2 (Updated): \displaystyle \sum_{k=0}^n k^m = \sum_{k=0}^m \binom{n+1}{k+1} \left\{ m \atop k \right\} k!, and

Theorem 3 (Updated): \displaystyle \sum_{k=0}^n k^m = \sum_{k=0}^m \left\langle m \atop k \right\rangle \binom{n+k+1}{m+1}.


The other mistake in the paper is in the definition of the induced permutation \pi. The definition for \pi in the paper has the requirement

\displaystyle f(\pi(i)) = f(\pi(j)) \implies \pi(i) < \pi(j).

That should instead be

\displaystyle f(\pi(i)) = f(\pi(j)) \text{ and } i < j  \implies \pi(i) < \pi(j).

Thanks to Professor Maltenfort for these corrections!


References

Spivey, Michael Z. “A Combinatorial View of Sums of Powers,” Mathematics Magazine, 90 (2): 125-131, 2021.

Posted in combinatorics, number theory, publishing, Stirling numbers | Leave a comment

A Connection Between the Riemann Zeta Function and the Power Sum

The Riemann zeta function \zeta(s) can be expressed as \zeta(s) = \sum_{n=1}^{\infty} \frac{1}{n^s}, for complex numbers s whose real part is greater than 1. By analytic continuation, \zeta(s) can be extended to all complex numbers except where s = 1.

The power sum S_a(M) is given by S_a(M) = \sum_{n=1}^{M-1} n^a. (Sometimes M is given as the upper bound, but for this post it’s more convenient to use M-1.)

Recently I learned of a really nice relationship between the zeta function \zeta(-a) and the power sum S_a(M) for a \in \mathbb{N}. It’s due to Mináč [1], and I reproduce his proof here.

Theorem: For positive integers a, \displaystyle \zeta(-a) = \int_0^1 S_a(x) \, dx.

Now, of course M in S_a(M) is a positive integer, so what do we mean by integrating S_a(x) from 0 to 1? Well, it’s also known that S_a(M) is a polynomial of degree a+1 in M. For example,

\displaystyle S_1(M) = \sum_{n=1}^{M-1} n = \frac{M(M-1)}{2},

\displaystyle S_2(M) = \sum_{n=1}^{M-1} n^2 = \frac{M(M-1)(2M-1)}{6}.

So what we mean when we’re integrating S_a(x) from 0 to 1 is that we’re integrating the polynomial representation of S_a(x) from 0 to 1.

Proof. First, we need a few known facts about the Bernoulli numbers and Bernoulli polynomials:

  1. The known representation of the power sum in terms of Bernoulli polynomials: \displaystyle S_a(M) = \frac{B_{a+1}(M) - B_{a+1}(1)}{a+1}.
  2. The derivative of a Bernoulli polynomial: \displaystyle \frac{d}{dx} B_m(x) = m B_{m-1}(x).
  3. Some specific values of Bernoulli polynomials: \displaystyle B_m(1) = (-1)^m B_m(0) = (-1)^m B_m.
  4. All the odd Bernoulli numbers (except B_1) are 0.
  5. A relationship between the zeta function and Bernoulli numbers: \displaystyle \zeta(-a) = (-1)^a \frac{B_{a+1}}{a+1}.

Now, using these facts, we have

\displaystyle \int_0^1 S_a(x) \, dx =\int_0^1 \frac{B_{a+1}(x) - B_{a+1}(1)}{a+1} dx = \frac{B_{a+2}(1) - B_{a+2}(0)}{(a+1)(a+2)} + (-1)^a \frac{B_{a+1}}{a+1} = 0 + (-1)^a \frac{B_{a+1}}{a+1} = \zeta(-a). \blacksquare

For example, Theorem 1 gives us that

\displaystyle 1 + 2 + 3 + \cdots = \zeta(-1) = \int_0^1 \left(\frac{x^2}{2} - \frac{x}{2}\right) \, dx = \left. \frac{x^3}{6} - \frac{x^2}{4} \right|_0^1 = \frac{1}{6} - \frac{1}{4} = - \frac{1}{12},

\displaystyle 1^2 + 2^2 + 3^2 + \cdots = \zeta(-2) = \int_0^1 \left(\frac{x^3}{3} - \frac{x^2}{2} + \frac{x}{6}\right) \, dx = \left. \frac{x^4}{12} - \frac{x^3}{6} + \frac{x^2}{12} \right|_0^1 = \frac{1}{12} - \frac{1}{6} + \frac{1}{12} = 0.

References

[1] Ján Mináč, “A remark on the values of the Riemann zeta function,” Expositiones Mathematicae, 12: 459-462, 1994.

Posted in Bernoulli numbers, number theory | Leave a comment

Intuition for the Dual in Linear Programming

One of the most important theoretical results in linear programming is that every LP has a corresponding dual program. Where, exactly, this dual comes from can often seem mysterious. Several years ago I answered a question on a couple of Stack Exchange sites giving an intuitive explanation for where the dual comes from. Those posts seem to have been appreciated, so I thought I would reproduce my answer here.


Suppose we have a primal problem as follows.

Primal = \begin{Bmatrix} \max &5x_1 - 6x_2 \\ s.t. &2x_1 - x_2 = 1\\ &x_1 + 3x_2 \leq 9 \\  &x_1 \geq 0 \end{Bmatrix}


Now, suppose we want to use the primal’s constraints as a way to find an upper bound on the optimal value of the primal. If we multiply the first constraint by 9, the second constraint by 1, and add them together, we get 9(2x_1 - x_2) + 1(x_1 +3 x_2) for the left-hand side and 9(1) + 1(9) for the right-hand side. Since the first constraint is an equality and the second is an inequality, this implies

19x_1 - 6x_2 \leq 18.

But since x_1 \geq 0, it’s also true that 5x_1 \leq 19x_1, and so

\displaystyle 5x_1 - 6x_2 \leq 19x_1 - 6x_2 \leq 18.

Therefore, 18 is an upper-bound on the optimal value of the primal problem.

Surely we can do better than that, though. Instead of just guessing 9 and 1 as the multipliers, let’s let them be variables. Thus we’re looking for multipliers y_1 and y_2 to force

\displaystyle 5x_1 - 6x_2 \leq y_1(2x_1-x_2) + y_2(x_1 + 3x_2) \leq y_1(1) + y_2(9).

Now, in order for this pair of inequalities to hold, what has to be true about y_1 and y_2? Let’s take the two inequalities one at a time.


The first inequality: 5x_1 - 6x_2 \leq y_1(2x_1-x_2) + y_2(x_1 + 3x_2)

We have to track the coefficients of the x_1 and x_2 variables separately. First, we need the total x_1 coefficient on the right-hand side to be at least 5. Getting exactly 5 would be great, but since x_1 \geq 0, anything larger than 5 would also satisfy the inequality for x_1. Mathematically speaking, this means that we need 2y_1 + y_2 \geq 5.

On the other hand, to ensure the inequality for the x_2 variable we need the total x_2 coefficient on the right-hand side to be exactly -6. Since x_2 could be positive, we can’t go lower than -6, and since x_2 could be negative, we can’t go higher than -6 (as the negative value for x_2 would flip the direction of the inequality). So for the first inequality to work for the x_2 variable, we’ve got to have -y_1 + 3y_2 = -6.


The second inequality: y_1(2x_1-x_2) + y_2(x_1 + 3x_2) \leq y_1(1) + y_2(9)

Here we have to track the y_1 and y_2 variables separately. The y_1 variable comes from the first constraint, which is an equality constraint. It doesn’t matter if y_1 is positive or negative, the equality constraint still holds. Thus y_1 is unrestricted in sign. However, the y_2 variable comes from the second constraint, which is a less-than-or-equal to constraint. If we were to multiply the second constraint by a negative number that would flip its direction and change it to a greater-than-or-equal constraint. To keep with our goal of upper-bounding the primal objective, we can’t let that happen. So the y_2 variable can’t be negative. Thus we must have y_2 \geq 0.

Finally, we want to make the right-hand side of the second inequality as small as possible, as we want the tightest upper-bound possible on the primal objective. So we want to minimize y_1 + 9y_2.


Putting all of these restrictions on y_1 and y_2 together we find that the problem of using the primal’s constraints to find the best upper bound on the optimal primal objective entails solving the following linear program:

\begin{matrix} \text{Minimize } &y_1 + 9y_2 \\ \text{ subject to } &2y_1 + y_2 \geq 5 \\ &-y_1 + 3y_2 = -6\\ &y_2 \geq 0 \end{matrix}

And that’s the dual.


It’s probably worth summarizing the implications of this argument for all possible forms of the primal and dual. The following table is taken from p. 214 of Introduction to Operations Research, 8th edition, by Hillier and Lieberman. They refer to this as the SOB method, where SOB stands for Sensible, Odd, or Bizarre, depending on how likely one would find that particular constraint or variable restriction in a maximization or minimization problem.

             Primal Problem                           Dual Problem
             (or Dual Problem)                        (or Primal Problem)

             Maximization                             Minimization

Sensible     <= constraint            paired with     nonnegative variable
Odd          =  constraint            paired with     unconstrained variable
Bizarre      >= constraint            paired with     nonpositive variable

Sensible     nonnegative variable     paired with     >= constraint
Odd          unconstrained variable   paired with     = constraint
Bizarre      nonpositive variable     paired with     <= constraint
Posted in linear programming, optimization | Leave a comment

The Sum of Cubes is the Square of the Sum

It’s fairly well-known, to those who know it, that

\displaystyle \left(\sum_{k=1}^n k \right)^2 = \frac{n^2(n+1)^2}{4} =  \sum_{k=1}^n k^3 .

In other words, the square of the sum of the first n positive integers equals the sum of the cubes of the first n positive integers.

It’s probably less well-known that a similar relationship holds for \tau, the function that counts the number of divisors of an integer:

\displaystyle \left(\sum_{d|n} \tau(d) \right)^2 = \sum_{d|n} \tau(d)^3 .

In this post we’re going to prove this formula for \tau.

First, it’s pretty easy to see what the value of \tau is for prime powers; i.e., integers of the form p^e, where p is prime. Since the only divisors of p^e are 1, p,  p^2, \ldots, p^e, the number of divisors of p^e is given by \tau(p^e) = e+1.

Let’s check the identity we’re trying to prove when n = p^e. We have

\displaystyle \left(\sum_{d|p^e} \tau(d) \right)^2 = \left(\tau(1) + \tau(p) + \tau(p^2) + \cdots + \tau(p^e)\right)^2 \\ = \left(1 + 2 + 3 + \cdots + (e+1)\right)^2 = \left( \frac{(e+1)(e+2)}{2}\right)^2.

We also have

\displaystyle \sum_{d|p^e} \tau(d)^3 = \tau(1)^3 + \tau(p)^3 +\tau(p^2)^3 + \cdots + \tau(p^e)^3 \\ = 1^3 + 2^3 + 3^3 + \cdots + (e+1)^3 = \left( \frac{(e+1)(e+2)}{2}\right)^2.

Clearly, the two expressions are equal, so the identity we’re trying to prove holds in the prime-power case. (And, in fact, the derivation uses the identity about the sum of the first several positive integers mentioned at the beginning of the post!)

Let f(n) = \sum_{d|n} \tau(d) and F(n) = \sum_{d|n} \tau(d)^3. What we’ve shown thus far is that f(p^e) = F(p^e).

Now, we’re going to pull some elementary number theory. One fact about \tau is that it is multiplicative; i.e., \tau(mn) = \tau(m) \tau(n) when m and n are relatively prime. This is one of the first properties you learn about \tau once you learn its definition, and we’re not going to prove it here.

It turns out that both f and F are multiplicative as well! First, the product of two multiplicative functions is also multiplicative. (This is a one-line proof using the definition of multiplicative.) So \tau(d)^3 is multiplicative.

Another property of multiplicative functions is that if g is multiplicative, then \sum_{d|n} g(d) is also multiplicative. This is a special case of the more general result that the Dirichlet convolution of two multiplicative functions is multiplicative. (In the Dirichlet convolution g \star h, take h to be the identity function; i.e., h(n) = 1.) This means that our functions f and F defined in the previous paragraph are both multiplicative.

Therefore, with the prime-power factorization of n given by n = p_1^{e_1} p_2^{e_2} \cdots p_k^{e_k}, we have

\displaystyle \left(\sum_{d|n} \tau(d) \right)^2 = f(n) = f(p_1^{e_1} p_2^{e_2} \cdots p_k^{e_k}) = f(p_1^{e_1}) f(p_2^{e_2}) \cdots f(p_k^{e_k}) \\= F(p_1^{e_1}) F(p_2^{e_2}) \cdots F(p_k^{e_k}) = F(p_1^{e_1} p_2^{e_2} \cdots p_k^{e_k}) = F(n) = \sum_{d|n} \tau(d)^3.

For an even further generalization, see Barbeau and Seraj [1]. (The proof we give in this post follows the same basic argument as the proof of Proposition 1 in Barbeau and Seraj’s paper.)

  1. Barbeau, Edward, and Samer Seraj, “Sum of cubes is square of sum,” Notes on Number Theory and Discrete Mathematics 19(1), 2013, pages 1–13.

Posted in number theory | Leave a comment

Happy Birthday, Benoit Mandelbrot

Today’s Google doodle honors mathematician Benoit Mandelbrot. He would have been 96 today. If you’re interested in learning more about his life and work, the Google doodle link contains a short summary. If you want to go deeper, you can read the Wikipedia article on him.

I only have one small thing to add. A few years ago I wrote an interactive text-based game, A Beauty Cold and Austere. You can read some reviews of it and play it here, and you can read some of my thoughts about it here. The connection to Mandelbrot is that I used part of the Mandelbrot set for the cover art for the game, as you can see below.

abcacover

Posted in complex numbers, fractals, people | Leave a comment

No Integer Solutions to a Mordell Equation

Equations of the form x^3 = y^2 + k are called Mordell equations.  In this post we’re going to prove that the equation x^3 = y^2 -7 has no integer solutions, using (with one exception) nothing more complicated than congruences.

Theorem: There are no integer solutions to the equation x^3 = y^2 -7.

Proof.

Case 1.  First, suppose that there is a solution in which x is even.  Since 2|x, 8|x^3.  Looking at the equation modulo 8, then, we have

\displaystyle 0 \equiv y^2 - 7 \pmod{8} \implies y^2 \equiv 7 \pmod{8}.

However, if we square the residues \{0, 1, 2, 3, 4, 5, 6, 7\} and reduce them modulo 8 we have \{0, 1, 4, 1, 0, 1, 4, 1\}.  So there is no integer y that when squared is congruent to 7 modulo 8.  Therefore, there is no solution to x^3 = y^2 -7 when is even.

Case 2.  Now, suppose there is a solution in which x is odd.  If so, then we have x \equiv 3 \pmod{4} or x \equiv 1 \pmod{4}.

If x \equiv 3 \pmod{4} then we have y^2 \equiv 27 + 7 \equiv 34 \equiv 2 \pmod{4}.  However, if we square the residues \{0, 1, 2, 3\} and reduce them modulo 4 we get \{0, 1, 0, 1\}.  So there is no integer y that when squared is congruent to 2 modulo 4.  Thus there is no solution to x^3 = y^2 - 7 when x \equiv 3 \pmod{4}.

Now, suppose x \equiv 1 \pmod{4}.  Applying some algebra to the original equation, we have

\displaystyle y^2 = x^3 + 7 \implies y^2 + 1 = x^3 + 8 = (x+2)(x^2 - 2x + 4).

By assumption, x + 2 \equiv 3 \pmod{4}.  If x + 2 has prime factors that are all equivalent to 1 modulo 4, then their product (i.e., x+2) would be equivalent to 1 modulo 4.  Thus x + 2 has at least one prime factor q that is congruent to 3 modulo 4.

However, if q|x+2, then q|y^2+1.   This means that y^2 \equiv -1 \pmod{q}.  However, thanks to Euler’s criterion, only odd primes q such that q \equiv 1 \pmod{4} can have a solution to y^2 \equiv -1 \pmod{q}.  Since q \equiv 3 \pmod{4}, y^2 \equiv -1 \pmod{q} has no solution.  Thus there is no solution to x^3 = y^2 - 7 when x \equiv 1 \pmod{4}, either.

Since we’ve covered all cases, there can be no solution in integers to the Mordell equation x^3 = y^2 + k.

Posted in number theory | Leave a comment

Some Video Learning Suggestions

During this COVID-tide many of us have been seeking out online learning resources.  I’ve done so quite a bit in the past few months, and I thought I would do a post to recommend some of these.  They are all sites that feature videos.  I’ll start with the math ones and then go on to other subjects.

Math

  • Khan Academy.  This one should be obvious to anyone who knows anything about learning math online.  The videos on this site are quite good – among the best online math videos I’ve seen.  The site also features regular self-assessments ranging from quick quizzes to end-of-course tests so that you can gauge your learning.  From what I can tell, Khan Academy is the gold standard for online learning, and I have no problems recommending it to my children (ages 9, 9, and 12) or to my college students.  (While Khan Academy got their start with math, they’ve branched out to many other subjects as well.)
  • Numberphile.  This is a YouTube channel sponsored by the Mathematical Sciences Research Institute.  It features a wide variety of mathematical topics generally aimed at those who know something about math and are curious to learn more.  I’ve only watched two Numberphile videos – both on Möbius strips – but they were both quite good, and I even learned something from them!  (I also showed them to my children.)  I wouldn’t normally recommend a YouTube channel based on just two videos, but their quality plus the fact that my college students have often spoken highly of Numberphile is sufficient to make me comfortable recommending this channel.
  • Prodigy.  This is an online role-playing game in which players have to solve math problems in order to defeat monsters in battle.  While I have some pedagogical criticisms of Prodigy, the game has good production values for its target audience (mostly elementary school children), it successfully ramps up the difficulty level as players demonstrate greater math knowledge, and all three of my kids have had fun with it.

Geography

  • Geography Now.  Paul Barbato deserves an award for this YouTube channel, in which he covers all the countries in the world in alphabetical order.  He’s funny and informative, giving an overview of the physical geography, the people, the culture, and the political relations of each country he discusses.  The production values are much better than any of the other general-interest geography channels I’ve seen, too.  He’s only up to the Seychelles at this point, so if you’re interested in countries like Spain or the UK you’ll have to wait a bit, but this has been the go-to site for geography homeschooling with my kids the past several months.  (“Hast du gluten-free?”  “Nein!” still brings a chuckle in my household.)
  • Touropia.  This is the best YouTube travel channel I’ve found.  I generally pair a country’s Touropia “Top Ten Places to Visit” video with the country’s corresponding Geography Now video for two different perspectives on the same country.

Science

  • National Geographic.  As you can imagine, this venerable institution has some high-quality videos out there.  I’m recommending them under “science” because we’ve only watched their science videos.
  • Mystery Science.  This is a series of hands-on video lessons for K-5.  My children’s teachers used them in the classroom pre-COVID, and I’ve used several as well for homeschooling.  The science is solid and at the appropriate level, and the hands-on activities help engage the children.  When I was using them more regularly back in the spring, some of the video lessons were available for free and others you had to purchase.  As of right now they are offering a limited number of free memberships that presumably allow you to access all the videos.

History

  • Crash Course in World History.  This YouTube series features an irreverent overview of world history that should appeal to people of all ages.  I haven’t watched any of their videos all the way through, but I’ve seen segments, my wife has used them when she’s homeschooling, and my kids have enjoyed them.  This link is just to the first video in the world history series, and they have other series as well (including U.S. history).  “Except for the Mongols!” has become a catchphrase in my house.

Philosophy

  • Philosophy of the Humanities.  These YouTube videos simply feature Leiden University professor Victor Gijsbers giving a series of lectures on the philosophy of science, history, knowledge, and the humanities in general.  There’s nothing fancy here, video-wise, but Victor is an amazingly clear lecturer, elucidating intricate philosophical concepts with clarity and easy-to-understand examples.  This is the only recommendation in this post that I haven’t used with my kids; this is advanced high school level material at the very least.  Favorite quote: “Hegel is evidently a comic historian.  Which, by the way, doesn’t mean we will laugh a lot when we read Hegel.  Because, believe me, we don’t.”  (I actually know Victor in another context: He and I have both written interactive fiction!)

English

  • Schoolhouse Rock.  If you were an American child in the late 70s or early 80s you will know exactly what this video series is, as it was ubiquitous during prime Saturday-morning cartoon-watching.  My kids and I viewed their videos on the eight parts of speech, but they have some history videos and even a solid video on how a bill becomes law as well.  Enjoy the 70s-style animation and music, and expect to emerge with a few extra earwigs.  (“Conjunction junction, what’s your function?” and half of the interjections video continue to be repeated in my house.)  The link is to the YouTube channel, which doesn’t have all the videos.  However, they are easy to find online.

French

  • Get Started with French Like a Boss!.  I am not anywhere near fluent in French, but I did study it in high school and college, and I decided homeschooling would be a great opportunity to introduce my kids to a foreign language.  Lya, the host of this video that introduces basic French words and phrases, is funny and a little goofy in an endearing way.  She’s really good about using the vocabulary in different contexts, too, to the point that my 12-year-old was able to start picking up basic French sentence structure just from Lya’s examples.  This video is part of a larger series, but it’s the only one from that series I’ve seen.
  • Rock ‘N Learn.  This brightly-animated series aimed at young children pairs French vocabulary with corresponding English vocabulary.  My kids are older than the target audience for the series, but the vocabulary is right at their level, and they’re old enough that they can laugh about how goofy it is.  Rock ‘N Learn actually covers a wide variety of subjects, but we’ve only watched it for French.  (Also, they appear to have a series of French language videos for teens and adults, but I haven’t watched those yet.)
Posted in education, teaching | Leave a comment

A Lesson on Converting Between Different Bases

We’re in the time of COVID-19, and that has meant taking far more direct responsibility for my children’s learning than I ever have before.  It’s been a lot of work, but it’s also been fun.  In fact, I’ve been surprised at how much I’ve enjoyed it.

One of these enjoyable aspects has been introducing my children to some mathematical concepts that are more advanced than they would normally get in third or sixth grade.  My sixth grader in particular is ready for some basic number theory, such as the representation of numbers in bases other than 10.

Here’s a problem I posed for him a few weeks ago, after making sure he understood the conversion concept.

Take the number 42178 and convert it to base 2.

Dutifully, he began converting 42178 to base 10.  It took him a minute or two, but he got the correct answer of 219110.  Then he started working on the conversion from base 10 to base 2.  I told him to tell me when he finished the calculation but not tell me what the answer is.  After another couple of minutes, he did so.  I then quickly wrote down the answer of 1000100011112 off the top of my head.  His eyes bugged and his jaw dropped – a response that is always gratifying to see from a middle-schooler. 🙂

I didn’t keep him in suspense long, though.  Since 8 is a power of 2, there’s a fast way to convert between those two bases.  In particular, 23 = 8, so you can convert the digits in the base-8 representation of a number in groups of three.  For the example of 42178, we have 48 = 1002, 28 = 0102, 18 = 0012, and 78 = 1112.  (All of these base-2 representations I had in my head.)  String those four together to get

42178 = 1000100011112.

This process goes in the other direction, too.  And let’s convert from binary to base 16, just to work with a different number than 8.  Thus, for example,

1101010001112 = D4716,

as 01112 = 716, 01002 = 416, and 11012 = D16.  (Note that we have to do the conversion starting with the least significant digit; i.e., from right to left.)

This process works when converting between any two bases where one base is a positive integer power of the other.

Posted in number theory | Leave a comment

A Coin-Flipping Problem

One problem that I’ve assigned when discussing Markov chains is to calculate the expected number of flips required for a particular pattern to appear.  (Here I mean a pattern such as heads, heads, heads, or HHH.)  In this post I’m going to discuss another approach – one that doesn’t use Markov chains – to solve this problem.

Suppose we want to find the expected number of flips required for the pattern HHH to appear.  Call this X.  We can calculate X by conditioning on the various patterns we might achieve that do or don’t give us HHH.  For example, for our first few flips we could observe T, HT, HHT, or HHH.  These cover the entire outcome space and have probability 1/2, 1/4, 1/8, and 1/8, respectively.

  • If we observe T, then we effectively have to start the entire process over, and we’ve used one flip to get to that point.  So, if we observe T, then the average number of flips required is 1+X.
  • If we observe HT, then we also have to start the entire process over, and we’ve used two flips to get there.  For this outcome, the average number of flips required is 2+X.
  • If we observe HHT, then we have to start the entire process over, and we’ve used three flips.  The average number of flips required for this scenario is 3+X.
  • Finally, if we observe HHH, then we have achieved our goal.  The number of flips required is 3.

All together, then, the average number of flips required satisfies the equation

\displaystyle X = 1/2(1+X) + 1/4(2+X) + 1/8(3+X) + 1/8(3).

Solving for X, we obtain X = 14.  So it takes 14 flips, on average, to obtain three consecutive heads.


What if we have a more complicated pattern, though?  Let’s look at HHT as an example.  Let X be the expected number of flips required for HHT to appear for the first time.

Once again, we can condition on the various patterns we might achieve that do or don’t give us HHT.  These are T, HT, HHH, and HHT.  As in the previous example, these cover the entire outcome space and have probability 1/2, 1/4, 1/8, and 1/8, respectively.

  • If we observe T, then we have to start over.  The average number of flips required is 1+X.
  • If we observe HT, then we have to start over.  The average number of flips required is 2+X.
  • If we observe HHH, then we don’t have to start over.  It’s entirely possible that the HH at the end of HHH could be followed by a T, and then we would have achieved our pattern!  Mathematically, this means that things get a bit more complicated.
    • Let E_{HH} denote the expected number of flips required for HHT to appear given that we currently have HH.
    • Thus if we start with HHH, then the average number of flips required to obtain HHT is 3 + E_{HH}.
    • Now we need to determine E_{HH}.
      • If we currently have HH, then the next flip is either a T, which completes our pattern, or an H, we means we must start over in our quest to complete HHT given that we currently have HH.
      • Thus E_{HH} = 1/2(1) + 1/2(1 + E_{HH}).
      • Solving this yields E_{HH} = 2.
    • Thus if we observe HHH, then the average number of flips required to obtain HHT is 3 + 2 = 5.
  • Finally, if we observe HHT, then we have achieved our goal in 3 flips.

Therefore, X = 1/2(1 + X) + 1/4(2 + X) + 1/8(5) + 1/8(3).  Solving this equation for X gives us X = 8 flips.

Notice that it takes fewer flips, on average, to achieve HHT than it does HHH.  This is because we don’t always have to start over every time the sequence fails to match our goal sequence.

The interested reader is invited to find the expected number of flips for the other sequences.

Posted in coin flipping, Markov chains, probability | Leave a comment

A Request for a Proof of a Binomial Identity

A few weeks ago I received an email from Professor Steve Drekic at the University of Waterloo. He asked if I knew of a way to prove the following binomial identity:

\displaystyle \sum_{k=1}^n \frac{1}{2k-1} \binom{2k}{k} \binom{2n-2k}{n-k} = \binom{2n}{n}.

(He told me later that he wanted it to help prove that the random walk on the integers with transition probability p = 1/2 is null recurrent.)

It’s an interesting binomial identity, in that it has a nontrivial but not overly complicated sum on the left side and a simple expression on the right. Yet I had not seen it before. I was able to find a proof that uses generating functions, and I thought I would reproduce it here.

Lemma 1.

\displaystyle \binom{1/2}{k} = \frac{-1}{2k-1} \binom{-1/2}{k}.

Proof.
By Identity 17 in my book The Art of Proving Binomial Identities [1],

\displaystyle \binom{1/2}{k} = \frac{(1/2)^{\underline{k}}}{k!} = \frac{(1/2) (-1/2)^{\underline{k}}}{(-1/2-k+1) k!} = \frac{(-1/2)^{\underline{k}}}{-(2k-1)) k!} = \frac{-1}{2k-1} \binom{-1/2}{k},

where in the second step we use properties of the falling factorial x^{\underline{k}}.

Next, we need the following generating function.

Lemma 2.

Proof.
\displaystyle - \sqrt{1-4x} = \sum_{k=0}^{\infty} \frac{1}{2k-1} \binom{2k}{k} x^k.

By Newton’s binomial series (Identity 18 in my book [1]),
\displaystyle  -\sqrt{1-4x} = -(1-4x)^{1/2} = -\sum_{k=0}^{\infty} \binom{1/2}{k} (-4x)^k \\ = -\sum_{k=0}^{\infty} \frac{-1}{2k-1} \binom{-1/2}{k} (-4x)^k, \text{ by Lemma 1} \\ = \sum_{k=0}^{\infty} \frac{1}{2k-1} \left(\frac{-1}{4}\right)^k \binom{2k}{k} (-4x)^k, \text{ by Identity 30 in [1]} \\ = \sum_{k=0}^{\infty} \frac{1}{2k-1} \binom{2k}{k} x^k.

Finally, we’re ready to prove the identity.

Identity 1.

\displaystyle \sum_{k=1}^n \frac{1}{2k-1} \binom{2k}{k} \binom{2n-2k}{n-k} = \binom{2n}{n}.

Proof.
Let a_k = \binom{2k}{k}/(2k-1), and let b_k = \binom{2k}{k}. Since the generating function for (a_k) is -\sqrt{1-4x} (Lemma 2), and the generating function for (b_k) is 1/\sqrt{1-4x} (Identity 150 in [1]), the generating function for their convolution,

\displaystyle c_n = \sum_{k=0}^n \frac{1}{2k-1} \binom{2k}{k} \binom{2n-2k}{n-k},

is, by the convolution property for generating functions (see Theorem 13 in [1]),

\displaystyle - \frac{\sqrt{1-4x}}{\sqrt{1-4x}} = -1.

Since -1 just generates the sequence -1, 0, 0, 0, \ldots, this means that

\displaystyle \sum_{k=0}^n \frac{1}{2k-1} \binom{2k}{k} \binom{2n-2k}{n-k} = \begin{cases} -1, & \: n = 0; \\ 0, & \: n \geq 1.\end{cases}

Therefore, when n \geq 1, we have
\displaystyle  \sum_{k=0}^n \frac{1}{2k-1} \binom{2k}{k} \binom{2n-2k}{n-k} = 0 \\ \implies \sum_{k=1}^n \frac{1}{2k-1} \binom{2k}{k} \binom{2n-2k}{n-k} - \binom{2n}{n} = 0 \\ \implies \sum_{k=1}^n \frac{1}{2k-1} \binom{2k}{k} \binom{2n-2k}{n-k} = \binom{2n}{n}.

References

1. Michael Z. Spivey, The Art of Proving Binomial Identities, CRC Press, 2019.

Posted in binomial coefficients, generating functions | 2 Comments