An arcsecant/arctangent integral

Recently in my integral calculus class I assigned the problem of evaluating

\displaystyle \int \frac{dx}{x \sqrt{x^4-1}}.

I intended for the students to recognize the integrand as similar to the derivative of arcsecant:

\displaystyle \frac{d}{dx} \sec^{-1} x = \frac{1}{x \sqrt{x^2-1}}.

This leads to the substitution u = x^2, with du = 2x dx.  The original integral then becomes

\displaystyle \frac{1}{2} \int \frac{2x dx}{x^2 \sqrt{x^4-1}} = \frac{1}{2} \int \frac{du}{u \sqrt{u^2-1}} = \frac{1}{2} \sec^{-1} u + C = \frac{1}{2} \sec^{-1} (x^2) + C.

However, one of my students, Max, tried the substitution u = \sqrt{x^4-1} (on the general principle that letting u be the most complicated part of the integral is a good idea).  This gives du = 2x^3/\sqrt{x^4-1}.  Max then got stuck with the substitution.  The problem was in the inverse trig functions section of the text, though, and so he guessed \arctan (\sqrt{x^4-1}) as an antiderivative.  Checking this, he obtained

\displaystyle \frac{d}{dx} \arctan (\sqrt{x^4-1}) = \left(\frac{1}{x^4-1+1} \right) \left(\frac{1}{2}\right) \left(\frac{1}{\sqrt{x^4-1}}\right) \left(4x^3\right) = \frac{2}{x \sqrt{x^4-1}}.

Since this is only off by a factor of 2, he concluded that the answer is

\displaystyle \int \frac{dx}{x \sqrt{x^4-1}} = \frac{1}{2} \arctan (\sqrt{x^4-1}) + C!

Why are both of these answers correct?  Let’s complete the evaluation of the integral using Max’s substitution, and then we’ll show that the two answers are actually equal to each other.

With u = \sqrt{x^4-1}du = 2x^3/\sqrt{x^4-1}, and x^4 = u^2+1, we can rewrite the original integral and evaluate it via

\displaystyle \frac{1}{2} \int \frac{2x^3 dx}{x^4 \sqrt{x^4-1}} = \frac{1}{2} \int \frac{du}{u^2+1} = \frac{1}{2} \arctan u + C = \frac{1}{2} \arctan (\sqrt{x^4-1}) + C.

It’s easy to get stuck here because the correct evaluation doesn’t actually substitute u into the integrand (at least not by itself).  Instead, the radical becomes part of the du, and the remaining expression in x gets replaced by u^2+1.  The rest of the evaluation is surprisingly simple, but getting to here is the trick.

The reason the two answers are equal has to do with trigonometry.  If \theta = \sec^{-1} (x^2), then \sec \theta = x^2.  Using the trig identity \tan^2 \theta + 1 = \sec^2 \theta, we have \tan \theta = \sqrt{\sec^2 \theta - 1} = \sqrt{x^4-1}.  Thus \theta can also be expressed as \theta = \arctan (\sqrt{x^4-1}).  (This can also be seen by setting x^2 and 1 as the hypotenuse and adjacent side of a right triangle; the Pythagorean theorem plus the triangle definition of tangent gets you the rest of the way.)

I’ve come across different-looking answers to the same integral evaluation before, but this is probably the most complicated different-looking pair of answers I’ve seen.

Posted in calculus | Leave a comment

Tiling Proofs for the Sum of Even or Odd-Indexed Binomial Coefficients

Two basic identities for the binomial coefficients are contained in the equation \displaystyle \sum_{k \geq 0} \binom{n}{2k} = \sum_{k \geq 0} \binom{n}{2k+1} = 2^{n-1}.   There are multiple ways to prove these identities (even if you restrict yourself to combinatorial proofs).  In this post I’m going to discuss a tiling or coloring-based combinatorial proof that I find aesthetically satisfying.

Proof: Suppose you have a 1 \times (n-1) board.  If you can color each square in the board either black or white, there are 2^{n-1} ways to color the board.

There are also n grooves in this board: n-2 grooves between the squares plus a groove (numbered 0 before the first square and a groove after the last square.  Suppose we select an even number 2k of those grooves: a_1, a_2, \ldots, a_{2k}.  We can then construct a coloring of the board by coloring black the squares between grooves [a_{2i-1}, a_{2i}], 1 \leq i \leq k, and coloring the remaining squares white.  This produces a coloring with k runs of black tiles.  For example, if n = 10, selecting grooves 1, 3, 4, and 5 (so that k = 2) yields the coloring WBBWBWWWW.  There are \binom{n}{2k} ways to choose 2k even numbers from n.  Summing over all possible values of k, we have the number of ways to color the board as \displaystyle \sum_{k \geq 0} \binom{n}{2k}.

Instead of choosing an even number of grooves, we could also choose an odd number 2k+1 of grooves: a_0, a_1, \ldots, a_{2k}.  Now we color black the squares between grooves [a_{2i-1}, a_{2i}] as before, but we also color black the squares between grooves 0 and a_0.  Color the remaining squares white.  This selection might not have k+1 runs of black tiles because a_0 might be 0.  However, it will have exactly k transitions from a white run to a black run.  For example, selecting grooves 0, 1, 3, 4, and 5 produces the same coloring WBBWBWWWW as in the previous example.  It has two white-to-black transitions, and k = 2.  There are \binom{n}{2k+1} ways to choose 2k+1 odd numbers from n.  Summing over all possible values of k, we also have the number of ways to color the board as \displaystyle \sum_{k \geq 0} \binom{n}{2k+1}.

Therefore, \displaystyle \sum_{k \geq 0} \binom{n}{2k} = \sum_{k \geq 0} \binom{n}{2k+1} = 2^{n-1}.

There are simpler combinatorial proofs of these identities.  For example, one can count the number of ways to form, out of n people, a committee with an even only or odd only number of members.  One can also construct a one-to-one mapping between even-sized committees and odd-sized committees, showing that there must be an equal number of each.  Proofs That Really Count [1] (p. 65 and p. 81, respectively) discusses both of these proofs.


  1.  Arthur Benjamin and Jennifer Quinn, Proofs That Really Count, MAA, 2003.
Posted in binomial coefficients, combinatorics | Leave a comment

Combinatorial Proofs of Two Hagen-Rothe Identities in Concrete Mathematics

There are two binomial coefficient identities in Concrete Mathematics that have bothered me for years:  They look they should have combinatorial proofs, but I couldn’t for the life of me think of what those might be.  (In Concrete Mathematics they are proved with generating functions.)  Well, I just recently found those combinatorial proofs I was looking for.

Here are the identities. [2, p. 202]

(1)  \displaystyle \sum_k \binom{tk+r}{k} \binom{tn-tk+s}{n-k} \frac{r}{tk+r} = \binom{tn+r+s}{n}.

(2)  \displaystyle \sum_k \binom{tk+r}{k} \binom{tn-tk+s}{n-k} \frac{r}{tk+r} \cdot \frac{s}{tn-tk+s} = \binom{tn+r+s}{n} \frac{r+s}{tn+r+s}.

The key idea I didn’t have before is the general ballot theorem, which states that the number of lattice paths from (0,0) to (n,m), m > rn, that never touch the line y = rx after (0,0) is given by \frac{m-rn}{m+n} \binom{m+n}{m}. This can be proved combinatorially; see, for example, Renault’s paper [5].

Identity (1) counts the number of lattice paths from (0,0) to (n, (t-1)n + r+ s).  The right-hand side is straightforward, since \binom{n+m}{n} counts the number of lattice paths from (0,0) to (n,m).  For the left side, condition on the number of right steps k a lattice path takes after touching the line y = (t-1)x+s for the last time.  By the general ballot theorem, the number of paths from (n-k, (t-1)(n-k)+s) to (n, (t-1)n + r + s) that never touch the line y = (t-1)x + s after the initial point is \frac{r}{tk+r} \binom{tk+r}{k}.  (This is because there are k right steps and (t-1)k + r up steps.)  Also, the number of lattice paths from (0,0) to (n-k, (t-1)(n-k) + s) is \binom{t(n-k)+s}{n-k}.  Summing over all possible values of k gives the left side of Identity (1).

Identity (2) is similar, although it counts the number of lattice paths from (0,0) to (n, (t-1)n + r + s) that never touch the line y = (t-1)x after (0,0).  The right side follows from the general ballot theorem.  For the left side, condition on the last place (k, (t-1)k + r) a lattice path touches the line y = (t-1)x + r.  By the general ballot theorem, there are \frac{r}{tk+r} \binom{tk+r}{k} paths from (0,0) to (k, (t-1)k + r) that never touch the line y = (t-1)x after (0,0).  Also by the general ballot theorem, there are \frac{s}{tn-tk+s} \binom{tn-tk+s}{n-k} lattice paths from (k, (t-1)k + r) to (n, (t-1)n + r + s) that never touch the line y = (t-1)x + r after the initial point.  Multiplying these together and summing over all values of k gives the left side.

I don’t think these proofs are new; for example, Chu [1] mentions lattice path proofs of these identities in books by Mohanty [3] and Narayana [4].


  1. Wenchang Chu, “Elementary Proofs for Convolution Identities of Abel and Hagen-Rothe,” Electronic Journal of Combinatorics 17, Note #N24, 2010.
  2. Ronald L. Graham, Donald E. Knuth, and Oren Patashnik.  Concrete Mathematics, 2nd ed., Addison-Wesley, 1994.
  3. S. G. Mohanty, Lattice Path Counting and Applications, Academic Press, 1979.
  4. T. V. Narayana, Lattice Path Combinatorics with Statistical Applications, Univ. of Toronto Press, 1979.
  5. Marc Renault, “Lost (and Found) in Translation: André’s Actual Method and Its Application to the Generalized Ballot Problem,” American Mathematical Monthly 115 (4), 2008, pp. 358-363.
Posted in combinatorics, lattice paths | Leave a comment

Vandermonde’s Identity from the Generalized Product Rule

Vandermonde’s Identity, \displaystyle \sum_{k=0}^r \binom{n}{k} \binom{m}{r-k} = \binom{n+m}{r}, is one of the more famous identities involving the binomial coefficients.  A standard way to prove it is with the following combinatorial argument.

How many ways are there to choose a committee of size r from a group of n men and m women?  By definition of the binomial coefficient, the answer is \binom{n+m}{r}.  Alternatively, condition on the number k of men on the committee.  There are \binom{n}{k} ways to choose the men, and then there are \binom{m}{r-k} ways to choose the remaining r-k members of the committee from the m women.  Summing over all possible values of k gives the left side.

In this post we’re going to give a much less standard proof of Vandermonde’s Identity — one via the generalized product rule for derivatives.  In addition, while the combinatorial proof requires n and m to be natural numbers, our proof holds for all real numbers.

The generalized product rule, sometimes known as Leibniz’s Generalized Rule, gives a formula for the nth derivative of the product of two functions.  Here it is:

\displaystyle (fg)^{(n)}(x) = \sum_{k=0}^n \binom{n}{k} f^{(k)}(x) g^{(n-k)}(x).

This formula can be easily proved by induction.

Let f(x) = x^n, and let g(x) = x^m.  We know that the rth derivative of a power function like x^n is n^{\underline{r}} x^{n-r}, where n^{\underline{r}} is the falling factorial n(n-1) \cdots (n-r+1).  Now, let’s look at the rth derivative of f(x) g(x) = x^{n+m}.  As we just argued, this is (n+m)^{\underline{r}} x^{n+m-r}.  By the generalized product rule, then, we have

\displaystyle (n+m)^{\underline{r}} x^{n+m-r} = \sum_{k=0}^r \binom{r}{k} n^{\underline{k}} x^{n-k} m^{\underline{r-k}} x^{m-r+k}

\displaystyle \implies (n+m)^{\underline{r}} x^{n+m-r} = \sum_{k=0}^r \frac{r!}{(r-k)! k!} n^{\underline{k}} m^{\underline{r-k}} x^{n+m-r}

\displaystyle \implies \frac{(n+m)^{\underline{r}}}{r!} x^{n+m-r} = \left(\sum_{k=0}^r \frac{n^{\underline{k}}}{k!} \frac{m^{\underline{r-k}}}{(r-k)!}\right) x^{n+m-r}.

The coefficients of the two expressions here must be equal in order for the expressions themselves to be equal.  In addition, using the definition of the generalized binomial coefficient, we have \binom{n}{k} = \frac{n^{\underline{k}}}{k!}.  (This allows binomial coefficients to be defined for values of n that are not natural numbers.)  All of this gives us Vandermonde’s Identity,

\displaystyle \binom{n+m}{r} = \sum_{k=0}^r \binom{n}{k} \binom{m}{r-k}.

Posted in binomial coefficients, calculus, combinatorics | Leave a comment

Combinatorial Proofs for the Binomial and Alternating Binomial Power Sums

In this post I give combinatorial proofs of formulas for the binomial and alternating binomial power sums: \displaystyle \sum_{k=0}^n \binom{n}{k} k^m and \displaystyle \sum_{k=0}^n \binom{n}{k} k^m (-1)^k.

Here’s the first.

Identity 1.

\displaystyle \sum_{k=0}^n \binom{n}{k} k^m = \sum_{j=0}^m \left\{ {m \atop j} \right\} \binom{n}{j} j! 2^{n-j}.

Proof.  Both sides count the number of functions from \{1, 2, \ldots, m\} to subsets of \{1, 2, \ldots, n\}.  For the left side, condition on the number of elements in the codomain.   If there are k elements in the codomain, there are \binom{n}{k} ways to choose those k elements from the n possible.  Once the k elements in the codomain are chosen, there are k choices for f(1)k choices for f(2), and so forth, for a total of k^m choices for the function values from f(1) to f(m).  Summing over all possible values of k gives the left side.

For the right side, condition on the number of elements in the image.  If there are j elements in the image, there are \binom{n}{j} ways of choosing these j from the n possible.  Then there are 2^{n-j} ways to complete the codomain, as any of the remaining n-j elements in \{1, 2, \ldots, n\} could either be in the codomain or not.  In addition, there are \left\{ {m \atop j } \right\} j! onto functions from a set with m elements to a set with j elements.  Finally, sum over all possible values of j.    \square

And the second.

Identity 2.

\displaystyle \sum_{k=0}^n \binom{n}{k} k^m (-1)^k = (-1)^n \left\{ m \atop n \right\} n!.

Proof.  We use the same interpretation as in the first proof.  The left side is the difference between the number of functions from \{1, 2, \ldots, m\} to subsets of \{1, 2, \ldots, n\} with an even-sized codomain and the number of such functions with an odd-sized codomain.  We construct a sign-reversing involution on the set of functions from \{1, 2, \ldots, m\} to \{1, 2, \ldots n\}.

Given a function f: \{1, 2, \ldots, m\} \mapsto S, where S is a subset of \{1, 2, \ldots, n\}, let y be the largest-numbered element of \{1, 2, \ldots, n\} that is not in the image of f.  Define \sigma(f) to be the function g: \{1, 2, \ldots, m\} \mapsto T, where T \subseteq \{1, 2, \ldots, n\} such that f(x) = g(x) for all x \in \{1, 2, \ldots, m\} but that T = S \cup \{y\} if y \not\in S and T = S - \{y\} if y \in S.  In other words, f and g are the same function except that the codomain of exactly one of them contains the largest-numbered element not in their common image.  From this definition we can see that \sigma is both sign-reversing (as it maps functions with even-sized codomains to those with odd-sized codomains and vice versa) and an involution.  Thus the functions for which \sigma is defined are canceled out in the sum on the left.  The only functions for which \sigma is not defined are those for which y does not exist.  These are the onto functions from \{1, 2, \ldots, m\} to \{1, 2, \ldots, n\}.  There are \left\{ {m \atop n } \right\} n! of these, and their sign is determined by whether n is even or odd.  \square

For proofs of these two identities by using the formulas for \displaystyle \sum_{k=0}^n \binom{n}{k} k^{\underline{m}} and \displaystyle \sum_{k=0}^n \binom{n}{k} k^{\underline{m}} (-1)^k and then using the Stirling numbers of the second kind to convert from falling factorial powers to ordinary powers, see my paper [1].


[1]  Michael Z. Spivey.  “Combinatorial Sums and Finite Differences.”  Discrete Mathematics 307 (24): 3130-3146, 2007.

Posted in binomial coefficients, combinatorics, Stirling numbers | Leave a comment

Independence of the Range and Minimum of a Sample from an Exponential Distribution

A few years ago I answered a question on math.SE about the distribution of the sample range from an exponential (1) distribution.  In my answer I claim that the range and the minimum of the sample are independent, thanks to the memoryless property of the exponential distribution.  Here’s a direct derivation of the independence of the range and minimum, for those (including the graduate student who contacted me about this last month!) for whom the memoryless justification is not sufficient.

Let Y_{(1)}, Y_{(2)}, \ldots, Y_{(n)} be the order statistics from a sample of size n from an exponential (1) distribution.  The joint distribution of Y_{(1)} and Y_{(n)}  is given by

\begin{aligned} f_{Y_{(1)}, Y_{(2)}}(y,z) &= \frac{n!}{(n-2)!} e^{-y} e^{-z} \left( (1 - e^{-z}) - (1-e^{-y}) \right)^{n-2} \\ &= n(n-1) e^{-y} e^{-z} (e^{-y} - e^{-z})^{n-2}, \text{ for } 0 < y < z. \end{aligned}

Now, let’s do a double variable transformation.  Let R = Y_{(n)} - Y_{(1)}; i.e., R is the sample range.  Let S = Y_{(1)}, the minimum.  Then Y_{(n)} = R + S and Y_{(1)} = S.  The Jacobian of the transformation is \begin{vmatrix} 1 & 1 \\ 0 & 1 \end{vmatrix} = 1.  Substituting, we have

\begin{aligned} f_{R,S}(r,s) &= n(n-1) e^{-s} e^{-r-s} (e^{-s} - e^{-r-s})^{n-2} \\ &= n(n-1) e^{-2s} e^{-r} (e^{-s})^{n-2} (1 - e^{-r})^{n-2} \\ &= (n-1) e^{-r} (1-e^{-r})^{n-2} n e^{-ns}, \text{ for } r, s > 0.\end{aligned}

Since f_{R,S}(r,s) factors into a function of r times a function of sR and S are independent.  (We have f_R(r) = (n-1) e^{-r} (1-e^{-r}), r > 0; and f_{Y_{(1)}}(y) = n e^{-ny}, y > 0.)

Posted in probability, statistics | Leave a comment

A Double Euler-Mascheroni Sequence

What is the value of the sum \displaystyle \sum_{i=1}^n \sum_{j=1}^n \frac{1}{i+j}?  Furdui poses this as Question 1.28 in his book Limits, Series, and Fractional Part Integrals [1, p. 5], where he calls the sum a “double Euler-Mascheroni sequence.”  His solution converts the double sum into an expression containing an integral and an infinite series, and then he evaluates those.  In this post I’ll give a different derivation of Furdui’s expression for the value of the double sum – one that uses only standard summation manipulations.

Let \displaystyle H_n denote the nth harmonic number: \displaystyle H_n = \sum_{k=1}^n \frac{1}{k}.  The following lemma is proved in Concrete Mathematics [2, pp. 40-41] and is a straightforward application of summation by parts, but I’ll give a proof here anyway.

Lemma: \displaystyle \sum_{k=1}^n H_k = (n+1) H_{n+1} - (n+1).

Proof:  Swapping the order of summation, we have \displaystyle \sum_{k=1}^n H_k = \sum_{k=1}^n \sum_{j=1}^k \frac{1}{j} = \sum_{j=1}^n \sum_{k=j}^n \frac{1}{j} = \sum_{j=1}^n \frac{n+1-j}{j} = (n+1) \sum_{j=1}^n \frac{1}{j} - \sum_{j=1}^n 1 \\ = (n+1) H_n - n = (n+1) H_n - n + \frac{n+1}{n+1} - 1 = (n+1)H_{n+1} - (n+1).

With the lemma in place, we’re now ready to derive an expression for this double Euler-Mascheroni sequence.

Theorem: \displaystyle \sum_{i=1}^n \sum_{j=1}^n \frac{1}{i+j} = (2n+1) H_{2n+1} - (2n+2) H_{n+1} + 1.

Proof: We have \displaystyle \sum_{i=1}^n \sum_{j=1}^n \frac{1}{i+j} = \sum_{i=1}^n (H_{n+i} - H_i) = \sum_{i=1}^n H_{n+i} - \sum_{i=1}^n H_i = \sum_{i=1}^{2n} H_i - 2 \sum_{i=1}^n H_i.

Applying the lemma, we obtain \displaystyle \sum_{i=1}^n \sum_{j=1}^n \frac{1}{i+j} = (2n+1) H_{2n+1} - (2n+1) - 2 \left( (n+1)H_{n+1} - (n+1)\right) \\ = (2n+1) H_{2n+1} - (2n+2)H_{n+1} + 1.


  1. Ovidiu Furdui, Limits, Series, and Fractional Part Integrals, Springer, 2013.
  2. Ronald L. Graham, Donald E. Knuth, and Oren Patashnik, Concrete Mathematics (2nd ed.), Addison-Wesley, 1994.
Posted in harmonic numbers | Leave a comment