Independence of the Range and Minimum of a Sample from an Exponential Distribution

A few years ago I answered a question on math.SE about the distribution of the sample range from an exponential (1) distribution.  In my answer I claim that the range and the minimum of the sample are independent, thanks to the memoryless property of the exponential distribution.  Here’s a direct derivation of the independence of the range and minimum, for those (including the graduate student who contacted me about this last month!) for whom the memoryless justification is not sufficient.

Let Y_{(1)}, Y_{(2)}, \ldots, Y_{(n)} be the order statistics from a sample of size n from an exponential (1) distribution.  The joint distribution of Y_{(1)} and Y_{(n)}  is given by

\begin{aligned} f_{Y_{(1)}, Y_{(2)}}(y,z) &= \frac{n!}{(n-2)!} e^{-y} e^{-z} \left( (1 - e^{-z}) - (1-e^{-y}) \right)^{n-2} \\ &= n(n-1) e^{-y} e^{-z} (e^{-y} - e^{-z})^{n-2}, \text{ for } 0 < y < z. \end{aligned}

Now, let’s do a double variable transformation.  Let R = Y_{(n)} - Y_{(1)}; i.e., R is the sample range.  Let S = Y_{(1)}, the minimum.  Then Y_{(n)} = R + S and Y_{(1)} = S.  The Jacobian of the transformation is \begin{vmatrix} 1 & 1 \\ 0 & 1 \end{vmatrix} = 1.  Substituting, we have

\begin{aligned} f_{R,S}(r,s) &= n(n-1) e^{-s} e^{-r-s} (e^{-s} - e^{-r-s})^{n-2} \\ &= n(n-1) e^{-2s} e^{-r} (e^{-s})^{n-2} (1 - e^{-r})^{n-2} \\ &= (n-1) e^{-r} (1-e^{-r})^{n-2} n e^{-ns}, \text{ for } r, s > 0.\end{aligned}

Since f_{R,S}(r,s) factors into a function of r times a function of sR and S are independent.  (We have f_R(r) = (n-1) e^{-r} (1-e^{-r}), r > 0; and f_{Y_{(1)}}(y) = n e^{-ny}, y > 0.)

Posted in probability, statistics | Leave a comment

A Double Euler-Mascheroni Sequence

What is the value of the sum \displaystyle \sum_{i=1}^n \sum_{j=1}^n \frac{1}{i+j}?  Furdui poses this as Question 1.28 in his book Limits, Series, and Fractional Part Integrals [1, p. 5], where he calls the sum a “double Euler-Mascheroni sequence.”  His solution converts the double sum into an expression containing an integral and an infinite series, and then he evaluates those.  In this post I’ll give a different derivation of Furdui’s expression for the value of the double sum – one that uses only standard summation manipulations.

Let \displaystyle H_n denote the nth harmonic number: \displaystyle H_n = \sum_{k=1}^n \frac{1}{k}.  The following lemma is proved in Concrete Mathematics [2, pp. 40-41] and is a straightforward application of summation by parts, but I’ll give a proof here anyway.

Lemma: \displaystyle \sum_{k=1}^n H_k = (n+1) H_{n+1} - (n+1).

Proof:  Swapping the order of summation, we have \displaystyle \sum_{k=1}^n H_k = \sum_{k=1}^n \sum_{j=1}^k \frac{1}{j} = \sum_{j=1}^n \sum_{k=j}^n \frac{1}{j} = \sum_{j=1}^n \frac{n+1-j}{j} = (n+1) \sum_{j=1}^n \frac{1}{j} - \sum_{j=1}^n 1 \\ = (n+1) H_n - n = (n+1) H_n - n + \frac{n+1}{n+1} - 1 = (n+1)H_{n+1} - (n+1).

With the lemma in place, we’re now ready to derive an expression for this double Euler-Mascheroni sequence.

Theorem: \displaystyle \sum_{i=1}^n \sum_{j=1}^n \frac{1}{i+j} = (2n+1) H_{2n+1} - (2n+2) H_{n+1} + 1.

Proof: We have \displaystyle \sum_{i=1}^n \sum_{j=1}^n \frac{1}{i+j} = \sum_{i=1}^n (H_{n+i} - H_i) = \sum_{i=1}^n H_{n+i} - \sum_{i=1}^n H_i = \sum_{i=1}^{2n} H_i - 2 \sum_{i=1}^n H_i.

Applying the lemma, we obtain \displaystyle \sum_{i=1}^n \sum_{j=1}^n \frac{1}{i+j} = (2n+1) H_{2n+1} - (2n+1) - 2 \left( (n+1)H_{n+1} - (n+1)\right) \\ = (2n+1) H_{2n+1} - (2n+2)H_{n+1} + 1.


  1. Ovidiu Furdui, Limits, Series, and Fractional Part Integrals, Springer, 2013.
  2. Ronald L. Graham, Donald E. Knuth, and Oren Patashnik, Concrete Mathematics (2nd ed.), Addison-Wesley, 1994.
Posted in harmonic numbers | Leave a comment

Picking Random Points on a Circle (Not a Disk)

How do you select points at random on a circle?  By “circle” I mean the outside of a disk, not its interior.  In this post I’m going to discuss two methods: (1) Selecting an angle at random on [0, 2\pi] and taking the point that corresponds to that angle, and (2) Selecting an x-coordinate at random and taking the point that corresponds to that x-value.  (Actually, there are two possible points in Case (2), but we can subsequently choose either of those with a 50% probability.)  My main purpose is to show that the second method does not actually select points at random on the circle; instead, that method chooses points more likely to be near the top and bottom of the circle than near the sides.

First, we have to ask what it means, in a precise mathematical sense, to pick points at random on a circle.  I’m going to take this to mean the following:

A process that selects a point at random on a circle is one such that the probability of picking a point from an arc of the circle is proportional to the length of that arc.

For example, there should be a probability of 1/4 of picking a point from the upper-right quadrant of the circle, a probability of 1/2\pi of picking a point from an arc of length 1, and in general a probability of L/2\pi of picking a point from an arc of length L.

For simplicity’s sake, let’s use a unit circle.  Suppose we pick an angle at random from [0, 2\pi] and take the corresponding point on the circle.  The probability of selecting an angle from a particular interval of length L is L/2\pi.  Since the length of an arc of the unit circle is just \theta, where \theta is the angle subtended by that arc, the probability of this process selecting a point from an arc of length L is also L/2\pi.  Thus selecting an angle at random and taking the point that corresponds to that angle is a process that selects a random point on the circle.

Why doesn’t selecting an x-coordinate do the same thing?  Let’s focus on just the upper-right quarter of the circle and select an x value at random from [0,1].  Then we take the point (x, \sqrt{1-x^2}) on the circle.  Since x is related to \theta by the relationship x = \cos \theta, and \cos \theta is decreasing on [0, \pi/2], the probability of the point (x, \sqrt{1-x^2}) ending up in the range [0, \pi/4] of \theta values is the probability that x is chosen from the interval [ \cos(\pi/4), \cos(0)] = [\sqrt{2}/2, 1].  This probability is not 1/2 (as it would be if this process selected a point at random on the unit quarter-circle) but 1 - \sqrt{2}/{2} \approx 0.292.  Thus choosing a random x from [0,1] and taking the corresponding point on the unit quarter circle tends to produce more points near the top of the circle than near the side.

Generalizing from the example, the probability that the x-coordinate procedure produces a point on the circle with an angle in the range [0, \theta] is 1 - \cos \theta.  This means that the probability of the x-coordinate procedure yielding a value in a small range of width \Delta \theta around a fixed \theta value is \frac{d}{d \theta} (1 - \cos \theta) = \sin \theta times \Delta \theta, or \sin \theta \Delta \theta.  Since the sine function increases on [0, \pi/2], the closer \theta is to \pi/2, the more likely it is that the x-coordinate procedure will produce a value near \theta.

Posted in probability, simulation, statistics | 2 Comments

Proof of the Recursive Identity for the Bernoulli Numbers

The Bernoulli numbers B_n satisfy the recursive relationship \displaystyle \sum_{k=0}^n \binom{n+1}{k} B_k = [n=0].  (Here, [P] is the Iverson bracket, where [P] evaluates to 1 if P is true and 0 if P is false.)  The relationship can be used to calculate the Bernoulli numbers fairly easily.  This post gives a proof of this relationship.

The exponential generating function for the Bernoulli numbers is given by \displaystyle \frac{x}{e^x-1} = \sum_{n=0}^{\infty} B_n \frac{x^n}{n!}.  The exponential generating function for the sequence consisting of 0 followed by an infinite number of 1’s is given by \displaystyle e^x-1 = \sum_{n=1}^{\infty} \frac{x^n}{n!}.  Multiplying these together, we obtain, thanks to the multiplication principle for exponential generating functions (see, for example, Concrete Mathematics [1], p. 365)

\displaystyle x = \left(\sum_{n=1}^{\infty} \frac{x^n}{n!} \right) \left( \sum_{n=0}^{\infty} B_n \frac{x^n}{n!}\right) = \sum_{n=0}^{\infty} \left(\sum_{k=0}^{n-1} \binom{n}{k} B_k \right) \frac{x^n}{n!}.

But x has generating function \displaystyle \sum_{n=0}^{\infty} [n=1] \frac{x^n}{n!}.  Thus we have \displaystyle \sum_{k=0}^{n-1} \binom{n}{k} B_k = [n=1].  Replacing n-1 with n, we have the formula we were trying to prove:

\displaystyle \sum_{k=0}^n \binom{n+1}{k} B_k = [n=0].

(As a side note, the Concrete Mathematics reference, p. 365, derives the exponential generating function for the Bernoulli numbers from the recurrence relation.)


1.  Ronald L. Graham, Donald E. Knuth, and Oren Patashnik.  Concrete Mathematics, 2nd edition.  Addison-Wesley, 1994.

Posted in Bernoulli numbers, real analysis, recurrence relations | 1 Comment

Does Pointwise Convergence to a Continuous Function Imply Uniform Convergence?

Recently in my advanced calculus class we discussed how the uniform limit of a sequence of continuous functions is itself continuous.  One of my students turned the question around: If the limit of a pointwise sequence of continuous functions is continuous, does that mean that the convergence is uniform?

The answer is “No,” although I couldn’t think of a counterexample off the top of my head.  I found one (not surprisingly) on Math.SE.  Although there are some good discussion and multiple good answers to this question given there,  I’m going to discuss Ilmari Karonen’s answer.

Let \displaystyle f_n: [0,1] \mapsto \mathbb{R} be given by

\displaystyle f_n(x) = \begin{cases} nx, & 0 \leq x \leq 1/n; \\ 2 - nx, & 1/n < x \leq 2/n; \\ 0, & \text{ otherwise}. \end{cases}

Each f_n is continuous (piecewise, but that’s fine).  For any x \in (0,1], there exists N such that x > 2/n for all n \geq N.  Since f_n(0) = 0 for all n, this means that f_n(x) \to 0 for all x \in [0,1].  The zero function is clearly continuous.

However, for each n \in \mathbb{N} we have that f_n(1/n) = 1.  This means that when \epsilon < 1 there cannot be an N \in \mathbb{N} such that |f_n(x)| < \epsilon for all x \in [0,1] and n \geq N.  Thus the convergence cannot be uniform.

Posted in real analysis | 2 Comments

A Simple Proof That the p-Series Converges for p > 1

Proving that the p-series \displaystyle \sum_{n=1}^{\infty} \frac{1}{n^p} converges for p > 1 is a standard exercise in second-semester calculus.  It’s also an important property to know when establishing the convergence of other series via a comparison test.  The usual way I do this in class is with the integral test.  However, I saw a different, simple proof by joriki on Math.SE a few years back.  That proof is the subject of this post.

Let S_n be the nth partial sum \displaystyle S_n = \sum_{k=1}^n \frac{1}{k^p}.  Clearly, \{S_n\} is increasing.  We have

\displaystyle S_{2n+1} = \sum_{k=1}^{2n+1} \frac{1}{k^p} = 1 + \sum_{i=1}^n \left( \frac{1}{(2i)^p} + \frac{1}{(2i+1)^p} \right) < 1 + \sum_{i=1}^n \frac{2}{(2i)^p}

\displaystyle = 1 + 2^{1-p} S^n < 1 + 2^{1-p} S_{2n+1}.

Solving for S_{2n+1} yields

\displaystyle S_{2n+1} < \frac{1}{1-2^{1-p}}.

Thus the sequence \{S_n\} is bounded above.  By the Monotone Convergence Theorem, then, \{S_n\} converges.  And, of course, since the sequence of partial sums converges, \displaystyle \sum_{n=1}^{\infty} \frac{1}{n^p} converges.

One use of this proof, as a commenter on joriki’s post mentions, is when doing the series material at the beginning of a Calculus II course rather than at the end.  The only piece of content from the rest of Calculus II that you need for the series material is the integral test, and the primary use of the integral test is to prove convergence of the p-series when p > 1.  With this proof one can establish this convergence without knowing improper integration or the integral test.

Posted in calculus, sequences and series | Leave a comment

Proof of the Irrationality of e

In a previous post I proved that \log_{10} 5 is irrational.  In this post I prove the irrationality of e.

A proof of the irrationality of e must start by defining e.  There are some different ways to do that.  We’ll take e to be the unique positive number b such that \dfrac{d}{dx} b^x = b^x.  (This property itself can be proved from other ways to define e.  See, for example, Fitzpatrick’s Advanced Calculus [1], Section 5.2.)  We’ll also make the assumptions that (1) e < 3 and (2) e^x is an increasing function.

Under this definition of e, the Lagrange Remainder Theorem says that, given x > 0 and n \in \mathbb{N}, there exists c \in (0,x) such that

\displaystyle e^x = 1 + x + \frac{x^2}{2} + \cdots + \frac{x^n}{n!} + \frac{e^c x^{n+1}}{(n+1)!}.

Taking x = 1, we obtain, for some c \in (0,1),

\displaystyle e = 1 + 1 + \frac{1}{2} + \cdots + \frac{1}{n!} + \frac{e^c}{(n+1)!}.

\displaystyle \implies 0 < e - \left[ 2 + \frac{1}{2} + \cdots + \frac{1}{n!}\right] = \frac{e^c }{(n+1)!} \leq \frac{e}{(n+1)!} \leq \frac{3}{(n+1)!}.

Now, suppose e is rational.  Then there exist m_0, n_0 \in \mathbb{N} with e = m_0/n_0.  Therefore,

\displaystyle 0 < \frac{m_0}{n_0} - \left[ 2 + \frac{1}{2} + \cdots + \frac{1}{n!} \right] \leq \frac{3}{(n+1)!}.

Multiplying by n!, this becomes

\displaystyle 0 < n! \frac{m_0}{n_0} - n! \left[ 2 + \frac{1}{2} + \cdots + \frac{1}{n!} \right] \leq \frac{3}{n+1}.

Since the above inequality is true for all n \in \mathbb{N}, it is true for n = \max \{n_0, 3\}.  For this value of n,  \displaystyle n!\frac{m_0}{n_0} - n!\left[ 2 + \frac{1}{2} + \cdots + \frac{1}{n!}\right] is an integer, and 3/(n+1) \leq 3/4.

This means that there is an integer in the interval (0,3/4).  However, we know that there is no integer in the interval (0,1).  Contradiction; thus e is irrational.

[1] Patrick M. Fitzpatrick, Advanced Calculus, American Mathematical Society, 2006.

Posted in irrational numbers, real analysis | Leave a comment