The Expected Maximum of Independent and Identically Distributed Geometric Random Variables

In my last post I derived the formula for the expected maximum of n independent and identically distributed (iid) exponential random variables.  If each random variable has rate parameter \lambda, then the expected maximum is \frac{1}{\lambda} \sum_{i=1}^n \frac{1}{i} = \frac{H_n}{\lambda}, where H_n is the nth harmonic number.

This post considers the discrete version of the same problem: the expected maximum of n independent and identically distributed geometric random variables.  Unfortunately, there is no known nice, closed-form expression for the expected maximum in the discrete case.  However, the answer for the corresponding exponential random variables turns out to be a good approximation.

Let X_1, X_2, \ldots X_n be n iid geometric(p) random variables.  Let T_n denote the maximum of the X_i variables.  Let q = 1- p.  Let the rate parameter for the corresponding exponential distribution be \lambda = - \log(1-p).  Now, we have

\displaystyle E[T_n] = \sum_{k=0}^{\infty} P(T_n > k) = \sum_{k=0}^{\infty} (1 - P(T_n \leq k))
\displaystyle = \sum_{k=0}^{\infty} (1 - P(X_1 \leq k) P(X_2 \leq k) \cdots P(X_n \leq k)) = \sum_{k=0}^{\infty} (1 - (1-q^k)^n),

as the cumulative distribution function of a geometric(p) random variable is 1 - (1-p)^k.

By viewing the infinite sum as right- and left-hand Riemann sum approximations of the corresponding integral we obtain

\displaystyle \int_0^{\infty} (1 - (1 - q^x)^n) dx \leq E[T_n] \leq 1 + \int_0^{\infty} (1 - (1 - q^x)^n) dx.

Now, let’s take a closer look at the integral that occurs on both sides of this inequality. With the variable switch u = 1 - q^x we have

\displaystyle \int_0^{\infty} (1 - (1 - q^x)^n) dx = -\frac{1}{\log q} \int_0^1 \frac{1 - u^n}{1-u} du = -\frac{1}{\log q} \int_0^1 \left(1 + u + \cdots + u^{n-1}\right) du
\displaystyle = -\frac{1}{\log q} \left(1 + \frac{1}{2} + \cdots + \frac{1}{n}\right) = -\frac{1}{\log q} H_n,
proving the fairly tight bounds

\displaystyle \frac{1}{\lambda} H_n \leq E[T_n] \leq 1 + \frac{1}{\lambda} H_n.

We can obtain a more precise approximation by using the Euler-Maclaurin summation formula for approximating a sum by an integral.  Up to a first-order error term, it says that

\displaystyle E[T_n] = \sum_{k=0}^{\infty} (1 - (1-q^k)^n) \approx \int_0^{\infty} (1 - (1 - q^x)^n) dx + \frac{1}{2},
yielding the approximation

\displaystyle E[T_n] \approx -\frac{1}{\log q} H_n + \frac{1}{2},

with error term given by
\displaystyle \int_0^{\infty} n (\log q) q^x (1 - q^x)^{n-1} \left(x - \lfloor x \rfloor - \frac{1}{2}\right) dx.
One can verify that this is quite small unless n is also small or q is extreme.

See also Bennett Eisenberg’s paper “On the expectation of the maximum of IID geometric random variables,” Statistics and Probability Letters 78 (2008), 135-143.

(I first posted this argument online as my answer to “Expectation of the maximum of IID geometric random variables” on Math Stack Exchange.)

Advertisements
This entry was posted in probability. Bookmark the permalink.

2 Responses to The Expected Maximum of Independent and Identically Distributed Geometric Random Variables

  1. unsampled says:

    Really enjoyed this and the similar article on the exponential. Do you have a similar derivation for the expected maximum of iid Poisson rvs?

    • mzspivey says:

      Thanks! I don’t have a corresponding derivation for Poisson random variables, and I don’t think I’ve ever seen one, either. An interesting idea for a future post, perhaps.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s