## The Secretary Problem With the Two Best

The secretary problem is the following: Suppose a manager wants to hire the best person for his secretary out of a group of n candidates.  He interviews the candidates one by one.  After interviewing a particular candidate he must either (1) hire the candidate he just interviewed, rejecting all other candidates, or (2) reject the current candidate and interview the next candidate in the list.  (A manager cannot change his mind and hire a previously-rejected candidate.)

What’s the best strategy for the manager?  The best strategy turns out to be to reject the first $n/e$ candidates (in other words, reject about the first 37% of candidates) and then hire the next one who is better than all candidates seen thus far.  Somewhat surprisingly, this strategy gives the manager a probability of $1/e$ (about 37%) of hiring the best candidate out of the entire group!

But what if the manager is a little more flexible and changes his goal to hiring one of the two best candidates?  The analysis is a little more complicated, so let’s simplify things somewhat by assuming that he has 100 candidates to interview.  In this case, using a similar strategy, the manager can get the probability of hiring one of the two best candidates above 50%.  The rest of this post contains the derivation of this fact.

Suppose the manager’s strategy is (like in the original version of the secretary problem) to reject the first k candidates and then hire the next candidate who is better than all candidates seen thus far.  Let’s condition on the absolute ranking of the best candidate among this first k.

If the #1 candidate is the best among the first k, the manager will never see a better candidate and thus will reject all the candidates.  The probability of success here is 0.

If the #2 candidate is the best among the first k, the next candidate that the manager sees who is better than any of the first k must be the #1 candidate.  The manager will end up hiring the #1 person.  This scenario happens with probability $\frac{k}{100} \cdot \frac{100-k}{99}$, as the probability that the #2 candidate falls in the first k is $\frac{k}{100}$, and, given this, the probability that the #1 candidate falls in the last $100-k$ is $\frac{100-k}{99}$.

If the #3 candidate is the best among the first k, the next candidate that the manager sees who is better than any of the first k must be the #1 or the #2 candidate.  The manager will thus end up hiring one of the two best people.  This scenario happens with probability $\frac{k}{100} \cdot \frac{100-k}{99} \cdot \frac{99-k}{98}$.  To see this, the probability that the #3 candidate falls in the first k is $\frac{k}{100}$.  Given this, the probability that the #1 candidate does not fall in the first k is $\frac{100-k}{99}$, and given all of that, the probability that the #2 candidate does not fall in the first k is $\frac{99-k}{98}$.

If the #4 candidate is the best among the first k, then the manager is not guaranteed to end up hiring one of the two best candidates.  There’s a 2/3 chance that the manager interviews the #1 or #2 candidate before the #3 candidate after rejecting the first k.  The manager’s probability of success is thus $\frac{2}{3} \cdot \frac{k}{100} \cdot \frac{100-k}{99} \cdot \frac{99-k}{98} \cdot \frac{98-k}{97}$.

Let’s jump to the general case.  If the #j candidate is the best among the first k, then the manager will end up hiring one of the two best candidates with probability $\frac{2}{j-1}$.  The manager’s probability of success is thus $\frac{2}{j-1} \cdot \frac{k}{100} \cdot \frac{100-k}{99} \cdot \frac{99-k}{98} \cdots \frac{100-k-(j-2)}{100-(j-1)}$.

The largest possible value for j is $100 - k + 1$.

Putting this all together, then, the probability that the manager hires one of the two best candidates using the strategy of rejecting the first k and then hiring the next candidate who is better than all seen before is

$\displaystyle \frac{k}{100} \cdot \frac{100-k}{99} + \sum_{j=3}^{100-k+1} \frac{k}{100} \cdot \frac{(100-k)(99-k) \cdots (100-k-(j-2))}{100 \cdot 99 \cdot 98 \cdots (100-(j-1))}$.

The value of this expression for different values of k is given below.  As you can see, the optimal strategy of this type is for the manager to reject the first 30 candidates outright and then take the best candidate he next sees.  Doing so gives him a greater than 50% probability of selecting one of the two best candidates.  Any value of k from 22 to 39, though, also gives him a better than 50% probability.  And even taking k to be anything from 12 to 56 gives him a better than 40% change of hiring one of the two best candidates.  This type of strategy is thus rather robust.

k    Probability
1     0.0935476
2     0.147297
3     0.191249
4     0.228736
5     0.261425
6     0.290316
7     0.316075
8     0.33918
9     0.359986
10   0.378773
11   0.395761
12   0.411133
13   0.425041
14   0.437612
15   0.448957
16   0.45917
17   0.468335
18   0.476526
19   0.483808
20   0.490239
21   0.495872
22   0.500755
23   0.504931
24   0.508439
25   0.511316
26   0.513595
27   0.515306
28   0.516479
29   0.51714
30   0.517313
31   0.517021
32   0.516287
33   0.515129
34   0.513567
35   0.511619
36   0.509302
37   0.506631
38   0.503622
39   0.500288
40   0.496643
41   0.492701
42   0.488473
43   0.48397
44   0.479205
45   0.474186
46   0.468926
47   0.463433
48   0.457716
49   0.451784
50   0.445647
51   0.439311
52   0.432786
53   0.426077
54   0.419194
55   0.412142
56   0.404928
57   0.39756
58   0.390042
59   0.382382
60   0.374584
61   0.366656
62   0.358601
63   0.350426
64   0.342136
65   0.333735
66   0.325228
67   0.31662
68   0.307916
69   0.29912
70   0.290236
71   0.281268
72   0.272221
73   0.263097
74   0.253902
75   0.244639
76   0.235311
77   0.225922
78   0.216475
79   0.206973
80   0.197421
81   0.187821
82   0.178175
83   0.168488
84   0.158762
85   0.149
86   0.139204
87   0.129378
88   0.119524
89   0.109645
90   0.0997434
91   0.0898213
92   0.0798815
93   0.0699263
94   0.0599581
95   0.0499792
96   0.0399917
97   0.0299979
98   0.02
99   0.01
100 0.0