First-Server Advantage in Tennis Matches Iain MacPhee and Jonathan Rougier Department of Mathematical Sciences University of Durham, U.K. Abstract We show that the advantage that can accrue to the server in tennis does not necessarily imply that serving first increases the probability of winning the match. We demonstrate that the outcome of tie-breaks, sets and matches can be independent of who serves first. These are corollaries of a more general result that we prove by considering invariances across certain permutations of the order in which the players serve. Our proof is non-algebraic and self-contained. Keywords: Tennis, Tie-break, n-point win-by-k Games Primary classification: 91A60 (Probabilistic games; gambling), 91A05 (2-person games). Secondary classification: 60J20 (Applications of discrete Markov processes) 1 Introduction In professional lawn tennis it is usually better to serve than to return. It is tempting to infer from this that it is therefore advantageous to serve Corresponding author: Department of Mathematical Sciences, University of Durham, Science Site, South Road, Durham DH1 3LE, U.K.; tel +44(0)191 334 3111; fax +44(0)191 334 3051; e-mail J.C.Rougier@durham.ac.uk 1
first, i.e. winning the toss and electing to serve improves the probability of winning the first set, and the match. In this paper we show that this inference does not necessarily follow. We show that in a standard class of models where serving confers an advantage the probability that a given player wins is independent of who serves first. We first show that this is true for a tie-break, and then we generalise our result to sets and to the match. The analysis of tennis matches using simple probabilistic models is well-known (e.g., Kemeny and Snell, 1960). An interesting history of scoring systems in tennis, and some combinatorial calculations on outcomes is given in Riddle (1988). Our tie-break results are not new, having been given in Pollard (1983), and, implicitly, in Haigh (1996). These proofs use standard probabilistic methods (e.g., hitting times on binomial trees, combinatorial analysis) relying on algebraic equalities. By contrast, our approach provides a simple and completely selfcontained proof without any algebra at all, based on a new stronger result concerning invariances across certain permutations of the order in which the players serve. 2 The tie-break The tennis tie-break is won by the first player to seven points, or, if the score reaches six-all, the first player to go two points ahead. It is an example of a n-point win-by-k game, with n = 7 and k = 2, with additional structure provided by the pattern of serves. On entering a tie-break the player to serve first is pre-determined by the initial coin toss and the total number of games played so far in the match. This player serves once, and then the players alternate serving two points each. 2
We will make the following assumption, which defines the class of models we study. Assumption 1. The points of the tie-break comprise independent Bernoulli trials with fixed probability of success depending only on who is serving. We label our players A and B. We will prove that under Assumption 1 the probability of player A winning the tie-break is independent of who serves first. This is a simple corollary to a more general result, for which we require the following definition. Definition 1. A pairwise service ordering ( pso) is a concatenation of the tuples AB and BA repeatedly according to some rule. Then the theorem is as follows. Theorem 1. Suppose we are in tie-break with a generalised serving pattern that conforms to a pso fixed by rule R. Then under Assumption 1, the probability of player A winning the tie-break is independent of the choice for R. For an actual tennis tie-break, the two possible service patterns are ABBAABB BAABBAA A serves first, B serves first. Both of these are psos, and so it follows from Theorem 1 that the probability of player A winning is invariant to who serves first. To prove Theorem 1 we make repeated use of the following lemma, which summarises the invariance structure of the probability of attaining certain scores with respect to the service ordering. 3
Lemma 2. Under the same conditions as Theorem 1, the probability of any score [i, j] with i + j even is invariant to R, excepting those in the set { [7, j] or [i, 7] : i, j {1, 3, 5} }. Proof. The tie-break may be represented on a binomial tree. We write any given path through the tree with any given pso rule R in the following way (ignoring, briefly, the restrictions on the score): 1 0 0 1 1 1... A B B A A B... (say). Each vertical pair represents a single point, with the first line showing the indicator variable of A winning and the second line showing the server; in this example, the score goes with service for the first five points, before A wins a point on B s serve. Clearly the role of the rule R is to assign a probability to the chosen path. The first point to note is that from a given pso we can reach every other pso by interchanging A and B in positions 2i + 1 and 2i + 2 for i N one-pair-at-a-time, two-pairs-at-a-time, and so on. The second point is that if we also interchange the indicator variables then we alter neither the final score of the path, nor its probability. If we want to know the probability of a score [i, j] under a pso with rule R, then we sum the probabilities over all possible paths to [i, j]. If we take any other pso with rule R the paths to [i, j] will be different, but the sum of their probabilities will be the same. This is because each path to [i, j] with rule R can be bijectively associated with a path to [i, j] which has the same probability under rule R, using the interchanges described above. Thus the probability of [i, j] is invariant to the choice of R or R, and since R and R are completely general, it is invariant to whatever rule is chosen for the pso. 4
The reason that Lemma 2 only works for even scores is that we need to know what happens on both of the points that we interchange in each tuple to associate bijectively between R to R. The reason it does not work for scores of the form [7, j] or [j, 7] for i, j {1, 3, 5} is that in this case the winning player must win the final point. Unless he or she also wins the penultimate point, we end up interchanging into a different terminating score. To prove Theorem 1 it is sufficient to show that the probability of player A winning the tie-break is invariant to the service ordering. Lemma 2 shows that the probability of reaching the score [6, 6] is invariant to the pso rule R. Now we condition on whether the score passes through [6, 6], and consider the two cases separately. If the score does pass through [6, 6], then the terminal scores (for A winning) will be of the form [6 + i + 2, 6 + i] for i N. Scores of this form satisfy Lemma 2, noting that A must have won both of the final two points, and so we conclude that the probability of A winning after passing through [6, 6] is invariant to the pso rule R. What about if the score does not pass through [6, 6]? The terminating scores in this case comprise the set S 1 = { [7, j] : j = 0, 1,..., 5 }. Several of the scores in this set do not satisfy Lemma 2. We can finesse this by considering an extension of the tie-break to 12 points, regardless of whether A has 7 points or not. Thus we define a second set, S 2 = { [i, j] : i > j, i + j = 12 }. Note that S 1 S 2 = { [7, 5] }, and that [7, 5] is the only element of S 2 that does not satisfy Lemma 2. However, in the extended game to 12 points the score [7, 5] is invariant to the pso rule, because the game can end both..., 0, 1 or..., 1, 0. That is, in the 12-point game player B can legitimately win the last point, while in the tie-break he or she cannot. Therefore every score in S 2 is invariant to the pso rule R. 5
Now consider the relationship between S 1 and S 2 on the binomial tree. Every path to S 2 must pass through S 1, and every path through S 1 must terminate in S 2. Therefore for any given pso rule R the probability on S 1 is the same as that on S 2. As the probability on S 2 is invariant to R, it follows immediately that the probability on S 1 is too. That is, the probability that A wins without passing through [6, 6] is invariant to the pso rule R. This completes the proof of Theorem 1. 3 Tennis generalisations Lemma 2 clearly generalises to all n 2, with appropriate modification to S 1 and S 2. Therefore we can extend our result from tie-breaks to sets without tie-breaks, noting the following: 1. Assumption 1 is sufficient to ensure that the games themselves comprise independent Bernouilli trials with fixed probability of success depending only on who is serving. 2. The scoring for a set without a tie-break is exactly the same as for the tie-break itself, except n = 6 rather than 7. 3. The two possible serving arrangements are ABAB and BA- BA, both of which are psos. For sets with tie-breaks we can condition our argument on the score [6, 6] in games. Clearly this satisfies Lemma 2 when n = 6: players A and B must win one apiece of the final two games, and so these two games are interchangeable as described in Lemma 2. If the score in games gets to [6, 6] then a tie-break is played, and we have already shown that the outcome of the tie-break is invariant to who serves first. But if the score in games does not get to [6, 6] we can apply exactly the 6
same reasoning as before, with S 1 = { [6, j] : j 0, 1,..., 4 } { [7, 5] } and S 2 = { [i, j] : i > j, i + j = 12 }. So we have shown that under Assumption 1 the probability of player A winning a set is independent of who serves first, regardless of whether or not the set is terminated with a tie-break or with two-ahead. And then it follows that under the same Assumption, the probability of player A winning the match is independent of who serves first. i.e., independent of who wins the toss. Finally, we note some recent evidence (Klaassen and Magnus, 2001) suggesting that the simple Markov structure implied by our Assumption 1 is in fact a reasonable model. Klaassen and Magnus find that with their sample of 86,298 points from Wimbledon 1992 1995 they reject the hypothesis that points are identically and independently distributed (iid), but they note that the deviations from the iid are small and... the assumption of iid in specific applications (such as forecasting) could be relatively harmless (p. 506). This finding, which suggests that our Assumption may be taken as approximately true, implies that in practice the impact of serving first on the outcome of a tie-break, a set, or a match, will be small and could be negligible. References Haigh, J. (1996), More on n-point, win-by-k games, Journal of Applied Probability, 33, 382 387. Kemeny, J.G. and Snell, J.L. (1960), Finite Markov Chains, Princeton N.J., Van Nostrand Reinhold. Klaassen, F.J.G.M. and Magnus, J.R. (2001), Are points in tennis independently and identically distributed? Evidence from a dy- 7
namic binary panel data model, Journal of the American Statistical Association, 96, 500 509. Pollard, G.H. (1983) An analysis of classical and tie-breaker tennis, Australian Journal of Statistics, 25, 496 505. Riddle, L.H. (1988) Probability models for tennis scoring, Applied Statistics, 37, 63 75. 8