1 A time to win? Patterns of outcomes for football teams in the English Premier League Nigel Smeeton, MSc, Honorary Lecturer King s College London, Division of Imaging Sciences & Biomedical Engineering, London, United Kingdom Email: nigel.smeeton@kcl.ac.uk Summary It is a widely held view that football teams can have periods of good, poor or indifferent performance characterised by a preponderance of wins, defeats or draws respectively. The multiple runs test can be used to detect clustering in a sequence of events in which several outcomes are possible. The development of the study of runs is briefly outlined. A method for simulating multiple runs distributions using the statistical package Stata is applied to the results of the English Premier League for the 2002-2003 season. For these data there is little evidence for clustering of outcomes. NOTE This paper reports on an analysis performed by the author in 2003. Following unsuccessful submission for publication to a refereed magazine (the analysis was considered to be too simplistic) it has remained filed away and was recently rediscovered by the author when sorting out old documents. I believe that these findings may now be of historic interest to those who follow British football.
1 Introduction 2 The English Premier League involves the top twenty football clubs; during a season the different pairings of clubs play against each other twice, once at each ground. Clubs are ranked based on points gained by wins and draws (see Smeeton (2003) for a more detailed description). On the basis of the clubs end-of-season positions, the most successful teams can participate in European competitions during the following season whereas the bottom three clubs are relegated. It is not unknown for a football club to perform poorly for most of a season but be saved from relegation by a run of wins close to the end. In a similar manner, a team that is consistently at the top of the table can be overtaken because of a series of poor results towards the finish. Inspection of sequences of results can give the impression that some teams experience good or bad patches; Smeeton (2003) showed evidence of clustering in ten consecutive match results for Leeds United. However, as far as I am aware an investigation of the results from Premier League teams obtained over a complete season has not as yet been conducted. 2 Statistical background For a series of events where several outcomes are possible, a run is defined as a sequence of one or more observations of the same type. Under the null hypothesis that the different outcomes occur randomly, a relatively small value for the total number of runs indicates possible clustering whereas an unexpectedly large value suggests alternation between outcomes of different types. If it is clustering that is under investigation, the one-tailed probability of the number of runs being no more than a stated low number is of relevance. For data that have already been collected it is the conditional distribution of the number of runs that is considered (Schuster and Gu, 1997).
3 Interest in patterns of runs, often based around outcomes in games of gambling such as roulette, blossomed during the latter part of the nineteenth century in both the popular and scientific literature. Thomas Hardy s novel A Laodicean, originally published in 1881, relates the advice given to a young man who loses significant sums of money at the roulette tables of Monte Carlo: These runs of luck will be your ruin, as I have told you before You will be for repeating and repeating your experiments and will end by blowing your brains out. Whitworth (1886) gives the number of ways in which m indifferent (i.e. indistinguishable) black balls and n indifferent white balls can be arranged in a row for a particular number of black/ white contacts, although he evidently felt that the proof was straightforward as this is left as an exercise for the reader! Karl Pearson (1897) thought along similar lines in his statement the theory of runs is a very simple one. Only in 1940 did Wald and Wolfowitz demonstrate how the two-sample runs distribution could be obtained by summing individual probabilities. For three or more outcomes, probabilities relating to the number of runs were derived by Mood (1940). No illustrative example is given in what from an algebraic point of view is a very difficult paper. Barton and David (1957) extended Mood s work and applied their theory to falls in share prices on the London Stock Exchange. Their article contains tables for the distribution of the number of runs for samples of up to 12 observations both for 3 and 4 different outcomes. Shaughnessy (1981) tested for randomness in time-ordered residuals from regression analyses by applying multiple runs distributions. The tables of critical values in the paper are mainly for groups of similar size and so are of limited application. Schuster
4 and Gu (1997) developed algorithms for calculating exact distributions for the number of runs for multiple outcomes using the software system Mathematica. A recent comprehensive text by Balakrishnan and Koutras (2002) explores the theoretical aspects of a wide range of run problems. 3 Simulation of the multiple run distributions Suppose that in a complete season a club plays n (=38) fixtures with n 1 wins, n 2 draws and n 3 defeats. Using a local macro developed within the statistical package Stata (StataCorp, 2001) a dataset of size n is created, which contains one observation for each event, the three outcomes win, draw and defeat being indicated by different letters in the ratio n 1 :n 2 :n 3 (Smeeton and Cox, 2003). The data are then randomly shuffled repeatedly, and the number of runs after each permutation is calculated to generate the conditional distribution. Probabilities in the lower tail of the distribution indicate the degree of clustering in the observed sample. Consecutive Premier League match results for 2002-2003 were obtained from the Internet site football.guardian.co.uk (accessed 19 th May 2003). A runs distribution was simulated for each of the 20 clubs using the appropriate numbers of wins, draws and defeats in each case. One million permutations were used for each distribution, giving a confidence interval of approximately ± 0.0005 for a tail probability of 0.05. 4 Results Table 1 shows the league positions of the clubs at the end of the season, the numbers of wins, draws and defeats and the number of runs over the season. The one-tailed probability under a random pattern of at most the observed number of runs is also given. For almost all of the teams there is no evidence of clustering. However, Liverpool showed a strong pattern of
5 clustered results and there was a weaker tendency towards clustering for Manchester United. Both teams attracted the attention of the media for their inconsistent performances. Two other teams were frequently in the news and merit mention. West Ham United produced mediocre results for most of the season but turned themselves around during the last few weeks; despite their valiant efforts they just failed to avoid relegation. Sunderland, after a reasonable start to the season had a disastrous run of defeats. West Ham and Sunderland show no real tendency for clustering, however, when the season is taken as a whole. Table 1 here 5 Discussion The lack of evidence for clustering might come as a surprise to those who pay close attention to the game. However, even with a completely random sequence of events, runs of outcomes of the same type can occur by chance and in a league of 20 clubs followed over 38 matches some observed clustering is inevitable. Lowest p-values of 0.009 and 0.056 for particular teams seem perfectly reasonable. Although tempting to brush aside the apparent clustering of the results for Liverpool and Manchester United, they do deserve closer inspection. Liverpool had a run of seven wins in the early part of the season, putting them at the top of the table, but towards the end of 2002 there was a run of four defeats followed by three draws. Not many football supporters would be convinced that chance is the true explanation; factors such as the injury or suspension of a key player should be considered. Some closely involved with football believe that once a few poor results have been obtained a team can enter a period of low morale during which winning is difficult even when playing supposedly weaker teams. For the 2002-2003 season,
6 Leeds United was a possible example. Similarly, the development of a high level of confidence might explain the long runs of wins sometimes experienced by the most successful teams. The evidence of clustering with Manchester United is more difficult to explain. A quick scan of the results for the whole season does not give a particularly strong impression of clustering but closer inspection shows that of the 13 non-win outcomes, ten occurred as pairs of the same type (e.g. DD or LL). Put another way, Manchester United were generally consistent winners during 2002-2003 but if a less favourable result was obtained there was not an immediate return to winning form, possibly indicating that even the most successful teams can experience a temporary loss in confidence. Overall, it is unclear as to whether the findings for these two clubs represent genuine shortterm shifts in their levels of performance. Inspection of the results for following seasons might help to clarify the truth; the chance repetition of a sequence of clustered events by the same team would be highly unlikely. As it stands, the jury on clustering is still out! Acknowledgements I am grateful for the helpful comments on this manuscript received from Obi Ukoumunne.
References 7 Balakrishnan, N. and Koutras, M.V. (2002) Runs and Scans with Applications. New York: John Wiley & Sons. Barton, D.E. & David, F.N. (1957) Multiple runs. Biometrika, 44, 168-178. Hardy, T. (1912) A Laodicean: a Story of Today. London: Macmillan. Mood, A.M. (1940) The distribution theory of runs. Annals of Mathematical Statistics, 11, 367 392. Pearson, K. (1897) The Chances of Death and Other Studies in Evolution. Vol. 1. London: Edward Arnold. Schuster, E.F. & Xiangjun, G. (1997) On the conditional and unconditional distributions of the number of runs in a sample from a multisymbol alphabet. Communications in Statistics: Simulation and Computation, 26, 423-442. Shaughnessy, P.W. (1981) Multiple runs distributions: recurrences and critical values. Journal of the American Statistical Association, 76, 732-736. Smeeton, N. (2003) Do football teams have clusters of wins, draws and defeats? Teaching Statistics, 25, 90-92. Smeeton, N. and Cox, N. (2003) Do-it-yourself shuffling and the number of runs under randomness for a sample consisting of several categories. Stata Journal, 3, 270-277. StataCorp (2001) Stata Statistical Software: Release 7.0. College Station, TX: Stata Corporation. Wald, A. and Wolfowitz, J. (1940) On a test whether two samples are from the same population. Annals of Mathematical Statistics, 11, 147-162. Whitworth, W. A. 1886. Choice and Chance, 4 th edn. Cambridge: Deighton Bell and Co.
8 Table 1. Runs in football match outcomes for the English Premier League: 2002-2003 season Team Results (W, D, L) Number of runs P-value (one-tailed) Manchester United Arsenal Newcastle United Chelsea Liverpool Blackburn Rovers Everton Southampton Manchester City Tottenham Hotspur Middlesbrough Charlton Athletic Birmingham City Fulham Leeds United Aston Villa Bolton Wanderers West Ham United West Bromwich Albion Sunderland 25, 8, 5 23, 9, 6 21, 6, 11 19, 10, 9 18, 10, 10 16, 12, 10 17, 8, 13 13, 13, 12 15, 6, 17 14, 8, 16 13, 10, 15 14, 7, 17 13, 9, 16 13, 9, 16 14, 5, 19 12, 9, 17 10, 14, 14 10, 12, 16 6, 8, 24 4, 7, 27 16 25 25 26 18 26 22 25 27 27 29 22 27 28 24 28 27 23 19 16 0.056 0.928 0.802 0.739 0.009 0.585 0.160 0.379 0.858 0.769 0.896 0.188 0.738 0.844 0.612 0.864 0.691 0.199 0.230 0.213