Grand Slams are short changing women s tennis

Similar documents
A point-based Bayesian hierarchical model to predict the outcome of tennis matches

Swinburne Research Bank

The final set in a tennis match: four years at Wimbledon 1

Improving the Australian Open Extreme Heat Policy. Tristan Barnett

On the advantage of serving first in a tennis set: four years at Wimbledon

Summarizing tennis data to enhance elite performance. Tristan Barnett PhD University of South Australia

Predictors for Winning in Men s Professional Tennis

HOW THE TENNIS COURT SURFACE AFFECTS PLAYER PERFORMANCE AND INJURIES. Tristan Barnett Swinburne University. Graham Pollard University of Canberra

Men s Best Shots Poll

APPLYING TENNIS MATCH STATISTICS TO INCREASE SERVING PERFORMANCE DURING A MATCH IN PROGRESS

1. OVERVIEW OF METHOD

Player Grading System

The MACC Handicap System

Opening up the court (surface) in tennis grand slams

First-Server Advantage in Tennis Matches

SPORTS EVENT SPOTLIGHT. 11 th November 2016

How the interpretation of match statistics affects player performance

Federer By Chris Bowers READ ONLINE

Australian open 2016 prize money. Australian open 2016 prize money.zip

Life Transitions and Travel Behaviour Study. Job changes and home moves disrupt established commuting patterns

Using Actual Betting Percentages to Analyze Sportsbook Behavior: The Canadian and Arena Football Leagues

Tennis Coach Roger Federer Us Open 2012 Schedule

First Server Advantage in Tennis. Michelle Okereke

Analysis of the Article Entitled: Improved Cube Handling in Races: Insights with Isight

A Hare-Lynx Simulation Model

WORLD. Geographic Trend Report for GMAT Examinees

Predicting Tennis Match Outcomes Through Classification Shuyang Fang CS074 - Dartmouth College

Athlete Development Criteria Athlete Development Scholarship Criteria

Averages. October 19, Discussion item: When we talk about an average, what exactly do we mean? When are they useful?

TENNIS CANADA s GOLD/SILVER and BRONZE PERFORMANCE STANDARDS

Impact of Demographic Characteristics on USASF Members' Perceptions on Recent Proposed Rule Changes in All Star Cheerleading

Analysis of performance at the 2007 Cricket World Cup

The 2017 women s Australian Open final attracted 360,000 more viewers than the final of T20 cricket s Big Bash League (aired on another free-to-air

Opleiding Informatica

Chapter 13. Factorial ANOVA. Patrick Mair 2015 Psych Factorial ANOVA 0 / 19

Our Shining Moment: Hierarchical Clustering to Determine NCAA Tournament Seeding

[5] S S Blackman and Casey J W. Development of a rating system for all tennis players. Operations Research, 28: , 1980.

REDUCING THE LIKELIHOOD OF LONG TENNIS MATCHES

ELO MECHanICS EXPLaInED & FAQ

Game Theory (MBA 217) Final Paper. Chow Heavy Industries Ty Chow Kenny Miller Simiso Nzima Scott Winder

Swinburne Research Bank

ECO 199 GAMES OF STRATEGY Spring Term 2004 Precept Materials for Week 3 February 16, 17

What is going on in modern volleyball

Major League Baseball Offensive Production in the Designated Hitter Era (1973 Present)

Using Markov Chains to Analyze a Volleyball Rally

Analyses of the Scoring of Writing Essays For the Pennsylvania System of Student Assessment

Hitting with Runners in Scoring Position

A REVIEW OF AGE ADJUSTMENT FOR MASTERS SWIMMERS

The effect of pack formation at the 2005 world orienteering championships

Is lung capacity affected by smoking, sport, height or gender. Table of contents

Modern volleyball aspects

VIRTUAL TENNIS TOUR SEASON 2014 OFFICIAL RULEBOOK

The probability of winning a high school football game.

Contingent Valuation Methods

WHAT CAN WE LEARN FROM COMPETITION ANALYSIS AT THE 1999 PAN PACIFIC SWIMMING CHAMPIONSHIPS?

LADIES IN THE LANES. Working together, we can eliminate gender inequality in swim. ACTIVEswim.com


Calculation of Trail Usage from Counter Data

save percentages? (Name) (University)

Pokémon Organized Play Tournament Operation Procedures

Men in Black: The impact of new contracts on football referees performances

Tournament Operation Procedures

Officiating Broadcast Enhancement Live Production Experiential Digital Coaching

WELCOME TO FORM LAB MAX

Has the NFL s Rooney Rule Efforts Leveled the Field for African American Head Coach Candidates?

TREND INSIGHTS CABLE TO TELEVISE NCAA MEN S BASKETBALL CHAMPIONSHIP GAME

Revisiting the Hot Hand Theory with Free Throw Data in a Multivariate Framework

DEVELOPMENT OF A SET OF TRIP GENERATION MODELS FOR TRAVEL DEMAND ESTIMATION IN THE COLOMBO METROPOLITAN REGION

Assessment Schedule 2016 Mathematics and Statistics: Demonstrate understanding of chance and data (91037)

STAT/MATH 395 PROBABILITY II

Regression to the Mean at The Masters Golf Tournament A comparative analysis of regression to the mean on the PGA tour and at the Masters Tournament

!!!!!!!!!!!!! One Score All or All Score One? Points Scored Inequality and NBA Team Performance Patryk Perkowski Fall 2013 Professor Ryan Edwards!

Fit to Be Tied: The Incentive Effects of Overtime Rules in Professional Hockey

STT 315 Section /19/2014

NCSS Statistical Software

If a fair coin is tossed 10 times, what will we see? 24.61% 20.51% 20.51% 11.72% 11.72% 4.39% 4.39% 0.98% 0.98% 0.098% 0.098%

P L A Y S I G H T. C O M

Exploring the relationship between Heart Rate (HR) and Ventilation Rate (R) in humans.

Equine Injury Database Update and Call for More Data

Novak Djokovic Interview by Dr Bane Krivokapic

BODY HEIGHT AND CAREER WIN PERCENTAGE IN RELATION TO SERVE AND RETURN GAMES EFFECTIVENESS IN ELITE TENNIS PLAYERS

The Australian Golf Handicap System

Which On-Base Percentage Shows. the Highest True Ability of a. Baseball Player?

Match Duration and Number of Rallies in Men s and Women s FIVB World Tour Beach Volleyball

A PRIMER ON BAYESIAN STATISTICS BY T. S. MEANS

Average Runs per inning,

A Markov Decision Process-based handicap system for. tennis. T.C.Y. Chan 1,2 and R. Singal,1,3. January 3, 2017

NBA TEAM SYNERGY RESEARCH REPORT 1

Tech Suit Survey (Virginia LMSC)

Evaluating The Best. Exploring the Relationship between Tom Brady s True and Observed Talent

Figure 1. Winning percentage when leading by indicated margin after each inning,

BASKETBALL PREDICTION ANALYSIS OF MARCH MADNESS GAMES CHRIS TSENG YIBO WANG

Chapter 5: Methods and Philosophy of Statistical Process Control

NEURAL AND SENSORY APPLICATIONS OF INTERVENTION IN IMPROVING TECHNIQUE AT HIGH SPEED AND RACE PACE ASCTA CONFERENCE - BROADBEACH MAY 2014

Lecture 22: Multiple Regression (Ordinary Least Squares -- OLS)

Competitive Performance of Elite Olympic-Distance Triathletes: Reliability and Smallest Worthwhile Enhancement

1 Streaks of Successes in Sports

Tournaments. Dale Zimmerman University of Iowa. February 3, 2017

STRATEGY 2015: EVERYONE, EVERY WAY Final Report for

A Correlation Study of Nadal's and Federer's Technical Indicators in Different Tennis Arenas

Transcription:

sport Grand Slams are short changing women s tennis Sports reporters are often quick to dismiss women s tennis as unpredictable when compared to the men s game. But Stephanie Kovalchik finds match format is to blame for inconsistencies between genders Image: Al Bello/Getty Images Sport/Thinkstock Sports fans the world over love a good upset when the underdog upends the competition and triumphs against the favourite. In July, at the Wimbledon Championships in southwest London, tennis delivered one such scenario: Rafael Nadal, a player ranked 10th in the world, lost in the second round to Dustin Brown, who was ranked 10nd. These sorts of upset in men s tennis, when they occur, lead commentators to praise the depth and competitive balance 1 of the game. But when top-ranked female players are unexpectedly defeated, the tone of the coverage shifts. Words such as inconsistent and unpredictable 3 are used, often as euphemisms for inferior play. Tennis World exemplified this with its sceptical outlook for the women s game in the wake of several upsets at the 014 US Open: These upsets, when looked [at] from the perspective of the sport overall, raise a lot of questions about the direction that women s tennis is taking. 4 Commentators have been echoing these sentiments for a number of years. Ironically, some of the severest fault-finding has come from women like writer Caroline Cameron, who summarily concluded in 01 that the women s game hasn t been able to keep up 5 with the incredible talent and competitiveness of the men s game. Past studies have shown that female tennis players receive less press coverage than men, and the coverage they do receive is more likely to be off topic or negative, 6 and discussions about consistency in performance are a case in point. But is it accurate to describe women tennis players as less consistent than men? The verdict from the tennis media has been a resounding yes. Yet, in what follows, I will show that performance statistics in professional tennis strongly argue for the conclusion that inconsistency in tennis is a problem of match format, not gender. Measures of consistency Although consistency is a popular concept in tennis, the sport lacks formal metrics for consistent performance. In fact, the very nature of the sport complicates the measurement of consistency. For time or distance sports, where all performance is judged on a common continuous scale, the standard measure of consistency is variation in performance over time. 7 For 1 015 The Royal Statistical Society

Figure 1. Frequency of upsets in tennis matches, 010 014. A match is considered an upset when a lower-ranked player beats a higher-ranked player combat sports like tennis there is no single equivalent metric. The discrete nature of win-or-lose outcomes and the wide variety of conditions in tennis (each tournament is held in a different place, with its own surface, and unique set of match-ups) make the measurement of consistent performance a challenge. A good metric of consistency should measure the extent to which a player plays to his or her expected level. In this article, I will consider a number of measures that aim to do this and will examine what each has to tell us about differences in the consistency of men and women s performance in singles tennis. The performance data presented here will focus on singles matches (i.e. one player versus one opponent) on the Women s Tennis Association (WTA) tour, and the men s equivalent, the Association of Tennis Professionals (ATP) tour, between 010 and 014 the five most recently completed seasons. The tournaments that make up the season are grouped into different tiers according to their difficulty. The Grand Slams the Australian Open, the French Open, the Wimbledon Championships, and the US Open represent the highest tier for both tours. The next highest tier for the WTA combines the Premier Mandatory and Premier 5 tournaments, which I will refer to as the Premier 5+. For the ATP, the analogous tier is the nine tournaments that make up the Masters 1000. Fewer of the top-ranked players compete in lower-tiered tournaments, so they will not be discussed. Upsets A match win is considered an upset when the lower-ranked player prevails and some upsets are bigger than others. For example, Roger Federer s loss (when ranked 7th) at the 013 Wimbledon Championships to Sergiy Stakhovsky, a player ranked 116th in the world at that time, was more shocking than Federer s loss two years earlier (when ranked 3rd) to 19th ranked Jo-Wilfried Tsonga. As this example suggests, expectations about winning in tennis are typically judged by player seeding or rank. We can therefore measure the degree of an upset by the rank differential between the winning player (R W ) and the losing player (R L ), rank differential = R W R L An upset occurs whenever the difference in rank is positive (as the better player s rank is a lower number), and the larger and more positive the differential, the bigger the upset. Between 010 and 014, the overall frequency of higher-ranked players losing to lower-ranked players (i.e. upsets) was 31.5% for the WTA and 9.6% for the ATP, suggesting that high-ranked female players were slightly more likely to lose to a lower-ranked player than high-ranked male players. However, the frequency of upsets by tournament tier shows that this difference was only true for performance at the Grand Slams though even there, huge upsets with players differing by more than 70 ranking points were similarly rare for both tours (Figure 1). At lower-tier tournaments, the distribution of upsets was nearly identical for men and women. Streaks Consistency can also be measured by consecutive match wins, or streaks. By this standard, Novak Djokovic s 41-match winning streak in 011 was the most impressive display of consistency in recent history. But are males more streaky than females, in general? The characteristics of maximum streaks for the top 100 male and female players between 010 and 014 suggests not. Figure (page 14) shows that the distribution of consecutive wins for both tours lines up almost exactly. Letdowns A letdown is the phenomenon of an early round loss following a big win, which can be regarded as an extreme type of upset such as when Nadal (this time ranked 5th), fresh from his eighth title win at the French Open, lost in the first round of the 013 Wimbledon Championships to Steve Darcis, who was ranked 135th. Defining an early round loss as an exit in the first or second round, the 13

Figure. Distribution of consecutive match wins, or streaks, 010 014 Figure 3. Frequency of letdowns, 010 014. A letdown is an early round loss following a big win likelihood that a finalist or winner of a big tournament between 010 and 014 would exit early was greater for female players than for male players, but only at the Grand Slams (Figure 3). The frequency of letdowns was statistically similar at the other major tournaments of the season. Variation in match win percentage In time and distance sports, where the quality of a performance is measured by a single number on a continuous scale, assessing consistency is straightforward. We simply look at deviations in performance from one competition to the next. For tennis, performance is a binary outcome win or loss. The trouble with binary outcomes for the study of consistency is the basic fact that their variance (which tells us about consistency) is completely determined by their mean (i.e. expected wins). What we would like to know is how the win expectation varies with each match; that is, how far off a player s chance of winning can be from one match to the next. Unfortunately, a win expectation, like the inner thoughts that run through an athlete s mind in the heat of competition, is not something we can observe. However, the strong relationship between rank and win ability suggests a possible solution. If rank is the main factor determining a player s win expectation, we can consider the win frequencies in each season for the players with the same rank as repeated observations of win expectations for players with the same level of ability. For example, we would take the season win percentage across multiple seasons for the number 1 ranked player to determine season-to-season variation in win expectation for the topranked player in the world. Using this approach to measure stability in win expectations for the 010 014 Masters/Premier 5+ tournaments, men and women were found to have a comparable pattern of stability: the highest-ranked players have the most stable (least variable) win expectations and lower-ranked players the least stable (most variable). The overlapping regression lines in Figure 4a not only confirm that the best players distinguish themselves by their greater consistency, but also show that the consistency of win expectations for female and male players of equal rank is statistically identical at the Masters/Premier 5+ tournaments. At the Grand Slams, the gap in the regression lines in Figure 4b indicates that win expectations for a female player are more variable than for a male player of equal rank. In other words, women have been less consistent in the number of matches won than men in recent seasons, but only at the Grand Slams. Up-and-down matches How a player performs during a match can also provide insight into his or her consistency. Lopsided sets are an extreme example of inconsistent play during a match. Consider, for example, Serena Williams s round of 16 match in the 014 China Open. After a lapse in the second set, Williams defeated Lucie Safarova 6 1, 1 6, 6. To capture the up-and-down nature of matches, I introduce the metric of average game spread reversals. A game spread is the difference in games won in a set between the higher-ranked player and the lower-ranked player (e.g. a 6 1 set win would be a spread of 5). A reversal is the difference in game spreads from consecutive sets, and a higher reversal reflects a more up-and-down match. So a player who wins 6 1, 6 1 would have a reversal of 0, indicating perfectly consistent play in each set; whereas a 6 1, 1 6 performance in two sets would have a reversal of 10. The average of reversals is used to correct for differences in the number of sets played (the topsy-turvy Williams Safarova match had an average reversal in game spread of 9.5). 14

Over the past five seasons, the occurrence of lopsided matches was very similar for both tours. Overall, the median game spread reversals for each tour was, and at both the Grand Slams and Masters/ Premier 5+ tournaments, the difference in average game spread reversals between tours was 0.1 games or less. Current top female players appear as likely to have an up-anddown match as male players. (a) (b) Figure 4. Variation in match win probabilities, 010 014: (a) Masters/Premier 5+; (b) Grand Slams Format advantage Of the five measures of consistency considered here, there were no gender differences in performance for two of the measures (streaks and up-and-down matches). For three of the measures (upsets, letdowns and variation in win percentage) differences were found that suggest female players have had less consistent performance in recent years. However, these differences were only observed at the Grand Slams, where female players play a different match format than males: women compete in a best-of-three format, while men play a more taxing best-of-five. Could match format explain the tour differences in consistency observed in recent years? Logic dictates that an underdog will have a harder time winning three sets in a match than two sets. Even without a complicated mathematical argument, we can conclude that the best-of-five format favours higher-ranked players more than a best-of-three. How much of an edge it offers is the trickier question. This is where a mathematical analysis is helpful and, thanks to the hierarchical structure of tennis, also tractable. The most studied and discussed mathematical model in tennis is the IID model. IID stands for independent and identically distributed, which refers to the basic assumption that the probabilities of winning a point on serve or return are treated as constant throughout the match. In other words, point outcomes are treated like the outcomes of coin tosses, and the probability of a success (a point won) on any toss is a constant that is determined by the player s underlying ability against his or her particular opponent. The IID model seems implausible, but it has been shown to be remarkably accurate for describing outcomes in tennis. 8 One of the reasons for the IID model s popularity is that it simplifies much of the mathematics of tennis. Assuming that the 15

Matthew Stockman/Getty Images Sport/Thinkstock Edge % probability of winning a set follows the Figure 5 shows the magnitude of the is neither a toss-up (50%) nor a certainty IID model, it is possible to write down edge for a range of hypothetical probabilities (100%), and the edge falls off more steeply as exact formulae for winning a match under of winning a set. The parabolic shape shows the likelihood of winning a set becomes more a variety of formats. Suppose that p is the that the edge is largest when the highercertain than when it becomes less certain. Figure 5. Edge in win advantage probability that the higher-ranked player ranked player s chance of winning the set When the chance of winning a set is between in a particular match-up wins a set. As was previously shown in this magazine,9 the chance that this player will win a 10 best-of-three match under the IID model is M3 = p (1 + (1 p)). For a best-of-five match it is M5 = p3(1 + 3(1 p) + 6(1 p)). 8 Given a match win probability of M, the gap in a player s chance of winning versus 6 losing, what can be called the player s win advantage, is M 1. We are interested in 4 how match format influences win advantage, all else being equal. The edge that match format provides to the higher-ranked player is the difference in win advantage between a best-of-five match and a best-of-three, and is 0 equal to the following polynomial: edge = (M5 M3 ) = p { p 1 + 3 (1 p) + 6 (1 p) (1 +0.5 (1 p))} 0.6 0.7 0.8 0.9 1.0 edge = (M5 M3 ) = p { p 1 + 3 (1 p) + 6 (1 p) (1 + (1 p))} Probability of Winning a Set 3 (1 p) + 6 (1 p) (1 + (1 p))} Figure 5. Edge in win advantage 16

Match Win Advantage % 60 55 50 45 40 Best of 3 Best of 5 ATP WTA would be the surest way to eliminate this source of disparity and help the women s game to enjoy the benefits of having more of its top players (not just Serena Williams) dominate at the biggest tournaments. Although the tennis world would probably resist such a change, the evidence presented in this article shows that equality in match format would be an important step towards true gender equality in tennis. Acknowledgements All of the data used in this paper are in the public domain and were collected using the author s R package deuce. 40 60 80 100 40 60 80 100 Rank Figure 6. Win advantage of higher-ranked player given observed set win probabilities, 010 014 0.6 and 0.8, this plot suggests that the boost in win advantage for the higher-ranked player playing a best-of-five match is 7 10 percentage points. Although Figure 5 suggests that match format can have a huge impact on win advantage, it is based on theoretical values for the chance of winning a set, which might not reflect their probabilities on the professional tours. To determine the actual edge that match format has given the tours in recent years, I computed the edge using actual set win probabilities for each tour from 010 to 014. The results, shown in Figure 6, show the win advantage by rank, aggregating the match outcomes across seasons for players of the same rank. Compared to women, the men s game has a slightly larger advantage among the highest-ranked players under either match format. However, the tour differences are minuscule in contrast to the advantage provided by the best-of-five format, which gives an edge of 9 percentage points overall. Can the edge afforded by best-of-five matches explain gender differences at the Grand Slams? The observed win advantage of higher-ranked players at the 010 014 Grand Slams was 50% for men and 4% for women, an 8 percentage point difference that is entirely consistent with the expected edge of 9 percentage points with a best-of-five match. If women played best-of-five matches like men, we would expect them to perform as consistently as men. These findings shine a spotlight on two biases in the way tennis and comparisons between the men s and women s game are reported. There is both an overemphasis on performance at Grand Slams and a discounting of differences between the tours that extend beyond gender. Equality in match format would be an important step towards true gender equality in tennis Conclusion Despite praiseworthy efforts to close the gender gap in tournament prize money, the Grand Slams are inadvertently short-changing the women s game by having men and women play a different match format and using a format for women that makes the outcomes for their tour less predictable than the men s. Having both tours play the longer bestof-five format (currently reserved for the men) References 1. McGrogan, E. (015). Men s depth on delightful display as Murray, Kyrgios dig deep to reach QFs. Tennis, 5 January. Retrieved from bit.ly/1osyuyb. W. S. (01) Where there s a Williams. The Economist, 31 May. Retrieved from econ.st/1euz0bu 3. Sujith, K. (01). The problem with women s tennis. Roar, 14 June. Retrieved from bit.ly/1gddk0b 4. Iyer, S. (014). A vicious cycle of inconsistency: The ailments plaguing WTA tennis. Tennis World, September. Retrieved from bit.ly/1p89fwk 5. Cameron, C. (01) Cameron on Tennis: Inconsistency in the WTA. Sportsnet, 11 October. Retrieved from bit.ly/1osyrlm 6. Tuggle, C. A. (1997) Differences in television sports reporting of men s and women s athletics: ESPN SportsCenter and CNN Sports Tonight. Journal of Broadcasting & Electronic Media, 41(1), 14 4. 7. Currell, K. and Jeukendrup, A. E. (008) Validity, reliability and sensitivity of measures of sporting performance. Sports Medicine, 38(4), 97 316. 8. Klaassen, F. J. and Magnus, J. R. (001). Are points in tennis independent and identically distributed? Evidence from a dynamic binary panel data model. Journal of the American Statistical Association, 96(454), 500 509. 9. Gray, C. (015). Game, set and stats. Significance, 1(1), 8 31. Stephanie Kovalchik is a statistician at the RAND Corporation, tennis analyst, and the 016 Program Chair-Elect for the Section on Statistics in Sports of the American Statistical Association. You can follow her work on tennis at on-the-t.com and on Twitter @StatsOnTheT 17