Returns to Skill in Professional Golf: A Quantile Regression Approach

Similar documents
The Economics of Golf: An Investigation of the Returns to Skill of PGA Tour Golfers

THE DETERMINANTS OF ANNUAL EARNINGS FOR PGA PLAYERS UNDER THE NEW PGA S FEDEX CUP SYSTEM

Running head: DATA ANALYSIS AND INTERPRETATION 1

Is It Truly a Building Ground? A Returns to Skill and Learning by Doing Study of the PGA Tour and the Web.com Tour

Kelsey Schroeder and Roberto Argüello June 3, 2016 MCS 100 Final Project Paper Predicting the Winner of The Masters Abstract This paper presents a

Department of Economics Working Paper

Gender, Skill, and Earnings PGA vs. LPGA

Chapter 12 Practice Test

Navigate to the golf data folder and make it your working directory. Load the data by typing

Journal of Quantitative Analysis in Sports

Section I: Multiple Choice Select the best answer for each problem.

Department of Economics Working Paper

Journal of Sports Economics 2000; 1; 299

Gizachew Tiruneh, Ph. D., Department of Political Science, University of Central Arkansas, Conway, Arkansas

a) List and define all assumptions for multiple OLS regression. These are all listed in section 6.5

Economic Value of Celebrity Endorsements:

Using Actual Betting Percentages to Analyze Sportsbook Behavior: The Canadian and Arena Football Leagues

Contingent Valuation Methods

Regression to the Mean at The Masters Golf Tournament A comparative analysis of regression to the mean on the PGA tour and at the Masters Tournament

Using PGA Tour Results to Illustrate the Effects of Selection Bias

Behavior under Social Pressure: Empty Italian Stadiums and Referee Bias

Fit to Be Tied: The Incentive Effects of Overtime Rules in Professional Hockey

Data Set 7: Bioerosion by Parrotfish Background volume of bites The question:

Legendre et al Appendices and Supplements, p. 1

Age and Winning in the Professional Golf Tours

Is Tiger Woods Loss Averse? Persistent Bias in the Face of Experience, Competition, and High Stakes. Devin G. Pope and Maurice E.

Department of Economics Working Paper

Quantitative Methods for Economics Tutorial 6. Katherine Eyal

Driv e accu racy. Green s in regul ation

Staking plans in sports betting under unknown true probabilities of the event

2017 Distance Report. A Review of Driving Distance Introduction

Stats 2002: Probabilities for Wins and Losses of Online Gambling

North Point - Advance Placement Statistics Summer Assignment

Should bonus points be included in the Six Nations Championship?

POWER VS. PRECISION: HOW HAVE THE DETERMINANTS OF PGA TOUR GOLFERS PERFORMANCE-BASED EARNINGS EVOLVED SINCE THE 1990 S? Michael F.

THE USGA HANDICAP SYSTEM. Reference Guide

Basic Wage for Soccer Players in Japan :

Modelling Exposure at Default Without Conversion Factors for Revolving Facilities

Name May 3, 2007 Math Probability and Statistics

Racial Bias in the NBA: Implications in Betting Markets

What does it take to produce an Olympic champion? A nation naturally

Handicap Differential = (Adjusted Gross Score - USGA Course Rating) x 113 / USGA Slope Rating

1. Answer this student s question: Is a random sample of 5% of the students at my school large enough, or should I use 10%?

Emergence of a professional sports league and human capital formation for sports: The Japanese Professional Football League.

What Causes the Favorite-Longshot Bias? Further Evidence from Tennis

Labor Supply on the US Professional Golfers Association Tour

PGA Tour Scores as a Gaussian Random Variable

Efficiency Wages in Major League Baseball Starting. Pitchers Greg Madonia

Statistical Analysis of PGA Tour Skill Rankings USGA Research and Test Center June 1, 2007

Announcements. Lecture 19: Inference for SLR & Transformations. Online quiz 7 - commonly missed questions

IDENTIFYING SUBJECTIVE VALUE IN WOMEN S COLLEGE GOLF RECRUITING REGARDLESS OF SOCIO-ECONOMIC CLASS. Victoria Allred

Tee Tests: Playing with Tiger. September 2007 MIDDLEBURY COLLEGE ECONOMICS DISCUSSION PAPER NO

Is Home-Field Advantage Driven by the Fans? Evidence from Across the Ocean. Anne Anders 1 John E. Walker Department of Economics Clemson University

International Discrimination in NBA

Volume 37, Issue 3. Elite marathon runners: do East Africans utilize different strategies than the rest of the world?

2017 Distance Report. Distance Report - Summary

PREDICTING the outcomes of sporting events

The Project The project involved developing a simulation model that determines outcome probabilities in professional golf tournaments.

STAT 625: 2000 Olympic Diving Exploration

Gamblers Favor Skewness, Not Risk: Further Evidence from United States Lottery Games

Opleiding Informatica

Journal of Human Sport and Exercise E-ISSN: Universidad de Alicante España

The Changing Hitting Performance Profile In the Major League, September 2007 MIDDLEBURY COLLEGE ECONOMICS DISCUSSION PAPER NO.

Review of A Detailed Investigation of Crash Risk Reduction Resulting from Red Light Cameras in Small Urban Areas by M. Burkey and K.

College/high school median annual earnings gap,

Beyond Central Tendency: Helping Students Understand the Concept of Variation

Lesson 14: Modeling Relationships with a Line

University of Nevada, Reno. The Effects of Changes in Major League Baseball Playoff Format: End of Season Attendance

TRIP GENERATION RATES FOR SOUTH AFRICAN GOLF CLUBS AND ESTATES

Assessment Summary Report Gulf of Mexico Red Snapper SEDAR 7

Quantitative Literacy: Thinking Between the Lines

NBA TEAM SYNERGY RESEARCH REPORT 1

Tournament Selection Efficiency: An Analysis of the PGA TOUR s. FedExCup 1

Deciding When to Quit: Reference-Dependence over Slot Machine Outcomes

SPATIAL STATISTICS A SPATIAL ANALYSIS AND COMPARISON OF NBA PLAYERS. Introduction

Calculation of Trail Usage from Counter Data

The Effects of Altitude on Soccer Match Outcomes

Assessing Golfer Performance on the PGA TOUR

Online Companion to Using Simulation to Help Manage the Pace of Play in Golf

A Simulation Model to Analyze the Impact of Golf Skills and a Scenario-based Approach to Options Portfolio Optimization

Distancei = BrandAi + 2 BrandBi + 3 BrandCi + i

PSY201: Chapter 5: The Normal Curve and Standard Scores

How to Make, Interpret and Use a Simple Plot

Safety Assessment of Installing Traffic Signals at High-Speed Expressway Intersections

March Madness Basketball Tournament

An Analysis of Factors Contributing to Wins in the National Hockey League

PRACTICAL EXPLANATION OF THE EFFECT OF VELOCITY VARIATION IN SHAPED PROJECTILE PAINTBALL MARKERS. Document Authors David Cady & David Williams

STANDARD SCORES AND THE NORMAL DISTRIBUTION

Department of Applied Economics and Management Cornell University, Ithaca, New York USA

Journal of Quantitative Analysis in Sports

Summary of SAGA Handicap Manual

Computing the Probability of Scoring a 2 in Disc Golf Revised November 20, 2014 Steve West Disc Golf, LLC

Will the New Low Emission Zone Reduce the Amount of Motor Vehicles in London?

DEVELOPMENT OF A SET OF TRIP GENERATION MODELS FOR TRAVEL DEMAND ESTIMATION IN THE COLOMBO METROPOLITAN REGION

An Analysis of the Effects of Long-Term Contracts on Performance in Major League Baseball

Superstar Effects in Golf Tournaments. Zachary Lazow, Colgate University

Practice Test Unit 6B/11A/11B: Probability and Logic

THE IMPACT OF PRESSURE ON PERFORMANCE: EVIDENCE FROM THE PGA TOUR. Daniel C. Hickman a and Neil E. Metz b. August 2014

ISDS 4141 Sample Data Mining Work. Tool Used: SAS Enterprise Guide

The pth percentile of a distribution is the value with p percent of the observations less than it.

Transcription:

International Journal of Sport Finance, 2010, 5, 167-180, 2010 West Virginia University Returns to Skill in Professional Golf: A Quantile Regression Approach Leo H. Kahane 1 1 Providence College Leo H. Kahane is an associate professor of economics at Providence College. His research interests include sport economics, international trade, and political economy. He is also the editor of the Journal of Sports Economics. Abstract There have been a host of empirical papers studying the returns to skill in professional golf (e.g., Alexander & Kern, 2005; Callan & Thomas, 2007; Moy & Liaw, 1998; Rishe, 2001; Shmanske, 1992, 2000, 2008). None of these studies, however, carefully considers the skewed distribution of earnings in professional golf. This paper uses quantile regression to better handle the skewness and outlier values found in PGA earnings data. Using data from the PGA for the years 2004 to 2007 results of quantile regressions show that the returns to skills such as putting and driving accuracy have a statistically different impact on earnings at different points on the conditional earnings distribution. Keywords: golf, PGA, earnings, quantile regression Introduction The returns to skill in professional sports have been the focus of a growing body of research in sport economics. Scully (1974) was the first paper to attempt to link player skills with compensation in Major League Baseball. Jones and Walsh (1988) studied salaries in the National Hockey League, and Kahn and Sherer (1988) examined salaries in the National Basketball Association. Kahn (1992) studied performance and pay in the National Football League. These papers represent only a small sample of research on salary determination in sports, and the field has made great strides since these early papers were published. In addition to the above sports, researchers have studied salary determination in professional golf. For example, one of the earliest papers (discussed in greater detail in the next section) is by Shmanske (1992), who used a cross-section of data from 1986 to estimate the relationship between various golf skills and tournament earnings. One of the issues ignored by this paper is the fact that golf earnings are highly positively skewed. Subsequent papers (e.g., Moy & Liaw, 1998; Shmanske, 2000; Nero, 2001) partially address this problem by transforming earnings into natural logs before regressing them on skills. While employing a log transformation may decrease skew, this comes at a price as it ignores some potentially interesting characteristics of the earnings distribution that are captured by its skew. An alternative approach, which is the

Kahane focus of this paper, is to examine the linkage between professional golfers earnings and their skills with the use of quantile regression. Quantile regression not only is better equipped to deal with skewed data, but it also allows us to more fully explore the returns to skill by considering non-central points on the conditional earnings distribution. Previous Research on Golf Earnings There have been a handful of published papers focusing on the relationship between skills and earnings in professional golf. 1 One is the aforementioned paper by Shmanske (1992), who uses a cross-section of the 60 top money winners from the 1986 Professional Golfers Association (PGA) Tour to study how practice improves the marginal product of golfers skills, which in turn affects their earnings. Based on his empirical findings he notes, among other things, that the value of the marginal product from putting may be in the range of $500 per hour of practice. Later research by Moy and Liaw (1998) uses a cross-section for golfers in the PGA, the Ladies Professional Golf Association (LPGA), and Senior PGA Tour golfers data for 1993 to estimate the relationship between earnings and various golfing skills. They find that long driving, good putting, and iron play are all important for success in the PGA. 2 In comparison, iron play and short game skills are more important for players in the LPGA and the Senior PGA. On a related topic, Shmanske (2000) considers the earnings differential of players in the PGA and LPGA by examining a cross-section of data for each group from 1998. He notes that while men tend to play for bigger purses in the PGA than do women competing in the LPGA, men also generally play longer courses and more rounds than do women. He finds that, controlling for skill levels, women in the LPGA are not underpaid in comparison to the men in the PGA. A follow-up by Rishe (2001) examines the same issue of earnings differentials, but in this case between PGA and Senior PGA players. He finds that the primary reason for the greater earnings of PGA golfers is that they earn a greater rate of return on their skills than equally skilled golfers in the Senior PGA. He proposes that this difference may be attributable to differing demand conditions (e.g., television viewership, etc.) for golfing skills, but age discrimination may also be a reason. Work by Alexander and Kern (2005) is aimed at testing the adage of drive for show, putt for dough. The adage implies that driving distance is less important in determining the performance (and hence earnings) of professional golfers than is putting and other short game skills. The authors employ a panel dataset for PGA players covering the period of 1992-2001 to test this hypothesis, and to see if changes in equipment over the years (e.g., the increased size of drivers, etc.) have affected the importance of driving versus putting. They conclude that there is some limited support that the importance of driving has increased over the years, but that putting remains the most important skill. Recent research on the linkage between skill and earnings in golf focuses on the fact that skills do not directly determine earnings. For example, Callan and Thomas (2007) motivated by Scully (2002) develop a structural model where skills determine score, which in turn determines rank performance, which ultimately determines earnings. Using cross-sectional data from the 2002 PGA Tour they find that their structural equation approach produces somewhat different estimates for the marginal product of various skills in comparison to reduced-from, single-equation studies. 3 168 Volume 5 Number 3 2010 IJSF

Returns to Skill in Professional Golf: A Quantile Regression Approach Earnings, Skewness, and the Advantage of Quantile Regression One of the features that nearly all of the papers described above share is the use of linear regression models that focus on the behavior of the conditional mean of the dependent variable, the ordinary least-squares (OLS) estimation method being the one most commonly employed. 4 This paper follows a different approach by utilizing Figure 1: Distributional Graphs for Real PGA Earnings per Event, 2004 2007 Figure 1A: Box Plot Figure 1B: Kernel Density Estimate (with Normal Overlay) Volume 5 Number 3 2010 IJSF 169

Kahane quantile regression for estimating the returns to skill in golf. 5 Quantile regression has a number of features that may make it a better choice of estimation methods in this context. For example, one key advantage of quantile regression is that it is better equipped to handle cases where skewness and outlier effects are present in the dependent variable. In the case of golf tournament earnings, it is abundantly clear that the data are strongly positively skewed. Evidence of skewness and the presence of outliers are provided graphically in Figures 1A and 1B. The first is a box plot of real earnings per event ( earnings hereafter) for PGA Tour golfers during the 2004 to 2007 seasons. The graph clearly demonstrates a strong positive skew with many outliers. Figure 1B shows the kernel density estimate for the same data and includes an overlay of a normal distribution. Again, the data appear to be strongly positively skewed and the shape of the kernel density estimate appears to be non-normal. In addition to these figures, Table 1 provides empirical tests for skewness and nonnormality of earnings. The tests, which are based on skewness, kurtosis, and a joint test for both, strongly reject normality. 6 Table 1: Normality Tests for Real and Log Earnings per Event joint test Variable Skewness Kurtosis Pr(Skewness) Pr(Kurtosis) χ2(2) Prob> χ 2 real earnings 4.698 37.769 0.000 0.000 791.400 0.000 per event, thousands Ln (real -0.111 3.374 0.200 0.050 5.480 0.065 earnings per event) As for the causes of this skewness in PGA earnings, two reasons emerge. First, the payout structure in PGA tournaments is nonlinear. This point is made clearly in Scully (2002, p. 336) as he notes that prize money in golf is based on the rank-order finish of tournament participants with a nonlinear payout structure. Golfers in a tournament who make the cut (i.e., those who finish in the top 50% of the field after the first two rounds of the tournament) are eligible for some share of the purse. Of those making the cut, the typical payout structure upon the completion of the tournament is such that the first place finisher receives 18% of the purse, the second place finisher receives 10.8%, the third place receives 6.8%, the fourth place finisher receives 4.8%, and so on. This nonlinear, convex payout structure (with respect to rank-order finish) contributes to the skewness in PGA earnings. A second reason for the skewness in per event earnings is the presence of some extraordinarily talented golfers. Even with the non-linear payout structure noted above, per event earnings could still be non-skewed across golfers if tournament wins are spread across a large number of golfers. The fact is, however, that tournaments wins are clustered among a small group of highly talented golfers. For example, during the 2004 to 2007 period there were a total of 135 first place wins across all PGA tour events. Of these 135 first place wins, three golfers (Tiger Woods, Vijay Singh, and Phil Mickelson) collectively took 49 (or 36%) of them. 7 The clustering of winning, together with the non-linear payout structure, lead to skewed per event earnings in professional golf. If we proceed with a simple conditional mean regression estimation (such as OLS) when the data are strongly skewed then the 170 Volume 5 Number 3 2010 IJSF

Returns to Skill in Professional Golf: A Quantile Regression Approach non-normality of the errors will create difficulties with inference tests based on the usual standard errors and t-statistics. Furthermore, the results may not describe the experience of the typical golfer since the regression coefficients may be strongly influenced by the skewness and outlier effects (an illustration of this fact will be provided later). As noted earlier, one common solution to reducing the skew in the dependent variable is to transform the data into natural logs and then estimate a conditional mean regression of the natural logs of the dependent variable on the covariates. However, the normality tests presented in Table 1 for the natural log of earnings also shows evidence of non-normality and thus estimating a semi-log form does not entirely solve the problem of non-normality in the errors in this case. Quantile regression, however, is well equipped to deal with such a problem. Perhaps the greatest advantage to quantile regression is that it allows for us to consider non-central points on the conditional distribution function of the dependent variable. That is, conditional mean regression only gives us information about how various covariates, such as driving distance, may affect the conditional mean earnings for golfers. 8 Quantile regression allows us to explore the possibility that, say, changes in driving distance may affect golfers differently at different points on the conditional earnings distribution. Simply put, quantile regression results may provide a more complete understanding of the effects of various covariates on earnings. Model, Estimation Approach and Data The model used in this paper to estimate earnings of PGA golfers is similar to those used by others (e.g., Shmanske, 1992; Moy & Liaw, 1998; Alexander & Kern, 2005), and contains measures on various skills, overall professional experience, and physical characteristics of the golfer. Equation (1) presents the general form: y i = α (q) + β (q) x i + ε i (q) (1) The dependent variable, y i, is measured as real earnings per PGA event (in thousands of 2007 dollars). 9 The vector x i contains the covariates expected to explain golf earnings, β is the vector of coefficients to be estimated, and ε i is the error term. Note that the superscript (q) denotes the specific quantile associated with equation (1). 10 The vector x i contains five measures of golfing skill that contribute to lower scores and ultimately greater earnings. The definition of these variables and their expected impact on earnings is as follows: greens in regulation: the percent of time a player was able to hit the green in regulation (greens hit in regulation/holes played x 100). A green is considered hit in regulation when the golfer has two putts from the green to make par. For example, on a par-5 hole, a green is hit regulation if the ball in on the putting green by the third shot, leaving two putts left to earn par. This measure represents a golfer s skill in iron play. A positive coefficient is expected for this variable. 11 putting average: the average number of putts needed to finish a hole per green hit in regulation. Other things equal, the fewer the putts, the lower the score and thus a negative coefficient is expected. save percentage: the percent of time a golfer was able to get the ball in the hole in two shots or less following landing in a greenside sand bunker (regardless of Volume 5 Number 3 2010 IJSF 171

Kahane score). This skill captures the golfer s ability to salvage his/her score by accurately chipping out of the sand and as such we expect a positive coefficient. yards per drive: the average number of yards per measured drive. 12 Other things equal, including accuracy, longer drives leave the ball closer to the hole and should generally lead to reduced scores and thus a positive coefficient is expected. driving accuracy: the percentage of time a tee shot comes to rest in the fairway. All else equal, greater accuracy in driving should result in lower scores and as such a positive coefficient is expected. In addition to the skill measures described above, vector x i in equation (1) also contains controls for experience. The first is years pro, which is the number of years that have passed since the player has debuted as a professional golfer on the PGA Tour. The second is simply the square of years pro. It is expected that earnings will increase with greater experience on the PGA Tour, but with a diminishing effect. Thus a positive coefficient is expected for years pro and a negative coefficient is expected for its square. 13 Lastly, vector x i includes two measures on the physical characteristics of golfers. The first is weight (measured in pounds) and the second is height (measured in feet). These measures are included to control for the possibility that physical characteristics of golfers may affect performance and, hence, earnings. For example, it may be the case that taller players require less effort when driving the ball and as such may be more consistent with their drives (i.e., they may have less variance in their diving distance and accuracy). Or it may be the case that, other things equal, heavier players become more fatigued during a tournament and this could affect their consistency as well. Data on earnings and the above noted measures for players were collected for the 2004 through 2007 PGA Tours; Table 2 provides summary statistics. The median real earnings per event in thousands (not shown in Table 2) is 34.71. The median is much less than the reported mean of 52.02, and is illustrative of the strong positive skew in the earnings data. Table 2: Summary Statistics (n = 778) Variable Mean Std. Dev. Min Max real earnings per event, thousands 52.02 62.10 0.94 682.65 greens in regulation 64.55 2.92 54.30 74.10 putting average 1.78 0.02 1.71 1.86 save percentage 48.95 5.93 31.80 68.10 yards per drive 288.57 8.78 258.70 319.60 driving accuracy 63.40 5.31 41.90 78.40 years pro 13.38 6.47 0.00 34.00 weight 181.94 19.97 136.00 265.00 height, (in feet) 5.96 0.19 5.25 6.42 Empirical Results As was noted, one of the disadvantages of conditional mean estimation methods, such as OLS, is that the regression results may be strongly influenced by outlier effects. As a means of illustrating this problem in the current context, Table 3 presents the top five most influential observations for a simple levels OLS regression estimate of equation (1). The table reports statistics for DFBETAS for each of the five skill measures includ- 172 Volume 5 Number 3 2010 IJSF

Returns to Skill in Professional Golf: A Quantile Regression Approach ed in the regression. 14 The results reveal a very interesting pattern. Namely, Tiger Woods alone strongly affects the estimated coefficients on four of the five skill measures. As an example, by including Tiger Woods performance in 2006 the estimated coefficient for greens in regulation is increased for all observations by nearly one full standard deviation! Clearly, the use of a simple conditional mean estimation method such as OLS would likely produce misleading values for the estimated coefficients. Fortunately, quantile regression is, by its nature, immune to such outlier effects and as such it is likely to give a more realistic estimation of the returns to various golfing skills for the typical golfer. Table 3: DFBETA Results for Top 5 Most Influential Observations. Greens in Regulation Putting Average Player Year DFBETA Player Year DFBETA Tiger Woods 2006 0.999 Tiger Woods 2007-0.648 Tiger Woods 2007 0.669 Tiger Woods 2005-0.473 Vijay Singh 2004 0.473 Ernie Els 2004-0.315 Sergio Garcia 2004 0.218 Tiger Woods 2004-0.247 Tiger Woods 2005 0.196 Jim Furyk 2006-0.228 Save Percentage Yards per Drive Player Year DFBETA Player Year DFBETA Tiger Woods 2006 0.495 Tiger Woods 2005 0.625 Tiger Woods 2005 0.236 Tiger Woods 2006 0.277 Vijay Singh 2005 0.190 Tiger Woods 2007 0.223 Phil Mickelson 2004 0.163 Jose Coceres 2007 0.148 Adam Scott 2004 0.161 Geoff Ogilvy 2006 0.136 Driving Accuracy Player Year DFBETA Jose Coceres 2007 0.260 Zach Johnson 2007 0.194 Fred Funk 2007 0.187 Jim Furyk 2006 0.151 Geoff Ogilvy 2006 0.120 * Based on OLS regressions in levels. The cutoff value for DFBETAs that may present problems is when DFBETA > 2/(N^0.5) Table 4 presents the quantile regression estimates for equation (1). The first column contains, for comparison purposes, the simple OLS estimation with robust standard errors reported in parentheses. Columns 2 through 6 report quantile regression estimates for the 10 th, 25 th, 50 th, 75 th, and 90 th conditional quantiles. The standard errors for these estimated coefficients are computed using a bootstrap method with 2,000 replications. 15 Finally, column 7 shows the Wald statistic for a test of equivalence between the estimated coefficients for the five quantiles. For all regressions, the covariates are centered at their means. Volume 5 Number 3 2010 IJSF 173

Kahane Table 4: OLS and Quantile Regression Results for Real Earnings per Event (1) (2) (3) (4) (5) (6) (7) VARIABLES OLS q10 q25 q50 q75 q90 Ho: equivalent coefficients greens in regulation 7.485*** 1.486*** 2.158*** 4.111*** 6.487*** 6.377*** 7.69*** (1.219) (0.270) (0.328) (0.619) (0.998) (2.114) putting average -700.082*** -182.121*** -286.970*** -374.041*** -498.630*** -717.563*** 5.71*** (97.118) (23.157) (37.512) (47.718) (95.964) (204.680) save percentage 2.148*** 0.396*** 0.662*** 1.074*** 2.276*** 3.375*** 6.30*** (0.332) (0.090) (0.102) (0.210) (0.415) (0.997) yards per drive 0.926** 0.257** 0.462*** 0.407* 0.070 1.602* 1.99* (0.371) (0.105) (0.121) (0.228) (0.455) (0.898) driving accuracy -0.972* 0.257 0.463** -0.107-1.593** -2.086* 2.56** (0.501) (0.166) (0.186) (0.364) (0.761) (1.242) years pro 1.460* -0.103-0.847* -0.474 0.595 2.554 1.25 (0.873) (0.331) (0.507) (0.684) (1.140) (2.316) years pro squared -0.036 0.005 0.031* 0.022-0.019-0.064 1.10 (0.028) (0.010) (0.018) (0.024) (0.036) (0.079) weight -0.107-0.062** -0.044-0.131-0.032-0.115 0.57 (0.089) (0.030) (0.044) (0.090) (0.118) (0.191) height, (in feet) 15.093 0.983 0.587 4.824 4.775 5.992 0.10 (9.175) (3.252) (4.670) (8.284) (16.093) (24.889) Constant 51.207*** 15.527*** 22.544*** 38.391*** 64.458*** 102.736*** (1.802) (0.565) (0.852) (1.465) (3.079) (6.359) Observations 778 778 778 778 778 778 R-squared/Pseudo R-squared 0.299 0.130 0.128 0.139 0.165 0.229 *** p<0.01, ** p<0.05, * p<0.1. Robust standard errors in parentheses for OLS. Boostrapped standard errors for quantile regressions (2000 replications). Column (7) shows Wald test statistics for the equivalence of estimated coefficients. 174 Volume 5 Number 3 2010 IJSF

Returns to Skill in Professional Golf: A Quantile Regression Approach Starting with the OLS results we see that, with the exception of driving accuracy, all of the skill and experience measures have the predicted signs. The negative sign on driving accuracy is interesting and may reflect a tradeoff between driving accuracy and distance. 16 As for statistical significance, four of the five coefficients to the skill measures (greens in regulation, putting average, save percentage, and yards per drive) are statistically significant at the 5% level or smaller. The coefficients to driving accuracy and years pro achieve a significance level of 10% and the coefficients to years pro squared, weight, and height fail to achieve an acceptable level of significance. Turning to the quantile regression estimates, the estimated coefficients for greens in regulation, putting average, and save percentage have the expected signs and are statistically significant at the 1% level in all regressions. The variable yards per drive is statistically significant at the 5% level or less for 10 th and 25 th quantile regressions, but tends to be less significant for greater quantile regressions. This suggests that driving distance is important for those at the lower end of the earnings distribution, but becomes less important for those at the upper end. Or, in other words, the adage drive for show, putt for dough seems to apply to the elite golfers, but not necessarily for golfers at the lower end of the earnings distribution. Finally, the Wald tests for statistical equivalence of the estimated coefficients across the quantiles reject the null hypothesis at the 5% level of significance for all of the skill measures with the exception of yards per drive, which is rejected at the 10% level of significance. Rather than discussing the size of each coefficient separately (there are a total of 60), I will discuss several prominent results. Notice that the coefficient on greens in regulation for the OLS column suggests that, other things equal, an increase in one percentage point in this measure increases earnings by approximately $7,485. Comparing this result to the median quantile estimation (column 4), we see that an increase in one percentage point in greens in regulation is expected to increase earnings by about $4,111. This is a considerable difference between these two predictions and reflects the effects of the skewness (and the effects of outliers) in the earnings measure. Even more interesting is the fact the estimated coefficient for greens in regulation increases steadily as we move from the 10 th quantile, (1.486) to the 90 th quantile (6.377). This implies that the effect of an increase in greens in regulation is quite different for golfers with low earnings in comparison to golfers with high earnings, other things equal. The Wald test in column 7 confirms that the difference is statistically significant. Thus we can conclude that not only does an increase in greens in regulation positively affect expected earnings (i.e., producing positive location shift of the conditional earnings distribution), but it also means that the increase in expected earnings is greater at higher points on the conditional distribution function and thus produces a widening in the expected earnings (i.e., producing a positive scale shift ). We find similar results with respect to putting skill. The OLS predicted effect of reducing putting average by one stroke ($700,082) is nearly twice that for the median quantile regression result ($374,041). In addition, notice the effects of this variable become greater for each successive conditional quantile. These results imply, for example, that a one stroke reduction in putting average would increase expected earnings by about $182 thousand for golfers at the 10 th quantile in earnings, while the same reduction would increase expected earnings by more than $717,000 for golfers in the 90 th quantile of earnings. In sum, improvements in putting skill increase expected earnings Volume 5 Number 3 2010 IJSF 175

Kahane and at the same time tend to widen them; the latter result being in part due to the nonlinear payout structure in PGA tournaments as noted earlier. Figure 2 provides an alternative means of considering the effects of the covariates on both the location and scale of earnings. Each graph in the figure tracks the evolution of the estimated coefficient of each covariate for greater quantiles. The shaded area shows the range for the 95% confidence envelope for the quantile regression estimate. Lastly, the dashed line shows the estimated OLS coefficient for the variable. Generally speaking, a plot for a quantile coefficient (and confidence envelope) that is above the zero axis indicates that earnings increase with an increase in the covariate (i.e., produces a positive location shift of the conditional earnings distribution), while a plot lying below the zero axis implies the opposite. 17 Furthermore, plots that slope upward tend to widen the scale of the conditional distribution function. Whereas downwardsloping plots indicate that the scale tends to narrow as the covariate increases. Thus, as an example, we see that the plot for greens in regulation in Figure 2 is above the zero axis and has a positive slope. This reflects our previous discussion of this coefficient. Namely, that earnings are increasing with greens in regulation and that the dispersion or scale of earnings is increasing as well. We can see from Figure 2 that save percentage has a similar qualitative effect as greens in regulation. The plot for putting average is below the zero axis and has a negative slope. Thus, this tells us that decreases in the average number of putts per hole has the effect of increasing earnings and that it also tends to increase the scale of earnings, all else equal. Conclusion The returns to skill in professional golf have been considered in a number of academic research papers. All of the previous research, however, has employed some version of a conditional mean estimation procedure that may be inappropriate and misleading in light of the fact that earnings in the PGA are strongly skewed. This paper has taken a different approach that of conditional quantile estimation. Our findings from estimated quantile regressions indicate that not only do the effects of various skills differ from those employing conditional mean estimation, but that the impact of changes in several key skill measures is more complex than what is implied from simple OLS-produced estimates. Like previous studies, improvements in iron play, putting ability, and sand saves all serve to increase expected earnings. The results presented in this paper, however, go further and show that improvements in these skills contribute a widening of the conditional earnings distribution. In addition, to the extent that professional golfers practice various skills with an eye on increasing earnings, the results presented in this study may provide better guidance as to how much time to spend practicing various skills. For example, consider a golfer in the 25 percentile of real earnings per event. According to the above results, if this golfer could reduce his/her putting average by one standard deviation (i.e., by 0.02) then his/her earnings would be expected to increase by an estimated $5,739, which represents approximately a 32% increase in per event earnings. However, a golfer in the 75 percentile for earnings per event would witness an approximate $9,973, or 16% increase in their earnings for the same reduction in their putting average. 18 This kind of differential impact on improved putting (and for other skills) may affect the time individual golfers spend on developing specific skills. 176 Volume 5 Number 3 2010 IJSF

Returns to Skill in Professional Golf: A Quantile Regression Approach Figure 2: Coefficients Graph for Skill and Experience Variables for the Levels Quantile Regressions (The shaded areas represent the 95 percent confidence intervals for the quanitle estimates. The dashed line show the OLS estimated coefficient. Volume 5 Number 3 2010 IJSF 177

Kahane There are several possible extensions to the above work. One is to consider alternative skill measures in the earnings function. For example, measures such as birdie conversions or hitting out of the rough may be tested. In addition, the above estimates used pooled cross sections of data for professional golfers. It may be desirable to control for unobserved heterogeneity for individual golfers with a random- or fixedeffects quantile regression. This type of estimation procedure, however, is currently unavailable for quantile regression (see Koenker, 2005). Perhaps future developments in the field will make such an estimation approach possible. References Alexander, D. L., & Kern, W. (2005). Drive for show and putt for dough? An analysis of the earnings of PGA Tour golfers. Journal of Sports Economics, 6(1), 46-60. Callan, S. J., & Thomas, J. M. (2007). Modeling the determinants of a professional golfer s tournament earnings: A multiequation approach. Journal of Sports Economics, 8(4), 394-411. Ehrenberg, R. G., & Bognanno, M. L. (1990a). Do tournaments have incentive effects? Journal of Political Economy, 98(6), 1307-1324. Ehrenberg, R. G., & Bognanno, M. L. (1990b). The incentive effects of tournaments revisited: Evidence from the European PGA Tour. Industrial and Labor Relations Review, 43(3), S74- S88. Hamilton, B. (1997). Racial discrimination and professional basketball salaries in the 1990s. Applied Economics, 29(3), 287-296. Hao, L., & Naiman, D. Q. (2007). Quantile regression. Thousand Oaks, CA: Sage Publications, Inc. Jones, J. C. H., & Walsh, W. D. (1988). Salary determination in the National Hockey League: The effects of skills, franchise characteristics, and discrimination. Industrial and Labour Relations Review, 4, 592-604. Kahn, L. M. (1992). The effects of race on professional football players compensation. Industrial & Labor Relations Review, 45(2), 295-310. Kahn, L. M., & Shere, P. D. (1988). Racial differences in professional basketball players compensation. Journal of Labor Economics, 6(1), 40-61. Koenker, R. (2005). Quantile regression. Cambridge, UK: Cambridge University Press. Koenker, R., & Hallock, K. F. (2001). Quantile regression. Journal of Economic Perspectives, 15(4), 143-156. Moy, R. L., & Liaw, T. (1998). Determinants of professional golf tournament earnings. American Economist, 42(1), 65-70. Nero, P. (2001). Relative salary efficiency of PGA Tour golfers. American Economist, 45(2), 51-56. Rishe, P. J. (2001). Differing rates of return to performance: A comparison of the PGA and senior golf tours. Journal of Sports Economics, 2(3), 285-296. Scully, G. W. (1974). Pay and performance in Major League Baseball. American Economic Review, 64(6), 915-930. Scully, G. W. (2002). The distribution of performance and earnings in a prize economy. Journal of Sports Economics, 3(3), 235-245. Shmanske, S. (1992). Human capital formation in professional sports: Evidence from the PGA Tour. Atlantic Economic Journal, 20(3) 66-80. Shmanske, S. (1998). Price discrimination at the links. Contemporary Economic Policy, 6(3), 368-378. Shmanske, S. (2000). Gender, skill, and earnings in professional golf. Journal of Sports Economics, 1(4), 385-400. 178 Volume 5 Number 3 2010 IJSF

Returns to Skills in Professional Golf: A Quantile Regression Approach Shmanske, S. (2008). Skills, performance and earnings in the tournament compensation model: Evidence from PGA Tour microdata. Journal of Sports Economics, 9(6), 644-662. Vincent, C., & Eastman, B. (2009). Determinants of pay in the NHL: A quantile regression approach. Journal of Sports Economics, 10(3), 256-277. Endnotes 1 Other topics of research on golf have included such things as the pricing for a round of golf (e.g., Shmanske, 1998) and the incentive effects that tournament-style payoffs have on participants effort, (e.g., Ehrenberg and Bognanno, 1990a, 1990b). 2 See Shmanske (1992) for brief explanation of the object of playing golf and the kinds of skills needed to be successful in playing the game. 3 Shmanske (2008) also considers a structural equation approach, but uses individual tournament data (as opposed to year-long tournament averages). The use of individual tournament data allows him to take into account the specific aspects of various tournaments (e.g., overall course distance differences). In addition, the microdata employed allows him to incorporate measures of variance and skew in player performance measures in addition to the typical mean measures. 4 The Callan and Thomas (2007) paper employs a two-stage least-squares estimation method. Alexander and Kern (2005) employ a GLS estimator with random effects. 5 Others who have employed quantile regression in the context of sports include Hamilton (1997) for professional basketball and Vincent and Eastman (2009) for professional ice hockey. See Koenker and Hallock (2001) for an introduction to the basics of quantile regression. For a more comprehensive presentation of quantile regression see Koenker (2005). 6 The tests shown in Table 1 are computed using Stata s sktest. 7 Tiger Woods had 22 first-place finishes, Vijay Singh had 16, and Phil Mickelson had 11, (http://www.pgatour.com). 8 And even this may be of little value when the mean value of earnings is not centrally located due to the skewness in the data. 9 All data for PGA Tour were gleaned from the ESPN website: http://espn.go.com/golf/. Variable definitions were obtained from the PGA Tour website: http://pgatour.com/r/stats/. 10 In the case of a simple conditional mean estimation (e.g., OLS) of equation (1), the superscripts would not appear in the equation. 11 Alexander and Kern (2005) note that this measure may not be a pure measure of ability as a golfer s greens in regulation value may be dependent on their driving ability. This is also true for putting average and save percentage. 12 As noted on the website: http://www.pgatour.com, measurements are taken on two holes per round. The two holes that are selected face in opposite directions to counteract the effect of wind. Distance is measured to the point at which the ball comes to rest regardless of whether they are in the fairway or not. 13 The age of the player (and its square) were also considered as controls for experience. The results were essentially unchanged when these measures were used in place of years pro (and its square). Attempts to use both measures in the same regression led to clear signs of multicollinearity. This is expected as the simple linear correlation between age and years pro is approximately 0.97. 14 The DFBETA for golfer i for covariate j is computed as: DFBETA ij =( b j b j(i) )/se(b j(i) ). That is, it is the difference between the estimated coefficient b j including player i, minus the same coefficient estimated when player i is excluded (b j(i) ), divided by the standard error of the esti- Volume 5 Number 3 2010 IJSF 179

Kahane mated coefficient when the player is excluded. The larger the absolute value of the DFBETA, the greater the influence the observation has on the estimated coefficient. Observations where the absolute value of the DFBETA is greater than 2/ n, equal to 0.072 in this case, are considered highly influential observations. 15 Using a bootstrap estimation for the standard errors reduces the problems associated with inference using assymptotic-based standard errors and t-statistics whose validity depends on the normality assumption. See Hao and Naiman (2007) for discussion on this issue. 16 Thanks are due to an anonymous referee for pointing out this possibility. The idea here is that players who attempt to drive the ball further may do so at the expense of accuracy. Indeed, the correlation between yards per drive and driving accuracy for the sample used in this study is - 0.63. 17 A plot that generally falls on the horizontal axis, such as years pro squared, implies the covariate does not have a statistically significant impact on earnings. 18 Real earnings per event for golfers in the 25 th and 75 th percentiles are approximately $17,963 and $62,161, respectively. Using the coefficient for putting average shown in Table 4 for golfers in the 25 th percentile, the expected relative gain to reducing their putting average by 0.02 is: (- 0.02)x(-286.97)/17.963 = 0.319. Similarly, for golfers in the 75 th percentile, we have: (-0.02)x(- 498.63)/62.161 = 0.160. Author s Note This paper was prepared for the Western Economic Association International Conference in Vancouver, Canada, June 29-July 3. 180 Volume 5 Number 3 2010 IJSF

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.