Statistical Analysis of PGA Tour Skill Rankings USGA Research and Test Center June 1, 2007

Similar documents
Select Boxplot -> Multiple Y's (simple) and select all variable names.

One-way ANOVA: round, narrow, wide

Example 1: One Way ANOVA in MINITAB

Analysis of Variance. Copyright 2014 Pearson Education, Inc.

Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA

Name May 3, 2007 Math Probability and Statistics

Stats 2002: Probabilities for Wins and Losses of Online Gambling

Data Set 7: Bioerosion by Parrotfish Background volume of bites The question:

ANOVA - Implementation.

Running head: DATA ANALYSIS AND INTERPRETATION 1

One-factor ANOVA by example

A few things to remember about ANOVA

Unit4: Inferencefornumericaldata 4. ANOVA. Sta Spring Duke University, Department of Statistical Science

Legendre et al Appendices and Supplements, p. 1

Announcements. Lecture 19: Inference for SLR & Transformations. Online quiz 7 - commonly missed questions

Unit 4: Inference for numerical variables Lecture 3: ANOVA

Driv e accu racy. Green s in regul ation

Section I: Multiple Choice Select the best answer for each problem.

Chapter 12 Practice Test

Week 7 One-way ANOVA

Case Processing Summary. Cases Valid Missing Total N Percent N Percent N Percent % 0 0.0% % % 0 0.0%

save percentages? (Name) (University)

Changes in speed and efficiency in the front crawl swimming technique at 100m track

Youngs Creek Hydroelectric Project

Is lung capacity affected by smoking, sport, height or gender. Table of contents

ASTERISK OR EXCLAMATION POINT?: Power Hitting in Major League Baseball from 1950 Through the Steroid Era. Gary Evans Stat 201B Winter, 2010

1wsSMAM 319 Some Examples of Graphical Display of Data

Confidence Interval Notes Calculating Confidence Intervals

PLANNED ORTHOGONAL CONTRASTS

The Reliability of Intrinsic Batted Ball Statistics Appendix

DIRECTION DEPENDENCY OF OFFSHORE TURBULENCE INTENSITY IN THE GERMAN BIGHT

Lower Columbia River Dam Fish Ladder Passage Times, Eric Johnson and Christopher Peery University of Idaho

Biostatistics & SAS programming

Minimal influence of wind and tidal height on underwater noise in Haro Strait

DEPARTMENT OF FISH AND GAME

Robert Jones Bandage Report

Safety at Intersections in Oregon A Preliminary Update of Statewide Intersection Crash Rates

23 RD INTERNATIONAL SYMPOSIUM ON BALLISTICS TARRAGONA, SPAIN APRIL 2007

Youngs Creek Hydroelectric Project (FERC No. P 10359)

Chapter 5: Methods and Philosophy of Statistical Process Control

Appendix A: Before-and-After Crash Analysis of a Section of Apalachee Parkway

1 Hypothesis Testing for Comparing Population Parameters

Procedia Engineering Procedia Engineering 2 (2010)

Stat 139 Homework 3 Solutions, Spring 2015

Factorial Analysis of Variance

A Statistical Analysis of the Factors that Potentially Affect the Price of A Horse

Sample Final Exam MAT 128/SOC 251, Spring 2018

Denise L Seman City of Youngstown

SCIENTIFIC COMMITTEE SEVENTH REGULAR SESSION August 2011 Pohnpei, Federated States of Micronesia

CHAPTER ANALYSIS AND INTERPRETATION Average total number of collisions for a try to be scored

y ) s x x )(y i (x i r = 1 n 1 s y Statistics Lecture 7 Exploring Data , y 2 ,y n (x 1 ),,(x n ),(x 2 ,y 1 How two variables vary together

5.1 Introduction. Learning Objectives

Announcements. % College graduate vs. % Hispanic in LA. % College educated vs. % Hispanic in LA. Problem Set 10 Due Wednesday.

Distancei = BrandAi + 2 BrandBi + 3 BrandCi + i

Laboratory Activity Measurement and Density. Average deviation = Sum of absolute values of all deviations Number of trials

On the association of inrun velocity and jumping width in ski. jumping

WHAT CAN WE LEARN FROM COMPETITION ANALYSIS AT THE 1999 PAN PACIFIC SWIMMING CHAMPIONSHIPS?

Accident data analysis using Statistical methods A case study of Indian Highway

Table 4.1: Descriptive Statistics for FAAM 26-Item ADL Subscale

Row / Distance from centerline, m. Fan side Distance behind spreader, m 0.5. Reference point. Center line

Was John Adams more consistent his Junior or Senior year of High School Wrestling?

Figure 1 Location of the ANDRILL SMS 2006 mooring site labeled ADCP1 above.

TECHNICAL REPORT STANDARD TITLE PAGE

Chapter 9: Hypothesis Testing for Comparing Population Parameters

MEMORANDUM. Investigation of Variability of Bourdon Gauge Sets in the Chemical Engineering Transport Laboratory

A Depletion Compensated Wet Bath Simulator For Calibrating Evidential Breath Alcohol Analyzers

Innovative Golf Putter

Factorial ANOVA Problems

PGA Tour Scores as a Gaussian Random Variable

Fishery Resource Grant Program Final Report 2010

Monitoring Population Trends of White-tailed Deer in Minnesota Marrett Grund, Farmland Wildlife Populations and Research Group

Economic Value of Celebrity Endorsements:

Fluid Flow. Link. Flow» P 1 P 2 Figure 1. Flow Model

Math SL Internal Assessment What is the relationship between free throw shooting percentage and 3 point shooting percentages?

An Empirical Comparison of Regression Analysis Strategies with Discrete Ordinal Variables

Major League Baseball Offensive Production in the Designated Hitter Era (1973 Present)

A Statistical Evaluation of the Decathlon Scoring Systems

The Normal Distribution, Margin of Error, and Hypothesis Testing. Additional Resources

Total Ionization Dose (TID) Test Results of the RH1021BMH-10 Precision 10V Low Dose Rate (LDR) LDR = 10 mrads(si)/s

Empirical Example II of Chapter 7

Confidence Intervals with proportions

Real Life Turbulence and Model Simplifications. Jørgen Højstrup Wind Solutions/Højstrup Wind Energy VindKraftNet 28 May 2015

Total Ionization Dose (TID) Test Results of the RH1028MW Ultralow Noise Precision High Speed Operational Low Dose Rate (LDR)

In addition to reading this assignment, also read Appendices A and B.

Taking Your Class for a Walk, Randomly

That pesky golf game and the dreaded stats class

Psychology - Mr. Callaway/Mundy s Mill HS Unit Research Methods - Statistics

MRI-2: Integrated Simulation and Safety

Smart Water Application Technologies (SWAT)

Tutorial for the. Total Vertical Uncertainty Analysis Tool in NaviModel3

Atmospheric Rossby Waves in Fall 2011: Analysis of Zonal Wind Speed and 500hPa Heights in the Northern and Southern Hemispheres

Universitas Sumatera Utara

Gender, Skill, and Earnings PGA vs. LPGA

Setting up group models Part 1 NITP, 2011

PRACTICAL EXPLANATION OF THE EFFECT OF VELOCITY VARIATION IN SHAPED PROJECTILE PAINTBALL MARKERS. Document Authors David Cady & David Williams

Supercritical CO2 Power Cycles: Design Considerations for Concentrating Solar Power

WIND DATA REPORT. Swan s Island, ME

On the Challenges of Analysis and Design of Turret-Moored FPSOs in Squalls

NMSU Red Light Camera Study Update

b

Transcription:

Statistical Analysis of PGA Tour Skill Rankings 198-26 USGA Research and Test Center June 1, 27 1. Introduction The PGA Tour has recorded and published Tour Player performance statistics since 198. All of this information is published for each year on the PGA Tour website (www.pgatour.com). Among a number of parameters listed on the PGA Tour website, these published statistics include four key skills. These are: - driving distance - driving accuracy - greens in regulation - putting average (data not available prior to 1986) In addition, the PGA Tour records and publishes the amount of money each player has won during each year. The players annual performance for money won and all measured skills are ranked from the highest to the lowest. Correlating the money won rankings to skill rankings can be used to determine the relative importance of key skills to winning and how the relative importance has changed on the PGA Tour over the period from 198 to 26. 2. Analysis method The USGA has compared player rankings in the skills listed above to their ranking for money won. This was done with a Spearman Rank Correlation study, wherein the ranking for money won is correlated to the ranking for a particular skill. This study is done with only those players who have participated in enough events to qualify to be listed in the skills ranking. Every year there are a number of players who have earned money on Tour, but have not played enough Tour event rounds to be listed in the skill rankings. These players are not considered in the Spearman Rank Correlation, and are removed from the money won ranking for this analysis. The remaining players are re-ranked according to their results so that the money ranking has an equal quantity of players as each skill ranking. The ranks for tied players in a Spearman Rank Correlation are obtained by averaging the ranks that the tied players would occupy, although it was found to be immaterial to the results obtained in this study. The correlation coefficient (r) is calculated for each year s comparison of money won ranking to skill ranking. These correlation coefficients have been plotted for all years from 198 until 26. They are also presented in Appendix I. 1

3. Results The summary of the Spearman Rank Correlation study from 198-26 is presented in Table 1. Plots of the year-by-year rank correlations are given in Figures 1 4. GIR Putting Avg Driving Distance Driving Accuracy 198-26 Mean.54.56.21.31 198-26 Std. Dev..94.11.97.177 Table 1: Correlation coefficients between Money winning rank and skill rank - means and standard deviations. (Putting Average 1986-26) 1.9 GIR.8 Correlation Coefficient.7.6.5.4.3.54.2.1 Figure 1: Correlation coefficient between money winning rank and greens-inregulation (GIR) rank on the PGA Tour from 198 to 26 2

1 Driving Distance.9.8.7 Correlation Coefficient.6.5.4.3.2.21.1 Figure 2: Correlation coefficient between money winning rank and driving distance rank on the PGA Tour from 198 to 26 1 Putting Average.9.8 Correlation Coefficient.7.6.5.4.3.56.2.1 Figure 3: Correlation coefficient between money winning rank and putting average rank on the PGA Tour from 1986 to 26. (Putting average was not published for years prior to 1986.) 3

1.9 Driving Accuracy.8.7 Correlation Coefficient.6.5.4.3.2.1 -.1 -.2 Figure 4: Correlation coefficient between money winning rank and driving accuracy rank on the PGA Tour from 198 to 26. 4. Analysis of Rank Correlation 4.1. Greens In Regulation and Putting Average rankings have relatively strong correlations to money won rankings. Over time, GIR has shown more stability because Putting Average correlation coefficient has decreased in the two most recent years and bears further observation to determine if this is an anomaly or an ongoing trend. 4.2 Compared to GIR and Putting Average, Driving Distance ranking has a relatively low correlation to Money ranking. It has remained fairly stable over the period studied. 4.3 Of these key skills, Driving Accuracy has changed the most over the time period studied. The overall standard deviation of the results is higher than the other skills. During the 198 s Driving Accuracy was as nearly as strongly correlated to money ranking as GIR and Putting Average. This changed in 1992 and again in 23. Rather than a general ongoing decline, two sudden declines followed by stability have resulted in three stable eras at three distinct levels of correlation. (see Fig. 5) These changes are further evaluated in Appendix II. For the current era (23-6), the level of correlation 4

between Money ranking and Driving Accuracy rank has nearly reached the lowest level absolute value it can attain. 1.9 Driving Accuracy.8.7 Correlation Coefficient.6.5.4.3.2.47.24.1 -.1.1 -.2 Figure 5: Correlation coefficient of money winning rank to driving accuracy rank on the PGA Tour shown as three separate eras. 5

5. Top 1 Money Winners Rank Analysis In addition to the Spearman Rank Correlation Study, the top ten money winners from each year were evaluated for their rankings in the key skills of driving distance, driving accuracy, GIR, and putting average. The average ranking for each skill for the top ten money winners has been calculated. The best ranking players have the lowest numerical rank, so the lower average rank is a superior average rank. 6. Results The summary of the top ten money winners key skills ranking from 198-26 is presented in Table 1. Figures 6-1 are the plots of the average ranking for driving distance, driving accuracy, greens in regulation, and putting average for the top ten money winners from each year since 198. These results are also presented in Appendix III. GIR Putting Avg. Driving Distance Driving Accuracy Mean 4.5 38.9 6.6 73 Std. Dev. 1.8 12.4 14.2 27.5 Table 2: Average rank of the top 1 money winners 198-26 (putting average 1986-26) - means and standard deviations. GIR 2 4 4.5 Average Rank 6 8 1 12 14 16 Figure 6: Average GIR rank of top ten money winners including average value 198-26. 6

Driving Distance 2 4 Average Rank 6 8 1 6.6 12 14 16 Figure 7: Average driving distance rank of top ten money winners 198-26. 2 Putting Average 4 38.9 6 Average Rank 8 1 12 14 16 Figure 8: Average putting average rank of top ten money winners 1986-26. 7

. Driving Accuracy 2. 4. Average Rank 6. 8. 1. 12. 14. 16. Figure 9: Average driving accuracy rank of top ten money winners 198-26.. 2. Driving Accuracy 4. 48.3 Average Rank 6. 8. 1. 77.5 12. 12.4 14. 16. Figure 1: Three eras of average driving accuracy rank of top ten money winners 198-26 8

7. Analysis of Top Ten Rank 7.1 The Putting Average of the top ten money winners has the lowest average rank over the period studied. 7.2 GIR is the most stable of all the skills for average rank of the top ten money winners. 7.3 Driving accuracy has changed the most over the period studied in a similar fashion to the change shown in the Spearman Rank Correlation Study. Three separate, relatively stable eras have occurred with the changes taking place at a similar time that the driving accuracy correlation changed (top ten: 199 / 23; correlation: 1992 / 23). These changes are further evaluated in Appendix IV. The current level (23-26) demonstrates a numerically high average ranking that is different from the other skills. This difference from the other skills is similar to the results obtained in the rank correlation studies. 9

Appendix I Correlation Annual Coefficient between PGA Tour Money Winning Rank and Skill Rank Driving Distance Accuracy GIR Putting Average 198.12.53.37 1981.12.39.65 1982.21.38.59 1983.1.48.6 1984.12.51.7 1985.34.4.61 1986.18.52.4.68 1987.16.54.6.64 1988.29.53.57.63 1989.6.52.42.48 199.12.42.57.57 1991.25.47.61.61 1992.3.29.66.65 1993.36.24.63.64 1994.31.33.59.6 1995.39.18.41.55 1996.24.27.47.62 1997.26.23.62.42 1998.28.18.55.6 1999.2.21.52.63 2.35.28.58.59 21.16.23.65.58 22.1.19.44.6 23.15..45.47 24.12..5.52 25.25 -.6.44.22 26.8.1.44.37 1

Appendix II An examination of the data that appear in Figure 5 indicates that the correlation coefficient between accuracy and money earned in 1992 decreases by more than three standard deviations from its mean value over the time period from 198 to 1991. The coefficient maintains this decrease of more than three standard deviations until 22. In 23 the coefficient decreases again by more than three standard deviations from its average value over the period from 1992 to 22. It continues this same decrease for another 2 years with a smaller decrease in 26. This behavior suggests that one can pool the data into three separate time spans: 198 to 1991, 1992 to 22, and 23 to 26, and test whether the correlation coefficient varies over these three time periods. An ANOVA is applied to the data to test the hypothesis of a uniform value of the correlation coefficient. If the hypothesis is rejected, a Tukey Multiple Comparison Test can be made to determine confidence intervals for the difference associated with each of the correlation coefficients over each time span. If one lets mu(1), mu(2), and mu(3) be the value of the correlation between money and accuracy over the time intervals 198 to 1991, 1992 to 22, and 23 to 26, respectively, and tests the null hypothesis that mu(1)=mu(2)=mu(3) against the alternative that some are not equal, one finds from the analysis below that one must reject the null hypothesis. In addition one obtains the following 95% simultaneous confidence intervals for the difference between these mu s:.17 < mu(1) mu(2) <.3.14 < mu(2) mu(3) <.32; i.e., the decreases are somewhere between.17 and.3 for the first jump and.14 and.32 for the second jump simultaneously at the 95% level of confidence. The MINITAB output of the analysis that leads to the above conclusion appears below. One-way ANOVA: Correlation between Money and Accuracy versus Time Span Analysis of Variance for Correlation between Money and Accuracy Source DF SS MS F P Time Span 2.73971.36985 113.92. Error 24.7792.325 Total 26.81763 Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev ----+---------+---------+---------+-- mu(1) 12.47442.692 (-*-) mu(2) 11.2399.4888 (-*-) mu(3) 4.1.6633 (---*--) ----+---------+---------+---------+-- Pooled StDev =.5698..16.32.48 Tukey's pairwise comparisons Family error rate =.5; Individual error rate =.2; Critical value = 3.53.17595 < mu(1) mu(2) <.2947.3823 < mu(1) mu(3) <.54653.1465 < mu(2) mu(3) <.31213 11

Appendix III Average Annual Skill Rank of Top 1 PGA Tour Money Winners Driving Distance Accuracy GIR 198 62.5 55.4 55.8 1981 73.6 4.2 23.2 1982 68.4 4.6 31.6 1983 69.5 42.6 32.3 1984 46.3 64.3 33.4 1985 6.2 47.4 23. Putting Average 1986 65.2 51.3 41.6 24.1 1987 81.4 51.4 5.3 2.2 1988 67.6 51.5 51.9 23.7 1989 85.8 37.8 47.1 31.7 199 58.9 75.3 42.4 49.7 1991 45.3 69.3 46.4 45.3 1992 67.7 82.3 41.7 27.6 1993 46.7 78.1 35.1 27.1 1994 75.6 35.7 35.1 34.6 1995 45. 83.8 69.8 64.3 1996 56.9 85.5 42.7 44.4 1997 86.6 77.9 5.8 5.4 1998 47.9 84. 45.6 53.7 1999 57. 7.1 37.2 26.2 2 41.2 85.6 34.5 33.9 21 56.8 87.7 3.3 36.6 22 67. 92.3 42.9 47.3 23 36.9 17. 26.3 52.4 24 34.6 15.7 45.2 34.6 25 69.5 11.7 28.3 55. 26 61.2 113. 49.7 34.8 12

Appendix IV In an analysis similar to the one that appears in Appendix 1, it is possible to show that the data of the mean driving accuracy rank of the top 1 money winners in Figure 13 can be partitioned into three time spans: 198 to 1989, 199 to 22, and 23 to 26. This behavior suggests that one can pool the data into these three separate time spans, and test whether the mean rank varies over these three periods. An ANOVA is applied to the data to test the hypothesis of a uniform value of the mean rank. If the hypothesis is rejected, a Tukey Multiple Comparison Test can be made to determine confidence intervals for the difference associated with each of the mean ranks over each time span. If one lets mu(1), mu(2), and mu(3) be the value of the mean rank of the top 1 money winners driving accuracy over the time intervals 198 to 1989, 199 to 22, and 23 to 26, respectively, and tests the null hypothesis that mu(1)=mu(2)=mu(3) against the alternative that some are not equal, one finds from the analysis below that one must reject the null hypothesis. In addition one obtains the following 95% simultaneous confidence intervals for the difference between these mu s: 15 < mu(2) mu(1) < 44 23 < mu(3) mu(2) < 62; i.e., the decreases are somewhere between 15 and 44 for the first jump and 23 and 62 for the second jump simultaneously at the 95% level of confidence. The MINITAB output of the analysis that leads to the above conclusion appears below. One-way ANOVA: Top 1 Driving Accuracy Rank versus Time Span Analysis of Variance for Top 1 Money Winners Driving Accuracy Rank Source DF SS MS F P Time Span 2 15358 7679 43.3. Error 24 4282 178 Total 26 1964 Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev -------+---------+---------+--------- mu(1) 1 48.25 8.19 (--*--) mu(2) 13 77.51 14.24 (--*-) mu(3) 4 12.35 2.38 (---*----) -------+---------+---------+--------- Pooled StDev = 13.36 6 9 12 Tukey's pairwise comparisons Family error rate =.5; Individual error rate =.2; Critical value = 3.53 15.23 < mu(2) - mu(1) < 43.28 52.37 < mu(3) - mu(1) < 91.83 23.78 < mu(3) - mu(2) < 61.91 13