Chapter. 1 Who s the Best Hitter? Averages

Similar documents
Major League Baseball Offensive Production in the Designated Hitter Era (1973 Present)

Additional On-base Worth 3x Additional Slugging?

Correction to Is OBP really worth three times as much as SLG?

The Rise in Infield Hits

GUIDE TO BASIC SCORING

Baseball Scorekeeping for First Timers

Salary correlations with batting performance

George F. Will, Men at Work

2014 Tulane Baseball Arbitration Competition Eric Hosmer v. Kansas City Royals (MLB)

Fairfax Little League PPR Input Guide

BABE: THE SULTAN OF PITCHING STATS? by. August 2010 MIDDLEBURY COLLEGE ECONOMICS DISCUSSION PAPER NO

AP Stats Chapter 2 Notes

Lorenzo Cain v. Kansas City Royals. Submission on Behalf of the Kansas City Royals. Team 14

Lesson 1 Pre-Visit Batting Average

Matt Halper 12/10/14 Stats 50. The Batting Pitcher:

One could argue that the United States is sports driven. Many cities are passionate and

When Should Bonds be Walked Intentionally?

Relative Value of On-Base Pct. and Slugging Avg.

2014 NATIONAL BASEBALL ARBITRATION COMPETITION ERIC HOSMER V. KANSAS CITY ROYALS (MLB) SUBMISSION ON BEHALF OF THE CLUB KANSAS CITY ROYALS

Rating Player Performance - The Old Argument of Who is Bes

2013 National Baseball Arbitration Competition

2015 NATIONAL BASEBALL ARBITRATION COMPETITION

Table of Contents. Pitch Counter s Role Pitching Rules Scorekeeper s Role Minimum Scorekeeping Requirements Line Ups...

CHAPTER 2 Modeling Distributions of Data

1: MONEYBALL S ECTION ECTION 1: AP STATISTICS ASSIGNMENT: NAME: 1. In 1991, what was the total payroll for:

The pth percentile of a distribution is the value with p percent of the observations less than it.

2014 Tulane Baseball Arbitration Competition Josh Reddick v. Oakland Athletics (MLB)

2014 National Baseball Arbitration Competition

2018 Winter League N.L. Web Draft Packet

Draft - 4/17/2004. A Batting Average: Does It Represent Ability or Luck?

Lesson 2 Pre-Visit Slugging Percentage

Dexter Fowler v. Colorado Rockies (MLB)

NUMB3RS Activity: Is It for Real? Episode: Hardball

Index. r (correlation), 127 s and u, 51, 55 s (steps), 103 SF, 4 SLG, 3 SLOB, 11 TB, 3 u and s, 51, 55

Chapter 1 The official score-sheet

2015 Winter Combined League Web Draft Rule Packet (USING YEARS )

2013 Tulane National Baseball Arbitration Competition

Since the National League started in 1876, there have been

Minnesota Twins (7-10) 8, Seattle Mariners (7-10) 5 April 25, 2015

Triple Lite Baseball

Do Clutch Hitters Exist?

OAKLAND ATHLETICS MATHLETICS MATH EDUCATIONAL PROGRAM. Presented by ROSS Dress for Less and Comcast SportsNet California

IHS AP Statistics Chapter 2 Modeling Distributions of Data MP1

2015 NATIONAL BASEBALL ARBITRATION COMPETITION. Lorenzo Cain v. Kansas City Royals (MLB) SUBMISSION ON BEHALF OF KANSAS CITY ROYALS BASEBALL CLUB

CS 221 PROJECT FINAL

Offensive & Defensive Tactics. Plan Development & Analysis

How percentages are used in sports

Figure 1. Winning percentage when leading by indicated margin after each inning,

Scorekeeping Guide Book

Averages. October 19, Discussion item: When we talk about an average, what exactly do we mean? When are they useful?

2015 National Baseball Arbitration Competition

Chapter 2: Modeling Distributions of Data

2014 National Baseball Arbitration Competition

An average pitcher's PG = 50. Higher numbers are worse, and lower are better. Great seasons will have negative PG ratings.

A PRIMER ON BAYESIAN STATISTICS BY T. S. MEANS

TOP OF THE TENTH Instructions

In my left hand I hold 15 Argentine pesos. In my right, I hold 100 Chilean

2017 B.L. DRAFT and RULES PACKET

2011 COMBINED LEAGUE (with a DH) DRAFT / RULES PACKET

Greatest Show on the Diamond. As April rolls in, the majority of us are thinking some of the same things; finals and

MONEYBALL. The Power of Sports Analytics The Analytics Edge

Package mlbstats. March 16, 2018

2015 NATIONAL BASEBALL ARBITRATION COMPETITION

2017 International Baseball Tournament. Scorekeeping Hints

Jonathan White Paper Title: An Analysis of the Relationship between Pressure and Performance in Major League Baseball Players

It s conventional sabermetric wisdom that players

Department of Economics Working Paper Series

Team 10. Texas Rangers v. Nelson Cruz. Brief in support of Nelson Cruz

#35 CODY BELLINGER #58 EDWARD PAREDES

2010 Boston College Baseball Game Results for Boston College (as of Feb 19, 2010) (All games)

An Analysis of the Effects of Long-Term Contracts on Performance in Major League Baseball

HMB Little League Scorekeeping

Defining Greatness A Hall of Fame Handbook

Scorekeeping Clinic Heather Burton & Margarita Yonezawa &

A New Chart for Pitchers and My Top 10 Pitching Thoughts Cindy Bristow - Softball Excellence

Lesson 3 Pre-Visit Teams & Players by the Numbers

There are three main pillars of behavior consistently found in successful baseball players and teams:

Inside Baseball Take cues from successful baseball strategies to improve your game in business. By Bernard G. Bena

"That's Interference! Time!" Dear Coach,

DISTRICT 53 SCOREKEEPER CLINIC

KRISTOPHER NEGRÓN (45) POSITION: AGE: BORN: BATS: THROWS: HEIGHT:

Average Runs per inning,

Rare Play Booklet, version 1

2013 National Baseball Arbitration Competition Tulane University Law School

DLL Scorekeeping Guide. Compiled by Kathleen DeLaney and Jill Rebiejo

Baseball Basics for Brits

JULY 2012 GREYSCALE Mathletics_Workbook_6-8.indd 1

Fair Ball! Why Adjustments Are Needed. Best Batters and the Offensive Events Used to Determine Them

Southern U. Baseball 2017 Overall Statistics for Southern U. (as of Apr 01, 2017) (All games Sorted by Batting avg)

Chasing Hank Aaron s Home Run Record. Steven P. Bisgaier, Benjamin S. Bradley, Peter D. Harwood and Paul M. Sommers. June, 2002

B. AA228/CS238 Component

New York Yankees (34-31) 6, Seattle Mariners (34-32) 3 June 12, 2014

Seattle Mariners (52-45) 3, Los Angeles Angels (58-38) 2 July 19, 2014

EDGAR MARTINEZ: PRAISE FROM HIS PEERS

A Remark on Baserunning risk: Waiting Can Cost you the Game

Correlation and regression using the Lahman database for baseball Michael Lopez, Skidmore College

Chapter 3 Displaying and Describing Categorical Data

TULANE UNIVERISTY BASEBALL ARBITRATION COMPETITION NELSON CRUZ V. TEXAS RANGERS BRIEF FOR THE TEXAS RANGERS TEAM # 13 SPRING 2012

A Markov Model of Baseball: Applications to Two Sluggers

TULANE BASEBALL ARBITRATION COMPETITION. Isaac Benjamin Davis. New York Metropolitan Baseball Club, Inc. ARBITRATION BRIEF FOR ISAAC BENJAMIN DAVIS

Transcription:

Chapter 1 Who s the Best Hitter? Averages The box score, being modestly arcane, is a matter of intense indifference, if not irritation, to the non-fan. To the baseball-bitten, it is not only informative, pictorial, and gossipy but lovely in aesthetic structure. ROGER ANGELL, AUTHOR You don t need to be a baseball fan to have heard that a certain player is having a good season and hitting.300. Or perhaps a bad season and not even batting.230. These numbers are a kind of average. In baseball, a player s batting average, AVG, over a particular period of time is the number of hits, H, divided by the number of official atbats, AB, during that period. That is, AVG = H AB. For example, Babe Ruth s best season, from the point of view of batting average, was in 1923 when he got 205 hits in 522 official at-bats. Thus H = 205, AB = 522, and 1

2 A Mathematician at the Ballpark AVG = H AB = 205 522 L.393. I will use whenever an equality is just an approximation. To seven decimal places, this average is.3927203, but it is traditional to give batting averages to three places. So, the average.393 is Babe Ruth s batting average for the 1923 season. Babe Ruth s lifetime batting average isn t so shabby either. It is.342 because, for his lifetime, AB = 8399, H = 2873, and AVG = H AB = 2873 8399 L.342. By the way, the phrase batting average is an unfortunate choice. It should be hitting proportion because it gives the proportion of official at-bats that were hits. Thus throughout his career, about.342 of Babe Ruth s official at-bats were hits. Or, if you prefer percentages, about 34.2 percent of his official at-bats were hits. Of course, batting average is absolutely standard and all baseball fans know what it means, so I won t fight it. There are other numbers that we can think of as averages that many baseball strategists feel more accurately measure hitting effectiveness. In their book [36, Baseball Dynasties, page 146], Rob Neyer and Eddie Epstein are emphatic: Why do people have such a hard time letting go of batting average? On-base percentage and slugging percentage are more important, more significant, more meaningful, more everything than batting average. Batting average is a red herring. Let go of it, friends. It s not that important. In case you missed that, they later say, on page 282: Dear Reader, if you learn nothing else from this

Who s the Best Hitter? 3 book, please learn the fact that batting average is not the most important offensive number. Let s look more closely at some of these other sorts of averages, which came into prominence in the 1980s. Slugging percentage, SLG, takes into account the total number of bases, TB, that is, TB = 1B + 2 * 2B + 3 * 3B + 4 * HR, where 1B is the number of singles, 2B is the number of doubles, 3B is the number of triples, and HR is the number of home runs. Then the slugging percentage is SLG = TB AB. Here s another formula for SLG: SLG = H + 2B + 2 * 3B + 3 * HR, AB which is easier to use when you re not provided the number 1B of singles. In 1923, Babe Ruth had 106 singles, 45 doubles, 13 triples, and 41 home runs. Therefore he had TB = 106 + 2 * 45 + 3 * 13 + 4 * 41 = 399 total bases, and his slugging percentage was SLG = TB AB = 399 522 L.764.

4 A Mathematician at the Ballpark This is spectacular, though his slugging percentage was.847 in 1920, which was a record up to the year 2001, when Barry Bonds finally broke it. For Babe Ruth s career, 1B = 1517, 2B = 506, 3B = 136, and HR = 714, so TB = 1517 + 2 * 506 + 3 * 136 + 4 * 714 = 5793, and therefore Babe Ruth s lifetime slugging percentage was which is by far the highest in Major League history, assuming at least 3000 plate appearances. Let s compare the batting average, AVG, and the slugging percentage, SLG. Because the number of hits is the total of the numbers of singles, doubles, triples, and home runs, we can write H = 1B + 2B + 3B + HR. Thus we have SLG = SLG = TB AB = 5793 8399 L.690, AVG = 1B + 2B + 3B + HR, AB 1B + 2 * 2B + 3 * 3B + 4 * HR. AB The only difference between these formulas is that SLG gives doubles double-the-value, triples triple-the-value, and home runs four times the value. Recall that AB represents the official at-bats. This does not take into account the number of bases on balls, BB, the number of times the batter is hit by a pitched ball, HBP, or the number of times the batter hits a sacrifice fly, SF. It is sometimes convenient to calculate batter effective-

ness in terms of all plate appearances. The number of all plate appearances is PA = AB + BB + HBP + SF. Who s the Best Hitter? 5 We might regard the added terms, BB, HBP, and SF as the unofficial at-bats. Another popular measure of batter effectiveness is the on-base percentage, OBP, defined by OBP = H + BB + HBP. PA This is the proportion of plate appearances where the player gets on base. Note that home runs count as on base, even though the batter just touches the bases on his way back to home plate. The numbers HBP and SF are not readily available for players, like Babe Ruth, who played before statisticians and mathematicians really started analyzing the game. Let s calculate the lifetime on-base percentage, OBP, for someone a little closer to our era, Pete Rose. His career numbers are AB = 14,053, H = 4256, TB = 5752, BB = 1566, HBP = 107, and SF = 79. So PA = AB + BB + HBP + SF = 14,053 + 1566 + 107 + 79 = 15,805 and Pete Rose s lifetime on-base percentage is OBP = = H + BB + HBP PA 4256 + 1566 + 107 15,805 L.375.

6 A Mathematician at the Ballpark His number of hits, H = 4256, is the all-time record. So is his number of at-bats AB = 14,053; next is Hank Aaron who had 12,364 lifetime at-bats. Pete s lifetime batting average was AVG = 4256 14,053 L.303. This is pretty good, but as Ted Williams put it, Baseball is the only field of endeavor in which a man can succeed three times out of ten and be considered a good performer. Each of AVG, SLG, and OBP is an interesting statistic that provides useful information about a batter s effectiveness. But none of these statistics provides a single number that summarizes batters overall effectiveness. AVG is the oldest established statistic and is invariably the one used to determine batting champions. But AVG doesn t take into account extra-base hits, much less home runs, runsbatted-in, bases on balls, and other important measures of batter effectiveness. SLG certainly takes into account extrabase hits, but it gives too much credit for them. Is a triple really three times as valuable as a single? For that matter, is a home run four times as valuable as a single? Getting on base is crucial, so AVG does measure an important statistic. But there are other ways to get on base, most notably bases on balls, so the on-base percentage OBP is also a useful statistic. In other words, AVG, SLG, and OBP are all useful statistics, but none of them is close to reflecting the whole picture. Another measure of a batter s offensive contribution is On-Base Plus Slugging (OPS). The formula is very simple: OPS = SLG + OBP.

Who s the Best Hitter? 7 This is one of the statistics championed by Pete Palmer, one of the two gurus of baseball statistics over the past 20 years; the other one is Bill James. OPS is becoming increasingly popular because it s easy to calculate. Moreover, though it may appear as an artificial statistic, it correlates quite well with some lesser known, but more complicated, statistics that are more realistic measures of overall hitting prowess than AVG, SLG, or OBP. Let s return to the enigmatic Pete Rose. To calculate his lifetime slugging percentage SLG, we need the breakdown of his 14,053 hits: 1B = 3215, 2B = 746, 3B = 72, and HR = 223. Then Pete s TB = 3215 + 2 * 746 + 3 * 72 + 4 * 223 = 5815, so SLG = TB AB = 5815 14,053 L.414. Now Pete s lifetime OPS = SLG + OBP =.414 +.375 =.789. It s an interesting fact that Pete Rose does not rank in the top 100 of all ballplayers in any of the categories AVG, SLG, OBP, or OPS. We now turn our attention to Barry Bonds, regarded by many as the best hitter of our time. In 2001, his statistics were AB = 476, H = 156, 1B = 49, 2B = 32, 3B = 2, HR = 73, BB = 177, HBP = 9, and SF = 2. Bonds 73 home runs established a new season record. We have TB = 49 + 2 * 32 + 3 * 2 + 4 * 73 = 411,

8 A Mathematician at the Ballpark so AVG = H AB = 156 476 L.328, SLG = TB AB = 411 476 L.863, PA = AB + BB + HBP + SF = 476 + 177 + 9 + 2 = 664, H + BB + HBP 156 + 177 + 9 664 = 342 OBP = PA = 664 L.515, OPS = SLG + OBP L.863 +.515 L 1.379. In that last line, it looks like I can t add. But, the approximation 1.379 is correct, because SLG L.863445 and OBP L.515060, and so OPS L 1.378505, which rounds up to 1.379. Bonds slugging percentage,.863, for the season exceeds the older record of.847 set by Babe Ruth in 1920. His OPS of 1.379 just beats Babe Ruth s 1.378 set in 1920. Of course, the Babe never heard of the concept OPS. In 2002, Barry Bonds statistics were AB = 403, H = 149, 1B = 70, 2B = 31, 3B = 2, HR = 46, BB = 198, HBP = 9, and SF = 2. This time TB = 70 + 2 * 31 + 3 * 2 + 4 * 46 = 322, so AVG = H AB = 149 403 L.370, SLG = TB AB = 322 403 L.799, PA = AB + BB + HBP + SF = 403 + 198 + 9 + 2 = 612, H + BB + HBP 149 + 198 + 9 612 = 356 OBP = PA = 612 L.582, OPS = SLG + OBP L.799 +.582 L 1.381. In 2002, Bonds AVG =.370 led the majors. More impressive, his OBP =.582 exceeds Ted Williams.553, a

Who s the Best Hitter? 9 record that goes back to 1941. His OPS is also a new record, just beating out his own record of 2001. Bonds is a fantastic player, but it isn t fair to say that he is as great as Babe Ruth because Ruth was the dominant player for nearly 20 years and changed the character of the game. So much for household names. Let s consider some players who haven t yet ascended to the pantheon. Consider the 2003 regular-season offensive numbers for the two outstanding Japanese outfielders: the Yankee Hideki Matsui and the Mariner Ichiro Suzuki. Some of their numbers are in the next table. Hideki Matsui and Ichiro Suzuki AB H 2B 3B HR BB HBP SF H. Matsui 623 179 42 1 16 63 3 6 Ichiro S. 679 212 29 8 13 36 6 1 For Matsui, and SLG = = AVG = H AB = 179 623 L.287 H + 2B + 2 * 3B + 3 * HR AB 179 + 42 + 2 + 48 623 Because his number of plate appearances is = 271 623 L.435. PA = AB + BB + HBP + SF = 623 + 63 + 3 + 6 = 695,

10 A Mathematician at the Ballpark Matsui s OBP = H + BB + HBP PA = 179 + 63 + 3 695 L.353. Finally, Matsui s OPS = SLG + OBP L.435 +.353 =.788. We list these numbers in the next table, as well as Matsui s number R of runs scored, number RBI of runs batted in, number SO of strike-outs, number SB of stolen bases, number CS of times caught trying to steal a base, and number E of fielding errors. Exactly the same formulas were used to calculate the numbers for Ichiro, as shown in the following table. Hideki Matsui and Ichiro Suzuki AVG SLG OBP OPS R RBI SO SB CS E H. Matsui.287.435.353.788 82 106 86 2 2 8 Ichiro S..312.436.352.788 111 62 69 34 8 2 On the basis of the ever-popular AVG, Ichiro had a substantially better offensive season than Matsui. But their more significant numbers SLG, OBP, and OPS are amazingly close to each other. These numbers suggest that they are equally effective offensively, though one can always argue over related issues. For example, Matsui had more walks than Ichiro, and Matsui had an outstanding postseason. But the walks were already taken into account in calculating OBP and hence OPS. Matsui fans might point out that Matsui had significantly more RBIs, but that s not

surprising since Ichiro was a lead-off batter. One offensive statistic that s been overlooked so far is stolen bases. Ichiro stole 34 in 42 attempts, while Matsui stole two in four attempts, so Ichiro wins out here. Also, Ichiro s fielding was better with two errors compared with Matsui s eight. In fact, Ichiro had only six errors in the three years 2001 2003. In their important book [42, The Hidden Game of Baseball], John Thorn and Pete Palmer prefer the product of OBP and SLG, which is called the Batter s Run Average or BRA = OBP * SLG. They argue that BRA is a better measure of offensive power than the sum OPS = OBP + SLG because it correlates better with the number of runs produced. I don t disagree, but OPS and BRA are closely correlated and, moreover, OPS is easier to calculate, easier to relate to, and more popular. Some authors use SLOB in place of BRA. See, for example, [7]. There s actually some merit to SLOB. This is a catchy, if not neat, acronym, and it s easy to remember that SLOB = SLG * OBP. Who s the Best Hitter? 11 However, it is generally a bad idea to fight established notation, and BRA is the standard. By the way, some authors (e.g., [40]) call SLG and OBP the slugging average and the on-base average, because they are not percentages. I agree, but they aren t averages either. In much of the nonbaseball world, average refers to what you get if you sum a bunch of numbers and divide by the number of such numbers. Thus the average salary of players on a baseball team can be calculated by summing

12 A Mathematician at the Ballpark the players salaries and dividing by the number of players. For example, if there were 26 players and the annual player payroll is $43,500,000, then the average salary would be 43,500,000 $ 26 Figuring out who is the best hitter is not a simple task, even when one understands what the various averages really mean. But some statistical phenomena can make the task seem impossible. We end this chapter with one such phenomenon, called Simpson s paradox, which was first observed in the context of economics and that pops up in unexpected places. Let s consider two recent ballplayers, Derek Jeter and David Justice. In 1995, Jeter s AVG = H AB = 12 =.250 and 48 Justice s AVG = H AB = 104 411 L.253. Justice also had a higher batting average in 1996: Jeter s AVG = H AB = 183 L.314 and 582 Justice s AVG = H AB = 45 140 L.321. It seems reasonable to conclude that, just based on batting averages, Justice did better than Jeter over these two years. However, if we combine the two years, we find that Jeter s AVG = L $1,673,000. 12 + 183 48 + 582 = 195 630 L.310,

Who s the Best Hitter? 13 while This bizarre phenomenon seems to occur for some pair of interesting ballplayers about once a year. To make this particular story even more delicious, observe that Justice beat out Jeter again in 1997: But if we combine all three years, we get while Justice s AVG = Jeter s AVG = Justice s AVG = 104 + 45 411 + 140 = 149 551 L.270. Jeter s AVG = 190 L.291 and 654 Justice s AVG = 163 495 L.329. 12 + 183 + 190 48 + 582 + 654 = 385 1284 L.300, 104 + 45 + 163 411 + 140 + 495 = 312 1046 L.298. You could say that there s no justice for Justice! Simpson s paradox isn t a true paradox, but it s called a paradox because it s counterintuitive and even distressing. When batting averages are combined, one adds both the numerators and the denominators because that s the way to get the total number of hits and at-bats for the longer period of time. This is different from adding the actual numerical fractions. Since Jeter s 1995 and 1996 averages,.250 and.314, are less than Justice s averages,.253

14 A Mathematician at the Ballpark and.321, Jeter s sum.564 is certainly less than Justice s sum.574. But these sums are meaningless in this context, and one doesn t get the longer-term averages in this way. Who s the best hitter? One reason that this question causes so many arguments is that there s so many ways to measure hitting excellence. Which is the better measure depends in part on what are, in your judgment, the most important aspects of hitting. None of our averages will give you a simple absolute answer as to who is the best hitter, but they do have the power to reveal the truth about why a hitter is good.