The Rise in Infield Hits

Similar documents
Infield Hits. Infield Hits. Parker Phillips Harry Simon PARKER PHILLIPS HARRY SIMON

When Should Bonds be Walked Intentionally?

Salary correlations with batting performance

Matt Halper 12/10/14 Stats 50. The Batting Pitcher:

2014 Tulane Baseball Arbitration Competition Eric Hosmer v. Kansas City Royals (MLB)

2014 NATIONAL BASEBALL ARBITRATION COMPETITION ERIC HOSMER V. KANSAS CITY ROYALS (MLB) SUBMISSION ON BEHALF OF THE CLUB KANSAS CITY ROYALS

2013 National Baseball Arbitration Competition

Major League Baseball Offensive Production in the Designated Hitter Era (1973 Present)

A Database Design for Selecting a Golden Glove Winner using Sabermetrics

Team 10. Texas Rangers v. Nelson Cruz. Brief in support of Nelson Cruz

Additional On-base Worth 3x Additional Slugging?

2013 Tulane National Baseball Arbitration Competition

Figure 1. Winning percentage when leading by indicated margin after each inning,

JEFF SAMARDZIJA CHICAGO CUBS BRIEF FOR THE CHICAGO CUBS TEAM 4

The Influence of Free-Agent Filing on MLB Player Performance. Evan C. Holden Paul M. Sommers. June 2005

2014 National Baseball Arbitration Competition

Lorenzo Cain v. Kansas City Royals. Submission on Behalf of the Kansas City Royals. Team 14

NBA TEAM SYNERGY RESEARCH REPORT 1

4-3 Rate of Change and Slope. Warm Up. 1. Find the x- and y-intercepts of 2x 5y = 20. Describe the correlation shown by the scatter plot. 2.

CS 221 PROJECT FINAL

Dexter Fowler v. Colorado Rockies (MLB)

Chapter. 1 Who s the Best Hitter? Averages

PREDICTING the outcomes of sporting events

2017 B.L. DRAFT and RULES PACKET

2015 Winter Combined League Web Draft Rule Packet (USING YEARS )

2018 Winter League N.L. Web Draft Packet

An Analysis of the Effects of Long-Term Contracts on Performance in Major League Baseball

Team Number 6. Tommy Hanson v. Atlanta Braves. Side represented: Atlanta Braves

It s conventional sabermetric wisdom that players

2014 National Baseball Arbitration Competition

2011 COMBINED LEAGUE (with a DH) DRAFT / RULES PACKET

2014 National Baseball Arbitration Competition

Percentage. Year. The Myth of the Closer. By David W. Smith Presented July 29, 2016 SABR46, Miami, Florida

Predicting the Total Number of Points Scored in NFL Games

Simulating Major League Baseball Games

Average Runs per inning,

2013 National Baseball Arbitration Competition. Tommy Hanson v. Atlanta Braves. Submission on behalf of Atlanta Braves. Submitted by Team 28

A LOOK BACK AT A brief recap of the 2013 campaign follows.

Pitching Performance and Age

2013 National Baseball Arbitration Competition Tulane University Law School

Department of Economics Working Paper Series

Predicting Season-Long Baseball Statistics. By: Brandon Liu and Bryan McLellan

A V C A - B A D G E R R E G I O N E D U C A T I O N A L T I P O F T H E W E E K

Draft - 4/17/2004. A Batting Average: Does It Represent Ability or Luck?

MONEYBALL. The Power of Sports Analytics The Analytics Edge

2015 NATIONAL BASEBALL ARBITRATION COMPETITION. Mark Trumbo v. Arizona Diamondbacks. Submission on Behalf of Mark Trumbo. Midpoint: $5,900,000

TOP OF THE TENTH Instructions

How Effective is Change of Pace Bowling in Cricket?

Running head: DATA ANALYSIS AND INTERPRETATION 1

BABE: THE SULTAN OF PITCHING STATS? by. August 2010 MIDDLEBURY COLLEGE ECONOMICS DISCUSSION PAPER NO

OFFICIAL RULEBOOK. Version 1.08

Stats 2002: Probabilities for Wins and Losses of Online Gambling

4-3 Rate of Change and Slope. Warm Up Lesson Presentation. Lesson Quiz

An average pitcher's PG = 50. Higher numbers are worse, and lower are better. Great seasons will have negative PG ratings.

Do Clutch Hitters Exist?

Old Age and Treachery vs. Youth and Skill: An Analysis of the Mean Age of World Series Teams

A Simple Visualization Tool for NBA Statistics

Regression to the Mean at The Masters Golf Tournament A comparative analysis of regression to the mean on the PGA tour and at the Masters Tournament

WHAT CAN WE LEARN FROM COMPETITION ANALYSIS AT THE 1999 PAN PACIFIC SWIMMING CHAMPIONSHIPS?

save percentages? (Name) (University)

HSA Park Rules 2011 Girls 8 & Under League

Pitching Performance and Age

a) List and define all assumptions for multiple OLS regression. These are all listed in section 6.5

2014 National Baseball Arbitration Competition

2014 Tulane Baseball Arbitration Competition Josh Reddick v. Oakland Athletics (MLB)

The probability of winning a high school football game.

2014 Tulane Baseball Arbitration Competition Eric Hosmer v. Kansas City Royals

OFFICIAL RULEBOOK. Version 1.16

2015 NATIONAL BASEBALL ARBITRATION COMPETITION. Lorenzo Cain v. Kansas City Royals (MLB) SUBMISSION ON BEHALF OF KANSAS CITY ROYALS BASEBALL CLUB

Membership Package Year 1. Practice Plans & Drills

Lesson 2 Pre-Visit Slugging Percentage

2015 National Baseball Arbitration Competition

Indoor Wiffle Ball League Rule Book

Fairfax Little League PPR Input Guide

How percentages are used in sports

Baseball and Softball

Machine Learning an American Pastime

SAP Predictive Analysis and the MLB Post Season

Relative Value of On-Base Pct. and Slugging Avg.

A Brief Shutdown Innings Study. Bob Mecca

2015 NATIONAL BASEBALL ARBITRATION COMPETITION

Grade 5 Lesson 1. Lesson Plan Page 2. Student Article Page 5. Opinion Practice Activity Handout Page 9

Raymond HV Gallucci, PhD, PE;

Building an NFL performance metric

Jonathan White Paper Title: An Analysis of the Relationship between Pressure and Performance in Major League Baseball Players

An examination of try scoring in rugby union: a review of international rugby statistics.

Defensive Observations. II. Build your defense up the middle. By Coach Jack Dunn I. Defensive Observations

Regression Analysis of Success in Major League Baseball

The MLB Language. Figure 1.

y ) s x x )(y i (x i r = 1 n 1 s y Statistics Lecture 7 Exploring Data , y 2 ,y n (x 1 ),,(x n ),(x 2 ,y 1 How two variables vary together

Table 1. Average runs in each inning for home and road teams,

Baseball Strategies By Bob Bennett, Jack Stallings READ ONLINE

Season Plans. We hope you have learned a lot from this book: what your responsibilities. chapter 9

A Markov Model of Baseball: Applications to Two Sluggers

Offensive & Defensive Tactics. Plan Development & Analysis

When you think of baseball, you think of a game that never changes, right? The

George F. Will, Men at Work

MANAGER WHEN IS A MANAGER DESIGNATED?

Correction to Is OBP really worth three times as much as SLG?

SOFTBALL STANDARDS GRADE LEVEL STANDARD DESCRIPTION PE.2.C.2.5 PE.2.C.2.9 PE.2.L.3.3 PE.3.M.1.7 PE.2.C.2.8 PE.3.L.3.3 PE.3.L.3.4 PE.3.L.4.

Transcription:

The Rise in Infield Hits Parker Phillips Harry Simon December 10, 2014 Abstract For the project, we looked at infield hits in major league baseball. Our first question was whether or not infield hits have been on the rise since 2005. We found through analyzation of the percentage of infield hits, number of infield hits, and percentage of ground balls, we can definitively say that infield hits have become more common in major league baseball. From there, we looked at why this has been occurring. Our findings suggest that one important aspect of why this has happened is due to a decrease in the overall ERA of the league. From there, we looked at if IFH% is at all an important statistic in evaluating the offensive worth of a player. This result was inconclusive, but suggest that IFH% is more important that it has been accredited for. 1 Introduction First, why is a player s Infield Hit% (IFH%) important? IFH% is calculated by taking the total number of hits that do not leave the infield divided by the total number of ground balls. A player with a high IFH% tells us that the player is turning would-be-groundball-outs into singles. A player with a low IFH% tells us that out of all their ground balls, the player is generating outs (potentially double plays) instead of singles. The first question we had to address is the following: Has the number of infield hits been increasing since 2005? To look at this we needed to look at both the infield hits statistic, and any statistic related to infield hits to isolate whether or not it has in fact been increasing on its own, and not as a result of something else. All of the data we worked with was obtained from fan graphs.com We grabbed the information that pertained to the entire MLB in relation to IFH%. After graphing IFH% versus the year (see Figure 1), it became quite clear that the percentage of infield hits has been on a definite upward trend since 2005. From there, to make sure that this trend was not a result of there being less ground balls, we analyzed the number of ground balls in that time as well. Figure 2 also has an upward trend. These two figures together proved to us that the number of infield hits in the major leagues has definitely been increasing since 2005. 1

Figure 1: IFH% over time Figure 2: GB% over time 2

The data holds that IFH% has been on a rise in the past years. In the following sections, we discuss different methods of evaluating players by their IFH%. 2 Over Time Once we found that IFH% has definitely been increasing, we asked if IFH% is an important tool in evaluating the worth of a player. To do this, we wanted to see any correlation between IFH% and WAR. We chose WAR because it is a subjective, widely accepted tool used to evaluate the worth of a player. If IFH% has a strong correlation with WAR, we would be able to say that it is an important statistic. However, since it is not a strong correlation, we created a new offensive statistic to show that although IFH% by itself is not strong in predicting WAR, adding it to other stats increases its value. We look at data for ERA over time. Because ERA is a statistic where the higher the number is, the worse the pitcher is, we changed it to negative ERA in order to clearly visualize the relationship it has with other statistics. First, we graphed it against time. If pitching has been either improving or declining over time, then it would be possible for it to be affecting IFH%. (See Figure 3 Figure 3: Negative ERA over time In Figure eraovertime, it is easy to see that pitching has definitely been improving over time. The R 2 value between season and ERA is the IFH% statistic is 0.83. With this information, we could then directly compare IFH% and ERA and know that any correlation would be matched over time as well which would support our question (see Figure 4). After graphing the two against each other, it became clear of a definite correlation. The R 2 between the two is 0.7365, which is a strong correlation. To see if this correlation is unique, ERA was compared to other hitting relating statistics, including Average, R 2 =.87, with a negative slope. Intuitively, this statistic makes sense. As pitching 3

Figure 4: IFH% versus Negative ERA increases, the batting average of players should be decreasing. When compared to slugging average, the R 2 value equals.83 with a negative slope, and for number of RBIs, it is also.83 with a negative slope. What these comparisons showed us is that hitting statistics normally correlated with stronger hitters have been getting worse as ERA gets better. Although this makes sense, it made us question why then, would Infield Hit Percentage increase with better pitching. 3 Hitting versus Pitching Hypothesis: As pitching gets better, players are hitting solid hits less and less, including top tier players. These top tier players however, have speed, and are still able to beat out the throw to first and get the infield hit. Infield hits are far less effective tools that a normal base hit, which explains why this is happening as ERA decreases. To see if this was true, we created a statistic based on IFH% and SLG% to see if IFH does play any role in predicting WAR (see Figure 5). SLG% was chosen because, when compared with WAR, it had the highest R 2 value of.4. When added together in a 5 to 4 ratio (5*SLG% + 4*IFH%) the combination was better at predicting WAR than either by itself, with an R 2 value of.44 (see Figure 6). By adding even more unlikely statistics that relate to IFH%, such as GB% and LD%, we were even able to push the R 2 value up to.477 (see Figure 7). This shows that players who have both a high SLG% and a high IFH% tend to have higher WARs than players who simply have a high SLG%. A simple example would be to compare Andrew McCutchen to Chris Davis. McCutchen has an 8.1 IFH% and a.508 SLG%. Davis has a 3.1 IFH% and a whopping.634 SLG%. However, of the two, McCutchen has a better WAR value by 1.4 (8.2 to 6.8). 4

Figure 5: IFH% as a predictor of WAR Figure 6: IFH% and SLG% as predictors of WAR 5

Figure 7: IFH% + SLG% + GB% + LD% as predictors of WAR Figure 8: SLG% + GB% + LD% as predictors of WAR 6

4 Speedy Players Currently there are not many statistics that predict how valuable a player s speed is to the game of baseball. In the 1970 s, Bill James formulated a statistic called Speed Score or Spd. On Fangraphs, Spd is calculated using Stolen Base Percentage (SB%), Frequency of Stolen Base Attempts, Percentage of Triples, and Runs Scored Percentage. Moreover, Fangraphs has a Base-Running Value statistic (BsR), which they consider to be an all encompassing base running statistic that turns stolen bases, caught stealings, and other base running plays into runs above and below average. However, neither statistic incorporates IFH% into their calculations. Stolen bases and triples are rather clear indicators of speed. From a stolen base, a speedy player turns a single into a double based off of speed alone. A speedy player can turn a double into a triple perchance (and even sometimes a single into a double). However, a player who can turn a would-be-groundball-out into a single should also be deemed as a speedy player. From this, it is more than reasonable to believe that IFH% should be incorporated in determining the speed of a player. The new statistic we formulated takes into account IFH%, the player s Base Running value, Stolen Base %, and Triples. We call this statistic NewSpeed. It is defined as: IF H% NewSpeed = c 1 LeagueAverageIF H% + c BsR 2 LeagueAverageBsR + c 3 3B LeagueAverage3B SB% LeagueAverageSB% + c 4 Consider Table 1 comparing outfielders Starling Marte and Carlos Gomez. Both players have nearly identical stats, except Starling Marte has a 2% lead in IFH% and nearly twice the value in Base Running. The average for IFH% was 7.9% among outfielders. Spd rates Marte as the speedier player, which NewSpeed accounts for as well. Yet, Marte s and Gomez s Spd s are nearly identical, despite the gaps in Base Running value and IFH%. Our stat NewSpeed suggests Marte is a much speedier player, accounting for Marte s abilities on the base paths. Player Offense Base Running IFH IFH% SB SB% 3B Spd NewSpeed Starling Marte 25.3 5.8 24 14.5 % 30 73.2% 6 7.1 6.09 Carlos Gomez 26.8 3.3 20 12.7% 34 73.9% 4 6.5 4.58 Who wins? C.G. S.M. S.M. S.M C.G C.G S.M. S.M. S.M. Table 1: Comparing Players 4.1 Analysis In order to assess our new statistic NewSpeed, we assessed how it measured up to the other speed statistics in predicting WAR. Figure 9 shows Spd as a predictor of WAR. However, it possesses the lowest R 2 value. Spd is not a good predictor of WAR. 7

Figure 10 shows Base Running Value as a predictor of WAR and possesses a R 2 value of 0.1309, double that of Spd. Figure 11 shows an unweighted NewSpeed as a predictor of WAR and possesses the greatest R 2 value of 0.17. This suggests of the three statistics, it is the best predictor of WAR. When NewSpeed is weighted with the following weights: c 1 = 5, c 2 = 2, c 3 = 5, c 4 = 0 The weighted NewSpeed has an R 2 value of.28. Figure 13 factors out IFH% as part of the NewSpeed statistic. The R 2 value decreases here significantly, almost identical with that of Base Running value. Figure 13 shows that IFH% does better help predict WAR values. Our new speed statistic shows that players who possess good IFH%, SB%, general Base Running Value, and are able to generate Triples will be the more valuable speedier players. Figure 9: SpD Rating x WAR 5 Conclusion Although our findings show that IFH% has potential to be a useful statistic for predicting value, the best statistic we could create with IFH% was still fairly poor at predicting the WAR of a player with an R 2 value of only 0.47. What we did find to be useful is that IFH% is a good statistic for predicting ERA of a pitcher. If a team keeps track of the number of infield hits given up by a pitcher, they can have a pretty good idea of what the ERA of the pitcher might be. 8

Figure 10: BaseRunning x WAR Figure 11: NewSpeed x WAR 9

Figure 12: weightednewspeed x WAR Figure 13: NewSpeedMinus x WAR 10

Overall, given more time and resources, we believe that it is definitely possible to create a statistic utilizing IFH% that accurately reflect either the value of a position player, or the value of a pitcher, and that IFH% can be used as an important statistic in the future. 11