The Intrinsic Value of a Pitch

Size: px
Start display at page:

Download "The Intrinsic Value of a Pitch"

Transcription

1 The Intrinsic Value of a Pitch Glenn Healey (ghealey@uci.edu) Electrical Engineering and Computer Science University of California, Irvine, CA Research Paper for 2017 SABR Analytics Conference 1 Introduction Traditional statistics that are employed to measure and predict the performance of pitchers are based on observed outcomes. These statistics are affected by a number of confounding variables that are beyond the control of the pitcher such as the defense, the ballpark, the umpire, and the catcher. Approaches that reduce the dependence of statistics on these variables include selectively removing plate appearances as in FIP [12] or attempting to adjust for the effect of variables as in DRA [25]. Both of these approaches have important limitations. Removing plate appearances restricts the scope of a statistic while the use of inexact adjustments introduces distortion in a statistic s computed value. As baseball games have been recorded with increasing detail, analysts have developed methods for measuring pitcher performance at the level of an individual pitch. In his pioneering work, Burley [6] used a linear weights model [38] to define the observed value of a pitch as the change in run expectancy given the pitch outcome. Using PITCHf/x measurements, Walsh [39] extended the approach by using a pitch classification scheme to assign linear weight values to different pitch types. This inspired further research [1] [28] [29] and linear weight pitch values are readily accessible on the internet [2]. Since these pitch values depend on observed outcomes, however, they are affected by numerous variables that are independent of the quality of a pitch. Individual factors such as framing, for example, have been shown to have a large effect on observed pitch values [32]. Due to these confounding variables, observed pitch values have a low degree of repeatability [33] which limits their utility for prediction. The deployment of sensors [11] [24] [26] that characterize the trajectory of pitches and batted balls in three dimensions provides the opportunity to assign an intrinsic value to a pitch that depends on its physical properties and not on its observed outcome. We exploit 1

2 this opportunity by utilizing a Bayesian framework to map five-dimensional PITCHf/x velocity, movement, and location vectors to pitch intrinsic values. HITf/x data is used by the model to obtain intrinsic quality-of-contact values [17] [18] for batted balls that are invariant to the defense, ballpark, and atmospheric conditions. Separate mappings are built to accommodate the effects of count and batter/pitcher handedness. A kernel method is used to generate nonparametric estimates for the component probability density functions in Bayes theorem while cross-validation enables the model to adapt to the size and structure of the data. The methodology does not suffer the loss of information that is inherent with schemes that rely on pitch classification and is sufficiently general to support the use of additional variables such as spin rate. The new model is efficient and supports the real-time dissemination of intrinsic pitch values during games. We use the Cronbach s alpha [8] estimate of reliability [7] [40] to show that intrinsic pitch values have a significantly higher internal consistency than outcome-based pitch values which enables more accurate predictive models. We further develop a method to combine intrinsic values at the individual pitch level into a statistic that captures the value of a pitcher s collection of pitches over a period of time. We use this statistic to show that pitchers who outperform their intrinsic values during a season tend to perform worse the following year. Since intrinsic values are based on physical measurements, the new statistics can be used to predict how a pitcher s collection of pitches will translate from other levels(amateur, minors, foreign leagues) to the MLB environment. By directly relating the physical properties of a pitch to expected performance, this approach also promises to improve our understanding of how pitcher skill varies with age. 2 Computing Pitch Intrinsic Values 2.1 Sensor Data PITCHf/x [11] is a system that uses two cameras to capture a set of images of a pitch. The PITCHf/x images can be used to estimate the three-dimensional path of a pitch and to derive information about its speed and movement. Our analysis of PITCHf/x data considers several of the reported attributes for each pitch. The pair (l x,l z ) specifies the location of a pitch as it crosses home plate where l x is the horizontal coordinate and l z is the vertical 2

3 coordinate relative to an origin at the back vertex of home plate. The positive x-axis points to the right from the catcher s perspective, the positive y-axis points toward second base, and the positive z-axis points up. The coordinates l x and l z are typically reported in feet. Themovement ofapitch(b x,b z )isdefinedasthedifference betweenthepitchlocation(l x,l z ) and the theoretical location of a pitch thrown at the same speed that does not deviate from a straight path due to spin [30]. The movement parameters b x and b z are typically reported in inches. The value s is an estimate of pitch speed in three dimensions near the release point in miles per hour. Since each batter has a unique strike zone in the z dimension, we transform l z before processing into a coordinate system with a standard strike zone. The transform maps a pitch at the top or bottom of the batter s individual strike zone to the top or bottom of the standard strike zone. The batter s individual zone is stretched accordingly to map to the standard zone while pitches above or below the batter s individual zone by a vertical distance z are mapped to a transformed z coordinate that is z above or below the standard zone. The HITf/x system [24] uses the PITCHf/x images to estimate the initial speed and direction of batted balls in three dimensions. The direction is specified by two angles. The vertical launch angle is the angle that the batted ball s initial velocity vector makes with the plane of the playing field and the horizontal spray angle specifies the direction of the projection of the batted ball s initial velocity vector onto the plane of the playing field. The woba cube model [17] [18] is used to specify an intrinsic value for batted balls at contact using the measured exit speed, vertical angle, and horizontal angle as depicted in figure 1. This intrinsic value is independent of variables that include the defense, ballpark, and weather conditions. 3

4 2.2 Bayesian Foundation Figure 1: woba Cube We develop a method for learning the dependence of a pitch s intrinsic value on its measured parameters. Using Bayes theorem, the posterior probability of an outcome R j given a measured pitch vector v = (s,b x,b z,l x,l z ) is given by P(R j v) = p(v R j)p(r j ) p(v) where p(v R j ) is the conditional probability density function for v given outcome R j, P(R j ) is the prior probability of outcome R j, and p(v) is the probability density function for v. We consider the six possible outcomes R 0 = ball in play, R 1 = called ball, R 2 = called strike, R 3 = swinging strike, R 4 = foul ball, and R 5 = batter hit-by-pitch where foul tips that are caught for strikeouts are classified as R 3 and not R 4. Our analysis will model the dependence of each of the factors in (1) on the count and the platoon configuration. We show in section 2.6 that a weighted sum of the P(R j v) values over outcomes provides a measure of the intrinsic value of a pitch. (1) 4

5 2.3 Kernel Density Estimation The goal of density estimation for our application is to recover the conditional probability density functions p(v R j ) in equation (1) from a set of measured pitch vectors and their outcomes. Let v i for i = 1,2,...,n be a set of n five-dimensional measured pitch vectors with outcome R j. Kernel methods [36] which are also known as Parzen-Rosenblatt [31] [35] window methods are widely used for nonparametric density estimation. A kernel density estimate for p(v R j ) is given by p(v R j ) = 1 n K(v v i ) (2) n i=1 where K( ) is a kernel probability density function that is typically unimodal and centered at zero. A standard kernel for approximating a d dimensional density is the zero-mean Gaussian K(v) = [ 1 (2π) d/2 Σ exp 1 ] 1/2 2 vt Σ 1 v where Σ is the d d covariance matrix. For this kernel, p(v R j ) at any v is the average of a sum of Gaussians centered at the sample points v i and the covariance matrix Σ determines the amount and orientation of the smoothing. Σ is often chosen to be the product of a scalar and an identity matrix which results in equal smoothing in every direction. To recover a more accurate approximation p(v R j ) the covariance matrix should allow different amounts of smoothing in different directions. We enable this goal while also reducing the number of unknown parameters by adopting a diagonal model for Σ with variance elements (σ 2 s,σ2 b x,σ 2 b z,σ 2 l x,σ 2 l z ). For our five-dimensional data, this allows K(v) to be written as a product of five one-dimensional Gaussians [ 1 K(v) = exp 1 ( )] s 2 + b2 x + b2 z + l2 x + l2 z (4) (2π) 5/2 σ s σ bx σ bz σ lx σ lz 2 σs 2 σb 2 x σb 2 z σl 2 x σl 2 z which depends on the five unknown bandwidth parameters σ s,σ bx,σ bz,σ lx, and σ lz. Optimal bandwidth parameters are learned using the process described in the next section. (3) 5

6 2.4 Bandwidth Selection for Kernel Density Estimation The accuracy of the kernel density estimate p(v R j ) is highly dependent on the choice of the bandwidth vector σ = (σ s,σ bx,σ bz,σ lx,σ lz ) [9]. The recovered p(v R j ) will be spiky for small values of the parameters and, in the limit, will tend to a sum of Dirac delta functions centered at the v i data points as the bandwidths approach zero. Large bandwidths, on the other hand, can induce excessive smoothing which causes the loss of important structure in the estimate of p(v R j ). A number of bandwidth selection techniques have been proposed and a recent survey of methods and software is given in [16]. Many of these techniques are basedonmaximumlikelihoodestimatesforp(v R j )whichselect σ sothat p(v R j )maximizes the likelihood of the observed v i data samples. Applying these techniques to the full set of observed data, however, yields a maximum at σ = (0,0,0) which corresponds to the sum of delta functions result. To avoid this difficulty, maximum likelihood methods for bandwidth selection have been developed that are based on leave-one-out cross-validation [36]. The computational demands of leave-one-out cross-validation techniques are excessive for our data set. Therefore, we have adopted a cross-validation method which requires less computation. From the set of n observed v i vectors for outcome R j, we generate M disjoint subsets S k of fixed size n v to be used for validation. For each validation set S k, we construct the estimate p(v R j ) using the n n v vectors that are not in S k as a function of the bandwidth vector σ = (σ s,σ bx,σ bz,σ lx,σ lz ). The optimal bandwidth vector σ k = (σ sk,σ b xk,σ b zk,σ l xk,σ l zk ) for S k is the choice that maximizes the pseudolikelihood [10] [16] according to σ k = argmax σ v i S k p(v i R j ) (5) wheretheproductisover then v vectorsinthevalidationsets k.theoverall optimizedbandwidth vector σ (j) = (σ s(j),σ b x (j),σ b z (j),σ l x (j),σ l z (j)) for the p(v R j ) density estimate is obtained by averaging the M vectors σ k. 2.5 Estimating the Posterior Probability An estimate for P(R j v) can be derived from estimates of the quantities on the right side of equation (1). The density estimate p(v R j ) for each p(v R j ) is obtained using the kernel 6

7 method defined by equations (2) and (4). Each prior probability P(R j ) is estimated as the fraction P(R j ) of the pitches in the full data set with outcome R j. The density p(v) is estimated using 5 p(v) = p(v R j ) P(R j ) (6) j=0 where the sum is over the six possible outcomes given in section 2.2. The estimate for P(R j v) is then constructed by combining the estimates for p(v R j ),P(R j ), and p(v) according to (1). 2.6 Intrinsic Values In this section we present a method to compute the intrinsic value of a pitch. Define the context during a plate appearance (PA) as the platoon configuration and the number of balls b and strikes s on the batter. The context value is the league average woba [37] for PAs completed after the context. Let the pre-pitch value be the context value before a pitch. A pitch either completes a PA giving a post-pitch value defined by the woba coefficient for the PA result or the pitch causes a transition to a new context whose value defines the post-pitch value. The observed value O of a pitch is the difference between the post-pitch value and the pre-pitch value. This approach is used [2] to compute pitch values based on observed outcomes. Statistics that are based on observed pitch values depend on factors such as the catcher, the umpire, the defense, and the ballpark and have been shown to have a low degree of repeatability [33]. The posterior probabilities P(R j v) can be used to define pitch intrinsic values. Let weight w j denotethepost-pitchvalueminus thepre-pitch valueforapitch withoutcomer j. The weights w 1,w 2,w 3,w 4, and w 5 depend on the count (b,s) and the platoon configuration. Since batted balls can take a range of values, the weight w 0 also depends on the vector v of pitch parameters. We describe a method for estimating the batted ball weight function w 0 (b,s,v) in section 2.7. We define the intrinsic value of a pitch for a platoon configuration as 5 I(b,s,v) = w 0 (b,s,v)p(r 0 v)+ w j (b,s)p(r j v) (7) j=1 7

8 which measures pitch value as a function of the physical pitch parameters, the count, and the platoon configuration. Positive values of I(b, s, v) favor the batter while negative values favor the pitcher. I(b,s,v) is the expected value of a pitch with parameter vector v on count (b,s) for a given platoon configuration and is not dependent on factors such as the catcher, umpire, defense, or ballpark associated with the pitch. 2.7 Estimating the Batted Ball Weight Function The batted ball weight function w 0 (b,s,v) for a platoon configuration is estimated using nonparametric regression [5]. Let v i for i = 1,2,...,n b be a set of n b five-dimensional pitch vectors on count (b,s) for a platoon configuration that result in a batted ball (R 0 ) outcome. For each v i, let y i be the expected woba for the batted ball minus the pre-pitch value for the pitch that resulted in the batted ball. The expected woba is computed using the woba cube method [17] [18] from the batted ball exit speed, vertical angle, and horizontal angle as measured by the HITf/x system. The estimated w 0 (b,s,v) function is given by nb ŵ 0 (b,s,v) = i=1k(v v i )y i nb (8) i=1k(v v i ) where K( ) is a kernel function. For this work, we use the Gaussian kernel specified by (4) with cross-validation used for bandwidth selection. The n b observed vectors v i with a batted ball outcome and the associated y i values are used to generate S k validation sets. In this case, the optimal bandwidth vector σk is the choice that minimizes the sum of the absolute errors σ k = argmin σ where the sum is over the vectors in the validation set. v i S k y i ŵ 0 (b,s,v i ) (9) 3 Visualizing Intrinsic Values Data acquired by Sportvision s PITCHf/x and HITf/x systems during every regular-season MLB game in 2014 were used for this study. Figures 2 through 9 demonstrate properties and implications of pitch intrinsic values. Since I is a function of the count and the five 8

9 pitch parameters, we can examine the variation of intrinsic values along various dimensions while keeping other variables fixed. Figure 2 displays pitch intrinsic values for an 0-0 count on the (l x,l z ) plane as viewed from the catcher s perspective for a pitch speed of s = 90 mph, horizontal movement b x = 3 inches, and vertical movement b z = 6 inches. Pitches with parameters near these values are typically classified as four-seam fastballs. We see that the locations with the smallest run value for these pitches are down-and-away within the strike zone. We also see that the locations with the largest run values are for pitches that are out of the strike zone which are often taken for balls. Figure 2: Pitch Intrinsic Value for 0-0 count, s = 90,b x = 3,b z = 6 Figures 3 through 6 illustrate how intrinsic values depend on the count. In these figures, we consider the pitch depicted in figure 2 for a vertical location near the center of the strike zone at l z = 2.5. Figure 3 plots the probability of a swing as a function of the horizontal 9

10 location l x for 0-0 and 0-1 counts. We see that the swing probability is higher for all values of l x on 0-1 counts and that swing probability tends to be highest for pitches near the center of the strike zone for both counts. From equation (7), a pitch intrinsic value is the sum of components associated with the six R j outcomes. Figure 4 plots the value of the ball in play I bip = w 0 (b,s,v)p(r 0 v) component as a function of l x for 0-0 and 0-1 counts. We see that I bip is largest for pitches in the middle/inside part of the strike zone for pitches with these parameters. I bip tends to be larger for an 0-1 count because there are more swings (figure 3) which increases P(R 0 v) and also because w 0 (b,s,v) tends to be larger for balls in play on 0-1 since the pre-pitch value is smaller for an 0-1 count. Figure 5 plots the I ball = w 1 (b,s)p(r 1 v) component as a function of l x for 0-0 and 0-1 counts. We see that as we move away from the middle of the zone I ball is larger for an 0-0 count because there are fewer swings (figure 3) on 0-0 which leads to larger values of the ball probability P(R 1 v) and because the value of a ball (w 1 (0,0) =.038) on 0-0 is larger than the value of a ball (w 1 (0,1) =.028) on 0-1 for this platoon configuration. Figure 6 plots pitch intrinsic value I for 0-0 and 0-1 counts. The shape of these curves is largely determined by the structure of the I bip and I ball functions plotted in figures 4 and 5. The pitch locations that minimize run value are near the edges of the strike zone whereas the pitch locations that maximize run value are near the middle/inside part of the strike zone or for pitches that are well outside the strike zone. The minima on both edges of the plate for the 0-1 count are more distant from the center of the zone than for the 0-0 count since the batter is more likely to swing at borderline pitches on an 0-1 count as shown in figure 3. Figure 7 plots the density of pitches thrown with the parameters considered in figure 2 as a function of l x for 0-0 and 0-1 counts. Since the batter is more likely to swing at a given pitch on 0-1 and the cost of a ball is less on 0-1, pitchers throw fewer pitches near the center of the plate and more pitches out of the zone on 0-1. Figure 8 illustrates the dependence of I on the pitch speed s for the previously considered movement parameters (b x = 3,b z = 6) with l z = 2.5 on an 0-0 count. We see that for pitches in the strike zone the pitcher benefits from increased velocity. There is an inversion region where increased velocity benefits the batter near the inner edge of the strike zone which is due to umpires calling a smaller zone for higher velocity pitches with these parameters near the inner edge of the plate. 10

11 Figure 9 shows the dependence of I on pitch vertical movement b z for s = 90 mph with l z = 2.5 and b x = 3 for an 0-0 count. We see that increasing vertical movement lowers the run value across the range of l x P(swing) Figure 3: Dependence of swing probability on l x for s = 90,l z = 2.5,b x = 3,b z = 6 l x 11

12 I bip Figure 4: Dependence of BIP value component on l x for s = 90,l z = 2.5,b x = 3,b z = 6 l x I ball Figure 5: Dependence of ball value component on l x for s = 90,l z = 2.5,b x = 3,b z = 6 l x 12

13 I Figure 6: Dependence of pitch intrinsic value I on l x for s = 90,l z = 2.5,b x = 3,b z = 6 l x p Figure 7: Dependence of pitch density on l x for s = 90,l z = 2.5,b x = 3,b z = 6 l x 13

14 mph 91 mph 92 mph 93 mph 0.02 I Figure 8: Dependence of I on s and l x for 0-0 count, l z = 2.5,b x = 3,b z = 6 l x inches 7 inches 8 inches 9 inches 0.02 I Figure 9: Dependence of I on b z and l x for 0-0 count, s = 90,l z = 2.5,b x = 3 l x 14

15 4 Reliability We use reliability estimates to demonstrate that the new intrinsic pitch values have a higher degree of repeatability than outcome-based pitch values. Reliability [40] is based on the premise that the measurement of an attribute is equal to true talent, which is the player s expected value for the measurement, plus random error. In the context of assessing a pitcher s skill level on pitches, the reliability of a measurement M over a sample of N pitches is defined by R(N) = σ2 t σ 2 o(n) where σ 2 t is the variance of true talent across pitchers for M and σ 2 o(n) is the variance of the observed values across pitchers for M as a function of the sample size. R(N) quantifies the degree to which the measurement is repeatable and, therefore, is inversely related to the amount of random error in the measurement. In the context of forecasting, reliability determines how much the observed value of a measurement should be regressed in the direction of the mean to estimate true talent [18]. Measurements with a higher reliability require less regression and provide more accurate forecasts [19]. Split-half methods are a popular way to estimate reliability. These methods partition a data set into two halves and compute the correlation of the player measurements across the halves. A limitation of using split-half methods is that the estimated R(N) can change depending on how the data is partitioned. An alternative approach is to compute Cronbach s alpha [8] which is anestimate of R(N) that isanapproximation tothe averageof all possible split-half correlations that would be computed from a full data set with 2N pitches for each player. We used Cronbach s alpha (α(n)) to estimate reliability for measurements defined by the average of the I pitch values and the average of the O pitch values over a set of N pitches. Figure 10 plots α(n) for these measurements for pitches thrown on an 0-0 count by a RHP to a RHB. The analysis considers the 116 pitchers who threw at least 200 tracked pitches in this configuration in For values of N ranging from 20 to 200 we computed α(n) for the I and O measurements using the first N of these pitches for each of the 116 pitchers. Figure11isthecorresponding plotforthe112rhpwhothrew atleast 200tracked (10) 15

16 I O 0.4 α Figure 10: α(n) reliability estimate for RHP vs. RHB, 0-0 count N I O 0.4 α Figure 11: α(n) reliability estimate for RHP vs. LHB, 0-0 count N 16

17 pitches on an 0-0 count to LHB. We see that the intrinsic pitch measurements have a significantly higher reliability than the observed pitch measurements. The intrinsic I measurement reaches 0.5 at 135 pitches for the configuration in figure 10 and at 65 pitches for the configuration in figure 11. The observed O measurement has relatively small values of α(n) and obtains negative values which can occur when α(n) is computed using small samples for measurements with low internal consistency. Table 1 summarizes the reliability estimates for configurations involving right-handed pitchers for the second pitch of a plate appearance. For these cases, we considered pitchers who threw at least 100 pitches per count within a platoon configuration in The third column of the table indicates the number of pitchers who satisfied this criterion. The last two columns of the table give α(n) for a sample size of N = 100 pitches for the I and O measurements. We see that in each case, the estimated reliability for the I measurement is between 0.40 and 0.45 and is significantly larger than the corresponding reliability for the O measurement. Table 1: α(n) for N = 100 for the I and O measurements Configuration count # pitchers α(100) for I α(100) for O RHP vs. RHB RHP vs. RHB RHP vs. LHB RHP vs. LHB Intrinsic Pitch Statistics In section 2.6 we described a method for computing the observed O and intrinsic I value of an individual pitch. In this section, we define statistics that summarize the observed and intrinsic value of the set of pitches thrown by a pitcher over a period of time. Consider a right-handed pitcher P who faces B R right-handed batters and throws n i pitches to the ith of these batters. Let O R (i,j) be the observed value of the jth pitch to the ith batter. For the plate appearance by the ith batter the sum of the observed pitch values 17

18 n i O R (i,j) (11) j=1 is equal to the woba coefficient for the outcome of the plate appearance minus the league average woba for the RHP vs RHB platoon configuration. We define pitcher P s observed pitch statistic O R versus RHB as the sum of the O R (i,j) over all batters and pitches thrown within the platoon configuration divided by the number of batters faced O R = 1 B R B R n i O R (i,j). (12) i=1 j=1 Thus, O R is equal to the woba allowed by pitcher P against RHB minus the league average woba for RHP vs RHB. We will also find it convenient to write O R = 1 B R c i N R (c i )O R (c i ) (13) where N R (c i ) is the number of pitches thrown by pitcher P in count c i to RHB and O R (c i ) is the average observed value of these pitches in count c i. The sum is over the twelve possible counts c i. If we repeat the process for left-handed batters (LHB) to obtain O L for pitcher P, we can define the overall observed pitch statistic O P for pitcher P as O P = B RO R +B L O L B R +B L (14) where B L is the number of LHB faced by pitcher P. O P can also be computed as the sum of the observed pitch values against all batters divided by the total number of batters faced. In a similar way, we can use intrinsic pitch values to compute the intrinsic pitch statistic I P for pitcher P. Unlike observed values, intrinsic values do not depend on factors such as the catcher, the umpire, the fielders, or the ballpark. Thus, intrinsic values should be more indicative of pitch quality than observed values. Let I R (c i ) be the average intrinsic value of pitches thrown by P to RHB on count c i. Instead of using the actual number N R (c i ) of pitches thrown in each count which depends on variables such as the catcher s framing ability, we will use the expected number of pitches thrown in each count. In sections 2.2, 2.3, and 2.5 we showed how to compute the probability P(R j v) of each outcome R j for a pitch with measured parameter vector v. Thus, for a given pitch, we can compute the 18

19 probability that the plate appearance ends on that pitch as well as the probability that the count transitions to a given new count or remains the same in the case of a two-strike foul ball. For each 0-0 pitch, for example, we can compute the probability that the plate appearance ends, the count moves to 1-0, or the count moves to 0-1. Considering all 0-0 pitches, we compute N R (1,0) and N R (0,1) which are the expected number of 1-0 and 0-1 pitches for the pitcher against RHB. Starting from N R (1,0) and N R (0,1) we continue the process to compute the expected number of pitches N R(c i ) for each count c i for pitcher P against RHB. Following (13), we then define the intrinsic pitch statistic for P against RHB as I R = 1 B R c i N R(c i )I R (c i ) (15) where I R (c i ) is the average intrinsic value of the pitches thrown by pitcher P to RHB on count c i. If we repeat the process for LHB to obtain I L, we can define the overall intrinsic pitch statistic I P for pitcher P as I P = B RI R +B L I L B R +B L. (16) Given only the 2014 data set, the five-dimensional space of v vectors is too sparse to compute accurate kernel density estimates for the LHP vs LHB configuration and for deep counts involving right-handed pitchers. Thus, Table 2 presents the RHP with the lowest I P for 2014 after restricting the analysis to the first two pitches of plate appearances (c i = {(0,0),(0,1),(1,0)}). We see that Phil Hughes easily posted the lowest I P which is not surprising given that he also enjoyed the highest strikeout-to-walk ratio ever recorded by a major league ERA qualifier. Table 3 presents the five pitchers with the smallest O P I P differences. These pitchers significantly outperformed the intrinsic value of their pitches. We see that each of these pitchers had a much higher ERA the following year except for Adam Wainwright who was limited to 28 innings in 2015 after suffering an injury in April. In addition, the 2014 ERA for each of these pitchers is the best of their career through 2016 except for Wainwright s small-sample ERA in

20 Table 2: RHP with the lowest I P over at least 400 batters faced, 2014 Pitcher I P 1000 Phil Hughes John Lackey Zack Greinke Jordan Zimmermann Anibal Sanchez Jacob degrom Colby Lewis Hisashi Iwakuma Bartolo Colon Adam Wainwright -9.8 Table 3: RHP with the lowest O P I P over at least 400 batters faced, 2014 Pitcher (O P I P ) ERA 2015 ERA David Buchanan Carlos Carrasco Edinson Volquez Felix Hernandez Adam Wainwright Future Work While we have shown that statistics based on pitch intrinsic values have a number of desirable properties, a pitcher s success depends on several additional factors. Pitch diversity affects performance as experiments have shown, for example, that contact rates degrade significantly as pitches are drawn from a wider range of speeds [14]. Other studies have shown that major league strikeout rates increase as a pitcher s number of distinct pitch types increases [3] and that pitchers who throw a high fraction of fastballs suffer a larger decline in performance when they face batters multiple times in a game [27]. Studies [4] [13] [15] [23] [34] have also shown that effective pitch sequencing can be used to obtain an advantage. Another important aspect of pitching is the use of a game plan that accounts for each batter s strengths/weaknesses and the computation of pitch intrinsic values is an important first step in the automated generation of matchup models [20] [21]. In addition to 20

21 a pitch s physical parameters, a pitcher s delivery can also affect results if, for example, he hides the ball well or inadvertently provides clues about the identity of an upcoming pitch. By accurately quantifying pitch intrinsic values, we have a framework that will enable the careful study of other factors that affect pitching success. Acknowledgment I am grateful to Sportvision and MLB Advanced Media for providing the HITf/x data which made this work possible. I also thank Tom Tango and Mitchel Lichtman for helpful comments on a previous draft of this manuscript. I am happy to acknowledge the assistance of Qi Shi in the preparation of this document. References [1] D. Allen. (Mar. 16, 2009). Run value by pitch location [Online]. Available: baseballanalysts.com/archives/2009/03/run value by pi.php. [2] D. Appelman. (May 20, 2009). Pitch type linear weights [Online]. Available: www. fangraphs.com/blogs/pitch-type-linear-weights. [3] R. Arthur. (Feb. 6, 2014). Entropy and the eephus [Online]. Available: [4] P. Bonney. (Mar. 6, 2015). Defining the pitch sequencing question [Online]. Available: [5] A. Bowman and A. Azzalini. Applied Smoothing Techniques for Data Analysis. Clarendon Press, Oxford, [6] C. Burley. (Oct. 15, 2004). The importance of strike one (and two, and three...), part 2 [Online]. Available: [7] R. Carleton. (Apr. 20, 2011). 525,600 minutes: how do you measure a player in a year? [Online]. Available: 21

22 [8] L. Cronbach. Coefficient alpha and the internal structure of tests. Psychometrika, 16(3): , [9] R. Duda, P. Hart, and D. Stork. Pattern Classification. Wiley-Interscience, New York, [10] R. Duin. On the choice of smoothing parameters for Parzen estimators of probability density functions. IEEE Transactions on Computers, C-25(11): , [11] M. Fast. What the heck is PITCHf/x? In J. Distelheim, B. Tsao, J. Oshan, C. Bolado, and B. Jacobs, editors, The Hardball Times Baseball Annual, 2010, pages The Hardball Times, [12] Fielding Independent Pitching (FIP) [Online]. Available: pitching/fip/. [13] C. Glaser. (Mar. 4, 2010). The influence of batters expectations on pitch perception [Online]. Available: [14] R. Gray. Behavior of college baseball players in a virtual batting task. Journal of Experimental Psychology: Human perception and performance, 28(5): , [15] J. Greenhouse. (May 20, 2010). Lidge s pitches [Online]. Available: baseballanalysts.com/ archives/2010/05/brad lidges out.php. [16] A.C. Guidoum. Kernel estimator and bandwidth selection for density and its derivatives. The kedd package, version 1.03, October [17] G. Healey. (Mar. 17, 2016). The intrinsic value of a batted ball [Online]. Available: [18] G. Healey. (Aug. 2, 2016). The reliability of intrinsic batted ball statistics [Online]. Available: [19] G. Healey. (July 26, 2016). The reliability of intrinsic batted ball statistics: Appendix[Online]. Available: vixra.org/pdf/ v1.pdf. [20] G. Healey. Modeling the probability of a strikeout for a batter/pitcher matchup. IEEE Transactions on Knowledge and Data Engineering, 27(9): , September

23 [21] G. Healey. Matchup models for the probability of a ground ball and a ground ball hit. Journal of Sports Analytics, September [22] G. Healey. Appendix to The Intrinsic Value of a Pitch, [23] G. Healey and S. Zhao. Using PITCHf/x to model the dependence of strikeout rate on the predictability of pitch sequences. Journal of Sports Analytics, [24] P. Jensen. (Jun. 30, 2009). Using HITf/x to measure skill [Online]. Available: [25] J. Judge, H. Pavlidis, and D. Turkenkopf. (Apr. 29, 2015). Introducing deserved run average DRA and all its friends [Online]. Available: [26] J. Keri. (Mar. 4, 2014). Q&A: MLB Advanced Media s Bob Bowman discusses revolutionary new play-tracking system [Online]. Available: grantland.com/the-triangle/mlb-advancedmedia-play-tracking-bob-bowman-interview. [27] M. Lichtman. (Nov. 15, 2013). Pitch types and the times through the order penalty [Online]. Available: [28] M. Marchi. (Dec. 4, 2009). Pitch run value and count [Online]. Available: [29] D. Meyer. (May 6, 2015). Dynamic run value of throwing a strike (instead of a ball) [Online]. Available: [30] A. Nathan. (Oct. 21, 2012). Determining pitch movement from PITCHf/x data [Online]. Available: baseball.physics.illinois.edu/movement.pdf. [31] E. Parzen. On estimation of a probability density function and mode. Annals of Mathematical Statistics, 33(3): , [32] H. Pavlidis and D. Brooks. (Mar. 3, 2014). Framing and blocking pitches: a regressed probabilistic model[online]. Available: [33] Pitch type linear weights [Online]. Available: linearweights. 23

24 [34] J. Roegele. (Nov. 24, 2014). The effects of pitch sequencing [Online]. Available: www. hardballtimes.com/the-effects-of-pitch-sequencing. [35] M. Rosenblatt. Remarks on some nonparametric estimates of a density function. Annals of Mathematical Statistics, 27(3): , [36] S. Sheather. Density estimation. Statistical Science, 19(4): , [37] T. Tango, M. Lichtman, and A. Dolphin. The Book: Playing the Percentages in Baseball. Potomac Books, Dulles, Virgina, [38] J. Thorn and P. Palmer. The Hidden Game of Baseball. Doubleday and Company, New York, [39] J. Walsh. (Feb. 26, 2008). Searching for the game s best pitch [Online]. Available: [40] R. Zeller and E. Carmines. Measurement in the Social Sciences: The Link Between Theory and Data. Cambridge University Press, Biography Glenn Healey is Professor of Electrical Engineering and Computer Science at the University of California, Irvine. Before joining UC Irvine, he worked at IBM Research. Dr. Healey received the B.S.E. in Computer Engineering from the University of Michigan and the M.S. in computer science, the M.S. in mathematics, and the Ph.D. in computer science from Stanford University. He is director of the Computer Vision Laboratory at UC Irvine. Dr. Healey has served on the editorial boards of IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE Transactions on Image Processing, and the Journal of the Optical Society of America A. He has been elected a Fellow of IEEE and SPIE. Dr. Healey has received several awards for outstanding teaching and research. 24

The Intrinsic Value of a Batted Ball Technical Details

The Intrinsic Value of a Batted Ball Technical Details The Intrinsic Value of a Batted Ball Technical Details Glenn Healey, EECS Department University of California, Irvine, CA 9617 Given a set of observed batted balls and their outcomes, we develop a method

More information

The Reliability of Intrinsic Batted Ball Statistics Appendix

The Reliability of Intrinsic Batted Ball Statistics Appendix The Reliability of ntrinsic Batted Ball Statistics Appendix Glenn Healey, EECS Department University of California, rvine, CA 92617 Given information about batted balls for a set of players, we review

More information

Simulating Major League Baseball Games

Simulating Major League Baseball Games ABSTRACT Paper 2875-2018 Simulating Major League Baseball Games Justin Long, Slippery Rock University; Brad Schweitzer, Slippery Rock University; Christy Crute Ph.D, Slippery Rock University The game of

More information

When you think of baseball, you think of a game that never changes, right? The

When you think of baseball, you think of a game that never changes, right? The The Strike Zone During the PITCHf/x Era by Jon Roegele When you think of baseball, you think of a game that never changes, right? The rules are the same as they were over 100 years ago, right? The bases

More information

PREDICTING the outcomes of sporting events

PREDICTING the outcomes of sporting events CS 229 FINAL PROJECT, AUTUMN 2014 1 Predicting National Basketball Association Winners Jasper Lin, Logan Short, and Vishnu Sundaresan Abstract We used National Basketball Associations box scores from 1991-1998

More information

Trouble With The Curve: Improving MLB Pitch Classification

Trouble With The Curve: Improving MLB Pitch Classification Trouble With The Curve: Improving MLB Pitch Classification Michael A. Pane Samuel L. Ventura Rebecca C. Steorts A.C. Thomas arxiv:134.1756v1 [stat.ap] 5 Apr 213 April 8, 213 Abstract The PITCHf/x database

More information

THE BOOK--Playing The Percentages In Baseball

THE BOOK--Playing The Percentages In Baseball So as a baseball flies towards home plate, the moment when it passes from central to peripheral vision could THE BOOK ARTICLES BLOG ABOUT US The Book Tom M. Tango, Mitc... Buy New $14.93 Privacy Information

More information

Evaluating and Classifying NBA Free Agents

Evaluating and Classifying NBA Free Agents Evaluating and Classifying NBA Free Agents Shanwei Yan In this project, I applied machine learning techniques to perform multiclass classification on free agents by using game statistics, which is useful

More information

INTRODUCTION TO PATTERN RECOGNITION

INTRODUCTION TO PATTERN RECOGNITION INTRODUCTION TO PATTERN RECOGNITION 3 Introduction Our ability to recognize a face, to understand spoken words, to read handwritten characters all these abilities belong to the complex processes of pattern

More information

Wind Regimes 1. 1 Wind Regimes

Wind Regimes 1. 1 Wind Regimes Wind Regimes 1 1 Wind Regimes The proper design of a wind turbine for a site requires an accurate characterization of the wind at the site where it will operate. This requires an understanding of the sources

More information

Clutch Hitters Revisited Pete Palmer and Dick Cramer National SABR Convention June 30, 2008

Clutch Hitters Revisited Pete Palmer and Dick Cramer National SABR Convention June 30, 2008 Clutch Hitters Revisited Pete Palmer and Dick Cramer National SABR Convention June 30, 2008 Do clutch hitters exist? More precisely, are there any batters whose performance in critical game situations

More information

Mixed Models as the Basis for Catcher Evaluation and Forecasting. Harry Pavlidis Baseball Prospectus & Pitch Info

Mixed Models as the Basis for Catcher Evaluation and Forecasting. Harry Pavlidis Baseball Prospectus & Pitch Info Mixed Models as the Basis for Catcher Evaluation and Forecasting Harry Pavlidis Baseball Prospectus & Pitch Info Introduction Primary Research Team Harry Pavlidis: Director of Technology, Baseball Prospectus

More information

Calculation of Trail Usage from Counter Data

Calculation of Trail Usage from Counter Data 1. Introduction 1 Calculation of Trail Usage from Counter Data 1/17/17 Stephen Martin, Ph.D. Automatic counters are used on trails to measure how many people are using the trail. A fundamental question

More information

Do Clutch Hitters Exist?

Do Clutch Hitters Exist? Do Clutch Hitters Exist? David Grabiner SABRBoston Presents Sabermetrics May 20, 2006 http://remarque.org/~grabiner/bosclutch.pdf (Includes some slides skipped in the original presentation) 1 Two possible

More information

Baseball Scouting Reports via a Marked Point Process for Pitch Types

Baseball Scouting Reports via a Marked Point Process for Pitch Types Baseball Scouting Reports via a Marked Point Process for Pitch Types Andrew Wilcox and Elizabeth Mannshardt Department of Statistics, North Carolina State University November 14, 2013 1 Introduction Statistics

More information

Effects of Incentives: Evidence from Major League Baseball. Guy Stevens April 27, 2013

Effects of Incentives: Evidence from Major League Baseball. Guy Stevens April 27, 2013 Effects of Incentives: Evidence from Major League Baseball Guy Stevens April 27, 2013 1 Contents 1 Introduction 2 2 Data 3 3 Models and Results 4 3.1 Total Offense................................... 4

More information

ROSE-HULMAN INSTITUTE OF TECHNOLOGY Department of Mechanical Engineering. Mini-project 3 Tennis ball launcher

ROSE-HULMAN INSTITUTE OF TECHNOLOGY Department of Mechanical Engineering. Mini-project 3 Tennis ball launcher Mini-project 3 Tennis ball launcher Mini-Project 3 requires you to use MATLAB to model the trajectory of a tennis ball being shot from a tennis ball launcher to a player. The tennis ball trajectory model

More information

ScienceDirect. Rebounding strategies in basketball

ScienceDirect. Rebounding strategies in basketball Available online at www.sciencedirect.com ScienceDirect Procedia Engineering 72 ( 2014 ) 823 828 The 2014 conference of the International Sports Engineering Association Rebounding strategies in basketball

More information

Richard S. Marken The RAND Corporation

Richard S. Marken The RAND Corporation Fielder s Choice: A Unified Theory of Catching Fly Balls Richard S. Marken The RAND Corporation February 11, 2002 Mailing Address: 1700 Main Street Santa Monica, CA 90407-2138 Phone: 310 393-0411 ext 7971

More information

Reports and Glossary of Terms

Reports and Glossary of Terms Reports and Glossary of Terms Glossary of Terms Terms are listed by the default order in which they appear upon exporting a.csv file from the TrackMan

More information

Major League Baseball Offensive Production in the Designated Hitter Era (1973 Present)

Major League Baseball Offensive Production in the Designated Hitter Era (1973 Present) Major League Baseball Offensive Production in the Designated Hitter Era (1973 Present) Jonathan Tung University of California, Riverside tung.jonathanee@gmail.com Abstract In Major League Baseball, there

More information

Bayesian Optimized Random Forest for Movement Classification with Smartphones

Bayesian Optimized Random Forest for Movement Classification with Smartphones Bayesian Optimized Random Forest for Movement Classification with Smartphones 1 2 3 4 Anonymous Author(s) Affiliation Address email 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29

More information

Procedia Engineering Procedia Engineering 2 (2010)

Procedia Engineering Procedia Engineering 2 (2010) Available online at www.sciencedirect.com Procedia Engineering Procedia Engineering 2 (2010) 002681 2686 (2009) 000 000 Procedia Engineering www.elsevier.com/locate/procedia 8 th Conference of the International

More information

Draft - 4/17/2004. A Batting Average: Does It Represent Ability or Luck?

Draft - 4/17/2004. A Batting Average: Does It Represent Ability or Luck? A Batting Average: Does It Represent Ability or Luck? Jim Albert Department of Mathematics and Statistics Bowling Green State University albert@bgnet.bgsu.edu ABSTRACT Recently Bickel and Stotz (2003)

More information

AggPro: The Aggregate Projection System

AggPro: The Aggregate Projection System Gore, Snapp and Highley AggPro: The Aggregate Projection System 1 AggPro: The Aggregate Projection System Ross J. Gore, Cameron T. Snapp and Timothy Highley Abstract Currently there exist many different

More information

Project 1 Those amazing Red Sox!

Project 1 Those amazing Red Sox! MASSACHVSETTS INSTITVTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6.001 Structure and Interpretation of Computer Programs Spring Semester, 2005 Project 1 Those amazing Red

More information

Which On-Base Percentage Shows. the Highest True Ability of a. Baseball Player?

Which On-Base Percentage Shows. the Highest True Ability of a. Baseball Player? Which On-Base Percentage Shows the Highest True Ability of a Baseball Player? January 31, 2018 Abstract This paper looks at the true on-base ability of a baseball player given their on-base percentage.

More information

Uncovering the Mysteries of the Knuckleball

Uncovering the Mysteries of the Knuckleball Uncovering the Mysteries of the Knuckleball by Alan M. Nathan T he knuckleball is perhaps the most mysterious of baseball pitches. It is thrown slower than normal pitches, it is barely spinning, and it

More information

Building an NFL performance metric

Building an NFL performance metric Building an NFL performance metric Seonghyun Paik (spaik1@stanford.edu) December 16, 2016 I. Introduction In current pro sports, many statistical methods are applied to evaluate player s performance and

More information

Title: 4-Way-Stop Wait-Time Prediction Group members (1): David Held

Title: 4-Way-Stop Wait-Time Prediction Group members (1): David Held Title: 4-Way-Stop Wait-Time Prediction Group members (1): David Held As part of my research in Sebastian Thrun's autonomous driving team, my goal is to predict the wait-time for a car at a 4-way intersection.

More information

Honest Mirror: Quantitative Assessment of Player Performances in an ODI Cricket Match

Honest Mirror: Quantitative Assessment of Player Performances in an ODI Cricket Match Honest Mirror: Quantitative Assessment of Player Performances in an ODI Cricket Match Madan Gopal Jhawar 1 and Vikram Pudi 2 1 Microsoft, India majhawar@microsoft.com 2 IIIT Hyderabad, India vikram@iiit.ac.in

More information

Analysis of Shear Lag in Steel Angle Connectors

Analysis of Shear Lag in Steel Angle Connectors University of New Hampshire University of New Hampshire Scholars' Repository Honors Theses and Capstones Student Scholarship Spring 2013 Analysis of Shear Lag in Steel Angle Connectors Benjamin Sawyer

More information

A Game Theoretic Study of Attack and Defense in Cyber-Physical Systems

A Game Theoretic Study of Attack and Defense in Cyber-Physical Systems The First International Workshop on Cyber-Physical Networking Systems A Game Theoretic Study of Attack and Defense in Cyber-Physical Systems ChrisY.T.Ma Advanced Digital Sciences Center Illinois at Singapore

More information

Application of Bayesian Networks to Shopping Assistance

Application of Bayesian Networks to Shopping Assistance Application of Bayesian Networks to Shopping Assistance Yang Xiang, Chenwen Ye, and Deborah Ann Stacey University of Guelph, CANADA Abstract. We develop an on-line shopping assistant that can help a e-shopper

More information

A Novel Approach to Predicting the Results of NBA Matches

A Novel Approach to Predicting the Results of NBA Matches A Novel Approach to Predicting the Results of NBA Matches Omid Aryan Stanford University aryano@stanford.edu Ali Reza Sharafat Stanford University sharafat@stanford.edu Abstract The current paper presents

More information

Hitting with Runners in Scoring Position

Hitting with Runners in Scoring Position Hitting with Runners in Scoring Position Jim Albert Department of Mathematics and Statistics Bowling Green State University November 25, 2001 Abstract Sportscasters typically tell us about the batting

More information

Looking at Spacings to Assess Streakiness

Looking at Spacings to Assess Streakiness Looking at Spacings to Assess Streakiness Jim Albert Department of Mathematics and Statistics Bowling Green State University September 2013 September 2013 1 / 43 The Problem Collect hitting data for all

More information

#19 MONITORING AND PREDICTING PEDESTRIAN BEHAVIOR USING TRAFFIC CAMERAS

#19 MONITORING AND PREDICTING PEDESTRIAN BEHAVIOR USING TRAFFIC CAMERAS #19 MONITORING AND PREDICTING PEDESTRIAN BEHAVIOR USING TRAFFIC CAMERAS Final Research Report Luis E. Navarro-Serment, Ph.D. The Robotics Institute Carnegie Mellon University November 25, 2018. Disclaimer

More information

Pitching Performance and Age

Pitching Performance and Age Pitching Performance and Age By: Jaime Craig, Avery Heilbron, Kasey Kirschner, Luke Rector, Will Kunin Introduction April 13, 2016 Many of the oldest players and players with the most longevity of the

More information

Introduction to Pattern Recognition

Introduction to Pattern Recognition Introduction to Pattern Recognition Jason Corso SUNY at Buffalo 12 January 2009 J. Corso (SUNY at Buffalo) Introduction to Pattern Recognition 12 January 2009 1 / 28 Pattern Recognition By Example Example:

More information

A Nomogram Of Performances In Endurance Running Based On Logarithmic Model Of Péronnet-Thibault

A Nomogram Of Performances In Endurance Running Based On Logarithmic Model Of Péronnet-Thibault American Journal of Engineering Research (AJER) e-issn: 2320-0847 p-issn : 2320-0936 Volume-6, Issue-9, pp-78-85 www.ajer.org Research Paper Open Access A Nomogram Of Performances In Endurance Running

More information

Siła-Nowicka, K. (2018) Analysis of Actual Versus Permitted Driving Speed: a Case Study from Glasgow, Scotland. In: 26th Annual GIScience Research UK Conference (GISRUK 2018), Leicester, UK, 17-20 Apr

More information

What New Technologies Are Teaching Us About the Game of Baseball

What New Technologies Are Teaching Us About the Game of Baseball What New Technologies Are Teaching Us About the Game of Baseball Alan M. Nathan University of Illinois at Urbana-Champaign a-nathan@illinois.edu October 15, 2012 Abstract The trajectory of a baseball moving

More information

Using Multi-Class Classification Methods to Predict Baseball Pitch Types

Using Multi-Class Classification Methods to Predict Baseball Pitch Types Using Multi-Class Classification Methods to Predict Baseball Pitch Types GLENN SIDLE, HIEN TRAN Department of Applied Mathematics North Carolina State University 2108 SAS Hall, Box 8205 Raleigh, NC 27695

More information

y ) s x x )(y i (x i r = 1 n 1 s y Statistics Lecture 7 Exploring Data , y 2 ,y n (x 1 ),,(x n ),(x 2 ,y 1 How two variables vary together

y ) s x x )(y i (x i r = 1 n 1 s y Statistics Lecture 7 Exploring Data , y 2 ,y n (x 1 ),,(x n ),(x 2 ,y 1 How two variables vary together Statistics 111 - Lecture 7 Exploring Data Numerical Summaries for Relationships between Variables Administrative Notes Homework 1 due in recitation: Friday, Feb. 5 Homework 2 now posted on course website:

More information

Biomechanics and Models of Locomotion

Biomechanics and Models of Locomotion Physics-Based Models for People Tracking: Biomechanics and Models of Locomotion Marcus Brubaker 1 Leonid Sigal 1,2 David J Fleet 1 1 University of Toronto 2 Disney Research, Pittsburgh Biomechanics Biomechanics

More information

A Chiller Control Algorithm for Multiple Variablespeed Centrifugal Compressors

A Chiller Control Algorithm for Multiple Variablespeed Centrifugal Compressors Purdue University Purdue e-pubs International Compressor Engineering Conference School of Mechanical Engineering 2014 A Chiller Control Algorithm for Multiple Variablespeed Centrifugal Compressors Piero

More information

Estimation and Analysis of Fish Catches by Category Based on Multidimensional Time Series Database on Sea Fishery in Greece

Estimation and Analysis of Fish Catches by Category Based on Multidimensional Time Series Database on Sea Fishery in Greece Estimation and Analysis of Fish Catches by Category Based on Multidimensional Time Series Database on Sea Fishery in Greece Georgios Tegos 1, Kolyo Onkov 2, Diana Stoyanova 2 1 Department of Accounting

More information

SHOT ON GOAL. Name: Football scoring a goal and trigonometry Ian Edwards Luther College Teachers Teaching with Technology

SHOT ON GOAL. Name: Football scoring a goal and trigonometry Ian Edwards Luther College Teachers Teaching with Technology SHOT ON GOAL Name: Football scoring a goal and trigonometry 2006 Ian Edwards Luther College Teachers Teaching with Technology Shot on Goal Trigonometry page 2 THE TASKS You are an assistant coach with

More information

Average Runs per inning,

Average Runs per inning, Home Team Scoring Advantage in the First Inning Largely Due to Time By David W. Smith Presented June 26, 2015 SABR45, Chicago, Illinois Throughout baseball history, the home team has scored significantly

More information

CALIBRATION OF THE PLATOON DISPERSION MODEL BY CONSIDERING THE IMPACT OF THE PERCENTAGE OF BUSES AT SIGNALIZED INTERSECTIONS

CALIBRATION OF THE PLATOON DISPERSION MODEL BY CONSIDERING THE IMPACT OF THE PERCENTAGE OF BUSES AT SIGNALIZED INTERSECTIONS CALIBRATION OF THE PLATOON DISPERSION MODEL BY CONSIDERING THE IMPACT OF THE PERCENTAGE OF BUSES AT SIGNALIZED INTERSECTIONS By Youan Wang, Graduate Research Assistant MOE Key Laboratory for Urban Transportation

More information

Competitive Performance of Elite Olympic-Distance Triathletes: Reliability and Smallest Worthwhile Enhancement

Competitive Performance of Elite Olympic-Distance Triathletes: Reliability and Smallest Worthwhile Enhancement SPORTSCIENCE sportsci.org Original Research / Performance Competitive Performance of Elite Olympic-Distance Triathletes: Reliability and Smallest Worthwhile Enhancement Carl D Paton, Will G Hopkins Sportscience

More information

Author s Name Name of the Paper Session. Positioning Committee. Marine Technology Society. DYNAMIC POSITIONING CONFERENCE September 18-19, 2001

Author s Name Name of the Paper Session. Positioning Committee. Marine Technology Society. DYNAMIC POSITIONING CONFERENCE September 18-19, 2001 Author s Name Name of the Paper Session PDynamic Positioning Committee Marine Technology Society DYNAMIC POSITIONING CONFERENCE September 18-19, 2001 POWER PLANT SESSION A New Concept for Fuel Tight DP

More information

Using Spatio-Temporal Data To Create A Shot Probability Model

Using Spatio-Temporal Data To Create A Shot Probability Model Using Spatio-Temporal Data To Create A Shot Probability Model Eli Shayer, Ankit Goyal, Younes Bensouda Mourri June 2, 2016 1 Introduction Basketball is an invasion sport, which means that players move

More information

ScienceDirect. Relating baseball seam height to carry distance

ScienceDirect. Relating baseball seam height to carry distance Available online at www.sciencedirect.com ScienceDirect Procedia Engineering 112 (2015 ) 406 411 7th Asia-Pacific Congress on Sports Technology, APCST 2015 Relating baseball seam height to carry distance

More information

Legendre et al Appendices and Supplements, p. 1

Legendre et al Appendices and Supplements, p. 1 Legendre et al. 2010 Appendices and Supplements, p. 1 Appendices and Supplement to: Legendre, P., M. De Cáceres, and D. Borcard. 2010. Community surveys through space and time: testing the space-time interaction

More information

The below identified patent application is available for licensing. Requests for information should be addressed to:

The below identified patent application is available for licensing. Requests for information should be addressed to: DEPARTMENT OF THE NAVY OFFICE OF COUNSEL NAVAL UNDERSEA WARFARE CENTER DIVISION 1176 HOWELL STREET NEWPORT Rl 02841-1708 IN REPLY REFER TO Attorney Docket No. 300170 20 March 2018 The below identified

More information

Minimum Mean-Square Error (MMSE) and Linear MMSE (LMMSE) Estimation

Minimum Mean-Square Error (MMSE) and Linear MMSE (LMMSE) Estimation Minimum Mean-Square Error (MMSE) and Linear MMSE (LMMSE) Estimation Outline: MMSE estimation, Linear MMSE (LMMSE) estimation, Geometric formulation of LMMSE estimation and orthogonality principle. Reading:

More information

Machine Learning an American Pastime

Machine Learning an American Pastime Nikhil Bhargava, Andy Fang, Peter Tseng CS 229 Paper Machine Learning an American Pastime I. Introduction Baseball has been a popular American sport that has steadily gained worldwide appreciation in the

More information

THEORY OF TRAINING, THEORETICAL CONSIDERATIONS WOMEN S RACE WALKING

THEORY OF TRAINING, THEORETICAL CONSIDERATIONS WOMEN S RACE WALKING THEORY OF TRAINING, THEORETICAL CONSIDERATIONS WOMEN S RACE WALKING Prof.Corina ȚIFREA Ph.D General characteristics of race walking Sport performance is multiply determined, but we can t definitely settle

More information

Pitching Performance and Age

Pitching Performance and Age Pitching Performance and Age Jaime Craig, Avery Heilbron, Kasey Kirschner, Luke Rector and Will Kunin Introduction April 13, 2016 Many of the oldest and most long- term players of the game are pitchers.

More information

Special Topics: Data Science

Special Topics: Data Science Special Topics: Data Science L Linear Methods for Prediction Dr. Vidhyasaharan Sethu School of Electrical Engineering & Telecommunications University of New South Wales Sydney, Australia V. Sethu 1 Topics

More information

Modeling Pitch Trajectories in Fastpitch Softball

Modeling Pitch Trajectories in Fastpitch Softball Noname manuscript No. (will be inserted by the editor) Modeling Pitch Trajectories in Fastpitch Softball Jean M. Clark Meredith L. Greer Mark D. Semon Received: date / Accepted: date Abstract The fourth-order

More information

The Vertical Illusions of Batters

The Vertical Illusions of Batters From The Baseball Research Journal, 32: 26-30 The Vertical Illusions of Batters Terry Bahill 1 and David G. Baldwin 2 1 Systems and Industrial Engineering, University of Arizona, Tucson, AZ 85721-0020,

More information

Initial Velocity (IV) Test Replacement

Initial Velocity (IV) Test Replacement October 18, 2017 Initial Velocity (IV) Test Replacement 1 Rationale of Equipment and Method Change The Rules of Golf Appendix III, Rule 5, refers to the Initial Velocity Standard ( IV is tested according

More information

A point-based Bayesian hierarchical model to predict the outcome of tennis matches

A point-based Bayesian hierarchical model to predict the outcome of tennis matches A point-based Bayesian hierarchical model to predict the outcome of tennis matches Martin Ingram, Silverpond September 21, 2017 Introduction Predicting tennis matches is of interest for a number of applications:

More information

EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 6. Wenbing Zhao. Department of Electrical and Computer Engineering

EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 6. Wenbing Zhao. Department of Electrical and Computer Engineering EEC 686/785 Modeling & Performance Evaluation of Computer Systems Lecture 6 Department of Electrical and Computer Engineering Cleveland State University wenbing@ieee.org Outline 2 Review of lecture 5 The

More information

A Bayesian Hierarchical Model of Pitch Framing in Major League Baseball

A Bayesian Hierarchical Model of Pitch Framing in Major League Baseball A Bayesian Hierarchical Model of Pitch Framing in Major League Baseball Sameer K. Deshpande and Abraham J. Wyner Statistics Department, The Wharton School University of Pennsylvania 1 August 2016 Introduction

More information

Using Markov Chains to Analyze a Volleyball Rally

Using Markov Chains to Analyze a Volleyball Rally 1 Introduction Using Markov Chains to Analyze a Volleyball Rally Spencer Best Carthage College sbest@carthage.edu November 3, 212 Abstract We examine a volleyball rally between two volleyball teams. Using

More information

By Shane T. Jensen, Kenneth E. Shirley and Abraham J. Wyner. University of Pennsylvania

By Shane T. Jensen, Kenneth E. Shirley and Abraham J. Wyner. University of Pennsylvania The Annals of Applied Statistics 2009, Vol. 3, No. 2, 491 520 DOI: 10.1214/08-AOAS228 c Institute of Mathematical Statistics, 2009 BAYESBALL: A BAYESIAN HIERARCHICAL MODEL FOR EVALUATING FIELDING IN MAJOR

More information

Outline. Terminology. EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 6. Steps in Capacity Planning and Management

Outline. Terminology. EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 6. Steps in Capacity Planning and Management EEC 686/785 Modeling & Performance Evaluation of Computer Systems Lecture 6 Department of Electrical and Computer Engineering Cleveland State University wenbing@ieee.org Outline Review of lecture 5 The

More information

Finding your feet: modelling the batting abilities of cricketers using Gaussian processes

Finding your feet: modelling the batting abilities of cricketers using Gaussian processes Finding your feet: modelling the batting abilities of cricketers using Gaussian processes Oliver Stevenson & Brendon Brewer PhD candidate, Department of Statistics, University of Auckland o.stevenson@auckland.ac.nz

More information

The 2015 MLB Season in Review. Using Pitch Quantification and the QOP 1 Metric

The 2015 MLB Season in Review. Using Pitch Quantification and the QOP 1 Metric 1 The 2015 MLB Season in Review Using Pitch Quantification and the QOP 1 Metric Dr. Jason Wilson 2 and Wayne Greiner 3 1. Introduction The purpose of this paper is to document new research in the field

More information

Golf s Modern Ball Impact and Flight Model

Golf s Modern Ball Impact and Flight Model Todd M. Kos 2/28/201 /2018 This document reviews the current ball flight laws of golf and discusses an updated model made possible by advances in science and technology. It also looks into the nature of

More information

The Effect of Driver Mass and Shaft Length on Initial Golf Ball Launch Conditions: A Designed Experimental Study

The Effect of Driver Mass and Shaft Length on Initial Golf Ball Launch Conditions: A Designed Experimental Study Available online at www.sciencedirect.com Procedia Engineering 34 (2012 ) 379 384 9 th Conference of the International Sports Engineering Association (ISEA) The Effect of Driver Mass and Shaft Length on

More information

An Application of Signal Detection Theory for Understanding Driver Behavior at Highway-Rail Grade Crossings

An Application of Signal Detection Theory for Understanding Driver Behavior at Highway-Rail Grade Crossings An Application of Signal Detection Theory for Understanding Driver Behavior at Highway-Rail Grade Crossings Michelle Yeh and Jordan Multer United States Department of Transportation Volpe National Transportation

More information

The pth percentile of a distribution is the value with p percent of the observations less than it.

The pth percentile of a distribution is the value with p percent of the observations less than it. Describing Location in a Distribution (2.1) Measuring Position: Percentiles One way to describe the location of a value in a distribution is to tell what percent of observations are less than it. De#inition:

More information

Bayesball: A Bayesian Hierarchical Model for Evaluating Fielding in Major League Baseball

Bayesball: A Bayesian Hierarchical Model for Evaluating Fielding in Major League Baseball University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 2009 Bayesball: A Bayesian Hierarchical Model for Evaluating Fielding in Major League Baseball Shane T. Jensen University

More information

QUALITYGOLFSTATS, LLC. Golf s Modern Ball Flight Laws

QUALITYGOLFSTATS, LLC. Golf s Modern Ball Flight Laws QUALITYGOLFSTATS, LLC Golf s Modern Ball Flight Laws Todd M. Kos 5/29/2017 /2017 This document reviews the current ball flight laws of golf and discusses an updated and modern version derived from new

More information

Table 1. Average runs in each inning for home and road teams,

Table 1. Average runs in each inning for home and road teams, Effect of Batting Order (not Lineup) on Scoring By David W. Smith Presented July 1, 2006 at SABR36, Seattle, Washington The study I am presenting today is an outgrowth of my presentation in Cincinnati

More information

Nature Neuroscience: doi: /nn Supplementary Figure 1. Visual responses of the recorded LPTCs

Nature Neuroscience: doi: /nn Supplementary Figure 1. Visual responses of the recorded LPTCs Supplementary Figure 1 Visual responses of the recorded LPTCs (a) The mean±sd (n=3 trials) of the direction-selective (DS) responses (i.e., subtracting the null direction, ND, from the preferred direction,

More information

Relative Vulnerability Matrix for Evaluating Multimodal Traffic Safety. O. Grembek 1

Relative Vulnerability Matrix for Evaluating Multimodal Traffic Safety. O. Grembek 1 337 Relative Vulnerability Matrix for Evaluating Multimodal Traffic Safety O. Grembek 1 1 Safe Transportation Research and Education Center, Institute of Transportation Studies, University of California,

More information

CS 221 PROJECT FINAL

CS 221 PROJECT FINAL CS 221 PROJECT FINAL STUART SY AND YUSHI HOMMA 1. INTRODUCTION OF TASK ESPN fantasy baseball is a common pastime for many Americans, which, coincidentally, defines a problem whose solution could potentially

More information

Variable Face Milling to Normalize Putter Ball Speed and Maximize Forgiveness

Variable Face Milling to Normalize Putter Ball Speed and Maximize Forgiveness Proceedings Variable Face Milling to Normalize Putter Ball Speed and Maximize Forgiveness Jacob Lambeth *, Dustin Brekke and Jeff Brunski Cleveland Golf, 5601 Skylab Rd. Huntington Beach, CA 92647, USA;

More information

Queue analysis for the toll station of the Öresund fixed link. Pontus Matstoms *

Queue analysis for the toll station of the Öresund fixed link. Pontus Matstoms * Queue analysis for the toll station of the Öresund fixed link Pontus Matstoms * Abstract A new simulation model for queue and capacity analysis of a toll station is presented. The model and its software

More information

An Engineering Approach to Precision Ammunition Development. Justin Pierce Design Engineer Government and International Contracts ATK Sporting Group

An Engineering Approach to Precision Ammunition Development. Justin Pierce Design Engineer Government and International Contracts ATK Sporting Group An Engineering Approach to Precision Ammunition Development Justin Pierce Design Engineer Government and International Contracts ATK Sporting Group 1 Background Federal Premium extensive experience with

More information

DECISION MODELING AND APPLICATIONS TO MAJOR LEAGUE BASEBALL PITCHER SUBSTITUTION

DECISION MODELING AND APPLICATIONS TO MAJOR LEAGUE BASEBALL PITCHER SUBSTITUTION DECISION MODELING AND APPLICATIONS TO MAJOR LEAGUE BASEBALL PITCHER SUBSTITUTION Natalie M. Scala, M.S., University of Pittsburgh Abstract Relief pitcher substitution is an integral part of a Major League

More information

B. AA228/CS238 Component

B. AA228/CS238 Component Abstract Two supervised learning methods, one employing logistic classification and another employing an artificial neural network, are used to predict the outcome of baseball postseason series, given

More information

An average pitcher's PG = 50. Higher numbers are worse, and lower are better. Great seasons will have negative PG ratings.

An average pitcher's PG = 50. Higher numbers are worse, and lower are better. Great seasons will have negative PG ratings. Fastball 1-2-3! This simple game gives quick results on the outcome of a baseball game in under 5 minutes. You roll 3 ten-sided dice (10d) of different colors. If the die has a 10 on it, count it as 0.

More information

Chapter 12 Practice Test

Chapter 12 Practice Test Chapter 12 Practice Test 1. Which of the following is not one of the conditions that must be satisfied in order to perform inference about the slope of a least-squares regression line? (a) For each value

More information

JPEG-Compatibility Steganalysis Using Block-Histogram of Recompression Artifacts

JPEG-Compatibility Steganalysis Using Block-Histogram of Recompression Artifacts JPEG-Compatibility Steganalysis Using Block-Histogram of Recompression Artifacts Jan Kodovský, Jessica Fridrich May 16, 2012 / IH Conference 1 / 19 What is JPEG-compatibility steganalysis? Detects embedding

More information

Safety Systems Based on ND Pressure Monitoring

Safety Systems Based on ND Pressure Monitoring ECNDT 2006 - oster 65 Safety Systems Based on ND ressure Monitoring Davor ZVIZDIC; Lovorka GRGEC BERMANEC, FSB-LM, Zagreb, Croatia Abstract. This paper investigates the scope of ND pressure monitoring

More information

IEEE RAS Micro/Nano Robotics & Automation (MNRA) Technical Committee Mobile Microrobotics Challenge 2016

IEEE RAS Micro/Nano Robotics & Automation (MNRA) Technical Committee Mobile Microrobotics Challenge 2016 IEEE RAS Micro/Nano Robotics & Automation (MNRA) Technical Committee Mobile Microrobotics Challenge 2016 OFFICIAL RULES Version 2.0 December 15, 2015 1. THE EVENTS The IEEE Robotics & Automation Society

More information

Contingent Valuation Methods

Contingent Valuation Methods ECNS 432 Ch 15 Contingent Valuation Methods General approach to all CV methods 1 st : Identify sample of respondents from the population w/ standing 2 nd : Respondents are asked questions about their valuations

More information

Baseball Scorekeeping for First Timers

Baseball Scorekeeping for First Timers Baseball Scorekeeping for First Timers Thanks for keeping score! This series of pages attempts to make keeping the book for a RoadRunner Little League game easy. We ve tried to be comprehensive while also

More information

When Should Bonds be Walked Intentionally?

When Should Bonds be Walked Intentionally? When Should Bonds be Walked Intentionally? Mark Pankin SABR 33 July 10, 2003 Denver, CO Notes provide additional information and were reminders to me for making the presentation. They are not supposed

More information

2014 National Baseball Arbitration Competition

2014 National Baseball Arbitration Competition 2014 National Baseball Arbitration Competition Jeff Samardzija v. Chicago Cubs Submission on Behalf of Chicago Cubs Midpoint: $4.9 million Submission by: Team 26 Table of Contents I. Introduction and Request

More information

Fastball Baseball Manager 2.5 for Joomla 2.5x

Fastball Baseball Manager 2.5 for Joomla 2.5x Fastball Baseball Manager 2.5 for Joomla 2.5x Contents Requirements... 1 IMPORTANT NOTES ON UPGRADING... 1 Important Notes on Upgrading from Fastball 1.7... 1 Important Notes on Migrating from Joomla 1.5x

More information

WIND DATA REPORT. Paxton, MA

WIND DATA REPORT. Paxton, MA WIND DATA REPORT Paxton, MA July 1, 2011 September 30, 2011 Prepared for Massachusetts Clean Energy Center 55 Summer Street, 9th Floor Boston, MA 02110 by Eric Morgan James F. Manwell Anthony F. Ellis

More information

1999 On-Board Sacramento Regional Transit District Survey

1999 On-Board Sacramento Regional Transit District Survey SACOG-00-009 1999 On-Board Sacramento Regional Transit District Survey June 2000 Sacramento Area Council of Governments 1999 On-Board Sacramento Regional Transit District Survey June 2000 Table of Contents

More information