The Intrinsic Value of a Batted Ball Technical Details

Similar documents
The Intrinsic Value of a Pitch

The Reliability of Intrinsic Batted Ball Statistics Appendix

PREDICTING the outcomes of sporting events

Special Topics: Data Science

Simulating Major League Baseball Games

ROSE-HULMAN INSTITUTE OF TECHNOLOGY Department of Mechanical Engineering. Mini-project 3 Tennis ball launcher

Evaluating and Classifying NBA Free Agents

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

Looking at Spacings to Assess Streakiness

Which On-Base Percentage Shows. the Highest True Ability of a. Baseball Player?

Finding your feet: modelling the batting abilities of cricketers using Gaussian processes

Machine Learning an American Pastime

STANDARD SCORES AND THE NORMAL DISTRIBUTION

2 When Some or All Labels are Missing: The EM Algorithm

Baseball Scouting Reports via a Marked Point Process for Pitch Types

Trouble With The Curve: Improving MLB Pitch Classification

SHOT ON GOAL. Name: Football scoring a goal and trigonometry Ian Edwards Luther College Teachers Teaching with Technology

Minimum Mean-Square Error (MMSE) and Linear MMSE (LMMSE) Estimation

Introduction to Pattern Recognition

GLMM standardisation of the commercial abalone CPUE for Zones A-D over the period

Project 1 Those amazing Red Sox!

B. AA228/CS238 Component

Using Spatio-Temporal Data To Create A Shot Probability Model

Building an NFL performance metric

EE 364B: Wind Farm Layout Optimization via Sequential Convex Programming

ScienceDirect. Rebounding strategies in basketball

Chapter 20. Planning Accelerated Life Tests. William Q. Meeker and Luis A. Escobar Iowa State University and Louisiana State University

JPEG-Compatibility Steganalysis Using Block-Histogram of Recompression Artifacts

CHAPTER 10: LINEAR KINEMATICS OF HUMAN MOVEMENT

Mapping Batter Ability in Baseball Using Spatial Statistics Techniques

ANALYSIS OF A BASEBALL SIMULATION GAME USING MARKOV CHAINS

CHAPTER 1. Knowledge. (a) 8 m/s (b) 10 m/s (c) 12 m/s (d) 14 m/s

ATION TITLE. Survey QC, Decision Making, and a Modest Proposal for Error Models. Marc Willerth, MagVAR

Calculation of Trail Usage from Counter Data

March Madness Basketball Tournament

Chapter 11 Waves. Waves transport energy without transporting matter. The intensity is the average power per unit area. It is measured in W/m 2.

CS145: INTRODUCTION TO DATA MINING

In memory of Dr. Kevin P. Granata, my graduate advisor, who was killed protecting others on the morning of April 16, 2007.

PERCEPTIVE ROBOT MOVING IN 3D WORLD. D.E- Okhotsimsky, A.K. Platonov USSR

Anatomy of a Homer. Purpose. Required Equipment/Supplies. Optional Equipment/Supplies. Discussion

Clutch Hitters Revisited Pete Palmer and Dick Cramer National SABR Convention June 30, 2008

1. A cannon shoots a clown directly upward with a speed of 20 m/s. What height will the clown reach?

Physics P201 D. Baxter/R. Heinz

Honest Mirror: Quantitative Assessment of Player Performances in an ODI Cricket Match

CS 221 PROJECT FINAL

A COURSE OUTLINE (September 2001)

Lesson 2 Pre-Visit Slugging Percentage

The statements of the Law of Cosines

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag

A One-Parameter Markov Chain Model for Baseball Run Production

HIGH RESOLUTION DEPTH IMAGE RECOVERY ALGORITHM USING GRAYSCALE IMAGE.

Massey Method. Introduction. The Process

Ten Problems About Twenty- Five Horses

March Madness Basketball Tournament

Author s Name Name of the Paper Session. Positioning Committee. Marine Technology Society. DYNAMIC POSITIONING CONFERENCE September 18-19, 2001

Introduction to Pattern Recognition

On the Impact of Garbage Collection on flash-based SSD endurance

Using Markov Chains to Analyze a Volleyball Rally

The pth percentile of a distribution is the value with p percent of the observations less than it.

SAMPLE RH = P 1. where. P 1 = the partial pressure of the water vapor at the dew point temperature of the mixture of dry air and water vapor

ROUND TOSS-UP: What is the square root of one million? (1000) (10 points) BONUS: How many zeros are at the end of ?

Chapter 4: Single Vertical Arch

Chapter 11 Waves. Waves transport energy without transporting matter. The intensity is the average power per unit area. It is measured in W/m 2.

Major League Baseball Offensive Production in the Designated Hitter Era (1973 Present)

Accuplacer Arithmetic Study Guide

PGA Tour Scores as a Gaussian Random Variable

Safety Assessment of Installing Traffic Signals at High-Speed Expressway Intersections

PUBLISHED PROJECT REPORT PPR850. Optimisation of water flow depth for SCRIM. S Brittain, P Sanders and H Viner

DEPARTMENT OF THE NAVY DIVISION NEWPORT OFFICE OF COUNSEL PHONE: FAX: DSN:

Using Multi-Class Classification Methods to Predict Baseball Pitch Types

Launch Vehicle Performance Estimation:

3D Inversion in GM-SYS 3D Modelling

Bayesball: A Bayesian Hierarchical Model for Evaluating Fielding in Major League Baseball

Bayesian Methods: Naïve Bayes

Assessment Summary Report Gulf of Mexico Red Snapper SEDAR 7

Humanoid Robots and biped locomotion. Contact: Egidio Falotico

APBA Baseball for Windows 5.75 Update 17

DEVELOPMENT OF A SET OF TRIP GENERATION MODELS FOR TRAVEL DEMAND ESTIMATION IN THE COLOMBO METROPOLITAN REGION

Statistical Analysis of Gait Data to Assist Clinical Decision Making

By Shane T. Jensen, Kenneth E. Shirley and Abraham J. Wyner. University of Pennsylvania

Agood tennis player knows instinctively how hard to hit a ball and at what angle to get the ball over the. Ball Trajectories

MATH 114 QUANTITATIVE REASONING PRACTICE TEST 2

Investigation of Suction Process of Scroll Compressors

Naval Postgraduate School, Operational Oceanography and Meteorology. Since inputs from UDAS are continuously used in projects at the Naval

Bayesian Optimized Random Forest for Movement Classification with Smartphones

CORESTA RECOMMENDED METHOD N 6

STT 315 Section /19/2014

ConcepTest PowerPoints

Variable Face Milling to Normalize Putter Ball Speed and Maximize Forgiveness

Available online at ScienceDirect. Procedia Engineering 112 (2015 )

by Michael Young Human Performance Consulting

Title: 4-Way-Stop Wait-Time Prediction Group members (1): David Held

A STUDY OF THE LOSSES AND INTERACTIONS BETWEEN ONE OR MORE BOW THRUSTERS AND A CATAMARAN HULL

Modeling Pitch Trajectories in Fastpitch Softball

LOW PRESSURE EFFUSION OF GASES revised by Igor Bolotin 03/05/12

Predicting the use of the Sacrifice Bunt in Major League Baseball. Charlie Gallagher Brian Gilbert Neelay Mehta Chao Rao

Mathematics in Sports

Chapter 5 5. INTERSECTIONS 5.1. INTRODUCTION

A Fair Target Score Calculation Method for Reduced-Over One day and T20 International Cricket Matches

A Nomogram Of Performances In Endurance Running Based On Logarithmic Model Of Péronnet-Thibault

Transcription:

The Intrinsic Value of a Batted Ball Technical Details Glenn Healey, EECS Department University of California, Irvine, CA 9617 Given a set of observed batted balls and their outcomes, we develop a method for learning the dependence of a batted ball s intrinsic value on its measured s,v, and h parameters. 1 HITf/x Data The HITf/x data used for this study was provided by SportVision and includes measurements from every regular-season MLB game during 014. We consider all balls in play with a horizontal angle in fair territory (h [ 45,45 ]) that were tracked by the system where bunts are excluded. This results in a set of 14364 batted balls and the distributions for s,v, and h are shown in figures 1 and. We see that the peak of the speed distribution is near 93 mph and that the peaks of the vertical and horizontal angle distributions are near zero. Since HITf/x tracks batted balls over a portion of their trajectory that occurs after the ball has slowed due to air drag and gravity, the estimated speeds are a few miles per hour less than the speeds recorded by other systems. Since this effect is systematic, these offsets will not have a significant impact on the batted ball statistics computed in this work. Learning Algorithm.1 Bayesian Foundation Using Bayes theorem, the probability of a batted ball outcome R j given a measured vector x = (s,v,h) is given by P(R j x) = p(x R j)p(r j ) p(x) (1) 1

0.03 0.05 p(s) 0.0 0.015 0.01 0.005 0 0 0 40 60 80 100 10 Figure 1: Distribution of initial speeds (mph) for batted balls in 014 s 0.0 vertical horizontal 0.015 p 0.01 0.005 0-60 -40-0 0 0 40 60 80 100 angle Figure : Distribution of vertical and horizontal angles (degrees) for batted balls in 014

where p(x R j ) is the conditional probability density function for x given outcome R j, P(R j ) is the prior probability of outcome R j, and p(x) is the probability density function for x. LinearcombinationsoftheP(R j x)probabilitiesfordifferent outcomescanbeusedtomodel the expected value of statistics such as batting average, woba, and slugging percentage as a function of the batted ball vector x. For a given batted ball, therefore, these statistics provide a measure of value that is separate from the batted ball s particular outcome.. Kernel Density Estimation The goal of density estimation for this application is to recover the underlying probability density functions p(x R j ) and p(x) in equation (1) from the set of observed batted ball vectors and their outcomes. Given the typical positioning of defenders on a baseball field and the various ways that an outcome such as a single can occur, we expect a conditional density p(x R j ) to have a complicated multimodal structure. Thus, we use a nonparametric technique for density estimation. We first consider the task of estimating p(x). Let x i = (s i,v i,h i ) for i = 1,,...,n be the set of n observed batted ball vectors. Kernel methods [6] which are also known as Parzen- Rosenblatt [4] [5] window methods are widely used for nonparametric density estimation. A kernel density estimate for p(x) is given by p(x) = 1 n K(x x i ) () n i=1 where K( ) is a kernel probability density function that is typically unimodal and centered at zero. A standard kernel for approximating a d dimensional density is the zero-mean Gaussian K(x) = [ 1 (π) d/ Σ exp 1 ] 1/ xt Σ 1 x where Σ is the d d covariance matrix. For this kernel, p(x) at any x is the average of a sum of Gaussians centered at the sample points x i and the covariance matrix Σ determines the amount and orientation of the smoothing. Σ is often chosen to be the product of a scalar and an identity matrix which results in equal smoothing in every direction. However, we see 3 (3)

from figures 1 and that the distribution for v has detailed structure while the distributions for s and h are significantly smoother. Thus, to recover an accurate approximation p(x) the covariance matrix should allow different amounts of smoothing in different directions. We enable this goal while also reducing the number of unknown parameters by adopting a diagonal model for Σ with variance elements (σs,σ v,σ h). For our three-dimensional data, this allows K(x) to be written as a product of three one-dimensional Gaussians K(x) = [ 1 exp 1 ( s (π) 3/ σ s σ v σ h σs + v σ v )] + h σh which depends on the three unknown bandwidth parameters σ s,σ v, and σ h. (4).3 Cross-Validation for Bandwidth Selection The accuracy of the kernel density estimate p(x) is highly dependent on the choice of the bandwidth vector σ = (σ s,σ v,σ h ) [1]. The recovered p(x) will be spiky for small values of the parameters and, in the limit, will tend to a sum of Dirac delta functions centered at the x i data points as the bandwidths approach zero. Large bandwidths, on the other hand, can induce excessive smoothing which causes the loss of important structure in the estimate of p(x). A number of bandwidth selection techniques have been proposed and a recent survey of methods and software is given in [3]. Many of these techniques are based on maximum likelihood estimates for p(x) which select σ so that p(x) maximizes the likelihood of the observed x i data samples. Applying these techniques to the full set of observed data, however, yields a maximum at σ = (0,0,0) which corresponds to the sum of delta functions result. To avoid this difficulty, maximum likelihood methods for bandwidth selection have been developed that are based on leave-one-out cross-validation [6]. The computational demands of leave-one-out cross-validation techniques are excessive for our HITf/x data set. Therefore, we have adopted a cross-validation method which requires less computation. From the full set of n observed x i vectors, we generate M disjoint subsets S j of fixed size n v to be used for validation. For each validation set S j, we construct the estimate p(x) using the n n v vectors that are not in S j as a function of the bandwidth vector σ = (σ s,σ v,σ h ). The optimal bandwidth vector σj = (σ sj,σ vj,σ hj ) for S j 4

is the choice that maximizes the pseudolikelihood [] [3] according to σ j = argmax σ x i S j p(x i ) (5) where the product is over the n v vectors in the validation set S j. The overall optimized bandwidth vector σ is obtained by averaging the M vectors σ j. For our data set, we used five validation sets S 1,S,S 3,S 4, and S 5 to select the optimized bandwidth vector σ for the p(x) estimate. Set S i includes n v batted balls that were hit on day 6i 5 of a calendar month. Set S, for example, includes only batted balls hit on the 7th day of a month. The size n v = 380 was taken to be the largest value so that each set S i includes the same number of elements. The decision to use six days of separation for the validation sets was made with the goal of maximizing the independence of the sets. A regular-season series of consecutive games between the same pair of teams always lasts less than six days. In addition, major league teams in 014 tended to use a rotation of starting pitchers that repeats every five days so that, if this tendency is followed, each starting pitcher will occur once per calendar month in each of the five validation sets. For each validation set S j, a three-dimensional search was conducted with a step size of 0.1 in σ s,σ v, and σ h to find the optimized σ j in equation (5). For each S j and σ vector under consideration, we removed the twenty x i batted ball vectors with the smallest value of p(x i ) to prevent outliers from influencing the optimization. The vectors σ j for each S j are given in Table 1 and after averaging yielded an optimized σ = (σ s,σ v,σ h) of (.0,1.50,.0).We see that vertical angle has the smallest smoothing parameter (σv = 1.50) which is consistent with the observation from figures 1 and that vertical angle has more detailed structure in its density than batted ball speed or horizontal angle. Table 1: Optimal bandwidths σ j for validation sets S j S 1 S S 3 S 4 S 5 (.0,1.5,.) (1.9,1.5,.3) (.0,1.6,.0) (.0,1.6,.3) (.,1.3,.) 5

.4 Constructing the Estimate for P(R j x) An estimate for P(R j x) can be derived from estimates of the quantities on the right side of equation (1). The density estimate p(x) for p(x) is obtained using the kernel method defined by equations () and (4) with the optimized bandwidth vector σ learned using the process described in section.3. Each conditional probability density function p(x R j ) is estimated in the same way except that the training set is defined by the subset of the x i vectors with outcome R j. Since reduced sample sizes for specific outcomes R j preclude the learning of individual bandwidth vectors for each p(x R j ), we use the σ derived for p(x) for each case. This approach also has the desirable effect of providing the same smoothing to a batted ball vector in the numerator and denominator of (1) which prevents a probability P(R j x) from exceeding one. Each prior probability P(R j ) is estimated by the fraction of the n batted balls in the full training set with outcome R j. The estimate for P(R j x) is then constructed by combining the estimates for p(x R j ),P(R j ), and p(x) according to Bayes theorem..5 Batter Handedness We repeated the process described in the previous sections to obtain separate densities for left-handed and right-handed batters. The n = 14364 batted balls were first partitioned into the 54948 for left-handed batters and 69416 for right-handed batters. The method describedinsection.3wasthenusedtobuildfivevalidationsetsforeachcasewhichresulted in a validation set size n v of 1680 for left-handed batters and 190 for right-handed batters. The optimal bandwidth vectors σ j for each validation set and batter handedness are given in Table. After averaging, we arrive at an optimized σ = (σs,σ v,σ h ) of (.18,1.7,.50) for left-handed batters and (.16, 1.56,.30) for right-handed batters. We note that, as seen in section.3, σ v is the smallest for each case while σ h is the largest. In addition, the bandwidth increases for each variable to provide more smoothing as the number of samples decreases. 6

Table : Optimal bandwidths σ j for validation sets S j by batter handedness S 1 S S 3 S 4 S 5 L (.0,1.5,3.1) (.,.1,.) (.3,1.6,.1) (.4,1.9,.3) (.0,1.5,.8) R (1.9,1.8,.1) (.1,1.7,.) (.4,1.4,.) (.,1.5,.6) (.,1.4,.4) References [1] R. Duda, P. Hart, and D. Stork. Pattern Classification. Wiley-Interscience, New York, 001. [] R. Duin. On the choice of smoothing parameters for Parzen estimators of probability density functions. IEEE Transactions on Computers, C-5(11):1175 1179, 1976. [3] A.C. Guidoum. Kernel estimator and bandwidth selection for density and its derivatives. The kedd package, version 1.03, October 015. [4] E. Parzen. On estimation of a probability density function and mode. Annals of Mathematical Statistics, 33(3):1065 1076, 196. [5] M. Rosenblatt. Remarks on some nonparametric estimates of a density function. Annals of Mathematical Statistics, 7(3):83 837, 1956. [6] S. Sheather. Density estimation. Statistical Science, 19(4):588 597, 004. 7