A One-Parameter Markov Chain Model for Baseball Run Production

Similar documents
February 12, Winthrop University A MARKOV CHAIN MODEL FOR RUN PRODUCTION IN BASEBALL. Thomas W. Polaski. Introduction.

Simulating Major League Baseball Games

A Markov Model for Baseball with Applications

GUIDE TO BASIC SCORING

ANALYSIS OF A BASEBALL SIMULATION GAME USING MARKOV CHAINS

To find their sum, you could simply add from left to right, or you could pair the numbers as illustrated and add each pair.

Fairfax Little League PPR Input Guide

Chapter 1 The official score-sheet

HCLL Scorekeeping Clinic

Using Markov Chains to Analyze a Volleyball Rally

Hitting with Runners in Scoring Position

The MLB Language. Figure 1.

TEE BALL BASICS. Here is a field diagram: 45 feet

INFORMS Transactions on Education

THE FIELD THE INFIELD

IBAF Scorers Manual INTERNATIONAL BASEBALL FEDERATION FEDERACION INTERNACIONAL DE BEISBOL

2017 International Baseball Tournament. Scorekeeping Hints

Major League Baseball Offensive Production in the Designated Hitter Era (1973 Present)

Which On-Base Percentage Shows. the Highest True Ability of a. Baseball Player?

SUPERSTITION LITTLE LEAGUE LOCAL RULES

Westfield Youth Sports, Inc. Double A Baseball Rules

OFFICIAL 2018 FAST PLASTIC RULEBOOK

Table of Contents. Pitch Counter s Role Pitching Rules Scorekeeper s Role Minimum Scorekeeping Requirements Line Ups...

Correction to Is OBP really worth three times as much as SLG?

Indoor Wiffle Ball League Rule Book

2018 FCYB Local Rules

Redmond West Little League

2019 Season Local Ground Rules Minors, Majors Danville Little League

Should pitchers bat 9th?

IBAF Scoring Manual INTERNATIONAL BASEBALL FEDERATION FEDERACION INTERNACIONAL DE BEISBOL

One could argue that the United States is sports driven. Many cities are passionate and

arxiv: v1 [stat.ap] 18 Nov 2018

TOP OF THE TENTH Instructions

FYB/USSSA 2019 LOCAL TOURNAMENT RULES

Del Mar Little League AA Rules of Play

Using Markov Chains to Analyze Volleyball Matches

Section 2 PLAYING TIME

Richmond City Baseball. Player Handbook

Lynn s Softball League Rules Mens 2018

RULES WE NEED TO KNOW

Berks Fall Baseball Rules

Left Fielder. Homework

Swinburne Research Bank

CORONA PONY YOUTH BASEBALL LOCAL RULES & REGULATIONS (Revised: January, 2016)

AggPro: The Aggregate Projection System

FARM DIVISION RULES Spring 2017

Arlington Doc Bonaccorso Summer Classic Tournament Rules 8U Updated 7/11/16

TPLL 2017 MINORS RULES

RULES AND REGULATIONS OF FIXED ODDS BETTING GAMES

RIVIERA LITTLE LEAGUE 2018 NATIONAL DIVISION RULES

COACH PITCH DIVISION

15.0 T-BALL RULES. If a player overthrows the ball to a baseman, the runner does not advance a base PICKERING BASEBALL ASSOCIATION

Season Plans. We hope you have learned a lot from this book: what your responsibilities. chapter 9

Del Mar American Little League AAA Division Rules

2017 B.L. DRAFT and RULES PACKET

1. The Ohio Hot Stove Baseball League Rules are to apply, unless SYB specifies the rule for league play.

Williamsburg Youth Baseball League 2018 Rules

Triple Lite Baseball

Rookie In-Park Rules

The Rise in Infield Hits

Ahwatukee Little League 2016 Local Regulations and Playing Rules

Lakeshore Baseball and Softball Association

Master of Arts In Mathematics

A Remark on Baserunning risk: Waiting Can Cost you the Game

Lynn s Softball League Rules Coed 2018

2015 GTAAA Jr. Bulldogs Memorial Day Tournament

September 29, New type of file on Retrosheet website. Overview by Dave Smith

Beaver Bash Tournament Rules and Information

Guide to Softball Rules and Basics

OFFICIAL RULEBOOK. Version 1.08

FYB/USSSA 2018 LOCAL TOURNAMENT RULES

Lesson 1 Pre-Visit Batting Average

Antelope Little League

CARY GROVE YOUTH BASEBALL and SOFTBALL PONY LEAGUE RULES

9u (Tadpole) Hardball Rules

Madison Baseball Association Local Rules and Regulations Rookie League (7 and 8 year olds)

BASEBALL PINTO DIVISION RULES OF PLAY LAKES YOUTH ATHLETIC ASSOCIATION

Predicting the use of the sacrifice bunt in Major League Baseball BUDT 714 May 10, 2007

Field Manager s Rulebook

DECISION MODELING AND APPLICATIONS TO MAJOR LEAGUE BASEBALL PITCHER SUBSTITUTION

OFFICIAL RULEBOOK. Version 1.16

TBALL LEAGUE RULES TEE BALL RULES AND GUIDELINES. 1. BASEBALLS 1.1. A soft baseball (safety ball) will be used for the player's safety.

CBA Umpire Test. 3. The pitcher cannot wear a batting glove under his glove, even though the umpire does not think it is distracting to the batter.

This file contains the main manual, optional rules manual, game tables, score sheet, game mat, and two teams from the 1889 season.

George F. Will, Men at Work

(11-12 YEAR OLDS) MAJOR LEAGUE RULES 1 BASE PATH DISTANCE: 65 FEET 2 PITCHER'S MOUND TO HOME PLATE: 50 FEET 6 INCHES

OFFICIALS EDUCATION PROGRAM BATTING

EBENSBURG LITTLE LEAGUE 2017 MINOR LEAGUE RULES

Infield Hits. Infield Hits. Parker Phillips Harry Simon PARKER PHILLIPS HARRY SIMON

Softball Study Guide

2018 Rosedale Community Council-Youth Softball Rules Difference Chart

Ahwatukee Little League 2018 Local Regulations and Playing Rules

Tball League Division Rules

Spring 2018 PARK REC TEAMS T-BALL RULES

Scorekeeping Guide Book

FISHERS-HSE YOUTH BASEBALL RULES Kindergarten League Reviewed and Approved March 26, 2017

KIMBERTON YOUTH ATHLETIC LEAGUE Rules of Play Baseball AA Division

2018 MAJOR DIVISION RULES

ZERO TOLERANCE FOR UNRULY BEHAVIOR

SLOW PITCH SOFTBALL THE GAME SOFTBALL TERMINOLOGY Ball: Ball in play: Batting order: Complete inning: Double: Fly ball: Force out: Foul ball:

Transcription:

for Winthrop University April 13, 2013 s s and

: A is an ideal candidate for mathematical modelling, as it has these features: a relatively small number of configurations, a relatively small number of events that can happen, a discrete nature. These factors also suggest using a discrete-time stochastic process as a model. For the last 50 years, intermittent attempts have been made to model baseball using a type of stochastic process called a Markov chain, and to derive measures of players offensive abilities from an analysis of that model. s s and

s A Markov chain is a mathematical model for movement between states. A process starts in one of these states and moves from state to state. The moves between states are called steps or transitions. The chain can be said to move between states and to be at a state or in a state after a certain number of steps. The state of the chain at any given step is not known, but the probability that the chain is at a given state after n steps depends only on the state of the chain after n 1 steps, and the probabilities that the chain moves from one state j to another state i in one step. s s and

The Transition Matrix P These probabilities are called transition probabilities for the Markov chain. The transition probabilities are placed in a matrix called the transition matrix P for the chain by entering the probability of a transition from state j to state i at the (i, j)-entry of P. So if there were m states named 1, 2,... m, the transition matrix would be the m m matrix P = From: 1 j m To:. 1 p ij i m s s and

The Big Idea Use a Markov chain to model run production in baseball. The states will be different possible offensive configurations. The transition probabilities will be calculated from actual data. s s and

States of the Transient States Bases Occupied Outs Bases Occupied Outs None 0 1st,2nd 1 1st 0 1st,3rd 1 2nd 0 2nd,3rd 1 3rd 0 1st,2nd,3rd 1 1st,2nd 0 None 2 1st,3rd 0 1st 2 2nd,3rd 0 2nd 2 1st,2nd,3rd 0 3rd 2 None 1 1st,2nd 2 1st 1 1st,3rd 2 2nd 1 2nd,3rd 2 3rd 1 1st,2nd,3rd 2 s s and Absorbing States Left on Base Outs Left on Base Outs 0 3 2 3 1 3 3 3

Transition Probabilities Only the following events are allowed for the player at bat: single, double, triple, home run, walk, hit batsman, out. No base stealing, errors, sacrifices, sacrifice flies, double plays, triple plays, or other ever-more-bizarre events are allowed. For the following analysis to work, each batter must have the same probability of singling, doubling, etc. Each batter is identical. s s and

Outcomes of Batting Events Event Walk or Hit Batsman Single Double Triple Home Run Out Outcome The batter advances to first base. Runners advanced only if forced. The batter advances to first base. Runners on first base or third base advance one base. A runner on second base advances to third base with probability 1 p and scores with probability p. (This parameter is chosen to minimize model error.) The batter advances to second base. All runners advance two bases. The batter advances to third base. All runners advance three bases. The batter scores. All runners score. No runners advance. The number of outs increases by one. s s and

Calculation of Transition Probabilities The transition probabilities are calculated from the above rules and from aggregate data. For example, the data from the 2012 MLB season is: Batting Event Occurrences Probability Walk or Hit Batsman 16203 0.089 Single 27941 0.154 Double 8261 0.046 Triple 927 0.005 Home Run 4934 0.027 Out 123128 0.679 Example: The probability of a transition from runner on first, 0 out to runners on first and second, 0 out is the probability of a walk or a hit batsman or a single, which is 0.089 + 0.154 = 0.263. s s and

The Transition Matrix P The transition probabilities are calculated as above and placed in the transition matrix P. We order the states listing absorbing first, then transient. The probability of a transition from an absorbing state to itself is 1. The probability of a transition from an absorbing state to any other state is 0. The probabilty of a transition from a transient state is governed by the above probabilities. s s and [ ] I S P = O Q

The Fundamental Matrix M The fundamental matrix for the Markov chain is M = (I Q) 1. Suppose that the chain starts at state i. Then The expected number of visits to state j before being absorbed is the (j, i)-element in M. The expected number of steps until absorption is the sum of the (j, i)-elements of M, where j ranges over all transient states. In terms of baseball, the sum of the i th column of M is the expected number of batters that will come to bat during the remainder of the half-inning starting from state i. s s and The sum of the first column of M is the expected number of batters E(B) that will bat in a half-inning.

The Matrix A = SM The matrix A = SM is important to our analysis. Suppose that the chain starts at the transient state i. Then The (j, i)-element of the matrix A = SM is the probability that the Markov chain is eventually absorbed at state j. In the baseball model, the i th column of A contains the probabilities that 0, 1, 2, or 3 runners are left on base at the end of the half-inning starting at state i. Thus the first column of A can be used to compute the expected number of runners E(L) left on base: s s and 0 P(0 left) + 1 P(1 left) + 2 P(2 left) + 3 P(3 left)

The Conservation Law of Every batter who comes to the plate makes an out, scores, or is left on base. Let B be the number of batters who come to the plate in an half-inning, R be the number of (earned) runs scored, and L be the number left on base. Taking expected values, B = 3 + R + L, or R = B L 3 s s and E(R) = E(B) E(L) 3 and we can calculate E(B) and E(L), thus E(R).

Does this Work? We set p equal to that value that would predict exactly the number of earned runs scored in MLB since 1955. This value is p = 0.63. For the 2012 MLB season, the model predicts that 0.439618 earned runs will score in a single half-inning. Since 43,355 innings were played during the season, the model predicts that 19,059.6 earned runs would score during the season. In reality, 19,302 earned runs scored during the season, or 0.445208 per half-inning. The error in the model was 242.4 earned runs for the season; the relative error was 1.26%. s s and

Does this Work? Earned Runs Relative Year Scored Prediction Error Error 2000 22874 23207.0 333.0 1.46% 2001 21215 21363.2 148.2 0.70% 2002 20529 20701.2 172.2 0.84% 2003 21154 21114.0-40.0-0.19% 2004 21510 21724.7 214.7 1.00% 2005 20562 20634.3 72.34 0.35% 2006 21723 21950.6 227.6 1.05% 2007 21529 21407.8-121.2-0.56% 2008 20790 20808.4 18.4 0.09% 2009 20731 20904.0 173.0 0.83% 2010 19595 19411.1-183.9-0.94% 2011 19032 18866.9-165.1-0.87% 2012 19302 19059.6-242.4-1.26% Total 270546 271152.9 606.9 0.22% s s and

Earned Runs Scored ( ) and Predictions ( ) 1955-2012 24 000 22 000 20 000 18 000 16 000 14 000 s s and 12 000 10 000

Player Rating Consider the same player batting over and over. Use the player s batting statistics to generate transition probabilities. Run model to find the expected number of earned runs the player would score in a single half-inning or in a nine-inning game. s s and

Example: MLB 2012 The 10 MLB players with the highest batting averages in the 2012 season were studied using this model. The p value was set at the value that would produce the exact number of earned runs for the entire 2012 MLB season. This value is p = 0.695. Markov Runs Name Created 27 OPS Posey 8.06 7.90.987 Cabrera 8.22 7.98.999 McCutchen 7.77 7.99.953 Trout 7.91 8.85.963 Beltre 6.73 7.45.921 Braun 8.01 8.46.987 Mauer 6.90 6.66.861 Jeter 5.01 5.37.791 Molina 6.33 6.87.874 Fielder 7.97 7.82.940 s s and

MLB 2012 Rankings Markov Runs Name Created 27 OPS Posey 2 5 3 Cabrera 1 4 1 McCutchen 6 3 5 Trout 5 1 4 Beltre 8 7 7 Braun 3 2 2 Mauer 7 9 9 Jeter 10 10 10 Molina 9 8 8 Fielder 4 6 6 s s and