Part of baseball s charm is the illusion it offers that all aspects of it can be completely reduced to numerical expressions and printed in agate type in the sport section. George F. Will, Men at Work
Tom O Brien
The Official Rules of Major League Baseball 1.02 The objective of each team is to win by scoring more runs than their opponent 1.03 The winner of the game shall be that team which shall have scored, in accordance with these rules, the greater number of runs at the conclusion of a regulation game.
What are the important factors in winning games? Suppose our team outdoes the opposition in category X. What is the probability that we will win the game?
Event Probability, % W/L Runs created 82.6 4.75 Hits + Walks 79.6 3.90 Total Bases 79.5 3.88 Hits 76.3 3.22 Walks 66.1 1.95 (Errors) 63.9 1.77 Stolen Bases 61.1 1.58
The Society for American Baseball Research (SABR) SABeRmetrics Sabermetrics is the mathematical and statistical analysis of baseball records - Bill James
Offense 50% Defense 50% Offense Runs Created Linear Weights OPS Modified Defense Pitching - ~40% Fielding - ~10%
Wins above Replacement Player (WAR) Team Data Runs Scored Ratio or Difference Predict Number of Games Won Individual Data Runs Allowed Other Calculate "Runs per Win" Stuff Defensive Stats Hits Walks Runs Extra Bases Created Pitching Stats Outs Define Runs "Replacement above WAR Player" (R. P.) R. P.
The Importance of Felix Hernandez Cy Young Award 2001 2012 Won-Lost Percentages of Starting Pitchers (starters have won 23 of 24 awards) 2010 AL: Felix Hernandez, Seattle Mariners Won 13, Lost 12 (others average 20-6) Team record 61-101 (worst in AL) Felix produced 4.24 wins above team, 6.8 WAR ERA 2.27; best in league Pitched 250 innings; 230 K s 2009: Felix went 19-5.880.870.828.828.821.818.808.808.800.783.778.769.760.760.759.731.731.724.682.677.667.667.520
1 2
Position 1B 2B 3B SS LF CF RF Player Albert Pujols Mark Teixeira Chase Utley (Scott Rolen) Adrian Beltre * Evan Longoria Brendan Ryan Brett Gardner Michael Bourn (Andruw Jones) Jason Heyward Ichiro Suzuki * ( ) Retired 2012 * Beyond peak/dh
Wins above Replacement Player (WAR) Team Data Runs Scored Ratio or Difference Predict Number of Games Won Individual Data Runs Allowed Other Calculate "Runs per Win" Stuff Defensive Stats Hits Walks Runs Extra Bases Created Pitching Stats Outs Define Runs "Replacement above WAR Player" (R. P.) R. P.
RC = (H + W) x (TB) / (AB + W) Rearranged: RC = [(H + W) / (AB + W)] x (TB) RC = OBP x (SP x AB) Quality: OBP x SP Quantity: At Bats
RC = (H + W CS) x (TB + 0.55 x SB) / (AB + W) = A x B / C A Reaching base B Advancement C Opportunities
A factor: B factor: H + W CS becomes H + W CS + HBP GIDP TB + 0.55 x SB becomes TB + 0.26(TBB IBB + HBP) + 0.52(SH + SF + SB) C factor: AB + W becomes AB + W + HBP + SH + SF Bill James has about 15 different technical versions use depends on availability of various stats
The Runs Created Formulas 1964 AL Team Runs Basic SB version Tech Boston 688 735 732 725 Detroit 699 690 691 707 New York 730 695 698 705 Minnesota 737 764 763 775 Baltimore 679 667 667 671 Cleveland 689 658 653 666 Chicago 642 614 614 643 Los Angeles 544 559 555 552 Kansas City 621 644 643 645 Washington 578 559 556 566 Average 661 658 657 665 Std Error of Estimate 60.3 26.9 26.5 22.1 Analysis of Variance Basic 80.1 Tech 6.5 Residue 13.4
Runs Created 1985 NL Team A B C RC R StL 1863 2432 6182 733 747 NY 1807 2390 6248 691 695 Mtl 1671 2295 6053 634 633 Chi 1809 2426 6177 710 686 Phl 1720 2340 6122 657 667 Pgh 1677 2135 6099 587 568 4012 3996 LA 1838 2382 6222 704 682 Cin 1778 2322 6143 672 677 Hou 1774 2386 6192 684 706 SD 1774 2238 6150 646 650 Atl 1728 2234 6206 622 632 SF 1612 2123 6063 564 556 3892 3903 7904 7899 Division East 10547 14018 36881 4009 3996 West 10504 13685 36976 3888 3903 7897 7899 League 21051 27703 73857 7896 7899
Wins above Replacement Player (WAR) Team Data Runs Scored Ratio or Difference Predict Number of Games Won Individual Data Runs Allowed Other Calculate "Runs per Win" Stuff Defensive Stats Hits Walks Runs Extra Bases Created Pitching Stats Outs Define Runs "Replacement above WAR Player" (R. P.) R. P.
Batting, Pitching, & Fielding Statistics Franchise Encyclopedia: 1997 / 1999 114-48, Finished 1st in AL East (Schedule and Results) View League Standings and Leaders Manager: Joe Torre (114-48) Scored 965 runs, Allowed 656 runs. Pythagorean W-L: 108-54 Ballparks: Yankee Stadium II & Shea Stadium Attendance: 2,955,193 (3rd of 14) Park Factors Over 100 favors batters, under 100 favors pitchers. Multi-year: Batting - 97, Pitching - 95 one-year: Batting - 100, Pitching - 97 Postseason: Won World Series (4-0) over San Diego Padres Won AL Championship Series (4-2) over Cleveland Indians Won AL Division Series (3-0) over Texas Rangers
Batting, Pitching, & Fielding Statistics Franchise Encyclopedia: 1997 / 1999 114-48, Finished 1st in AL East (Schedule and Results) View League Standings and Leaders Manager: Joe Torre (114-48) Scored 965 runs, Allowed 656 runs. Pythagorean W-L: 108-54 Ballparks: Yankee Stadium II & Shea Stadium Attendance: 2,955,193 (3rd of 14) Park Factors Over 100 favors batters, under 100 favors pitchers. Multi-year: Batting - 97, Pitching - 95 one-year: Batting - 100, Pitching - 97 Postseason: Won World Series (4-0) over San Diego Padres Won AL Championship Series (4-2) over Cleveland Indians Won AL Division Series (3-0) over Texas Rangers
Percentage = (runs scored) 2 / [(runs scored) 2 + (runs allowed) 2 ] P = R 2 / (R 2 + S 2 ) Has found its way into popular culture ****** Variations: P = r 2 / (r 2 + 1) (where r = R/S) W/L = R 2 /S 2 (or W/L = r 2 )
P = R 2 / (R 2 + S 2 ) Pythagorean theorem involves an assumption. The best exponent may not be 2.00000000. A generalization: P = R n / (R n + S n ) By trial, the best value of n is 1.80-1.85. 1.83 usually assumed. Advantage: (Slightly) more accurate Disadvantages: More complex Can t call it Pythagorean
Linear Results shown by straight line Percentage Coordinates are scoring percentage and winning percentage Neutrality Line must go through (0.500, 0.500) Slope The best value for the slope of the line is close to 1.8
W/(W + L) 0.5 = 1.8[R/(R + S) 0.5] P = W/(W + L) = (1.4R 0.4S)/(R + S) Δ = R S P = (R + 0.4Δ)/(R + S)
The Beer-Mat Test
Situation Proportional (recognize value of runs) Sabermetric (weight value of runs) % of variance AL NL 75.1 67.1 22.3 24.0 Residue 2.5 8.9
It depends on the average scoring level Let s call the average number of runs, by both teams, ρ Tabulate ratio of runs required to average level ρ Method r= 1 1.1 1.25* Pythagorean 1 1.007 1.038 James Index 1.093 1.099 1.125 Linear Percentage 1.111 1.111 1.111 (*or 0.8) (Values are runs required for an incremental win, divided by ρ)
OPS RPG Correlation
Base-Stealing Performance 1950-2010
1) Reduce prep time (simple windup now prevalent) 2) Reduce time to get the ball to the fielder (quick turnaround): Time Success Rate (sec) (%) <3.25 61.4 3.25-3.4 68.7 3.4-3.55 73.9 >3.55 77.1
Thank You Slides by Kathleen Jelliffe
Alex Rodriguez 115.5 Albert Pujols 91.8 Derek Jeter 72.3 [2012 Hall-of-Famers] [70.2-70.6] Carlos Beltran 65.7 Roy Halladay 65.2 Adrian Beltre 65.0 Todd Helton 61.6 Andy Pettitte 58.6 Ichiro Suzuki 57.1 Tim Hudson 56.5 Above first line: in top 50, all-time Above second line: in top 100, all-time (through April 2013)