NFL Player Evaluation Using Expected Points Added with nflscrapr

Similar documents
A Reproducible Method for Offensive Player Evaluation in Football

Evaluating NFL plays: Expected points adjusted for schedule

Estimating the Probability of Winning an NFL Game Using Random Forests

Rating Player Performance - The Old Argument of Who is Bes

2017 PFF RUSHING REPORT

NCAA Football Statistics Summary

Predicting the Total Number of Points Scored in NFL Games

Modeling Fantasy Football Quarterbacks

NFL Overtime-Is an Onside Kick Worth It?

Building an NFL performance metric

The probability of winning a high school football game.

DOB: June 15, 1993 (Age 25)

2015 Fantasy NFL Scouting Report

Directions On How To Player Fantasy Football 2012 Ppr Leagues

Using New Iterative Methods and Fine Grain Data to Rank College Football Teams. Maggie Wigness Michael Rowell & Chadd Williams Pacific University

HOSTING DISTRICT/PLAYOFF GAMES:

HOSTING DISTRICT/PLAYOFF GAMES:

Decision Making in American Football: Evidence from 7 Years of NFL Data

A Critical Re-Evaluation of the Career of Sam Baugh. Glenn Gerstner St. John s University PCA/ACA Conference March 28, 2013

SAP Predictive Analysis and the MLB Post Season

This is Not a Game Economics and Sports

MOST RECEIVING YARDS IN A SIX-SEASON SPAN, NFL HISTORY

= = 1 2. (6) What is the probability that Jack will get a wooden bat and Peter will not = 25

3.0. GSIS 3.1 Release Notes. NFL GSIS Support: (877) (212)

WIDE RECEIVER LBS COLLEGE: MINNESOTA ACQUIRED: FREE AGENT NFL EXPERIENCE (NFL/TITANS): 8/1 HOMETOWN: COLD SPRING, MINN

100Yards.com. MIS 510, Project Report Spring Chandrasekhar Manda Rahul Chahar Karishma Khalsa

Player Lists Explanation

An Examination of Decision-Making Biases on Fourth Down in The National Football League. Weller Ross, B.S. Sport Management

RULE BOOK FOR THE FIRST FRIENDS CHURCH ALL OUT FLAG FOOTBALL PROGRAM

Apple Device Instruction Guide- High School Game Center (HSGC) Football Statware

The factors affecting team performance in the NFL: does off-field conduct matter? Abstract

Rules And How To Players Fantasy Football 2013 Espn Top 50

Largest Comeback vs. Eagles vs. Minnesota Vikings at Veterans Stadium, December 1, 1985 (came back from 23-0 deficit in 4th qtr.

PLAYOFF RACES HEATING UP AS NFL SEASON ROLLS ON

NFL Direction-Oriented Rushing O -Def Plus-Minus

The SUV Statistic for Baseball and Football Situational Underlying Value SUV Baseball Expected Runs/Chance of Scoring Table

The Worth of NFL Free Agent Quarterbacks. by Kenneth Goit

SPORTSCIENCE sportsci.org

University of Groningen. Statistical thinking in sports Koning, Ruud; J. Albert (eds.), [No Value]

2014 fantasy football rankings preseason top Fantasy football rankings of the top 300 players for the 2014 NFL season.

Guide for Statisticians

HUSKERS in the NFL. Nebraska Football in the NFL

Measuring Match Importance in the Malaysian Super League Based on Win-Lose-Draw Results

Directions On How To Player Fantasy Football 2012 Ppr League Rankings

The Lions 10 points yielded at the New York Giants mark the lowest total Detroit has ever allowed on the road during a Monday Night Football game.

RULE BOOK FOR REDZONE FLAG FOOTBALL -CHAMPION 03/18/2017 THE ATTIRE RULES

Tab Ramos Sports Center Flag Football Rules

PREDICTING THE FUTURE OF FREE AGENT RECEIVERS AND TIGHT ENDS IN THE NFL

MONEYBALL. The Power of Sports Analytics The Analytics Edge

OSHKOSH COMMUNITY YMCA ADULT INDOOR FLAG FOOTBALL

MORE EXCITING FOOTBALL AHEAD AS NFL ENTERS WEEK 3

The Highway to WAR: Defining and Calculating the Components for Wins Above Replacement

Joe Flacco (Today's Great Quarterbacks) By Ryan Nagelhout READ ONLINE

4th Down Expected Points

Tampa Bay. Buccaneers Recap

FLAG FOOTBALL (1 st 6 th grade)

As of July 1, Nebraska had 39 former players on NFL rosters including 17 players with four or more years of experience.

Play 4 Fun Sports. NFL Youth Flag Football Program Rules

ISDS 4141 Sample Data Mining Work. Tool Used: SAS Enterprise Guide

Kurt Warner. Quarterback 6-2, 220 Northern Iowa St. Louis Rams, 2004 New York Giants, Arizona Cardinals (12 playing seasons)

The Drivers of Success in the NFL: Differences in Factors Affecting the Probability of Winning Based on First Half Performance

Football 101. Basic Functions of a Football Team

Announcements. Lecture 19: Inference for SLR & Transformations. Online quiz 7 - commonly missed questions

Champions Cup & X-Cup Rules

Lakeland Football Season and Career Individual Leaders updated May 21, 2018

Football Play Type Prediction and Tendency Analysis

GONZALEZ S NFL STATISTICS

Football Monday. St. Peter's Prep (35) at Union City (7) - Football

Simulating Major League Baseball Games

NFL Season Launch Kit 2015 PLAY FOOTBALL VIGNETTES

Effect of homegrown players on professional sports teams

5 v. 5 Flag Football Rules (Celina Parks and Recreation -- Fall 2017, v3.0)

Flag Football Rules. All balls must be spotted in the middle of the field

Friday, October 14, 2011 (07:02 PM) League game at Highland

Quick Start Guide. Quick Start Guide

Journal of Quantitative Analysis in Sports Manuscript 1039

2018 FOOTBALL STATISTICIANS MANUAL

2018 Florida State Football Florida State Combined Team Statistics (as of Sep 04, 2018) All games

RUNNING BACK LBS COLLEGE: MISSISSIPPI ACQUIRED: UNRESTRICTED FREE AGENT (KC) NFL EXPERIENCE (NFL/TITANS): 7/3 HOMETOWN: LARGO, FLA

Advanced Metrics Matchup Guide

Final statistics October 9, 1960 Oakland Raiders at Dallas Texans. Cotton Bowl Dallas, Texas. Temperature 82 Humidity 60. Precipitation None

Flag Football Rules, Regulations and Information

Using Spatio-Temporal Data To Create A Shot Probability Model

Lakewood Recreation Youth Flag Football Rule Book Pee Wee League (Grades 1 & 2) Rookie League (Grades 3 & 4)

Evaluating the Performance of Offensive Linemen in the NFL

Fantasy Football Index 2014 By Ian Allan READ ONLINE

IFAF rule changes 2017

DakStats Football Quick Start Guide 1 of 7

Navigate to the golf data folder and make it your working directory. Load the data by typing

Columbia University. Department of Economics Discussion Paper Series. Auctioning the NFL Overtime Possession. Yeon-Koo Che Terry Hendershott

BEGINNING OF THE GAME AND PLAY FORMAT:

B. AA228/CS238 Component

NFL FLAG FOOTBALL IS NON-CONTACT. BLOCKING AND TACKLING ARE NOT ALLOWED.

2018 REGULAR SEASON STATS

MALCOLM SMITH SOUTHERN CALIFORNIA

360,000 student-athletes 23 sports at

#1 Accurately Rate and Rank each FBS team, and

EAGLES FOOTBALL GRIDIRON NEWS. Eagles pound the. McAllen Bulldogs 52 to 21. Will face Edinburg Vela in Area Round

FBT FLAG FOOTBALL GAME DAY EXPERIENCE end zone NOTE: Coaches, keep your players belts and wristbands after practices and games. the breakaway banner

Mixed Models as the Basis for Catcher Evaluation and Forecasting. Harry Pavlidis Baseball Prospectus & Pitch Info

Transcription:

NFL Player Evaluation Using Expected Points Added with nflscrapr Ron Yurko Sam Ventura Max Horowitz Department of Statistics Carnegie Mellon University Great Lakes Analytics in Sports Conference, 2017 Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 1 / 31

Game of Yards? Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 2 / 31

Tale of Two Runs Arizona Cardinals line up on 3rd-and-20 on their own 10 yard line: David Johnson gains 11 yards, team punts next play Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 3 / 31

Tale of Two Runs Arizona Cardinals line up on 3rd-and-20 on their own 10 yard line: David Johnson gains 11 yards, team punts next play Pittsburgh Steelers line up on 3rd-and-1 on their own 10 yard line: Le Veon Bell gains 4 yards, converts first down Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 3 / 31

Tale of Two Runs Arizona Cardinals line up on 3rd-and-20 on their own 10 yard line: David Johnson gains 11 yards, team punts next play Pittsburgh Steelers line up on 3rd-and-1 on their own 10 yard line: Le Veon Bell gains 4 yards, converts first down Johnson gained nearly 3 times more yards than Bell... Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 3 / 31

Tale of Two Runs Arizona Cardinals line up on 3rd-and-20 on their own 10 yard line: David Johnson gains 11 yards, team punts next play Pittsburgh Steelers line up on 3rd-and-1 on their own 10 yard line: Le Veon Bell gains 4 yards, converts first down Johnson gained nearly 3 times more yards than Bell... NOT ALL YARDS ARE CREATED EQUAL Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 3 / 31

How to Value a Play? Enter the Statistician How do we properly evaluate a play? 3rd-and-20, CLE 10, (10:48) (Shotgun) D.Johnson up the middle to CLE 21 for 11 yards (R.McLeod). Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 4 / 31

How to Value a Play? Enter the Statistician How do we properly evaluate a play? 3rd-and-20, CLE 10, (10:48) (Shotgun) D.Johnson up the middle to CLE 21 for 11 yards (R.McLeod). Consider a play-by-play dataset: Down: 4 downs to advance the ball 10 (or more) yards Yards to go: distance in yards to convert first down Yard line: distance in yards away from opponent s endzone (100 to 0) - the field position Time remaining: seconds remaining in half, each half is 1800 seconds long (overtime is 900) Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 4 / 31

How to Value a Play? Enter the Statistician How do we properly evaluate a play? 3rd-and-20, CLE 10, (10:48) (Shotgun) D.Johnson up the middle to CLE 21 for 11 yards (R.McLeod). Consider a play-by-play dataset: Down: 4 downs to advance the ball 10 (or more) yards Yards to go: distance in yards to convert first down Yard line: distance in yards away from opponent s endzone (100 to 0) - the field position Time remaining: seconds remaining in half, each half is 1800 seconds long (overtime is 900) Value of a play depends on the situation Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 4 / 31

We Know This Already Operations Research on Football (Carter and Machol, 1971) Former BYU and NFL quarterback Virgil Carter Took 2852 1st-and-10 plays, turned the field into 10 yard buckets, averaged the value of the next scoring event Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 5 / 31

We Know This Already Operations Research on Football (Carter and Machol, 1971) Former BYU and NFL quarterback Virgil Carter Took 2852 1st-and-10 plays, turned the field into 10 yard buckets, averaged the value of the next scoring event The Hidden Game of Football (Carroll et al., 1988) Classic work and first deep dive into football statistics Play s success is a function of down and yards to go Linear expected points model from -2 on team s goal line to +6 on opponent s, every 25 yards leads to 2 more expected points Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 5 / 31

We Know This Already Operations Research on Football (Carter and Machol, 1971) Former BYU and NFL quarterback Virgil Carter Took 2852 1st-and-10 plays, turned the field into 10 yard buckets, averaged the value of the next scoring event The Hidden Game of Football (Carroll et al., 1988) Classic work and first deep dive into football statistics Play s success is a function of down and yards to go Linear expected points model from -2 on team s goal line to +6 on opponent s, every 25 yards leads to 2 more expected points Recent developments by Aaron Schatz at Football Outsiders, Brian Burke at ESPN, and Keith Goldner at numberfire Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 5 / 31

Reproducible with nflscrapr Recent work in football analytics is not easily reproducible: Reliance on proprietary and costly data sources Data quality relies on potentially biased human judgement Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 6 / 31

Reproducible with nflscrapr Recent work in football analytics is not easily reproducible: Reliance on proprietary and costly data sources Data quality relies on potentially biased human judgement nflscrapr: R package created by Maksim Horowitz to enable easy data access and promote reproducible NFL research Collects play-by-play data from NFL.com and formats into R data frames Data is available for all games starting in 2009 Available on Github, install with: devtools::install github(repo=maksimhorowitz/nflscrapr) Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 6 / 31

How to Model Expected Points Developed a novel multinomial logistic regression model built on nflscrapr data from 2009-2016 seasons, using the nnet package Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 7 / 31

Expected Points Model Model is generating probabilities, agnostic of value associated with each next score type Next Score: Y {Touchdown (7), Field Goal (3), Safety (2), No Score (0), -Safety (-2), -Field Goal (-3), -Touchdown (-7)} Situation: X {down, yards to go, field position, time remaining} Outcome probabilities: P(Y = y X ) Expected Points (EP) = E(Y X ) = y P(Y = y X ) y Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 8 / 31

Expected Points Added (EPA) Expected Points Added (EPA) estimates a play s value based on the change in situation, providing a point value EPA playi = EP playi+1 EP playi Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 9 / 31

Expected Points Added (EPA) Expected Points Added (EPA) estimates a play s value based on the change in situation, providing a point value EPA playi = EP playi+1 EP playi David Johnson s 11 yards leading to 4th down? Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 9 / 31

Expected Points Added (EPA) Expected Points Added (EPA) estimates a play s value based on the change in situation, providing a point value EPA playi = EP playi+1 EP playi David Johnson s 11 yards leading to 4th down? 0.0096 EPA Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 9 / 31

Expected Points Added (EPA) Expected Points Added (EPA) estimates a play s value based on the change in situation, providing a point value EPA playi = EP playi+1 EP playi David Johnson s 11 yards leading to 4th down? 0.0096 EPA Le Veon Bell s 4 yards converting 1st down? Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 9 / 31

Expected Points Added (EPA) Expected Points Added (EPA) estimates a play s value based on the change in situation, providing a point value EPA playi = EP playi+1 EP playi David Johnson s 11 yards leading to 4th down? 0.0096 EPA Le Veon Bell s 4 yards converting 1st down? 0.9225 EPA His 4 yards are almost 100 times more valuable! Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 9 / 31

EPA-based Metrics Using EPA, can calculate new metrics to evaluate players The following slides will go through various EPA-based metrics: Total EPA Dividing by attempts for rate measure Introduce new weighted metrics Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 10 / 31

Quarterbacks: Total EPA Total EPA = Attempts EPA Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 11 / 31

Quarterbacks: EPA per Attempt EPA per Attempt = Total EPA # of Attempts Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 12 / 31

Quarterbacks: Success Rate Success Rate = # plays with EPA>0 # plays (Berri & Burke, 2012) Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 13 / 31

Quarterbacks: Weighted Completion % Weighted Completion % = Completions EPA Attempts EPA Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 14 / 31

Receivers: Weighted Reception % Weighted Reception % = Receptions EPA Targets EPA Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 15 / 31

Rushers: EPA per Carry EPA per Carry = Total EPA # of Carries Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 16 / 31

Difference Between Passing and Rushing Rushing is costly, need to evaluate players relative to average Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 17 / 31

Peyton Manning Heatfield Comparison Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 18 / 31

Receiving Heatfield Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 19 / 31

Stabilization Important to understand when stats stabilize to trust them e.g. How many pass attempts does Dak Prescott need before we can believe the results? Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 20 / 31

Stabilization Important to understand when stats stabilize to trust them e.g. How many pass attempts does Dak Prescott need before we can believe the results? Follow approach by Tom Tango for baseball stats: For given number N of attempts (or targets, carries, etc.), find the players with at least that number For each player, randomly select two samples of size N 2 and calculate the stat of interest, repeat Compute the correlation r and repeat the process for each considered N, identify when correlation stabilizes Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 20 / 31

Receiving Stabilization Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 21 / 31

Passing Stabilization Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 22 / 31

Rushing Stabilization Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 23 / 31

Consistency Across Seasons Accurate metrics of player ability should be consistent over time Check the correlations between player seasons in two ways: Same team in both seasons Different team in each season Ideally, measure of player s ability should be independent of team Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 24 / 31

Receiving Correlations High level of consistency for receiving stats Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 25 / 31

Passing Correlations Drop between 2014-2015 led by Peyton Manning and Tony Romo Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 26 / 31

Rushing Correlations Clearly team dependent (O-line?), e.g. DeMarco Murray: DAL in 2014 - Success Rate = 43.16% PHI in 2015 - Success Rate = 34.38% Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 27 / 31

Recap Traditional metrics do not always properly evaluate a play Using nflscrapr we can calculate EPA based metrics (as well as WPA) and view player production across the field Passing is more efficient than rushing from an EPA view Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 28 / 31

Recap Traditional metrics do not always properly evaluate a play Using nflscrapr we can calculate EPA based metrics (as well as WPA) and view player production across the field Passing is more efficient than rushing from an EPA view Receiving: EPA per Rec and Yards per Rec are highly consistent Passing: Success Rate and Comp % are more appropriate than yards Rushing: Success Rate is most consistent Most difficult to evaluate Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 28 / 31

Future Work: Isolate Player Contribution Obviously one player is not solely responsible for EPA Need to account for the situation (down, yards to go, etc.) and also the teams and players involved Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 29 / 31

Future Work: Isolate Player Contribution Obviously one player is not solely responsible for EPA Need to account for the situation (down, yards to go, etc.) and also the teams and players involved Mixed model approach (Judge et al., 2015): Fixed effects for the situation Random effects for the individual players, teams This work will be presented at NESSIS 2017 Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 29 / 31

Future Work: Probability Measures Multinomial logistic regression model generates probabilities for each of the 7 next score events - can create new statistics for events e.g. Probability of Touchdown Added (PTDA) Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 30 / 31

Tartan Sports Analytics Website application: Developing reports and shiny apps to host on Tartan Sports Analytics Club website https://tartansportsanalytics.com/ nflscrapr development on Github: https://github.com/maksimhorowitz/nflscrapr Follow us on twitter: Tartan Sports Analytics - @CMUAnalytics Ron - @Stat Ron Sam - @stat sam Max - @bklynmaks Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 31 / 31

References I [Hastie et al., 2009]. [Carroll et al., 1998]. [Pasteur and David, 2017]. [Goldner, 2017]. [Berri and Burke, 2017]. [Carter and Machol, 1971]. [Burke, ]. [out, ]. [num, ]. Football outsiders. numberfire. Berri, D. and Burke, B. (2017). Measuring productivity of nfl players. In Quinn, K., editor, The Economics of the National Football League, pages 137 158. Springer, New York, New York. Burke, B. Advanced football analytics. Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 1 / 4

References II Carroll, B., Palmer, P., Thorn, J., and Pietrusza, D. (1998). The Hidden Game of Footbal: The Next Edition. Total Sports, Inc., New York, New York. Carter, V. and Machol, R. (1971). Operations research on football. Operations Research, 19(2):541 544. Goldner, K. (2017). Situational success: Evaluating decision-making in football. In Albert, J., Glickman, M. E., Swartz, T. B., and Koning, R. H., editors, Handbook of Statistical Methods and Analyses in Sports, pages 183 198. CRC Press, Boca Raton, Florida. Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 2 / 4

References III Hastie, T., Tibshirani, R., and Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York, New York. Pasteur, R. D. and David, J. A. (2017). Evaluation of quarterbacks and kickers. In Albert, J., Glickman, M. E., Swartz, T. B., and Koning, R. H., editors, Handbook of Statistical Methods and Analyses in Sports, pages 165 182. CRC Press, Boca Raton, Florida. Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 3 / 4

Expected Points Model Combined two ideas for weighting plays: Score differential - more weight for close score games Distance in drives away from next score Ron Yurko (@Stat Ron) NFL Player Evaluation Great Lakes Analytics, 2017 4 / 4