Package nhlscrapr. R topics documented: March 8, Type Package

Similar documents
Package rdpla. August 13, 2017

Package STI. August 19, Index 7. Compute the Standardized Temperature Index

Package mrchmadness. April 9, 2017

Package pinnacle.data

Package jackstraw. August 7, 2018

TOURNAMENT TEAM REGISTRATION INSTRUCTIONS:

dplyr & Functions stat 480 Heike Hofmann

Package ExactCIdiff. February 19, 2015

How to Setup and Score a Tournament. May 2018

Working with Marker Maps Tutorial

A GUIDE TO THE LOOSE ENDS HOCKEY LEAGUE WEBSITE PAGE

Microsoft Windows Software Manual for FITstep Stream Version 4

Here is a step-by-step guideline you may follow for signing your child up for a swim meet on Swim Connection.

Package BAwiR. January 25, 2018

[CROSS COUNTRY SCORING]

[CROSS COUNTRY SCORING]

Club s Homepage Welcome Club Calendar Logout Add a Request Play Date Requested Time Hole Selection # of Tee Times Break Link

Package mregions. October 17, 2017

FireHawk M7 Interface Module Software Instructions OPERATION AND INSTRUCTIONS

WinPicks 2017 Reference Manual

A. Login. 3. After validation a Welcome screen is displayed.

THE STATCREW SYSTEM For Basketball - What's New Page 1

Learning Locomotion Test 3.1 Results

WEST POINT GOLF CLUB USING THE GOLFSOFTWARE PROGRAM FOR THE DRAW AND SCORING

Mac Software Manual for FITstep Pro Version 2

Club s Homepage Use this feature to return the club s website.

SWIM MEET MANAGER 5.0 NEW FEATURES

Horse Farm Management s Report Writer. User Guide Version 1.1.xx

Registering Club players in Whole Game Club Official Training Guide

RSKtools for Matlab processing RBR data

Package macleish. January 3, 2018

Package gvc. R topics documented: August 29, Version Title Global Value Chains Tools

The ICC Duckworth-Lewis-Stern calculator. DLS Edition 2016

How to Download a Red App

Oracle 11g Secure Files Overview Inderpal S. Johal. Inderpal S. Johal, Data Softech Inc.

2017 Census Reporting To access the SOI s Census Reporting web site go to:

NCSS Statistical Software

Software Manual for FITstep Pro Version 2

Working with Object- Orientation

Formula 1 Statistics Feeds

Golf Genius Software

Division Data Coordinator Tasks

Fencing Time Version 4.3

MP-70/50 Series Scoreboard Controller User Guide

FIBA Europe Coaching Website. Manual. Practice Section

January 2007, Number 44 ALL-WAYS TM NEWSLETTER

Gaia ARI. ARI s Gaia Workshop. Grégory Mantelet. 20 th June 2018 CDS / ARI

SIDRA INTERSECTION 6.1 UPDATE HISTORY

SCW Web Portal Instructions

TRAP MOM FUN SHOOT 2011

USER MANUAL April 2016

Formula E Statistics Feeds

Apple Device Instruction Guide- High School Game Center (HSGC) Football Statware

Integrated Sports Systems (ISS) Inc. Meet Management Suite

Excel Solver Case: Beach Town Lifeguard Scheduling

Viva TPS. TS11/15 Total Stations Check and Adjust Procedure. October Summary

Registering Your Team for the First Time In a Seasonal Year

ExcelDerby User's Manual

NHL Statistics Feeds

BVIS Beach Volleyball Information System

User Help. Fabasoft Scrum

PRIME TIME HOCKEY S Youth Development Program YDP

Inspection User Manual

INTERNATIONAL SKATING UNION SPECIAL REGULATIONS & TECHNICAL RULES SPEED SKATING and SHORT TRACK SPEED SKATING 2016

RUNNING A MEET WITH HY-TEK MEET MANAGER

The ICC Duckworth-Lewis Calculator. Professional Edition 2008

Blackwave Dive Table Creator User Guide

Database of Winds and Waves. -internet version-

Maestro 3 rd Party Golf User Guide

The PyMca Application and Toolkit V.A. Solé - European Synchrotron Radiation Facility 2009 Python for Scientific Computing Conference

SBSC CARDING INSTRUCTIONS

Track B: Particle Methods Part 2 PRACE Spring School 2012

Chapter 6 Handicapping

IWR PLANNING SUITE II PCOP WEBINAR SERIES. Laura Witherow (IWR) and Monique Savage (MVP) 26 July

Tournament Rules. 1. USA Hockey rules will apply except as noted herein.

MyCricket User Manual

Scoreboard Operator s Instructions MPC Control

Pegas 4000 MF Gas Mixer InstructionManual Columbus Instruments

REMOTE CLIENT MANAGER HELP VERSION 1.0.2

Quintic Automatic Putting Report

Fastball Baseball Manager 2.5 for Joomla 2.5x

Full-Time People and Registrations Version 5.0

26 th ANNUAL CHALLENGE CUP TOURNAMENT

JUNIOR CRUSADERS HOCKEY

Team Manager's Manual

Diver Training Options

Learning Locomotion Test 3.2 Results

MyNetball Club Training Manual

Handbook for Captains Updated August 2018

SQL LiteSpeed 3.0 Installation Guide

3.0. GSIS 3.1 Release Notes. NFL GSIS Support: (877) (212)

League Of Legends Statistics Feeds

Open Research Online The Open University s repository of research publications and other research outputs

FIG: 27.1 Tool String

TECHNICAL NOTE HOW TO USE LOOPERS. Kalipso_TechDocs_Loopers. Revision: 1.0. Kalipso version: Date: 16/02/2017.

Overview. 2 Module 13: Advanced Data Processing

Introducing the Stats Tracker Premium v6

Inspection User Manual This application allows you to easily inspect equipment located in Onix Work.

Milking center performance and cost

New User. This is the sign-in screen for the registration system. If this is the first time entering the system, click on NEW USER.

Transcription:

Type Package Package nhlscrapr March 8, 2017 Title Compiling the NHL Real Time Scoring System Database for easy use in R Version 1.8.1 Date 2014-10-19 Author A.C. Thomas, Samuel L. Ventura Maintainer ORPHANED Depends R (>= 2.15.0) Imports RCurl, rjson, biglm, bitops Compiling the NHL Real Time Scoring System Database for easy use in R. License GPL (>= 3) LazyData yes NeedsCompilation yes Repository CRAN Date/Publication 2017-03-08 09:13:41 X-CRAN-Original-Maintainer X-CRAN-Comment Orphaned on 2017-03-08 as mail to the maintainer is undeliverable, and this needed correction for R-devel. R topics documented: nhlscrapr-package...................................... 2 fold.frames......................................... 3 full.game.database..................................... 4 nhlscrapr-data........................................ 5 NP.score........................................... 6 player.summary....................................... 7 process.games........................................ 7 Index 10 1

2 nhlscrapr-package nhlscrapr-package nhlscrapr: Retrieve and process play-by-play game data from the NHL Real Time Scoring System Details This package contains routines for extracting play-by-play game data for regular-season and playoff NHL games, particularly for analyses that depend on which players are on the ice. This package processes data files of the type expanded summary (ES) and play-by-play (PL) for all possible games between 2002 and 2013, visual shift charts (SCH and SCV) for games between 2002 and 2007, and shot locations from 2008 until present. The purpose of this routine is to produce tables for players, games and on-ice events specifically to look at the impact of a player s contribution to how events unfold. The examples below illustrate how these tools can be used to produce these tables. The command process.games() will download the appropriate game files and, for each game selected produce a table of events with all players on the ice and relevant information for each particular event, such as shot distance and type, goals and assists, and the zone in which the event took place (relative to the home team). The command augment.game() takes a single game file and makes it more useful for the purpose of statistical modelling. It replaces player names with a unique player ID number, separates goaltenders from skaters, and obtains the event interval time. This routine is called for each game in the table by merge.to.mega.file() as it combines all games into one single R object, saving this object to disk along with the game table and unique player roster. A user only need run one command, compile.all.games(), in order to get everything from the site, though this may take hours to download and compile. Examples ## Not run: #What are all games that can/should be downloaded? #valid=false implies previously screened problems. #Just a subset for testing. game.table <- full.game.database()[301:303,] #Get the details for these 20 games. print(game.table)

fold.frames 3 #Takes HTML files and (possibly) GIF images and produces event and player tables for each game. process.games (game.table) #Give me a single game record. sample.game <- retrieve.game (season="20022003", gcode="20301") #Augment all games and put them all in one big database. compile.all.games (output.file="mynhlscrapes.rdata") ################################################################# # Extras: #Process games in parallel! library(domc) registerdomc(4) res <- foreach (kk=1:dim(game.table)[1]) %dopar% { message (paste(kk, game.table[kk,1], game.table$gcode[kk])) item <- process.single.game( game.table[kk,1], game.table$gcode[kk], save.to.file=true) } ## End(Not run) fold.frames fold.frames Collapse a series of data frames with the same columns into one data frame. fold.frames(frame.list) Arguments frame.list A list of data frames. Details Most useful with compile.all.games().

4 full.game.database Value A single data frame with the same columns as the originals. full.game.database Create a shell database for NHL games from 2002 to the present, then download them Creates a shell database for NHL games from 2002 to present, unfilled with game information. Uses this to collect and scrape NHL data, but not process them immediately. full.game.database(extra.seasons=0) download.single.game (season="20122013", gcode="20001", rdata.folder="nhlr-data", verbose=true, wait=20) download.games (games=full.game.database(), rdata.folder="nhlr-data",...) Arguments extra.seasons season gcode rdata.folder verbose wait games Beyond 20142015, adds data for additional seasons (assuming nhlscrapr is not supported and updated before the season begins. A character string for the two years specifying an NHL season. The five-digit ID number for a particular NHL game. The location within the current directory to which to save the downloaded files. Will be created if it does not exist. Report additional messages. Amount of time in seconds to wait between game downloads. A game database, such as the one produced by full.game.database().... Arguments to pass to download.single.game().

nhlscrapr-data 5 Details full.game.database() gives ID numbers for all regular-season and playoff games played between 2002 and 2013, with indicators for whether any particular game is known to be unavailable. download.single.game() retrieves the relevant files for a single game from NHL.com. download.games() retrieves the files for all games in the table. Value full.game.database: a data frame with columns including season, session (Regular or Playoffs), game number/gcode (which game in the season?). Placeholders for teams, score and date of game are included to be filled in later. valid indicates whether a full record of the game is available for download. download.single.game: returns a single Boolean value indicating if no errors were recorded during the download. Saves the game to disk, particularly the PL, ES, SCH and SCV files, along with JSON data for x-y events. download.games: returns the input game database with valid changed to FALSE for any failed downloads. Examples #Select a part of the history. game.table <- full.game.database()[201:220,] #Download one game. download.single.game(game.table$season[1], game.table$gcode[1], wait=1) #Download all games. ## Not run: game.table.updated <- download.games (games=game.table) ## End(Not run) nhlscrapr-data nhlscrapr: Included Data Sets Data sets included with the nhlscrapr package.

6 NP.score quadsarray pushes pulls zone.adjust.prefab date201415 Format quadsarray: a series of quadrilaterals that define distinct areas of the offensive zone. pushes, pulls: the possible transitions from one area subzone to another, constrained to be distance lengthening (push) or shortening (pull). zone.adjust.prefab: When fixing the rink biases, this is used to assign the transition probabilities. This can be calculated by the user if desired. date201415: A predownloaded version of the timing of games for the 2014-2015 season. None of these is ever required for the user. NP.score Calculate NP-tau score Calculates the Net Probability score from a game table. NP.score(grand.data, seconds=20) Arguments grand.data seconds A single data frame of all events to be considered. The amount of time in seconds after the event with which we measure net goals. Value A table containing the goal counts for each team following the particular events and zones, as well as the calculated net probability.

player.summary 7 References Michael Schuckers and James Curro (2013) Total Hockey Rating (THoR): A comprehensive statistical rating of National Hockey League forwards and defensemen based upon all on-ice events. Sloan Sports Conference. player.summary Player summaries Summarizes the event count for each player in the data frame. player.summary (grand.data, roster.unique) Arguments grand.data roster.unique A single data frame of all events to be considered. A list of all player IDs to consider. Value An array with event counts for each player in the data frame. process.games Given downloaded NHL games, produce event tables with players on ice. Produces game tables from the NHL source files.

8 process.games process.single.game (season="20122013", gcode="20001", rdata.folder="nhlr-data", override.download=false, save.to.file=true,...) process.games (games=full.game.database(), rdata.folder="nhlr-data", override.download=false) retrieve.game (season="20122013", gcode="20001", rdata.folder="nhlr-data", force=true,...) compile.all.games (rdata.folder="nhlr-data", output.folder="source-data", new.game.table=null, seasons=null, verbose=false, override.days.back=null, date.check=false,...) Arguments season gcode rdata.folder A character string for the two years specifying an NHL season. The five-digit ID number for a particular NHL game. The location within the current directory to which to save the downloaded files. Will be created if it does not exist. output.folder The location within the current directory where compiled files will be saved. Will be created if it does not exist. games, new.game.table A game database, such as the one produced by full.game.database(). override.download Re-download the game files whether or not they are currently locally available. save.to.file force Save the game object to file. If TRUE, reprocesses the single game file. seasons If selected, defaults to a game table with only these seasons, of the form "20142015". date.check Try and download files without known dates. override.days.back Force download for those days previous to this one. verbose Print extra output information.... Arguments to pass to download.single.game().

process.games 9 Details Value This group of functions takes the downloaded HTML, image and JSON files and produces an event table for each game, particularly focusing on the players on the ice during each event. process.single.game() produces an object called game.info that contains several pieces of information, primarily the data.frame playbyplay, with each column representing an event. Columns are as follows: event: The event number as recorded in the game. period, seconds: The period of the game and the elapsed time in seconds. a1,...,a6: The numbers and names of the away team players on the ice. h1,...,h6: The numbers and names of the home team players on the ice. ev.team: The team that registered the event in question. ev.player.1, ev.player.2, ev.player.3: The players involved in the event. distance: The distance in feet from the relevant goal for a shot on goal, missed shot or goal. type: Further information on the event, either shot type or penalty type. homezone: The zone in which the event took place, from the perspective of the home team. xcoord, ycoord: If available, the (x,y) location of a shot on goal or hit. process.games() runs this routine and saves all game files to disk. retrieve.game() retrieves the processed file from disk if it exists, and if not, it runs process.single.game() first. compile.all.games() does everything that s needed to construct the entire record. This will be fastest if all games have already been downloaded and processed, possibly in parallel.

Index Topic datasets nhlscrapr-data, 5 Topic package nhlscrapr-package, 2 compile.all.games (process.games), 7 date201415 (nhlscrapr-data), 5 download.games (full.game.database), 4 download.single.game (full.game.database), 4 fold.frames, 3 full.game.database, 4 nhlscrapr (nhlscrapr-package), 2 nhlscrapr-data, 5 nhlscrapr-package, 2 NP.score, 6 player.summary, 7 process.games, 7 process.single.game (process.games), 7 pulls (nhlscrapr-data), 5 pushes (nhlscrapr-data), 5 pushpull (nhlscrapr-data), 5 quadsarray (nhlscrapr-data), 5 retrieve.game (process.games), 7 zone.adjust.prefab (nhlscrapr-data), 5 10