More on Defensive Regression (or Runs) Analysis

Size: px
Start display at page:

Download "More on Defensive Regression (or Runs) Analysis"

Transcription

1 A More on Defensive Regression (or Runs) Analysis This appendix has three primary objectives: first, to disclose aspects of DRA not disclosed in chapter two; second, to address aspects of the model that raise issues related less to baseball per se than to statistical modeling in general; and third, to drive home the fundamental point that DRA is not an answer, but a method. Included in this appendix are certain alternative models I tried, and suggestions for further improvements, which should provide some sense of the range of alternative approaches that are possible. DRA POST-1951 Overview There are essentially two DRA models: post-1951 and pre The post model uses a subset of Retrosheet play-by-play data currently available for seasons after 1951, and was almost completely described in chapter two. The pre-1952 model must make do with considerably less data, which renders it more primitive for infielders and unavoidably more complicated for outfielders. When we first began explaining DRA, we took a bottom-up approach, starting from the shortstop position and gradually building up until we had a team model. Here we ll take a top-down approach, revealing the entire post-1951 team model all at once, and then discussing its components. Likewise, we ll start with a top-down discussion of the pre-1952 model. The following page presents the entire post-1951 model on one page, with a glossary of defined terms on the facing page. 3

2 DRA Model Team defensive runs saved above or below the league rate, given innings pitched, DR.ip, is estimated as the sum of pitching, catching, infield, and outfield defensive runs: Pitching =.27 *SO.bfp.34 *BB.bfp 1.49 *HR.bh +.42 *A1.bip +.44 *IFO.bip.56 *WP.ip. Catching =.59 *CS.sba +.59 *GO2.bip. Infield =.52 *rgo *ra *ra *ra6. Outfield =.53 *rpo *rpo *rpo *A7.ip +.61 *A8.ip +.61 *A9.ip. All plain variables are team seasonal totals. See definitions on facing page. All variables with a dot, for example, A6.bip, are calculated in the same way: A6.bip = A6 [ A6 * (BIP \ league BIP )]. A6.bip equals total A6 recorded by the team above (if negative, below) the league average rate that year, given total team BIP opportunities. The opportunities variable following the dot is always in lower case letters. All variables beginning with an r are residual team plays that year; that is, estimated net plays taking into account available predictors, using regression analysis. rgo3 = GO3.bip +.09 *RBIP.bip. ra4 = A4.bip +.08 *RBIP.bip +.15 *RFO.rbip +.32 *LFO.lbip +.18 *HR.bh +.20 *WP.ip +.19 * SH.bip. ra6 = A6.bip.06 *RBIP.bip +.29 *RFO.rbip +.15 *LFO.lbip +.12 *HR.bh +.56 *WP.ip +.43 * SH.bip. ra5 = A5.bip.10 *RBIP.bip +.21 *RFO.rbip +.10 *LFO.lbip +.15 *A1.bip +.13 *rgo *IBB.pa. rpo7 = PO7.bip +.03 *RBIP.bip +.21 *RGO.rbip +.10 *LGO.lbip. rpo8 = PO8.bip.01 *RBIP.bip +.27 *RGO.rbip +.24 *LGO.lbip +.07 *IFO.bip +.20 *SH.bip. rpo9 = PO9.bip.03 *RBIP.bip +.22 *RGO.rbip +.22 *LGO.lbip +.12 *IFO.bip. Example of allocation of team fielding runs to individual (lower-case i ) fielders: ia6 runs = +.44 *ra6 * (iip \ IP ) +.44 * [ia6 A6 * (iip \ IP )].

3 Definitions of Team-Level Variables for DRA Model ( ) Abbrev. Definition Formula or Source Abbrev. Definition Formula or Source 1 9 Pitcher... Right Fielder LFO L eft -handed batter FO play-by-play data A Assists (total, if not followed by a number) LGO L eft -handed batter GO play-by-play data BB Unintentional BB + HBP UBB + HBP OA Outfielder-only A sum(a7,a8,a9 ) BFP Batters Faced by Pitchers PA - IBB OPO Outfielder-only PO sum(po7,po8,po9 ) BH Balls Hit BFP - SO - BB PA Plate Appearances BIP Balls In Play BH - HR PB Passed Balls CS Caught Stealing PO Putouts (total, if not followed by a Number) FO Fly Outs (total) RFO + LFO RBIP Right-handed batter BIP play-by-play data GO Ground Outs (total) RGO + LGO RFO Right-handed batter FO play-by-play data GO2 GO at catcher A2 - CS RGO Right-handed batter GO play-by-play data GO3 GO at first base A3 + UGO3 SBA Stolen Base( SB ) Attempts SB + CS HBP Hit By Pitch SH Sacrifice Hits HR Home Runs SO Strikeouts IA In fielder-only Assists sum( A1,A2,...,A6 ) UBB Unintentional BB BB (traditional) - IBB IBB Intentional Bases on Balls BB UBB UGO3 Unassisted GO3 avg(ugo3e1,ugo3e2 ) IFO In fielder-only FO FO - OPO UGO3e1 UGO3 estimate #1 IPO - A - IFO IP Innings Pitched (or Played) UGO3e2 UGO3 estimate #2 GO - IA - CS - GIDP IPO Infielder-only PO sum(po1,po2,... PO6 ) WP Wild Pitches (includes PB ) WP (traditional) + PB

4 6 APPENDIX A The previous two pages are a bit much to take in all at once. But I do not believe that any other comprehensive system for team and individual defense remotely as accurate as DRA can be summarized as concisely. Before addressing the new points, let s quickly recap in a few pages the basic approach under DRA as described in chapter two. You might find it helpful to flip back to the preceding two pages as you read both the recap and the discussion of new issues. DRA is essentially a forced-zero-intercept, two-stage multivariable leastsquares regression analysis model. I m using the two-stage terminology informally; as we shall see, the DRA model is not an instrumental variables model, otherwise known as a two-stage least-squares model. The forced zero intercept merely means that we center the ultimate outcome being predicted (team runs allowed), each play made (each pitching and fielding play that is made) outcome used to predict expected team runs allowed, and each variable used to predict expected pitching and fielding plays, so that all outcomes and their respective predictors are net numbers, above or below the league-average rate. Furthermore, each outcome or predictor is centered by reference to its appropriate denominator of opportunities (the denominators are not literally used as denominators in the arithmetical sense; hence the quotation marks). The first stage of regression analysis involves regressing centered fielding variables onto centered variables not under the control of the fielding position being evaluated (and ideally not influenced by the quality of other fielders) that tend to be associated with more or fewer fielding plays at that position. The residual left over from each first-stage regression at each position is treated as an estimate of the skill plays made at that position above or below expectation. The second-stage regression involves regressing net team runs allowed onto net pitching and (first-stage-regression-adjusted) fielding plays in order to reveal the number of runs associated with each net pitching and (firststage-regression-adjusted) fielding outcome. To rate a team at a position, you simply apply the run weight determined in the second-stage regression to the net plays (which, again, are negative half the time) to determine defensive runs at that position. Finally, you allocate team defensive runs at that position to each player first pro-rata, based on his innings played at that position, then calculate his net plays compared to the team rate, given his percentage of team innings played. Each net play is credited with the same run weight used for the team rating at that position.

5 More on Defensive Regression (or Runs) Analysis 7 Centering The Variables By Their Respective Denominators We center all the team variables by their respective denominators of opportunities. Centering in this way is the first step towards making each variable less correlated with the others, so that its independent net impact in runs may be better estimated. The little quotation marks are there because we will not achieve true independence in a mathematically precise sense. The best denominator of opportunities for the ultimate outcome we re trying to model actual total team runs allowed per season is innings pitched, so we calculate team runs allowed above or below the league-average rate given the team s innings played, that is, net runs allowed given innings played, or RA.ip. In some sense this is just denominating net runs allowed by total outs, as innings are defined by outs. This is correct, because the ultimate limit on the number of runs a team can score in an inning is defined by outs. The best denominator of opportunities for pitchers to record strikeouts ( SO ) or unintentional walks (including batters hit by pitch, BB ) is the number of batters they face, or batters facing pitcher ( BFP ); hence net strikeouts given batters facing the team s pitchers ( SO.bfp ) and net unintentional walks and batters hit by pitch ( BB.bfp ).1 The best denominator for home runs allowed ( HR ) is any BFP not ending in a BB or HR, or balls hit ( BH ); hence HR.bh, which tracks home runs allowed, given that the batter has made contact. The number of balls in play ( BH minus HR, or BIP ) is the primary denominator of opportunities for plays involving getting the batter out on a batted ball not hit out of the park. By initially denominating batted ball outcomes by BIP, we begin the process of measuring net plays independent of the pitching staff s SO.bfp, BB.bfp, and HR.bh. Infield fly outs, that is, fly balls caught by infielders ( IFO ), are almost always weakly hit balls that could be caught by two or more fielders. Since they are nearly automatic outs, analogous to SO, we credit the pitchers with IFO relative to the league, given total BIP, resulting in the IFO.bip variable appearing among pitching runs. Likewise, we credit the pitcher if he records an assist ( A1 ), which will almost always be on a ground ball he has fielded ( A1.bip ). BIP is also the best 1. In this version of DRA, I tried treating intentional walks separately; for reasons discussed shortly below it didn t make any difference, though it should have. The BFP denominator for SO.bfp and BB.bfp excludes plate appearances ending in an intentional walk.

6 8 APPENDIX A denominator for ground out fielding plays at catcher ( GO3 ) and first, assists at second, third, and short, and putouts at each outfield position. The simplest denominator for runners caught stealing ( CS ) is the number of stolen base attempts ( SBA ), hence CS.sba. Finally, wild pitches (defined here to include passed balls, WP ) and outfielder assists ( A7, A8, and A9 ) are denominated by innings played ( IP ), not because that is optimal, but because it is simple. An alternative approach is addressed further below. For the pitching, catching, and outfielder assists variables, centering is the only adjustment that has to be made. (The coefficients for A7.ip, A8.ip, and A9.ip are the same because I combined all three into one variable, A789. ip, when running the second-stage regression.) Furthermore, with the exception of IFO.bip and GO2.bip, we have the exact counts of denominators per pitcher (their BFP, BH, and BIP ) and catcher (their SBA ), so the individual formulas are the same as the team formula, and the sum of individual results equals the team results. There is one variable that is truly a combination of a pitching and catching variable: WP.ip, and not just because it includes passed balls. We credit or debit the pitchers with total WP.ip, because by far the largest source of variance in both wild pitches and passed balls is knuckleball pitching and sheer pitcher wildness. However, to give catchers some credit for being better or worse at preventing wild pitches and passed balls, we credit each catcher with the number of his net passed balls, given innings played, relative to his team (which would control somewhat for the effect of pitchers), and multiplied by three, because there have been roughly two wild pitches per passed ball throughout major league history. Thus, we credit the catcher with effectively two wild pitches saved and one passed ball saved for every passed ball he records in a season above or below his team s rate. It s an admittedly crude measure of the impact catchers have on passed balls and wild pitches, but it is probably reasonable, because catchers miss so much playing time that the set of their catching teammates, at least over the course of a career, probably approaches league-average performance. And, as emphasized in our catcher chapter, all of the traditional methods for evaluating catchers are very suspect, because the biggest impact catchers may have is on pitcher effectiveness, more specifically, SO.bfp and BB.bfp, rather than on base runner defense. Adjusting Net Fielding Plays Made Using Proxy BIP Distribution Variables Second, we refine the estimate of true skill plays made on BIP by backing out, using regression analysis, the estimated effects pitchers and batters have on the distribution of BIP throughout the field. The key items of information gleaned from Retrosheet used to make these adjustments are the number of

7 More on Defensive Regression (or Runs) Analysis 9 total BIP hit by opponent right-handed batters (Right-handed opponent batter BIP, or RBIP ), the number of fly outs ( FO ) and ground outs ( GO ) recorded against opponent right-handed batters (Right-handed opponent batter FO and GO, or RFO and RGO ), and the number of FO and GO recorded against opponent left-handed batters (Left-handed opponent batter FO and GO, or LFO and LGO ). Th e denominator for RBIP is total BIP, yielding RBIP.bip (you have to have a BIP to have an RBIP ), which is negative when the team has a more left-handed opponent batter BIP. The denominator for RFO and RGO is RBIP (you have to have an RBIP to have either an RFO or an RGO ), yielding RFO.rbip and RGO.rbip. The denominator for LFO and LGO is total BIP hit by opponent left-handed batters, which is merely BIP minus RBIP, or LBIP, yielding LFO.lbip and LGO.lbip. Notice that these variables have all been constructed so that they are at least arithmetically independent of each other. These five key variables ( RBIP.bip, RFO.rbip, RGO.rbip, LFO.lbip, and LGO.bip ) are the Proxy BIP Distribution Variables. They are good, if imperfect, proxies for whatever perfect information could theoretically be obtained regarding the actual distribution of expected BIP fielding plays. As we showed in our Bill Mazeroski, Buddy Bell, and Mickey Mantle examples in chapter two, regression analysis reveals that they have the kind of statistical relationships with net second base assists ( A4 ) given total BIP ( A4. bip ), net third base assists ( A5 ) given total BIP ( A5.bip ), and net center field putouts ( PO8 ) given total BIP ( PO8.bip ) that one would expect. When RBIP.bip is positive (that is, when there is an above-average number of BIP hit by opponent right- handed batters, given total BIP ), there are more ground outs recorded on the left side of the infield (third and short) and more fly outs recorded on the right side of the outfield (center and right). When RBIP.bip is negative (in other words, when there is an above-average number of BIP hit by opponent left -handed batters, given total BIP ), there are more ground outs recorded on the right side of the infield (first and second) and fewer on the left side of the outfield (left field). In both cases, that s because hitters tend to pull the ball when they ground out and tend to be behind the ball when they fly out. (Fly balls and line drives to the outfield that are pulled tend to be hit harder and drop in as clean hits.) The coefficients for RBIP.bip are much bigger (positive or negative) in the infield than in the outfield. That s because batter-handedness has a much greater effect on the direction of ground outs than fly outs. You can see this by watching how infields and outfields shift. For the several left-handed batters these days for whom a Williams -type shift is put on, especially Ryan Howard, you ll frequently see the third baseman playing between third and second, and the shortstop playing behind second, but the outfielders playing practically straightaway.

8 10 APPENDIX A RFO.rbip and LFO.lbip are used to adjust ground out plays in the infield for fly ball and ground ball pitching. By using relative FO to estimate relative opportunities to record infield assists, we avoid having the assists made by the fielder being evaluated from being used to take into account his relative opportunities to make plays. By splitting fly outs by opponent batter-handedness, we capture to a significant extent cases in which (i) a team s lefthanded pitchers (who would face proportionately more right -handed batters) tend to induce RGO or RFO and (ii) a team s right-handed pitchers (who would face proportionately more left -handed batters) tend to induce LGO and LFO. Right- and left-handed opponent batters also have their own impact on whether BIP are hit on the ground or in the air, which is also reflected in RFO.rbip and LFO.lbip. However, RFO.rbip and LFO.lbip are controlled more by a team s pitchers, who would tend to have much more extreme ground ball or fly ball tendencies than the league s batters as a whole (excluding, of course, the team s own hitters), though this is less true for more recent seasons, which feature less-balanced schedules. If RFO.rbip is positive, that suggests there will be fewer GO recorded against those right-handed batters, and particularly fewer GO on the left side of the infield. (If RFO.rbip is negative, there will be more GO, particularly on the left side.) If LFO.lbip is high, that suggests there will be fewer GO, and particularly fewer GO on the right side of the infield. (If LFO.lbip is negative, there will be more GO, particularly on the right side.) The coefficients at second, third, and shortstop in the chart at the beginning of this appendix all reflect that expectation. We ll address first base further below. RGO.rbip and LGO.lbip are used to adjust fly out plays in the outfield for fly ball and ground ball pitching by left- and right-handed pitchers, respectively. By using relative GO to estimate relative outfield putout opportunities, we avoid having the actual putouts recorded by each outfielder being used to estimate how many putouts he should have made. If RGO.rbip is positive, that suggests there will be fewer FO recorded against those right-handed batters, and particularly fewer FO on the right side of the infield (and vice-versa). If LGO.lbip is positive, that suggests there will be fewer FO, and particularly fewer fly outs on the left side of the infield (and again, vice versa). Notice again that batter-handedness has less of an impact in the outfield than in the infield, as shown by the fact that the coefficients for RGO.rbip and LGO.lbip are nearly equal in the outfield, whereas the coefficients for RFO.rbip and LFO.lbip are significantly different at each infield position. The obvious case, mentioned in the Mantle example, is center field, which is, well, in the center of the field, where the impact of left- and righthanded batters (and pitchers) is approximately equal. But in right field, the coefficients for RGO.rbip and LGO.lbip are also nearly the same. Only in left is there a meaningful difference between the RGO.rbip and LGO.lbip coefficients,

9 More on Defensive Regression (or Runs) Analysis 11 but even so, the difference is not as great as the differences between the coefficients for RFO.rbip and LFO.lbip at second, third, and short. The bottom line seems to be that opponent batted handedness, and the interaction between opponent batter handedness and pitcher handedness, has a much, much greater impact on the direction of ground balls than fly balls. Adjusting Net Plays For The Impact Of Base Runners The Proxy BIP Distribution Variables attempt to account for where batted balls are hit that is, whether they are hit on the ground or in the air, and on the left or right side of the field. But there are other factors that were not discussed in chapter two that impact the likelihood that fielders at each position will make plays. One obvious factor for infielders is the presence of runners at first base. This increases double play assist opportunities for middle infielders but also forces the first baseman to play close to the bag, which reduces his chance of fielding ground balls in the hole between first and second. Taking this into account using regression analysis is a little tricky. If you create a variable for estimated runners at first, this would include not only walks but also hits allowed. But hits allowed are partly a function of net plays made at first, second, and short. Any statistical association revealed by regression analysis between, say, shortstop assists and runners on first could reflect either the shortstop s impact on the number of runners at first (by allowing or preventing hits) or the impact of the runners at first on shortstop assists (by increasing or decreasing double play assist opportunities). There are a few candidates for variables that get around this circularity problem, at least for middle infielder double play assists, because they are not influenced by infielder fielding: SO.bfp, BB.bfp, HR.bh, WP.ip, and perhaps SH.bip (net sacrifice hits given BIP ). The more SO.bfp, the fewer hits and runners at first. The more BB.bfp, the more runners on first. HR clear the base paths, which obviously prevents double plays. WP and SH allow runners on first to reach second, thus preventing a double play. At both shortstop and second base these variables have, at least directionally, the impact one would expect, though the particular coefficients are not very stable from sample to sample, and since WP.ip and SH.bip have relatively little variation from team to team, they are probably not practically significant and could have been dropped from the model. In addition, SH.bip might also belong more with the category of Proxy BIP Distribution Variables, because by definition they are ground balls that can only be fielded in a particular area of the infield (say, approximately anywhere within sixty feet of home plate).

10 12 APPENDIX A Net intentional walks ( IBB ) given total plate appearances ( PA ), IBB. pa (note that PA equals IBB plus BFP in the post-1951 model) have a negative impact on third base plays, probably because they reduce sacrifice bunts that should be added back. In any event, that variable has little practical impact and could have been dropped from the model. Adjusting Net Plays For The Impact Of Ball Hogging A fielder might make more plays not by preventing more BIP from going through for hits, but by taking more easy chances that could have been fielded by other fielders and were more or less guaranteed outs anyway. By far the most important example of this are FO fieldable by infielders. Ninety to ninety-five percent of fly balls and pop ups caught by infielders can usually be taken by at least two, and sometimes three, different fielders (two infielders and an outfielder). Centerfielders who have played very shallow, especially Andruw Jones, have tended to hog some of these chances. Regressions of PO8.bip onto IFO.bip throughout history consistently show that the more IFO.bip, the fewer PO8.bip, and vice versa. Therefore, if IFO. bip has been reduced by centerfielder ball hogging, a portion of those negative hogged plays is added to expected PO8.bip, thus reducing the centerfielder rating, and vice versa. At times there is an impact for corner outfielders as well. I was somewhat surprised that IFO.bip was so important in right field. Perhaps the fact that most pop-ups are hit to the right side of the field (as most batters are right-handed, and most pop ups are hit to the opposite side of the field, for reasons we ve already discussed) explains this result. Right fielders may take more discretionary pop flies from first basemen (some of whom are the slowest players in baseball) than left fielders take from third basemen. A batted ball category similar to infield fly outs is SH. The three fielders who field SH are the pitcher, first baseman, third baseman, and, to a very small extent, catcher. There is probably some bunt hogging, depending on the fielding quality of pitchers. A great fielding pitcher, such as Greg Maddux, probably fielded some bunts that might otherwise have been fielded by Chipper Jones or Fred McGriff. In contrast, someone like Randy Johnson probably relied more on others to handle sacrifice bunts. The third baseman formula above reflects this factor by backing out a portion of A1. bip when calculating ra5. (So, if the pitcher is taking bunt opportunities from the third baseman, estimated hogged bunts are added back to the third baseman, and vice versa.) Third baseman and first baseman don t fight over bunt opportunities; rather, bunt opportunities are gifts from the batter. Presumably, hitters playing against Brooks Robinson aimed their

11 More on Defensive Regression (or Runs) Analysis 13 bunts toward Boog Powell, and hitters playing against Keith Hernandez aimed their bunts toward Howard Johnson. Regression analysis indicates that the more rgo3 (which is already adjusted for batter-handedness), the fewer ra5, and vice versa. Another similarity between SH.bip and IFO.bip is that both are essentially guaranteed outs. All that is at stake with a sacrifice hit attempt is whether the lead runner advances and the value of that is only about.20 runs. As a practical matter, no fielder should be getting any credit for fielding a sacrifice bunt and getting the runner out at first. Given the total number of SH attempts fielded, the fielder should be given credit for the net number of lead runners taken out relative to the league rate, given those total opportunities, multiplied by.20 runs. I doubt any contemporary third or first baseman would earn more than a couple of runs a season for any such skill. Given total SH attempts fielded, the fielder should be charged for the net number of times he went for the out at second and failed to get either the lead runner or the batter out, multiplied by the free out lost and the hit given up, or about 0.75 runs. Any new DRA model I will develop will take more complete advantage of play-by-play data, will exclude SH from BIP altogether, and will subtract SH assists from each fielder s total. This will also make it unnecessary to back out SH.bip from positions that never have the opportunity to field SH, such as middle infielders and outfielders. Therefore, any future DRA model would not have the SH.bip factor in the rpo8 formula (it wasn t statistically significant in left or right) and none for the ra4 or ra6 formulas (except if significant in limiting double play opportunities). First Base About ninety-eight to ninety-nine percent of ground outs result in an assist for the fielder who fields the ball, with one exception: first base. First basemen record assists for only about half of the ground balls they convert into outs the rest of the time they just run to the bag to record the putout unassisted. Traditional statistics don t differentiate between ground ball putouts and fly ball putouts, but Retrosheet play-by-play data after 1951 does, so it is possible to count the exact number of ground balls a first baseman fields. Unfortunately, I had neither individual nor team totals of unassisted ground outs at first base ( UGO3 ) when I first developed the post-1951 model. However, a reasonably good estimate of the team total can by obtained indirectly, as shown in the charts at the beginning of the chapter. In English, the three rows above say that estimated UGO3 is simply the average of two estimates.

12 14 APPENDIX A UGO3 Unass iste d GO3 avg(ugo3e1,ugo3e2 ) UGO3e1 Unass iste d GO3 estimate #1 IPO - A - IFO UGO3e2 Unass iste d GO3 estimate #2 GO - IA - CS - GIDP The first estimate ( UGO3e1 ) is the estimated number of infield putouts that were not due to catching fly balls: total infield putouts, minus total team assists (including outfield assists, which always result in an infielder putout), minus FO recorded by infielders. I had the exact count for the latter variable, because my data provider gave me the Retrosheet count for total fly outs; all you need to do is subtract outfield putouts from that total to arrive at infielder fly outs. This estimate will overestimate UGO3 by the number of unassisted ground ball putouts at infield positions other than first, which are, in total, only about one-third the total at first. The second estimate is the estimated number of GO that were not in the form of infield assists. I had a total Retrosheet count of GO (at all infield positions); infield assists from fielding ground balls are estimated as total infield assists less CS and double play assists. This estimate underestimates total infield unassisted ground outs by the number of infield assists on relays. The noise in the above estimates is not inconsiderable, but probably not biased either. We are not ultimately concerned with getting the exact total of UGO3, but net UGO3, given BIP, or UGO3.bip. Both unassisted ground ball putouts at second and third, as well as infielder relay assists are both rare and random events that should merely create random noise, whereas first base unassisted ground ball putouts are routine and reflect to a large degree the systematic preference of the first baseman to run to the bag or to toss to the pitcher covering the bag. The sum of first base assists ( A3 ) and UGO3 is estimated GO at first base ( GO3 ). Here is the formula for residual, or regression-adjusted, GO3 : rgo3 = GO3.bip +.09 *RBIP.bip. Regression analysis would also include +.06 * RFO.rbip and +.14 *LFO.lbip, but we need to sacrifice some accuracy at first base by deleting these variables to ensure that the global regression of RA.ip onto our fully-adjusted pitching, fielding, and base-running variables generates correct run weights for infield and outfield plays. Here s why. The Proxy BIP Distribution Variables have a couple of important limitations. One is that in order to obtain in the second-step regression run weights in the infield and outfield that make sense (are approximately equal or slightly higher in the outfield), it is usually desirable that the sum of

13 More on Defensive Regression (or Runs) Analysis 15 RFO.rbip and LFO.lbip regression weights for adjusting infielder positions be approximately equal to the sum of RGO.rbip and LGO.lbip regression weights, respectively, for adjusting outfielder positions. In other words, we do not want each infielder assist to be discounting each outfield putout more than each outfielder putout is discounting each infielder assist. The sum of RFO.rbip coefficients is.71 with an adjustment included at first base ( rgo3 ) (.65 without); the sum of RGO.rbip coefficients is.70. The sum of LFO.lbip coefficients is.70 with an adjustment at first base (.57 without). But the sum of LGO.lbip coefficients is only.56: rgo3 = GO3.bip [ +.06 * RFO.rbip +.13 *LFO.lbip ] ra4 = A4.bip + ( ) +.15 * RFO.rbip +.32 *LFO.lbip ( ) ra6ss = A6.bip + ( ) +.29 *RFO.rbip +.15 *LFO.lbip ( ) ra5 = A5.bip + ( ) +.21 * RFO.rbip +.10 *LFO.lbip ( ) rpo7 = PO7.bip + ( ) +.21*RGO.rbip +.10*LGO.lbip rpo8 = PO8.bip + ( ) +.27*RGO.rbip +.24*LGO.lbip ( ) rpo9 = PO9.bip + ( ) +.22*RGO.rbip +.22*LGO.lbip ( ) Including the first base adjustments for the RFO.rbip and RGO.rbip, coefficients would be balanced, but including first base adjustments for the LFO. lbip and LGO.lbip would result in an imbalance that leads to run-weight coefficients for the outfield positions being lower than for the infield positions, because the marginal outfield plays are associated with a reduction in ground out plays greater than the reduction in outfield plays that is associated with marginal infield plays. We ll address issues related to this further below, when we discuss modeling issues, based on statistical theory, apart from baseball. Examples Of First Stage Regression And Diagnostics There would be little point to showing every single regression analysis and its output, but a couple of illustrative examples should convey the issues involved in variable selection. If one regresses A4.bip onto the Proxy BIP Distribution Variables applicable to infielders ( RBIP.bip, RFO.rbip, LFO.lbip, and SH.bip ) and variables that may impact the number of runners on first base and thus double play pivot opportunities ( IBB.pa, SO.bfp, BB.bfp, HR.bh, and WP.ip ), we obtain the following output (I imported my Excel spreadsheet of centered variables into the statistical software package S-PLUS in order to run the regressions):

14 16 APPENDIX A Call: lm(formula = A4.bip ~ IBB.pa + SO.bfp + BB.bfp + HR.bh + WP.ip + RBIP.jbip + RFO.rbip + LFO.lbip + SH.bip, data = DRAsept07sansNL69, na.action = na.exclude) Residuals: Min 1Q Median 3Q Max Coefficients: Value Std. Error t value Pr( > t ) (Intercept) IBB.pa SO.bfp BB.bfp HR.bh WP.ip RBIP.bip RFO.rbip LFO.lbip SH.bip Residual standard error: on 1218 degrees of freedom Multiple R-Squared: Generally, we will eliminate from consideration variables with a Pr( > t ) greater than.05. It is quite common for statisticians to restrict model variables to those with p values of less than.05. When we eliminate variables with p values greater than.05 from the above regression we obtain the following result: Call: lm(formula = A4.bip ~ HR.bh + WP.ip + RBIP.bip + RFO.rbip + LFO.lbip + SH.bip, data = DRAsept07sansNL69, na.action = na.exclude) Residuals: Min 1Q Median 3Q Max Coefficients: Value Std. Error t value Pr( > t ) (Intercept) HR.bh WP.ip RBIP.bip RFO.rbip LFO.lbip SH.bip Residual standard error: on 1221 degrees of freedom Multiple R-Squared: F-statistic: 192 on 6 and 1221 degrees of freedom, the p-value is 0

15 More on Defensive Regression (or Runs) Analysis 17 The above output, rearranged and rounded, says that a good estimate of Expected A4.bip =.08 *RBIP.bip.15 *RFO.rbip.32 *LFO.lbip.18 *HR.bh.20 * WP.ip.19 * SH.bip. Since we are looking for net plays, we subtract expected A4.bip from actual A4.bip to obtain the following formula for residual (or regression-adjusted) plays at second: ra4 = A4.bip +.08 *RBIP.bip +.15 *RFO.rbip +.32 *LFO.lbip +.18 * HR.bh +.20 * WP.ip +.19 * SH.bip. We round to two decimal places not only for the sake of readability, but also because the standard errors in the estimates of the coefficients (see Std. Error column in the regression output) are generally greater than.01 and actually tend to be about.05. Reporting extra decimal places would be a classic case of false precision. Th ere are some interesting additional details in the final output. Notice that the data is DRAsept07sansNL69. I developed the model in September 2007 from Retrosheet data then only available from 1957 through Also, because of some data anomalies at the time in the 1969 National League data set, I excluded that year and league from the sample. Having developed the model from data, I applied it out of sample to , 1969 (National League), and when finalizing this book. We ll discuss the out of sample output shortly below. Th e Multiple R-Squared of.4855 indicates that approximately 49 percent, or about half, of the variance in A4.bip can be explained by the model. The remaining residual is what we call ra4 and treat as reflecting the true skill of the team s second baseman. The distribution of ra4 is still too large: the worst team at second base had 89 ra4 ; the best, + 83 ra4. The quartiles are fairly reasonable: 17 ra4 and + 15 ra4. The Residual standard error is the standard deviation in ra4, which is 25. Though the ra4 do not follow a so-called normal distribution exactly, due to an excessive number of extreme outcomes, it is still approximately correct to say that the middle halves of teams have between 17 and + 15 ra4, and the middle two-thirds have approximately 25 to + 25 ra4. This spread is probably too high, based on batted ball data, which indicates that the model is not perfectly capturing all the factors that can give or take away chances from second basemen. But the second-stage regression will

16 18 APPENDIX A discount ra4 (and other such residual estimated skill plays at other positions) to adjust for this. I have not included the usual diagnostic plots of residuals. There is absolutely no non-linearity in the residuals, at any position. The scatter plots of residuals against fitted values show no change in the spread of residuals. While the residuals in both the first and second stage regressions were unimodal and symmetric, it must be said that the tails were fatter than one would like, thus falling short of the ideal in regression modeling of normally distributed residuals. Recall that the presence of runners at first base should reduce GO3.bip, because the first baseman has to play closer to the bag. Regression analysis suggests that the typical impact is either not statistically significant or not practically significant over the course of a season. Call: lm(formula = GO3.bip ~ IBB.pa + SO.bfp + BB.bfp + HR.bh + WP.ip + RBIP.jbip + RFO.rbip + LFO.lbip + SH.bip, data = DRAsept07sansNL69, na.action = na.exclude) Residuals: Min 1Q Median 3Q Max Coefficients: Value Std. Error t value Pr( > t ) [1-std impact] (Intercept) IBB.pa SO.bfp BB.bfp runs HR.bh WP.ip runs RBIP.jbip RFO.rbip LFO.lbip SH.bip runs Residual standard error: 23.7 on 1218 degrees of freedom Multiple R-Squared: F-statistic: on 9 and 1218 degrees of freedom, the p-value is 0 I ve highlighted the variables not under the control of fielders that would impact the number of runners at first base. The only one with a p -value below.05 was WP.ip, and, given the standard deviation of WP.ip, that impact in runs per season would typically be only plus or minus three runs. For reasons explained shortly above, we excluded RFO.bip and LFO.lbip from the model for rgo3.

17 More on Defensive Regression (or Runs) Analysis 19 Second-Stage Regression And Diagnostics Set forth below is the regression output from the second stage, global regression, in which we regress actual team runs allowed above or below the league rate that year, RA.ip, onto all of the estimated net skill plays at all positions, including net pitcher plays such as BB.bfp, SO.bfp, HR.bh, IFO.bip, A1.bip, WP.ip, and net residual fielder plays such as ra4, ra6, rpo8, etc. Call: lm(formula = R.ip ~ IBB.pa + SO.bfp + BB.bfp + HR.bh + IFO.bip + A1.bip + WP.ip + CS.sba + GO2.bip + A789.ip + rgo3 + ra4 + ra5 + ra6 + rpo7 + rpo8 + rpo9, data = DRA,na.action = na.exclude) Residuals: Min 1Q Median 3Q Max Coefficients: Value Std. Error t value Pr( > t ) (Intercept) IBB.pa SO.bfp BB.bfp HR.bh IFO.bip A1.bip WP.ip CS.sba GO2.bip OA.ip rgo ra ra ra rpo rpo rpo Residual standard error: on 1210 degrees of freedom Multiple R-Squared: F-statistic: on 17 and 1210 degrees of freedom, the p-value is 0 Th e standard error of a little over 22 runs is similar to the standard errors for the twenty or so well-known formulas for estimating team runs scored, as demonstrated by John Jarvis on his website. Generally this means that

18 20 APPENDIX A the DRA estimate of runs allowed per team is within plus or minus 22 runs about two-thirds of the time. The worst matches, with the greatest errors, are 63 runs and + 67 runs. I would imagine that almost all of the many well-known offensive models would have similar outliers in a fifty- or sixtyyear sample. The Multiple R-Squared is not as high as I would like. When separate DRA models are developed for the Modern Era ( ) and Contemporary Era (1993 present), such models tend to have multiple r-squareds of approximately ninety-five percent, which is approximately the same as is found in the better models of team offense, as reported by John Jarvis at this website (three were as high as ninety-six percent). Part of the art of developing regression models is balancing accuracy and simplicity. In this case I felt it would dramatically simplify this book to have one model for all seasons since the early 1950s. I have not included the usual diagnostic plots of residuals. There is absolutely no non-linearity in the residuals. We ve dealt with multi-collinearity among the predictor variables via centering and first-stage regressions, so the variables all have correlations with each other between.1 and +.1, down from.6 and +.6 for the simple seasonal totals. The scatter plot of residuals against fitted values shows no change in the spread of residuals. The Durbin Watson statistic did not indicate any meaningful correlation in team residuals over time. While the residuals in both the first and second stage regressions were unimodal and symmetric, it must be said again that the tails were fatter than one would like, though closer to a normal distribution than in the case of the first-stage regressions. However, due to the large sample sizes no residual in the first-stage regression was remotely large enough to impact the coefficient estimates in the second-stage regression. One of the typical diagnostic tests for a regression model is to apply it out of sample to see how well it works. When I was finishing this book and had to apply the model to the 1969 National League and seasons for both leagues, the standard error was 23 runs and the r-squared was.90 virtually identical to the in-sample values. Unfortunately, the standard error was 36 runs, with a.89 r -squared. However, that is easily explained. First, the play-by-play data for the early-to-mid 1950s is not nearly as complete as it is for the late 1950s some teams are missing up to 40 games of data per season. This results in significant data errors in the Proxy BIP Distribution Variables, CS.sba, GO3.bip, and IFO.bip. Second, as we will see in our discussion of the pre-1952 model(s), there was a dramatic change during the 1950s in the impact of pitchers on batted ball outcomes. The run weights for the so-called Three True Outcomes BB.bfp, SO.bfp, and HR.bh are remarkably consistent with those determined under a

19 More on Defensive Regression (or Runs) Analysis 21 variety of rigorous offensive models, though the weight for HR.bh is about one-tenth of a run too high. The run weight for CS.sba is almost precisely right, for it equals the sum of the typical increase in run expectation if a base is stolen (approximately.15 to.20 runs) and the typical decrease in run expectation if a runner on base is taken out (approximately.45 to.40 runs). Similarly, the run weight for A789.ip is almost precisely right, for it equals the sum of the typical increase in run expectation if a base runner gains the extra base (approximately.15 to.20 runs) and the typical decrease in run expectation if a runner on base is taken out by the outfielder (approximately.45 to.40 runs). The run weight for WP.ip should be.27 runs, not.56, the excess being due to the fact that WP.ip carries the higher run expectation of the state of there already being one or more runners on base. In other words, positive WP.ip is strongly correlated with runs allowed not only because a WP increases runs allowed, but have runners on base already is obviously even more correlated with allowing runs; the WP.ip variable cannot separate out these two effects. We ll get to an imperfect fix in one of our alternative DRA models. IBB.pa is also too high, for the same reason; the average intentional walk increases expected runs by only.16, rather than.33. Jim Albert and Jay Bennett s Curve Ball : Baseball, Statistics, and the Role of Chance in the Game (see pages 187 through 189 of the current paperback edition) has an excellent discussion about how regression variables can carry information of omitted variables (here, the existence of base runners) in both a good and a bad way. WP.ip and IBB.pa are examples where the omitted variables (the fact that there are runners on base already, which correlates with allowing runs) have a bad effect on the estimates. We will shortly see examples of variables carrying useful information. THEORETICAL QUESTIONS REGARDING THE PROXY BIP DISTRIBUTION VARIABLES We now come to perhaps the most interesting issue in the DRA model from the standpoint of general statistical modeling: the role of the Proxy BIP Distribution Variables and the run weights for the residual fielding plays ( ra4, ra6, rpo7, etc.). The Proxy BIP Distribution Variables are good proxy variables under standard multivariable regression theory, for two reasons. First, they are strongly correlated with the true distribution of ground balls and fly balls hit by right- and left-handed batters. As explained in chapter two, the.80 correlations between RFO.rbip and RGO.rbip, and between LFO.lbip and LGO.lbip, suggest they explain about two-thirds the variance in true ground

20 22 APPENDIX A balls versus fly balls generated by right- and left-handed batters respectively. That s because there should be zero correlations, because quality between team outfields and infields should be uncorrelated over large samples. The fact that the correlations are nevertheless approximately.80 suggests that the square of that number (64 % ) is the amount of variation between FO given BIP and GO given BIP that must be controlled by the pitchers. Second, the chosen proxies are not correlated (or very weakly correlated) with the theoretical error term in a perfectly specified model, in other words, the true skill plays of the position being evaluated, and uncorrelated with any other predictors used to predict skill plays at such position, such as the baserunner variables and the ball-hogging variables. When we get to the second, global regression, the residuals from the first set of regression rgo3, ra4, ra5, ra6, rpo7, rpo8, and rpo9 c a n be viewed as explanatory variables in predicting RA.ip that are either proxy variables for true skill plays or estimates of true skill plays that are subject to measurement error. If we view them as proxy variables, they have the problem that they are correlated somewhat with the error term in modeling RA.ip. For example, ra6 is too high (overestimates true skill net A6 ( tsa6 )) if the team s outfielders are above average in true skill, which would be associated with more runs prevented. Seen instead as simply measurement error in explanatory variables, this results in classical errors-in-variables, which can be shown to result in attenuation bias, 2 which causes the coefficients to be too small. This is exactly what happens in the DRA model, where the true run value of a true skill play (about.75 to.85 runs, depending on the position) is attenuated to something closer to.50 runs. Though that results in a mis-estimation of the true run value of a true net skill play, it is ultimately helpful in the DRA model because we are interested more in estimating the total defensive runs per position per team. Attenuation is an appropriate haircut for an estimate of skill plays with too much noise in it. When I first published an article in 2003 about the basic approach of DRA, one of the readers suggested that it was an instrumental variables regression model, also known as a two-stage least-squares model. I do not believe that is the case. In the first-stage regressions for each position, the Proxy BIP Distribution Variables are serving simply as good (because they are independent of the position being evaluated) if imperfect predictors in an ordinary least-squares estimate of net plays at each position, given total BIP, for example, A6.bip, A4.bip, PO8.bip, A5.bip, etc. 2. See Jeffrey M. Wooldridge, Introductory Econometrics: a Modern Approach, (South- Western, 2009).

21 More on Defensive Regression (or Runs) Analysis 23 Perhaps the reader was thinking of the first-stage, per position regressions as the first stage in a formal two-stage (that is, instrumental variables) model. Seen in that light, the Proxy BIP Distribution Variables are attempting to function in some sense like instrumental variables, but without satisfying all the requirements that an instrumental variable should most importantly, exogeneity, or independence from the error term in the secondstage regression. For example, RFO.rbip is in some sense acting as an instrumental variable to purge estimates of net skill plays at each infield position of the effect of fly ball versus ground ball pitching to right-handed batters. And about twothirds of RFO.rbip probably does reflect the tendency of opponent righthanded batters to hit the ball on the ground or in the air, which has a very minor impact on ultimate runs allowed. (The expected run value of a ground ball is close to that of a ball hit in the air; more ground balls go through for hits, but more balls hit in the air go for extra bases.) However, RFO.rbip also reflects to some extent the skill of the outfielders in preventing hits, which does have an impact on runs allowed and would impact the error term in the second-stage regression. Another way in which the two-stage DRA model is inconsistent with the two-stage instrument variables regressions that I have seen is that the number of instrumental variables is less than the number of predictor (pitching and fielding) variables, and that a different set of instrumental variables is used for each predictor. Finally, in the examples of two-stage instrumental variables regression that I have seen, the fitted variables from the first stage are included in the second-stage regression; here, the residuals from the first-stage regression are included in the second-stage regression. Though the Proxy BIP Distribution Variables used in DRA are not ideal, they make the model much better than it would be without them. Furthermore, the ultimate validation of DRA is less whether it passes all the standard diagnostic tests for a regression model than whether it generates (i) fielder defensive runs estimates that match well with batted ball data systems, and (ii) team defensive runs estimates that match actual team runs allowed. On the basis of many tests I ve conducted over the years, DRA defensive runs estimates for individual fielders match almost or about as well with estimates derived from batted ball data as the latter do with each other. And DRA defensive runs estimates for teams match nearly or about as well with actual team runs allowed as the best offensive runs models based on team seasonal totals of various offensive events match actual team runs scored. Most importantly, the Proxy BIP Distribution Variables can be replaced with better Proxy BIP Distribution Variables in future versions of DRA that can be developed by exploiting the Retrosheet play-by-play database (currently available after 1951) to its maximum extent. When I say better,

Fairfax Little League PPR Input Guide

Fairfax Little League PPR Input Guide Fairfax Little League PPR Input Guide Each level has different participation requirements. Please refer to the League Bylaws section 7 for specific details. Player Participation Records (PPR) will be reported

More information

GUIDE TO BASIC SCORING

GUIDE TO BASIC SCORING GUIDE TO BASIC SCORING The Score Sheet Fill in this section with as much information as possible. Opposition Fielding changes are indicated in the space around the Innings Number. This is the innings box,

More information

Chapter 1 The official score-sheet

Chapter 1 The official score-sheet Chapter 1 The official score-sheet - Symbols and abbreviations - The official score-sheet - Substitutions - Insufficient space on score-sheet 13 Symbols and abbreviations Symbols and abbreviations Numbers

More information

Baseball Scorekeeping for First Timers

Baseball Scorekeeping for First Timers Baseball Scorekeeping for First Timers Thanks for keeping score! This series of pages attempts to make keeping the book for a RoadRunner Little League game easy. We ve tried to be comprehensive while also

More information

Table of Contents. Pitch Counter s Role Pitching Rules Scorekeeper s Role Minimum Scorekeeping Requirements Line Ups...

Table of Contents. Pitch Counter s Role Pitching Rules Scorekeeper s Role Minimum Scorekeeping Requirements Line Ups... Fontana Community Little League Pitch Counter and Scorekeeper s Guide February, 2011 Table of Contents Pitch Counter s Role... 2 Pitching Rules... 6 Scorekeeper s Role... 7 Minimum Scorekeeping Requirements...

More information

Do Clutch Hitters Exist?

Do Clutch Hitters Exist? Do Clutch Hitters Exist? David Grabiner SABRBoston Presents Sabermetrics May 20, 2006 http://remarque.org/~grabiner/bosclutch.pdf (Includes some slides skipped in the original presentation) 1 Two possible

More information

Softball/Baseball Scorekeeping for First Timers. Written for novice scorekeeper volunteers scoring CHAA Softball/Baseball games.

Softball/Baseball Scorekeeping for First Timers. Written for novice scorekeeper volunteers scoring CHAA Softball/Baseball games. Softball/Baseball Scorekeeping for First Timers Written for novice scorekeeper volunteers scoring CHAA Softball/Baseball games. 8 9 7 10 6 4 5 1 3 2 1 Thanks for keeping score! This series of pages attempts

More information

How to Make, Interpret and Use a Simple Plot

How to Make, Interpret and Use a Simple Plot How to Make, Interpret and Use a Simple Plot A few of the students in ASTR 101 have limited mathematics or science backgrounds, with the result that they are sometimes not sure about how to make plots

More information

Average Runs per inning,

Average Runs per inning, Home Team Scoring Advantage in the First Inning Largely Due to Time By David W. Smith Presented June 26, 2015 SABR45, Chicago, Illinois Throughout baseball history, the home team has scored significantly

More information

Triple Lite Baseball

Triple Lite Baseball Triple Lite Baseball As the name implies, it doesn't cover all the bases like a game like Playball, but it still gives a great feel for the game and is really quick to play. One roll per at bat, a quick-look

More information

Pitching Performance and Age

Pitching Performance and Age Pitching Performance and Age Jaime Craig, Avery Heilbron, Kasey Kirschner, Luke Rector and Will Kunin Introduction April 13, 2016 Many of the oldest and most long- term players of the game are pitchers.

More information

Major League Baseball Offensive Production in the Designated Hitter Era (1973 Present)

Major League Baseball Offensive Production in the Designated Hitter Era (1973 Present) Major League Baseball Offensive Production in the Designated Hitter Era (1973 Present) Jonathan Tung University of California, Riverside tung.jonathanee@gmail.com Abstract In Major League Baseball, there

More information

An average pitcher's PG = 50. Higher numbers are worse, and lower are better. Great seasons will have negative PG ratings.

An average pitcher's PG = 50. Higher numbers are worse, and lower are better. Great seasons will have negative PG ratings. Fastball 1-2-3! This simple game gives quick results on the outcome of a baseball game in under 5 minutes. You roll 3 ten-sided dice (10d) of different colors. If the die has a 10 on it, count it as 0.

More information

Offensive & Defensive Tactics. Plan Development & Analysis

Offensive & Defensive Tactics. Plan Development & Analysis Offensive & Defensive Tactics Plan Development & Analysis Content Head Coach Creating a Lineup Starting Players Characterizing their Positions Offensive Tactics Defensive Tactics Head Coach Creating a

More information

Level 2 Scorers Accreditation Handout

Level 2 Scorers Accreditation Handout Level 2 Scorers Accreditation Handout http://www.scorerswa.baseball.com.au ~ www.facebook.com/scorerswa LEVEL TWO SCORING ACCREDITATION HANDOUT This workbook is used in conjunction with the Australian

More information

Baseball Scorekeeping for First Timers

Baseball Scorekeeping for First Timers Baseball Scorekeeping for First Timers 8 9 7 10 6 4 5 1 3 2 1 The Scorebook 5 INTRODUCTION: This is what a page of the scorebook looks like. There are a lot of abbreviations and spaces to collect every

More information

Pitching Performance and Age

Pitching Performance and Age Pitching Performance and Age By: Jaime Craig, Avery Heilbron, Kasey Kirschner, Luke Rector, Will Kunin Introduction April 13, 2016 Many of the oldest players and players with the most longevity of the

More information

Rare Play Booklet, version 1

Rare Play Booklet, version 1 Rare Play Booklet, version 1 How To Implement Any time an X Chart reading is required, first roll 2d6. If the two dice equal 2 or 12, consult the Rare Play Chart. Otherwise, proceed as you normally would.

More information

Scorekeeping Clinic Heather Burton & Margarita Yonezawa &

Scorekeeping Clinic Heather Burton & Margarita Yonezawa & Scorekeeping Clinic Heather Burton & Margarita Yonezawa hadburton@gmail.com & mdiaz4@yahoo.com A scorekeeper must be impartial. Only speak up for a substitution violation or when a pitcher has reached

More information

When Should Bonds be Walked Intentionally?

When Should Bonds be Walked Intentionally? When Should Bonds be Walked Intentionally? Mark Pankin SABR 33 July 10, 2003 Denver, CO Notes provide additional information and were reminders to me for making the presentation. They are not supposed

More information

Table 1. Average runs in each inning for home and road teams,

Table 1. Average runs in each inning for home and road teams, Effect of Batting Order (not Lineup) on Scoring By David W. Smith Presented July 1, 2006 at SABR36, Seattle, Washington The study I am presenting today is an outgrowth of my presentation in Cincinnati

More information

Navigate to the golf data folder and make it your working directory. Load the data by typing

Navigate to the golf data folder and make it your working directory. Load the data by typing Golf Analysis 1.1 Introduction In a round, golfers have a number of choices to make. For a particular shot, is it better to use the longest club available to try to reach the green, or would it be better

More information

2018 Winter League N.L. Web Draft Packet

2018 Winter League N.L. Web Draft Packet 2018 Winter League N.L. Web Draft Packet (WEB DRAFT USING YEARS 1981-1984) Welcome to Scoresheet Baseball: the 1981-1984 Seasons. This document details the process of drafting your 2010 Old Timers Baseball

More information

Antelope Little League

Antelope Little League Antelope Little League Scorekeeper Training Thank you for volunteering to be a scorekeeper! It s an essential role, not only for keeping track of the score but also for the safety of the players. Being

More information

Department of Economics Working Paper Series

Department of Economics Working Paper Series Department of Economics Working Paper Series Race and the Likelihood of Managing in Major League Baseball Brian Volz University of Connecticut Working Paper 2009-17 June 2009 341 Mansfield Road, Unit 1063

More information

TOP OF THE TENTH Instructions

TOP OF THE TENTH Instructions Instructions is based on the original Extra Innings which was developed by Jack Kavanaugh with enhancements from various gamers, as well as many ideas I ve had bouncing around in my head since I started

More information

CS 221 PROJECT FINAL

CS 221 PROJECT FINAL CS 221 PROJECT FINAL STUART SY AND YUSHI HOMMA 1. INTRODUCTION OF TASK ESPN fantasy baseball is a common pastime for many Americans, which, coincidentally, defines a problem whose solution could potentially

More information

1. OVERVIEW OF METHOD

1. OVERVIEW OF METHOD 1. OVERVIEW OF METHOD The method used to compute tennis rankings for Iowa girls high school tennis http://ighs-tennis.com/ is based on the Elo rating system (section 1.1) as adopted by the World Chess

More information

New Approaches to Historical Pitcher Evaluation Using DRA

New Approaches to Historical Pitcher Evaluation Using DRA F New Approaches to Historical Pitcher Evaluation Using DRA In 2001, Voros McCracken published an article at baseballprospectus.com in which he introduced to the public the idea that pitchers have surprisingly

More information

The MACC Handicap System

The MACC Handicap System MACC Racing Technical Memo The MACC Handicap System Mike Sayers Overview of the MACC Handicap... 1 Racer Handicap Variability... 2 Racer Handicap Averages... 2 Expected Variations in Handicap... 2 MACC

More information

a) List and define all assumptions for multiple OLS regression. These are all listed in section 6.5

a) List and define all assumptions for multiple OLS regression. These are all listed in section 6.5 Prof. C. M. Dalton ECN 209A Spring 2015 Practice Problems (After HW1, HW2, before HW3) CORRECTED VERSION Question 1. Draw and describe a relationship with heteroskedastic errors. Support your claim with

More information

Team Number 6. Tommy Hanson v. Atlanta Braves. Side represented: Atlanta Braves

Team Number 6. Tommy Hanson v. Atlanta Braves. Side represented: Atlanta Braves Team Number 6 Tommy Hanson v. Atlanta Braves Side represented: Atlanta Braves Table of Contents I. Introduction... 1 II. Hanson s career has been in decline since his debut and he has dealt with major

More information

2015 Winter Combined League Web Draft Rule Packet (USING YEARS )

2015 Winter Combined League Web Draft Rule Packet (USING YEARS ) 2015 Winter Combined League Web Draft Rule Packet (USING YEARS 1969-1972) Welcome to Scoresheet Baseball: the winter game. This document details the process of drafting your Old Timers Baseball team on

More information

Figure 1. Winning percentage when leading by indicated margin after each inning,

Figure 1. Winning percentage when leading by indicated margin after each inning, The 7 th Inning Is The Key By David W. Smith Presented June, 7 SABR47, New York, New York It is now nearly universal for teams with a 9 th inning lead of three runs or fewer (the definition of a save situation

More information

Clutch Hitters Revisited Pete Palmer and Dick Cramer National SABR Convention June 30, 2008

Clutch Hitters Revisited Pete Palmer and Dick Cramer National SABR Convention June 30, 2008 Clutch Hitters Revisited Pete Palmer and Dick Cramer National SABR Convention June 30, 2008 Do clutch hitters exist? More precisely, are there any batters whose performance in critical game situations

More information

SOFTBALL. Rules and Scoring

SOFTBALL. Rules and Scoring SOFTBALL Rules and Scoring A team consists of nine players. A Designated Hitter (DH) may be used for any player as long as the player's name is included in the official line-up sheet. If a DH is replaced

More information

Scorekeeping Guide Book

Scorekeeping Guide Book Scorekeeping Guide Book Courtesy of East Orange Babe Ruth Table of Contents Page 1. Starting the Scorecard for a Game...1 2. The Scorecard Layout...2 Individual and Game Totals...2 3. Scorekeeping Basics...3

More information

IBAF Scorers Manual INTERNATIONAL BASEBALL FEDERATION FEDERACION INTERNACIONAL DE BEISBOL

IBAF Scorers Manual INTERNATIONAL BASEBALL FEDERATION FEDERACION INTERNACIONAL DE BEISBOL IBAF Scorers Manual INTERNATIONAL BASEBALL FEDERATION FEDERACION INTERNACIONAL DE BEISBOL REVISED IN 2009 2 CONTENTS The Scorekeeper 5 Preface 6 Chapter 1 The Official Score-sheet 7 Symbols and abbreviations

More information

DLL Scorekeeping Guide. Compiled by Kathleen DeLaney and Jill Rebiejo

DLL Scorekeeping Guide. Compiled by Kathleen DeLaney and Jill Rebiejo DLL Scorekeeping Guide Compiled by Kathleen DeLaney and Jill Rebiejo First Edition Danville Little League 2015, 2016 Table of Contents Page STARTING THE SCORECARD FOR A GAME... 1 THE SCORECARD LAYOUT...

More information

y ) s x x )(y i (x i r = 1 n 1 s y Statistics Lecture 7 Exploring Data , y 2 ,y n (x 1 ),,(x n ),(x 2 ,y 1 How two variables vary together

y ) s x x )(y i (x i r = 1 n 1 s y Statistics Lecture 7 Exploring Data , y 2 ,y n (x 1 ),,(x n ),(x 2 ,y 1 How two variables vary together Statistics 111 - Lecture 7 Exploring Data Numerical Summaries for Relationships between Variables Administrative Notes Homework 1 due in recitation: Friday, Feb. 5 Homework 2 now posted on course website:

More information

It s conventional sabermetric wisdom that players

It s conventional sabermetric wisdom that players The Hardball Times Baseball Annual 2009 How Do Pitchers Age? by Phil Birnbaum It s conventional sabermetric wisdom that players improve up to the age of 27, then start a slow decline that weeds them out

More information

Matt Halper 12/10/14 Stats 50. The Batting Pitcher:

Matt Halper 12/10/14 Stats 50. The Batting Pitcher: Matt Halper 12/10/14 Stats 50 The Batting Pitcher: A Statistical Analysis based on NL vs. AL Pitchers Batting Statistics in the World Series and the Implications on their Team s Success in the Series Matt

More information

2015 Shetland Score Keeping Guide

2015 Shetland Score Keeping Guide 2015 Shetland Score Keeping Guide 2014 Overview About this Guide This guide was created to help you understand the basic requirements of keeping score at CYB games Please Note: This guide does not include

More information

STANDARD SCORES AND THE NORMAL DISTRIBUTION

STANDARD SCORES AND THE NORMAL DISTRIBUTION STANDARD SCORES AND THE NORMAL DISTRIBUTION REVIEW 1.MEASURES OF CENTRAL TENDENCY A.MEAN B.MEDIAN C.MODE 2.MEASURES OF DISPERSIONS OR VARIABILITY A.RANGE B.DEVIATION FROM THE MEAN C.VARIANCE D.STANDARD

More information

OFFICIAL RULEBOOK. Version 1.08

OFFICIAL RULEBOOK. Version 1.08 OFFICIAL RULEBOOK Version 1.08 2017 CLUTCH HOBBIES, LLC. ALL RIGHTS RESERVED. Version 1.08 3 1. Types of Cards Player Cards...4 Strategy Cards...8 Stadium Cards...9 2. Deck Building Team Roster...10 Strategy

More information

Draft - 4/17/2004. A Batting Average: Does It Represent Ability or Luck?

Draft - 4/17/2004. A Batting Average: Does It Represent Ability or Luck? A Batting Average: Does It Represent Ability or Luck? Jim Albert Department of Mathematics and Statistics Bowling Green State University albert@bgnet.bgsu.edu ABSTRACT Recently Bickel and Stotz (2003)

More information

B. AA228/CS238 Component

B. AA228/CS238 Component Abstract Two supervised learning methods, one employing logistic classification and another employing an artificial neural network, are used to predict the outcome of baseball postseason series, given

More information

COACH PITCH DIVISION

COACH PITCH DIVISION COACH PITCH DIVISION These WLALL Coach Pitch rules are in addition to Little League Official Rules. To the extent the following rules are inconsistent with the Little League Official Rules, these rules

More information

Simulating Major League Baseball Games

Simulating Major League Baseball Games ABSTRACT Paper 2875-2018 Simulating Major League Baseball Games Justin Long, Slippery Rock University; Brad Schweitzer, Slippery Rock University; Christy Crute Ph.D, Slippery Rock University The game of

More information

DOE Golfer Experiment

DOE Golfer Experiment DOE Golfer Experiment A Design of Experiments Report Travis Anderson Jake Munger Deshun Xu 11/11/2008 INTRODUCTION We used Response Surface Methodology to optimize a golf putter. A face centered Central

More information

OFFICIAL RULEBOOK. Version 1.16

OFFICIAL RULEBOOK. Version 1.16 OFFICIAL RULEBOOK Version.6 3. Types of Cards Player Cards...4 Strategy Cards...8 Stadium Cards...9 2. Deck Building Team Roster...0 Strategy Deck...0 Stadium Selection... 207 CLUTCH BASEBALL ALL RIGHTS

More information

Building an NFL performance metric

Building an NFL performance metric Building an NFL performance metric Seonghyun Paik (spaik1@stanford.edu) December 16, 2016 I. Introduction In current pro sports, many statistical methods are applied to evaluate player s performance and

More information

Predicting the use of the sacrifice bunt in Major League Baseball BUDT 714 May 10, 2007

Predicting the use of the sacrifice bunt in Major League Baseball BUDT 714 May 10, 2007 Predicting the use of the sacrifice bunt in Major League Baseball BUDT 714 May 10, 2007 Group 6 Charles Gallagher Brian Gilbert Neelay Mehta Chao Rao Executive Summary Background When a runner is on-base

More information

DRILL #1 FROM THE TEE

DRILL #1 FROM THE TEE 1 Hitting Drills DRILL #1 FROM THE TEE DRILL #2 GROUNDER, PO PUP, LINE DRIVE DRILL #3 BATTER STANCE DRILL #4 EYE ON THE SPOT DRILL #5 COLORED BALL TOSS DRILL #6 CONTACT AND FREEZE DRILL #7 BALLOON DRILL

More information

Defensive Observations. II. Build your defense up the middle. By Coach Jack Dunn I. Defensive Observations

Defensive Observations. II. Build your defense up the middle. By Coach Jack Dunn I. Defensive Observations Defensive Observations By Coach Jack Dunn I. Defensive Observations 1. An alert capable defense makes for a winning team. The ability to make the routine play consistently is the hallmark of solid defensive

More information

DRILL #1 BALL TO FIRST

DRILL #1 BALL TO FIRST 1 Fielding Drills DRILL #1 BALL TO FIRST DRILL #2 SCOOP DRILL DRILL #3 PAST BALL DRILL #4 ALLIGATOR ARMS DRILL #5 STOP AND GO DRILL #6 ROLLERS DRILL #7 STEP AWAY TOSS DRILL #8 LEAD RUNNER DRILL #9 HIT

More information

I. General Coaching Tips

I. General Coaching Tips I. General Coaching Tips 1. Be Enthusiastic 2. Build Confidence n Help kids believe in themselves n Instead of being critical or creating pressure 3. Enthusiasm and confidence building more important than

More information

Redmond West Little League

Redmond West Little League Redmond West Little League Scorekeeper Guidelines for AAA, Coast and Majors Baseball and Softball This scorekeeping manual is intended as a guide to assist scorekeepers in scoring RWLL games. The umpires

More information

Positional Mechanics for Infield Positions

Positional Mechanics for Infield Positions Positional Mechanics for Infield Positions FIRST BASEMAN The #1 job of the first baseman is to keep the ball in front of him The first baseman must go after any ball that is thrown his way Basic Positioning

More information

2017 International Baseball Tournament. Scorekeeping Hints

2017 International Baseball Tournament. Scorekeeping Hints 2017 International Baseball Tournament Scorekeeping Hints Scorekeeping Abbreviations: Basic Abbreviations 1B Single 2B Double 3B Triple BB Base on Balls BK Balk CS Caught Stealing DP Double Play E Error

More information

Guide to Softball Rules and Basics

Guide to Softball Rules and Basics Guide to Softball Rules and Basics History Softball was created by George Hancock in Chicago in 1887. The game originated as an indoor variation of baseball and was eventually converted to an outdoor game.

More information

DISTRICT 53 SCOREKEEPER CLINIC

DISTRICT 53 SCOREKEEPER CLINIC The picture can't be displayed. DISTRICT 53 SCOREKEEPER CLINIC Presented by Tommy Ferguson Jim Spering District 53 Administrator 1 THE OFFICIAL SCORER Official Scoring is the basis of all records in Baseball

More information

Welcome to Replay Baseball!

Welcome to Replay Baseball! Welcome to Replay Baseball! In 97, John Brodak and Norm Roth, avid baseball fans and tabletop baseball gamers, wanted to invent a baseball board game that incorporated all the details of the sport they

More information

Additional On-base Worth 3x Additional Slugging?

Additional On-base Worth 3x Additional Slugging? Additional On-base Worth 3x Additional Slugging? Mark Pankin SABR 36 July 1, 2006 Seattle, Washington Notes provide additional information and were reminders during the presentation. They are not supposed

More information

Machine Learning an American Pastime

Machine Learning an American Pastime Nikhil Bhargava, Andy Fang, Peter Tseng CS 229 Paper Machine Learning an American Pastime I. Introduction Baseball has been a popular American sport that has steadily gained worldwide appreciation in the

More information

When you think of baseball, you think of a game that never changes, right? The

When you think of baseball, you think of a game that never changes, right? The The Strike Zone During the PITCHf/x Era by Jon Roegele When you think of baseball, you think of a game that never changes, right? The rules are the same as they were over 100 years ago, right? The bases

More information

2017 B.L. DRAFT and RULES PACKET

2017 B.L. DRAFT and RULES PACKET 2017 B.L. DRAFT and RULES PACKET Welcome to Scoresheet Baseball. The following information gives the rules and procedures for Scoresheet leagues that draft both AL and NL players. Included is information

More information

Chapter. 1 Who s the Best Hitter? Averages

Chapter. 1 Who s the Best Hitter? Averages Chapter 1 Who s the Best Hitter? Averages The box score, being modestly arcane, is a matter of intense indifference, if not irritation, to the non-fan. To the baseball-bitten, it is not only informative,

More information

A PRIMER ON BAYESIAN STATISTICS BY T. S. MEANS

A PRIMER ON BAYESIAN STATISTICS BY T. S. MEANS A PRIMER ON BAYESIAN STATISTICS BY T. S. MEANS 1987, 1990, 1993, 1999, 2011 A PRIMER ON BAYESIAN STATISTICS BY T. S. MEANS DEPARTMENT OF ECONOMICS SAN JOSE STATE UNIVERSITY SAN JOSE, CA 95192-0114 This

More information

Pine Tar Baseball. Game Rules Manual - version 2.1 A dice simulation game ~ copyright by LIS Games

Pine Tar Baseball. Game Rules Manual - version 2.1 A dice simulation game ~ copyright by LIS Games Introduction to Pine Tar Baseball Pine Tar Baseball Game Rules Manual - version 2.1 A dice simulation game ~ copyright 2015-2017 by LIS Games Pine Tar baseball is intended to be a game that can be played

More information

2011 COMBINED LEAGUE (with a DH) DRAFT / RULES PACKET

2011 COMBINED LEAGUE (with a DH) DRAFT / RULES PACKET 2011 COMBINED LEAGUE (with a DH) DRAFT / RULES PACKET Welcome to Scoresheet Baseball. Here is the rules packet, for a Combined League (drafting both National and American League players), a description

More information

Softball New Zealand Scorers Refresher Examination 2018

Softball New Zealand Scorers Refresher Examination 2018 Softball New Zealand Scorers Refresher Examination 2018 The entire exam will be answered in this booklet. Sections 1-3 are compulsory sections for ALL scorers. Section 4 is compulsory for Grade 6 and 7

More information

HMB Little League Scorekeeping

HMB Little League Scorekeeping HMB Little League Scorekeeping Basic information to track: Batting lineups Inning and score Balls, strikes, and outs Official game start time Pitchers and number of pitches thrown Help the coaches protect

More information

T-Ball is a baseball game for young boys and girls. It is a way to have fun while learning how to play.

T-Ball is a baseball game for young boys and girls. It is a way to have fun while learning how to play. Coaching youth baseball is an exciting and rewarding way to be involved with youth sports. It is not always easy though. A majority of the coaches at T-ball level are coaching for the first time and sometimes

More information

FALL CLASSIC BASEBALL GAME

FALL CLASSIC BASEBALL GAME FALL CLASSIC BASEBALL GAME Manage Baseball's Greatest Stars to see if YOU can win the pennant! Page 1 FALL CLASSIC BASEBALL Grab your team and set the starting line up, bench, and bullpen. You're the manager

More information

A Markov Model of Baseball: Applications to Two Sluggers

A Markov Model of Baseball: Applications to Two Sluggers A Markov Model of Baseball: Applications to Two Sluggers Mark Pankin INFORMS November 5, 2006 Pittsburgh, PA Notes are not intended to be a complete discussion or the text of my presentation. The notes

More information

Why We Should Use the Bullpen Differently

Why We Should Use the Bullpen Differently Why We Should Use the Bullpen Differently A look into how the bullpen can be better used to save runs in Major League Baseball. Andrew Soncrant Statistics 157 Final Report University of California, Berkeley

More information

There are three main pillars of behavior consistently found in successful baseball players and teams:

There are three main pillars of behavior consistently found in successful baseball players and teams: There are three main pillars of behavior consistently found in successful baseball players and teams: 1. Hustle 2. Awareness 3. Mental Toughness A team that always hustles exudes a special confidence and

More information

1 st basemen. I will:

1 st basemen. I will: Pitchers Condition Throw Strikes- make them put it in play Vary your timing home (1-2-3 seconds) with runners on base -step off -pick off -hold Make the runners stop, before you deliver home- no walking

More information

Name of Activity Description Key Teaching Points. Sideline base running - Deep in LF or RF. Rotate groups after 5 minutes

Name of Activity Description Key Teaching Points. Sideline base running - Deep in LF or RF. Rotate groups after 5 minutes Session # 9 Equipment: Medballs, Agility Ladder, Surgical Tubing, 1 batting cage (utilize more if you have them), 4 hitting stations w/tees, L-screen, 3 buckets of baseballs, 12 orange cone LESSON PLAN

More information

Game Theory (MBA 217) Final Paper. Chow Heavy Industries Ty Chow Kenny Miller Simiso Nzima Scott Winder

Game Theory (MBA 217) Final Paper. Chow Heavy Industries Ty Chow Kenny Miller Simiso Nzima Scott Winder Game Theory (MBA 217) Final Paper Chow Heavy Industries Ty Chow Kenny Miller Simiso Nzima Scott Winder Introduction The end of a basketball game is when legends are made or hearts are broken. It is what

More information

Announcements. Lecture 19: Inference for SLR & Transformations. Online quiz 7 - commonly missed questions

Announcements. Lecture 19: Inference for SLR & Transformations. Online quiz 7 - commonly missed questions Announcements Announcements Lecture 19: Inference for SLR & Statistics 101 Mine Çetinkaya-Rundel April 3, 2012 HW 7 due Thursday. Correlation guessing game - ends on April 12 at noon. Winner will be announced

More information

Calculation of Trail Usage from Counter Data

Calculation of Trail Usage from Counter Data 1. Introduction 1 Calculation of Trail Usage from Counter Data 1/17/17 Stephen Martin, Ph.D. Automatic counters are used on trails to measure how many people are using the trail. A fundamental question

More information

SWFLL Umpiring Basics The Basics of Diamond Coverage. Definitions

SWFLL Umpiring Basics The Basics of Diamond Coverage. Definitions The Basics of Diamond Coverage Once an umpire has learned to properly take coverage patterns at first base from the foul line position they may abandon many of the principles developed here in favor of

More information

Percentage. Year. The Myth of the Closer. By David W. Smith Presented July 29, 2016 SABR46, Miami, Florida

Percentage. Year. The Myth of the Closer. By David W. Smith Presented July 29, 2016 SABR46, Miami, Florida The Myth of the Closer By David W. Smith Presented July 29, 216 SABR46, Miami, Florida Every team spends much effort and money to select its closer, the pitcher who enters in the ninth inning to seal the

More information

Correlation and regression using the Lahman database for baseball Michael Lopez, Skidmore College

Correlation and regression using the Lahman database for baseball Michael Lopez, Skidmore College Correlation and regression using the Lahman database for baseball Michael Lopez, Skidmore College Overview The Lahman package is a gold mine for statisticians interested in studying baseball. In today

More information

The Intrinsic Value of a Batted Ball Technical Details

The Intrinsic Value of a Batted Ball Technical Details The Intrinsic Value of a Batted Ball Technical Details Glenn Healey, EECS Department University of California, Irvine, CA 9617 Given a set of observed batted balls and their outcomes, we develop a method

More information

Bunt Defenses. King Philip Pride Chalk Talk. How we call the play:

Bunt Defenses. King Philip Pride Chalk Talk. How we call the play: Bunt Defenses How we call the play: Two digit number second one is the defense. Ex. 54 means defense 4; 41 means defense 1 Bunt Defense 1 (standard bunt defense) - 1B & 3B (Corners) charge in. Normally

More information

Field Manager s Rulebook

Field Manager s Rulebook Field Manager s Rulebook BASEBALL CLASSICS â Next generation baseball board game Featuring Classic Edition Player Cards Revision 16 Copyright 2018 Field Manager s Rulebook Baseball Classics â P.O. Box

More information

Winning 10U Defensive Strategy

Winning 10U Defensive Strategy Winning 10U Defensive Strategy If you haven t read the "Defensive Strategy for 8U" let me direct you to that page. You will find many items from that article will still hold true for the Winning 10U softball

More information

Lorenzo Cain v. Kansas City Royals. Submission on Behalf of the Kansas City Royals. Team 14

Lorenzo Cain v. Kansas City Royals. Submission on Behalf of the Kansas City Royals. Team 14 Lorenzo Cain v. Kansas City Royals Submission on Behalf of the Kansas City Royals Team 14 Table of Contents I. Introduction and Request for Hearing Decision... 1 II. Quality of the Player s Contributions

More information

Softball Study Guide

Softball Study Guide Softball Study Guide The Team: A team consists of nine players: a pitcher, catcher, first baseman, second baseman, third baseman, shortstop, left fielder, center fielder, and right fielder. 1. A player

More information

Copyright Notice - IT IS ILLEGAL TO POST THIS DOCUMENT ONLINE

Copyright Notice - IT IS ILLEGAL TO POST THIS DOCUMENT ONLINE Copyright Notice - IT IS ILLEGAL TO POST THIS DOCUMENT ONLINE The material enclosed is copyrighted. You do not have resell rights or giveaway rights to the material provided herein. Only customers that

More information

2010 Boston College Baseball Game Results for Boston College (as of Feb 19, 2010) (All games)

2010 Boston College Baseball Game Results for Boston College (as of Feb 19, 2010) (All games) Game Results for Boston College (as of Feb 19, 2010) Date Opponent Score Inns Overall ACC Pitcher of record Attend Time Feb 19, 2010 at Tulane W 8-5 9 1-0-0 0-0-0 Dean, P (W 1-0) 3003 3:01 () extra inning

More information

One could argue that the United States is sports driven. Many cities are passionate and

One could argue that the United States is sports driven. Many cities are passionate and Hoque 1 LITERATURE REVIEW ADITYA HOQUE INTRODUCTION One could argue that the United States is sports driven. Many cities are passionate and centered around their sports teams. Sports are also financially

More information

Regression Analysis of Success in Major League Baseball

Regression Analysis of Success in Major League Baseball University of South Carolina Scholar Commons Senior Theses Honors College Spring 5-5-2016 Regression Analysis of Success in Major League Baseball Johnathon Tyler Clark University of South Carolina - Columbia

More information

Mental Approach to Pitching

Mental Approach to Pitching Mental Approach to Pitching Since you were a young boy, you have been overwhelmed by coaches who teach only the mechanics of pitching. What they fail to realize, however, is that mechanics is only one

More information

Running head: DATA ANALYSIS AND INTERPRETATION 1

Running head: DATA ANALYSIS AND INTERPRETATION 1 Running head: DATA ANALYSIS AND INTERPRETATION 1 Data Analysis and Interpretation Final Project Vernon Tilly Jr. University of Central Oklahoma DATA ANALYSIS AND INTERPRETATION 2 Owners of the various

More information

Lab 11: Introduction to Linear Regression

Lab 11: Introduction to Linear Regression Lab 11: Introduction to Linear Regression Batter up The movie Moneyball focuses on the quest for the secret of success in baseball. It follows a low-budget team, the Oakland Athletics, who believed that

More information

Left Fielder. Homework

Left Fielder. Homework Boys 7-8 ball Knowledge General Rules 1) The coach for the offensive team will run the pitching machine for his/her players. 2) Each player gets 5 pitches unless the 5th pitch is fouled off. Then the batter

More information