FREE AGENCY AND CONTRACT OPTIONS: HOW MAJOR LEAGUE BASEBALL TEAMS VALUE PLAYERS. May 11, 2007

Similar documents
An Analysis of the Effects of Long-Term Contracts on Performance in Major League Baseball

Efficiency Wages in Major League Baseball Starting. Pitchers Greg Madonia

2014 NATIONAL BASEBALL ARBITRATION COMPETITION ERIC HOSMER V. KANSAS CITY ROYALS (MLB) SUBMISSION ON BEHALF OF THE CLUB KANSAS CITY ROYALS

2014 Tulane Baseball Arbitration Competition Josh Reddick v. Oakland Athletics (MLB)

Effects of Incentives: Evidence from Major League Baseball. Guy Stevens April 27, 2013

2014 Tulane Baseball Arbitration Competition Eric Hosmer v. Kansas City Royals (MLB)

Length of Contracts and the Effect on the Performance of MLB Players

Major League Baseball Offensive Production in the Designated Hitter Era (1973 Present)

a) List and define all assumptions for multiple OLS regression. These are all listed in section 6.5

Average Runs per inning,

MONEYBALL. The Power of Sports Analytics The Analytics Edge

2014 National Baseball Arbitration Competition

Additional On-base Worth 3x Additional Slugging?

2014 National Baseball Arbitration Competition

Figure 1. Winning percentage when leading by indicated margin after each inning,

CS 221 PROJECT FINAL

Lorenzo Cain v. Kansas City Royals. Submission on Behalf of the Kansas City Royals. Team 14

Percentage. Year. The Myth of the Closer. By David W. Smith Presented July 29, 2016 SABR46, Miami, Florida

JEFF SAMARDZIJA CHICAGO CUBS BRIEF FOR THE CHICAGO CUBS TEAM 4

Department of Economics Working Paper

Machine Learning an American Pastime

How to Make, Interpret and Use a Simple Plot

Department of Economics Working Paper Series

The Rise in Infield Hits

Opleiding Informatica

2013 Tulane National Baseball Arbitration Competition

Table 1. Average runs in each inning for home and road teams,

2015 NATIONAL BASEBALL ARBITRATION COMPETITION

Using Actual Betting Percentages to Analyze Sportsbook Behavior: The Canadian and Arena Football Leagues

2013 National Baseball Arbitration Competition. Tommy Hanson v. Atlanta Braves. Submission on behalf of Atlanta Braves. Submitted by Team 28

Running head: DATA ANALYSIS AND INTERPRETATION 1

Do Clutch Hitters Exist?

2015 NATIONAL BASEBALL ARBITRATION COMPETITION. Mark Trumbo v. Arizona Diamondbacks. Submission on Behalf of Mark Trumbo. Midpoint: $5,900,000

Revisiting the Hot Hand Theory with Free Throw Data in a Multivariate Framework

1. Answer this student s question: Is a random sample of 5% of the students at my school large enough, or should I use 10%?

2013 National Baseball Arbitration Competition

Fit to Be Tied: The Incentive Effects of Overtime Rules in Professional Hockey

The relationship between payroll and performance disparity in major league baseball: an alternative measure. Abstract

Compensating Wage Differentials and Wage Discrimination in Major League Baseball

Dexter Fowler v. Colorado Rockies (MLB)

Show Me the Money: An Analysis of the Impact of Free Agency on NFL Player Performance. Ben Singer, 2009

Simulating Major League Baseball Games

What Causes the Favorite-Longshot Bias? Further Evidence from Tennis

Matt Halper 12/10/14 Stats 50. The Batting Pitcher:

Old Age and Treachery vs. Youth and Skill: An Analysis of the Mean Age of World Series Teams

Online Baseball Union

Workers' Responses to Incentives: The Case of Pending MLB Free Agents

Expectancy theory and major league baseball player compensation

2014 National Baseball Arbitration Competition

2015 Winter Combined League Web Draft Rule Packet (USING YEARS )

2018 Winter League N.L. Web Draft Packet

University of Nevada, Reno. The Effects of Changes in Major League Baseball Playoff Format: End of Season Attendance

2014 National Baseball Arbitration Competition

NHL SALARY DETERMINATION AND DISTRIBUTION A THESIS. Presented to. The Colorado College. Bachelor of Arts. Ian Young. February 2015

Evaluating and Classifying NBA Free Agents

Jenrry Mejia v. New York Mets Submission on Behalf of the New York Mets Midpoint: $2.6M Submission by Team 32

The next criteria will apply to partial tournaments. Consider the following example:

Draft - 4/17/2004. A Batting Average: Does It Represent Ability or Luck?

Why We Should Use the Bullpen Differently

Department of Economics Working Paper

Monopsony Exploitation in Professional Sport: Evidence from Major League Baseball Position Players,

NBA TEAM SYNERGY RESEARCH REPORT 1

DO LONG-TERM CONTRACTS ENCOURAGE SHIRKING IN MAJOR LEAGUE BASEBALL? A THESIS. Presented to. The Faculty of the Department of Economics and Business

2015 National Baseball Arbitration Competition

International Discrimination in NBA

Team Number 6. Tommy Hanson v. Atlanta Braves. Side represented: Atlanta Braves

Clutch Hitters Revisited Pete Palmer and Dick Cramer National SABR Convention June 30, 2008

2017 B.L. DRAFT and RULES PACKET

Salary Arbitration, A Burden or a Benefit. By Bill Gilbert SABR 36. Seattle, Washington. June 29, 2006

A Novel Approach to Predicting the Results of NBA Matches

Chapter. 1 Who s the Best Hitter? Averages

The Influence of Free-Agent Filing on MLB Player Performance. Evan C. Holden Paul M. Sommers. June 2005

Factors Affecting the Probability of Arrests at an NFL Game

Correction to Is OBP really worth three times as much as SLG?

2011 COMBINED LEAGUE (with a DH) DRAFT / RULES PACKET

Salary correlations with batting performance

Contingent Valuation Methods

Inside Baseball Take cues from successful baseball strategies to improve your game in business. By Bernard G. Bena

The Effects of Race and Role Specialization on Pitcher Salary Equations

Game Theory (MBA 217) Final Paper. Chow Heavy Industries Ty Chow Kenny Miller Simiso Nzima Scott Winder

Expansion: does it add muscle or fat? by June 26, 1999

Has the NFL s Rooney Rule Efforts Leveled the Field for African American Head Coach Candidates?

Two Machine Learning Approaches to Understand the NBA Data

Jenrry Mejia v. New York Mets Submission on Behalf of New York Mets Midpoint: $2.6 Million Submission by Team 18

A SURVEY OF 1997 COLORADO ANGLERS AND THEIR WILLINGNESS TO PAY INCREASED LICENSE FEES

PREDICTING the outcomes of sporting events

A PRIMER ON BAYESIAN STATISTICS BY T. S. MEANS

2014 Tulane National Baseball Arbitration Competition Jeff Samardzija v. Chicago Cubs (MLB)

Staking plans in sports betting under unknown true probabilities of the event

Betaball. Using Finance to Evaluate. Baseball Contracts. Jamie O Donohue

Quantitative Methods for Economics Tutorial 6. Katherine Eyal

Factors Affecting Minor League Baseball Attendance. League of AA minor league baseball. Initially launched as the Akron Aeros in 1997, the team

Should bonus points be included in the Six Nations Championship?

Predicting Season-Long Baseball Statistics. By: Brandon Liu and Bryan McLellan

Pierce 0. Measuring How NBA Players Were Paid in the Season Based on Previous Season Play

Taking Your Class for a Walk, Randomly

BABE: THE SULTAN OF PITCHING STATS? by. August 2010 MIDDLEBURY COLLEGE ECONOMICS DISCUSSION PAPER NO

Billy Beane s Three Fundamental Insights on Baseball and Investing

Citation for published version (APA): Canudas Romo, V. (2003). Decomposition Methods in Demography Groningen: s.n.

An Empirical Analysis of the Contract Year Phenomenon in the National Football League

Transcription:

FREE AGENCY AND CONTRACT OPTIONS: HOW MAJOR LEAGUE BASEBALL TEAMS VALUE PLAYERS May 11, 2007 Michael Dinerstein mdiner@stanford.edu Stanford University, Department of Economics Advisor: Prof. Bob Hall Abstract When evaluating and signing players, Major League Baseball teams face incomplete information regarding a player s true value. This paper explores how teams deal with such uncertainty and whether their approaches toward high-risk free agent signings and low-risk contract option decisions differ. I use seemingly unrelated regression to estimate the relationships between past performance and future performance and between past performance and free agent salaries. I find that in determining free agent salary offers, teams undervalue past performance relative to its power in predicting future performance. For the low-risk option decisions, I use the Wilcoxon Sum Rank Test and a logit regression to determine that teams are much less cautious and follow no discernible pattern in exercising options. Teams thus use very different approaches in making free agent and contract option decisions. Keywords: sports economics, salary determination, free agent, labor, management I am grateful to my advisor, Professor Bob Hall, for his guidance and thoughtful comments. Working with him has been an invaluable learning experience. I would also like to thank Professor Luigi Pistaferri for his econometrics advice. Any mistakes are my own.

Michael Dinerstein, May 11, 2007, Page 1 1. Introduction In 2004 Adrian Beltre, a third baseman for the Los Angeles Dodgers, had a fantastic year and received considerable support as the possible Most Valuable Player of the National League. Beltre s performance proved somewhat unexpected, however, because in the years prior to 2004 his statistics were relatively average compared to the rest of the league. Critics pointed out that Beltre s contract situation likely explained his unexpected performance. Beltre s contract was set to expire after the 2004 season, at which point he could become a free agent and negotiate a new contract with any baseball team. Perhaps Beltre then exerted a particularly strong effort during 2004, his contract year, that he had not shown during previous seasons. Or Beltre could have always performed to his full capacity and may have just reached a level of skill in 2004 that allowed him to develop into a very productive player. The Seattle Mariners seemed to assume the latter explanation was true (or at least that the former explanation would not predict a future devolution into prior habits) by offering Beltre a 5-year contract for $64 million, an amount that would hardly be justified by his performance prior to 2004. Beltre s subsequent performance proved disappointing relative to the expectations his new contract created. The case of Adrian Beltre highlights the difficulty in discerning players true marginal values with incomplete information and the consequences to teams and players of contract decisions. The goal of this paper is to understand how Major League Baseball (MLB) general managers and their teams navigate this process of deciding whether to keep and sign players. This paper will examine both the typically high-risk decisions of signing free agents to new contracts and the typically low-risk decisions of exercising team options that MLB front offices must make. Is there a statistically significant pattern of soon-to-be free agents performing above expectations in the final year of their contracts? If so, are teams fooled by this unexpected

Michael Dinerstein, May 11, 2007, Page 2 performance or do they adjust their salary offers? While many papers have addressed the former question, none has asked whether teams fall into the trap of overemphasizing the most recent performance, or even whether they create the incentive in the first place for players to try harder during their contract year. A related question of team rationality in the contract process is whether teams exercise their team options rationally. Options, which will be explained later in the introduction, represent a less significant commitment to a player because options are rarely guaranteed and are for just one year of service. I will address the degree to which teams exercise options rationally and then compare the valuation of player options with the more significant commitments in lucrative free agent deals. 1.1 Uniqueness of Major League Baseball Baseball, due to the rules of the game and the nature of player contracts, affords researchers advantages that other sports cannot provide. While baseball is naturally a team game, individual performance, particularly by hitters, is reasonably independent from the actions of others. When a player hits the ball, his fate is nearly always independent of how his teammates act and depends on his opponents primarily through their defense. But given the high quality of defense at the major league level, most plays are either routine or quite difficult, and I assume that the variation in defenders is relatively limited. Whether the batter hits the ball is often quite dependent on the quality of the opposing pitcher, but because teams play 162 games and have fairly balanced schedules, I make the reasonable assumption that all hitters face approximately the same quality of pitchers over the course of the season. For most at-bats, a batter attempts to reach base without making an out, regardless of whether the previous hitter made an out or reached base. Certain situations can arise, however, that require the batter to aim for a specific outcome that would not necessarily be the goal for

Michael Dinerstein, May 11, 2007, Page 3 each at-bat. For instance, if a teammate has reached third base before two outs have been recorded in an inning, most hitters will attempt to hit a sacrifice fly, which sacrifices the current hitter to help the base runner score. While a hit is nearly always superior to a sacrifice fly in any situation, the risk involved in seeking a hit can be large enough that hitters aim for sacrifice flies. But even in such specific situations when the batter s goal changes, baseball statistics usually reward, or at least do not penalize, the batter for helping the team. Furthermore, the statistics are general enough that they can accommodate hitters with distinct strengths. The power hitter may frequently hit home runs but also strikes out more than another hitter who rarely hits more than a double but reaches base often. Statistics like slugging percentage reward both types of hitters by giving more points for home runs than singles but also accounting for the number of times a batter reaches base. Baseball statistics thus usually measure actions that are highly correlated with individual and team goals and are flexible enough to apply to the full spectrum of hitter types. Whether pitchers performance lends itself so easily to research is debatable. While most prior studies have focused on hitters, Krautmann, Gustafson, and Hadley (2003) built a model that predicted pitchers salaries based on past performance. Their findings indicate that all pitchers cannot be treated as members of the same population but rather starting pitchers, middle and long relief pitchers, and closing pitchers require separate analyses. Furthermore, no single performance measure emerges as explaining variation in pitcher contracts. Measuring pitching performance is thus more challenging. The rules for player contracts also overcome some of the difficulties of analyzing athlete contracts in major team sports. Unlike the National Football League (NFL), contracts are guaranteed so that if a team releases the player, the team still has an obligation to pay the player

Michael Dinerstein, May 11, 2007, Page 4 the full contract amount. Contract length, as well as salaries in later years of long-term contracts, should be close to a baseball player s projected value, whereas a football team might offer a sixyear contract without expecting the player to be on the roster for the last few years. Player and team options can be exceptions to the rule of guaranteed contracts, but, as this paper will show, their relative simple nature still allows for analysis. The other unique aspect of Major League Baseball contracts is that teams do not have to remain under a salary cap. In the National Basketball Association (NBA), teams can spend only a limited sum on players each year, whereas no such restriction exists in MLB. The salary cap has several effects. First, players may not receive their marginal value to a team because the team has an upper limit it can offer. Therefore, a model that uses past performance to predict salary may run into difficulties caused by truncation. Second, to ensure that teams have some flexibility despite the salary cap, the NBA includes several salary exceptions that allow a team to sign a player to a certain amount regardless of salary cap implications. Such exceptions can force player salaries into slots that make it difficult to fit a continuous distribution to salaries and again create a system where players may not receive their marginal value. Third, to circumvent the salary cap basketball teams often offer contracts that are back-loaded or include large signing bonuses that can be distributed over the length of the contracts in years where the team may be well under the salary cap. The large variation in yearly salaries can prove difficult to model. The relative consistency of baseball contracts makes analysis more straightforward. Baseball s system of free agency also more closely resembles a free market. Unlike the NBA, MLB does not have rules that allow a player s current team to offer a contract amount that no other team can match. Unlike the NFL, MLB does not feature franchise tags that allow teams to designate future free agents as franchise players and restrict them from entering the free

Michael Dinerstein, May 11, 2007, Page 5 agent market. The salary determination process is therefore easier to model because the bidding is more competitive and players have less uncertainty about when they will become free agents. 1.2 Large Commitments: Free Agent Decisions Because the contract determination process can be complex and involve many factors that are hard to separate, this paper will focus on the specific decision of how teams value most recent performance when signing free agents. An understanding of this aspect will offer insight into how teams generally deal with large financial commitments. The team s goal is some combination of winning a championship and maximizing its profits. A player s on-field production is an important component of both of these goals, and all else being equal an increase in on-field production increases the probability of winning and the team s profits. When offering a player a contract, a team must predict players future on-field production, but this task can be quite difficult, as highlighted by the Adrian Beltre example. Changes in player performance could be a result of two factors. First, a player s possible performances form a distribution from which each year is one draw. Even if the distribution remains unchanged, player performance can vary between years. Second, a player s distribution may change. For instance, if a player becomes more accustomed to major league pitching, his distribution of outcomes may change to favor higher performance. Distinguishing between these factors is the team s challenge. Adding to this variation in on-field performance is a player s ability to control his effort. Following Krautmann (1990), a worker s marginal production equation is where MP j is the jth worker s marginal product, j j j MP f E, X, (1) E j is the jth worker s effort, X is a vector of other inputs, and j is a random variable. I define effort broadly to include any measures that

Michael Dinerstein, May 11, 2007, Page 6 improve performance, whether they are expected of the worker or not. For instance, taking steroids is illegal and many teams may frown upon such action, but if players can increase their production by taking steroids, then they are a measure of effort. Effort can be decomposed, as Krautmann (1990) outlines, as j j j E h CT, Z (2) where CT j is the time remaining on the jth worker s contract and affecting his effort. The question is why might MP CT j j Z j is a vector of other factors, (3) the marginal effect of time remaining on a contract on marginal product, not equal zero. If baseball players anticipate they will receive higher salaries for certain actions, they will engage in such behavior. Players at the beginning of long-term contracts know that they have locked in specific salaries for the following years and may expect that their current performance will have no impact on their compensation. Players in the final year of their contracts, or the contract year, can expect to negotiate new contracts in a short time frame and may believe that increased performance now will translate to higher salaries in a year. For this reason, players may exert more effort in the final year of their contract because they expect it will have significant benefits. Whether this expectation is rational is unclear. Teams presumably evaluate players on their performance over several years. If this pattern is established, players have little incentive to adjust their effort for only the end of their current contracts. Instead, production should be more consistent. But teams may have outside pressures that preclude them from taking such an approach toward evaluation. For instance, if a baseball team s fans or sports writers are tired of

Michael Dinerstein, May 11, 2007, Page 7 losing and see that a player who just finished a productive season is available, they may call for their team to sign the player. Such pressure could cause the team to alter how it values players. Teams have turned to incentive mechanisms to protect themselves from the risk associated with signing a player whose future production is highly uncertain. For instance, many contracts include bonuses if a player participates in a certain number of games or makes an All- Star team. The size of these bonuses, however, is quite insignificant compared to the base salaries, and as Harder (1989) notes, large incentive mechanisms are highly correlated with high base salaries. These incentive mechanisms thus may not provide the team with much insurance. Therefore, teams continue to face the problem of how to predict future performance. These decisions can have large impacts because free agent contracts are often multi-year commitments to players for large sums of money. 1.3 Small Commitments: Options Options can be team options, player options, or mutual options. When a player and team initially negotiate a contract, they can add an option to the end of the contract for a pre-specified amount. Then, when the player completes the non-option years of the contract, the holder of the option chooses whether to exercise it. If the holder exercises the option, the player s contract is extended by one year for the amount that was specified when the contract was originally signed. For a mutual option, both the team and the player must exercise the option for the previous contract to extend. Typically the team or player can exercise the option at any point during the contract, although only the teams seem to exercise options early. Team options usually include buyouts that the player receives if the team chooses not to exercise the option, though these buyouts are significantly smaller than the option amount.

Michael Dinerstein, May 11, 2007, Page 8 Sometimes the contract will delineate conditions under which a team option will vest, or automatically become guaranteed. Some form of options has actually predated free agency, as during the reserve era the team always had an option to renew the previous one-year contract if the two parties could not negotiate a new deal. But once the free agency era began, options became much less frequent and have only recently regained their popularity. Interestingly, options are so entrenched in the business of baseball that the player s union had an option to extend the 1997-2000 Collective Bargaining Agreement for one year, which it exercised. When exercising options, teams are thus making smaller commitments to players because options are only one-year extensions of contracts. Furthermore, options only apply to players previously under contract with the team, so the team likely is aware of the player s work ethic and interactions with teammates. Because option choices are relatively simple binary decisions to model the holder either exercises the option or does not they lend themselves to salary models and offer an interesting point of comparison to modeling free agent contracts. This paper will consider only team options. 2. Literature Review There has not been a study of player or team options in baseball. I hypothesize that this dearth of literature is partly a result of the challenge of finding comprehensive options data. While downloadable databases of salaries and lists of free agents are readily accessible, there is no organized public source of option data, and until the growth of Internet blogs, researchers lacked a way to compile option data. Furthermore, options seem to have become much more popular in recent years, and therefore their significance has only arisen recently. My attempt to collect option data and analyze it thus appears to open the door into understanding how players

Michael Dinerstein, May 11, 2007, Page 9 and teams use options and whether these low-risk commitments differ from decisions regarding free agents contracts. As for the question of whether players allow their effort to fluctuate over the course of a contract, baseball researchers have offered several models but have yet to settle on a conclusive answer. Harder (1989) collected data from four years: 1976, 1977, 1987, and 1988. Harder used two unconventional statistics, runs created and total average, as his measures of performance. He claimed that these statistics could best measure a player s contribution to a team s success, and their correlations with conventional statistics were very high. Harder then identified the log of salary as the dependent variable and ran a linear regression with variables for players career production, experience, contract status, team performance, ethnicity, and All- Star status the previous season. The paper concluded that being in a contract year had no effect on total average or runs created in 1976 or 1977 while in 1989 the effect was negative on runs created but not statistically significant on total average. Harder concluded that players did not anticipate that statistical gains in their contract years would lead to higher salaries. The study s years were so unique in baseball history that these results likely do not apply to the current environment. During 1976 and 1977, the first two years of data, the league was transitioning into free agency. Players likely could not predict how their production would affect their future salaries. During the later years, 1987 and 1988, owners were found to be colluding. Again, players may have expected to be low-balled in contract negotiations regardless of performance and thus saw no incentive to try harder during their contract year. Because today s league more closely models a free market, players incentives have likely changed. Without using the standard regression, Krautmann (1990) attempted to answer the related question of whether shirking (exerting less than maximal effort) occurs in the first year of

Michael Dinerstein, May 11, 2007, Page 10 a long-term contract. Krautmann s data ran from 1976 to 1983 and included free agents who signed contracts with lengths of at least five years. He constructed forecast intervals for players slugging averages based on past data and then defined super-par and sub-par performance as the top and bottom 5% of the interval, respectively. Krautmann found that only 4.5% of the players had super-par performances and only 1.8% had sub-par performances. He concluded that shirking did not occur. Scroggins (1993) challenged Krautmann s result by claiming that slugging percentages would not pick up shirking but that using total bases, a measure that could account for time spent injured, proved that shirking occurred. Scroggins used a linear regression with Krautmann s data and found that the coefficient on whether a player had just signed a long-term contract was significantly negative. Krautmann (1993) defended his choice of slugging percentage and then applied his method of analysis to the total bases statistic. He found that the results were nearly identical to those of his initial paper. This paper will add to the existing literature in several ways. First, the discussion of team options fills in a gap that research has not yet covered. Their emergence in recent years and their uniqueness as non-guaranteed contract years allow options to offer insight into how players react to a more uncertain contract situation and how teams value players. Second, the analysis of effort during contract years will make use of more recent data than previous studies. Particularly in a labor market that has fundamentally changed over the last 20 years, recent data is essential in explaining current patterns of behavior. Finally, previous studies have focused on whether players react to an incentive to alter behavior in the contract year. This paper will delve into the team s side of this issue by asking whether teams create the incentive to alter behavior and how they value players in the context of possible shirking.

Michael Dinerstein, May 11, 2007, Page 11 3. Data As discussed above, pitching statistics are not general enough to cover all types of pitchers. I choose to drop pitchers from my analysis for two reasons. First, splitting up the pitchers into starters, long and middle relievers, and closers and analyzing them separately causes my conclusions to suffer from small sample sizes. Second, because the different pitching roles can require different mentalities and skill sets, a player s past performance in one role may not accurately predict his future performance if he changes roles. Hitters, on the other hand, can be aggregated, and while they may switch positions, this change should not affect their hitting performance. My sample consists of player-years from 2001 to 2004. Players who signed free agent contracts beginning in 2005 are also included. 1 While the full population of hitters from these years numbers 2111, my sample includes only 1330 of the player-years, or 62.9%. This partial coverage is a result of incomplete contract information. My data consists of three main subsets: player on-field performance, off-field player characteristics, and contracts. On-field performance is publicly available for all players and consistent across sources because baseball designates an official scorer. 2 Off-field player characteristics are also widely available and do not depend on the source. 3 Contract information is much harder to find and the level of detail can depend on the source. 4 1 Because I evaluate free agents based on variables observed prior to their new contract, these observations use data from only 2001-2004. These players are included to increase the sample size of free agents. 2 For on-field performance, I used version 5.4 of Sean Lahman s baseball database, The Baseball Archive. The database is accessible at http://www.baseball1.com/statistics/. I checked a random sample of these statistics with ESPN.com (http://espn.go.com), and found no inaccuracies. 3 I again used version 5.4 of The Baseball Archive as well as Baseball-Reference.com (http://www.baseballreference.com/). 4 For the salary amounts, I used version 5.4 of The Baseball Archive. For lists of free agents, I consulted Associated Press articles. For contract lengths and option data, I used two online blogs and checked their data against Associated Press articles. The first blog, MLB Contracts (http://www.bluemanc.demon.co.uk/baseball/mlbcontracts.htm), is no longer accessible directly, so I used The

Michael Dinerstein, May 11, 2007, Page 12 There is no consensus on how to measure yearly on-field batter performance. The most popular statistics in newspapers and casual fan conversations are home runs, runs batted in, and batting average. Home runs give a sense of a player s power, but they shed little light on his ability to reach base through other hits. Furthermore, they are highly dependent on the number of games a batter plays. Runs batted in (RBIs) may be a strong measure of a hitter s contribution to a team s production. The drawback, however, is that a hitter s opportunities for RBIs are very dependent on his teammates. An RBI is much easier to obtain if teammates have reached base. Thus, hitters who bat after good hitters have more opportunities for RBIs. Batting average (BA) gives the percentage of hits a player earns for his total at-bats: Total Hits BA. (4) Total At Bats At-bats are the total number of times a player bats, though walks, sacrifices, and hit-by-pitches are not included. Batting average offers a strong measure of a player s ability to reach base but it fails to distinguish between different types of hits. For instance, a home run is more valuable than a single, but batting average counts each as one hit. The most common statistics thus are not sufficient for analysis that attempts to find a player s full independent value. Other hybrid statistics more closely measure a player s value to a team. Most previous studies have used slugging percentage (SLG), where Total Bases SLG. (5) Total At Bats The total bases statistic, another performance measure, refers to the number of bases a player has earned through hits. For instance, a single equals one base, a double equals two, but a non-hit WayBack Machine (http://www.archive.org/web/web.php) to access the site. The other blog, Cot s Baseball Contracts (http://mlbcontracts.blogspot.com/), proved to be an excellent source.

Michael Dinerstein, May 11, 2007, Page 13 like a walk does not count even though the player advances a base. Slugging percentage thus conveys a combination of how often a player gets hits and how valuable the hits are. On-base percentage (OBP) improves upon batting average by accounting for non-hits, such as walks, that lead to the batter reaching base without causing an out. OBP is the ratio of the times reached base without making an out and total plate appearances. OBP has gained popularity recently, as recounted in Michael Lewis s Moneyball (2003), which described the approach taken by Oakland Athletics General Manager Billy Beane in evaluating players. The problem with OBP is the same as that of BA OBP fails to differentiate between different types of hits. In response to this deficiency, baseball researchers have turned to OPS, the sum of SLG and OBP. OPS combines the advantages of SLG and OBP. Bill James and Thomas Boswell also have argued for their own complicated statistics. James (1988) constructed runs created, which accounts for not only hits and walks but also considers a player s ability to steal a base. Boswell (1988) created total average, a measure that also included base-running statistics. The real test of which performance measure to employ is whether the teams actually use it when predicting a player s value. Even if one statistic is the best predictor of wins or team profit, its usefulness in this analysis is limited because I am testing how teams weigh the most recent performance versus less recent performance in determining salary offers. Therefore, I take the statistic, or combination of statistics, that the team uses as given. But since teams do not make such information public in most cases, I assume that teams have learned from experience and now use a performance measure that best accounts for a player s contribution. The problem returns to determining the performance measure that best leads to wins and profits. The combination of statistics should account for both total production during a season and average production during the games a batters plays. Thus, the statistics should distinguish between two

Michael Dinerstein, May 11, 2007, Page 14 players who produce equal cumulative totals but play in different numbers of games. For the reasons already discussed, home runs, RBIs, BA, and OBP do not account for players full value in ways that better statistics do. Runs created and total average are too obscure to assume that all teams use them to evaluate players. Instead, I use a combination of slugging percentage and total bases. While OPS includes SLG, the results hardly change when I substitute OPS for SLG, and SLG is easier to interpret than OPS, which double-counts hits (hits appear in both the SLG and OBP components). Also, most studies have used SLG. Therefore, I prefer SLG to OPS. In order to predict future slugging percentage (SLG t ) and a player s value to a team, I will use lagged slugging percentages from the three previous seasons (SLG t-1, SLG t-2, SLG t-3 ). 5 In order to distinguish between players who are efficient but often hurt and players who are efficient and play many games, I will include a three-year average of total bases (AVGBASES). This combination is simple but takes into account a player s rate of production and total contribution. 6 In addition to a batter s hitting skill, his value to a team includes fielding abilities. Fielding can be difficult to measure because a high number of errors may not necessarily indicate a poor fielder but rather a fielder with wider range who can reach many balls. Therefore, his errors might have been hits if other fielders were playing. Furthermore, general managers rarely mention players fielding when signing free agents, so I am inclined to believe that only the best fielders receive more money for their fielding abilities. Thus, I include a dummy variable 5 I choose to predict slugging percentage rather than total bases because total bases are less consistent across years as they depend largely on injuries that occur randomly. Therefore, lagged slugging percentages are better predictors of future slugging percentage, so the player model will predict slugging percentage but still make use of total bases. 6 In the second model used in evaluating players with options, I use lagged values of total bases (BASES t-1, BASES t-2, BASES t-3 ) as well as slugging percentages because the dependent variable is now a salary amount, not a performance statistic.

Michael Dinerstein, May 11, 2007, Page 15 (GOLDGLOVE) for whether the position player (non-pitcher) has received a Gold Glove Award, which designates each year the best fielder at a certain position in each league. Off-field player characteristics are player-specific characteristics that do not depend on performance. The most obvious variable choice is AGE. I expect that as players age, they become accustomed to the grueling 162-game seasons, they learn what lifestyle will allow them to perform at a high level, and they make other changes driven by experience to maximize their performance. Batters may also adapt to Major League pitching over time, but conversely pitchers may adapt to the batters tendencies, which makes the total effect ambiguous. In this sense, deviations in performance between years could be a result of a player s underlying ability changing rather than random draws each year from the same distribution. Thus, unless the effect of pitchers adapting to batters is particularly strong, I expect older players to have higher statistics and higher value to teams. But at a certain point older players face disadvantages that can affect their value. Injuries can accumulate, hitters eyes may become less sharp, and other physical ailments can limit a player s performance. To model this quadratic relationship, I include an AGE 2 variable. Unfortunately data on how long players have been in the minor leagues, which could affect their major league performance, is nearly impossible to collect. Players also contribute value to teams beyond their on-field performance. Players with magnetic personalities or characteristics that inspire the community have marketing potential that can increase team revenues. Because a player s marketability is nearly impossible to measure precisely, I include a dummy variable (ALLSTAR) for whether a player was an All-Star in any of the three previous seasons. All-Stars are often the most well-known players and have the best marketing potential and so will serve as a proxy for a player s marketability.

Michael Dinerstein, May 11, 2007, Page 16 I also include dummy variables for the observation s year (2001, 2002, 2003, 2004). Even though the period studied is short, the labor market or general performance may change between the years. In 2002, the owners and players association agreed to a new Collective Bargaining Agreement. This agreement included for the first time revenue sharing among teams, a luxury tax on teams with high payrolls, and testing for steroids. Because these changes could affect player performance and team decisions, yearly dummy variables are necessary. Furthermore, rumors that the composition of the baseballs has changed over the years demands that I differentiate between years. To relate these variables to labor market decisions, I require contract data. Because such data is reported in a haphazard manner, I can rely on only the most significant and basic parts of the contract. For instance, the reporting of performance incentive clauses varies between sources as well as between players. Details of contracts of high-profile players are more often reported because the public s interest in these players is high. More marginal players are less noteworthy and may even be playing with minor league contracts that escape the media s attention. Because most players base salaries appear publicly, I will focus my analysis on base salaries (SALARY). A player s salary serves as a proxy for the player s value to a team and is a scarce resource for the team. For most players, their base salary dominates potential bonuses. This may not be true, however, for some players, especially those who are injury-prone. The difficulty in collecting data on incentives and bonuses unfortunately precludes a more comprehensive analysis. I convert all salaries into number of millions of dollars to avoid large numbers. I also collect the length of contracts (LENGTH) but do not explicitly place it in my models. Contract length can represent the size of a team s commitment and is thus relevant. But since it is endogenous in the teams decision-making, I use it only to sort observations and

Michael Dinerstein, May 11, 2007, Page 17 determine which observations appear in each model. Similarly, NEWCONTRACT is a dummy variable for whether the player has signed a new free agent contract for the current season, but it does not appear in the equations. Instead, it filters old contracts from entering the free agent salary equations. Finally, to test whether a player outperforms expectations prior to becoming a free agent, I include a dummy variable (CONTRACTYEAR) for whether the player will be a free agent in after the current season. 7 The descriptive statistics, reported in Table 2, offer some basic trends. Slugging percentage falls in the sample from year-to-year whereas total bases trends upward. The average age is about 31 years old and the average salary is $3.357 million. Almost 25% of observations signed new free agent contracts and 25% are in the final year of a contract before becoming free agents. Because this sample only covers 62.9% of players-years from 2001 to 2004, the degree to which the sample represents the population is at issue. Since on-field performance and off-field player characteristics data are available for all players, I can test whether the players included in the sample are characteristically different from those not included. A test of whether means are equal shows that the players in the sample are older, more experienced, have higher base salaries (when reported), are less often switch hitters, and have more at-bats, higher slugging percentages, and higher on-base percentages in the prior year. All of these differences, except the proportion of switch hitters, appear significant in all four years as well as in the aggregated sample. While the following analysis suffers from a censoring problem, the bias in the sample favors older and more experienced players, who are exactly those players more likely to be 7 Some players will become free agents unexpectedly if they are released or an option is not exercised. Only those players who are guaranteed to become free agents, unless an extension is signed before the season ends, have a 1 for CONTRACTYEAR.

Michael Dinerstein, May 11, 2007, Page 18 eligible for free agency and to have options in their contracts. This paper attempts to answer questions that revolve around free agency, so the sample s bias toward older and more experienced players is mitigated. This paper also focuses on the difficulty that teams have in evaluating players and making contract decisions. These decisions are more important for players with higher salaries because the team s commitment is larger. The censoring problem is thus less significant than it first appears. 4. Model For the question asking how teams make large commitments, I have player and team regression equations. On the player side, my dependent variable is the player s on-field performance, measured in terms of slugging percentage, in the current year. To predict such performance, I use off-field player characteristics and past on-field performance. I also include the dummy variable CONTRACTYEAR. On the team side, my dependent variable is the first-year salary given to a newly-signed free agent. I use the same off-field player characteristics and past onfield performance variables in addition to dummy variables for ALLSTAR and GOLDGLOVE. These variables appear only in the team equation because they are relevant for a player s value to a team but they fail to predict future slugging percentage beyond the inclusion of the lagged SLG. Similarly, CONTRACTYEAR is only relevant to the player, who may alter his performance when in the contract year, whereas every free agent the team signs just finished a contract year. The player equation makes use of all observations in the sample. The team equation, because it applies to team decisions on free agents, only uses observations for which the player has just signed a new free agent contract. The dataset is thus unbalanced between the equations, as players register slugging percentages each year but sign free agent contracts less frequently.

Michael Dinerstein, May 11, 2007, Page 19 Because the equations model similar processes, I expect that the regression error terms could be correlated. Instead of running two separate OLS equations, I use seemingly unrelated regression to determine the coefficients jointly. I want to test whether coefficients on the same independent variables are equal across the two equations, so I need both dependent variables to lie on the same scale. With an OLS regression, I estimate that each marginal slugging percentage point (.001) equates to a marginal value of $29,500. 8 After rescaling SLGt into dollar amounts, the equations appear as follows: SLG * t SLG 2003 2004 AGE AGE 7 0 1 t 1 8 SLG 2 9 t 2 SLG 3 10 t 3 AVGBASES 2001 2002 2 4 CONTRACTYEAR 11 5 6 (6) SALARY SLG where 2003 2004 AGE AGE 7 0 1 8 t 1 SLG 2 9 t 2 SLG 3 10 t 3 AVGBASES 2001 2002 * SLGt is the rescaled slugging percentage in the current year. 2 4 ALLSTAR GOLDGLOVE (7) Because STATA s seemingly unrelated regression command sureg cannot handle unbalanced datasets where the number of observations differs between equations, I follow the method outlined in McDowell (2004). I scale the equations so that the error terms have equal variance and then combine the data into one panel with a variable indicating which equation the observation enters. I use the command xtgee to produce the results equal to those of seemingly unrelated regression for unbalanced data. For the question of how teams make smaller commitments, I predict how much players with upcoming options can expect to receive as free agents. I use this prediction as a proxy for 11 5 12 6 8 By convention, one point of a baseball percentage refers to 0.001. I derive the estimate for the marginal value of a point of slugging percentage from an OLS regression of salary on three lags of slugging percentage and a constant. The estimate of 29.5 ($29,500) is the sum of the coefficients on the lags of slugging percentage.

Michael Dinerstein, May 11, 2007, Page 20 the player s value to the team with the team option. To predict this value, I run the following regression to determine coefficients on key variables: SALARY SLG 0 1 t 1 SLG 2 t 2 SLG 3 t 3 BASES 2 7 2001 8 2002 9 2003 10 2004 11AGE 12 AGE 13 ALLSTAR 4 t 1 BASES 5 t 2 BASES 6 t 3 (8) Because options only extend a player s contract for one year, in determining the coefficients I only include new free agents who sign one-year contracts. I then use the coefficient estimates and the data from players with options to predict their salaries, which I label PREDICTSAL. Once I have a player s predicted salary, I subtract from it the cost to the team of exercising the option (the option amount minus any pre-negotiated buyout if the option is declined). This amount, if positive, predicts that the player s value to the team is higher than the cost of exercising the option. A negative outcome means the team s costs are higher than its predicted returns from keeping the player. To test whether a higher value of PREDICTSAL OPTIONCOST predicts that a team is more likely to exercise the option, I use two methods. First, I use the nonparametric Wilcoxon Sum Rank Test. I order the players by PREDICTSAL OPTIONCOST from lowest to highest and then sum the ranks of the contract options that were exercised. Because the number of options exercised and declined both exceed 10 in the sample, I use a normal approximation to find a p-value for the test of the null hypothesis that higher values of PREDICTSAL OPTIONCOST have no relationship with whether a team exercises an option. Second, I use a logit regression to find the relationship between the PREDICTSAL OPTIONCOST variable and a variable for whether the team exercises the option. My logit regression equation appears as follows: PREDICTSAL OPTIONCOST EXERCISE 0 1 (9)

Michael Dinerstein, May 11, 2007, Page 21 where EXERCISE is a binary variable that takes 1 if an option is exercised and 0 if the option is declined. I expect the Wilcoxon Sum Rank Test and the logit regression to yield similar results. 5. Results and Discussion The results from the seemingly unrelated regressions appear in Table 4 at the end of the paper. I will analyze the results from the player equation first and then the team equation results. Finally, I will compare the coefficients across equations. As I would expect, lagged observations of slugging percentage strongly predict future slugging percentage, and the prediction power increases for the most recent observations. A 0.010 increase in SLG t-1 predicts, ceteris paribus, a 0.0047 increase in SLG t. Similarly, 0.010 increases in SLG t-2 and SLG t-3 predict 0.0012 and 0.0005 increases in SLG t, respectively. With p-values below 0.01, the first two lagged slugging percentages are clearly strong predictors of future performance. The ratio of the coefficients reveals that the slugging percentage of the previous year accounts for 72.8% of the variation explained by lagged slugging percentages. Before considering the team s equation, this ratio s high value could explain why players might expect a jump in statistics in the contract year to lead to much larger free agent contracts. Players indeed seem to outperform predicted performance levels when they are due to become free agents after the season. The coefficient on the dummy variable CONTRACTYEAR is 0.2795 (on the dollar scale), so being in the contract year predicts an increase in slugging percentage of 0.0095. Since the p-value for the coefficient is 0.115, I cannot claim with near certainty that this relationship between the contract year and slugging percentage is definite, but it provides strong evidence that anecdotes of players trying harder in the contract year have an empirical basis. This result agrees with several studies, including Scroggins (1993), Maxcy, Fort, and Krautmann (2002), and Marburger (2003), which find evidence of players altering

Michael Dinerstein, May 11, 2007, Page 22 behavior in anticipation of free agency. The behavior change found in this paper favors higher performance, while Scroggins (1993) concluded that a player s production falls during the same period. The result also differs from those of Harder (1989) and Krautmann (1993), which found no changes in behavior. A player s total production, measured in total bases, over the previous three seasons also has a positive correlation with SLG t. An increase of 10 total bases on average over the three previous seasons (equal to 10 additional singles, 5 additional doubles or other combinations of hits summing to 10 total bases) predicts a 0.0016 increase in SLG t. Because the lagged slugging percentages should already account for a batter s efficiency, this result seems to indicate that players who have more at-bats on average will have higher slugging percentages in future seasons. These players may be more durable and less affected by lingering injuries that could depress slugging percentages. While these predicted effects seem like very small increases in SLG t, even fractions of percentage point changes can affect a player s value appreciably. For instance, in 2004, the slugging average mean for players with at least 50 at-bats was 0.396. This corresponded to the 46 th percentile of the distribution. An increase of 0.010 to 0.406 would have raised the player to the 51 st percentile. The effects of lagged slugging averages and being in a contract year therefore have a discernible economic impact. The key off-field player characteristic, AGE, exhibits unexpected effects on SLG t. I hypothesized that players would improve as they age but at a decreasing rate. The coefficient on AGE, however, is negative and large enough that its impact cannot be ignored. An increase in age by one year corresponds to a decrease in SLG t of 0.0208. The p-value for the coefficient s difference from 0 is 0.016. The positive coefficient for AGE 2, 0.0075 (0.0003 after scaling back

Michael Dinerstein, May 11, 2007, Page 23 to slugging percentages), is also the opposite sign of my expectation, with a p-value of 0.056. The effect of the linear AGE variable is stronger because the break-even point at which a change in age does not predict any change in slugging percentage occurs between 40 and 41 years old, which lie at the very top end of the age distribution of players. The improvement in skill due to experience and becoming accustomed to the Major Leagues may then be overrated or possibly counteracted by other effects associated with aging, such as the accumulation of injuries. Those players who last into their late 30s and early 40s may be the healthiest players. Because the injury-prone players usually retire earlier, the oldest players are not representative of the rest of the population, and their return to experience may be higher or their vulnerability to injury may be lower. The last variables in the player equation are the year dummy variables. The coefficients on 2001 and 2002 are negative while those on 2003 and 2004 are positive. The only small p- value corresponds to the 2004 coefficient. The yearly effects appear to be large, as a player in 2003 can expect to have a slugging percentage 0.0142 higher than a player in 2002. As discussed previously, there could be several reasons that the yearly averages fluctuate, including the composition of the balls and the prevalence of steroid use. These results fail to distinguish among many possible reasons, but they do indicate that hitting averages are susceptible to the different yearly playing environments. The signs of the coefficients in the team equation are similar, which signifies that general trends in the variables are well-established enough to predict both future performance and salaries. A 0.010 increase in slugging percentage in the most recent year predicts a salary increase of $138,813. Equal increases in SLG t-2 and SLG t-3 generate predicted salary increases of $38,868 and $18,915, respectively, holding all other variables constant. Only the coefficient

Michael Dinerstein, May 11, 2007, Page 24 on SLG t-3 has a p-value above 0.01, and its p-value of 0.245 is low enough that its effect on free agent salaries could carry some weight. These predicted salary increases only apply to new free agent contracts, not to players who are under long-term contracts. The figures also represent the amount the player signed for and give no conclusive prediction on what the player was offered by other teams except that other offers likely fell below the predicted salary. The ratio of the coefficients on the lagged slugging averages shows that the slugging percentage from the most recent year explains 63.0% of the salary variation due to past slugging averages. This ratio is lower than the equivalent ratio in the player equation, a relationship I will explore below. A player s durability, as measured by his average total bases in the previous three years while holding slugging averages constant, has a large impact on his free agent salary. An increase of 10 total bases on average over the three previous seasons yields a predicted salary increase of $128,598. These predicted changes in salary can have a large impact on a team s financial structure, although the changes are smaller in scale than those predicted in the player equation. An increase in free agent salary of $500,000 moves the mean salary from the 72 nd to the 76 th percentile. Accounting for a player s value to a team beyond his batting statistics is more difficult, though the coefficients on ALLSTAR and GOLDGLOVE may capture some of this value. Having been an All-Star at least one of the three previous seasons increases a player s predicted salary by $1.393 million. Because the lagged slugging averages and total bases average should cover much of a player s hitting value, this large salary increase due to All-Star status may indicate that a player s marketability is very important in determining free agent offers. The effect of defensive ability, measured by number of gold gloves, is smaller. An additional gold glove corresponds to a salary increase of $64,161, and the coefficient s p-value of 0.343