Millersville University of Pennsylvania April 17, 2014
Project Objectives Model the horse racing process to predict the outcome of a race. Use the win and exacta betting pools to estimate probabilities in the trifecta and superfecta pools. Determine the optimal amounts to wager to maximize the utility of the optimal bettor s wealth.
Terminology Pari-mutuel wagering: a form of betting on outcomes of an event in which the payoffs associated with each outcome are determined by the aggregated bets. Track take: the percentage of all bets placed which the race track keeps to cover its expenses and to furnish a profit for the track owners (typical values are 16 24%). Breakage: the practice of the tracks of rounding payoffs to next lowest multiple of $0.10 or $0.05.
Mathematical Model Suppose horses 1, 2,..., N will race. w n : amount bet on horse n {1, 2,..., N}. W : total amount wagered by the public on the race. W = N n=1 p n : public s estimate of the probability horse n will win. T : track take p n = w n W w n O n : public odds on horse n {1, 2,..., N}. O n = W (1 T ) w n w n to 1
Return on a Wager If the odds are i : j on horse n then a $j bet will return $(i + j) if horse n wins.
Return on a Wager If the odds are i : j on horse n then a $j bet will return $(i + j) if horse n wins. Example Let the track take be 1/6 and suppose in a 3-horse race the following amounts are wagered by the public. Horse w n O n 1 $100 4 : 1 2 $200 3 : 2 3 $300 2 : 3 W $600
Return on a Wager If the odds are i : j on horse n then a $j bet will return $(i + j) if horse n wins. Example Let the track take be 1/6 and suppose in a 3-horse race the following amounts are wagered by the public. Horse w n O n 1 $100 4 : 1 2 $200 3 : 2 3 $300 2 : 3 W $600 If $3 is bet on horse 3 and horse 3 wins, the bettor receives $5.
Private Probabilities of Race Outcomes Let π n denote a private estimate of the probability that horse n {1, 2,..., N} wins the race. The expected payoff of a unit bet on horse n is π n (O n + 1).
Private Probabilities of Race Outcomes Let π n denote a private estimate of the probability that horse n {1, 2,..., N} wins the race. The expected payoff of a unit bet on horse n is π n (O n + 1). Suppose the public and private probability estimates are the same, then π n (O n + 1) = w ( ) n W (1 T ) wn + 1 = 1 T. W w n
Private Probabilities of Race Outcomes Let π n denote a private estimate of the probability that horse n {1, 2,..., N} wins the race. The expected payoff of a unit bet on horse n is π n (O n + 1). Suppose the public and private probability estimates are the same, then π n (O n + 1) = w ( ) n W (1 T ) wn + 1 = 1 T. W w n Regardless of the amount bet, the bettor loses the track take.
Positive Returns A bettor can expect positive returns when π n p n for some n {1, 2,..., N} and when π n (O n + 1) > 1. The public s estimate of which horse will win a race is usually very accurate. Bettors hoping to beat the track must develop a set of probabilities of the outcomes of the races.
Stochastic Utility Model Assumptions: A horse race is a probabilistic event. We seek to model the utility U h of horse h.
Stochastic Utility Model Assumptions: A horse race is a probabilistic event. We seek to model the utility U h of horse h. Horse h possesses a vector of observable attributes, x h = x h,1, x h,2,..., x h,k. Horse h is ridden by a jockey having a vector of observable attributes, y h = y h,1, y h,2,..., y h,m.
Stochastic Utility Model Assumptions: A horse race is a probabilistic event. We seek to model the utility U h of horse h. Horse h possesses a vector of observable attributes, x h = x h,1, x h,2,..., x h,k. Horse h is ridden by a jockey having a vector of observable attributes, y h = y h,1, y h,2,..., y h,m. Utility function: U h U h (x h, y h )
Decomposition of the Utility Function The observable attributes of the horse and jockey may fail to adequately describe the utility of the horse. There may be errors in the observations of the attributes.
Decomposition of the Utility Function The observable attributes of the horse and jockey may fail to adequately describe the utility of the horse. There may be errors in the observations of the attributes. U h = V h (x h, y h ) + E h (x h, y h ) V h : deterministic part of the utility model. E h : random error portion of the utility model.
Assigning Probabilities Assumptions: Nature is rational in the sense that the horse with the greatest utility will win the race. Suppose horse h {1, 2,..., N} wins, then π h = P (U h U n, for n = 1, 2,..., N) = P (V h + E h V n + E n, for n = 1, 2,..., N)
Assigning Probabilities Assumptions: Nature is rational in the sense that the horse with the greatest utility will win the race. Suppose horse h {1, 2,..., N} wins, then π h = P (U h U n, for n = 1, 2,..., N) = P (V h + E h V n + E n, for n = 1, 2,..., N) The random error terms of the model are assumed to follow the double exponential distribution with P (E x) = e e x.
Multinomial Logit Model Assuming the error terms are i.i.d. with the double exponential distribution then e V h π h = N n=1 evn for h {1, 2,..., N}. This enables the development of a tractable, multinomial logistic regression model.
Deterministic Component of Utility Assume that V h is a linear function of the observable attributes of horse h: V h = K θ k Z h,k (x h, y h ) k=1 The parameters θ k measure the relative importance of attribute Z h,k in the determination of the winner of a race.
Deterministic Component of Utility Assume that V h is a linear function of the observable attributes of horse h: V h = K θ k Z h,k (x h, y h ) k=1 The parameters θ k measure the relative importance of attribute Z h,k in the determination of the winner of a race. The values of θ k for k {1, 2,..., K } can be estimated from historical data.
Model Attributes The deterministic utility may include attributes such as: % won: percentage of races won by the horse during the past two years. post: post position of the horse. winnings/race: winnings per race during the current year. speed rating: average speed rating of the horse s last four races. weight: amount of extra weight the horse carries. jockey % won: percentage of races won by the jockey during the past two years.
Race Data Data is available on-line, for example Daily Racing Form. Data can be exploded to create more race data.
Race Data Data is available on-line, for example Daily Racing Form. Data can be exploded to create more race data. Ignore the winner in an N-horse race and we have an (N 1)-horse race.
Race Data Data is available on-line, for example Daily Racing Form. Data can be exploded to create more race data. Ignore the winner in an N-horse race and we have an (N 1)-horse race. Races can be exploded to a depth of 3 and still give significant data. Hypothesis testing is conducted to determine which variables are predictive of utility.
Race Data Data is available on-line, for example Daily Racing Form. Data can be exploded to create more race data. Ignore the winner in an N-horse race and we have an (N 1)-horse race. Races can be exploded to a depth of 3 and still give significant data. Hypothesis testing is conducted to determine which variables are predictive of utility. Searching for Positive Returns at the Track, Bolton & Chapman, Management Science, Vol. 32, No. 8 (1986) is a good introduction to this process.
Exotic Bets The public opinion of the probability of an outcome in the win pool is usually quite accurate. Therefore it is difficult to make a profit by placing bets to win. The public s estimates of probabilities associated with exotic bets are less accurate.
Exotic Bets The public opinion of the probability of an outcome in the win pool is usually quite accurate. Therefore it is difficult to make a profit by placing bets to win. The public s estimates of probabilities associated with exotic bets are less accurate. exacta: first two finishers in order (minimum bet $1). trifecta: first three finishers in order (minimum bet $0.50) superfecta: first four finishers in order (minimum bet $0.25)
Exotic Bets The public opinion of the probability of an outcome in the win pool is usually quite accurate. Therefore it is difficult to make a profit by placing bets to win. The public s estimates of probabilities associated with exotic bets are less accurate. exacta: first two finishers in order (minimum bet $1). trifecta: first three finishers in order (minimum bet $0.50) superfecta: first four finishers in order (minimum bet $0.25) Odds for the trifecta, superfecta, and other exotic bets are not posted.
Henery Model (1 of 2) Assuming finishing times are normally distributed with unit variance, i.e., T i N(θ i, 1) for i = 1, 2,..., n. Henery model estimates the probability that horse i wins is π i = for i = 1, 2,..., n. j i n [ 1 Φ(u θj ) ] φ(u θ i ) du
Henery Model (1 of 2) Assuming finishing times are normally distributed with unit variance, i.e., T i N(θ i, 1) for i = 1, 2,..., n. Henery model estimates the probability that horse i wins is π i = for i = 1, 2,..., n. j i n [ 1 Φ(u θj ) ] φ(u θ i ) du For details see Permutation Probabilities as Models for Horse Races, R. Henery found in Efficiency of Racetrack Betting Markets, Hausch, Lo, and Ziemba (eds.), (1994).
Henery Model (2 of 2) The values of the improper integrals remain unchanged if the same constant θ is added to each θ i. Win probability π i for 1 i n can be estimated so estimate θ i for 1 i n by minimizing F(θ) = n i=1 j i n [ 1 Φ(u θj ) ] φ(u θ i ) du π i 2 where θ = θ 1, θ 2,..., θ n, subject to n θ j = 0. j=1
Henery Model (2 of 2) The values of the improper integrals remain unchanged if the same constant θ is added to each θ i. Win probability π i for 1 i n can be estimated so estimate θ i for 1 i n by minimizing F(θ) = n i=1 j i where θ = θ 1, θ 2,..., θ n, subject to Trifecta probabilities are then π ijk = u+θk θ j n [ 1 Φ(u θj ) ] φ(u θ i ) du π i Φ(v + θ j θ i )φ(v) dv n θ j = 0. j=1 n t i,j,k [1 Φ(u + θ k θ t )] φ(u) du. 2
Estimating Superfecta Probabilities Let p ijkl be the probability that entrants i, j, k, and l finish in that order as the 1st, 2nd, 3rd, and 4th places in a race. p ijkl p ij pk τ n s i,j pτ s p λ l n t i,j,k pλ t. p ij is the exacta probability that entrants i and j finish first and second in the race (available from wager board) p i is the probability that the ith entrant wins the race (available from wager board).
Estimating Superfecta Probabilities Let p ijkl be the probability that entrants i, j, k, and l finish in that order as the 1st, 2nd, 3rd, and 4th places in a race. p ijkl p ij pk τ n s i,j pτ s p λ l n t i,j,k pλ t. p ij is the exacta probability that entrants i and j finish first and second in the race (available from wager board) p i is the probability that the ith entrant wins the race (available from wager board). Fit parameters τ and λ using historical data.
Estimating Superfecta Probabilities Let p ijkl be the probability that entrants i, j, k, and l finish in that order as the 1st, 2nd, 3rd, and 4th places in a race. p ijkl p ij pk τ n s i,j pτ s p λ l n t i,j,k pλ t. p ij is the exacta probability that entrants i and j finish first and second in the race (available from wager board) p i is the probability that the ith entrant wins the race (available from wager board). Fit parameters τ and λ using historical data. Parameters appear to depend on field size, N.
Placing Optimal Bets Once private estimates of the winning probabilities for horses have been made, the bettor must decide which bets to place. The bettor will be changing the payout odds by placing additional bets.
Placing Optimal Bets Once private estimates of the winning probabilities for horses have been made, the bettor must decide which bets to place. The bettor will be changing the payout odds by placing additional bets. Assume the bettor is risk-adverse and therefore the utility function of the bettor is logarithmic.
Placing Optimal Bets Once private estimates of the winning probabilities for horses have been made, the bettor must decide which bets to place. The bettor will be changing the payout odds by placing additional bets. Assume the bettor is risk-adverse and therefore the utility function of the bettor is logarithmic. Assume no more bets are placed after the optimal bettor wagers.
Placing Optimal Bets Once private estimates of the winning probabilities for horses have been made, the bettor must decide which bets to place. The bettor will be changing the payout odds by placing additional bets. Assume the bettor is risk-adverse and therefore the utility function of the bettor is logarithmic. Assume no more bets are placed after the optimal bettor wagers. If a positive expected return results from placing a particular bet, the bet is called an overlay.
Example Suppose our private estimate of the outcome probabilities is as follows. Horse B n O n π n 1 $100 4 : 1 0.27 2 $200 3 : 2 0.33 3 $300 2 : 3 0.40 W $600 Question: will a positive return result from placing a bet on horse 1?
Placing a $2 Bet on Horse 1 O 1 = (600 + 2)(1 1 6 ) (100 + 2) = 3.9183 100 + 2 O 2 = (600 + 2)(1 1 6 ) 200 = 1.5083 200 O 3 = (600 + 2)(1 1 6 ) 300 = 0.6722 300
Placing a $2 Bet on Horse 1 O 1 = (600 + 2)(1 1 6 ) (100 + 2) = 3.9183 100 + 2 O 2 = (600 + 2)(1 1 6 ) 200 = 1.5083 200 O 3 = (600 + 2)(1 1 6 ) 300 = 0.6722 300 Our expected profit is E [b 1 ] = (0.27)(2)(3.9183) + (1 0.27)( 2) = $0.655882
Optimal Wager (1 of 2) Question: what is the optimal wager to place on horse 1?
Optimal Wager (1 of 2) Question: what is the optimal wager to place on horse 1? Answer: Let f (x) be the following ( ) (600 + x)(1 1 6 ) (100 + x) f (x) = (0.27x) (1 0.27)x 100 + x and maximize for x 2.
Optimal Wager (2 of 2) E@b1 D 0 20 40 60 80 100 b1-5 -10-15 -20 E [b1 ] is maximized at b1 $20.48 with a maximum of $3.25.
Optimal Wager (2 of 2) E@b1 D 0 20 40 60 80 100 b1-5 -10-15 -20 E [b1 ] is maximized at b1 $20.48 with a maximum of $3.25. The optimal bet makes up 17% of the total bet on horse 1.
Multiple Bets The bettor has a bankroll of size B. If there are N horses in a race, the bettor can place a vector of wagers b = b 1, b 2,..., b N subject to the constraints: 0 b i for i = 1,..., N N b i B i=1
Multiple Bets The bettor has a bankroll of size B. If there are N horses in a race, the bettor can place a vector of wagers b = b 1, b 2,..., b N subject to the constraints: 0 b i for i = 1,..., N N b i B i=1 If the outcome of race is k then the bankroll becomes ˆB = B + b k (O k + 1) N b i. i=1
Maximizing the Bettor s Expected Wealth Utility The bettor can place bets on as many outcomes as desired, so the general problem of maximizing the bettor s expected utility of wealth can be stated as max b 1,...,b N N π i U i=1 subject to the constraints B + (1 T )w i w i + b i N (w j + b j ) j=1 0 b i for i = 1, 2,..., N N b i B. i=1 N j=1 b j
Remarks A good introduction to solving such problems is Optimal bets in pari-mutuel systems, N. Levin found in Efficiency of Racetrack Betting Markets, Hausch, Lo, and Ziemba (eds.), (1994). Problem can be solved in closed form under the assumption of a fairly general class of utility functions. Problem can be generalized to the situation when the bettor is not placing the last wagers on the outcomes (and still solved).
Merit Order Many algorithms for solving the problem are based on the concept of merit order.
Merit Order Many algorithms for solving the problem are based on the concept of merit order. Lemma If there exist i and j in {1, 2,..., N} for which π i w i π j w j and it is optimal to wager on outcome j, then it is also optimal to wager on outcome i.
Merit Order Many algorithms for solving the problem are based on the concept of merit order. Lemma If there exist i and j in {1, 2,..., N} for which π i w i π j w j and it is optimal to wager on outcome j, then it is also optimal to wager on outcome i. Re-number the outcomes in decreasing merit: π 1 w 1 π 2 w 2 π N w N.
Optimal Solution Lemma Assuming the outcomes are numbered in decreasing order of merit then for the optimal solution there exists 0 r N such that b i > 0 for 1 i r, b i = 0 for r < i N. If r = 0 place no bets. If r = N bet on all outcomes.
Additional Constraints At smaller tracks and in exotic bets, the bettor s optimal wager should not make up too high a proportion of the betting pool or of the total of the wagers on any outcome. Restrict bets to only overlays. Bet on only one outcome per race. Place only the minimum bet on an outcome.
Numerical Solution Nonlinear, constrained, global optimization. Superfecta wager on a field of N = 14 horses, optimization takes place over 14! = 24024 variables, (14 4)! so sometimes only local maxima are computed.