Measuring Match Importance in the Malaysian Super League Based on Win-Lose-Draw Results N. ABDUL HAMID 1, M. R. NORAZAN 2, G. KENDALL 3 & N. MOHD-HUSSIN 4 Centre for Information Technology 1 Universiti Teknologi MARA 40450 Shah Alam, Selangor Centre for Statistical Studies and Decision Sciences 2 Universiti Teknologi MARA 40450 Shah Alam, Selangor School of Computer Science 3 University of Nottingham, Jalan Broga, 43500 Semenyih, Selangor and University of Nottingham UNITED KINGDOM Centre for Information Technology 4 Universiti Teknologi MARA Perlis 02600 Arau, Perlis nhayati@tmsk.uitm.edu.my 1, norazan@tmsk.uitm.edu.my 2, Graham.Kendall@nottingham.edu.my 3, Graham.Kendall@nottingham.ac.uk 3, naimahmh@perlis.uitm.edu.my 4 Abstract: - Malaysian football is witnessing a decrease in stadium attendance and thus scheduling fixtures into timeslots so as to maximise the number of supporters is critical to the league administrators. In this study, we propose a simple match importance measure based on a final ranking score. Comparing our results with the previous fixture generated using a Weighted Goal Score procedure, we find that the fixtures generated from our proposed approach are similar to our previous work, but using a simpler model. Key-Words: - match importance, ranking scores, scheduling fixtures 1 Introduction This work is a continuation of our previous work which aims to define the level of importance of each fixture in the Malaysian Super League. Malaysian football is witnessing a decrease in stadium attendance and scheduling fixtures into timeslots so as to maximise the number of supporters is critical to the league administrators. In our previous work [1] we applied WGS (Weighted Goal Score) in order to calculate the priority of each fixture. According to Dobson and Goggard [2], there are two distinct strands of empirical literature on modelling the outcome of matches in football (soccer). Modelling the number of goals scored and conceded and modelling win-lose-draw results. The difference between the forecasting abilities of these two models appears to be relatively small. This paper tests this hypothesis for Malaysian football. The aim of this study is to investigate a different method to measure match importance. We propose to measure match importance using win-lose-draw results from the previous season. The results are used as an input to produce fixture for the next season, with the aim of maximising stadium attendance. ISBN: 978-1-61804-106-7 91
2 Background Modelling match outcomes using win-lose-draw results have been reported in [3], [4], and [5]. In [5] a comparison was undertaken that considered two models; goals scored and conceded, and win-drawlose match results. It concluded that that the difference between the forecasting ability of the two models was relatively small. Lebovic and Sigelman [6] use a ranking methodology to forecast games in American college football. Their method analyses week-to-week changes of the ranking, utilising a logistic regression model. In [7] a ranking method is used where they predict the winners of men s professional tennis, employing the Bradley-Terry type model. In our work, we use final ranking of the league to predict the match importance of each team and use this measure of importance to produce the fixture for the next season. In [8], they present the performance of a Bayesian Network compared to other machine learning techniques for predicting match results for Tottenham Hotspur Football Club. This is however not applicable to this research in that we use final ranking (win-lose-draw) to measure the match importance. The use of ranking methods is also reported in [9], for professional tennis. They test whether the difference in ranking between individual players are good predictors for Grand Slam tennis outcomes. P iii. W t = t T Once we get the weight for each team, we multiply the weight of team a and b to get the level of importance l, of the match between team a and team b. In order to distinguish between home and away games, for the home game, we add 10% to the weight of home team, for example, if team a is playing against team b, the calculation is (W ta x W tb ) + (W ta x 0.1). We use l, as a measure of match importance, in the same mathematical model in our previous work, in order to generate schedules based on the final ranking of the season. 4 Results and Discussion 4.1 Season 2007-2008 Table 1 presents the calculation of weights for each team in season 2007-2008. The level of importance of each match is shown in Table 2. Table 1: Weights for each team in Season 2007-2008 3 Methodology We utilise a win-lose-draw model, drawing on the results from the previous season, to determine if there are similarities in the fixtures produced when using a Weighted Goal Score model. We use the data for seasons 2007-2008, 2009 and 2010. The winlose-draw (WLD) model is as follows: i. P t = points achieved by each team, t, throughout the season and is calculated following the FIFA rule (win = 3 points, draw = 1 point, and lose = 0) ii. T = total number of points for all teams The level of importance of each fixture using win-lose-draw were sorted in descending order of level of importance, and compared to the ranking ISBN: 978-1-61804-106-7 92
Table 2: Level of importance using final ranking (season 2007-2008) using WGS (table 3). We found that the results are similar. In fact, for a match between Kedah and Perak, both methods ranked them fifth. Similarly, the match between Penang and DPMM, is placed at 148 th, among the unimportant matches. We would expect that the schedules produced, using the two measures, would be similar as we are able to solve the problem using a deterministic, optimal approach. Table 3: Comparison of ranking using win-lose-draw and using WGS for Season 2007-2008. No. Fixture level of importance Win-losedraw Ranked using win-losedraw Ranked using WGS 1 KEDAH-N.SEMBILAN NAZA 0.0268 1 2 2 KEDAH-JOHOR 0.0262 2 11 3 KEDAH-SELANGOR 0.0259 3 1 4 N.SEMBILAN NAZA-KEDAH 0.0250 4 20 5 KEDAH-PERAK 0.0248 5 5 6 JOHOR-KEDAH 0.0239 6 25 7 KEDAH-TERENGGANU 0.0236 7 10 8 SELANGOR-KEDAH 0.0234 8 42 9 KEDAH-PERLIS 0.0233 9 57 10 N.SEMBILAN NAZA-JOHOR 0.0225 10 45... 135 PENANG-PERLIS 0.0071 135 133 136 UPB MYTEAM-PENANG 0.0070 136 112 137 DPMM-PENANG 0.0070 137 113 138 SARAWAK-N.SEMBILAN NAZA 0.0067 138 37 139 UPB MYTEAM-SARAWAK 0.0066 139 149 140 DPMM-SARAWAK 0.0066 140 150 141 SARAWAK-JOHOR 0.0066 141 85 142 PENANG-PAHANG 0.0065 142 109 143 SARAWAK-SELANGOR 0.0065 143 32 144 SARAWAK-PERAK 0.0062 144 69 145 PENANG-PDRM 0.0060 145 126 146 SARAWAK-TERENGGANU 0.0059 146 83 147 PENANG-UPB MYTEAM 0.0058 147 103 148 PENANG-DPMM 0.0058 148 148 149 SARAWAK-PERLIS 0.0058 149 128 150 N.SEMBILAN NAZA-PENANG 0.0055 150 81 151 SARAWAK-PAHANG 0.0054 151 104 152 PENANG-SARAWAK 0.0051 152 154 153 SARAWAK-PDRM 0.0049 153 122 154 SARAWAK-UPB MYTEAM 0.0048 154 96 155 SARAWAK-DPMM 0.0048 155 145 156 SARAWAK-PENANG 0.0044 156 115 4.2 Season 2009 Table 4 shows the weights for each team in season 2009, and the level of importance in Table 5. Similar to season 2007-2008, the level of importance of each fixture, using WLD, are sorted in descending order of the importance level, and then compared to the ranking using WGS in Table 6. We found that the results are almost similar. In fact, for a match between Selangor and Terengganu, both methods ranked them 5 th. Similarly, Kelantan and Kedah, are placed at 40 th, and Pahang and N.Sembilan placed at 161 st. Again, we would expect the resultant schedules to be very similar from the two importance levels. Table 4: Weights for each team in Season 2009 ISBN: 978-1-61804-106-7 93
Table 5: Level of importance using final ranking (season 2009) Table 6: Comparison of ranking using win-lose-draw and using WGS (Season 2009) Table 7: Weights for each team for Season 2010 Similar to season 2009 above, the level of importance of each fixture using WLD are sorted in descending order of level of importance, and compared to the ranking using WGS as shown in Table 9. The two models, again, produce similar results. A match between Perlis and Kuala Lumpur, are ranked at 139 th in both models. Similarly, a match between Johor and Pahang, gets ranked at 164 th in both models. Like the other seasons, the fixtures from our deterministic, optimal procedure would generate similar fixtures from the two match importance measures. 4.3 Season 2010 Table 7 shows the weights for season 2010, and the level of importance in Table 8. ISBN: 978-1-61804-106-7 94
Table 8: Level of importance using final ranking (season 2010) Table 9: Comparison of ranking using win-lose-draw and using WGS (Season 2010) 5 Conclusion We have proposed WLD, as a measure of match importance. The objective is to find out if it would produce similar results to the WGS method. We provided a comparison across three seasons (2007-2008, 2009, and 2010) of the Malaysian football league and found that that the results are very similar. This agrees with previous work, which also found that these two methods produce similar match weightings. Acknowledgments The authors would like to thank the Ministry of Higher Education (MOHE) Malaysia and Universiti Teknologi MARA Malaysia for supporting this research with the Research University Grant No. 600-RMI/ST/FRGS 5/3/FST (240/2010). ISBN: 978-1-61804-106-7 95
References: [1] Abdul-Hamid N., et al. A Statistical Method to Measure Match Importance in the Malaysian Super League. in 8th International Conference on the Practice and Theory of Automated Timetabling. 2010. Belfast, Ireland. [2] Dobson, S. and J. Goddard, Forecasting scores and results and testing the efficiency of the fixedodds betting market in Scottish league Football, in Statistical Thinking in Sports, J.H. Albert, Koning, R., Editor. 2008, Chapman & Hall/CRC: Boca Raton. p. 91-109. [3] Koning, R.H., Balance in competition in Dutch soccer. Statistician, 2000. 49: p. 419-431. [4] Goddard, J. and I. Asimakopoulos, Forecasting football results and the efficiency of fixed-odds betting. Journal of Forecasting, 2004. 23(1): p. 51-66. [5] Goddard, J., Regression models for forecasting goals and match results in association football. International Journal of Forecasting, 2005. 21(2): p. 331-340. [6] Lebovic, J.H. and L. Sigelman, The forecasting accuracy and determinants of football rankings. International Journal of Forecasting, 2001. 17(1): p. 105-120. [7] McHale, I. and A. Morton, A Bradley-Terry type model for forecasting tennis match results. International Journal of Forecasting, 2011. 27(2): p. 619-630. [8] Joseph, A., N.E. Fenton, and M. Neil, Predicting football results using Bayesian nets and other machine learning techniques. Knowledge-Based Systems, 2006. 19(7): p. 544-553. [9] del Corral, J. and J. Prieto-Rodríguez, Are differences in ranks good predictors for Grand Slam tennis matches? International Journal of Forecasting, 2010. 26(3): p. 551-563. ISBN: 978-1-61804-106-7 96