As time goes by - Using time series based decision tree induction to analyze the behaviour of opponent players

As ime goes by - Using ime series based decision ree inducion o analyze he behaviour of opponen players Chrisian Drücker, Sebasian Hübner, Ubbo Visser, Hans-Georg Weland TZI - Cener for Compuing Technologies Universiy of Bremen D-28334 Bremen, Germany grp-rcinern@zi.de WWW home page: hp://www.virualwerder.de/ Absrac. Wih he more sophisicaed abiliies of eams wihin he simulaion league level online funcions become more and more aracive. Las year we proposed an approach o recognize he opponens sraegy and developed he online coach accordingly. The coach was able o deec heir sraegy and hen passed his informaion ogeher wih appropriae counermeasures o his eam. However, his approach gives only informaion abou he enire eam and is no able o deec significan siuaions (e.g. double pass, sandard siuaions, repeaed paerns beween wo or hree players). In his paper we describe a new mehod for ime series which is able o describe he ime series by qualiaive absracion and produces samples which hen can be used for inducive learning mehods such as decision rees. 1 Inroducion As he RoboCup 2000 in Melbourne showed, he differences beween he echnical abiliies of he major eams aren as big as in previous world-cups. Due o eams which share heir progress wih he RoboCup communiy by releasing heir source code shorly afer he even - a big hanks o CMU a his poin -, every eam is able o easily arrange eleven prey good agens. Fuure eams will have o develop a superior eam behaviour o win a ournamen. This means a big sep forward in erms of muliagen-sysem research. Version 7.0 of he soccer server exended he abiliies of he online coach furher, so ha is use becomes even more ineresing. I is sill he mos effecive insrumen o analyze he opponen, because i possesses all he informaion abou he simulaed environmen. Therefore, i was imporan for us o coninue he developmen of he online coach which we used a he RoboCup 2000. In [Visser e al., 2001] we describe how he old coach deermines he opponen acical formaion wih a neural nework and how i is able o change our eam formaion during a mach.

This year our main arge is he collecion of more informaion abou he opposing eam. On one hand we wan o recognize more aspecs of he opponens global acics. Beside of heir formaions, we are ineresed in heir defense behaviour (e.g. he use of offside raps; cover or zone defense) and heir preferred offensive plays (e.g. wing play or cener breakhroughs; quick passing or long dribbling), bu also in heir abiliy o cope wih such behaviours when hey are used by our eam. On he oher hand we wan o analyze an individual opponen agen o deec which ask in heir sysem i fulfills. To deermine his, we have o observe how a cerain player reacs o cerain siuaions. Under which circumsances does an agen pass he ball o a fellow player? Why does i pass he ball in his siuaion o jus his player? When does an opponen player shoo a our goal? When does a defense player aack our ball carrier? If we know how an opponen agen reacs in a cerain siuaion, we can use his informaion o be always one sep ahead of he opponen. For example, if we realize ha he opponen forwards only ry o score if hey have enough space (as i seemed o be performed by he Brainsormers agens), we could preven hem from shooing a he goal by placing our defenders close enough o hem. Similar work has been done over he las couple of years by [Wünsel e al., 2000], [Frank e al., 2000], and [Raines e al., 1999]. Our objecive is o find a mehod ha can improve he behaviour of our eam during he game and ha no only shows us how we could do beer nex ime. Every year new eams debu in RoboCup whose agens differ in heir behaviours from all ha exised before. Therefore, off-line analysis can prepare our eam for exising agens only, whose binaries have been released during he year. Unforunaely only a few eams did ha early enough in he pas. Bu even if all would, we couldn assume ha our opponens didn made major changes in heir behaviour. 2 Qualiaive absracion of ime series A new mehod for he qualiaive absracion of muliple ime series has been developed a he Cener for Compuing Technologies [Boronowsky, 2001]. This mehod seems especially suiable o us in order o deliver paerns which we hen use as a basis for learning algorihms. The mehod has been developed o analyze coninuous valued ime series. I uses decision ree inducion and herefore generaes rules abou he analyzed ime series, in our case he observed players. The special feaure of his mehod is he way he coninuous valued aribues are discreized. Cerain rules, generaed by he decision ree, can be evaluaed auomaically by our coach. This informaion can be used o improve our agens behaviour in he running game. Addiionally, all rules can can be evaluaed by hand afer he game has ended. The resuls of he evaluaion by hand can be used o improve our agens basic behaviour. A big advanage of he used mehod is, ha he discreizaion of coninuous valued aribues is done auomaically by our sysem.

f i hreshold y f () i Classes B A q() = λ( ) r, r qualiaive absracion λ( r, ) r() sar end Fig. 1. Qualiaive absracion and horizonal spliing Thus, here will be no problem wih he decision ree inducion no generaing a proper resul, if an adverse discreizaion has been chosen by he designer. The discreizaions resuling from he mehod paricularly hold valuable informaion abou he opposing eam. Discreizaion of coninuous valued ime series is he main ask of he mehod. The mehod uses an very efficien algorihm o solve his problem. To analyze our opponen we record several ime series F. Supplemenary mahemaical ransformed ime series can be added in order o improve he resuls. One of he ime series r F mus be qualiaively absraced. This special series gives he classes o be learned, e.g. a ime series conaining: goalie leaves goal, goalie says in goal, and goalie reurns o goal. By he qualiaively absraced ime series ime slices on all ime series are assigned o he classes. One of he ime series f i is spli horizonally a a hreshold y (fig. 1). To deermine his hreshold all possible spli poins mus be evaluaed wih a special heurisic. The bes spli poin is he one which separaes he differen classes he bes. All he oher ime series are spli verically (fig. 2) a all poins a which he ime series f i crosses he hreshold y. Wih hese separaed graphs he process is recursively repeaed.

f 1 hreshold y f 1 () f 2 f 2 () f 3 f 3 () sar end Fig. 2. verical spliing To apply he uniform disribuion heorem of he enropy minimalizaion heurisic he ime series mus be approximaed by parially linear funcions. The used mehod is based on he enropy minimalizaion heurisic as used in ID3 [Quinlan, 1986] and C4.5 [Quinlan, 1993]. An aribue is spli a he poin a which he heurisic is minimal. Fayyad and Irani [Fayyad and Irani, 1992] have proved ha his can only happen a class boundaries, we call hese poins boundary poins (BP). By using his knowledge an imporan increase in efficiency can be reached. Bu i only works properly for non-overlapping classes. When wo classes are overlapping he number of BPs can increase dramaically. In he wors case here can be BP beween all examples in he overlapping area. The mehod we use in his work allows an efficien spliing of coninuous valued aribues in hese cases. Boronowsky [Boronowsky, 2001] shows ha his problem can be solved by using inervals of uniform disribuion. This can be achieved by a linear approximaion of he real disribuion. He also explains ha he enropy minimalizaion heurisic can only be minimal a he joins of he approximaion.

So far he mehod was used in he sysem ExraKT. ExraKT is a sysem o aid humans in analyzing ime series. I generaes hypoheses abou coherences in ime series which can hen be inerpreed by a human analys. The user can also manipulae he decision ree inducion o ener his own knowledge ino he process. The sysem was used o analyze simulaed echnical sysems. In hese ess also sudden occurring failures in he echnical sysems were simulaed. ExraKT didn know abou hese errors beforehand, bu hey were represened in he decision ree and could easily be found by he user. This aspec is very ineresing for he use in he RoboCup compeiion. In a RoboCup mach changes in an agen s behaviour can no only happen due o an error in he agen, bu also because he eam has changed is acic. 3 New Coach In order o improve our abiliies o analyze he opposing eam we inegraed he sysem ino he online coach. The coach coninuously racks cerain aspecs of he game, e.g. he posiions of he players, heir disance o each oher and heir disance o he ball. They are sored in memory forming several ime series which are analyzed in regular inervals by he sysem menioned above. As a resul we receive rules abou he opponens behaviour. Some of hese rules can be used insanly wihin he game o improve he behaviour of our players and o adap o he opponen. To achieve his he coach mus have some a priori knowledge abou he rules o expec and he suiable changes o our behaviour or playing syle. Addiionally, here is he possibiliy o exrac he colleced daa and rules a he end of he game. This informaion can be revised and correced by a human exper. They can hen be used for furher developmen of he eam or o improve he auomaic exracion of rules by he coach. In he firs sep we use he described mehod o analyze he opponens goal keeper. The goalie is pariculary suiable, because his role in he eam is fixed. He is he only player who s funcion is known from he beginning of he game; all he oher players can only be classified as forwards, mid-fielders or defenders by heir behaviour in he game, so we focus on he keeper for our iniial work. Addiionally, he goalie is a very imporan player. The knowledge of his srenghs and weaknesses can be used o opimize he behaviour and configuraion of our forwards, leading o beer scoring abiliies. We also wan o use he mehod o analyze our own goal keeper. This way we can es he qualiy of our sysem by cross checking he resuls wih he deails of our implemenaion. Besides ha we may find some aspecs in he behaviour of our goalie which could be improved furher. To prepare he analysis our coach sores some ime series of game aspecs which are relaed o he goal and he goal keeper. This includes he disance beween he ball and he goal,

he disance beween he ball and he goal keeper, he disance beween he goal keeper and he goal, he number of opponens and eam maes wihin he penaly area, he number of eam maes which may inercep he ball when kicked owards he goal. For us he mos imporan ask is o deermine, under which circumsances he goal keeper sars o leave he goal o ge he ball and when he sars o reurn o i. According o his problem we have chosen a suiable qualiaive absracion. This absracion uses he change in he ime series corresponding o he disance of he goal keeper o he goal. This is used o deermine wheher he goal keeper is leaving he goal, is saying in he goal or is reurning o he goal The siuaions in which he keeper is leaving he goal or is reurning o i are of special ineres for us, because in hese momens he can be aken by surprise more easily. We plan o modify and exend he mehods used o analyze he goal keeper o he oher pars of he eam in he nex sep. Then we will be able o improve he abiliy of all players of our eam o adap o he behaviour of he opponen. The forwards appear o be paricular ineresing o us. By finding special rules of aacks we can srenghen our defense by using suiable formaions and behaviours. 4 Usage of he colleced informaion Afer collecing hose informaion, we use hem o improve our performance. The off-line usage is raher simple, if we sore he daa in a human readable form. During a mach he coach has o broadcas is recogniions. As some of he informaion surely can be expressed in he sandardized language of he coaches, we ll sill have o wai for an inerrupion of he game, o send a message. The communicaion works jus like he change of our formaion, which we described in Visser e al. [2001]. The coach generaes an announcemen, which every agen can parse. If he coach recognizes ha he opponen goalie leaves his goal every ime when here is none of his defenders beween our ball carrier and him, he could insruc our forwards o pull him o he wings and pass he ball o an unguarded eammae in he cener. If he goalie only leaves he goal if he forward achieves a cerain disance o his goal, our agens should ry o score from abou his disance. If he goalie exremely shifs owards he ball during our aacks, quick wing plays should be effecive. Fig. 3 shows an 2-vs-1 siuaion (he hree defenders beneah don really affec he play). The goalie decides o aack he ball carrier, so as he forward passes he ball o his eammae, he ges a big opporuniy o score. If we assume ha he decision of he goalie is deerminisic, he should reac in he same way

Fig. 3. 2-vs-1 siuaion during he nex 2-vs-1 siuaion. The coach should insruc he agens, ha he supporing player should depar a lile more, o ge in an even beer scoring posiion. If he goalie had suspeced and inerceped he pass, he coach would have old he agens ha he ball carrier should dribble a lile longer nex ime. The couner acions of he coach have o be augh by a human insrucor. They have o be parameerizable, so no every possible siuaion has o be augh manually. E.g. a goalie, which leaves his goal a a lile differen disance o he ball, shouldn cause a complee differen behaviour of our agens. 5 Resuls As he described approach is a brand new mehod and has jus been released we have o admi ha here are only a few resuls so far. Currenly, we are developing an inerface for he RoboCup environmen such ha he mehod is provided wih he ime series informaion during a game. Firs es have shown ha he approach is very promising, especially because of he abiliy o discreize auomaically. We expec he mehod o provide he online coach wih informaion such as proposiional rules, e.g. if he forward is closer han 10m and he forward has he ball hen he keeper is approaching he ball. In he final version of his paper we will show he resuls and discuss hem in deail. Acknowledgemen We would like o hank Michael Boronowsky for he work he has done helping us geing sared wih his mehod. References [Boronowsky, 2001] Boronowsky, M. (2001). Diskreisierung reellweriger aribue mi gemischen koninuierlichen gleichvereilungen und ihre anwendung bei der zeireihenbasieren enscheidungsbaumindukion. In DISKI, volume 246, S. Augusin. infix Verlag.

[Fayyad and Irani, 1992] Fayyad, U. and Irani, K. (1992). On he handling of coninuous-valued aribues in decision ree generaion. In Machine Learning, volume 8, pages 87 102. [Frank e al., 2000] Frank, I., Tanaka-Ishii, K., Arai, K., and Masubara, H. (2000). The saisics proxy server. In The Fourh Inernaional Workshop on RoboCup, pages S.199 204. [Quinlan, 1986] Quinlan, J. (1986). Inducion of decision rees. In Machine Learning, volume 1. [Quinlan, 1993] Quinlan, J. (1993). C4.5 Programs for Machine Learning. Morgan Kaufmann. [Raines e al., 1999] Raines, T., Tambe, M., and Marsella, S. (1999). Auomaed assisans o aid humans in undersanding eam behabiors. In Proceedings of he Third Inernaional Workshop on Robocup, pages S.85 104. [Visser e al., 2001] Visser, U., Drcker, C., Hbner, S., Schmid, E., and Weland, H.-G. (2001). Recognizing formaions in opponen eams. In RoboCup-00, Robo Soccer World Cup IV, Lecure Noes in Compuer Science, Melbourne, Ausralia. Springer- Verlag. o appear. [Wünsel e al., 2000] Wünsel, M., Polani, D., Uhmann, T., and Perl, J. (2000). Behavior classificaion wih self-organizing maps. In The Fourh Inernaional Workshop on RoboCup, pages S.179 188.