Offensive Strategy in the 2D Soccer Simulation League Using Multi-group Ant Colony Optimization

Similar documents
Talking about the Influences of Height and Technique for Obtain the Rebound on the Basketball Game

Evaluation of a Center Pivot Variable Rate Irrigation System

Reduced drift, high accuracy stable carbon isotope ratio measurements using a reference gas with the Picarro 13 CO 2 G2101-i gas analyzer

ALCS TM. ABEL LMRP Capping System Rev-3 31-March Subsea Intervention for Deepwater Blowout Incidents

ADDITIONAL INSTRUCTIONS FOR ISU SYNCHRONIZED SKATING TECHNICAL CONTROLLERS AND TECHNICAL SPECIALISTS

Terminating Head

A HYBRID SEABED CLASSIFICATION METHOD USING AIRBORNE LASER BATHYMETRIC DATA

PERFORMANCE AND COMPENSATION ON THE EUROPEAN PGA TOUR: A STATISTICAL ANALYSIS

NEUTRAL AND OFFENSIVE ZONE TEAM PLAY

Peak Field Approximation of Shock Wave Overpressure Based on Sparse Data

CS 2750 Machine Learning. Lecture 4. Density estimation. CS 2750 Machine Learning. Announcements

English Premier League (EPL) Soccer Matches Prediction using An Adaptive Neuro-Fuzzy Inference System (ANFIS) for

High Speed 128-bit BCD Adder Architecture Using CLA

Product Information. Radial gripper PRG 52

The Extent to Which Unbalanced Schedules Cause Distortions in Sports League Tables # Liam J. A. Lenten * School of Economics and Finance

Equilibrium or Simple Rule at Wimbledon? An Empirical Study

Study on coastal bridge under the action of extreme wave

The impact of foreign players on international football performance

GAS-LIQUID INTERFACIAL AREA IN OXYGEN ABSORPTION INTO OIL-IN-WATER EMULSIONS

LSSVM Model for Penetration Depth Detection in Underwater Arc Welding Process

arxiv: v1 [cs.ne] 3 Jul 2017

Research and Application of Work Roll Contour Technology on Thin Gauge Stainless Steel in Hot Rolling

Modeling the Pedestrian Ability of Detecting Lanes and Lane Changing Behavior

A Climbing Robot based on Under Pressure Adhesion for the Inspection of Concrete Walls

DETECTION AND REFACTORING OF BAD SMELL

Dynamic Analysis of the Discharge Valve of the Rotary Compressor

Product Information. Long-stroke gripper PSH 42

SECOND-ORDER CREST STATISTICS OF REALISTIC SEA STATES

Keywords: Ordered regression model; Risk perception; Collision risk; Port navigation safety; Automatic Radar Plotting Aid; Harbor pilot.

A Prediction of Reliability of Suction Valve in Reciprocating Compressor

Engineering Analysis of Implementing Pedestrian Scramble Crossing at Traffic Junctions in Singapore

Development of Accident Modification Factors for Rural Frontage Road Segments in Texas

OPTIMIZATION OF PRESSURE HULLS OF COMPOSITE MATERIALS

Risk analysis of natural gas pipeline

Comprehensive evaluation research of volleyball players athletic ability based on Fuzzy mathematical model

Pedestrian Facilities Planning on Tianjin New Area program

Nonlinear Risk Optimization Approach to Gas Lift Allocation Optimization

Product Information. Long-stroke gripper PFH-mini

ITRS 2013 Silicon Platforms + Virtual Platforms = An explosion in SoC design by Gary Smith

A Study on Parametric Wave Estimation Based on Measured Ship Motions

For models: 660 EF/EFO

Investigation on Rudder Hydrodynamics for 470 Class Yacht

A comparison study on the deck house shape of high speed planing crafts for air resistance reduction

Mass Spectrometry. Fundamental GC-MS. GC-MS Interfaces

Investigating Reinforcement Learning in Multiagent Coalition Formation

An intro to PCA: Edge Orientation Estimation. Lecture #09 February 15 th, 2013

Decomposition guide Technical report on decomposition

Methodology for ACT WorkKeys as a Predictor of Worker Productivity

International Journal of Engineering and Technology, Vol. 8, No. 5, October Model Systems. Yang Jianjun and Li Wenjin

Evolutionary Sets of Safe Ship Trajectories: Evaluation of Individuals

Numerical Study of Occupants Evacuation from a Room for Requirements in Codes

JOURNAL OF GEOPHYSICAL RESEARCH, VOL. 88, NO. B4, PAGES , APRIL 10, 1983 AN EXPERIMENT IN SYSTEMATIC STUDY OF GLOBAL SEISMICITY:

Journal of Chemical and Pharmaceutical Research, 2014, 6(5): Research Article

Investigation on Hull Hydrodynamics with Different Draughts for 470 Class Yacht

A PROBABILITY BASED APPROACH FOR THE ALLOCATION OF PLAYER DRAFT SELECTIONS IN AUSTRALIAN RULES

Internal Wave Maker for Navier-Stokes Equations in a Three-Dimensional Numerical Model

Canadian Journal of Fisheries and Aquatic Sciences. Seasonal and Spatial Patterns of Growth of Rainbow Trout in the Colorado River in Grand Canyon, AZ

Driver s Decision Model at an Onset of Amber Period at Signalised Intersections

Ergonomics Design on Bottom Curve Shape of Shoe-Last Based on Experimental Contacting Pressure Data

A NEW METHOD FOR IMPROVING SCATTEROMETER WIND QUALITY CONTROL

Experimental And Numerical Investigation Of The Flow Analysis Of The Water-Saving Safety Valve

RADIAL STIFFNESS OF A BICYCLE WHEEL AN ANALYTICAL STUDY

Johnnie Johnson, Owen Jones and Leilei Tang. Exploring decision-makers use of price information in a speculative market

PREDICTION OF POLYDISPERSE STEAM BUBBLE CONDENSATION IN SUB-COOLED WATER USING THE INHOMOGENEOUS MUSIG MODEL

Application of fuzzy neural network in the pattern classification of table tennis rotating flight trajectory

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore.

M.H.Ahn, K.J.Lee Korea Advance Institute of Science and Technology 335 Gwahak-ro, Yuseong-gu, Daejeon , Republic of Korea

M. Álvarez-Mozos a, F. Ferreira b, J.M. Alonso-Meijide c & A.A. Pinto d a Department of Statistics and Operations Research, Faculty of

Modeling the Performance of a Baseball Player's Offensive Production

Chinese and foreign men s decathlon performance comparison and structural factor correlation test based on SPSS regression model

An Introduction To SMALL-SIDED GAMES. Fewer Players = More Touches FREE EBOOK SOCCER COACH WEEKLY

Product Information. Gripper for small components MPG-plus

Aerator Performance in Reducing Phenomenon of Cavitation in Supercritical Flow in Steep Channel Bed

11. Contract or Grant No. Lubbock, Texas

Operation Guide 3283

Muscle drain versus brain gain in association football: technology transfer through

Beating a Live Horse: Effort s Marginal Cost Revealed in a Tournament

Mechanical Engineering Journal

Comparative Deterministic and Probabilistic Analysis of Two Unsaturated Soil Slope Models after Rainfall Infiltration

Comparisons of Means for Estimating Sea States from an Advancing Large Container Ship

First digit of chosen number Frequency (f i ) Total 100

SOME OBSERVATIONS ON THE CO-ORDINATION DIAPHRAGMATIC AND RIB MOVEMENT IN RESPIRATION

What does it take to be a star?

Lake Clarity Model: Development of Updated Algorithms to Define Particle Aggregation and Settling in Lake Tahoe

Stability Analysis for the Design of 5000-Tonnes Offshore Work Barge

PARAMETER OPTIMIZATION OF SEA WATERWAY SYSTEM DREDGED TO THE

Product Information. Universal gripper PZN-plus

Frequency of Regulatory Inspections. Hernández, D.G.; Tellería, D.M. and Jordan, O.D.

Research on Assessment Method of Fire Protection System

Planning of production and utility systems under unit performance degradation and alternative resource-constrained cleaning policies

BETHANY TAX INCREMENT FINANCING DISTRICT NO. 1 NOTICE OF TWO PUBLIC HEARINGS

PERMIT TRADING AND STABILITY OF INTERNATIONAL CLIMATE AGREEMENTS 19. MICHAEL FINUS * University of Hagen and National University of Singapore

Bubble Column Reactor Fluid-dynamic Evaluation at Pilot-plant for Residue and Extra-heavy Crude Oil Upgrading Technology

Relative Salary Efficiency of PGA Tour Golfers: A Dynamic Review

Evaluating the Effectiveness of Price and Yield Risk Management Products in Reducing. Revenue Risk for Southeastern Crop Producers * Todd D.

Aquatics at ASV 1

CFD Simulation of R134a and R410A Two-Phase Flow in the Vertical Header of Microchannel Heat Exchanger

Product Information. Gripper for small components MPG 80

Referee Bias and Stoppage Time in Major League Soccer: A Partially Adaptive Approach

JIMAR ANNUAL REPORT FOR FY 2001 (Project ) Project Title: Analyzing the Technical and Economic Structure of Hawaii s Pelagic Fishery

Transcription:

Internatona Journa of Advanced Robotc Systems ARTICLE Offensve Strategy n 2D Soccer Smuaton League Usng Mut-group Ant Coony Optmzaton Reguar Paper Shengbng Chen *, Gang Lv Xaofeng Wang Key Lab of Network Integent Informaton Processng, Department of Computer Scence Technoogy, Hefe Unversty, Hefe, Chna *Correspondng author(s) E-ma: shbchen@hfuu.edu.cn Receved 9 December 24; Accepted 6 December 25 DOI:.5772/6267 26 Author(s). Lcensee InTech. Ths s an open access artce dstrbuted under terms of Creatve Commons Attrbuton Lcense (http://creatvecommons.org/censes/by/3.), whch permts unrestrcted use, dstrbuton, reproducton n any medum, provded orgna work s propery cted. Abstract The 2D soccer smuaton eague s one of best test beds for research of artfca ntegence (AI). It h acheved great successes n doman of mut-agent cooperaton machne earnng. However, probem of ntegra offensve strategy h not been soved because of dynamc unpredctabe nature of envronment. In ths paper, we present a nove offensve strategy bed on mut-group ant coony optmzaton (MACO-OS). The strategy uses pheromone evaporaton mechansm to count preference vaue of each attack n dfferent envronments, saves vaues of success rate preference n an attack nformaton tree n background. The decson modue of attacker n seects best attack accordng to preference vaue. The MACO-OS approach h been successfuy mpemented n our 2D soccer smuaton team n RoboCup compettons. The expermenta resuts have ndcated that agents deveoped wth ths strategy, aong wth reated technques, devered outstng performances. Keywords 2D Soccer Smuaton, Mut-agent Cooperaton, Offensve Strategy, Mut-group Ant Coony Optmzaton. Introducton As one of odest eagues n RoboCup, 2D soccer smuaton eague h acheved great success nspred many researchers a over word to engage n ths game each year []. Wthout necessty to he any ow-eve hardware probems, 2D soccer smuaton eague focuses on hgh-eve functons, such cooperaton earnng. Furrmore, 2D soccer smuaton eague o provdes an mportant expermenta patform, whch s a fuy dstrbuted, mut-agent stochtc doman wth contnuous state, observaton space [2]. AI researchers have been usng ths patform to pursue research n a wde varety of domans, ncudng cooperaton, coordnaton negotaton n rea-tme mut-agent systems. As envronment of 2D soccer smuaton s hghy dynamc unpredctabe, t s very dffcut to tran agents' s generate a team strategy. Researchers have done a ot of work n ths doman. Fang presented Monte Caro method to deveop a team s defensve strategy [3]. Amr presented a renforcement earnng method to fnd a better way to drbbe ba to break through opponents who are bockng way to goa [4]. Sh Int J Adv Robot Syst, 26, 3:25 do:.5772/6267

sove empoyed proxma drbbe Markov probem decson [5]. process Zhang to presented sove a vaue proxma strategy drbbe to mprove probem [5]. success Zhang presented rate of psng a vaue strategy [6]. Iobre to mprove used nductve success ogc rate of programmng psng to [6]. antcpate Iobre used shootng nductve ogc of programmng opponent payers to antcpate [7]. Ca empoyed shootng mamatca of opponent anayss payers [7]. a Ca BP empoyed neura network mamatca to mprove anayss ntercepton a BP success neura network probabty to mprove [8]. Lu proposed ntercepton a Support success probabty Vector Regresson [8]. Lu method proposed to a predct Support dstance Vector Regresson that method ba w to predct cover when dstance that ba w cover when agent successfuy ntercepts ba [9]. These methods have greaty mproved agent successfuy ntercepts ba [9]. These methods have greaty mproved effcency of agent's effcency of agent's s. Gven that probem of s. Gven that probem of mut-agent mut-agent cooperaton team's overa strategy h cooperaton yet to be soved, ths team's ssue overa contnues strategy to provoke h yet sgnfcant to be soved, nterest ths ssue []. contnues to provoke sgnfcant nterest []. Usng Usng powerfu powerfu optmzaton optmzaton capabty capabty of of swarm swarm ntegence methods, ths paper presents an offensve ntegence methods, ths paper presents an offensve strategy bed on mut-group ant coony optmzaton strategy bed on mut-group ant coony optmzaton (MACO-OS). Combned wth pecewse characterstcs (MACO-OS). Combned wth pecewse of attack seecton, strategy agorthm dynamcay seects characterstcs of attack best attack seecton, accordng to strategy rea-tme agorthm scene-nformaton dynamcay of seects competton best envronment. attack accordng to rea-tme scene-nformaton of Ths paper s organzed foows: In secton 2, we ntroduce background to 2D soccer smuaton eague. competton envronment. Ths paper s organzed foows: In secton 2, we Secton 3 ntroduces three prmary attack s ntroduce background to 2D soccer smuaton anayses major nfuencng factors of se s n eague. Secton competton. 3 ntroduces Secton 4 three defnes prmary parameters attack s of ant coony anayses optmzaton major when nfuencng t s apped factors of offensve se s strategy n n competton. RoboCup 2D soccer Secton smuaton 4 defnes eague, parameters we of ntroduces ant coony MACO-OS optmzaton n deta. when In secton t s apped 5, we run to offensve agorthm on strategy RoboCup n 2D RoboCup soccer smuaton 2D soccer eague smuaton patform eague, to verfy we effectveness ntroduces of MACO-OS. Fnay, deta. secton In secton 6 concudes 5, we run paper agorthm wth a on dscusson RoboCup about 2D soccer future work. smuaton eague patform to verfy effectveness of MACO-OS. Fnay, secton 6 concudes paper 2. Background wth a dscusson about future work. In 2D soccer smuaton eague, a centra soccer server 2. Background smuates a two-dmensona vrtua soccer fed n reatme. Two teams of ndependenty movng software In payers 2D soccer (caed smuaton agents) connect eague, to a centra server soccer va network server smuates sockets a n two-dmensona order to pay soccer vrtua on a vrtua soccer fed fed nsde n a rea-tme. computer Two []. teams Each of team ndependenty can have up movng to 2 cents software ncudng (caed payers agents) ( feders connect to one goakeeper) server va network a coach. payers sockets Each n order agent to receves pay soccer a set of on scene-nformaton a vrtua fed nsde (vsua, a computer acoustc []. Each physca) team from can have soccer up server, to 2 nterprets cents ncudng ths reatve payers nformaton ( feders generates one a goakeeper) word mode wth a coach. nterpretaton Each agent receves modue. The a set agent of scene-nformaton n seects an (vsua, acoustc sends t to physca) server n from rea-tme, wth soccer each server, of agents havng a chance to act tmes per second. Each of nterprets ths reatve nformaton generates a word se chances s caed a "cyce". Vsua nformaton mode wth nterpretaton modue. The agent n s sent sx or seven tmes per second. Durng a stard - seects an sends t to server n rea-tme, mnute game, ths gves 6, chances 4, wth recepts each of of vsua agents nformaton havng a [2]. chance to act tmes per second. Each of se chances s caed a "cyce". Vsua Bed nformaton on scene-nformaton, s sent sx seven each tmes agent per seects second. an Durng a wth stard foowng -mnute steps: game, ) evauatng ths gves subjectve 6, state descrbed n word mode; 2) makng a decson usng chances 4, recepts of vsua nformaton [2]. 2 Int J Adv Robot Syst, 26, 3:25 do:.5772/6267 Bed on scene-nformaton, each agent seects an wth foowng steps: ) evauatng subjectve strategy state agorthm; descrbed 3) n generatng word mode; refnng 2) makng bc a decson comms usng (such strategy dhng agorthm; n a gven drecton 3) generatng wth certan refnng power, bc turnng comms body (such or neck at dhng an ange, n kckng a gven drecton ba at wth an ange certan wth power, specfed turnng power, or body sde-tackng or neck at an ange, ba). The kckng observaton ba for at each an payer ange contans wth specfed nosy power, oca or geometrc sde-tackng nformaton, ba). such The observaton dstance for each ange payer to or payers contans [3, nosy 4], such that oca state geometrc of nformaton, word mode may such have tte dstance devaton from that ange of to rea or word. The structure of 2D soccer smuaton s depcted payers [3, 4], such that state of word mode beow: may have tte devaton from that of rea word. The structure of 2D soccer smuaton s depcted beow: St'- St' St'+ Approxmaton Word Modue Interpretaton Scene-nformaton Subjectve-state Decson Modue Evauator Strategy Agorthm Acton Generator Comms St- St St+ Fgure Fgure.. Structure Structure of 2D soccer of 2D smuaton soccer smuaton Agent Soccer Server The The tk n n 2D soccer smuaton eague s s to to fnd fnd best to execute from a a possbe word states states (derved from sensor nput by by cacuatng a sght a sght on on word word absoute absoute nose-free nose-free possbe). possbe). Each agent Each agent h to h nterpret to nterpret scene-nformaton scene-nformaton a subjectve n a state subjectve state seect best seect to nfuence best envronment. to nfuence The envronment. state space The state space space can be descrbed space foows: can be descrbed State space foows: - A sequence of scene-nformaton, observatons, whch space ncude - A sequence current postons of scene-nformaton, of ba a State observatons, payers (teammates whch ncude opponents), current vsons, postons physcs of ba so on. The a state payers of 2D (teammates soccer smuaton opponents), can be represented vsons, physcs a fxed-ength so on. vector, The state contanng of 2D soccer state varabes smuaton that can be totay represented cover 23 dstnct a fxed-ength objects ( vector, teammates, contanng opponents, ba agent tsef). state varabes that totay cover 23 dstnct objects ( teammates, Acton space - opponents, A s that ba agent may agent perform, tsef)..e., Acton shoot, space ps, drbbe, - A s poston, that ntercept, agent bock, may trap, perform, mark.e., shoot, formaton. ps, drbbe, The former poston, three ntercept, s (shoot, bock, ps, trap, mark drbbe) formaton. beong The to attack former s three of s ba-her, (shoot, ps, drbbe) t four beong s to (bock, attack trap, s mark, of formaton) ba-her, beong t to defence four s s, (bock, whe trap, mark, mdde two formaton) s (poston, ntercept) can be used n both attack defence for beong to defence s, whe mdde two s (poston, payer wthout ba. Each s defned ntercept) can be used n both attack defence for foows [5]: payer wthout ba. Each s defned foows [5]:. Shoot - to kck out ba to score ) 2. Shoot Ps - to - to kck kck out ba ba to a to proper score teammate 2) Ps - to kck ba to a proper teammate 3) Drbbe - to take ba forward wth repeated sght touches or bounces n an approprate drecton

3. Drbbe - to take ba forward wth repeated sght touches or bounces n an approprate drecton 3.2 Psng mode In order to change attack azmuth qucky, attacker 4. Poston - to arrve n a partcuar pace or manner to get a better chance of attackng or defendng must ps ba to hs teammate. The psng target pont may be poston where teammate sts, or space behnd defenders. Before psng ba, 5. Intercept - to get ba ft possbe attacker needs to predct wher defender w catch 4) Poston to arrve n partcuar pace or manner to defender w catch ba when t s psed n a certan 6. 4) Bock Poston - to - make to arrve movement n a partcuar of an pace opponent or manner (who to ba defender when w t s psed catch n a ba certan when drecton. t s psed The success n a certan get better chance of attackng or defendng drecton. The success rato s cacuated by comparng get contros a better chance ba) dffcut of attackng or mpossbe or defendng rato drecton. s cacuated The by success comparng rato s tme cacuated taken for by comparng ba 5) 5) Intercept to get ba ft possbe tme taken for ba defender to reach 4) Intercept Poston -- to to get arrve ba n a partcuar ft possbe pace or manner to defender tme defender w taken catch to for reach ba ba cut-off when t pont. s defender psed Accordng n to a reach certan to 7. 6) Trap - to hse ba-her try to stea ba 6) Bock to make movement of an opponent (who cut-off pont. Accordng to ths success rato, attacker get Bock a better - to chance make of attackng movement or defendng of an opponent (who ths drecton. cut-off success pont. rato, The Accordng success attacker rato to can ths cacuated seect success a prme rato, by comparng psng attacker contros 8. contros ba) dffcut or mpossbe can seect a prme psng drecton determne 5) Mark Intercept - to be ba) - to sgned get dffcut ba to or a specfc mpossbe ft opponent possbe wth drecton can tme seect taken determne a for prme wher ba psng to ps drecton defender or not. As to reach shown determne 7) am of denyng to m possesson of try ba to stea ba wher to ps or not. As shown n Fgure 3, tme 6) Trap Bock - to - hse to make ba-her movement of an try opponent to stea (who ba n Fgure cut-off wher 3, pont. to tme ps Accordng taken not. for to As ths shown ba success n rato, Fgure defender 3, attacker to tme 8) contros Mark - to to ba) be be sgned dffcut to or to a mpossbe specfc specfc opponent opponent wth wth reach can taken taken seect for cut-off for a prme pont ba ba s psng determned defender drecton defender once to to reach reach ntercept determne cut-off cut-off 9. Formaton - how payers n a of team ba are postoned am 7) Trap of denyng - to hse m possesson ba-her of ba try to stea ba dstance wher pont pont s to s psng determned ps determned or anges not. once As are once shown o determned. ntercept ntercept Fgure Thus, dstance 3, dstance we tme for defence 9) 8) Formaton Mark - to - be how sgned payers to a specfc n n a a team team opponent are are postoned wth ook taken psng to psng for anges anges ba are are o o ntercept determned. defender determned. dstance to reach Thus, Thus, we major we cut-off ook ook to to The for am chaenge defence of denyng m 2D soccer possesson smuaton of eague ba comprses factors. pont psng For s convenence, anges determned anges ntercept ntercept once psng dstance dstance ntercept anges dstance major ntercept major factors. factors. two The 9) pects: Formaton chaenge - scene-nformaton how payers 2D soccer n sent a team to smuaton each are agent postoned eague s dstance psng For For are convenence, anges denoted are o α ps determned. psng psng d ntp, respectvey. anges anges Thus, we ook ntercept ntercept to reatve comprses for defence nosy, two pects: subsequent scene-nformaton postons sent of every to to each each teammate agent The s chaenge reatve opponent n nosy, are unpredctabe 2D soccer subsequent smuaton because of postons eague psng dstance anges are are denoted ntercept ps dstance ps d ntp d ntp, respectvey., major factors. For convenence, psng anges ntercept nfuences of comprses every between teammate two pects: agents. scene-nformaton opponent are unpredctabe sent to each dstance are denoted ps d ntp, respectvey. Target pont because agent s of reatve nfuences nosy, between agents. subsequent postons Defender 3. Anayss of every of teammate man attack s opponent are unpredctabe Target Cut-off pont pont Defender Intercept dstance 3. because Anayss of of nfuences man attack between s agents. In 2D smuaton eague, re are three prncpa attack Cut-off pont Intercept dstance s In 3. Anayss n 2D reaton smuaton of to man eague, attack ba-her: s re shoot, are three ps prncpa Psng anges Attacker drbbe. attack In s ths secton, n reaton we anayse to ba-her: man factors shoot, of eachps In, drbbe. 2D n turn, smuaton In ths create secton, approprate eague, we anayse re modes. are three 3. anges man prncpa Fgure Fgure 3. Intercept Psng 3. dstance anges Intercept psng dstance Attacker anges psng anges factors of of attack s n reaton to ba-her: shoot, ps each, n turn, create approprate modes. Fgure 3.3 3.3 Drbbng 3. Intercept mode mode dstance psng anges 3. Shootng drbbe. mode In ths secton, we anayse man factors of 3.3 Drbbng mode 3. each Shootng, mode n turn, create approprate modes. The shootng pont must fuf two man condtons: ) t In order In 3.3 In order to Drbbng order ncree to mode to ncree drbbng drbbng speed, we speed, do speed, not requre we we do do not not must The 3. be shootng Shootng mode area pont of must goa, fuf two 2) man goakeeper condtons: ) ) t t ba requre to be n ba ba contro to to be be n of n attacker contro contro when of of attacker s attacker n when In order to ncree drbbng speed, we do not when defender must be cannot n ntercept area of of goa, ba. As shown 2) 2) n goakeeper Fgure 2, f penaty he he s area. s n n The penaty agent penaty may area. area. kck The The ba agent agent away may may n a kck certan kck ba ba The shootng pont must fuf two man condtons: ) t requre ba to be n contro of attacker when we defender defne cannot attacker ntercept a vertex, ba. ba. re As As shown w shown be n n severa Fgure Fgure 2, range, 2, away away whch n n a may certan a certan be beyond range, range, whch whch contro may may be of be beyond beyond agent, contro must be n area of goa, 2) goakeeper he s n penaty area. The agent may kck ba contro anges f we made defne by attacker defenders, a goakeeper vertex, vertex, re re w w goaposts. be be severa n of use severa of ntercept agent, agent, technoogy n n use use to ntercept catch ntercept technoogy ba technoogy before to to catch defender cannot ntercept ba. As shown n Fgure 2, away n a certan range, whch may be beyond contro catch The anges best ange made s by by maxmum defenders, defenders, ange goakeeper goakeeper (caed gap ange), goaposts. defender goaposts. ba does. ba before Ths method before defender needs to defender does. consder does. Ths Ths method attacker s method needs needs to f we defne attacker a vertex, re w be severa of agent, n use ntercept technoogy to catch to The The best best shootng ange ange s s pont maxmum maxmum s bsector ange ange neutra (caed (caed pont ofgap drbbng consder dstance gap consder attacker s defender s attacker s drbbng ntercept drbbng dstance dstance dstance anges made by defenders, goakeeper goaposts. ba before defender does. Ths method needs to ange), ange), gap ange. The best best dstance shootng shootng between pont pont s s attacker bsector bsector neutra after defender s two cyces neutra defender s ntercept or onger, ntercept dstance whch denoted dstance after after two two cyces d drb cyces or d ntd or onger,, The best ange maxmum ange (caed gap consder attacker s drbbng dstance onger, goa pont pont s caed of of gap gap shootng ange. ange. dstance. The The dstance dstance It obvous between between that re attacker respectvey. whch attacker whch s s denoted A dagram denoted of d drb d drbbng drb d ntd d, respectvey. mode s shown ntd, respectvey. A dagram n ange), best shootng pont bsector neutra defender s ntercept dstance after two cyces or onger, A dagram s a strong goa correaton s caed between shootng dstance. success It s obvous Fgure of 4. drbbng mode s shown n Fgure 4. pont of goa gap s caed ange. The shootng dstance dstance. between It s attacker obvous whch of s denoted drbbng mode d drb s dshown ntd, respectvey. n Fgure 4. A dagram neutra that re pont s of a strong gap correaton ange. The between greater shootng gap ange, success that re goa s s strong caed correaton shootng between dstance. shootng It s obvous success of drbbng mode s shown n Fgure Defender Defender 4. shorter neutra shootng pont of dstance gap ange. The hgher greater Intercept dstance that re neutra s a strong pont correaton of gap between ange. shootng The greater success Intercept dstance Defender shootng gap ange, ht rate. The shorter gap ange shootng shootng dstance are gap ange, neutra shorter pont of gap shootng ange. dstance The greater Intercept dstance Drbbng dstance denoted hgher gap ange, α shootng shorter d shoot ht, respectvey. rate. The shootng gap ange dstance shootng Drbbng dstance hgher shootng ht rate. The gap ange shootng The ba after two dstance are denoted shoot d shoot, respectvey. The ba after two Drbbng dstance hgher shootng ht rate. The gap ange shootng dstance are denoted shoot d shoot, respectvey. The cyces ba after two Attacker dstance are denoted shoot Goa d shoot, respectvey. cyces Attacker Fgure 4. Drbbng dstance ntercept dstance Goa cyces Attacker Fgure 4. Drbbng dstance ntercept dstance Goa Fgure 4. Drbbng dstance ntercept dstance Shootng dstance Shootng dstance Defender Fgure 4. Drbbng dstance ntercept dstance Shootng dstance Defender 4. Offensve strategy agorthms Defender Defender2 4. Offensve strategy agorthms 4. Offensve strategy agorthms Attacker Defender2 Defender2 4. Offensve In 2D strategy soccer agorthms smuaton eague, fed state h Attacker Attacker many In propertes 2D soccer descrbng smuaton offence eague, defence, fed such state h Gap ange Goakeeper In 2D soccer smuaton eague, fed state h In 2D soccer smuaton eague, fed state h many Gap many reatve propertes postons, descrbng dstance offence from defence, ba, such Gap ange Goakeeper many propertes descrbng offence defence, such propertes descrbng offence defence, such members reatve reatve n postons, a certan postons, regon dstance dstance so from on. from It s ey ba, ba, to see reatve postons, dstance from ba, members Goapost members that fed n a n state certan a certan s a regon contnuous regon so space. on. so It on. To s ey dea It s ey wth to see to see Fgure 2. Gap ange shootng dstance Goapost n a certan regon so on. It s ey to see that fed that that fed state s a space. To dea wth Fgure 2. 2. Gap ange shootng dstance state data fed qucky state s a contnuous effectvey, space. we need To dea to dvde wth state s a contnuous space. To dea wth state datam Fgure 2. Gap ange shootng dstance state nto state data many data qucky equvaent effectvey, states by a we process need we need to of dvde dscretzaton. to dvde m m qucky effectvey, we need to dvde m nto many 3.2 Psng mode nto In nto many many course equvaent of a states soccer states by game, a by process a process of attackng dscretzaton. of strategy 3.2 3.2 Psng mode In vares In course wth course of Shengbng state. a of soccer a That soccer Chen, game, s to Gang game, say, Lv attackng attackng Xaofeng attackng strategy strategy Wang: strategy s 3a In order to change attack Offensve azmuth Strategy qucky, n 2D Soccer vares pecewse vares Smuaton wth wth functon: League state. state. Usng That Mut-group That best s to attack s say, to Ant say, attackng Coony attackng may Optmzaton strategy be strategy dfferent s a s a In In attacker order must to to ps change attack ba to azmuth hs teammate. qucky, The psng pecewse f pecewse functon: fed state functon: changes. best best attack Furrmore, attack may we may be dfferent must be dentfy dfferent attacker attacker target pont must must may ps be ba to poston to hs hs teammate. The where The psng teammate psng f f fed best fed state state changes. for changes. Furrmore, each state. Furrmore, we must In ths paper, we we must dentfy set a dentfy path target target sts, pont pont or may may space be be behnd poston poston where defenders. where Before teammate teammate psng for best each best for n for each each each state. state, state. In ths In found ths paper, paper, we set best we a set path a n path

equvaent states by a process of dscretzaton. In course of a soccer game, attackng strategy vares wth state. That s to say, attackng strategy s a pecewse functon: best attack may be dfferent f fed state changes. Furrmore, we must dentfy best Author(s) for each Name(s): state. In Shengbng ths paper, we Chen set a path for each n each state, found best n a gven state va preferences of paths. Accordng to ory of ACO, we shoud regard each attack a Optmzaton path n souton space, consder each agent s tranng n same way we woud foragng behavour of an ant, treat an effectve attack success foragng. In order to sove probem of "a souton per state", we set an ant Page Lne No. group No. for each equvaent state (an equvaent Deete state s consdered an nterva n pecewse functon) for purposes authors Insttute of Integent Machnes, of ths paper. Wthn ths settng, each ant ays down a tra affaton pheromones Chnese on a path Academy (correspondng of Scences, to Hefe preference vaue of an attack Unversty, ), Hefe, each Anhu, ant Chna group may have a path that h hghest pheromone eve (correspondng to best preference vaue best attack ). Hence, we coud get best preference vaues for every state va mutpe ant groups. For ths paper, an tranng agorthm w devsed bed on mut-group 6 Agorthm- ant coony th optmzaton row [6]. The agorthm created an ant group for each state, nteracted pheromones wth smar groups Fnd produced smar a data groups set to save (SGs) by preference vaues of s for every state. Usng Equaton 5 method of ncreng pheromone mechansm of ACO, agorthm formed a mechansm of postve feedback between affecton pheromones. In process of 2D soccer smuaton competton, agent receves a arger? set of scene-nformaton (vsua, acoustc physca) transates ths nformaton nto equvaent states. The agent n searches data set of state- preference vaues, seectng an that h most smar state hghest preference vaue. Fnay, agent makes seected, evauates resut of after a gven tme (set fve cyces n ths paper) sends resut to tranng data set a new sampe. The fowsheet of tranng seecton s demonstrated beow: Proof Correctons Form PROOF CORRECTIONS By depctng FORM attack s paths of foragng, parameters (pheromone eve, heurstc nformaton preference functon) are defned foows: Repace wth Defnton - Pheromone eve (denoted τ Key Lab of Network Integent k (t)) represents ntenton of k-th group ants to choose path, whch Informaton decrees wth Processng, tme Department ncrees wth of ants' success Computer on path. Scence Technoogy, Hefe Unversty, Hefe, 236,Chna t ( t + ) = ( - r ) t ( t) + d t k k k () 4 Fg. 5 Coud you pee change fgure 5 Fgure 5 shown n foow? where ρ s coeffcent of pheromone evaporaton, δ s constant Fnd smar socated groups wth by Eq.5 pheromone save eve, n τ k refers to SG added pheromones of ants n k-th group on path. k t = k, 8 Tabe 2. Coud you pee make ne bodness of upper border å t (2) = k of tabe 2 Tranng Datet Group for state Pheromone Updatng Group2 for state2 Pheromone Updatng.. Group g for state g Pheromone Updatng State- Preference Vaues MACO Tranng Resut Fgure 5. The fow sheet of tranng seecton Fgure 5. The fow sheet of tranng seecton Current State Searchng Engneerng Attack Accordng to parameters, such pheromone eve, vaue for each state coud be cacuated, whe best attackng strategy coud be seected for each state. To avod probem of a state of sparseness, we created an Attack Informaton Tree (whch w be descrbed fuy n subsecton 4.2) to update pheromones n rea tme. Thus, offensve strategy we seected h a dynamc character. Paper Tte: Offensve Strategy n 2D Soccer Smuaton Subsecton League 4. Usng w descrbe Mut-group man Ant parameters Coony of MACO. 4. Parameters of MACO m where m k s ant number of k-th group, τ k, refers to added pheromones on path by -th ant n k-th group. t k, ìï Success = í ï î Faure where success faure are foragng resuts on path (each path corresponds to an n a state), respectvey. If an h acheved expectaton of ts executve agent (e.g., psng agent expects that target teammate w get ba, shootng agent wants to score), we ca t "success"; orwse, we ca t "faure". Defnton 2 - Heurstc nformaton (denoted η k ) s experence of ants n k-th group to choose path (correspondng to -th attack seected by agent), η k may be expressed by access frequency of path. = k (3) Ck h = k n (4) å C 4 Int J Adv Robot Syst, 26, 3:25 do:.5772/6267

where n s tota number of paths, C k s access number of ants n k-th group on path. Defnton 3 - Group smarty (denoted S k,j ) s smarty degree between two groups, whch may be cacuated by smartes of fed states of two groups (k j). Suppose that f s a h-dmensona vector, f k s fed state (ncudng varous reatve dstances anges amongst offence, defence ba, such αshoot, dshoot, αps) of group k, f k, s -th dmenson of f k, n vaue of S k, s: vaues of s n each envronment, 2) seect best attack bed on preference vaue. The frst step runs offne n background for a ong tme to obtan preference vaue; second step s desgned usng decson modue of attacker runs onne to seect best. The state of attackng envronment s a contnuous space. In order to dea wth state data qucky effectvey, we shoud transfer m nto many dscrete equvaent states. The dscretzaton functon s defned : ( ) 2 h = - k, = k, j, h S å f f (5) ( ) ( ) / ³ - ( ) ì ( / ( / ) ) ( ) * ( ) ïtruncate x Max T + x T - Max T x = í ïî T x ( T ) * ( Max / T) (8) When an ant fnds an optmum souton, t w share ts souton wth smar groups, whch have mnmum vaue of S k,. For exampe, when an ant from k-th group earns a success, t w add pheromones to ts own path ( descrbed n Defnton ), o add pheromones to smar groups (denoted SG). Gven that group number of SG s g, pheromones of each group n SG may be added : ( t) ( t) t = t + t (6) k Defnton 4 - Preference functon (denoted p k, (t) ) s tendency of -th ant n k-th group to choose path, whch s affected by pheromones τ k (t) heurstcs (η k ). The expresson of preference functon s defned : p, k ( t) = ( t) a b k ét ù éh ù ë k û ë û n a b å ét k ( t) ù éh ù ë û ë k û j= where α, β are adjustment factors of pheromones heurstcs, respectvey. In ce of gnorng heurstc nformaton, vaue of β may be zero. Accordng to vaue of p k, (t), whch may be cacuated by Equaton 7, we can acheve best attack for current fed envronment. 4.2 Offensve strategy agorthm bed on MACO We can smuate dfferent attackng envronment by settng agents at dfferent ponts on fed. Regardng three attack s (shootng, psng drbbng) three dfferent paths for foragng, we can obtan best attack by usng preference vaues of ants' foragng behavours. The foowng two steps shoud be made n order to fnd best attack : ) tran preference (7) where T s equvaent state number of each state n dscretzaton process (nterva number), Max() s orgna maxmum of state data x. The format of tranng data s (αshoot, dshoot, αps, dntp,ddrb, dntd, aa, suc), where aa s attack suc s resut of attack (faure:, success: ). Assumng that maxmum for each state (αshoot, dshoot, αps, dntp, ddrb, dntd) s defned (π, 3, π, 3, 3, 3), respectvey, equvaent state number of each state (T) s defned, n data (.6,.5,.5, 3.6,.9,.2, shootng, ) w be dscretzed (2, 4, 2, 2, 3, 4,, ), accordng to Equaton 8. Lkewse, data (.2, 4.3,.3,.7,.7,.8, psng, ) w be dscretzed (, 5,,, 3, 3, 2, ). In order to save success rate preference vaues of three attack s for each state, we created an Attack Informaton Tree (AIT), bed on a mut-branch tree mode ( shown n Fgure 6). For exampe, dscretzed data (2, 4, 2, 2, 3, 4,, ) are stored n Node 24 ( eaf on path whose αshoot=2 dshoot=4) of frst tree (root ), because vaue of (αshoot, dshoot, aa) s (2,4,). Lkewse, data (, 5,,, 3, 3, 2, ) are stored n Node of second tree (root 2 ), because vaue of (αps, dntpt, aa) s (,,2). Each eaf of tree h four-dmensona data (fa:nt, succ:nt, sucrate:foat, preference:foat), where fa s amount of faure, succ s amount of success, sucrate s success rate counted by τ k (t), preference s preference vaue denoted p k, (t), per Defnton 6. For every attackng envronment (state), we generated a group of ants used pheromone concentraton to record dynamc success-rate of each attack. When we obtaned tranng data, we counted pheromone eve heurstc nformaton usng Equatons -4. Accordng to ant coony agorthm, preference vaue p k, (t) can be counted usng Equaton 7. Bed on mut-group ant coony optmzaton, agorthm for tranng wth regard to AIT (MACO-AIT) s foows: Shengbng Chen, Gang Lv Xaofeng Wang: Offensve Strategy n 2D Soccer Smuaton League Usng Mut-group Ant Coony Optmzaton 5

foragng behavours. The foowng two steps shoud be made n order to fnd best attack : ) tran preference vaues of s n each envronment, 2) seect best attack bed on preference vaue. The frst step runs offne n background for a ong tme to obtan preference vaue; second step s desgned usng decson modue of attacker runs onne to seect best. of second tree (root2), because vaue of ( ps, dntpt, aa) s (,,2). Each eaf of tree h four-dmensona data (fa:nt, succ:nt, sucrate:foat, preference:foat), where fa s amount of faure, succ s amount of success, sucrate s success rate counted by k (t), preference s preference vaue denoted p, k (t), per Defnton 6. Shootng Psng Drbbng shoot 2 root3 t ps root2 root t 2 d drb 2 t 3 dshoot 2 k dntp 2 k2 dntd 2 k3 fa succ sucrate preference 2 k 2 k2 t3 t32 t3 k3 Fgure 6. The structure of Attack Informaton Tree for three s Fgure 6. The structure of Attack Informaton Tree for three s For every attackng envronment (state), we generated a (MACO-AIT) s foows: Agorthm-2. MACO-OS agorthm group of ants used pheromone concentraton to current state) shoud be seected. The MACO-OS agorthm Agorthm-. record MACO-AIT dynamc agorthm success-rate of each attack. Agorthm-. MACO-AIT agorthm Input: s Dscretzed foows: rea-tme fed data (X), traned AIT When we obtaned tranng data, we counted Output: Attack (act) Input: pheromone Pre-processed eve tranng data heurstc set D (m records nformaton ) usng Input: Pre-processed tranng data set D (m records ) Output: Equatons Attack Informaton -4. Accordng Tree (AIT) to ant coony agorthm, Output: Attack Informaton Tree (AIT) Step : Load Agorthm-2. AIT traned by MACO-OS MACO-AIT agorthm; preference vaue p, k (t) can be counted usng Equaton 7. Intaze, pref = bstpref =, fa=fa=, Step : Intaze, Treeshoot = Treeps = Treedrb Bed Agorthm-. on mut-group MACO-AIT ant agorthm coony = NULL; optmzaton, act = Step rom(_st); : Intaze, Treeshoot = Treeps = Treedrb = NULL; AIT={ Treeshoot, Treeps, Treedrb }; Input: Dscretzed rea-tme fed data (X), traned AIT agorthm for tranng wth regard to Step AIT 2: FOR t =, k (t)=, k =, Ck Output: each tree Attack n AIT (act) Input: Pre-processed tranng =; data set D (m records ) Fnd path correspondng to X; Step 2: FOR Output: ( = to Attack m ) Informaton Tree (AIT) Fnd Tree n AIT by aa of -th record; Step IF found : Load THEN AIT traned by MACO-AIT agorthm; pref =.preference; Seek -th record n Tree; Intaze, pref = bstpref =, fa=fa=, Step : Intaze, Treeshoot = Treeps = Treedrb = NULL; fa =. fa; IF found THEN act = rom(_st); AIT={ Treeshoot, Treeps, Treedrb }; ELSE Update k (t) by Defnton ; Step 2: FOR each tree n AIT t =, k (t)=, k =, Ck =; Fnd two paths,2 cosest to X; Update Ck Fnd path correspondng to X; Step 2: FOR ( = by to Defnton m ) 2; pref =(.preference+2.preference)/2; IF found THEN Fnd smar Fnd groups Tree n by AIT Eq.5 by aa save of n -th SG; record; fa =(. fa+2. fa)/2; FOR each group n SG ENDIF pref =.preference; Seek -th record n Tree; Update (t) by Equaton 6; IF pref > bstpref fa THEN =. fa; IF found THEN ENDFOR bstpref = pref; ELSE Update k (t) by Defnton ; ELSE fa= fa; Fnd two paths,2 cosest to X; Update Ck by Defnton 2; Create a sub-tree for remanng parameters; act =ths.; pref =(.preference+2.preference)/2; Fnd smar groups (SGs) by Equaton 5; Intaze vaue of Ck, k ENDIF fa =(. fa+2. fa)/2; (t); FOR each group n SG ENDFOR ENDIF Lnk new sub-tree to ts parent node; Step 3: IF bstpref < threshod AND actfa > ENDIF Update (t) by Equaton 6; IF pref > bstpref THEN act =back-ps; ENDFOR ENDFOR bstpref = pref; ENDIF Step 3: Cacuate k for ELSE fa= fa; every eaf by Equaton 4; Step 4: RETURN act; Step 4: Intaze adjustment Create a factors sub-tree (, ); for remanng parameters; act =ths.; Intaze preference Intaze functon vaue p, k (t); of Ck, k ENDIF (t); Step 2 searches data ENDFOR X n AIT resuts n best FOR each Tree n Lnk AIT new sub-tree to ts parent node; attack. Step If 3: re IF bstpref s no < path threshod matched AND wth actfa X, > THEN FOR each ENDIF path n Tree agorthm w fnd two cosest paths to X by ENDFOR act =back-ps; Cacuate vaue of p k (t) by Equaton 7; Eucdean dstance n tree obtan average Step 3: Save Cacuate p k (t) to ENDIF k preference for every of eaf by eaf Equaton path ; 4; preference vaue of se paths. If re are three or more Step 4: RETURN act; Step ENDFOR 4: Intaze adjustment factors (, ); paths wth same smaest dstance, agorthm w ENDFOR Intaze preference functon p, k (t); seect two of m romy two cosest paths. For Step 5: RETURN FOR AIT; each Tree n AIT a traned Step Step AIT, 2 searches 2 searches preference data data of X X each n n eaf s AIT not ess than resuts n n best best zero. Therefore, attack Step. 2 can If re dentfy s no an path matched whose wth X, FOR each path n Tree attack. If re s no path matched wth X, Step 4 ntazes adjustment of pheromone ( ) preference h agorthm bggest w vaue fnd n current two cosest state or paths to X by Step heurstc 4 ntazes ( ) factors. adjustment Cacuate Under norma of vaue pheromone of p k (t) by condtons, (α) Equaton 7; agorthm w fnd two cosest paths to X by Eucdean smar states. Eucdean In Step 3, dstance whe a n of preference tree vaues obtan average heurstc reonabe (β) factors. defnton Under Save ranges norma p k (t) to are:, condtons, preference of [, ] (we reonabe =2, defnton =4, respectvey, ranges ENDFOR are: n ths α [, paper). 2], β [, ] (we set α=2, back-ps of se, paths paths. whch wth If pses re same are smaest three ba backwards. or dstance, more paths The agorthm wth w same eaf path set of attack ; dstance s preference n are rar vaue tree ow, of se obtan agorthm paths. If average w re seect are preference three or more vaue β=4, respectvey, ENDFOR n ths paper). reon smaest s that seect a dstance, of two attack of m s romy agorthm woud not w have two seect good cosest two paths. of For m For a specfc Step 5: state, RETURN ftness AIT; eve of attack s preferences romy a traned ths AIT, ce. two Accordng preference cosest paths. to of each competton For eaf a s traned not ess AIT, than For a postvey specfc state, reated to ftness ts preference eve of vaue. attack The most sexperences, preference zero. ths s Therefore, of because each eaf Step defence s not 2 can ess s very dentfy than crowded. zero. an Therefore, whose Step approprate w have hghest preference vaue postvey reated Step 4 to ntazes ts preference adjustment vaue. The of pheromone most approprate heurstc ( ) Therefore, we shoud ps ba backwards to dsperse 2 can preference dentfy h an bggest whose vaue n preference current h state or bggest n tranng. Therefore, we can fnd most approprate defenders, n attack t agan when chance arses. va w preference have ( ) factors. vaue. hghest Under Usng preference norma condtons, AIT traned vaue by n smar states. In Step 3, whe a of preference vaues vaue n current state or smar states. In Step 3, tranng. mut-group Therefore, reonabe ant coony we defnton optmzaton, can fnd ranges are:, we most can search approprate [, ] (we The set space of compexty attack s of Agorthm- are rar ow, Agorthm-2 agorthm w seect whe a of preference vaues of attack s are rar state n every tree n AIT obtan preference are both O(s), where s s sze of AIT. Supposng va =2, preference =4, respectvey, vaue. n Usng ths paper). back-ps, whch pses ba backwards. The AIT traned by ow, agorthm w seect back-ps, whch pses vaues of every attack. In turn, maxmum each state data reon are dscretzed that a of nto attack T eements, s woud sze of not have good mut-group For ant a specfc coony state, optmzaton, ftness eve we can of search attack preference vaue (correspondng to best attack AIT s ba (s) preferences backwards. shoud be O(3*T n 2 ths The ). The ce. reon tme compextes Accordng s that a to of attack of competton s state n every postvey tree n reated AIT to ts preference obtan vaue. preference The to current state) shoud be seected. The MACO-OS Agorthm- most woud experences, not Agorthm-2 have ths good s are preferences because O(I*n^2*m*g*g(g)) defence n ths ce. s very Accordng crowded. to vaues of every approprate attack. w have In turn, hghest maxmum preference vaue agorthm s foows: O(n*g(g)), competton respectvey, Therefore, experences, we where shoud n, g, ps m, ths I are s because ba number backwards of defence to dsperse very preference vaue n tranng. (correspondng Therefore, we to can best fnd attack most approprate tos, crowded. groups, defenders, ants Therefore, n attack teratons, we shoud agan o when ps respectvey. chance ba backwards arses. to va preference vaue. Usng AIT traned Agorthm- by seems to have a hgh tme compexty. But, mut-group ant coony optmzaton, we can search The space compexty of Agorthm- Agorthm-2 6 Int J Adv Robot Syst, 26, 3:25 do:.5772/6267 state n every tree n AIT obtan preference are both O(s), where s s sze of AIT. Supposng vaues of every attack. In turn, maxmum each state data are dscretzed nto T eements, sze of preference vaue (correspondng to best attack AIT (s) shoud be O(3*T 2 ). The tme compextes of to current state) shoud be seected. The MACO-OS Agorthm- Agorthm-2 are O(I*n^2*m*g*g(g)) agorthm s foows: O(n*g(g)), respectvey, where n, g, m, I are number of

dsperse defenders, n attack t agan when chance arses. The space compexty of Agorthm- Agorthm-2 are both O(s), where s s sze of AIT. Supposng each state data are dscretzed nto T eements, sze of AIT (s) shoud be O(3*T 2). The tme compextes of Agorthm- Agorthm-2 are O(I *n^2*m*g*g(g)) O(n*g(g)), respectvey, where n, g, m, I are number of s, groups, ants teratons, o respectvey. Agorthm- seems to have a hgh tme compexty. But, n ce of attack seecton, n s equa to 3 ( number) (I*m) s equa to tranng tmes. Therefore, Agorthm- shoud run we n a practca appcaton. 5. Expermenta evauaton 5. An exampe of an attack process In order to ustrate appcaton of AIT, a smpfed exampe of an attack process s descrbed n ths secton. Frsty, we traned record data obtaned AIT by mut-group ant coony optmzaton. Then, MACO-OS agorthm w used n attacker decson modue. In a smuaton of common attackng envronment, we paced two attackers, two defenders one goakeeper. The attack process s depcted n Fgure 8 (a, b, c, d): two attackers (No. 9 No. ) have seected three attack s (psng, drbbng shootng), accordng to AIT scene-nformaton. Here, we sume that frst seecton starts at t cyce. After beng created, 2D soccer smuaton team wth MACO-OS w run n RoboCup smuaton patform to evauate our new agorthm. Frsty, every agent took rom attack s. The resuts of se s have been recorded used for next tranng sesson. The underyng code underyng databe are Agent2d-3.. Lbrcsc, respectvey. The postons of a payers (two attackers, two defenders one goakeeper) were set romy by an offne coach. After fve cyces of competton, resuts ony of two penaty teams area are s depcted. recorded format of "αshoot, dshoot, αps, dntp, d drb, dntd, aa, suc". The tranng w stop after 5, tmes. The tranng scene s shown n Fgure 7. In order to show stuatons of attack defence sdes more ceary, ony penaty area s depcted. patform to evauate our new agorthm. Frsty, every agent took rom attack s. The resuts of se s have been recorded used for next tranng sesson. The underyng code underyng databe are Agent2d-3.. Lbrcsc, respectvey. The postons of a payers (two attackers, two defenders one goakeeper) were set romy by an offne coach. After fve cyces of competton, resuts of two teams are recorded format of "αshoot, dshoot, αps, dntp, ddrb, dntd, aa, suc". The tranng w stop after 5, tmes. The tranng scene s shown n Fgure 7. In order to show stuatons of attack defence sdes more ceary, a) The frst seecton (t=t). In ths ce, psng..preference=.395, drbbng..preference =.82, shootng..preference =.. Therefore, psng s seected. Fgure 7. Screenshot of tranng scene arranged romy In foowng experments, maxmum vaue for b) The second seecton (t=t+5). In ths ce, each state data (αshoot, dshoot, αps, dntp, ddrb, dntd) w defned psng.2.preference =., drbbng.2.preference =.6724, (π/2, 5, π/2, 5, 3, 3), respectvey, wth each state shootng.2.preference =.. dscretzed nto 3 eements. If state data are arger than Therefore, drbbng s seected. defned maxmum, dscretzed data w be maxmum vaue accordng to Equaton 8 (e.g., f vaue of αshoot s equa to 2π/3, n dscretzed vaue of αshoot w be 3). The number of states w 2,7 (3*3 2 ), number of groups w 2,7 number of ants Fgure 7. Screenshot of tranng w scene 8, arranged n tota. The romy adjustment factors for pheromones heurstcs were set foows: α=2, β=4. In order to keep prevous experences obtaned durng In foowng experments, maxmum vaue for each pre-event tranng, we set coeffcent of pheromone state data (αshoot, dshoot, evaporaton αps, at a dntp, ow eve ddrb, (ρ=.). dntd) A of w experments ran c) The thrd seecton (t= t+9). In ths ce, defned (π/2, 5, π/2, on 5, 3, same 3), respectvey, computer (memory: wth DDR3 each 8G, CPU: psng.3.preference =., state dscretzed nto 3 5-457). eements. If state data are arger drbbng.3.preference =.96, than defned maxmum, dscretzed data w be shootng.3.preference =.7238. 5. An exampe of an attack process Therefore, shootng s seected. maxmum vaue accordng In order to to ustrate Equaton appcaton 8 (e.g., f of AIT, a vaue of αshoot s equa smpfed to 2π/3, exampe n of an dscretzed attack process s vaue descrbed n ths secton. Frsty, we traned record data of αshoot w be 3). The number of states w 2,7 (3*3 2 obtaned ), AIT by mut-group ant coony optmzaton. Then, number of groups MACO-OS w 2,7 agorthm w number used n of attacker ants decson w 8, n tota. The modue. adjustment In a factors smuaton for of pheromones common attackng envronment, we paced two attackers, two defenders heurstcs were set foows: α=2, β=4. In order to keep one goakeeper. The attack process s depcted n Fgure 8 prevous experences (a, b, obtaned c, d): two durng attackers (No. pre-event 9 No. ) have tranng, we set coeffcent seected three of pheromone attack s evaporaton (psng, drbbng d) The scene of ba psng goakeeper (t= t+2). at a ow eve (ρ=.). A shootng), of experments accordng to ran AIT on scene-nformaton. same Fgure 8. An exampe of an attack process usng AIT Here, we sume that frst seecton starts Fgure at t 8. An exampe of an attack process usng AIT computer (memory: DDR3 8G, CPU: 5-457). cyce. In ths exampe, ba-her seects that h maxma preference Shengbng vaue Chen, of path Gang n Lv AIT Xaofeng (whch Wang: Offensve Strategy n 2D Soccer Smuaton League Usng Mut-group Ant Coony Optmzaton 7

In ths exampe, ba-her seects that h maxma preference vaue of path n AIT (whch conssts of three trees named psng, drbbng shootng). The path corresponds to scene-nformaton, whe vaue of preference s counted by success tmes of n tranng. 5.2 Compettons wth bene strategy Appcaton effect n competton s soe crteron for evauatng an offensve strategy. Ths secton concerns our creaton of a 2D soccer team named MACO, whch empoys MACO-OS agorthm offensve strategy. Meanwhe, usng underyng code Agent2d-3.., whch s an open source by Heos (former word champon n RoboCup 2D smuaton eague), we created anor team (named BASE team), whch adopted searchngspace offensve strategy. The ony dfference between MACO team BASE team s attackng seecton strategy. There were games n tota n test, wth each game runnng 6, stard cock cyces. The resuts of games are presented beow: Games Payed Possesson Psng Success Goa Scored Goa Aganst Resut Goa Dfference 6.5% 85.2% 5 2 Wn 3 2 62.3% 82.8% 4 Wn 3 3 6.2% 85.3% 5 2 Wn 3 4 63.9% 86.% 5 Wn 5 5 6.2% 83.3% 4 Wn 3 6 59.3% 8.7% 4 2 Wn 2 7 66.5% 86.% 6 Wn 5 8 6.% 85.8% 5 3 Wn 2 9 63.4% 87.6% 6 2 Wn 4 62.% 82.9% 4 Wn 4 Average 62.% 84.7% 4.8.4 / 3.4 Tabe. The resuts of games between MACO Be As sted n Tabe, MACO team domnated n games wth 62.% possesson 84.7% psng-success on average. As a resut, MACO team enjoyed a % wnnng advantage wth 3.4 goa dfferences per game, we scorng three tmes many go ts opponent on average. 5.3 Compettons wth or strateges In recent Robocup 2D soccer smuaton eagues, re have been two popuar offensve strateges: Vaue strategy Renforcement Learnng approach. Most of top teams have used m or r varants, have secured good resuts n turn. For ths experment, we created two 2D soccer teams (named V team RL team): V team - The Vaue strategy w empoyed to seect attack. The fed w dvded nto many sma regons, whch were sgned a vaue accordng to threat of each regon. A BP-neura network w traned to cacuate vaue of each to a regon, so that attack coud be seected bed on ts vaue success probabty. RL team - The Renforcement Learnng approach w used to seect attack. A bnary search tree (BST) w created to save nformaton of states, s vaues of rewards. At each tme t, agent receved an observaton o t from word mode, wth whch we estmated vaues of reward r t usng foowng two steps [, 5]: ) cacuatng heurstc-vaue of state (v t ) by observaton o t (whch ncudes ba poston, teammates, opponents ba speed); 2) sgnng vaue to r t by dfference between v t v t-. The agent w n abe to fnd ts former states n BST added r t to t. Accordng to BST vaue, agent seected an a t, whch w subsequenty sent to server affected envronment. The envronment n moved to a new state s t+. We created two teams, whch respectvey used Vaue strategy Renforcement Learnng approach offensve strategy. The or modues of se teams used same code MACO team descrbed n subsecton 5.2. Moreover, a of teams ( V team, RL team MACO team) possess same tranng sampes same tranng tmes, run on same computer (havng same memory capacty computng capabty). The MACO team payed fve games wth V team RL team, respectvey. The resuts of each game are foows: Games Payed MACO vs V MACO vs RL 2 3 4 5 2 3 4 5 GS 2 3 3 2 3 3 3 4 2 4 GA 2 2 3 2 3 2 3 GD 2 2 Resut wn wn wn draw wn draw wn wn draw wn Tabe 2. The resuts of games between dfferent teams In Tabe 2, GS, GA GD respectvey denote goa scored, goa aganst goa dfference. In order to anayse performance of MACO team, a sgn test w ntroduced. Frst, we seected sgn for each game ( + for wn, # for draw - for oss). Then, number for each sgn w cacuated foows: Games Payed MACO vs. V MACO vs. RL 2 3 4 5 GD 2 2 Tota + - Sgn + + + # + 4 GD Sgn # + + # + 3 Tabe 3. The sgns for games between dfferent teams 8 Int J Adv Robot Syst, 26, 3:25 do:.5772/6267

Tabe 3 descrbes sgn of each game. Suppose n +, n- s tota of sgn '+' sgn '-', respectvey, N s sampe sze excudng tes between sampes, sgn tests are taken foows: Frst, we tested resuts of games between MACO team V team. Tabe 3 shows that n + =4, n-=, n N=n + +n-=4, r=mn(n +, n-)=. Accordng to crtca vaues of sgn test, r r.. Hence, MACO team s better than V team at a sgnfcance eve of.. Lkewse, havng tested resuts of games between MACO RL n Tabe 3, we can o concude that former s better than atter o at a sgnfcance eve of.. In order to enabe MACO-OS to compete more strategcay, we created anor 2D soccer team, named MDP team, w deveoped bed on Markov Decson Processes (MDP), whch w used by Wrght Eage team (a fve-tme word champon n 2D eague). Compared to RL team, MDP team uses searchng forward approach to seect best. Frsty, a bnary search tree (BST) w created to record nformaton of states, states vaues (SV), s, expected s vaues (AV) transtons probabtes (TP). Then, for a state s, agent computed AV TP of each possbe by searchng forward n BST. Accordng to paper [5], search n BST s processed n a depth-frst fhon unt a goa state or a certan pre-determned fxed depth s reached. When t reached depth, heurstc-vaues of SV were evauated sent to state s ong term vaue of AV [5]. Wth same memory capacty computng capabty, MACO team payed games wth MDP team. The resuts of se games are presented beow: Games Payed GS GA Resut Games Payed GS GA Resut 3 2 Wn 6 4 3 Wn 2 4 3 Wn 7 4 2 Wn 3 3 4 Lose 8 3 3 Draw 5.4 The scae of tranng tmes In ths stage, we anaysed tranng scae by smuatons. The team descrbed n above experments w traned 5, The team tmes. descrbed Had n team above had experments been traned w ony traned a few hundred 5, tmes. tmes, Had resut team may had have been been traned qute dfferent. ony a few In order hundred to dentfy tmes, resut approprate may have range been of qute tranng dfferent. tmes, we In order used a to dfferent dentfy scae of approprate tranng data range of tested tranng effect tmes, of we used correspondng a dfferent scae AIT. of tranng data tested Frsty, effect we of dvded correspondng tranng AIT. data coected usng above Frsty, experments we dvded nto tranng parts, data denoted coected d, usng d 2,..., d, j above set experments T = k = d k nto create parts, denoted tranng data d, d2, sets..., wth d, dfferent set Tscaes. = Then, to usng create MACO-AIT tranng data agorthm sets wth to tran dfferent those scaes. data Then, sets, usng AITs were MACO-AIT generated agorthm teams to created tran those correspondng data sets, AITs to se were AITs. generated Meanwhe, teams after creatng created correspondng prevousy mentoned to se AITs. BASE Meanwhe, team, we made after t compete creatng tmes prevousy wth mentoned each of BASE teams team, descrbed we made above, t compete respectvey. tmes The wth osng each of percentage, teams descrbed drawng percentage, above, respectvey. wnnng The percentage osng percentage, goa dfference drawng on average percentage, are presented wnnng beow: percentage goa dfference on average are presented beow: Resut (%) 9 8 7 6 5 4 3 2 2 3 4 5 6 7 8 9 Scae Drawng Percentage Losng Percentage Wnnng Percentage a) The mean curves of osng percentage, drawng percentage wnnng percentage Goa Dfference 5 4 3 2 - nto 25 e gener o " shoot, method d n creat AITs wth se AITs games The wnn are presen Resut (%) 9 8 7 6 5 4 3 2 a) The cur Goa Dfference 4 3 2 - -2 b) The cu Fgure. equvae 4 2 2 Draw 9 4 3 Wn 5 3 2 Wn 2 2 Draw Tota 32 26 6/3/ Tabe 4. The resuts of games between MACO MDP -2 2 3 4 5 6 7 8 9 Scae b) The mean curve of goa dfference Fgure 9. The curve of resut vared wth tranng scae when Fgure equvaent 9. The curve state number of resut of vared each state wth (T) s 3 tranng (2,7 scae states when n tota) equvaent state number of each state (T) s 3 (2,7 states n tota) Tabe 4 shows that MACO team performed better than Fgure 9 shows that wnnng percentage goa MDP team, wth sx wns, three draws one oss. Ths dfference are are ncreed graduay wth wth ncreng ncreng scae ndcates that MACO team h a 75% chance of beatng of scae of tranng tranng n eary eary stages. stages. However, However, when when MDP team, whch s satsfyng snce envronment tranng tranng scae scae reached reached T 9, T9, wnnng wnnng percentage percentage goa s unpredctabe, whe may sometmes be dfference goa dfference reached reached hghest hghest eve. Besdes, eve. Besdes, osng affected by accdenta factors. As far seecton s percentage osng percentage drawng drawng percentage percentage both reach both reach zero, concerned, MACO-OS agorthm performed better than zero, owest owest eve. It eve. can be It concuded can be concuded that offensve that MDP agorthm. Ths s because MDP agorthm strategy offensve w strategy mature w after mature about after 35, about 35, tranng tranng tmes uses searchng forward approach, whch needs a great when tmes each when state each s dscretzed state dscretzed nto 3 eements nto 3 (re eements were dea of memory computng tme. 2,7 (re states were n 2,7 tota). states n tota). In order to test wher state number h an nfuence on scae of tranng Shengbng tme, Chen, we Gang dscretzed Lv Xaofeng each state Wang: Offensve Strategy n 2D Soccer Smuaton League Usng Mut-group Ant Coony Optmzaton Fgure dfference scae reach wth F goa d of trann concuson tranng s each state reached t greater th were abe mature aft 6. Conc The offen 9 attackng