Meta-Knowledge as an engine in Classifier Combination

Similar documents
DATA MINING ON CRICKET DATA SET FOR PREDICTING THE RESULTS. Sushant Murdeshwar

Transformer fault diagnosis using Dissolved Gas Analysis technology and Bayesian networks

A Machine Learning Approach to Predicting Winning Patterns in Track Cycling Omnium

PREDICTING the outcomes of sporting events

Evaluating and Classifying NBA Free Agents

EXPLORING MOTIVATION AND TOURIST TYPOLOGY: THE CASE OF KOREAN GOLF TOURISTS TRAVELLING IN THE ASIA PACIFIC. Jae Hak Kim

A comprehensive evaluation of the methods for evolving a cooperative team

Open Research Online The Open University s repository of research publications and other research outputs

A Novel Approach to Predicting the Results of NBA Matches

Application of Bayesian Networks to Shopping Assistance

Uncovering Anomalous Usage of Medical Records via Social Network Analysis

Analysis of hazard to operator during design process of safe ship power plant

The Effect of a Seven Week Exercise Program on Golf Swing Performance and Musculoskeletal Screening Scores

Introduction to Pattern Recognition

Introduction to Pattern Recognition

Machine Learning an American Pastime

1.1 The size of the search space Modeling the problem Change over time Constraints... 21

This file is part of the following reference:

Evaluating the Design Safety of Highway Structural Supports

Safety Analysis Methodology in Marine Salvage System Design

RUGBY is a dynamic, evasive, and highly possessionoriented

Predicting Horse Racing Results with TensorFlow

Analysis of performance at the 2007 Cricket World Cup

DECISION-MAKING ON IMPLEMENTATION OF IPO UNDER TOPOLOGICAL UNCERTAINTY

Evaluating the Influence of R3 Treatments on Fishing License Sales in Pennsylvania

Rearrangement of Recognized Strokes in Online Handwritten Gurmukhi. words recognition.

knn & Naïve Bayes Hongning Wang

Volume 37, Issue 3. Elite marathon runners: do East Africans utilize different strategies than the rest of the world?

CS 7641 A (Machine Learning) Sethuraman K, Parameswaran Raman, Vijay Ramakrishnan

swmath - Challenges, Next Steps, and Outlook

Environmental Science: An Indian Journal

March Madness Basketball Tournament

CS145: INTRODUCTION TO DATA MINING

B. AA228/CS238 Component

Player Availability Rating (PAR) - A Tool for Quantifying Skater Performance for NHL General Managers

Deconstructing Data Science

March Madness Basketball Tournament

Using Spatio-Temporal Data To Create A Shot Probability Model

Tokyo: Simulating Hyperpath-Based Vehicle Navigations and its Impact on Travel Time Reliability

Building an NFL performance metric

Safety Assessment of Installing Traffic Signals at High-Speed Expressway Intersections

AIS data analysis for vessel behavior during strong currents and during encounters in the Botlek area in the Port of Rotterdam

GOLOMB Compression Technique For FPGA Configuration

MA PM: Memetic algorithms with population management

Post impact trajectory of vehicles at rural intersections

NCSS Statistical Software

Naïve Bayes. Robot Image Credit: Viktoriya Sukhanova 123RF.com

Naïve Bayes. Robot Image Credit: Viktoriya Sukhanova 123RF.com

Deakin Research Online

Marathon Performance Prediction of Amateur Runners based on Training Session Data

TULANE UNIVERSITY LAW SCHOOL SPORTS LAW SOCIETY 11TH ANNUAL INTERNATIONAL BASEBALL ARBITRATION COMPETITION (2018) OFFICIAL RULES

Predicting Horse Racing Results with Machine Learning

Student Population Projections By Residence. School Year 2016/2017 Report Projections 2017/ /27. Prepared by:

For IEC use only. Technical Committee TC3: Information structures, documentation and graphical symbols

IGEM/TD/2 Edition 2 with amendments July 2015 Communication 1779 Assessing the risks from high pressure Natural Gas pipelines

CS 221 PROJECT FINAL

Intelligent Decision Making Framework for Ship Collision Avoidance based on COLREGs

Examples of Carter Corrected DBDB-V Applied to Acoustic Propagation Modeling

Competitive Performance of Elite Olympic-Distance Triathletes: Reliability and Smallest Worthwhile Enhancement

A Case Study of Leadership in Women s Intercollegiate Softball. By: DIANE L. GILL and JEAN L. PERRY

Critical Gust Pressures on Tall Building Frames-Review of Codal Provisions

Advanced Test Equipment Rentals ATEC (2832) OMS 600

Neural Network in Computer Vision for RoboCup Middle Size League

Software Reliability 1

PUBLISHED PROJECT REPORT PPR850. Optimisation of water flow depth for SCRIM. S Brittain, P Sanders and H Viner

Information Technology for Monitoring of Municipal Gas Consumption, Based on Additive Model and Correlated for Weather Factors

Available from Deakin Research Online:

Centre for Transport Studies

Deconstructing Data Science

Simulating Major League Baseball Games

Purpose. Scope. Process flow OPERATING PROCEDURE 07: HAZARD LOG MANAGEMENT

Online Companion to Using Simulation to Help Manage the Pace of Play in Golf

Chapter 5 DATA COLLECTION FOR TRANSPORTATION SAFETY STUDIES

An Analysis of Reducing Pedestrian-Walking-Speed Impacts on Intersection Traffic MOEs

A Study on Weekend Travel Patterns by Individual Characteristics in the Seoul Metropolitan Area

Road Accident Analysis and Identify the black spot location On State Highway-5 (Halol-Godhra Section)

CS472 Foundations of Artificial Intelligence. Final Exam December 19, :30pm

PREDICTING THE NCAA BASKETBALL TOURNAMENT WITH MACHINE LEARNING. The Ringer/Getty Images

Heart Rate Prediction Based on Cycling Cadence Using Feedforward Neural Network

ASSESSING THE RELIABILITY OF FAIL-SAFE STRUCTURES INTRODUCTION

MONROE COUNTY NEW YORK

Compression Study: City, State. City Convention & Visitors Bureau. Prepared for

When Bad Things Happen to Good Property

TRAFFIC STUDY GUIDELINES Clarksville Street Department

ADOT Statewide Bicycle and Pedestrian Program Summary of Phase IV Activities APPENDIX B PEDESTRIAN DEMAND INDEX

Influence of wind direction on noise emission and propagation from wind turbines

Human Performance Evaluation

EXAMINING THE EFFECT OF HEAVY VEHICLES DURING CONGESTION USING PASSENGER CAR EQUIVALENTS

Modeling Planned and Unplanned Store Stops for the Scenario Based Simulation of Pedestrian Activity in City Centers

Determining Occurrence in FMEA Using Hazard Function

A Generalised Approach to Position Selection for Simulated Soccer Agents

STATE OF NEW YORK PUBLIC SERVICE COMMISSION. At a session of the Public Service Commission held in the City of Albany on August 20, 2003

Women and Marathons: A Low Participation. Recreation Research Proposal. PRM 447 Research and Evaluation in PRM. Jaimie Coastman.

Teaching Notes. Contextualised task 35 The 100 Metre Race

The Junior Young Physicists Tournament JYPT. Regulations

BASKETBALL PREDICTION ANALYSIS OF MARCH MADNESS GAMES CHRIS TSENG YIBO WANG

Transactions on the Built Environment vol 7, 1994 WIT Press, ISSN

A SURVEY OF 1997 COLORADO ANGLERS AND THEIR WILLINGNESS TO PAY INCREASED LICENSE FEES

Prudhoe Bay Oil Production Optimization: Using Virtual Intelligence Techniques, Stage One: Neural Model Building

Rules Modernization FAQs March 2018 Update

Transcription:

74 Meta-Knowledge as an engine in Classifier Combination Fredrick Edward Kitoogo 10 Department of Computer Science Faculty of Computing and IT, Makerere University Venansius Baryamureeba Department of Computer Science Faculty of Computing and IT, Makerere University The use of classifier combination has taken center stage in machine learning research, where outputs of different classifiers are combined in various ways to achieve a perceived better performance than that of any of the base classifiers involved in the combination. Many a research has however not empirically justified the use of the participating classifiers in a combination. This work looks at the usage of meta-knowledge that expresses the performance of each learning method on diverse domains to choose the most suitable learning algorithms suited for a combination for particular domains. The meta-knowledge is considered in this work is limited to the characteristics of the data involved. The approach works by having a learning algorithm that learns and describes how the data characteristics and the combined learning algorithms relate. Given a new learning domain, the domain characteristics are measured, and from the induced relationship, a selection of the most suitable base algorithms for combination will be done. The results of this work provide a fundamental step in achieving a cohesive framework for classifier combination. Categories and Subject Descriptors: I.5.2 [Pattern Recognition]: Design Methodology Classifier design and evaluation; I.5.3 [Pattern Recognition]: Clustering Algorithms; I.2.7 [Artificial Intelligence]: Natural Language Processing language parsing and understanding General Terms: Computer Science, Language Processing Additional Key Words and Phrases: classifier combination, clustering, machine learning, meta-knowledge IJCIR Reference Format: Fredrick Edward Kitoogo and Venansius Baryamureeba, Meta-Knowledge as an engine in Classifier Combination. International Journal of Computing and ICT Research, Vol. 1, No. 2, pp. 74-86. http:www.ijcir.org/volume1- number2/article 8.pdf. 1. INTRODUCTION Classification is the process of grouping of information into predetermined classes and categories of the same type. A classifier which is computer based agent is the one used in performing classifications Classifiers have to broad categories: rule-based classifiers and computational intelligence based classifiers. Whereas rule-based classifier are in general constructed by the designer, where the designer defines the rules for the interpretation of detected inputs, 10 Author s address: Fredrick Edward Kitoogo, Department of Computer Science, Faculty of Computing and IT, Makerere University, P.O. Box 7062, Kampala, Uganda, fkitoogo@cit.mak.ac.ug, www.cit.ac.ug Venansius Baryamureeba, Department of Computer Science, Faculty of Computing and IT, Makerere University, P.O. Box 7062, Kampala, Uganda, barya@cit.mak.ac.ug, www.cit.ac.ug Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than IJCIR must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. International Journal of Computing and ICT Research 2007. International Journal of Computing and ICT Research, ISSN 1818-1139 (Print), ISSN 1996-1065 (Online), Vol.1, No.2, pp. 74-86. December 2007.

75 in the computational intelligence based classifiers the designers only a based framework for the interpretation of data and the learning of training algorithms within such system are responsible for the generation of rules [Ranawana and Palade 2006] Many computing arenas such as text categorization, Named Entity Recognition, Data mining, word disambiguation, etc employ computation intelligence classification in one way or another. There are various machine learning algorithms that can be used for classification; a list of the most commonly used algorithms with their brief descriptions can be obtained from the Wikipedia website [2005]. The trend that has emerged in classification is classifier combination; where different classifiers are integrated in various ways to create a system that out performs the best individual classifiers [Kozareva et al. 2005].The motivation of classifiers is that the combinations are often much more accurate than the individual classifiers that make them up [Dzeroski and Zenko 2004; Bennett et al. 2005] Much of the research on the classifier combinations has achieved some good performance, however nothing much has been explained about the choices of the participant sin a combination for a particular dataset or domain. Some research on cataloging metadata for machine learning research and applications has been at tempted by Cunningham [1997] but is limited to single algorithms; there is a desire to have such research extended to classifier combination. This work which tackles the case of discovering meta-knowledge for classifier combination is motivated by the one free launch theorem [Wolpert and Macready 1997] which implies that one algorithm A is better than another B in a certain problem space, then also algorithm B can be better than algorithm A over another problem space ;this theorem backs up the intuition that the are is some meta-knowledge about specific problem spaces that could assist in improving the performance when classifiers are combined. The work of Kitoogo and Barya [2007] in which a question of determining the choice of base classifiers to participate in a combination was left open is also another motivator for this work. It is desirable to know before and which machine learning algorithms (algorithm type) combine best for a particular task. Beside the fact that different types of tasks ask for different types of solutions, it is difficult to determine before and which type of combination solution best suits a particular task. These observations point to the need of an empirical approach. In this work we take an empirical approach and perform a range of experiments to investigate this question. The work will handle the task of associating algorithm type combinations with. Data set characteristics by firstly categorizing the learning algorithms combinations, building a list of datasets, their characteristics, and the combination performance of the different sets, then 1. Applying the different learning algorithm combinations on the different categorized data sets to generate a Meta data set from the outputs and 2. Applying unsupervised clustering analysis to the mea data set to analyze the generated clusters and determine whether any patterns exist. The remainder of the paper is structured as follows: in Section 2, related work is briefly reviewed, in section 3, we outline the proposed method. We describe the supervised learning and unsupervised learning procedures of the method; in section 4 the results of the conducted experiments are presented. Finally, Section 5 closes with a conclusion and an outlook for future work. 2. RELATED WORK Various researchers have investigated the use of multiple classifiers together with different combination methods for classification problems. Some research on classifier combination is done by generating classifier combination using a single learning algorithm often referred to as homogeneous [Dzeroski and Zenko 2004]. This is normally done by manipulating the training set (as done in boosting of bagging), manipulating the input features, manipulating the output targets or injection randomness in the learning algorithms. The generated classifiers are the typically combined by majority or weighed voting. Other approaches like the heterogeneous one which apply different learning algorithms to a single dataset have been proposed by Merz [1999]. In the work of techniques of combining multiple classifiers, Alpaydin [1998], employed the strategy of varying the input representation for the different base classifiers so as to build complimentarily between the base classifiers, experiments that were conducted on some real-world data sets indicated that improved accuracy can be achieved without necessarily increasing overall computer resource usage. In their work of data-driven part-of-speech tagging, De Pauw et al. [2006] experimented with improvement of performance of for individual taggers on Kiswahili language by combining them into a committee of taggers using among others different voting schemes; plurality voting, majority voting and weighted voting. There experiments revealed that plurality voting outperformed the more elaborate scheme of weighted voting.

76 Seldom have researchers examined the use of meta-knowledge to determine which classifier combination sets perform best for particular domains or problem settings. Cornelson et al. [2002] conducted some experiments on combining families of information retrieval algorithms using mete-learning. The different algorithms were obtained by varying the normalization and similarity functions. Bennett et. al. [2005] introduce a probabilistic method for combining classifiers that considers the context- sensitive reliabilities of contributing classifiers. Their method harnesses variables that provide signal about the performance of classifiers in different situations, which they refer to as reliability indicators. Some work on finding associations between classification algorithms and characteristics of data set was done by Ibrahim [1999], the work was however limited to single individual classification algorithms. Okamoto and Yugami [2003] also investigated the effect of domain characteristics on instance based learning algorithms; their work specifically targeted the K-Nearest Neighbors classification Fig. 1. The model Accuracy as a function of the domain characteristic, again this work considered only a certain class of algorithms. Ranawana and Palada [2006] conducted an extensive review of different methods and approaches of multi classifier systems and summarized pointing out that the success of a multi classifier depends extensible on three key main features: proper selections of the participating classifiers, topology, and combinatorial methodology. The proper selection of the participating classifiers strengthens the desire for investigation into meta-knowledge. 3. THE MODEL APPROACH The model as shown in figure 1 aims t finding association between combined classification algorithms and the characteristics of different data sets in three step process: (1) Build a file of data set names, their characteristics, a file of names of algorithms indicating the algorithms types, then the different possible algorithm combination types. (1) Subject the different data sets to the range of classifier combination sets and measure the performance on each of the data sets, ultimately build a Meta data set out of the different algorithm combination types, different data characteristics and the related performance rates.

77 (2) Applying an unsupervised clustering algorithm to the file that has been built in the step 2, perform clustering analysis to determine any significant patterns. 3.1 Data Sets and Characteristics We use a set of benchmark data sets (as shown in Table I) from the UCI Machine Learning Repository [2006] that are used frequently in the field of machine learning research. 3.2 Supervising learning algorithms Supervised techniques involve the participation of an expert, who is responsible presents the algorithm with a set of inputs and informs the algorithm of the category or class to which it should be assigned to; in this work, the data sets are already in the desired format for training. Supervised learning algorithms can ideally be grouped into two broad categories: lazy learners; Eager learners put most of their time and effort in the learning phase and lazy learners divert their effort to the classification phase [Aha 1997]. Table I above shows the different base algorithms that were used in the classifier combination strategy. 3.3 Classifier Combination Table I above shows the different classifier combination sets of three that were generated out of the five base learning algorithms. Each of the different combinations is used to generate a classifier over the different data sets using Plurality Voting which was mainly motivated by the work of De Pauw et al. [2006] and that of Kitoogo and Barya [2007] in

78 which it is shown that though plurality voting is a simple combination methodology, it has a considerably high accuracy performance. 3.4 Clustering Data Generation Once the data sets, learning algorithms, and combination methodology to use for the experiments have been decided, then the clustering data generation is conducted. This is done by running the different algorithms together with the combination algorithm on all the data sets. Default settings are used for all the algorithms. 10 - Fold cross validation is used for testing the different combination sets. 3.5 Clustering Analysis After the data for clustering has been generated, the data grouped into three groups (i) High Accuracy (ii) Moderate Accuracy and, (iii) Low Accuracy. For the high accuracy data group an unsupervised learning algorithm (K-Means Clustering) is used to group the data into clusters of records (depending on the useful patterns). 4. EXPERIMENTS AND RESULTS The running of various plurality voting classifiers of three base learning algorithms generated out of combinations from five algorithms on the data sets resulted into a data set that was spit into three groups, bad accuracy performance, medium accuracy performance, and good accuracy performance shown in Tables IV, V, VII and VIII; this is done by dividing the range between the lowest accuracy figure and the highest accuracy figure into three groups. Since the main goal is to attach good accuracy performance to the characteristics of specific data set types, the good accuracy data is the one on which cluster analysis is performed in the experiments. The results from the bad data accuracy data indicate that all algorithm combinations performed badly with a high number of classes, this is further exposed even in the medium and good accuracy data sets, generally as the number of classes decreased the accuracy performance improved. 4.0.1 Results from the k-means clustering:. Preliminary experimental runs of the k-means clustering algorithm intimated between 3 and 6 clusters, so the number of clusters was set to 4. The first run of clustering clearly indicates that the number of instances was the only prominent variable used in determining the clusters as shown in round 1 clustering Variable Name Table II. Summary statistics of the clustering data First Run Obs Mean Min Max Std Dev Signf 0.96 0.86 Accuracy 112 0.748 0.072 0 1 0 Instances 112 106 2201 799 616.155 1 Attribute 112 3 57 18 14.366 0 s Classes 112 2 4 2 0.738 0 Second Run Variable Obs Mean Min Max Std Signf Name Dev 0.96 0.86 Accuracy 112 0.748 0.072 0 1 0 Attribute 17.9 14.36 112 3 57 1 s 11 6 2.32 Classes 112 2 4 0.738 0 1

79 Cluster accuracy Type accuracy Central Attribs 1 (Bayes C45Tree KNN) 0.928 16 2 2 (Bayes C45Tree Rule) 0.858 6 4 3 (Bayes KNN Rule) 0.762 57 2 4 (Bayes C45Tree KNN) 0.923 32 2 Central Classes - Summary statistics shown in Table II. The significance of the other variables can be exposed by excluding the prominent variable (number of instances) from the subsequent clustering experimental runs. The next level runs expose the significance of the other variables which were overshadowed by the number of instances in the first run. The second run of clustering reveals that the number of attributes is the greatest determinant of the clusters as shown in the Tables IX, X, XI and XII. Analysis shows that Bayes combines well with KNN for a moderately low number of classes; combination with the Rule based algorithms brings down the accuracy performance whether the number of attributes is low or high, and combination with decision trees generally has no big influence on a combinations accuracy performance. As seen in Tables IX, X, XI and XII, most of the observations fall into cluster 1 and 2 dominated by the Bayes/KNN combinations, the highest accuracies and central accuracies also reside in these clusters. It can also be observed that the Bayes/KNN combination is brought down when the number of attributes increases and also when combined with the rule based algorithm It can also be clearly noted that a combination that has C45Tree/Tree does not harbor any significant change in the accuracy performance. 5. CONCLUSION AND FUTURE WORK The first discovery from the division of the generated clustering data into groups was that irrespective of the combination set, the worst accuracy performance was exhibited in data sets which had a large number of classes for prediction. This is an implication that the higher the number of classes for prediction in a data set, the more difficult it is to learn from that data set. Because the number of instances was the only prominent variable before its exclusion from the clustering experiments where the other variables emerged as significant, then number of instances was not useful in determining the eventual clusters, subsequently rendering it not very relevant meta-knowledge for classifier combination. The main finding is that the number of attributes in a data set comes up as the most significant determinant of performance for various classifier combinations. From the results it is insinuated that the combination of Bayesian and K- Nearest Neighbor algorithms yields substantially good performance, however, the performance lowers with the number of classes. The results further expose that combination that included the Rule based algorithms performed averagely bad. Another finding was that combinations with the two decision tree methods did not significantly change the resultant performance of a combination set. For future work evaluation of the finding from clustering analysis will be conducted and compared with other combination sets without a priori clustering information. In search for more pragmatic results, experiments will in future be conducted using domain specific real world data sets and the range of the data characteristics will be extended in order to discover more meaningful patterns.

80 References AHA, D. 1997. Special Issue on Lazy learning. Artificial Intelligence Review 11, 1-5. ALPAYDIN, E. 1998. Techniques for combining multiple learners. In Proceedings of Engineering Intelligent Systems 2, 6-12. BENNETT, P., DUMAIS, S., AND HORVITZ, E. 2005. The combination of text classifiers using reliability indicators. Information Retrieval 8, 67-100. CORNELSON, M., GROSSMAN, R. L., KARIDI, G. R., AND SHNIDMAN, D. 2002.Combining Families of Information Retrieval Algorithms using Meta-Learning.survey of Text Mining: Clustering, Classification, and Retrieval 159-169. CUNNINGHAM, S. J. 1997. Dataset cataloging metadata for machine learning Applications and research. In Proceedings of the Sixth International Workshop on AI and Statistics. DE PAUW, G., DE SCHRYVER, G. M. & WAGACHA, P. W. 2006. Data-Driven Part-of-Speech Tagging of Kiswahili. Book Series: Lecture Notes in Computer Science, Book: Text, Speech and Dialogue, Springer Berlin/ Heidelberg 4188, 197-204. DZEROSKI, S. AND ZENKO, B. 2004. Is Combining Classifiers with Stacking Better than Selecting the Best One? Machine Learning 54, 255-273 IBRAHIM, R. S. 1999. Data Mining of Machine Learning Performance Data, Master s Thesis, RMIT University. KITOOGO,F.E., BARAYAMUREEBA, V. 2007. On classifier combination for better named entity. In Proceedings of the First International Computer Science and ICT Conference (COSCIT 2007), Nairobi, Kenya, Feb 2007. KOZAREVA, Z., FERRANDEZ, O., MONTOYO, A., MUNOZ, R., AND SUAREZ, A. 2005. Combining datadriven systems for improving Named Entity Recognition. In Proceedings of NLDB 3513, 80-90 MERZ, C. J. 1999. Using correspondence analysis to combine classifiers. Machine Learning 36, 33-58. OKAMOTO, S. AND YUGAMI. N. 2003. Effects of domain characteristics on instance-based learning algorithms. Theoretical Computer Science 298(1), 207-233. RANAWANA, R. AND PALADE, V. 2006. Multi-Classifier Systems: Review and a roadmap for developers. International Journal of Hybrid Intelligent Systems 3(1), 35-61. THE MACHINE LEARNING REPOSITORY. 2006. http://www.cs.uci.edu/mlearn/ ML Repository. Html accessed 04/05/2006. WIKIPEDIA. 2007. The Free Encyclopedia, http://en.wikipedia.org/wiki/category: Classification algorithms, accessed 29/04/2007. WOLPERT, D.H. AND MACREADY, W.G. 1997. No Free Lunch Theorems for optimization, IEEE Transactions on Evolutionary Computation 1, 67. APPENDIX I Table IV. Splitted generated Data for Clustering Bad accuracy performance Acc # Inst # Attr # Class Bayes C45Tree KNN 0.472 339 17 21 Bayes C45Tree Rule 0.354 339 17 21 Bayes C45Tree Tree 0.484 339 17 21 Bayes KNN Rule 0.354 339 17 21 Bayes KNN Tree 0.448 339 17 21 Bayes Rule Tree 0.354 339 17 21 C45Tree KNN Rule 0.351 339 17 21

81 C45Tree KNN Tree 0.436 339 17 21 C45Tree KNN Tree 0.551 368 20 3 C45Tree Rule Tree 0.351 339 17 21 KNN Rule Tree 0.351 339 17 21 Table V. Splitted generated data for clustering Good accuracy performance Acc # Inst # Attr # Class Bayes C45Tree KNN 0.771 148 18 4 Bayes C45Tree KNN 0.781 296 13 2 Bayes C45Tree KNN 0.825 2201 3 2 Bayes C45Tree KNN 0.834 977 14 2 Bayes C45Tree KNN 0.862 690 15 2 Bayes C45Tree KNN 0.876 106 57 2 Bayes C45Tree KNN 0.891 958 9 2 Bayes C45Tree KNN 0.923 351 32 2 Bayes C45Tree KNN 0.928 1728 6 4 Bayes C45Tree KNN 0.928 435 16 2 Bayes C45Tree KNN 0.944 569 20 2 Bayes C45Tree KNN 0.961 699 9 2 Bayes C45Tree Rule 0.748 148 18 4 Bayes C45Tree Rule 0.752 106 57 2 Bayes C45Tree Rule 0.761 690 15 2 Bayes C45Tree Rule 0.784 977 14 2 Bayes C45Tree Rule 0.789 2201 3 2 Bayes C45Tree Rule 0.858 1728 6 4 Bayes C45Tree Rule 0.885 958 9 2 Bayes C45Tree Rule 0.912 351 32 2 Bayes C45Tree Rule 0.936 435 16 2 Bayes C45Tree Rule 0.937 569 20 2 Bayes C45Tree Rule 0.941 699 9 2 Bayes C45Tree Tree 0.777 296 13 2 Bayes C45Tree Tree 0.783 2201 3 2 Bayes C45Tree Tree 0.827 977 14 2 Bayes C45Tree Tree 0.858 106 57 2 Bayes C45Tree Tree 0.862 690 15 2 Bayes C45Tree Tree 0.866 958 9 2 Bayes C45Tree Tree 0.926 351 32 2 Bayes C45Tree Tree 0.933 1728 6 4 Bayes C45Tree Tree 0.94 435 16 2 Bayes C45Tree Tree 0.944 569 20 2 Bayes C45Tree Tree 0.954 699 9 2 Table VI. Split generated data for clustering Medium accuracy performance Acc # Inst # Attrib # Classes Bayes C45Tree KNN 0.663 368 20 3 Bayes C45Tree KNN 0.687 345 6 2 Bayes C45Tree KNN 0.724 286 9 2 Bayes C45Tree KNN 0.737 214 9 6

82 Bayes C45Tree Rule 0.566 214 9 6 Bayes C45Tree Rule 0.59 368 20 3 Bayes C45Tree Rule 0.623 345 6 2 Bayes C45Tree Rule 0.644 286 9 2 Bayes C45Tree Rule 0.659 296 13 2 Bayes C45Tree Tree 0.668 286 9 2 Bayes C45Tree Tree 0.674 368 20 3 Bayes C45Tree Tree 0.687 345 6 2 Bayes C45Tree Tree 0.714 214 9 6 Bayes C45Tree Tree 0.723 148 18 4 Bayes KNN Rule 0.575 214 9 6 Bayes KNN Rule 0.59 368 20 3 Bayes KNN Rule 0.623 345 6 2 Bayes KNN Rule 0.644 286 9 2 Bayes KNN Rule 0.659 296 13 2 Bayes KNN Tree 0.655 368 20 3 Bayes KNN Tree 0.675 286 9 2 Bayes KNN Tree 0.676 345 6 2 Bayes KNN Tree 0.706 214 9 6 Bayes Rule Tree 0.575 214 9 6 Bayes Rule Tree 0.59 368 20 3 Bayes Rule Tree 0.623 345 6 2 Bayes Rule Tree 0.644 286 9 2 Bayes Rule Tree 0.645 296 13 2 C45Tree KNN Rule 0.571 214 9 6 C45Tree KNN Rule 0.59 368 20 3 C45Tree KNN Rule 0.623 345 6 2 C45Tree KNN Rule 0.644 286 9 2 C45Tree KNN Rule 0.669 296 13 2 C45Tree KNN Tree 0.678 286 9 2 C45Tree KNN Tree 0.693 345 6 2 C45Tree KNN Tree 0.729 148 18 4 C45Tree KNN Tree 0.733 214 9 6 C45Tree Rule Tree 0.571 214 9 6 C45Tree Rule Tree 0.59 368 20 3 C45Tree Rule Tree 0.623 345 6 2 C45Tree Rule Tree 0.644 286 9 2 C45Tree Rule Tree 0.659 296 13 2 KNN Rule Tree 0.58 214 9 6 KNN Rule Tree 0.59 368 20 3 KNN Rule Tree 0.623 345 6 2 KNN Rule Tree 0.644 286 9 2 KNN Rule Tree 0.659 296 13 2 Table VII. Split generated data for clustering Good accuracy -Cont... Acc # Inst # Attrib # Classes Bayes KNN Rule 0.748 148 18 4 Bayes KNN Rule 0.762 106 57 2 Bayes KNN Rule 0.768 690 15 2 Bayes KNN Rule 0.783 977 14 2 Bayes KNN Rule 0.789 2201 3 2 Bayes KNN Rule 0.858 1728 6 4 Bayes KNN Rule 0.875 958 9 2 Bayes KNN Rule 0.912 351 32 2 Bayes KNN Rule 0.938 435 16 2

83 Bayes KNN Rule 0.938 569 20 2 Bayes KNN Rule 0.946 699 9 2 Bayes KNN Tree 0.756 148 18 4 Bayes KNN Tree 0.804 977 14 2 Bayes KNN Tree 0.818 296 13 2 Bayes KNN Tree 0.825 2201 3 2 Bayes KNN Tree 0.858 106 57 2 Bayes KNN Tree 0.867 690 15 2 Bayes KNN Tree 0.887 958 9 2 Bayes KNN Tree 0.933 435 16 2 Bayes KNN Tree 0.936 1728 6 4 Bayes KNN Tree 0.943 351 32 2 Bayes KNN Tree 0.947 569 20 2 Bayes KNN Tree 0.96 699 9 2 Bayes Rule Tree 0.748 148 18 4 Bayes Rule Tree 0.761 690 15 2 Bayes Rule Tree 0.762 106 57 2 Bayes Rule Tree 0.781 977 14 2 Bayes Rule Tree 0.789 2201 3 2 Bayes Rule Tree 0.858 1728 6 4 Bayes Rule Tree 0.871 958 9 2 Bayes Rule Tree 0.914 351 32 2 Bayes Rule Tree 0.935 569 20 2 Bayes Rule Tree 0.936 435 16 2 Bayes Rule Tree 0.941 699 9 2 Table VIII. Split generated data for clustering Good accuracy -Cont... Acc # Inst # Attrib C45Tree KNN Rule 0.752 106 57 2 C45Tree KNN Rule 0.767 690 15 2 C45Tree KNN Rule 0.768 148 18 4 C45Tree KNN Rule 0.786 977 14 2 C45Tree KNN Rule 0.789 2201 3 2 C45Tree KNN Rule 0.858 1728 6 4 C45Tree KNN Rule 0.885 958 9 2 C45Tree KNN Rule 0.909 351 32 2 C45Tree KNN Rule 0.938 435 16 2 C45Tree KNN Rule 0.938 569 20 2 C45Tree KNN Rule 0.943 699 9 2 C45Tree KNN Tree 0.774 296 13 2 C45Tree KNN Tree 0.797 2201 3 2 C45Tree KNN Tree 0.83 977 14 2 C45Tree KNN Tree 0.849 106 57 2 C45Tree KNN Tree 0.87 690 15 2 C45Tree KNN Tree 0.892 958 9 2 C45Tree KNN Tree 0.926 351 32 2 C45Tree KNN Tree 0.933 1728 6 4 C45Tree KNN Tree 0.944 569 20 2 C45Tree KNN Tree 0.954 699 9 2 C45Tree KNN Tree 0.961 435 16 2 C45Tree Rule Tree 0.752 106 57 2 C45Tree Rule Tree 0.754 148 18 4 # Classes

84 C45Tree Rule Tree 0.761 690 15 2 C45Tree Rule Tree 0.782 977 14 2 C45Tree Rule Tree 0.789 2201 3 2 C45Tree Rule Tree 0.858 1728 6 4 C45Tree Rule Tree 0.885 958 9 2 C45Tree Rule Tree 0.912 351 32 2 C45Tree Rule Tree 0.933 435 16 2 C45Tree Rule Tree 0.933 569 20 2 C45Tree Rule Tree 0.934 699 9 2 KNN Rule Tree 0.761 148 18 4 KNN Rule Tree 0.762 106 57 2 KNN Rule Tree 0.767 690 15 2 Table IX. Model clusters-round 2 Cluster 1 Clust Acc # # Attribs Cls 1 Bayes C45Tree KNN 0.771 18 4 1 Bayes C45Tree KNN 0.781 13 2 1 Bayes C45Tree KNN 0.834 14 2 1 Bayes C45Tree KNN 0.862 15 2 1 Bayes C45Tree KNN 0.928 16 2 1 Bayes C45Tree KNN 0.944 20 2 1 Bayes C45Tree Rule 0.748 18 4 1 Bayes C45Tree Rule 0.761 15 2 1 Bayes C45Tree Rule 0.784 14 2 1 Bayes C45Tree Rule 0.936 16 2 1 Bayes C45Tree Rule 0.937 20 2 1 Bayes C45Tree Tree 0.777 13 2 1 Bayes C45Tree Tree 0.827 14 2 1 Bayes C45Tree Tree 0.862 15 2 1 Bayes C45Tree Tree 0.94 16 2 1 Bayes C45Tree Tree 0.944 20 2 1 Bayes KNN Rule 0.748 18 4 1 Bayes KNN Rule 0.768 15 2 1 Bayes KNN Rule 0.783 14 2 1 Bayes KNN Rule 0.938 16 2 1 Bayes KNN Rule 0.938 20 2 1 Bayes KNN Tree 0.756 18 4 1 Bayes KNN Tree 0.804 14 2 1 Bayes KNN Tree 0.818 13 2 1 Bayes KNN Tree 0.867 15 2 1 Bayes KNN Tree 0.933 16 2 1 Bayes KNN Tree 0.947 20 2 1 Bayes Rule Tree 0.748 18 4 1 Bayes Rule Tree 0.761 15 2 1 Bayes Rule Tree 0.781 14 2 1 Bayes Rule Tree 0.935 20 2 1 Bayes Rule Tree 0.936 16 2 1 C45Tree KNN Rule 0.767 15 2 1 C45Tree KNN Rule 0.768 18 4 1 C45Tree KNN Rule 0.786 14 2 1 C45Tree KNN Rule 0.938 16 2 1 C45Tree KNN Rule 0.938 20 2

85 1 C45Tree KNN Tree 0.774 13 2 1 C45Tree KNN Tree 0.83 14 2 1 C45Tree KNN Tree 0.87 15 2 1 C45Tree KNN Tree 0.944 20 2 1 C45Tree KNN Tree 0.961 16 2 1 C45Tree Rule Tree 0.754 18 4 1 C45Tree Rule Tree 0.761 15 2 1 C45Tree Rule Tree 0.782 14 2 1 C45Tree Rule Tree 0.933 16 2 1 C45Tree Rule Tree 0.933 20 2 1 KNN Rule Tree 0.761 18 4 1 KNN Rule Tree 0.767 15 2 1 KNN Rule Tree 0.783 14 2 1 KNN Rule Tree 0.938 16 2 1 KNN Rule Tree 0.938 20 2 Table X. Model clusters-round 2 Cluster 2 Clust Acc # # Attribs Cls 2 Bayes C45Tree KNN 0.825 3 2 2 Bayes C45Tree KNN 0.891 9 2 2 Bayes C45Tree KNN 0.928 6 4 2 Bayes C45Tree KNN 0.961 9 2 2 Bayes C45Tree Rule 0.789 3 2 2 Bayes C45Tree Rule 0.858 6 4 2 Bayes C45Tree Rule 0.885 9 2 2 Bayes C45Tree Rule 0.941 9 2 2 Bayes C45Tree Tree 0.783 3 2 2 Bayes C45Tree Tree 0.866 9 2 2 Bayes C45Tree Tree 0.933 6 4 2 Bayes C45Tree Tree 0.954 9 2 2 Bayes KNN Rule 0.789 3 2 2 Bayes KNN Rule 0.858 6 4 2 Bayes KNN Rule 0.875 9 2 2 Bayes KNN Rule 0.946 9 2 2 Bayes KNN Tree 0.825 3 2 2 Bayes KNN Tree 0.887 9 2 2 Bayes KNN Tree 0.936 6 4 2 Bayes KNN Tree 0.96 9 2 2 Bayes Rule Tree 0.789 3 2 2 Bayes Rule Tree 0.858 6 4 2 Bayes Rule Tree 0.871 9 2 2 Bayes Rule Tree 0.941 9 2 2 C45Tree KNN Rule 0.789 3 2 2 C45Tree KNN Rule 0.858 6 4 2 C45Tree KNN Rule 0.885 9 2 2 C45Tree KNN Rule 0.943 9 2 2 C45Tree KNN Tree 0.797 3 2 2 C45Tree KNN Tree 0.892 9 2 2 C45Tree KNN Tree 0.933 6 4 2 C45Tree KNN Tree 0.954 9 2 2 C45Tree Rule Tree 0.789 3 2 2 C45Tree Rule Tree 0.858 6 4 2 C45Tree Rule Tree 0.885 9 2 2 C45Tree Rule Tree 0.934 9 2 2 KNN Rule Tree 0.789 3 2

86 2 KNN Rule Tree 0.858 6 4 2 KNN Rule Tree 0.875 9 2 2 KNN Rule Tree 0.944 9 2 Table XI. Model clusters-round 2 Cluster 3 Cluster Accuracy # Attributes # Classes 3 Bayes C45Tree KNN 0.876 57 2 3 Bayes C45Tree Rule 0.752 57 2 3 Bayes C45Tree Tree 0.858 57 2 3 Bayes KNN Rule 0.762 57 2 3 Bayes KNN Tree 0.858 57 2 3 Bayes Rule Tree 0.762 57 2 3 C45Tree KNN Rule 0.752 57 2 3 C45Tree KNN Tree 0.849 57 2 3 C45Tree Rule Tree 0.752 57 2 Table XII. Model Clusters-Round 2 Cluster 4 Cluster Accuracy # Attributes # Classes 4 Bayes C45Tree KNN 0.923 32 2 4 Bayes C45Tree Rule 0.912 32 2 4 Bayes C45Tree Tree 0.926 32 2 4 Bayes KNN Rule 0.912 32 2 4 Bayes KNN Tree 0.943 32 2 4 Bayes Rule Tree 0.914 32 2 4 C45Tree KNN Rule 0.909 32 2 4 C45Tree KNN Tree 0.926 32 2 4 C45Tree Rule Tree 0.912 32 2 4 KNN Rule Tree 0.912 32 2