Package multisom. May 23, 2017

Similar documents
Package ExactCIdiff. February 19, 2015

Package STI. August 19, Index 7. Compute the Standardized Temperature Index

Package jackstraw. August 7, 2018

Evaluating and Classifying NBA Free Agents

Package pinnacle.data

Package mrchmadness. April 9, 2017

Neural Networks II. Chen Gao. Virginia Tech Spring 2019 ECE-5424G / CS-5824

RUGBY is a dynamic, evasive, and highly possessionoriented

Two Machine Learning Approaches to Understand the NBA Data

ORF 201 Computer Methods in Problem Solving. Final Project: Dynamic Programming Optimal Sailing Strategies

Package gvc. R topics documented: August 29, Version Title Global Value Chains Tools

Special Topics: Data Science

Tokyo: Simulating Hyperpath-Based Vehicle Navigations and its Impact on Travel Time Reliability

Package rdpla. August 13, 2017

Package simmr. August 29, 2016

The Intrinsic Value of a Batted Ball Technical Details

A definition of depth for functional observations

Using Spatio-Temporal Data To Create A Shot Probability Model

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

The Regional Power Grid Team

World Leading Traffic Analysis

Cricket umpire assistance and ball tracking system using a single smartphone camera

Package physiology. July 16, 2018

Credibility theory features of actuar

Fun Neural Net Demo Site. CS 188: Artificial Intelligence. N-Layer Neural Network. Multi-class Softmax Σ >0? Deep Learning II

Folding Reticulated Shell Structure Wind Pressure Coefficient Prediction Research based on RBF Neural Network

Nadiya Afzal 1, Mohd. Sadim 2

knn & Naïve Bayes Hongning Wang

Combined impacts of configurational and compositional properties of street network on vehicular flow

How to measure the average pedestrian access of a place to a group of important locations?

Package BAwiR. January 25, 2018

Guidelines for Applying Multilevel Modeling to the NSCAW Data

Algorithms and Software for the Golf Director Problem

Robust specification testing in regression: the FRESET test and autocorrelated disturbances

Simulation of Pedestrian Flow Based on Multi-Agent

Pressure Vessel Calculation for Yavin Thruster

swmath - Challenges, Next Steps, and Outlook

2 When Some or All Labels are Missing: The EM Algorithm

Information Technology for Monitoring of Municipal Gas Consumption, Based on Additive Model and Correlated for Weather Factors

Replay using Recomposition: Alignment-Based Conformance Checking in the Large

Introduction to topological data analysis

Workshop 1: Bubbly Flow in a Rectangular Bubble Column. Multiphase Flow Modeling In ANSYS CFX Release ANSYS, Inc. WS1-1 Release 14.

Taking Your Class for a Walk, Randomly

PREDICTING the outcomes of sporting events

Wireless Networks Fall Part #4: The Cellular Principle Capacity, Interference, and Traffic Engineering

Exposure at Default models with and without the Credit Conversion Factor

Lecture 10. Support Vector Machines (cont.)

Package nhlscrapr. R topics documented: March 8, Type Package

MA PM: Memetic algorithms with population management

Application of Dijkstra s Algorithm in the Evacuation System Utilizing Exit Signs

Opleiding Informatica

Bayesian Optimized Random Forest for Movement Classification with Smartphones

ASABE Robotics Student Design Competition Committee P-127

CC-Log: Drastically Reducing Storage Requirements for Robots Using Classification and Compression

Diver Training Options

This file is part of the following reference:

Modeling Planned and Unplanned Store Stops for the Scenario Based Simulation of Pedestrian Activity in City Centers

ENHANCED PARKWAY STUDY: PHASE 2 CONTINUOUS FLOW INTERSECTIONS. Final Report

Robust Imputation of Missing Values in Compositional Data Using the R-Package robcompositions

Estimating the Probability of Winning an NFL Game Using Random Forests

Minimum Mean-Square Error (MMSE) and Linear MMSE (LMMSE) Estimation

Track B: Particle Methods Part 2 PRACE Spring School 2012

Chapter 6. Analysis of the framework with FARS Dataset

AIR-X COMPACT AIR-X ON-LINE LUBRICANT AERATION MEASUREMENT. Technical Brochure ON RUNNING ENGINES

Paul Burkhardt. May 19, 2016

User Help. Fabasoft Scrum

Uninformed Search (Ch )

Supplementary Figures

Development of Fluid-Structure Interaction Program for the Mercury Target

Cricket Team Selection and Analysis by Using DEA Algorithm in Python

Package macleish. January 3, 2018

Why We Should Use the Bullpen Differently

The Effect of Von Karman Vortex Street on Building Ventilation

Singapore Mosaic PRODUCT SPECIFICATIONS

A new Decomposition Algorithm for Multistage Stochastic Programs with Endogenous Uncertainties

AGA Swiss McMahon Pairing Protocol Standards

#19 MONITORING AND PREDICTING PEDESTRIAN BEHAVIOR USING TRAFFIC CAMERAS

Ocean Wave Forecasting

Navigate to the golf data folder and make it your working directory. Load the data by typing

Autodesk Moldflow Communicator Process settings

Applied Econometrics with. Time, Date, and Time Series Classes. Motivation. Extension 2. Motivation. Motivation

Numerical Simulation for the Internal Flow Analysis of the Linear Compressor with Improved Muffler

A Chiller Control Algorithm for Multiple Variablespeed Centrifugal Compressors

Running WIEN2K on Ranger

Transposition Table, History Heuristic, and other Search Enhancements

Cluster Dendrogram. SolincoTour8. WilsonJuice108 VolklOrganixV1Midplus. YonexEZone98 DonnayProOne. HeadPrestigeMid. HeadPrestigeMidplus.

Chapter 13. Factorial ANOVA. Patrick Mair 2015 Psych Factorial ANOVA 0 / 19

Basketball field goal percentage prediction model research and application based on BP neural network

DECISION MODELING AND APPLICATIONS TO MAJOR LEAGUE BASEBALL PITCHER SUBSTITUTION

Ivan Suarez Robles, Joseph Wu 1

Lossless Comparison of Nested Software Decompositions

Formula 1 Statistics Feeds

Black Box testing Exercises. Lecturer: Giuseppe Santucci

A HYBRID METHOD FOR CALIBRATION OF UNKNOWN PARTIALLY/FULLY CLOSED VALVES IN WATER DISTRIBUTION SYSTEMS ABSTRACT

ECE 697B (667) Spring 2003

APPLICATION OF PUSHOVER ANALYSIS ON EARTHQUAKE RESPONSE PREDICATION OF COMPLEX LARGE-SPAN STEEL STRUCTURES

Highly Cited Psychometrika Articles:

FMSN60/MASM18 Financial Statistics Lecture 1, Introduction and stylized facts. Magnus Wiktorsson

STRAVA - DATA ACCESS AND USES. March 9, 2016 Shaun Davis + Dewayne Carver Florida Dept. of Transportation

STRESS ANALYSIS OF BELL CRANK LEVER

Transcription:

Package multisom May 23, 2017 Type Package Title Clustering a Data Set using Multi-SOM Algorithm Version 1.3 Author Sarra Chair and Malika Charrad Maintainer Sarra Chair <sarra.chair@gmail.com> Description Implements two versions of the algorithm namely: stochastic and batch. The package determines also the best number of clusters and offers to the user the best clustering scheme from different results. License GPL-2 Depends R (>= 3.1.3), class Imports kohonen URL https://sites.google.com/site/malikacharrad/research/multisom-package NeedsCompilation no Repository CRAN Date/Publication 2017-05-23 17:28:23 UTC R topics documented: BatchSOM......................................... 2 multisom.batch....................................... 3 multisom.stochastic..................................... 6 Index 8 1

2 BatchSOM BatchSOM Self-Organizing Map: Batch version Description This function implements the batch version of the kohonen algorithm Usage BatchSOM(data,grid = somgrid(),min.radius=0.0001, max.radius=0.002,maxit=1000, init=c("random","sample","linear"), radius.type=c("gaussian","bubble","cutgauss","ep")) Arguments data grid min.radius max.radius maxit init data to be used a grid for the representatives.the numbers of nodes should be approximately equal to 5*sqrt(n), which n denotes the number of sample. the minimum neighbourhood radius the maximum neighbourhood radius the maximum number of iterations to be done the method to be used to initialize the prototypes.the following are permitted: "random" uses random draws from N(0,1); "sample" uses a radom sample from the data; "linear" uses the linear grids upon the first two principle components direction.see package som. radius.type the neighborhood function type. The following are permitted: "gaussian" "bubble" "cutgauss" "ep" Value classif codes grid a vector of integer indicating to which unit each observation has been assigned a matrix of code vectors the grid, an object of class "somgrid" Author(s) Sarra Chair and Malika Charrad

multisom.batch 3 References Kohonen, T. (1995) Self-Organizing Maps. Springer-Verlag. Brian Ripley, William Venables (2015), class: Functions for Classification, URL https://cran.rproject.org/package=class. Jun Yan (2010), som: Self-Organizing Map, URL https://cran.r-project.org/package=som. Examples data<-iris[,-c(5)] BatchSOM(data,grid = somgrid(7,7,"hexagonal"),min.radius=0.0001, max.radius=0.002,maxit=1000,"random","gaussian") multisom.batch MultiSOM for batch version Description Usage This function implements the batch version of MultiSOM algorithm. multisom.batch(data= NULL,xheight,xwidth,topo=c("rectangular", "hexagonal"),min.radius,max.radius,maxit=1000, init=c("random","sample","linear"),radius.type= c("gaussian","bubble","cutgauss","ep"),index="all") Arguments data xheight xwidth topo min.radius max.radius maxit init data to be used the x-dimension of the map the y-dimension of the map the topology used to build the grid.the following are permitted: "hexagonal" "rectangular" the minimum neighbourhood radius the maximum neighbourhood radius the maximum number of iterations to be done the method to be used to initialize the prototypes.the following are permitted: "random" uses random draws from N(0,1); "sample" uses a radom sample from the data; "linear" uses the linear grids upon the first two principle components direction.

4 multisom.batch Details radius.type the neighborhood function type. The following are permitted: "gaussian" "bubble" "cutgauss" "ep" index vector of the index to be calculated. This should be one of : "db", "dunn", "silhouette", "ptbiserial", "ch", "cindex", "ratkowsky", "mcclain", "gamma", "gplus", "tau", "ccc", "scott", "marriot", "trcovw", "tracew", "friedman", "rubin", "ball", "sdbw", "dindex", "hubert", "sv", "xie-beni", "hartigan", "ssi", "xu", "rayturi", "pbm", "banfeld", "all" (all indices will be used) Index Optimal number of clusters 1. "db" or "all" Minimum value of the index (Davies and Bouldin 1979) 2. "dunn" or "all" Maximum value of the index (Dunn 1974) 3. "silhouette" or "all" Maximum value of the index (Rousseeuw 1987) 4. "ptbiserial" or "all" Maximum value of the index (Milligan 1980, 1981) 5. "ch" or "all" Maximum value of the index (Calinski and Harabasz 1974) 6. "cindex" or "all" Minimum value of the index (Hubert and Levin 1976) 7. "ratkowsky" or "all" Maximum value of the index (Ratkowsky and Lance 1978) 8. "mcclain" or "all" Minimum value of the index (McClain and Rao 1975) 9. "gamma" or "all" Maximum value of the index (Baker and Hubert 1975) 10. "gplus" or "all" Minimum value of the index (Rohlf 1974) (Milligan 1981) 11. "tau" or "all" Maximum value of the index (Rohlf 1974) (Milligan 1981) 12. "ccc" or "all" Maximum value of the index (Sarle 1983) 13. "scott" or "all" Max. difference between hierarchy (Scott and Symons 1971) levels of the index 14. "marriot" or "all" Max. value of second differences (Marriot 1971) between levels of the index 15. "trcovw" or "all" Max. difference between hierarchy (Milligan and Cooper 1985) levels of the index 16. "tracew" or "all" Max. value of absolute second (Milligan and Cooper 1985) differences between levels of the index 17. "friedman" or "all" Max. difference between hierarchy (Friedman and Rubin 1967) levels of the index 18. "rubin" or "all" Min. value of second differences (Friedman and Rubin 1967) between levels of the index 19. "ball" or "all" Max. difference between hierarchy

multisom.batch 5 (Ball and Hall 1965) levels of the index 20. "sdbw" or "all" Minimum value of the index (Halkidi and Vazirgiannis 2001) 21. "dindex" or "all" Graphical method (Lebart et al. 2000) 22. "hubert" or "all" Graphical method (Hubert and Arabie 1985) 23. "sv" or "all" Maximum value of the index (Zalik and Zalik, 2011) 24. "xie-beni" or "all" Minimum value of the index (Xie and Beni 1991) 25. "hartigan" or "all" Maximum difference between (Hartigan 1975) hierarchy levels of the index 26. "ssi" or "all" Maximum value of the index (Dolnicar,Grabler and Mazanec 1999) 27. "xu" or "all" Max. value of second differences (Xu 1997) between levels of the index 28. "rayturi" or "all" Minimum value of the index (Ray and Turi 1999) 29. "pbm" or "all" Maximum value of the index (Bandyopadhyay,Pakhira and Maulik 2004) 30. "banfeld" or "all" Minimum value of the index (Banield and Raftery 1974) Value All.index.by.layer Values of indices for each layer Best.nc Best number of clusters proposed by each index and the corresponding index value. Best.partition Partition that corresponds to the best number of clusters Author(s) Sarra Chair and Malika Charrad References Charrad M., Ghazzali N., Boiteau V., Niknafs A. (2014). "NbClust: An R Package for Determining the Relevant Number of Clusters in a Data Set.", "Journal of Statistical Software, 61(6), 1-36.", "URL http://www.jstatsoft.org/v61/i06/". Khanchouch, I., Charrad, M., & Limam, M. (2014). A Comparative Study of Multi-SOM Algorithms for Determining the Optimal Number of Clusters. Journal of Statistical Software, 61(6), 1-36.

6 multisom.stochastic Examples ## A 4-dimensional example set.seed(1) data<-rbind(matrix(rnorm(100,sd=0.3),ncol=2), matrix(rnorm(100,mean=2,sd=0.3),ncol=2), matrix(rnorm(100,mean=4,sd=0.3),ncol=2), matrix(rnorm(100,mean=8,sd=0.3),ncol=2)) res<- multisom.batch(data,xheight= 8, xwidth= 8,"hexagonal", min.radius=0.00010,max.radius=0.002, maxit=1000,"random","gaussian","ch") res$all.index.by.layer res$best.nc res$best.partition multisom.stochastic Multisom for stochastic version Description Usage This function implements the stochastic version of MultiSOM algorithm. multisom.stochastic(data = NULL, xheight = 7, xwidth = 7, topo = c("rectangular", "hexagonal"), neighbouhood.fct =c("bubble","gaussian"), dist.fcts = NULL, rlen = 100,alpha = c(0.05, 0.01), radius = c(2, 1.5, 1.2, 1), index = "all") Arguments data xheight xwidth the data matrix of observations the x-dimension of the map the y-dimension of the map topo the topology used to build the grid.the following are permitted: "hexagonal" "rectangular" neighbouhood.fct the neighbouhood function type. The following are permitted: "gaussian" "bubble" dist.fcts The metric used to determine the distance function. Possible choices are: "sumofsquares" "euclidean" "manhattan" "tanimoto"

multisom.stochastic 7 rlen alpha radius index the maximum number of iterations to be done learning rate, a vector of two numbers indicating the amount of change. Default is to decline linearly from 0.05 to 0.01 over rlen updates. the radius of the neighbourhood, either given as a single number or a vector (start, stop). If it is given as a single number the radius will run from the given number to the negative value of that number; as soon as the neighbourhood gets smaller than one only the winning unit will be updated. vector of the index to be calculated. This should be one of : "db", "dunn", "silhouette", "ptbiserial", "ch", "cindex", "ratkowsky", "mcclain", "gamma", "gplus", "tau", "ccc", "scott", "marriot", "trcovw", "tracew", "friedman", "rubin", "ball", "sdbw", "dindex", "hubert", "sv", "xie-beni", "hartigan", "ssi", "xu", "rayturi", "pbm", "banfeld", "all" (all indices will be used) Value All.index.by.layer Values of indices for each layer. Best.nc Best number of clusters proposed by each index and the corresponding index value. Best.partition Partition that corresponds to the best number of clusters Author(s) Sarra Chair and Malika Charrad Examples ## A real data example data<-as.matrix(iris[,-c(5)]) res<-multisom.stochastic(data, xheight = 8, xwidth = 8,"hexagonal","gaussian", dist.fcts = NULL, rlen = 100,alpha = c(0.05, 0.01), radius = c(2, 1.5, 1.2, 1),c("db","ratkowsky","dunn")) res$all.index.by.layer res$best.nc

Index Topic Clustering BatchSOM, 2 Topic MultiSOM Topic Number of clusters Topic R packages Topic Validity Indices BatchSOM, 2 8