Support Vector Machines: Optimization of Decision Making. Christopher Katinas March 10, 2016

Similar documents
Lecture 10. Support Vector Machines (cont.)

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag

Lecture 5. Optimisation. Regularisation

Logistic Regression. Hongning Wang

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

ROSE-HULMAN INSTITUTE OF TECHNOLOGY Department of Mechanical Engineering. Mini-project 3 Tennis ball launcher

Jasmin Smajic 1, Christian Hafner 2, Jürg Leuthold 2, March 16, 2015 Introduction to Finite Element Method (FEM) Part 1 (2-D FEM)

Attacking and defending neural networks. HU Xiaolin ( 胡晓林 ) Department of Computer Science and Technology Tsinghua University, Beijing, China

67. Sectional normalization and recognization on the PV-Diagram of reciprocating compressor

TSP at isolated intersections: Some advances under simulation environment

Special Topics: Data Science

Bayesian Methods: Naïve Bayes

knn & Naïve Bayes Hongning Wang

Pre-Kindergarten 2017 Summer Packet. Robert F Woodall Elementary

Math 4. Unit 1: Conic Sections Lesson 1.1: What Is a Conic Section?

Operational Risk Management: Preventive vs. Corrective Control

Functions of Random Variables & Expectation, Mean and Variance

Fluid-Structure Interaction Analysis of a Flow Control Device

Tie Breaking Procedure

Imperfectly Shared Randomness in Communication

A Novel Approach to Predicting the Results of NBA Matches

A Chiller Control Algorithm for Multiple Variablespeed Centrifugal Compressors

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

A Class of Regression Estimator with Cum-Dual Ratio Estimator as Intercept

ENHANCED PARKWAY STUDY: PHASE 2 CONTINUOUS FLOW INTERSECTIONS. Final Report

Chapter Capacity and LOS Analysis of a Signalized I/S Overview Methodology Scope Limitation

Abstract In this paper, the author deals with the properties of circumscribed ellipses of convex quadrilaterals, using tools of parallel projective tr

What is Restrained and Unrestrained Pipes and what is the Strength Criteria

SNARKs with Preprocessing. Eran Tromer

HIGHWAY CAPACITY MANUAL 2010: NEW SIGNALIZED INTERSECTION METHODOLOGY. James A. Bonneson, Ph.D., P.E.

Use of Auxiliary Variables and Asymptotically Optimum Estimators in Double Sampling

CS249: ADVANCED DATA MINING

Driver Behavior at Highway-Rail Grade Crossings With Passive Traffic Controls

Operations on Radical Expressions; Rationalization of Denominators

Chapter 2: Pure Substances a) Phase Change, Property Tables and Diagrams

CFD Analysis of Giromill Type Vertical Axis Wind Turbine

Product Decomposition in Supply Chain Planning

Physical Design of CMOS Integrated Circuits

San Francisco State University ECON 560 Summer Midterm Exam 2. Monday, July hour 15 minutes

Chapter 2 Ventilation Network Analysis

ANDRA Benchmark Workshop (Paris-Orly, April 2007)

Coaches, Parents, Players and Fans

Implementing Provisions for Art. 411 of the ICR Ski Jumping

Emergent walking stop using 3-D ZMP modification criteria map for humanoid robot

Obtain a Simulation Model of a Pedestrian Collision Imminent Braking System Based on the Vehicle Testing Data

Verification and Validation Pathfinder

Energy balance of the model as described in Theory and Model

Ammonia Synthesis with Aspen Plus V8.0

ScienceDirect. Rebounding strategies in basketball

Midterm Exam 1, section 2. Thursday, September hour, 15 minutes

Minimum Mean-Square Error (MMSE) and Linear MMSE (LMMSE) Estimation

DP Ice Model Test of Arctic Drillship

Tutorial 5 Relative equilibrium

The Usage of Propeller Tunnels For Higher Efficiency and Lower Vibration. M. Burak Şamşul

New Class of Almost Unbiased Modified Ratio Cum Product Estimators with Knownparameters of Auxiliary Variables

Diagnosis of Fuel Evaporative System

Modeling Approaches to Increase the Efficiency of Clear-Point- Based Solubility Characterization

ECO 745: Theory of International Economics. Jack Rossbach Fall Lecture 6

Using the CONVAL software for the petrochemical plant control valves checking. Case study

Motion Control of a Bipedal Walking Robot

How Many Iterations Are Enough?

Combining Experimental and Non-Experimental Design in Causal Inference

Workshop 1: Bubbly Flow in a Rectangular Bubble Column. Multiphase Flow Modeling In ANSYS CFX Release ANSYS, Inc. WS1-1 Release 14.

Write these equations in your notes if they re not already there. You will want them for Exam 1 & the Final.

Application Note AN-107

3D Inversion in GM-SYS 3D Modelling

A quantitative software testing method for hardware and software integrated systems in safety critical applications

Numerical modeling of refraction and diffraction

Variable Face Milling to Normalize Putter Ball Speed and Maximize Forgiveness

LQG Based Robust Tracking Control of Blood Gases during Extracorporeal Membrane Oxygenation

A STUDY OF SIMULATION MODEL FOR PEDESTRIAN MOVEMENT WITH EVACUATION AND QUEUING

HIGH RESOLUTION DEPTH IMAGE RECOVERY ALGORITHM USING GRAYSCALE IMAGE.

Workshop 302-compressor-anti-surge

Genetics and Inheritance

Communication Amid Uncertainty

CVEN Computer Applications in Engineering and Construction. Programming Assignment #4 Analysis of Wave Data Using Root-Finding Methods

CAM Final Report John Scheele Advisor: Paul Ohmann I. Introduction

New Albany / Fred Klink Memorial Classic 3rd Grade Championship Bracket **Daylight Saving Time - Turn Clocks Up One Hour**

PEAPOD. Pneumatically Energized Auto-throttled Pump Operated for a Developmental Upperstage. Test Readiness Review

Autodesk Moldflow Communicator Process settings

Dynamic Simulation for T-9 Storage Tank (Holding Case)

Hydraulic Modeling to Aid TDG Abatement at Boundary and Cabinet Gorge Dams

SAMPLE MAT Proceedings of the 10th International Conference on Stability of Ships

Team Advancement. 7.1 Overview Pre-Qualifying Teams Teams Competing at Regional Events...3

EVOLVING HEXAPOD GAITS USING A CYCLIC GENETIC ALGORITHM

Calculation of Trail Usage from Counter Data

Below are the graphing equivalents of the above constraints.

Fail Operational Controls for an Independent Metering Valve

Experimental Characterization and Modeling of Helium Dispersion in a ¼-Scale Two-Car Residential Garage

Simulating Major League Baseball Games

Title: Modeling Crossing Behavior of Drivers and Pedestrians at Uncontrolled Intersections and Mid-block Crossings

STRIP EDGE SHAPE CONTROL

Introduction to Pattern Recognition

On the Challenges of Analysis and Design of Turret-Moored FPSOs in Squalls

CS145: INTRODUCTION TO DATA MINING

WESEP 594 Research Seminar

The Simple Linear Regression Model ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

Title: 4-Way-Stop Wait-Time Prediction Group members (1): David Held

Compensator Design for Speed Control of DC Motor by Root Locus Approach using MATLAB

2 When Some or All Labels are Missing: The EM Algorithm

Transcription:

Support Vector Machines: Optimization of Decision Making Christopher Katinas March 10, 2016

Overview Background of Support Vector Machines Segregation Functions/Problem Statement Methodology Training/Testing Results Conclusions

Support Vector Machines (SVMs) Goal: Maximize the margin between two distinct groups via a segregation function Certain engineering problems require high certainty estimations of the equation separating two data sets (ex. phase diagrams in thermodynamics) Distinct phases are separated by functions which may not be described easily in closed form Can the liquid/vapor line be recreated by only using select points and using an SVM to identify the function? https://en.wikipedia.org/wiki/phase_diagram

Segregation Functions to Match yy xx = 0.01xx 2 + 5 Parabola

Segregation Functions to Match yy xx = 0.01xx 2 + 5 Parabola yy xx = 0.5xx + 25 Line

Segregation Functions to Match yy xx = 0.01xx 2 + 5 Parabola yy xx = 0.5xx + 25 1730.63 8.07131 yy xx = 10 101.325 xx+233.42 760 Line Antoinne Equation for Vapor Pressure of Water

Segregation Functions to Match yy xx = 0.01xx 2 + 5 Parabola yy xx = 0.5xx + 25 1730.63 8.07131 yy xx = 10 101.325 xx+233.42 760 Line Antoinne Equation for Vapor Pressure of Water 2 yy xx = ± 23 2 xx 54 2 + 50 Circle of Radius 23 at centered at (54,50)

Segregation Functions to Match yy xx = 0.01xx 2 + 5 Parabola yy xx = 0.5xx + 25 1730.63 8.07131 yy xx = 10 101.325 xx+233.42 760 Line Antoinne Equation for Vapor Pressure of Water 2 yy xx = ± 23 2 xx 54 2 + 50 Circle of Radius 23 at centered at (54,50) 2 yy xx = ± 62 xx 84 2 6 2 yy xx = ± 23 2 xx 54 2 + 20 + 50 Circle of Radius 23 at centered at (54,50) and Ellipse centered at (84,20)

Methodology Solve Lagrangian Dual Problem nn nn nn max αα ii 1 2 αα ii αα jj yy ii yy jj KK(xx ii, xx jj ) ii=1 ii=1 jj=1 nn such that CC αα ii 0 and yy ii αα ii = 0 ii=1 nn nn nn min αα ii + 1 2 αα ii αα jj yy ii yy jj KK(xx ii, xx jj ) Kernel Function KK xx ii, xx jj = PP + AAxx ii TT xx jj dd Select A to prevent numerical overflow for a given d, and P should be large to force optimizer to solve for correct weights - A -1 =max(x i T x j ) [Normalize inputs] - P= 1eeee 11/dd ii=1 ii=1 jj=1 nn such that CC αα ii 0and yy ii αα ii ii=1 = 0 Matlab quadprog can solve this! C was set to 1.0 for all simulations performed in this study (hence the choice of A and P)

Training Method Use Delaunay Triangulation to identify most critical points and query the function close to the boundary RED line segments denote where segregation function must reside Specify maximum number of refinements Keep only the points which bound the function for faster optimization

Training/Testing Results Magenta X = Group 1 Test Points Green X = Group 2 Test Points Blue Lines = Segregation Function Anti- Gate Red Lines = Segregation Function Gate Black circles = Desired New Points Magenta X = Group 1 Test Points Blue Area = Testing Group 1 Green X = Group 2 Test Points Maroon Area = Testing Group 2 Cyan circles = Support Vectors Yellow Line = Actual Boundary All results shown for 8 refinements and five random seeding points on each side of the function Eighth Order Polynomial Kernel was used. All Training Points were Kept!

Training/Testing Results Parabolic 0.68% Error Line 0.54% Error Antoinne 0.94% Error Circle 0.13% Error Circle/Ellipse 0.60% Error

Training/Testing Results Error in Antoinne Equation was due to no test points at boundaries Created one point at each corner of the domain [Pre-Seeding] Antoinne 0.94% Error No Pre-Seeding Antoinne 0.26% Error Pre-Seeded Boundary Points Only

Training/Testing Results with Noise Slack variables automatically included based on methodology shown earlier. More support vectors than for the no noise case due to higher difficult in fitting of segregation function Antoinne 0.26% Error Zero Noise Antoinne 0.70% Error (5 units of uniform random noise prescribed in each input variable)

Conclusions SVMs are extremely versatile in allowing for quantifiable decision-making strategies Capability of support vector machines was successfully demonstrated via five examples Care must be taken in selecting the parameters and training points Poor choice of number of training points can lead to improper bounding function and ultimately higher error Delaunay triangulation is a new method to acquire more desirable training points over random domain space Modified Kernel function constants were based on optimization versatility and general convergence Noise can be included and SVM is capable of creating a reasonable segregation function