Support Vector Machines: Optimization of Decision Making Christopher Katinas March 10, 2016
Overview Background of Support Vector Machines Segregation Functions/Problem Statement Methodology Training/Testing Results Conclusions
Support Vector Machines (SVMs) Goal: Maximize the margin between two distinct groups via a segregation function Certain engineering problems require high certainty estimations of the equation separating two data sets (ex. phase diagrams in thermodynamics) Distinct phases are separated by functions which may not be described easily in closed form Can the liquid/vapor line be recreated by only using select points and using an SVM to identify the function? https://en.wikipedia.org/wiki/phase_diagram
Segregation Functions to Match yy xx = 0.01xx 2 + 5 Parabola
Segregation Functions to Match yy xx = 0.01xx 2 + 5 Parabola yy xx = 0.5xx + 25 Line
Segregation Functions to Match yy xx = 0.01xx 2 + 5 Parabola yy xx = 0.5xx + 25 1730.63 8.07131 yy xx = 10 101.325 xx+233.42 760 Line Antoinne Equation for Vapor Pressure of Water
Segregation Functions to Match yy xx = 0.01xx 2 + 5 Parabola yy xx = 0.5xx + 25 1730.63 8.07131 yy xx = 10 101.325 xx+233.42 760 Line Antoinne Equation for Vapor Pressure of Water 2 yy xx = ± 23 2 xx 54 2 + 50 Circle of Radius 23 at centered at (54,50)
Segregation Functions to Match yy xx = 0.01xx 2 + 5 Parabola yy xx = 0.5xx + 25 1730.63 8.07131 yy xx = 10 101.325 xx+233.42 760 Line Antoinne Equation for Vapor Pressure of Water 2 yy xx = ± 23 2 xx 54 2 + 50 Circle of Radius 23 at centered at (54,50) 2 yy xx = ± 62 xx 84 2 6 2 yy xx = ± 23 2 xx 54 2 + 20 + 50 Circle of Radius 23 at centered at (54,50) and Ellipse centered at (84,20)
Methodology Solve Lagrangian Dual Problem nn nn nn max αα ii 1 2 αα ii αα jj yy ii yy jj KK(xx ii, xx jj ) ii=1 ii=1 jj=1 nn such that CC αα ii 0 and yy ii αα ii = 0 ii=1 nn nn nn min αα ii + 1 2 αα ii αα jj yy ii yy jj KK(xx ii, xx jj ) Kernel Function KK xx ii, xx jj = PP + AAxx ii TT xx jj dd Select A to prevent numerical overflow for a given d, and P should be large to force optimizer to solve for correct weights - A -1 =max(x i T x j ) [Normalize inputs] - P= 1eeee 11/dd ii=1 ii=1 jj=1 nn such that CC αα ii 0and yy ii αα ii ii=1 = 0 Matlab quadprog can solve this! C was set to 1.0 for all simulations performed in this study (hence the choice of A and P)
Training Method Use Delaunay Triangulation to identify most critical points and query the function close to the boundary RED line segments denote where segregation function must reside Specify maximum number of refinements Keep only the points which bound the function for faster optimization
Training/Testing Results Magenta X = Group 1 Test Points Green X = Group 2 Test Points Blue Lines = Segregation Function Anti- Gate Red Lines = Segregation Function Gate Black circles = Desired New Points Magenta X = Group 1 Test Points Blue Area = Testing Group 1 Green X = Group 2 Test Points Maroon Area = Testing Group 2 Cyan circles = Support Vectors Yellow Line = Actual Boundary All results shown for 8 refinements and five random seeding points on each side of the function Eighth Order Polynomial Kernel was used. All Training Points were Kept!
Training/Testing Results Parabolic 0.68% Error Line 0.54% Error Antoinne 0.94% Error Circle 0.13% Error Circle/Ellipse 0.60% Error
Training/Testing Results Error in Antoinne Equation was due to no test points at boundaries Created one point at each corner of the domain [Pre-Seeding] Antoinne 0.94% Error No Pre-Seeding Antoinne 0.26% Error Pre-Seeded Boundary Points Only
Training/Testing Results with Noise Slack variables automatically included based on methodology shown earlier. More support vectors than for the no noise case due to higher difficult in fitting of segregation function Antoinne 0.26% Error Zero Noise Antoinne 0.70% Error (5 units of uniform random noise prescribed in each input variable)
Conclusions SVMs are extremely versatile in allowing for quantifiable decision-making strategies Capability of support vector machines was successfully demonstrated via five examples Care must be taken in selecting the parameters and training points Poor choice of number of training points can lead to improper bounding function and ultimately higher error Delaunay triangulation is a new method to acquire more desirable training points over random domain space Modified Kernel function constants were based on optimization versatility and general convergence Noise can be included and SVM is capable of creating a reasonable segregation function