Special Topics: Data Science
|
|
- Alexina Burke
- 5 years ago
- Views:
Transcription
1 Special Topics: Data Science L Linear Methods for Prediction Dr. Vidhyasaharan Sethu School of Electrical Engineering & Telecommunications University of New South Wales Sydney, Australia V. Sethu 1
2 Topics 1. Linear Regression. Regularisation 3. Bayesian View of Linear Regression 4. Classification Systems 5. Discriminant functions 6. Logistic Regression References Friedman, J., Hastie, T., & Tibshirani, R. (001). The elements of statistical learning, New York, NY, USA:: Springer series in statistics. Duda, R. O., Hart, P. E., & Stork, D. G. (01). Pattern classification. John Wiley & Sons. Bishop, C. M. (006). Pattern recognition and machine learning. V. Sethu
3 Linear Regression Input Data Machine Prediction about some quantity of interest xx = xx 1,, xx TT h θθ yy R yy = ββ 0 + ii=1 ββ ii xx ii Parameters of the model: ββ = ββ 0, ββ 1,, ββ TT Residual Sum of Squares: RRRRRR ββ = yy jj yy jj Least squares estimate is the ββ that corresponds to the minimum RSS Friedman, J., Hastie, T., & Tibshirani, R. (001). The elements of statistical learning New York, NY, USA:: Springer series in statistics. V. Sethu 3
4 Least Squares Linear Regression Model Given a dataset, DD = xx jj, yy jj,, Residual sum of squares: RRRRRR ββ = yy jj yy jj = yy jj ββ 0 xx jjjj ββ ii ii=1 In matrix notation: RRRRRR ββ = yy XXββ TT yy XXββ ββ = ββ 0 ββ 1 ββ yy = yy 1 yy XX = xx 11 TT xx TT = xx 11 xx 1 xx N xx V. Sethu 4
5 Least Squares Linear Regression Model For least squares solution, ββ = XXTT yy XXββ = 00 Noting: Hat Matrix: ββ = XX TT XX 11 XX TT yy ββ ββ TT = XXTT XX yy = XX ββ = XX XX TT XX 11 XX TT yy = HHyy is Positive Definite if XX is full rank xx Columns of XX xx 11 V. Sethu 5
6 Sequential (on-line) Learning The least squares solution may be hard to compute in big data setting, ββ = XX TT XX 11 XX TT yy Consider Computationally expensive! Very large RRRRRR ββ = EE jj ; EE jj = yy jj ββ 0 xx jjjj ββ ii May not be available all at once (e.g., Real-time alications where data is continuously streaming) ii=1 = yy jj ββ TT xx jj Iterative estimation of ββ can be carried out: For the least squares case this is, ββ (ττ+11) = ββ ττ ηη EE jj ββ Stochastic Gradient Descent ββ (ττ+11) = ββ ττ ηη yy jj ββ ττ TT xx jj xx jj LMS (Least Mean Squares) algorithm V. Sethu 6
7 Note on Gradient Descent Negative of the gradient points in the direction of steepest descent (of the surface of cost function). Noisy gradients estimates from single data points (or small batches of data) used in stochastic gradient descent. Noise gradients may be beneficial when negotiating complex surfaces. V. Sethu 7
8 Potential issues with Least squares estimate, Regularisation ββ = XX TT XX 11 XX TT yy yy = XX ββ Low bias but may have high variance (Prediction accuracy may suffer) Ridge Regression ββ rrrrrrrrrr = arg min ββ Maybe desirable to determine a subset of features that exhibit the strongest effects. yy jj ββ 0 xx jjjj ββ ii ii=1 + λλ ii=1 ββ ii Regularlisation term (LL regularisation) Note: ββ 0 is not included Equivalent to ββ rrrrrrrrrr = arg min ββ subject to ββ ii t ii=1 yy jj ββ 0 xx jjjj ββ ii ii=1 One-to-one correspondence between λλ and tt V. Sethu 8
9 Regularisation ββ rrrrrrrrrr = arg min ββ yy jj ββ 0 xx jjjj ββ ii ii=1 + λλ ii=1 ββ ii Solutions are not equivariant under scaling of the inputs Inclusion of ββ 0 will make it dependent on origin chosen for yy i.e., adding a constant cc to all yy will not simply result in yy being offset by the same amount cc. Solution can be separated into two parts after centering inputs : - Estimate ββ 0 - Estimate ββ 1,, ββ yy = yy jj xx jjjj xx jjjj xx ii ββ = ββ rrrrrrrrrr = arg min ββ ββ 1 ββ yy = yy 1 yy yy XXββ TT yy XXββ + λλββ TT ββ XX = xx 11 TT xx TT = xx 11 xx 1 xx N xx V. Sethu 9
10 Ridge Regression ββ rrrrrrrrrr = arg min ββ yy XXββ TT yy XXββ + λλββ TT ββ ββ rrrrrrrrrr = XX TT XX + λλii 11 XX TT yy Makes the problem non-singular even if XX TT XX is not of full rank Comparing Least Squares ( ββ llll ) and Ridge Regression ( ββ rrrrrrrrrr ) (Using singular value decomposition: XX = UUUUVV TT ) XX ββ llll = XX TT XX 11 XX TT yy = UUUU TT yy XX ββ rrrrrrrrrr = XX TT XX + λλii 11 XX TT yy = UUUU DD + λλii 11 DDUU TT yy = ii=11 uu ii dd ii dd ii + λλ uu ii TT yy V. Sethu 10
11 Ridge Regression XX ββ rrrrrrrrrr = ii=11 uu ii dd ii dd ii + λλ uu ii TT yy Shrinks directions of least variance the most Note: XX TT XX = VVDD VV TT (Using singular value decomposition: XX = UUUUVV TT ) Also sample covariance, SS = 1 XXTT XX Therefore, vv ii (columns of VV) are eigen-vectors Also, XXvv ii = uu ii dd ii (from XX = UUUUVV TT ) Friedman, J., Hastie, T., & Tibshirani, R. (001). The elements of statistical learning New York, NY, USA:: Springer series in statistics. V. Sethu 11
12 Tuning hyperparameter λλ df λλ = dd ii + λλ ii=1 dd ii Effective degrees of freedom Friedman, J., Hastie, T., & Tibshirani, R. (001). The elements of statistical learning New York, NY, USA:: Springer series in statistics. V. Sethu 1
13 LL 11 Regularisation (Lasso) ββ rrrrrrrrrr = arg min ββ yy jj ββ 0 xx jjjj ββ ii ii=1 + λλ ii=1 ββ ii LL 11 regularisation term Equivalent to ββ rrrrrrrrrr = arg min ββ subject to ii=1 ββ ii yy jj ββ 0 xx jjjj ββ ii t ii=1 Encourages sparsity some coefficients will be exactly zero Makes the problem non-linear no closed form solution Quadratic programming problem but efficient algorithms exist V. Sethu 13
14 Feature Selection with Lasso ss = tt ii=1 ββ ii Friedman, J., Hastie, T., & Tibshirani, R. (001). The elements of statistical learning New York, NY, USA:: Springer series in statistics. V. Sethu 14
15 LL 11 and LL regularisation (sparsity) ββ 1 + ββ tt ββ 1 + ββ tt Friedman, J., Hastie, T., & Tibshirani, R. (001). The elements of statistical learning New York, NY, USA:: Springer series in statistics. V. Sethu 15
16 Bayesian View Assuming the error model, yy = yy + εε = ββ TT xx + εε, wwwwwwwwww εε~ 0, σσ We can write, PP yy xx, ββ, σσ = yy ββ TT xx, σσ In the Bayesian framework the parameters (ββ) are treated as random variables instead of fixed but unknown parameters. In terms of the dataset, DD = xx jj, yy jj,, Likelihood function, ll ββ PP yy XX, ββ, σσ = yy jj ββ TT xx jj, σσ Taking the logarithm, ln PP yy XX, ββ, σσ = ln yy jj ββ TT xx jj, σσ ln PP yy XX, ββ, σσ = ln σσ 1 ln ππ σσ yy jj ββ TT xx jj Maximum likelihood (ML) solution is equivalent to least squares solution V. Sethu 16
17 Bayes theorem states, Bayesian View (of LL Regularisation) Posterior Likelihood PP ββ XX, yy, σσ = PP yy ββ, XX, σσ PP ββ PP yy XX, σσ Prior Evidence If we assume a Gaussian prior for the parameters (ββ) conditional on some hyperparameter (αα), We obtain PP ββ = PP ββ αα = ββ 00, αα 1 II PP ββ XX, yy, σσ = ββ mm, SS Where, SS = mm = 1 σσ SS XX TT yy ααii + 1 σσ XXTT XX 1 Consequently the log posterior is given by, ln PP ββ XX, yy, σσ = 1 σσ yy jj ββ TT xx jj αα ββtt ββ Maximum a posteriori (MAP) estimate is equivalent to Ridge regression 17
18 Visualising Bayesian Learning Bishop, C. M. (006). Pattern recognition and machine learning V. Sethu 18
19 Classification Input Data Machine Prediction about some quantity of interest TT xx = xx 1,, xx R h θθ yy ωω 1,, ωω cc yy ωω 1,, ωω cc The Bayesian Decision Theory Picture: PP ωω jj xx = PP xx ωω jj PP ωω jj PP xx Defining a loss function (ξξ) such that ξξ αα ii ωω jj describes the loss incurred by estimating the class as ωω ii when the true class was ωω jj (Note: Here αα ii is used to denote yy = ωω ii ) gives us the expected loss associated with αα ii as: cc RR αα ii xx = ξξ αα ii ωω jj PP ωω jj xx V. Sethu 19
20 Bayesian Decision Theory For any general decision rule, αα xx, where αα xx assumes one of the values αα 1,, αα cc, the overall risk is given by, RR = RR αα xx xx PP xx ddxx xx If αα xx is chosen such that RR αα xx xx is as small as possible for every xx, then overall risk (RR) will be minimised. Bayes Decision Rule αα xx = arg min αα ii RR αα ii xx Leads to minimum overall risk, called Bayes risk and denotes as RR V. Sethu 0
21 Minimum Error Rate Classification In many classification tasks, only the number of errors/mistakes (error rate) is of interest. This leads to the so-called symmetric/zero-one loss function, The corresponding conditional risk is then, 0 ii = jj ξξ αα ii ωω jj = 1 ii jj Therefore for minimum error rate: RR αα ii xx = PP ωω jj xx = 1 PP ωω ii xx jj ii h θθ xx = ωω ii iiii PP ωω ii xx > PP ωω jj xx jj ii V. Sethu 1
22 Discriminant Functions There are many ways to represent classifiers One of the most useful is in terms of a set of discriminant function, {gg ii xx : ii = 1,, cc} such that, h θθ xx = ωω ii iiii gg ii xx > gg jj xx, jj ii You can replace every gg ii xx by ff gg ii xx where ff is monotonically increasing and the classification is unchanged. Duda, R. O., Hart, P. E., & Stork, D. G. (01). Pattern classification. John Wiley & Sons. Bayes classifier: gg ii xx = RR αα ii xx Minimum error rate: gg ii xx = PP ωω ii xx = PP xx ωω ii PP ωω ii cc PP xx ωω ii PP ωω ii ii=1 gg ii xx = PP xx ωω ii PP ωω ii gg ii xx = ln PP xx ωω ii + ln PP ωω ii V. Sethu
23 Decision Boundaries/Surfaces Every decision rule divides the feature space into cc decision regions, separated by decision boundaries. The decision regions need not be simply connected. Duda, R. O., Hart, P. E., & Stork, D. G. (01). Pattern classification. John Wiley & Sons. V. Sethu 3
24 Gaussian Data Distributions Linear Discriminants The minimum error rate classification can be achieved by the discriminant functions: gg ii xx = ln PP xx ωω ii + ln PP ωω ii In the case of multivariate normal data distribution within each class, PP xx ωω ii = xx μμ ii, ΣΣ ii The discriminant functions can be readily evaluated as, Independent of data and class distributions can be droed gg ii xx = 1 xx μμ ii TT ΣΣ 1 xx μμ ii dd ln ππ 1 ln Σ ii + ln PP ωω ii Mahalanobis distance Consider the case where the data distribution in all the classes are normally distributed with identical covariance matrices. i.e., ΣΣ ii = ΣΣ, ii. This corresponds to a situation where data from all classes fall into hyperellipsoidal clusters of the same shape and size but different locations in the feature space: gg ii xx = 1 xx μμ ii TT ΣΣ 1 xx μμ ii + ln PP ωω ii ln Σ term is droed since it is identical for all classes Expanding the Mahalanobis distance and droing the quadratic term (xx TT ΣΣΣΣ) which is identical for all the classes makes the discriminant functions linear, gg ii xx = ββ TT ii xx + ββ iii Where, ββ ii = ΣΣ 11 μμ ii 1 ββ iii = 1 μμ ii TT ΣΣ 11 μμ ii + ln PP ωω ii 4
25 Gaussian Data Distributions Decision Surfaces The decision surfaces between decision regions R i and R j is given by gg ii xx = gg jj xx. In the case of the linear discriminant functions arising from normally distributed data, they are hyperplanes given by μμ ii μμ jj TT ΣΣ 11 xx xx 00 = 0 Where, xx 00 = 1 μμ ii + μμ jj ln PP(ωω ii )/PP(ωω jj ) Hyperplane passes through xx 00 μμ ii μμ jj μμ ii μμ jj TT ΣΣ 1 μμ ii μμ jj Duda, R. O., Hart, P. E., & Stork, D. G. (01). Pattern classification. John Wiley & Sons. V. Sethu 5
26 Arbitrary Gaussian Data Distributions When data is every class are normally distributed but with arbitrary covariance matrices, decision regions need not be simply connected (even in one dimensional data) Duda, R. O., Hart, P. E., & Stork, D. G. (01). Pattern classification. John Wiley & Sons. Decision surfaces can be hyperplanes, pairs of hyperplanes, hyperspheres, hyperellipsoids, hyperparaboloids, and hyperhyperboloids (broadly referred to as hyperquadrics) Duda, R. O., Hart, P. E., & Stork, D. G. (01). Pattern classification. John Wiley & Sons. V. Sethu 6
27 Quantifying Error In a -class problem, the classifier separations the feature space into two decision regions, R 1 and R. Errors can arise in two ways Points from ωω 1 falling within R and vice versa. PP eeeeeeeeee = PP xx R, ωω 1 + PP xx R 1, ωω = PP xx R ωω 1 PP ωω 1 + PP xx R ωω 1 PP ωω 1 = R PP xx ωω 1 PP ωω 1 ddxx + R1 PP xx ωω PP ωω ddxx Duda, R. O., Hart, P. E., & Stork, D. G. (01). Pattern classification. John Wiley & Sons. V. Sethu 7
28 Receiver Operating Characteristic (ROC) Curve Each point on the curve corresponding to different operating point of classifier dd Discriminability (some measure of distance between distributions) e.g., dd = μμ μμ 1 σσ for the 1-D case (equal variance) Duda, R. O., Hart, P. E., & Stork, D. G. (01). Pattern classification. John Wiley & Sons. V. Sethu 8
29 Logistic Regression Arises from the desire to: 1. Model posterior probability of classes as linear functions (and treat it as a regression problem). Ensure that sum of class posteriors is one. The model has the form, ln PP ωω 1 xx PP ωω cc xx = ββ 10 + ββ 11 TT xx ln PP ωω xx PP ωω cc xx = ββ 0 + ββ TT xx ln PP ωω cc 1 xx PP ωω cc xx TT = ββ cc ββ cc 11 xx Equivalently, PP ωω cc xx = PP ωω jj xx = cc 1 ii=1 ee ββ ii0+ββ TT ii xx ee ββ jj0+ββ jj TT xx 1 + cc 1 ii=1 ee ββ ii0+ββ TT ii xx, jj = 1,, cc 1 V. Sethu 9
30 Given data, DD = xx jj, yy jj 1,, where, -Class Logistic Regression yy jj = 1, cccccccccc iiii ωω 1 0, cccccccccc iiii ωω The log-likelihood of the logistic regression model is then, This reduces to, l ββ = ln PP DD ββ = yy jj ln PP ωω 1 xx jj, ββ + 1 yy jj l ββ = yy jj ββ TT xx jj ln 1 + ee ββtt xx jj ln 1 PP ωω 1 xx jj, ββ To obtained the maximum likelihood solution we can set the derivatives to zero, ββ ββ = xx jj yy jj PP ωω 1 xx jj, ββ = 0 Nonlinear equations Use iterative solution (Newton-Raphson algorithm) V. Sethu 30
31 -Class Logistic Regression Newton update: In matrix notation, ββ ττ+1 = ββ ττ l ββ ββ ββ TT ββ ββ = XXTT yy l ββ ββ ββ TT = XXTT WWWW 1 ββ ββ Hessian matrix WW = dddddddd ww 1, ww,, ww ; ww ii = PP ωω ii xx ii, ββ ττ 1 PP ωω 1 xx 11, ββ ττ XX = xx 11 TT xx TT = xx 11 xx 1 yy = xx N xx yy 1 yy = PP ωω 1 xx 11, ββ ττ PP ωω 1 xx, ββ ττ Newton update for logistic regression, ββ ττ+1 = ββ ττ + XX TT WWWW 1 XX TT yy V. Sethu 31
Lecture 5. Optimisation. Regularisation
Lecture 5. Optimisation. Regularisation COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Andrey Kan Copyright: University of Melbourne Iterative optimisation Loss functions Coordinate
More informationCS249: ADVANCED DATA MINING
CS249: ADVANCED DATA MINING Linear Regression, Logistic Regression, and GLMs Instructor: Yizhou Sun yzsun@cs.ucla.edu April 24, 2017 About WWW2017 Conference 2 Turing Award Winner Sir Tim Berners-Lee 3
More informationMixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate
Mixture Models & EM Nicholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Previously We looked at -means and hierarchical clustering as mechanisms for unsupervised learning
More informationCS145: INTRODUCTION TO DATA MINING
CS145: INTRODUCTION TO DATA MINING 3: Vector Data: Logistic Regression Instructor: Yizhou Sun yzsun@cs.ucla.edu October 9, 2017 Methods to Learn Vector Data Set Data Sequence Data Text Data Classification
More informationLogistic Regression. Hongning Wang
Logistic Regression Hongning Wang CS@UVa Today s lecture Logistic regression model A discriminative classification model Two different perspectives to derive the model Parameter estimation CS@UVa CS 6501:
More informationBayesian Methods: Naïve Bayes
Bayesian Methods: Naïve Bayes Nicholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Last Time Parameter learning Learning the parameter of a simple coin flipping model Prior
More informationECO 745: Theory of International Economics. Jack Rossbach Fall Lecture 6
ECO 745: Theory of International Economics Jack Rossbach Fall 2015 - Lecture 6 Review We ve covered several models of trade, but the empirics have been mixed Difficulties identifying goods with a technological
More informationCourse 495: Advanced Statistical Machine Learning/Pattern Recognition
Course 495: Advanced Statistical Machine Learning/Pattern Recognition Lectures: Stefanos Zafeiriou Goal (Lectures): To present modern statistical machine learning/pattern recognition algorithms. The course
More informationLecture 10. Support Vector Machines (cont.)
Lecture 10. Support Vector Machines (cont.) COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Andrey Kan Copyright: University of Melbourne This lecture Soft margin SVM Intuition and problem
More informationJasmin Smajic 1, Christian Hafner 2, Jürg Leuthold 2, March 16, 2015 Introduction to Finite Element Method (FEM) Part 1 (2-D FEM)
Jasmin Smajic 1, Christian Hafner 2, Jürg Leuthold 2, March 16, 2015 Introduction to Finite Element Method (FEM) Part 1 (2-D FEM) 1 HSR - University of Applied Sciences of Eastern Switzerland Institute
More informationISyE 6414 Regression Analysis
ISyE 6414 Regression Analysis Lecture 2: More Simple linear Regression: R-squared (coefficient of variation/determination) Correlation analysis: Pearson s correlation Spearman s rank correlation Variable
More informationOperational Risk Management: Preventive vs. Corrective Control
Operational Risk Management: Preventive vs. Corrective Control Yuqian Xu (UIUC) July 2018 Joint Work with Lingjiong Zhu and Michael Pinedo 1 Research Questions How to manage operational risk? How does
More informationThe Simple Linear Regression Model ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD
The Simple Linear Regression Model ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD Outline Definition. Deriving the Estimates. Properties of the Estimates. Units of Measurement and Functional Form. Expected
More informationMinimum Mean-Square Error (MMSE) and Linear MMSE (LMMSE) Estimation
Minimum Mean-Square Error (MMSE) and Linear MMSE (LMMSE) Estimation Outline: MMSE estimation, Linear MMSE (LMMSE) estimation, Geometric formulation of LMMSE estimation and orthogonality principle. Reading:
More informationknn & Naïve Bayes Hongning Wang
knn & Naïve Bayes Hongning Wang CS@UVa Today s lecture Instance-based classifiers k nearest neighbors Non-parametric learning algorithm Model-based classifiers Naïve Bayes classifier A generative model
More informationMidterm Exam 1, section 2. Thursday, September hour, 15 minutes
San Francisco State University Michael Bar ECON 312 Fall 2018 Midterm Exam 1, section 2 Thursday, September 27 1 hour, 15 minutes Name: Instructions 1. This is closed book, closed notes exam. 2. You can
More informationThe Intrinsic Value of a Batted Ball Technical Details
The Intrinsic Value of a Batted Ball Technical Details Glenn Healey, EECS Department University of California, Irvine, CA 9617 Given a set of observed batted balls and their outcomes, we develop a method
More informationAttacking and defending neural networks. HU Xiaolin ( 胡晓林 ) Department of Computer Science and Technology Tsinghua University, Beijing, China
Attacking and defending neural networks HU Xiaolin ( 胡晓林 ) Department of Computer Science and Technology Tsinghua University, Beijing, China Outline Background Attacking methods Defending methods 2 AI
More informationDecision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag
Decision Trees Nicholas Ruozzi University of Texas at Dallas Based on the slides of Vibhav Gogate and David Sontag Announcements Course TA: Hao Xiong Office hours: Friday 2pm-4pm in ECSS2.104A1 First homework
More informationImperfectly Shared Randomness in Communication
Imperfectly Shared Randomness in Communication Madhu Sudan Harvard Joint work with Clément Canonne (Columbia), Venkatesan Guruswami (CMU) and Raghu Meka (UCLA). 11/16/2016 UofT: ISR in Communication 1
More informationCombining Experimental and Non-Experimental Design in Causal Inference
Combining Experimental and Non-Experimental Design in Causal Inference Kari Lock Morgan Department of Statistics Penn State University Rao Prize Conference May 12 th, 2017 A Tribute to Don Design trumps
More informationSNARKs with Preprocessing. Eran Tromer
SNARKs with Preprocessing Eran Tromer BIU Winter School on Verifiable Computation and Special Encryption 4-7 Jan 206 G: Someone generates and publishes a common reference string P: Prover picks NP statement
More informationAnalysis of Gini s Mean Difference for Randomized Block Design
American Journal of Mathematics and Statistics 2015, 5(3): 111-122 DOI: 10.5923/j.ajms.20150503.02 Analysis of Gini s Mean Difference for Randomized Block Design Elsayed A. H. Elamir Department of Statistics
More informationMachine Learning Application in Aviation Safety
Machine Learning Application in Aviation Safety Surface Safety Metric MOR Classification Presented to: By: Date: ART Firdu Bati, PhD, FAA September, 2018 Agenda Surface Safety Metric (SSM) development
More informationNew Class of Almost Unbiased Modified Ratio Cum Product Estimators with Knownparameters of Auxiliary Variables
Journal of Mathematics and System Science 7 (017) 48-60 doi: 10.1765/159-591/017.09.00 D DAVID PUBLISHING New Class of Almost Unbiased Modified Ratio Cum Product Estimators with Knownparameters of Auxiliary
More informationIntroduction to Pattern Recognition
Introduction to Pattern Recognition Jason Corso SUNY at Buffalo 12 January 2009 J. Corso (SUNY at Buffalo) Introduction to Pattern Recognition 12 January 2009 1 / 28 Pattern Recognition By Example Example:
More informationEstimating the Probability of Winning an NFL Game Using Random Forests
Estimating the Probability of Winning an NFL Game Using Random Forests Dale Zimmerman February 17, 2017 2 Brian Burke s NFL win probability metric May be found at www.advancednflstats.com, but the site
More informationConservation of Energy. Chapter 7 of Essential University Physics, Richard Wolfson, 3 rd Edition
Conservation of Energy Chapter 7 of Essential University Physics, Richard Wolfson, 3 rd Edition 1 Different Types of Force, regarding the Work they do. gravity friction 2 Conservative Forces BB WW cccccccc
More informationIntroduction to Pattern Recognition
Introduction to Pattern Recognition Jason Corso SUNY at Buffalo 19 January 2011 J. Corso (SUNY at Buffalo) Introduction to Pattern Recognition 19 January 2011 1 / 32 Examples of Pattern Recognition in
More informationPREDICTING the outcomes of sporting events
CS 229 FINAL PROJECT, AUTUMN 2014 1 Predicting National Basketball Association Winners Jasper Lin, Logan Short, and Vishnu Sundaresan Abstract We used National Basketball Associations box scores from 1991-1998
More informationSupport Vector Machines: Optimization of Decision Making. Christopher Katinas March 10, 2016
Support Vector Machines: Optimization of Decision Making Christopher Katinas March 10, 2016 Overview Background of Support Vector Machines Segregation Functions/Problem Statement Methodology Training/Testing
More informationISyE 6414: Regression Analysis
ISyE 6414: Regression Analysis Lectures: MWF 8:00-10:30, MRDC #2404 Early five-week session; May 14- June 15 (8:00-9:10; 10-min break; 9:20-10:30) Instructor: Dr. Yajun Mei ( YA_JUNE MAY ) Email: ymei@isye.gatech.edu;
More informationCommunication Amid Uncertainty
Communication Amid Uncertainty Madhu Sudan Harvard University Based on joint works with Brendan Juba, Oded Goldreich, Adam Kalai, Sanjeev Khanna, Elad Haramaty, Jacob Leshno, Clement Canonne, Venkatesan
More information2 When Some or All Labels are Missing: The EM Algorithm
CS769 Spring Advanced Natural Language Processing The EM Algorithm Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Given labeled examples (x, y ),..., (x l, y l ), one can build a classifier. If in addition
More informationEvaluating and Classifying NBA Free Agents
Evaluating and Classifying NBA Free Agents Shanwei Yan In this project, I applied machine learning techniques to perform multiclass classification on free agents by using game statistics, which is useful
More informationModeling the NCAA Tournament Through Bayesian Logistic Regression
Duquesne University Duquesne Scholarship Collection Electronic Theses and Dissertations 0 Modeling the NCAA Tournament Through Bayesian Logistic Regression Bryan Nelson Follow this and additional works
More informationProduct Decomposition in Supply Chain Planning
Mario R. Eden, Marianthi Ierapetritou and Gavin P. Towler (Editors) Proceedings of the 13 th International Symposium on Process Systems Engineering PSE 2018 July 1-5, 2018, San Diego, California, USA 2018
More informationTwo Machine Learning Approaches to Understand the NBA Data
Two Machine Learning Approaches to Understand the NBA Data Panagiotis Lolas December 14, 2017 1 Introduction In this project, I consider applications of machine learning in the analysis of nba data. To
More informationPrediction Market and Parimutuel Mechanism
Prediction Market and Parimutuel Mechanism Yinyu Ye MS&E and ICME Stanford University Joint work with Agrawal, Peters, So and Wang Math. of Ranking, AIM, 2 Outline World-Cup Betting Example Market for
More informationBuilding an NFL performance metric
Building an NFL performance metric Seonghyun Paik (spaik1@stanford.edu) December 16, 2016 I. Introduction In current pro sports, many statistical methods are applied to evaluate player s performance and
More informationNavigate to the golf data folder and make it your working directory. Load the data by typing
Golf Analysis 1.1 Introduction In a round, golfers have a number of choices to make. For a particular shot, is it better to use the longest club available to try to reach the green, or would it be better
More informationCommunication Amid Uncertainty
Communication Amid Uncertainty Madhu Sudan Harvard University Based on joint works with Brendan Juba, Oded Goldreich, Adam Kalai, Sanjeev Khanna, Elad Haramaty, Jacob Leshno, Clement Canonne, Venkatesan
More informationBASKETBALL PREDICTION ANALYSIS OF MARCH MADNESS GAMES CHRIS TSENG YIBO WANG
BASKETBALL PREDICTION ANALYSIS OF MARCH MADNESS GAMES CHRIS TSENG YIBO WANG GOAL OF PROJECT The goal is to predict the winners between college men s basketball teams competing in the 2018 (NCAA) s March
More informationGLMM standardisation of the commercial abalone CPUE for Zones A-D over the period
GLMM standardisation of the commercial abalone for Zones A-D over the period 1980 2015 Anabela Brandão and Doug S. Butterworth Marine Resource Assessment & Management Group (MARAM) Department of Mathematics
More informationPredicting Results of March Madness Using the Probability Self-Consistent Method
International Journal of Sports Science 2015, 5(4): 139-144 DOI: 10.5923/j.sports.20150504.04 Predicting Results of March Madness Using the Probability Self-Consistent Method Gang Shen, Su Hua, Xiao Zhang,
More informationFunctions of Random Variables & Expectation, Mean and Variance
Functions of Random Variables & Expectation, Mean and Variance Kuan-Yu Chen ( 陳冠宇 ) @ TR-409, NTUST Functions of Random Variables 1 Given a random variables XX, one may generate other random variables
More informationPre-Kindergarten 2017 Summer Packet. Robert F Woodall Elementary
Pre-Kindergarten 2017 Summer Packet Robert F Woodall Elementary In the fall, on your child s testing day, please bring this packet back for a special reward that will be awarded to your child for completion
More informationTitle: 4-Way-Stop Wait-Time Prediction Group members (1): David Held
Title: 4-Way-Stop Wait-Time Prediction Group members (1): David Held As part of my research in Sebastian Thrun's autonomous driving team, my goal is to predict the wait-time for a car at a 4-way intersection.
More informationUse of Auxiliary Variables and Asymptotically Optimum Estimators in Double Sampling
International Journal of Statistics and Probability; Vol. 5, No. 3; May 2016 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education Use of Auxiliary Variables and Asymptotically
More informationA Class of Regression Estimator with Cum-Dual Ratio Estimator as Intercept
International Journal of Probability and Statistics 015, 4(): 4-50 DOI: 10.593/j.ijps.015040.0 A Class of Regression Estimator with Cum-Dual Ratio Estimator as Intercept F. B. Adebola 1, N. A. Adegoke
More informationAnabela Brandão and Doug S. Butterworth
Obtaining a standardised CPUE series for toothfish (Dissostichus eleginoides) in the Prince Edward Islands EEZ calibrated to incorporate both longline and trotline data over the period 1997-2013 Anabela
More informationNeural Networks II. Chen Gao. Virginia Tech Spring 2019 ECE-5424G / CS-5824
Neural Networks II Chen Gao ECE-5424G / CS-5824 Virginia Tech Spring 2019 Neural Networks Origins: Algorithms that try to mimic the brain. What is this? A single neuron in the brain Input Output Slide
More informationChapter 12 Practice Test
Chapter 12 Practice Test 1. Which of the following is not one of the conditions that must be satisfied in order to perform inference about the slope of a least-squares regression line? (a) For each value
More informationNCSS Statistical Software
Chapter 256 Introduction This procedure computes summary statistics and common non-parametric, single-sample runs tests for a series of n numeric, binary, or categorical data values. For numeric data,
More informationy ) s x x )(y i (x i r = 1 n 1 s y Statistics Lecture 7 Exploring Data , y 2 ,y n (x 1 ),,(x n ),(x 2 ,y 1 How two variables vary together
Statistics 111 - Lecture 7 Exploring Data Numerical Summaries for Relationships between Variables Administrative Notes Homework 1 due in recitation: Friday, Feb. 5 Homework 2 now posted on course website:
More informationNumerical Model to Simulate Drift Trajectories of Large Vessels. Simon Mortensen, HoD Marine DHI Australia
Numerical Model to Simulate Drift Trajectories of Large Vessels Simon Mortensen, HoD Marine DHI Australia Conceptual framework - multi-layered risk estimation Layer 1 (2011): Ship specific risk (proxy
More informationFun Neural Net Demo Site. CS 188: Artificial Intelligence. N-Layer Neural Network. Multi-class Softmax Σ >0? Deep Learning II
Fun Neural Net Demo Site CS 188: Artificial Intelligence Demo-site: http://playground.tensorflow.org/ Deep Learning II Instructors: Pieter Abbeel & Anca Dragan --- University of California, Berkeley [These
More informationPredicting NBA Shots
Predicting NBA Shots Brett Meehan Stanford University https://github.com/brettmeehan/cs229 Final Project bmeehan2@stanford.edu Abstract This paper examines the application of various machine learning algorithms
More informationOptimal Weather Routing Using Ensemble Weather Forecasts
Optimal Weather Routing Using Ensemble Weather Forecasts Asher Treby Department of Engineering Science University of Auckland New Zealand Abstract In the United States and the United Kingdom it is commonplace
More informationEvaluation of Regression Approaches for Predicting Yellow Perch (Perca flavescens) Recreational Harvest in Ohio Waters of Lake Erie
Evaluation of Regression Approaches for Predicting Yellow Perch (Perca flavescens) Recreational Harvest in Ohio Waters of Lake Erie QFC Technical Report T2010-01 Prepared for: Ohio Department of Natural
More informationPhysical Design of CMOS Integrated Circuits
Physical Design of CMOS Integrated Circuits Dae Hyun Kim EECS Washington State University References John P. Uyemura, Introduction to VLSI Circuits and Systems, 2002. Chapter 5 Goal Understand how to physically
More informationPREDICTING THE NCAA BASKETBALL TOURNAMENT WITH MACHINE LEARNING. The Ringer/Getty Images
PREDICTING THE NCAA BASKETBALL TOURNAMENT WITH MACHINE LEARNING A N D R E W L E V A N D O S K I A N D J O N A T H A N L O B O The Ringer/Getty Images THE TOURNAMENT MARCH MADNESS 68 teams (4 play-in games)
More informationBayesian Optimized Random Forest for Movement Classification with Smartphones
Bayesian Optimized Random Forest for Movement Classification with Smartphones 1 2 3 4 Anonymous Author(s) Affiliation Address email 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
More informationIntroduction to Machine Learning NPFL 054
Introduction to Machine Learning NPFL 054 http://ufal.mff.cuni.cz/course/npfl054 Barbora Hladká hladka@ufal.mff.cuni.cz Martin Holub holub@ufal.mff.cuni.cz Charles University, Faculty of Mathematics and
More informationANALYSIS METHOD FOR MULTIPLE FREEWAY WEAVING SECTIONS
ANALYSIS METHOD FOR MULTIPLE FREEWAY WEAVING SECTIONS CIK Michael*, HEBENSTREIT Cornelia and FELLENDORF Martin Institute of Highway Engineering and Transport Planning Graz University of Technology Rechbauerstraße
More informationSection I: Multiple Choice Select the best answer for each problem.
Inference for Linear Regression Review Section I: Multiple Choice Select the best answer for each problem. 1. Which of the following is NOT one of the conditions that must be satisfied in order to perform
More informationSPATIAL STATISTICS A SPATIAL ANALYSIS AND COMPARISON OF NBA PLAYERS. Introduction
A SPATIAL ANALYSIS AND COMPARISON OF NBA PLAYERS KELLIN RUMSEY Introduction The 2016 National Basketball Association championship featured two of the leagues biggest names. The Golden State Warriors Stephen
More informationSimulating Major League Baseball Games
ABSTRACT Paper 2875-2018 Simulating Major League Baseball Games Justin Long, Slippery Rock University; Brad Schweitzer, Slippery Rock University; Christy Crute Ph.D, Slippery Rock University The game of
More informationPredicting the Total Number of Points Scored in NFL Games
Predicting the Total Number of Points Scored in NFL Games Max Flores (mflores7@stanford.edu), Ajay Sohmshetty (ajay14@stanford.edu) CS 229 Fall 2014 1 Introduction Predicting the outcome of National Football
More informationDistancei = BrandAi + 2 BrandBi + 3 BrandCi + i
. Suppose that the United States Golf Associate (USGA) wants to compare the mean distances traveled by four brands of golf balls when struck by a driver. A completely randomized design is employed with
More informationProjecting Three-Point Percentages for the NBA Draft
Projecting Three-Point Percentages for the NBA Draft Hilary Sun hsun3@stanford.edu Jerold Yu jeroldyu@stanford.edu December 16, 2017 Roland Centeno rcenteno@stanford.edu 1 Introduction As NBA teams have
More informationMath 4. Unit 1: Conic Sections Lesson 1.1: What Is a Conic Section?
Unit 1: Conic Sections Lesson 1.1: What Is a Conic Section? 1.1.1: Study - What is a Conic Section? Duration: 50 min 1.1.2: Quiz - What is a Conic Section? Duration: 25 min / 18 Lesson 1.2: Geometry of
More informationOn the convergence of fitting algorithms in computer vision
On the convergence of fitting algorithms in computer vision N. Chernov Department of Mathematics University of Alabama at Birmingham Birmingham, AL 35294 chernov@math.uab.edu Abstract We investigate several
More informationSan Francisco State University ECON 560 Summer Midterm Exam 2. Monday, July hour 15 minutes
San Francisco State University Michael Bar ECON 560 Summer 2018 Midterm Exam 2 Monday, July 30 1 hour 15 minutes Name: Instructions 1. This is closed book, closed notes exam. 2. No calculators or electronic
More informationCT4510: Computer Graphics. Transformation BOCHANG MOON
CT4510: Computer Graphics Transformation BOCHANG MOON 2D Translation Transformations such as rotation and scale can be represented using a matrix M ee. gg., MM = SSSS xx = mm 11 xx + mm 12 yy yy = mm 21
More informationA point-based Bayesian hierarchical model to predict the outcome of tennis matches
A point-based Bayesian hierarchical model to predict the outcome of tennis matches Martin Ingram, Silverpond September 21, 2017 Introduction Predicting tennis matches is of interest for a number of applications:
More informationA COURSE OUTLINE (September 2001)
189-265A COURSE OUTLINE (September 2001) 1 Topic I. Line integrals: 2 1 2 weeks 1.1 Parametric curves Review of parametrization for lines and circles. Paths and curves. Differentiation and integration
More informationB. AA228/CS238 Component
Abstract Two supervised learning methods, one employing logistic classification and another employing an artificial neural network, are used to predict the outcome of baseball postseason series, given
More informationDiscussion: Illusions of Sparsity by Giorgio Primiceri
Discussion: Illusions of Sparsity by Giorgio Primiceri Pablo Guerron-Quintana Boston College June, 2018 Pablo Guerron-Quintana Discussion 1 / 17 Executive Summary Motivation: Is regular coke (dense model)
More informationDNS Study on Three Vortex Identification Methods
Γ DNS Study on Three Vortex Identification Methods Yinlin Dong Yong Yang Chaoqun Liu Technical Report 2016-07 http://www.uta.edu/math/preprint/ DNS Study on Three Vortex Identification Methods Yinlin Dong
More informationCS 7641 A (Machine Learning) Sethuraman K, Parameswaran Raman, Vijay Ramakrishnan
CS 7641 A (Machine Learning) Sethuraman K, Parameswaran Raman, Vijay Ramakrishnan Scenario 1: Team 1 scored 200 runs from their 50 overs, and then Team 2 reaches 146 for the loss of two wickets from their
More informationBayesian model averaging with change points to assess the impact of vaccination and public health interventions
Bayesian model averaging with change points to assess the impact of vaccination and public health interventions SUPPLEMENTARY METHODS Data sources U.S. hospitalization data were obtained from the Healthcare
More informationChapter 10 Aggregate Demand I: Building the IS LM Model
Chapter 10 Aggregate Demand I: Building the IS LM Model Zhengyu Cai Ph.D. Institute of Development Southwestern University of Finance and Economics All rights reserved http://www.escience.cn/people/zhengyucai/index.html
More informationDo New Bike Share Stations Increase Member Use?: A Quasi-Experimental Study
Do New Bike Share Stations Increase Member Use?: A Quasi-Experimental Study Jueyu Wang & Greg Lindsey Humphrey School of Public Affairs University of Minnesota Acknowledgement: NSF Sustainable Research
More informationWhat is Restrained and Unrestrained Pipes and what is the Strength Criteria
What is Restrained and Unrestrained Pipes and what is the Strength Criteria Alex Matveev, September 11, 2018 About author: Alex Matveev is one of the authors of pipe stress analysis codes GOST 32388-2013
More informationLesson 14: Modeling Relationships with a Line
Exploratory Activity: Line of Best Fit Revisited 1. Use the link http://illuminations.nctm.org/activity.aspx?id=4186 to explore how the line of best fit changes depending on your data set. A. Enter any
More informationFinding your feet: modelling the batting abilities of cricketers using Gaussian processes
Finding your feet: modelling the batting abilities of cricketers using Gaussian processes Oliver Stevenson & Brendon Brewer PhD candidate, Department of Statistics, University of Auckland o.stevenson@auckland.ac.nz
More informationTSP at isolated intersections: Some advances under simulation environment
TSP at isolated intersections: Some advances under simulation environment Zhengyao Yu Vikash V. Gayah Eleni Christofa TESC 2018 December 5, 2018 Overview Motivation Problem introduction Assumptions Formation
More informationSEPARATING A GAS MIXTURE INTO ITS CONSTITUENT ANALYTES USING FICA
SEPARATING A GAS MIXTURE INTO ITS CONSTITUENT ANALYTES USING FICA Aparna Mahadevan Thesis submitted to the faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the
More informationA Mechanics-Based Approach for Putt Distance Optimization
University of Central Florida HIM 1990-2015 Open Access A Mechanics-Based Approach for Putt Distance Optimization 2015 Pascual Santiago-Martinez University of Central Florida Find similar works at: http://stars.library.ucf.edu/honorstheses1990-2015
More informationHIGH RESOLUTION DEPTH IMAGE RECOVERY ALGORITHM USING GRAYSCALE IMAGE.
HIGH RESOLUTION DEPTH IMAGE RECOVERY ALGORITHM USING GRAYSCALE IMAGE Kazunori Uruma 1, Katsumi Konishi 2, Tomohiro Takahashi 1 and Toshihiro Furukawa 1 1 Graduate School of Engineering, Tokyo University
More informationA statistical model for classifying ambient noise inthesea*
A statistical model for classifying ambient noise inthesea* OCEANOLOGIA, 39(3), 1997. pp.227 235. 1997, by Institute of Oceanology PAS. KEYWORDS Ambient noise Sea-state classification Wiesław Kiciński
More informationReport for Experiment #11 Testing Newton s Second Law On the Moon
Report for Experiment #11 Testing Newton s Second Law On the Moon Neil Armstrong Lab partner: Buzz Aldrin TA: Michael Collins July 20th, 1969 Abstract In this experiment, we tested Newton s second law
More informationNonlife Actuarial Models. Chapter 7 Bühlmann Credibility
Nonlife Actuarial Models Chapter 7 Bühlmann Credibility Learning Objectives 1. Basic framework of Bühlmann credibility 2. Variance decomposition 3. Expected value of the process variance 4. Variance of
More informationPredicting Tennis Match Outcomes Through Classification Shuyang Fang CS074 - Dartmouth College
Predicting Tennis Match Outcomes Through Classification Shuyang Fang CS074 - Dartmouth College Introduction The governing body of men s professional tennis is the Association of Tennis Professionals or
More informationDevelopment of Decision Support Tools to Assess Pedestrian and Bicycle Safety: Development of Safety Performance Function
Development of Decision Support Tools to Assess Pedestrian and Bicycle Safety: Development of Safety Performance Function Valerian Kwigizile, Jun Oh, Ron Van Houten, & Keneth Kwayu INTRODUCTION 2 OVERVIEW
More informationTie Breaking Procedure
Ohio Youth Basketball Tie Breaking Procedure The higher seeded team when two teams have the same record after completion of pool play will be determined by the winner of their head to head competition.
More informationHighly Cited Psychometrika Articles:
Highly Cited Psychometrika Articles: 1936 2001 1 2016 1 List compiled by Willem Heiser and Larry Hubert. Citation counts are from Google Scholar as of April, 1 Greater than 3000 citations: Akaike, H. (1987).
More informationModelling and Simulation of Environmental Disturbances
Modelling and Simulation of Environmental Disturbances (Module 5) Dr Tristan Perez Centre for Complex Dynamic Systems and Control (CDSC) Prof. Thor I Fossen Department of Engineering Cybernetics 18/09/2007
More informationSports Predictive Analytics: NFL Prediction Model
Sports Predictive Analytics: NFL Prediction Model By Dr. Ash Pahwa IEEE Computer Society San Diego Chapter January 17, 2017 Copyright 2017 Dr. Ash Pahwa 1 Outline Case Studies of Sports Analytics Sports
More information