Logistic Regression. Hongning Wang
|
|
- Merilyn Porter
- 5 years ago
- Views:
Transcription
1 Logistic Regression Hongning Wang
2 Today s lecture Logistic regression model A discriminative classification model Two different perspectives to derive the model Parameter estimation CS@UVa CS 6501: Text Mining 2
3 Review: Bayes risk minimization Risk assign instance to a wrong class yy = aaaaaaaaaaxx yy PP(yy XX) pp(xx, yy) yy = 0 yy = 1 *Optimal Bayes decision boundary We have learned multiple ways to estimate this pp XX yy = 0 pp(yy = 0) pp XX yy = 1 pp(yy = 1) False negative False positive CS@UVa CS 6501: Text Mining 3 XX
4 Instance-based solution k nearest neighbors Approximate Bayes decision rule in a subset of data around the testing point CS@UVa CS 6501: Text Mining 4
5 Instance-based solution k nearest neighbors Approximate Bayes decision rule in a subset of data around the testing point Let VV be the volume of the mm dimensional ball around xx containing the kk nearest neighbors for xx, we have pp xx VV = kk NN pp xx yy = 1 = kk 1 => pp xx = kk NNVV NN 1 VV Total number of instances With Bayes rule: pp yy = 1 xx = NN 1 NN kk 1 NN 1 VV kk NNNN = kk 1 kk pp yy = 1 = NN 1 NN Total number of instances in class 1 Counting the nearest neighbors from class1 CS@UVa CS 6501: Text Mining 5
6 Generative solution Naïve Bayes classifier yy = aaaaaaaaaaxx yy PP yy XX = aaaaaaaaaaxx yy PP XX yy PP(yy) dd = aaaaaaaaaaxx yy ii=1 PP(xx ii yy) PP yy y By Bayes rule By independence assumption x 1 x 2 x 3 x v CS@UVa CS 6501: Text Mining 6
7 Estimating parameters Maximial likelihood estimator PP xx ii yy = dd jj δδ(xx jj dd =xxii,yy dd =yy) dd δδ(yy dd =yy) PP(yy) = dd δδ(yy dd=yy) dd 1 text information identify mining mined is useful to from apple delicious Y D D D CS@UVa CS 6501: Text Mining 7
8 Discriminative v.s. generative models All instances are considered for probability density estimation yy = ff(xx) More attention will be put onto the boundary points CS@UVa CS 6501: Text Mining 8
9 Parametric form of decision boundary For binary case in Naïve Bayes ff XX = ssssss(log PP yy = 1 XX log PP(yy = 0 XX)) PP yy = 1 = ssssss log PP yy = 0 + ii=1 dd cc xx ii, dd log PP xx ii yy = 1 PP xx ii yy = 0 where = ssssss(ww TT XX) ww = log Linear regression? PP yy = 1 PP yy = 0, log PP xx 1 yy = 1 PP xx 1 yy = 0,, log PP xx vv yy = 1 PP xx vv yy = 0 XX = (1, cc(xx 1, dd),, cc(xx vv, dd)) CS@UVa CS 6501: Text Mining 9
10 Regression for classification? Linear regression yy ww TT XX Relationship between a scalar dependent variable yy and one or more explanatory variables CS@UVa CS 6501: Text Mining 10
11 Regression for classification? Linear regression yy ww TT XX Relationship between a scalar dependent variable yy and one or more explanatory variables y Y is discrete in a classification problem! yy = 1 wwtt XX > ww TT XX 0.5 What if we have an outlier? Optimal regression model 0.00 CS@UVa CS 6501: Text Mining 11 x
12 Regression for classification? Logistic regression pp yy xx = σσ ww TT XX = 1+exp( ww TT XX) Directly modeling of class posterior Sigmoid function P(y x) 1 What if we have an outlier? CS@UVa CS 6501: Text Mining 12 x
13 Logistic regression for classification Why sigmoid function? PP XX yy = 1 PP(yy=1) PP yy = 1 XX = PP XX yy = 1 PP yy=1 +PP XX yy = 0 PP(yy=0) 1 = PP XX yy = 0 PP(yy = 0) 1 + PP XX yy = 1 PP(yy = 1) Binomial P(y x) PP yy = 1 = αα 1.00 PP XX yy = 0 = NN(μμ 0, δδ 2 ) PP XX yy = 1 = NN(μμ 1, δδ 2 ) Normal with identical variance 0.00 CS@UVa CS 6501: Text Mining 13 x
14 Logistic regression for classification Why sigmoid function? PP XX yy = 1 PP(yy=1) PP yy = 1 XX = PP XX yy = 1 PP yy=1 +PP XX yy = 0 PP(yy=0) 1 = PP XX yy = 0 PP(yy = 0) 1 + PP XX yy = 1 PP(yy = 1) 1 = PP XX yy = 1 PP(yy = 1) 1 + exp ln PP XX yy = 0 PP(yy = 0) CS@UVa CS 6501: Text Mining 14
15 Logistic regression for classification Why sigmoid function? PP XX yy = 1 PP(yy=1) ln PP XX yy = 0 PP(yy=0) PP(yy=1) = ln + VV PP(yy=0) ii=1 ln PP(xx ii yy=1) PP(xx ii yy=0) VV = ln αα 1 αα + ii=1 μμ 1ii μμ 0ii δδ ii 2 PP xx yy = 1 δδ 2ππ ee xx ii μμ 1ii 2 2 μμ 0ii 2 2δδ ii xx μμ 2 2δδ 2 Origin of the name: logit function VV = ww 0 + ii=1 = ww 0 + ww TT XX = ww TT XX μμ1ii μμ 0ii δδ ii 2 xx ii CS@UVa CS 6501: Text Mining 15
16 Logistic regression for classification Why sigmoid function? PP XX yy = 1 PP(yy=1) PP yy = 1 XX = PP XX yy = 1 PP yy=1 +PP XX yy = 0 PP(yy=0) 1 = PP XX yy = 0 PP(yy = 0) 1 + PP XX yy = 1 PP(yy = 1) 1 = PP XX yy = 1 PP(yy = 1) 1 + exp ln PP XX yy = 0 PP(yy = 0) 1 = 1 + exp( ww TT XX) Generalized Linear Model Note: it is still a linear relation among the features! CS@UVa CS 6501: Text Mining 16
17 Logistic regression for classification For multi-class categorization PP yy = kk XX = exp(ww kk TT XX) KK jj=1 exp(ww jj TT XX) x 1 y x 2 x 3 x v PP yy = kk XX exp(ww kk TT XX) Warning: redundancy in model parameters, When KK=2, since KK jj=1 PP(yy = jj XX) = 1 exp(ww TT 1 XX) PP yy = 1 XX = exp(ww TT 1 XX) + exp(ww TT 0 XX) 1 = 1 + exp( ww 1 ww TT 0 XX) ww CS@UVa CS 6501: Text Mining 17
18 Logistic regression for classification Decision boundary for binary case 1, pp yy = 1 XX > 0.5 yy = 0, oottttttttttttttt i.f.f. i.f.f. pp yy = 1 XX = exp ww TT XX < 1 ww TT XX > exp( ww TT XX) > 0.5 yy = 1, wwtt xx > 0 0, ooooooooooooooooo A linear model! CS@UVa CS 6501: Text Mining 18
19 Logistic regression for classification Decision boundary in general yy = aaaaaaaaaaxx yy pp yy XX = aaaaaaaaaaxx yy exp(ww TT yy XX) = aaaaaaaaaaxx yy ww TT yy XX A linear model! CS@UVa CS 6501: Text Mining 19
20 Logistic regression for classification Summary PP yy = 1 XX = = Binomial PP XX yy = 1 PP(yy=1) PP XX yy = 1 PP yy=1 +PP XX yy = 0 PP(yy=0) 1 PP XX yy = 0 PP(yy = 0) 1 + PP XX yy = 1 PP(yy = 1) P(y x) PP yy = 1 = αα 1.00 PP XX yy = 0 = NN(μμ 0, δδ 2 ) PP XX yy = 1 = NN(μμ 1, δδ 2 ) Normal with identical variance 0.00 CS@UVa CS 6501: Text Mining 20 x
21 A different perspective Imagine we have the following Documents happy, good, purchase, item, indeed Sentiment positive pp xx = "happy", yy = 1 + pp xx = "good", yy = 1 + pp xx = "pppppppppppppppp, yy = 1 +pp xx = "item", yy = 1 + pp xx = "iiiiiiiiiiiii, yy = 1 = 1 Question: find a distribution pp(xx, yy) that satisfies this observation. Answer1: Answer2: pp xx = "item", yy = 1 = 1, and all the others 0 pp xx = "indeed", yy = 1 = 0.5, pp xx = "good", yy = 1 = 0.5, and all the others 0 We have too little information to favor either one of them. CS@UVa CS 6501: Text Mining 21
22 Occam's razor A problem-solving principle among competing hypotheses that predict equally well, the one with the fewest assumptions should be selected. William of Ockham ( ) Principle of Insufficient Reason: "when one has no information to distinguish between the probability of two events, the best strategy is to consider them equally likely Pierre-Simon Laplace ( ) CS@UVa CS 6501: Text Mining 22
23 A different perspective Imagine we have the following Documents happy, good, purchase, item, indeed Sentiment positive pp xx = "happy", yy = 1 + pp xx = "good", yy = 1 + pp xx = "pppppppppppppppp, yy = 1 +pp xx = "item", yy = 1 + pp xx = "iiiiiiiiiiiii, yy = 1 = 1 Question: find a distribution pp(xx, yy) that satisfies this observation. As a result, a safer choice would be: pp xx = " ", yy = 1 = 0.2 Equally favor every possibility CS@UVa CS 6501: Text Mining 23
24 A different perspective Imagine we have the following Observations Sentiment happy, good, purchase, item, indeed positive 30% of time good, item positive pp xx = "happy", yy = 1 + pp xx = "good", yy = 1 + pp xx = "pppppppppppppppp, yy = 1 +pp xx = "item", yy = 1 + pp xx = "iiiiiiiiiiiii, yy = 1 = 1 pp xx = "good", yy = 1 + pp xx = "item", yy = 1 = 0.3 Question: find a distribution pp(xx, yy) that satisfies this observation. Again, a safer choice would be: pp xx = "gggggggg", yy = 1 = pp xx = "iiiiiiii", yy = 1 = 0.15, and all the others 7 30 Equally favor every possibility CS@UVa CS 6501: Text Mining 24
25 A different perspective Imagine we have the following Observations happy, good, purchase, item, indeed Sentiment positive 30% of time good, item positive 50% of time good, happy positive pp xx = "happy", yy = 1 + pp xx = "good", yy = 1 + pp xx = "pppppppppppppppp, yy = 1 +pp xx = "item", yy = 1 + pp xx = "iiiiiiiiiiiii, yy = 1 = 1 pp xx = "good", yy = 1 + pp xx = "item", yy = 1 = 0.3 pp xx = "good", yy = 1 + pp xx = "happy", yy = 1 = 0.5 Question: find a distribution pp(xx, yy) that satisfies this observation. Time to think about: 1) what do we mean by equally/uniformly favoring the models? 2) given all these constraints, how could we find the most preferred model? CS@UVa CS 6501: Text Mining 25
26 Maximum entropy modeling A measure of uncertainty of random events HH XX = EE II XX = xx XX PP xx log PP(xx) Maximized when P(X) is uniform distribution Question 1 is answered, then how about question 2? CS@UVa CS 6501: Text Mining 26
27 Represent the constraints Indicator function E.g., to express the observation that word good occurs in a positive document ff xx, yy = 1 0 if yy = 1 and xx = ggggggggg otherwise Usually referred as feature function CS@UVa CS 6501: Text Mining 27
28 Represent the constraints Empirical expectation of feature function over a corpus EE[ pp ff ] = xx,yy pp xx, yy ff(xx, yy) where pp xx, yy = cc(ff(xx,yy)) NN Expectation of feature function under a given statistical model EE[pp ff ] = xx,yy pp xx pp(yy xx)ff(xx, yy) i.e., frequency of observing ff(xx, yy) in a given collection. Empirical distribution of xx in the same collection. Model s estimation of conditional distribution. CS@UVa CS 6501: Text Mining 28
29 Represent the constraints When a feature is important, we require our preferred statistical model to accord with it CC pp PP EE[pp ff ii ] = EE[ pp ff ii ], ii {1,2,, nn} EE[pp ff ii ] = EE[ pp ff ii ] xx,yy pp xx, yy ff ii xx, yy = xx,yy pp xx pp yy xx ff ii (xx, yy) We only need to specify this in our preferred model! Is Question 2 answered? CS@UVa CS 6501: Text Mining 29
30 Represent the constraints Let s visualize this (a) No constraint (b) Under constrained How to deal with these situations? (c) Feasible constraint (d) Over constrained CS@UVa CS 6501: Text Mining 30
31 Maximum entropy principle To select a model from a set CC of allowed probability distributions, choose the model pp CC with maximum entropy HH(pp) pp = aaaaaaaaaaxx pp CC HH(pp) pp(yy xx) Both questions are answered! CS@UVa CS 6501: Text Mining 31
32 Maximum entropy principle Let s solve this constrained optimization problem with Lagrange multipliers Primal: Lagrangian: pp = aaaaaaaaaaxx pp CC HH(pp) LL pp, λλ = HH pp + ii a strategy for finding the local maxima and minima of a function subject to equality constraints λλ ii (pp ff ii pp ff ii ) CS@UVa CS 6501: Text Mining 32
33 Maximum entropy principle Let s solve this constrained optimization problem with Lagrange multipliers Lagrangian: Dual: LL pp, λλ = HH pp + pp λλ (yy xx) = 1 ZZ λλ xx exp ii λλ ii (pp ff ii pp ff ii ) ii λλ ii ff ii (xx, yy) Ψ λλ = xx pp xx log ZZ λλ xx + ii λλ ii pp ff ii CS@UVa CS 6501: Text Mining 33
34 Maximum entropy principle Let s solve this constrained optimization problem with Lagrange multipliers Dual: where Ψ λλ = xx Z λλ = yy pp xx log ZZ λλ xx + exp λλ ii ff ii (xx, yy) ii ii λλ ii pp ff ii CS@UVa CS 6501: Text Mining 34
35 Maximum entropy principle Let s take a close look at the dual function where Ψ λλ = xx Z λλ = yy pp xx log ZZ λλ xx + exp λλ ii ff ii (xx, yy) ii ii λλ ii pp ff ii CS@UVa CS 6501: Text Mining 35
36 Maximum entropy principle Let s take a close look at the dual function Ψ λλ = = xx = xx xx pp xx log ZZ λλ xx + pp xx log exp( ii λλ ii pp ff ii ) ZZ λλ xx pp xx log pp(yy xx) xx pp xx ii λλ ii pp ff ii Maximum likelihood estimator! CS@UVa CS 6501: Text Mining 36
37 Maximum entropy principle Primal: maximum entropy pp = aaaaaaaaaaxx pp CC HH(pp) Dual: logistic regression pp λλ (yy xx) = 1 ZZ λλ xx exp ii λλ ii ff ii (xx, yy) where Z λλ = yy exp ii λλ ii ff ii (xx, yy) λλ is determined by Ψ(λλ) CS@UVa CS 6501: Text Mining 37
38 Questions haven t been answered Class conditional density Why it should be Gaussian with equal variance? Model parameters What is the relationship between ww and λλ? How to estimate them? CS@UVa CS 6501: Text Mining 38
39 Maximum entropy principle The maximum entropy model subject to the constraints CC has a parametric solution pp λλ (yy xx) where the parameters λλ can be determined by maximizing the likelihood function of pp λλ (yy xx) over a training set With a Gaussian distribution, differential entropy is maximized for a given variance. Features follow Gaussian distribution Maximum entropy model Logistic regression CS@UVa CS 6501: Text Mining 39
40 Parameter estimation Maximum likelihood estimation LL ww = dd DD yy dd log pp(yy dd = 1 XX dd ) + (1 yy dd ) log pp(yy dd = 0 XX dd ) Take gradient of LL ww with respect to ww LL ww = dd DD yy dd log pp(yy dd = 1 XX dd ) + (1 yy dd ) log pp(yy dd = 0 XX dd ) ww CS@UVa CS 6501: Text Mining 40
41 Parameter estimation Maximum likelihood estimation log pp(yy dd=1 XX dd ) ww log pp(yy dd=0 XX dd ) ww = log 1+exp( wwtt XX dd ) ww = exp( wwtt XX dd ) 1 + exp( ww TT XX dd ) XX dd = 1 pp(yy dd = 1 XX dd ) XX dd = 0 pp(yy dd = 1 XX dd ) XX dd CS@UVa CS 6501: Text Mining 41
42 Parameter estimation Maximum likelihood estimation LL ww = dd DD yy dd log pp(yy dd = 1 XX dd ) + (1 yy dd ) log pp(yy dd = 0 XX dd ) Take gradient of LL ww with respect to ww LL ww = dd DD yy dd log pp(yy dd = 1 XX dd ) + (1 yy dd ) log pp(yy dd = 0 XX dd ) ww = yy dd 1 pp yy dd = 1 XX dd XX dd + 1 yy dd 0 pp yy dd = 1 XX dd XX dd dd DD = dd DD yy dd pp yy = 1 XX dd XX dd concave function for ww Good news: neat format, Can be easily generalized to multi-class case Bad news: no close form solution CS@UVa CS 6501: Text Mining 42
43 Gradient-based optimization Gradient descent ww ww LL ww = [,,, ww 0 ww 1 ww (tt+1) = ww (tt) ηη (tt) ww ww ww VV ] Iterative updating Step-size, affects convergence CS@UVa CS 6501: Text Mining 43
44 Parameter estimation Stochastic gradient descent while not converge randomly choose dd DD LL dd ww = [ LL dd ww, LL dd ww ww 0 ww 1 ww (tt+1) = ww (tt) ηη tt LL dd ww ηη (tt+1) = aaηη tt,, LL dd ww ww VV ] Gradually shrink the step-size CS@UVa CS 6501: Text Mining 44
45 Parameter estimation Batch gradient descent while not converge Compute gradient w.r.t. all training instances LL DD ww = [ LL DD ww ww 0 Compute step size ηη tt ww (tt+1) = ww (tt) ηη tt LL dd ww, LL DD ww ww 1,, LL DD ww ww VV ] Line search is required to ensure sufficient decent First order method Second order methods, e.g., quasi- Newton method and conjugate gradient, provide faster convergence CS@UVa CS 6501: Text Mining 45
46 Model regularization Avoid over-fitting We may not have enough samples to well estimate model parameters for logistic regression Regularization Impose additional constraints over the model parameters E.g., sparsity constraint enforce the model to have more zero parameters CS 6501: Text Mining 46
47 Model regularization L2 regularized logistic regression Assume the model parameter ww is drawn from Gaussian: w NN(0, σσ 2 ) pp yy dd, ww XX dd pp yy dd XX dd, ww pp(ww) LL ww = dd DD [yy dd log pp(yy dd = 1 XX dd ) +(1 yy dd ) log pp yy dd = 0 XX dd ] wwtt ww 2σσ 2 L2-norm of ww CS@UVa CS 6501: Text Mining 47
48 Generative V.S. discriminative models Generative Specifying joint distribution Full probabilistic specification for all the random variables Dependence assumption has to be specified for pp XX yy and pp(yy) Flexible, can be used in unsupervised learning Discriminative Specifying conditional distribution Only explain the target variable Arbitrary features can be incorporated for modeling pp yy XX Need labeled data, only suitable for (semi-) supervised learning CS 6501: Text Mining 48
49 Naïve Bayes V.S. Logistic regression Naive Bayes Conditional independence pp XX yy = ii pp(xx ii yy) Distribution assumption of pp(xx ii yy) # parameters kk(vv + 1) Model estimation Closed form MLE Asymptotic convergence rate Logistic Regression No independence assumption Functional form assumption of pp yy XX exp(ww yy TT XX) # parameters (kk 1)(VV + 1) Model estimation Gradient-based MLE Asymptotic convergence rate εε NNNN,nn εε NNNN, + OO( log VV nn ) Need more training data εε LLLL,nn εε LLLL, + OO( VV nn ) CS@UVa CS 6501: Text Mining 49
50 Naïve Bayes V.S. Logistic regression LR NB "On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. Ng, Jordan NIPS 2002, UCI Data set CS@UVa CS 6501: Text Mining 50
51 What you should know Two different derivations of logistic regression Functional form from Naïve Bayes assumptions pp(xx yy) follows equal variance Gaussian Sigmoid function Maximum entropy principle Primal/dual optimization Generalization to multi-class Parameter estimation Gradient-based optimization Regularization Comparison with Naïve Bayes CS 6501: Text Mining 51
52 Today s reading Speech and Language Processing Chapter 6: Hidden Markov and Maximum Entropy Models 6.6 Maximum entropy models: background 6.7 Maximum entropy modeling CS@UVa CS 6501: Text Mining 52
knn & Naïve Bayes Hongning Wang
knn & Naïve Bayes Hongning Wang CS@UVa Today s lecture Instance-based classifiers k nearest neighbors Non-parametric learning algorithm Model-based classifiers Naïve Bayes classifier A generative model
More informationBayesian Methods: Naïve Bayes
Bayesian Methods: Naïve Bayes Nicholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Last Time Parameter learning Learning the parameter of a simple coin flipping model Prior
More informationMixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate
Mixture Models & EM Nicholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Previously We looked at -means and hierarchical clustering as mechanisms for unsupervised learning
More informationCS145: INTRODUCTION TO DATA MINING
CS145: INTRODUCTION TO DATA MINING 3: Vector Data: Logistic Regression Instructor: Yizhou Sun yzsun@cs.ucla.edu October 9, 2017 Methods to Learn Vector Data Set Data Sequence Data Text Data Classification
More informationSpecial Topics: Data Science
Special Topics: Data Science L Linear Methods for Prediction Dr. Vidhyasaharan Sethu School of Electrical Engineering & Telecommunications University of New South Wales Sydney, Australia V. Sethu 1 Topics
More informationCS249: ADVANCED DATA MINING
CS249: ADVANCED DATA MINING Linear Regression, Logistic Regression, and GLMs Instructor: Yizhou Sun yzsun@cs.ucla.edu April 24, 2017 About WWW2017 Conference 2 Turing Award Winner Sir Tim Berners-Lee 3
More informationLecture 5. Optimisation. Regularisation
Lecture 5. Optimisation. Regularisation COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Andrey Kan Copyright: University of Melbourne Iterative optimisation Loss functions Coordinate
More informationCourse 495: Advanced Statistical Machine Learning/Pattern Recognition
Course 495: Advanced Statistical Machine Learning/Pattern Recognition Lectures: Stefanos Zafeiriou Goal (Lectures): To present modern statistical machine learning/pattern recognition algorithms. The course
More informationLecture 10. Support Vector Machines (cont.)
Lecture 10. Support Vector Machines (cont.) COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Andrey Kan Copyright: University of Melbourne This lecture Soft margin SVM Intuition and problem
More information2 When Some or All Labels are Missing: The EM Algorithm
CS769 Spring Advanced Natural Language Processing The EM Algorithm Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Given labeled examples (x, y ),..., (x l, y l ), one can build a classifier. If in addition
More informationAttacking and defending neural networks. HU Xiaolin ( 胡晓林 ) Department of Computer Science and Technology Tsinghua University, Beijing, China
Attacking and defending neural networks HU Xiaolin ( 胡晓林 ) Department of Computer Science and Technology Tsinghua University, Beijing, China Outline Background Attacking methods Defending methods 2 AI
More informationDecision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag
Decision Trees Nicholas Ruozzi University of Texas at Dallas Based on the slides of Vibhav Gogate and David Sontag Announcements Course TA: Hao Xiong Office hours: Friday 2pm-4pm in ECSS2.104A1 First homework
More informationCommunication Amid Uncertainty
Communication Amid Uncertainty Madhu Sudan Harvard University Based on joint works with Brendan Juba, Oded Goldreich, Adam Kalai, Sanjeev Khanna, Elad Haramaty, Jacob Leshno, Clement Canonne, Venkatesan
More informationImperfectly Shared Randomness in Communication
Imperfectly Shared Randomness in Communication Madhu Sudan Harvard Joint work with Clément Canonne (Columbia), Venkatesan Guruswami (CMU) and Raghu Meka (UCLA). 11/16/2016 UofT: ISR in Communication 1
More informationECO 745: Theory of International Economics. Jack Rossbach Fall Lecture 6
ECO 745: Theory of International Economics Jack Rossbach Fall 2015 - Lecture 6 Review We ve covered several models of trade, but the empirics have been mixed Difficulties identifying goods with a technological
More informationCommunication Amid Uncertainty
Communication Amid Uncertainty Madhu Sudan Harvard University Based on joint works with Brendan Juba, Oded Goldreich, Adam Kalai, Sanjeev Khanna, Elad Haramaty, Jacob Leshno, Clement Canonne, Venkatesan
More informationMachine Learning Application in Aviation Safety
Machine Learning Application in Aviation Safety Surface Safety Metric MOR Classification Presented to: By: Date: ART Firdu Bati, PhD, FAA September, 2018 Agenda Surface Safety Metric (SSM) development
More informationOperational Risk Management: Preventive vs. Corrective Control
Operational Risk Management: Preventive vs. Corrective Control Yuqian Xu (UIUC) July 2018 Joint Work with Lingjiong Zhu and Michael Pinedo 1 Research Questions How to manage operational risk? How does
More informationSupport Vector Machines: Optimization of Decision Making. Christopher Katinas March 10, 2016
Support Vector Machines: Optimization of Decision Making Christopher Katinas March 10, 2016 Overview Background of Support Vector Machines Segregation Functions/Problem Statement Methodology Training/Testing
More informationThe Simple Linear Regression Model ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD
The Simple Linear Regression Model ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD Outline Definition. Deriving the Estimates. Properties of the Estimates. Units of Measurement and Functional Form. Expected
More informationISyE 6414 Regression Analysis
ISyE 6414 Regression Analysis Lecture 2: More Simple linear Regression: R-squared (coefficient of variation/determination) Correlation analysis: Pearson s correlation Spearman s rank correlation Variable
More informationMinimum Mean-Square Error (MMSE) and Linear MMSE (LMMSE) Estimation
Minimum Mean-Square Error (MMSE) and Linear MMSE (LMMSE) Estimation Outline: MMSE estimation, Linear MMSE (LMMSE) estimation, Geometric formulation of LMMSE estimation and orthogonality principle. Reading:
More informationNCSS Statistical Software
Chapter 256 Introduction This procedure computes summary statistics and common non-parametric, single-sample runs tests for a series of n numeric, binary, or categorical data values. For numeric data,
More informationPre-Kindergarten 2017 Summer Packet. Robert F Woodall Elementary
Pre-Kindergarten 2017 Summer Packet Robert F Woodall Elementary In the fall, on your child s testing day, please bring this packet back for a special reward that will be awarded to your child for completion
More informationSNARKs with Preprocessing. Eran Tromer
SNARKs with Preprocessing Eran Tromer BIU Winter School on Verifiable Computation and Special Encryption 4-7 Jan 206 G: Someone generates and publishes a common reference string P: Prover picks NP statement
More informationNaïve Bayes. Robot Image Credit: Viktoriya Sukhanova 123RF.com
Naïve Bayes These slides were assembled by Byron Boots, with only minor modifications from Eric Eaton s slides and grateful acknowledgement to the many others who made their course materials freely available
More informationNaïve Bayes. Robot Image Credit: Viktoriya Sukhanova 123RF.com
Naïve Bayes These slides were assembled by Eric Eaton, with grateful acknowledgement of the many others who made their course materials freely available online. Feel free to reuse or adapt these slides
More informationISyE 6414: Regression Analysis
ISyE 6414: Regression Analysis Lectures: MWF 8:00-10:30, MRDC #2404 Early five-week session; May 14- June 15 (8:00-9:10; 10-min break; 9:20-10:30) Instructor: Dr. Yajun Mei ( YA_JUNE MAY ) Email: ymei@isye.gatech.edu;
More informationIvan Suarez Robles, Joseph Wu 1
Ivan Suarez Robles, Joseph Wu 1 Section 1: Introduction Mixed Martial Arts (MMA) is the fastest growing competitive sport in the world. Because the fighters engage in distinct martial art disciplines (boxing,
More informationPREDICTING the outcomes of sporting events
CS 229 FINAL PROJECT, AUTUMN 2014 1 Predicting National Basketball Association Winners Jasper Lin, Logan Short, and Vishnu Sundaresan Abstract We used National Basketball Associations box scores from 1991-1998
More informationProduct Decomposition in Supply Chain Planning
Mario R. Eden, Marianthi Ierapetritou and Gavin P. Towler (Editors) Proceedings of the 13 th International Symposium on Process Systems Engineering PSE 2018 July 1-5, 2018, San Diego, California, USA 2018
More informationOperations on Radical Expressions; Rationalization of Denominators
0 RD. 1 2 2 2 2 2 2 2 Operations on Radical Expressions; Rationalization of Denominators Unlike operations on fractions or decimals, sums and differences of many radicals cannot be simplified. For instance,
More informationEE 364B: Wind Farm Layout Optimization via Sequential Convex Programming
EE 364B: Wind Farm Layout Optimization via Sequential Convex Programming Jinkyoo Park 1 Introduction In a wind farm, the wakes formed by upstream wind turbines decrease the power outputs of downstream
More informationEvaluating and Classifying NBA Free Agents
Evaluating and Classifying NBA Free Agents Shanwei Yan In this project, I applied machine learning techniques to perform multiclass classification on free agents by using game statistics, which is useful
More informationFun Neural Net Demo Site. CS 188: Artificial Intelligence. N-Layer Neural Network. Multi-class Softmax Σ >0? Deep Learning II
Fun Neural Net Demo Site CS 188: Artificial Intelligence Demo-site: http://playground.tensorflow.org/ Deep Learning II Instructors: Pieter Abbeel & Anca Dragan --- University of California, Berkeley [These
More informationPredicting NBA Shots
Predicting NBA Shots Brett Meehan Stanford University https://github.com/brettmeehan/cs229 Final Project bmeehan2@stanford.edu Abstract This paper examines the application of various machine learning algorithms
More informationFunctions of Random Variables & Expectation, Mean and Variance
Functions of Random Variables & Expectation, Mean and Variance Kuan-Yu Chen ( 陳冠宇 ) @ TR-409, NTUST Functions of Random Variables 1 Given a random variables XX, one may generate other random variables
More informationCombining Experimental and Non-Experimental Design in Causal Inference
Combining Experimental and Non-Experimental Design in Causal Inference Kari Lock Morgan Department of Statistics Penn State University Rao Prize Conference May 12 th, 2017 A Tribute to Don Design trumps
More informationModelling the distribution of first innings runs in T20 Cricket
Modelling the distribution of first innings runs in T20 Cricket James Kirkby The joy of smoothing James Kirkby Modelling the distribution of first innings runs in T20 Cricket The joy of smoothing 1 / 22
More informationConservation of Energy. Chapter 7 of Essential University Physics, Richard Wolfson, 3 rd Edition
Conservation of Energy Chapter 7 of Essential University Physics, Richard Wolfson, 3 rd Edition 1 Different Types of Force, regarding the Work they do. gravity friction 2 Conservative Forces BB WW cccccccc
More informationOptimization and Search. Jim Tørresen Optimization and Search
Optimization and Search INF3490 - Biologically inspired computing Lecture 1: Marsland chapter 9.1, 9.4-9.6 2017 Optimization and Search Jim Tørresen 2 Optimization and Search Methods (selection) Optimization
More informationJasmin Smajic 1, Christian Hafner 2, Jürg Leuthold 2, March 16, 2015 Introduction to Finite Element Method (FEM) Part 1 (2-D FEM)
Jasmin Smajic 1, Christian Hafner 2, Jürg Leuthold 2, March 16, 2015 Introduction to Finite Element Method (FEM) Part 1 (2-D FEM) 1 HSR - University of Applied Sciences of Eastern Switzerland Institute
More informationEssen,al Probability Concepts
Naïve Bayes Essen,al Probability Concepts Marginaliza,on: Condi,onal Probability: Bayes Rule: P (B) = P (A B) = X v2values(a) P (A B) = P (B ^ A = v) P (A ^ B) P (B) ^ P (B A) P (A) P (B) Independence:
More informationCS 4649/7649 Robot Intelligence: Planning
CS 4649/7649 Robot Intelligence: Planning Differential Kinematics, Probabilistic Roadmaps Sungmoon Joo School of Interactive Computing College of Computing Georgia Institute of Technology S. Joo (sungmoon.joo@cc.gatech.edu)
More informationNew Class of Almost Unbiased Modified Ratio Cum Product Estimators with Knownparameters of Auxiliary Variables
Journal of Mathematics and System Science 7 (017) 48-60 doi: 10.1765/159-591/017.09.00 D DAVID PUBLISHING New Class of Almost Unbiased Modified Ratio Cum Product Estimators with Knownparameters of Auxiliary
More informationSan Francisco State University ECON 560 Summer Midterm Exam 2. Monday, July hour 15 minutes
San Francisco State University Michael Bar ECON 560 Summer 2018 Midterm Exam 2 Monday, July 30 1 hour 15 minutes Name: Instructions 1. This is closed book, closed notes exam. 2. No calculators or electronic
More informationBASKETBALL PREDICTION ANALYSIS OF MARCH MADNESS GAMES CHRIS TSENG YIBO WANG
BASKETBALL PREDICTION ANALYSIS OF MARCH MADNESS GAMES CHRIS TSENG YIBO WANG GOAL OF PROJECT The goal is to predict the winners between college men s basketball teams competing in the 2018 (NCAA) s March
More informationAnalysis of Gini s Mean Difference for Randomized Block Design
American Journal of Mathematics and Statistics 2015, 5(3): 111-122 DOI: 10.5923/j.ajms.20150503.02 Analysis of Gini s Mean Difference for Randomized Block Design Elsayed A. H. Elamir Department of Statistics
More informationEE582 Physical Design Automation of VLSI Circuits and Systems
EE Prof. Dae Hyun Kim School of Electrical Engineering and Computer Science Washington State University Routing Grid Routing Grid Routing Grid Routing Grid Routing Grid Routing Lee s algorithm (Maze routing)
More informationDo New Bike Share Stations Increase Member Use?: A Quasi-Experimental Study
Do New Bike Share Stations Increase Member Use?: A Quasi-Experimental Study Jueyu Wang & Greg Lindsey Humphrey School of Public Affairs University of Minnesota Acknowledgement: NSF Sustainable Research
More informationFEATURES. Features. UCI Machine Learning Repository. Admin 9/23/13
Admin Assignment 2 This class will make you a better programmer! How did it go? How much time did you spend? FEATURES David Kauchak CS 451 Fall 2013 Assignment 3 out Implement perceptron variants See how
More informationPhysical Design of CMOS Integrated Circuits
Physical Design of CMOS Integrated Circuits Dae Hyun Kim EECS Washington State University References John P. Uyemura, Introduction to VLSI Circuits and Systems, 2002. Chapter 5 Goal Understand how to physically
More informationCoaches, Parents, Players and Fans
P.O. Box 865 * Lancaster, OH 43130 * 740-808-0380 * www.ohioyouthbasketball.com Coaches, Parents, Players and Fans Sunday s Championship Tournament in Boys Grades 5th thru 10/11th will be conducted in
More informationPlanning and Acting in Partially Observable Stochastic Domains
Planning and Acting in Partially Observable Stochastic Domains Leslie Pack Kaelbling and Michael L. Littman and Anthony R. Cassandra (1998). Planning and Acting in Partially Observable Stochastic Domains,
More informationPredicting Results of March Madness Using the Probability Self-Consistent Method
International Journal of Sports Science 2015, 5(4): 139-144 DOI: 10.5923/j.sports.20150504.04 Predicting Results of March Madness Using the Probability Self-Consistent Method Gang Shen, Su Hua, Xiao Zhang,
More informationAn approach for optimising railway traffic flow on high speed lines with differing signalling systems
Computers in Railways XIII 27 An approach for optimising railway traffic flow on high speed lines with differing signalling systems N. Zhao, C. Roberts & S. Hillmansen Birmingham Centre for Railway Research
More informationPrediction Market and Parimutuel Mechanism
Prediction Market and Parimutuel Mechanism Yinyu Ye MS&E and ICME Stanford University Joint work with Agrawal, Peters, So and Wang Math. of Ranking, AIM, 2 Outline World-Cup Betting Example Market for
More informationTie Breaking Procedure
Ohio Youth Basketball Tie Breaking Procedure The higher seeded team when two teams have the same record after completion of pool play will be determined by the winner of their head to head competition.
More informationPredicting Tennis Match Outcomes Through Classification Shuyang Fang CS074 - Dartmouth College
Predicting Tennis Match Outcomes Through Classification Shuyang Fang CS074 - Dartmouth College Introduction The governing body of men s professional tennis is the Association of Tennis Professionals or
More informationBayesian Optimized Random Forest for Movement Classification with Smartphones
Bayesian Optimized Random Forest for Movement Classification with Smartphones 1 2 3 4 Anonymous Author(s) Affiliation Address email 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
More informationFinding your feet: modelling the batting abilities of cricketers using Gaussian processes
Finding your feet: modelling the batting abilities of cricketers using Gaussian processes Oliver Stevenson & Brendon Brewer PhD candidate, Department of Statistics, University of Auckland o.stevenson@auckland.ac.nz
More informationNATIONAL FEDERATION RULES B. National Federation Rules Apply with the following TOP GUN EXCEPTIONS
TOP GUN COACH PITCH RULES 8 & Girls Division Revised January 11, 2018 AGE CUT OFF A. Age 8 & under. Cut off date is January 1st. Player may not turn 9 before January 1 st. Please have Birth Certificates
More information115th Vienna International Rowing Regatta & International Masters Meeting. June 15 to June 17, 2018
115th Vienna International Rowing Regatta & International Masters Meeting Conducted by the Vienna Rowing Association An event forming part of the Austrian Club Championships 2018 (ÖVM 2018) June 15 to
More informationCS 7641 A (Machine Learning) Sethuraman K, Parameswaran Raman, Vijay Ramakrishnan
CS 7641 A (Machine Learning) Sethuraman K, Parameswaran Raman, Vijay Ramakrishnan Scenario 1: Team 1 scored 200 runs from their 50 overs, and then Team 2 reaches 146 for the loss of two wickets from their
More informationEstimating the Probability of Winning an NFL Game Using Random Forests
Estimating the Probability of Winning an NFL Game Using Random Forests Dale Zimmerman February 17, 2017 2 Brian Burke s NFL win probability metric May be found at www.advancednflstats.com, but the site
More informationNeural Networks II. Chen Gao. Virginia Tech Spring 2019 ECE-5424G / CS-5824
Neural Networks II Chen Gao ECE-5424G / CS-5824 Virginia Tech Spring 2019 Neural Networks Origins: Algorithms that try to mimic the brain. What is this? A single neuron in the brain Input Output Slide
More informationMathematics of Pari-Mutuel Wagering
Millersville University of Pennsylvania April 17, 2014 Project Objectives Model the horse racing process to predict the outcome of a race. Use the win and exacta betting pools to estimate probabilities
More informationB. AA228/CS238 Component
Abstract Two supervised learning methods, one employing logistic classification and another employing an artificial neural network, are used to predict the outcome of baseball postseason series, given
More informationIntroduction to Machine Learning NPFL 054
Introduction to Machine Learning NPFL 054 http://ufal.mff.cuni.cz/course/npfl054 Barbora Hladká hladka@ufal.mff.cuni.cz Martin Holub holub@ufal.mff.cuni.cz Charles University, Faculty of Mathematics and
More informationModeling Approaches to Increase the Efficiency of Clear-Point- Based Solubility Characterization
Modeling Approaches to Increase the Efficiency of Clear-Point- Based Solubility Characterization Paul Larsen, Dallin Whitaker Crop Protection Product Design & Process R&D OCTOBER 4, 2018 TECHNOBIS CRYSTALLI
More informationA Chiller Control Algorithm for Multiple Variablespeed Centrifugal Compressors
Purdue University Purdue e-pubs International Compressor Engineering Conference School of Mechanical Engineering 2014 A Chiller Control Algorithm for Multiple Variablespeed Centrifugal Compressors Piero
More informationThe Intrinsic Value of a Batted Ball Technical Details
The Intrinsic Value of a Batted Ball Technical Details Glenn Healey, EECS Department University of California, Irvine, CA 9617 Given a set of observed batted balls and their outcomes, we develop a method
More informationAn Analysis of Factors Contributing to Wins in the National Hockey League
International Journal of Sports Science 2014, 4(3): 84-90 DOI: 10.5923/j.sports.20140403.02 An Analysis of Factors Contributing to Wins in the National Hockey League Joe Roith, Rhonda Magel * Department
More informationIntroduction to Genetics
Name: Introduction to Genetics Keystone Assessment Anchor: BIO.B.2.1.1: Describe and/or predict observed patterns of inheritance (i.e. dominant, recessive, co-dominance, incomplete dominance, sex-linked,
More informationCOMP Intro to Logic for Computer Scientists. Lecture 13
COMP 1002 Intro to Logic for Computer Scientists Lecture 13 B 5 2 J Admin stuff Assignments schedule? Split a2 and a3 in two (A2,3,4,5), 5% each. A2 due Feb 17 th. Midterm date? March 2 nd. No office hour
More informationA point-based Bayesian hierarchical model to predict the outcome of tennis matches
A point-based Bayesian hierarchical model to predict the outcome of tennis matches Martin Ingram, Silverpond September 21, 2017 Introduction Predicting tennis matches is of interest for a number of applications:
More informationBiomechanics and Models of Locomotion
Physics-Based Models for People Tracking: Biomechanics and Models of Locomotion Marcus Brubaker 1 Leonid Sigal 1,2 David J Fleet 1 1 University of Toronto 2 Disney Research, Pittsburgh Biomechanics Biomechanics
More informationA COURSE OUTLINE (September 2001)
189-265A COURSE OUTLINE (September 2001) 1 Topic I. Line integrals: 2 1 2 weeks 1.1 Parametric curves Review of parametrization for lines and circles. Paths and curves. Differentiation and integration
More informationPredicting Horse Racing Results with TensorFlow
Predicting Horse Racing Results with TensorFlow LYU 1703 LIU YIDE WANG ZUOYANG News CUHK Professor, Gu Mingao, wins 50 MILLIONS dividend using his sure-win statistical strategy. News AlphaGO defeats human
More informationPGA Tour Scores as a Gaussian Random Variable
PGA Tour Scores as a Gaussian Random Variable Robert D. Grober Departments of Applied Physics and Physics Yale University, New Haven, CT 06520 Abstract In this paper it is demonstrated that the scoring
More informationThe Evolution of Transport Planning
The Evolution of Transport Planning On Proportionality and Uniqueness in Equilibrium Assignment Michael Florian Calin D. Morosan Background Several bush-based algorithms for computing equilibrium assignments
More information4/27/2016. Introduction
EVALUATING THE SAFETY EFFECTS OF INTERSECTION SAFETY DEVICES AND MOBILE PHOTO ENFORCEMENT AT THE CITY OF EDMONTON Karim El Basyouny PhD., Laura Contini M.Sc. & Ran Li, M.Sc. City of Edmonton Office of
More informationWhat is Restrained and Unrestrained Pipes and what is the Strength Criteria
What is Restrained and Unrestrained Pipes and what is the Strength Criteria Alex Matveev, September 11, 2018 About author: Alex Matveev is one of the authors of pipe stress analysis codes GOST 32388-2013
More informationCT4510: Computer Graphics. Transformation BOCHANG MOON
CT4510: Computer Graphics Transformation BOCHANG MOON 2D Translation Transformations such as rotation and scale can be represented using a matrix M ee. gg., MM = SSSS xx = mm 11 xx + mm 12 yy yy = mm 21
More informationA new Decomposition Algorithm for Multistage Stochastic Programs with Endogenous Uncertainties
A new Decomposition Algorithm for Multistage Stochastic Programs with Endogenous Uncertainties Vijay Gupta Ignacio E. Grossmann Department of Chemical Engineering Carnegie Mellon University, Pittsburgh
More informationCHAPTER 1 INTRODUCTION TO RELIABILITY
i CHAPTER 1 INTRODUCTION TO RELIABILITY ii CHAPTER-1 INTRODUCTION 1.1 Introduction: In the present scenario of global competition and liberalization, it is imperative that Indian industries become fully
More informationPredicting the development of the NBA playoffs. How much the regular season tells us about the playoff results.
VRIJE UNIVERSITEIT AMSTERDAM Predicting the development of the NBA playoffs. How much the regular season tells us about the playoff results. Max van Roon 29-10-2012 Vrije Universiteit Amsterdam Faculteit
More informationAddition and Subtraction of Rational Expressions
RT.3 Addition and Subtraction of Rational Expressions Many real-world applications involve adding or subtracting algebraic fractions. Similarly as in the case of common fractions, to add or subtract algebraic
More informationName May 3, 2007 Math Probability and Statistics
Name May 3, 2007 Math 341 - Probability and Statistics Long Exam IV Instructions: Please include all relevant work to get full credit. Encircle your final answers. 1. An article in Professional Geographer
More informationgraphic standards manual Mountain States Health Alliance
manual Mountain States mountain states health alliance Bringing Loving Care to Health Care Table of Contents 4 5 6 7 9 -- Why do we need? Brandmark Specifications Brandmark Color Palette Corporate Typography
More informationAn Artificial Neural Network-based Prediction Model for Underdog Teams in NBA Matches
An Artificial Neural Network-based Prediction Model for Underdog Teams in NBA Matches Paolo Giuliodori University of Camerino, School of Science and Technology, Computer Science Division, Italy paolo.giuliodori@unicam.it
More informationintended velocity ( u k arm movements
Fig. A Complete Brain-Machine Interface B Human Subjects Closed-Loop Simulator ensemble action potentials (n k ) ensemble action potentials (n k ) primary motor cortex simulated primary motor cortex neuroprosthetic
More informationi) Linear programming
Problem statement Sailco Corporation must determine how many sailboats should be produced during each of the next four quarters (one quarter = three months). The demand during each of the next four quarters
More informationChapter 10 Aggregate Demand I: Building the IS LM Model
Chapter 10 Aggregate Demand I: Building the IS LM Model Zhengyu Cai Ph.D. Institute of Development Southwestern University of Finance and Economics All rights reserved http://www.escience.cn/people/zhengyucai/index.html
More informationPhrase-based Image Captioning
Phrase-based Image Captioning Rémi Lebret, Pedro O. Pinheiro, Ronan Collobert Idiap Research Institute / EPFL ICML, 9 July 2015 Image Captioning Objective: Generate descriptive sentences given a sample
More informationUse of Auxiliary Variables and Asymptotically Optimum Estimators in Double Sampling
International Journal of Statistics and Probability; Vol. 5, No. 3; May 2016 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education Use of Auxiliary Variables and Asymptotically
More informationPREDICTING THE NCAA BASKETBALL TOURNAMENT WITH MACHINE LEARNING. The Ringer/Getty Images
PREDICTING THE NCAA BASKETBALL TOURNAMENT WITH MACHINE LEARNING A N D R E W L E V A N D O S K I A N D J O N A T H A N L O B O The Ringer/Getty Images THE TOURNAMENT MARCH MADNESS 68 teams (4 play-in games)
More informationMultilevel Models for Other Non-Normal Outcomes in Mplus v. 7.11
Multilevel Models for Other Non-Normal Outcomes in Mplus v. 7.11 Study Overview: These data come from a daily diary study that followed 41 male and female college students over a six-week period to examine
More informationUnited States Commercial Vertical Line Vessel Standardized Catch Rates of Red Grouper in the US South Atlantic,
SEDAR19-DW-14 United States Commercial Vertical Line Vessel Standardized Catch Rates of Red Grouper in the US South Atlantic, 1993-2008 Kevin McCarthy and Neil Baertlein National Marine Fisheries Service,
More informationSPATIAL STATISTICS A SPATIAL ANALYSIS AND COMPARISON OF NBA PLAYERS. Introduction
A SPATIAL ANALYSIS AND COMPARISON OF NBA PLAYERS KELLIN RUMSEY Introduction The 2016 National Basketball Association championship featured two of the leagues biggest names. The Golden State Warriors Stephen
More information