Bayesian Learning. CS 5751 Machine Learning. Chapter 6 Bayesian Learning 1

Similar documents
Bayesian classification methods

Bayesian parameter estimation. Nuno Vasconcelos UCSD

Generative Models and Naïve Bayes

CS 2750 Machine Learning. Lecture 4. Density estimation. CS 2750 Machine Learning. Announcements

Recall that the area of a triangle can be found using the sine of one of the angles.

Geometrical Description of Signals GEOMETRICAL DESCRIPTION OF SIGNALS. Geometrical/Vectorial Representation. Coder. { } S i SOURCE CODER RECEIVER

MTH 112: Elementary Functions

Announcements. CS 188: Artificial Intelligence Spring Today. P4: Ghostbusters. Exact Inference in DBNs. Dynamic Bayes Nets (DBNs)

Contents TRIGONOMETRIC METHODS PROBABILITY DISTRIBUTIONS

Chapter 5. Triangles and Vectors

Apply the Law of Sines. You solved right triangles. You will solve triangles that have no right angle.

In any right-angle triangle the side opposite to the right angle is called the Label the Hypotenuse in each diagram above.

CS 188: Artificial Intelligence Spring Announcements

Minnesota s Wild Turkey Harvest Fall 2016, Spring 2017

The Pythagorean Theorem and Its Converse Is That Right?

Chp. 3_4 Trigonometry.notebook. October 01, Warm Up. Pythagorean Triples. Verifying a Pythagorean Triple... Pythagorean Theorem

The Possibilities of Difference Analysis Utilisation in Profit Rate Assessment

5.5 The Law of Sines

INVESTIGATION 2. What s the Angle?

Working Paper: Reversal Patterns

ERRATA for Guide for the Development of Bicycle Facilities, 4th Edition (GBF-4)

On coalition formation: durable coalition structures

Longitudinal Road Profile Model Generation Based on Measurement Data Using Mathematical Approach

Wave Breaking Energy in Coastal Region

5.8. Solving Three-Dimensional Problems by Using Trigonometry. LEARN ABOUT the Math. Matt s Solution. 328 Chapter 5

Majestic Mansion Dollhouse

Why? DF = 1_ EF = _ AC

Kinematics. Overview. Forward Kinematics. Example: 2-Link Structure. Forward Kinematics. Forward Kinematics

Areas of Trapezoids, Rhombuses, and Kites. To find the area of a trapezoid, rhombus, or kite

17.3 Find Unknown Side Lengths

Open Access Regression Analysis-based Chinese Olympic Games Competitive Sports Strength Evaluation Model Research

An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc. The tennis serve technology based on the AHP evaluation of consistency check

Chapter 12 Practice Test

Lesson 2 PRACTICE PROBLEMS Using Trigonometry in Any Triangle

Skills Practice Skills Practice for Lesson 4.1

Reduced drift, high accuracy stable carbon isotope ratio measurements using a reference gas with the Picarro 13 CO 2 G2101-i gas analyzer

High Speed 128-bit BCD Adder Architecture Using CLA

2014 Victorian Shooting Championship

Skills Practice Skills Practice for Lesson 4.1

A Universal Zombie RPG Add-On

Data Compression. Reduce the size of data. Reduces time to retrieve and transmit data. Compression ratio = original data size/compressed data size

Firearm Safety. Links to Topics below XAVIER BECERRA THE SIX BASIC GUN SAFETY RULES. Attorney Genera] The. Six Basic Gun Safety Rules

bark bark bat bat Multiple Meaning Words: Kindergarten to Grade 2 More Teaching Tools at harsh sound made by a dog

PRESSURE LOSSES DUE TO THE LEAKAGE IN THE AIR DUCTS - A SAFETY PROBLEM FOR TUNNEL USERS?

AHP-based tennis service technical evaluation consistency test

1 Measurement. What you will learn. World s largest cylindrical aquarium. Australian Curriculum Measurement and Geometry Using units of measurement

Your questions answered! Welcome Grads of 2020! Who can I go to if I need help? What are teachers like?

ANNEX 1 CLASSIFICATION AND LABELLING SUMMARY TABLES

Evaluation of a Center Pivot Variable Rate Irrigation System

2018 GIRLS DISTRICT-SPECIFIC PLAYER DEVELOPMENT GUIDE

EFFECTIVE STRESS CONCEPT NO SEEPAGE

Lesson 8: Application Technology

Come, Stay with Us. F # m. J œ œ œ. œ Œ. the. end - sto - rag - sad - in œ. as we. when the. œ œ. œ œ. J. œ œ. j œ.

Using Origami to Find Rational Approximations of Irrational Roots

Daytime Habitat Selection by Resident Golden Eagles in Southern Idaho, USA

Product Information. Long-stroke gripper PSH 42

Announcements. CS 188: Artificial Intelligence Spring Announcements II. P4: Ghostbusters 2.0. Today. Dynamic Bayes Nets (DBNs)

CS 188: Artificial Intelligence Spring Announcements

Improving Interactivity via VT-CFS and Framework-assisted Task Characterization for Linux/Android Smartphones

1970 BRITISH COHORT STUDY: SURVEY

An Analysis of the Visual Demands Associated with Aviation Maintenance Inspectors. The Ohio State University College of Optometry, Columbus, OH 2

First Aid in School Policy

IMO REVIEW OF THE INTACT STABILITY CODE. Sample calculations using a wind criterion. Submitted by Germany. Resolution A.749 (18) and MSC.

PILOT PROGRAM FOR THE EVALUATION OF NANOFILTRATION MEMBRANES PREPARED BY: WILMINGTON, MA ON UF PERMEATE

Structural Gate Decomposition for Depth-Optimal Technology Mapping in LUT-based FPGA

* LANDING ROLLED CURB SIDEWALK RAMP TYPE R (ROLLED SIDES) * LANDING ** RAMP FULL CURB HEIGHT MAY BE REDUCED TO ACCOMMODATE MAXIMUM SIDE FLARE SLOPE

An intro to PCA: Edge Orientation Estimation. Lecture #09 February 15 th, 2013

Terminating Head

Graphic Standards Guide

A Measurement Framework for National Key Performance Measures

Section I: Multiple Choice Select the best answer for each problem.

Measuring Returns to Scale in Nineteenth-Century French Industry Technical Appendix

Lesson 12.1 Right Triangle Trigonometry

Planning of production and utility systems under unit performance degradation and alternative resource-constrained cleaning policies

CS 253: Algorithms. LZW Data Compression

DRAFT FOR PUBLIC CONSULTATION INTERCONNECTION AGREEMENT v.2.0 FOR IP KULATA/SIDIROKASTRO DEFINITIONS, BUSINESS RULES, EXCEPTIONAL EVENT

The Discussion of this exercise covers the following points: The open-loop Ziegler-Nichols method. The open-loop Ziegler-Nichols method

Summary and Cruise Statistics 8

DEVELOPMENT OF HOOPS WITH DOUBLE RIGHT ANGLE ANCHORAGE FOR RC COLUMNS

Right Triangle Trigonometry

Our all-rounder stands out for its superb precision and durability. Central clamping of workpiece for conventional clamping and clamping of

MATHEMATICAL PRACTICES In the Solve It, you used what you know about triangles to find missing lengths. Key Concept Law of Sines

Intuitive Understanding of Throughput-Delay Tradeoff in Ad hoc Wireless Networks

Brand Selection and its Matrix Structure -Expansion to the Second Order Lag-

Performance Comparison of Dynamic Voltage Scaling Algorithms for Hard Real-Time Systems

Are We There Yet? IPv6 as Related to GDP per Capita. By Alain Durand November 28 th,` 2016

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag

THERMOFLO FLUID PUMPS& SYSTEMS THERMOFLO FOR HEATING, COOLING AND WATER SUPPLY APPLICATIONS TO BS7074 SEALED SYSTEMS EXPANSION VESSELS PRESSURIZERS

M. Álvarez-Mozos a, F. Ferreira b, J.M. Alonso-Meijide c & A.A. Pinto d a Department of Statistics and Operations Research, Faculty of

Evolutionary Sets of Safe Ship Trajectories: Evaluation of Individuals

Modeling the Performance of a Baseball Player's Offensive Production

Beating a Live Horse: Effort s Marginal Cost Revealed in a Tournament

Chicago Rabbinical Council 2701 West Howard Street Chicago, IL (773) Fax: (773) Rabbi Sholem Fishbane, Kashruth Administrator

Nonlinear Risk Optimization Approach to Gas Lift Allocation Optimization

Study on Fish Migration through a Stone-Embedded Fish Passage Based on Preference

ITRS 2013 Silicon Platforms + Virtual Platforms = An explosion in SoC design by Gary Smith

Hot-Air Blowers 12 / / Hot-Air Blowers

Development of Accident Modification Factors for Rural Frontage Road Segments in Texas

Odds Ratio Review. Logistic Regression. Odds Ratio Review. Logistic Regression LR - 1. a a c c ˆ 1. b b d 1

Evaluating the Effectiveness of Price and Yield Risk Management Products in Reducing. Revenue Risk for Southeastern Crop Producers * Todd D.

Transcription:

Byesn Lernng Byes Teorem MA, ML ypoteses MA lerners Mnmum descrpton lengt prncple Byes optml clssfer Nïe Byes lerner Byesn belef networks CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 1

Two Roles for Byesn Metods rode prctcl lernng lgortms: Nïe Byes lernng Byesn belef network lernng Combne pror knowledge pror probbltes wt obsered dt Requres pror probbltes: rodes useful conceptul frmework: rodes gold stndrd for elutng oter lernng lgortms Addtonl nsgt nto Occm s rzor CS 5751 Mcne Lernng Cpter 6 Byesn Lernng

Byes Teorem pror probblty of ypotess pror probblty of trnng dt probblty of gen probblty of gen CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 3

CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 4 Coosng Hypoteses Generlly wnt te most probble ypotess gen te trnng dt Mxmum posteror ypotess MA : If we ssume ten cn furter smplfy, nd coose te Mxmum lkelood ML ypotess rg mx rg mx rg mx H H H MA mx rg H ML

Byes Teorem oes ptent e cncer or not? A ptent tkes lb test nd te result comes bck poste. Te test returns correct poste result n only 98% of te cses n wc te dsese s ctully present, nd correct negte result n only 97% of te cses n wc te dsese s not present. Furtermore, 0.8% of te entre populton e ts cncer. cncer cncer +cncer -cncer + cncer - cncer cncer+ cncer+ CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 5

Some Formuls for robbltes roduct rule: probblty A B of conuncton of two eents A nd B: A B ABB BAA Sum rule: probblty of dsuncton of two eents A nd B: A B A + B - A B Teorem of totl probblty: f eents A 1,,A n n re mutully excluse wt A 1 1, ten B n 1 B A A CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 6

Brute Force MA Hypotess Lerner 1. For ec ypotess n H, clculte te posteror probblty. Output te ypotess MA wt te gest posteror probblty MA rg mx H CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 7

Relton to Concept Lernng Consder our usul concept lernng tsk nstnce spce X, ypotess spce H, trnng exmples consder te FndS lernng lgortm outputs most specfc ypotess from te erson spce VS H, Wt would Byes rule produce s te MA ypotess? oes FndS output MA ypotess? CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 8

CS 5751 Mcne Lernng Relton to Concept Lernng Assume fxed set of nstnces x 1,,x m Assume s te set of clssfctons cx 1,,cx m Coose : 1 f consstent wt 0 oterwse Coose to be unform dstrbuton 1/H for ll n H Ten 0 f s consstent oterwse wt 1 VS H, Cpter 6 Byesn Lernng 9

Lernng Rel Vlued Functon y e f ML Consder ny rel-lued trget functon f Trnng exmples x,d, were d s nosy trnng lue d fx + e e s rndom rble nose drwn ndependently for ec x ccordng to some Gussn dstrbuton wt men 0 Ten te mxmum lkelood ypotess ML s te one tt mnmzes te sum of squred errors: ML CS 5751 Mcne Lernng rg mn H m 1 d x Cpter 6 Byesn Lernng 10 x

CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 11 Lernng Rel Vlued Functon 1 1 rg mn rg mx σ 1 rg mx σ 1 πσ 1 rg mx ln ts nsted... log of Mxmze nturl πσ 1 rg mx rg mx rg mx σ 1 H H H H ML m H m H H ML x d x d x d x d e d p p x d

Mnmum escrpton Lengt rncple Occm s rzor: prefer te sortest ypotess ML: prefer te ypotess tt mnmzes rg mn L 1 + L ML C C H were L C x s te descrpton lengt of x under encodng C Exmple: H decson trees, trnng dt lbels L C1 s # bts to descrbe tree L C s #bts to descrbe gen Note L C 0 f exmples clssfed perfectly by. Need only descrbe exceptons Hence ML trdes off tree sze for trnng errors CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 1

Mnmum escrpton Lengt rncple MA rg mx H rg mx log H rg mn log H + log log 1 Interestng fct from nformton teory: Te optml sortest expected lengt code for n eent wt probblty p s log p bts. So nterpret 1: -log s te lengt of under optml code -log s lengt of gen n optml code prefer te ypotess tt mnmzes lengt+lengtmsclssfctons CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 13

CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 14 Byes Optml Clssfer Byes optml clssfcton Exmple: 1.4, - 1 0, + 1 1.3, - 1, + 0 3.3, - 3 1, + 3 0 terefore nd H V rg mx - mx rg H V + H H.6.4

Gbbs Clssfer Byes optml clssfer prodes best result, but cn be expense f mny ypoteses. Gbbs lgortm: 1. Coose one ypotess t rndom, ccordng to. Use ts to clssfy new nstnce Surprsng fct: ssume trget concepts re drwn t rndom from H ccordng to prors on H. Ten: E[error Gbbs ] E[error ByesOptml ] Suppose correct, unform pror dstrbuton oer H, ten ck ny ypotess from VS, wt unform probblty Its expected error no worse tn twce Byes optml CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 15

Nïe Byes Clssfer Along wt decson trees, neurl networks, nerest negor, one of te most prctcl lernng metods. Wen to use Moderte or lrge trnng set lble Attrbutes tt descrbe nstnces re condtonlly ndependent gen clssfcton Successful pplctons: gnoss Clssfyng text documents CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 16

CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 17 Nïe Byes Clssfer Assume trget functon f: X V, were ec nstnce x descrbed by ttrbuted 1,,, n. Most probble lue of fx s: Nïe Byes ssumpton: wc ges Nïe Byes clssfer:,...,, rg mx,...,,,...,, rg mx,...,, rg mx 1 1 1 1 n V n n V n V MA,...,, 1 n mx rg V NB

Nïe Byes Algortm Ne_Byes_Lern exmples For ec trget lue ˆ ˆ estmte For ec ttrbute lue estmte of ec ttrbute Clssfy_New_Instnce x NB rg mx ˆ V x ˆ CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 18

Nïe Byes Exmple Consder CoolCr gn nd new nstnce ColorBlue,TypeSUV,oors,TresWteW Wnt to compute NB rg mx V +*Blue+*SUV+*+*WteW+ 5/14 * 1/5 * /5 * 4/5 * 3/5 0.0137 -*Blue-*SUV-*-*WteW- 9/14 * 3/9 * 4/9 * 3/9 * 3/9 0.0106 CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 19

Nïe Byes Subtletes 1. Condtonl ndependence ssumpton s often olted,,..., 1 n but t works surprsngly well nywy. Note tt you do not need estmted posterors to be correct; need only tt rg mx ˆ ˆ rg mx 1,..., n V see omngos & zzn 1996 for nlyss Nïe Byes posterors often unrelstclly close to 1 or 0 V CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 0

CS 5751 Mcne Lernng Nïe Byes Subtletes. Wt f none of te trnng nstnces wt trget lue e ttrbute lue? Ten ˆ 0, nd... ˆ ˆ 0 Typcl soluton s Byesn estmte for ˆ nc + mp ˆ n + m n s number of trnng exmples for wc n c s number of exmples for wc nd p s pror estmte for ˆ m s wegt gen to pror.e., number of rtul exmples Cpter 6 Byesn Lernng 1

Byesn Belef Networks Interestng becuse Nïe Byes ssumpton of condtonl ndependence s too restrcte But t s ntrctble wtout some suc ssumptons Byesn belef networks descrbe condtonl ndependence mong subsets of rbles llows combng pror knowledge bout ndependence mong rbles wt obsered trnng dt lso clled Byes Nets CS 5751 Mcne Lernng Cpter 6 Byesn Lernng

Condtonl Independence efnton: X s condtonlly ndependent of Y gen Z f te probblty dstrbuton goernng X s ndependent of te lue of Y gen te lue of Z; tt s, f x, y, zk X x Y y, Z zk X x Z more compctly we wrte XY,Z XZ Exmple: Tunder s condtonlly ndependent of Rn gen Lgtnng TunderRn,LgtnngTunderLgtnng Nïe Byes uses condtonl nd. to ustfy X,YZXY,ZYZ zk CS 5751 Mcne Lernng XZYZ Cpter 6 Byesn Lernng 3

Byesn Belef Network Lgtnng Storm BusTourGroup Cmpfre S,B S, B S,B S, B C 0.4 0.1 0.8 0. C 0.6 0.9 0. 0.8 Cmpfre Tunder ForestFre Network represents set of condtonl ndependence ssumptons Ec node s sserted to be condtonlly ndependent of ts nondescendnts, gen ts mmedte predecessors rected cyclc grp CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 4

Byesn Belef Network Represents ont probblty dstrbuton oer ll rbles e.g., Storm,BusTourGroup,,ForestFre n generl, n y1,..., y y rents Y n 1 were rentsy denotes mmedte predecessors of Y n grp so, ont dstrbuton s fully defned by grp, plus te y rentsy CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 5

Inference n Byesn Networks How cn one nfer te probbltes of lues of one or more network rbles, gen obsered lues of oters? Byes net contns ll nformton needed If only one rble wt unknown lue, esy to nfer t In generl cse, problem s N rd In prctce, cn succeed n mny cses Exct nference metods work well for some network structures Monte Crlo metods smulte te network rndomly to clculte pproxmte solutons CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 6

Lernng of Byesn Networks Seerl rnts of ts lernng tsk Network structure mgt be known or unknown Trnng exmples mgt prode lues of ll network rbles, or ust some If structure known nd obsere ll rbles Ten t s esy s trnng Nïe Byes clssfer CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 7

Lernng Byes Net Suppose structure known, rbles prtlly obserble e.g., obsere ForestFre, Storm, BusTourGroup, Tunder, but not Lgtnng, Cmpfre, Smlr to trnng neurl network wt dden unts In fct, cn lern network condtonl probblty tbles usng grdent scent! Conerge to network tt loclly mxmzes CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 8

Grdent Ascent for Byes Nets Let w k denote one entry n te condtonl probblty tble for rble Y n te network w k YyrentsY te lst u k of lues e.g., f Y Cmpfre, ten u k mgt be StormT, BusTourGroupF erform grdent scent by repetedly 1. Updte ll w k usng trnng dt w k. Ten renormlze te w k to ssure CS 5751 Mcne Lernng w k + η d y, u w k w 1, 0 w 1 k k k d Cpter 6 Byesn Lernng 9

Summry of Byes Belef Networks Combne pror knowledge wt obsered dt Impct of pror knowledge wen correct! s to lower te smple complexty Acte reserc re Extend from Boolen to rel-lued rbles rmeterzed dstrbutons nsted of tbles Extend to frst-order nsted of propostonl systems More effecte nference metods CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 30