Byesn Lernng Byes Teorem MA, ML ypoteses MA lerners Mnmum descrpton lengt prncple Byes optml clssfer Nïe Byes lerner Byesn belef networks CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 1
Two Roles for Byesn Metods rode prctcl lernng lgortms: Nïe Byes lernng Byesn belef network lernng Combne pror knowledge pror probbltes wt obsered dt Requres pror probbltes: rodes useful conceptul frmework: rodes gold stndrd for elutng oter lernng lgortms Addtonl nsgt nto Occm s rzor CS 5751 Mcne Lernng Cpter 6 Byesn Lernng
Byes Teorem pror probblty of ypotess pror probblty of trnng dt probblty of gen probblty of gen CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 3
CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 4 Coosng Hypoteses Generlly wnt te most probble ypotess gen te trnng dt Mxmum posteror ypotess MA : If we ssume ten cn furter smplfy, nd coose te Mxmum lkelood ML ypotess rg mx rg mx rg mx H H H MA mx rg H ML
Byes Teorem oes ptent e cncer or not? A ptent tkes lb test nd te result comes bck poste. Te test returns correct poste result n only 98% of te cses n wc te dsese s ctully present, nd correct negte result n only 97% of te cses n wc te dsese s not present. Furtermore, 0.8% of te entre populton e ts cncer. cncer cncer +cncer -cncer + cncer - cncer cncer+ cncer+ CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 5
Some Formuls for robbltes roduct rule: probblty A B of conuncton of two eents A nd B: A B ABB BAA Sum rule: probblty of dsuncton of two eents A nd B: A B A + B - A B Teorem of totl probblty: f eents A 1,,A n n re mutully excluse wt A 1 1, ten B n 1 B A A CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 6
Brute Force MA Hypotess Lerner 1. For ec ypotess n H, clculte te posteror probblty. Output te ypotess MA wt te gest posteror probblty MA rg mx H CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 7
Relton to Concept Lernng Consder our usul concept lernng tsk nstnce spce X, ypotess spce H, trnng exmples consder te FndS lernng lgortm outputs most specfc ypotess from te erson spce VS H, Wt would Byes rule produce s te MA ypotess? oes FndS output MA ypotess? CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 8
CS 5751 Mcne Lernng Relton to Concept Lernng Assume fxed set of nstnces x 1,,x m Assume s te set of clssfctons cx 1,,cx m Coose : 1 f consstent wt 0 oterwse Coose to be unform dstrbuton 1/H for ll n H Ten 0 f s consstent oterwse wt 1 VS H, Cpter 6 Byesn Lernng 9
Lernng Rel Vlued Functon y e f ML Consder ny rel-lued trget functon f Trnng exmples x,d, were d s nosy trnng lue d fx + e e s rndom rble nose drwn ndependently for ec x ccordng to some Gussn dstrbuton wt men 0 Ten te mxmum lkelood ypotess ML s te one tt mnmzes te sum of squred errors: ML CS 5751 Mcne Lernng rg mn H m 1 d x Cpter 6 Byesn Lernng 10 x
CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 11 Lernng Rel Vlued Functon 1 1 rg mn rg mx σ 1 rg mx σ 1 πσ 1 rg mx ln ts nsted... log of Mxmze nturl πσ 1 rg mx rg mx rg mx σ 1 H H H H ML m H m H H ML x d x d x d x d e d p p x d
Mnmum escrpton Lengt rncple Occm s rzor: prefer te sortest ypotess ML: prefer te ypotess tt mnmzes rg mn L 1 + L ML C C H were L C x s te descrpton lengt of x under encodng C Exmple: H decson trees, trnng dt lbels L C1 s # bts to descrbe tree L C s #bts to descrbe gen Note L C 0 f exmples clssfed perfectly by. Need only descrbe exceptons Hence ML trdes off tree sze for trnng errors CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 1
Mnmum escrpton Lengt rncple MA rg mx H rg mx log H rg mn log H + log log 1 Interestng fct from nformton teory: Te optml sortest expected lengt code for n eent wt probblty p s log p bts. So nterpret 1: -log s te lengt of under optml code -log s lengt of gen n optml code prefer te ypotess tt mnmzes lengt+lengtmsclssfctons CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 13
CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 14 Byes Optml Clssfer Byes optml clssfcton Exmple: 1.4, - 1 0, + 1 1.3, - 1, + 0 3.3, - 3 1, + 3 0 terefore nd H V rg mx - mx rg H V + H H.6.4
Gbbs Clssfer Byes optml clssfer prodes best result, but cn be expense f mny ypoteses. Gbbs lgortm: 1. Coose one ypotess t rndom, ccordng to. Use ts to clssfy new nstnce Surprsng fct: ssume trget concepts re drwn t rndom from H ccordng to prors on H. Ten: E[error Gbbs ] E[error ByesOptml ] Suppose correct, unform pror dstrbuton oer H, ten ck ny ypotess from VS, wt unform probblty Its expected error no worse tn twce Byes optml CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 15
Nïe Byes Clssfer Along wt decson trees, neurl networks, nerest negor, one of te most prctcl lernng metods. Wen to use Moderte or lrge trnng set lble Attrbutes tt descrbe nstnces re condtonlly ndependent gen clssfcton Successful pplctons: gnoss Clssfyng text documents CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 16
CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 17 Nïe Byes Clssfer Assume trget functon f: X V, were ec nstnce x descrbed by ttrbuted 1,,, n. Most probble lue of fx s: Nïe Byes ssumpton: wc ges Nïe Byes clssfer:,...,, rg mx,...,,,...,, rg mx,...,, rg mx 1 1 1 1 n V n n V n V MA,...,, 1 n mx rg V NB
Nïe Byes Algortm Ne_Byes_Lern exmples For ec trget lue ˆ ˆ estmte For ec ttrbute lue estmte of ec ttrbute Clssfy_New_Instnce x NB rg mx ˆ V x ˆ CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 18
Nïe Byes Exmple Consder CoolCr gn nd new nstnce ColorBlue,TypeSUV,oors,TresWteW Wnt to compute NB rg mx V +*Blue+*SUV+*+*WteW+ 5/14 * 1/5 * /5 * 4/5 * 3/5 0.0137 -*Blue-*SUV-*-*WteW- 9/14 * 3/9 * 4/9 * 3/9 * 3/9 0.0106 CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 19
Nïe Byes Subtletes 1. Condtonl ndependence ssumpton s often olted,,..., 1 n but t works surprsngly well nywy. Note tt you do not need estmted posterors to be correct; need only tt rg mx ˆ ˆ rg mx 1,..., n V see omngos & zzn 1996 for nlyss Nïe Byes posterors often unrelstclly close to 1 or 0 V CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 0
CS 5751 Mcne Lernng Nïe Byes Subtletes. Wt f none of te trnng nstnces wt trget lue e ttrbute lue? Ten ˆ 0, nd... ˆ ˆ 0 Typcl soluton s Byesn estmte for ˆ nc + mp ˆ n + m n s number of trnng exmples for wc n c s number of exmples for wc nd p s pror estmte for ˆ m s wegt gen to pror.e., number of rtul exmples Cpter 6 Byesn Lernng 1
Byesn Belef Networks Interestng becuse Nïe Byes ssumpton of condtonl ndependence s too restrcte But t s ntrctble wtout some suc ssumptons Byesn belef networks descrbe condtonl ndependence mong subsets of rbles llows combng pror knowledge bout ndependence mong rbles wt obsered trnng dt lso clled Byes Nets CS 5751 Mcne Lernng Cpter 6 Byesn Lernng
Condtonl Independence efnton: X s condtonlly ndependent of Y gen Z f te probblty dstrbuton goernng X s ndependent of te lue of Y gen te lue of Z; tt s, f x, y, zk X x Y y, Z zk X x Z more compctly we wrte XY,Z XZ Exmple: Tunder s condtonlly ndependent of Rn gen Lgtnng TunderRn,LgtnngTunderLgtnng Nïe Byes uses condtonl nd. to ustfy X,YZXY,ZYZ zk CS 5751 Mcne Lernng XZYZ Cpter 6 Byesn Lernng 3
Byesn Belef Network Lgtnng Storm BusTourGroup Cmpfre S,B S, B S,B S, B C 0.4 0.1 0.8 0. C 0.6 0.9 0. 0.8 Cmpfre Tunder ForestFre Network represents set of condtonl ndependence ssumptons Ec node s sserted to be condtonlly ndependent of ts nondescendnts, gen ts mmedte predecessors rected cyclc grp CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 4
Byesn Belef Network Represents ont probblty dstrbuton oer ll rbles e.g., Storm,BusTourGroup,,ForestFre n generl, n y1,..., y y rents Y n 1 were rentsy denotes mmedte predecessors of Y n grp so, ont dstrbuton s fully defned by grp, plus te y rentsy CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 5
Inference n Byesn Networks How cn one nfer te probbltes of lues of one or more network rbles, gen obsered lues of oters? Byes net contns ll nformton needed If only one rble wt unknown lue, esy to nfer t In generl cse, problem s N rd In prctce, cn succeed n mny cses Exct nference metods work well for some network structures Monte Crlo metods smulte te network rndomly to clculte pproxmte solutons CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 6
Lernng of Byesn Networks Seerl rnts of ts lernng tsk Network structure mgt be known or unknown Trnng exmples mgt prode lues of ll network rbles, or ust some If structure known nd obsere ll rbles Ten t s esy s trnng Nïe Byes clssfer CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 7
Lernng Byes Net Suppose structure known, rbles prtlly obserble e.g., obsere ForestFre, Storm, BusTourGroup, Tunder, but not Lgtnng, Cmpfre, Smlr to trnng neurl network wt dden unts In fct, cn lern network condtonl probblty tbles usng grdent scent! Conerge to network tt loclly mxmzes CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 8
Grdent Ascent for Byes Nets Let w k denote one entry n te condtonl probblty tble for rble Y n te network w k YyrentsY te lst u k of lues e.g., f Y Cmpfre, ten u k mgt be StormT, BusTourGroupF erform grdent scent by repetedly 1. Updte ll w k usng trnng dt w k. Ten renormlze te w k to ssure CS 5751 Mcne Lernng w k + η d y, u w k w 1, 0 w 1 k k k d Cpter 6 Byesn Lernng 9
Summry of Byes Belef Networks Combne pror knowledge wt obsered dt Impct of pror knowledge wen correct! s to lower te smple complexty Acte reserc re Extend from Boolen to rel-lued rbles rmeterzed dstrbutons nsted of tbles Extend to frst-order nsted of propostonl systems More effecte nference metods CS 5751 Mcne Lernng Cpter 6 Byesn Lernng 30