Discriminative Feature Selection for Uncertain Graph Classification

Similar documents
Unit 6 Day 2 Notes Central Tendency from a Histogram; Box Plots

BASKETBALL PREDICTION ANALYSIS OF MARCH MADNESS GAMES CHRIS TSENG YIBO WANG

Sensing and Modeling of Terrain Features using Crawling Robots

Inferring land use from mobile phone activity

Predicting Horse Racing Results with TensorFlow

How Do Injuries in the NFL Affect the Outcome of the Game

Current Signature Trend 12C

F For this paper you must have:

NAME: A graph contains five major parts: a. Title b. The independent variable c. The dependent variable d. The scales for each variable e.

INTRODUCTION TO PATTERN RECOGNITION

Fundamentals of Machine Learning for Predictive Data Analytics

CS145: INTRODUCTION TO DATA MINING

DATA HANDLING EXAM QUESTIONS

POKEMON HACKS. Jodie Ashford Josh Baggott Chloe Barnes Jordan bird

knn & Naïve Bayes Hongning Wang

Running head: DATA ANALYSIS AND INTERPRETATION 1

Fish Biorobotics. Fishes as model systems for understanding aquatic propulsion. George V. Lauder Harvard University

Modelling Exposure at Default Without Conversion Factors for Revolving Facilities

Example: sequence data set wit two loci [simula

Exploring Measures of Central Tendency (mean, median and mode) Exploring range as a measure of dispersion

VI Classifications of Weldments

A computer program that improves its performance at some task through experience.

Mobility Detection Using Everyday GSM Traces

1. The data below gives the eye colors of 20 students in a Statistics class. Make a frequency table for the data.

Looking at Statistical Graphics Rigorously

Data: Central Tendency, Box & Whisker Plot Long-Term Memory Review Review 1

Reliable Real-Time Recognition of Motion Related Human Activities using MEMS Inertial Sensors

Internet Technology Fundamentals. To use a passing score at the percentiles listed below:

Understanding Rider Differences in Mileage and Riding Frequency through the MSF100 Motorcyclists Naturalistic Study.

Chapter 2: Modeling Distributions of Data

EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 6. Wenbing Zhao. Department of Electrical and Computer Engineering

Marine Renewables Industry Association. Marine Renewables Industry: Requirements for Oceanographic Measurements, Data Processing and Modelling

Empirical Example II of Chapter 7

Outline. Terminology. EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 6. Steps in Capacity Planning and Management

Parsimonious Linear Fingerprinting for Time Series

Evaluating and Classifying NBA Free Agents

MWGen: A Mini World Generator

Hazard and Risk Assessment Guide

ASMFC Stock Assessment Overview: Red Drum

A Bag-of-Gait Model for Gait Recognition

PREDICTING the outcomes of sporting events

5.1. Data Displays Batter Up. My Notes ACTIVITY

Mrs. Daniel- AP Stats Ch. 2 MC Practice

Level 1 Mathematics and Statistics, 2013

Complete Streets Basics and Benefits

Statistics Class 3. Jan 30, 2012

Simultaneous observations of the H2O and SiO masers toward the late-type stars using KVN

Statistical Studies: Analyzing Data III.B Student Activity Sheet 6: Analyzing Graphical Displays

Statistical Studies: Analyzing Data III.B Student Activity Sheet 6: Analyzing Graphical Displays

STAT 155 Introductory Statistics. Lecture 2-2: Displaying Distributions with Graphs

The Virginia Economy: Labor Markets and Workforce

Matrix-analog measure-cerrelatepredict

CHAPTER 1 Exploring Data

Lab 8 - Continuation of Simulation Let us investigate one of the problems from last week regarding a possible discrimination suit.

NOTES: STANDARD DEVIATION DAY 4 Textbook Chapter 11.1, 11.3

Research Article. Fault diagnosis of power transformer based on chemical properties of insulation oil

AN ASSESSMENT OF NEW JERSEY RESIDENT HUNTER OPINION ON CROSSBOW USE

Human Performance Evaluation

Biostatistics & SAS programming

A Novel Approach to Predicting the Results of NBA Matches

Bending Vibration Analysis of Pipes and Shafts Arranged in Fluid Filled Tubular Spaces Using FEM

Computational Analysis Task Casio ClassPad

This file is part of the following reference:

How to use the Turnip Graph Engine in STATA

That pesky golf game and the dreaded stats class

The Application of Data Mining in Sports Events

The Impact of Narrow Lane on Safety of the Arterial Roads. Hyeonsup Lim

Organizing Quantitative Data

Energy Output. Outline. Characterizing Wind Variability. Characterizing Wind Variability 3/7/2015. for Wind Power Management

Dangerously bold Featured scientist: Melissa Kjelvik from Michigan State University

Evaluation of NEA haddock Harvest Control Rules

Sistrunk Corridor DOTMOCRACY SUMMARY

WHAT CAN WE LEARN FROM MUSIC POLLS?

EFFECT OF SELECTED EXERCISES ON EXPLOSIVE STRENGTH, SPEED, ENDURANCE AND AGILITY OF MEDIUM FAST BOWLERS IN CRICKET

Appendix A Bridge Deck Section Images

How long would it take to escape from the pendulum?

Today s plan: Section 4.2: Normal Distribution

Case Processing Summary. Cases Valid Missing Total N Percent N Percent N Percent % 0 0.0% % % 0 0.0%

Advantages of Using Combined Bathymetry and Side Scan Data in Survey Processing T.M. Hiller, L.N. Brisson

CHAPTER 2 Modeling Distributions of Data

box and whisker plot 3880C798CA037B A83B07E6C4 Box And Whisker Plot 1 / 6

Name: Statistics February 25, 2013

Uncovering Anomalous Usage of Medical Records via Social Network Analysis

Travel Behavior of Baby Boomers in Suburban Age Restricted Communities

CAAD CTF 2018 Rules June 21, 2018 Version 1.1

An exploration of the origins of the favourite-longshot bias in horserace betting markets

Imperceptibly off-center goalkeepers influence penalty-kick direction in soccer

AP Statistics Midterm Exam 2 hours

Fun with M&M s. By: Cassandra Gucciardo. Sorting

Highway & Transportation (I) ECIV 4333 Chapter (4): Traffic Engineering Studies. Spot Speed

CSC242: Intro to AI. Lecture 21

Scaled vs. Original Socre Mean = 77 Median = 77.1

Please, after a carefully reading of given references, answer the following questions:

Real Time Early Warning Indicators for Costly Asset Price Boom/Bust Cycles: A Role for Global Liquidity

New Data for Vermont and our Communities: A Focus on Programs of the U.S. Census Bureau Town Officer Education Seminars uvm.

Figure 1: A hockey puck travels to the right in three different cases.

Level 1 Mathematics and Statistics, 2014

Persistence racial difference in socioeconomic outcomes. Are Emily and Greg More Employable than Lakisha and Jamal?

Computer diagnostics for the analysis of table tennis matches *

DESIGN AND ANALYSIS OF ALGORITHMS (DAA 2017)

Transcription:

Discriminative Feature Selection for Uncertain Graph Classification Xiangnan Kong University of Illinois at Chicago joint work with Philip S. Yu (Univ. Illinois at Chicago) Xue Wang & Ann B. Ragin (Northwestern Univ.)

Brain: A Complex Machine How it works? When something is wrong... Alzheimer s Disease ADHD

Neuroimaging fmri: A Video of brain activities But, Can you tell which brain is normal?

Brain as a Network Brain Activities Functional Connections

Brain Region Connectivity healthy brain Good Family

Brain Region Connectivity ADHD brain Connectivity problem

Brain Activities as an Uncertain Graph Not sure how exactly the network looks like probability that the connection existing in practice

Uncertain Graph 0.02 0.01 uncertain graph A 0.4 0.9 0.04 0.02 B C 0.38 0.11 0.7 possible worlds 0.03 0.252

Uncertain Graph 0.02 0.01 uncertain graph A 0.4 0.9 0.04 0.02 B C 0.38 0.11 0.7... possible worlds 0.03 0.252

Uncertain Graph Classification Problem 0.4 0.7 0.9 0.4 0.5 0.1 0.2 0.2 label! label! label! ADHD? +/- Uncertain Graphs Discriminative Subgraph Features x! label! x! label! x! label! Feature Vectors

How to tell if a subgraph is Discriminative Uncertain Graphs + 0.8 B A 0.9 C B + A A 0.1 0.9 0.1 0.8 0.1 C B 0.9 0.8 C B A 0.1 C G 1 G 2 G3 G4 Subgraph Features A A B C C B g 1 g 2 g 3 frequent in uncertain graphs discriminative in certain graphs C discriminative in uncertain graphs

Discriminative Scores of a Subgraph Certain graphs Utility Score G-test Score label! label! label! Frequency Ratio Confidence HSIC subgraph feature a certain value

Discriminative Scores of a Subgraph Uncertain graphs Utility Scores Probability 0.4 0.7 0.9 0.4 0.5 0.1 0.2 0.5 subgraph label! label! label! 0.01 0.31 0.23 0.05 0.01 0.13 0.31 a distribution

How to get the distribution? Confidence It depends on the utility function... Frequency Ratio G-test Score HSIC Generalized Utility Function for Certain Graphs Table 2: Summary of Discrimination Score Functions. Name f(n g +,ng,n +,n ) confidence frequency ratio G-test HSIC(linear) n g + n g + +ng g + n log n n g n + 2n g + ln ng + n n g n + (n g + n ng n + )2 (n + +n 1) 2 (n + +n ) 2 +2(n + n g + ) ln n (n + ng + ) n + (n n g )

Dynamic Programming n + i 1 0 [ Pr n g + = i, D ] + (k) 0 1 k n + Pr [n g + = n +, D ] + Pr [n g + =1, D ] + Pr [n g + =0, D ] + [ Pr i, D(k) ] = [ ] ( 1 Pr[g G ) k ] Pr[i, D(k 1)] + Pr[g G(k)] Pr[i 1, D(k 1)] if i k 1 if i = k =0 0 if i>kor i<0

How to Measure? MedianMode Mean Subgraph A Probability phi-prob 0 + Mean Mode Subgraph B Probability 0 Median phi-prob + Frequency Ratio

Subgraph Statistical Measures ) Mean: Mode: Median: phi-probability: ( Exp F (g, D) ) ( ) Median F (g, D) = ( Mode F (g, D) ) + s= =argmax S ( ϕ-pr F (g, D) ) s Pr[F (g, D) =s] = =argmax s s=ϕ [ Pr F (g, D) ] =s S [ ] Pr F (g, D) =s s= + Pr[F (g, D) =s] 1 2 More Details in the Paper 0-edges 1-edge Pattern Search Tree 2-edges

Data Sets Graphs: Brain Images (fmri) Class Label: Brain Diseases Table 3: Summary of experimental datasets. D D + D V avg. E avg. edge prob ADHD 200 100 100 116 484.7 0.55 ADNI 36 18 18 90 2019.8 0.59 HIV 50 25 25 90 480.48 0.88 Alzheimer s Disease ADHD

Compared Methods Certain Graph Methods Frequent Subgraphs in Uncertain Graphs Utility Functions Statistical Measures Confidence Frequency Ratio HSIC G-test Score Mean Mode Median Phi - probability

Uncertain Graph Helps Error Rate Methods t =100 t =200 t =300 t =400 t =500 Uncertain Graph Methods Certain Graph Methods Exp-HSIC 0.400 (9) 0.367 (8) 0.367 (10) 0.317 (4) 0.333 (9) Med-HSIC 0.433 (14) 0.350 (5) 0.333 (6) 0.350 (8) 0.317 (7) Mod-HSIC 0.367 (6) 0.333 (3) 0.300 (1)* 0.317 (4) 0.300 (2) ϕpr-hsic 0.283 (1)* 0.283 (1)* 0.333 (6) 0.333 (7) 0.300 (2) HSIC 0.450 (16) 0.467 (19) 0.467 (17) 0.500 (18) 0.500 (18) Exp-Ratio 0.433 (14) 0.383 (10) 0.317 (4) 0.300 (2) 0.300 (2) Med-Ratio 0.450 (16) 0.417 (15) 0.450 (16) 0.383 (11) 0.383 (11) Mod-Ratio 0.317 (3) 0.350 (5) 0.433 (15) 0.417 (13) 0.467 (15) ϕpr-ratio 0.400 (9) 0.317 (2) 0.300 (1)* 0.300 (2) 0.267 (1)* Ratio 0.500 (19) 0.483 (20) 0.533 (22) 0.567 (22) 0.533 (20) Exp-Gtest 0.300 (2) 0.367 (8) 0.317 (4) 0.350 (8) 0.383 (11) Med-Gtest 0.517 (21) 0.450 (18) 0.400 (11) 0.500 (18) 0.483 (17) Mod-Gtest 0.517 (21) 0.550 (22) 0.500 (21) 0.500 (18) 0.517 (19) ϕpr-gtest 0.450 (16) 0.417 (15) 0.417 (13) 0.383 (11) 0.300 (2) Gtest 0.500 (19) 0.500 (21) 0.467 (17) 0.433 (14) 0.550 (21) Exp-Conf 0.367 (7) 0.333 (3) 0.300 (1)* 0.283 (1)* 0.300 (2) Med-Conf 0.333 (4) 0.350 (5) 0.350 (8) 0.350 (8) 0.317 (7) Mod-Conf 0.417 (12) 0.383 (10) 0.350 (8) 0.317 (4) 0.333 (9) ϕpr-conf 0.400 (9) 0.417 (15) 0.467 (17) 0.467 (16) 0.433 (13) Conf 0.400 (9) 0.400 (13) 0.417 (13) 0.450 (15) 0.467 (15) Exp-Freq 0.383 (8) 0.383 (10) 0.400 (11) 0.467 (16) 0.433 (13) Freq 0.350 (5) 0.400 (13) 0.483 (20) 0.550 (21) 0.550 (21)

Discriminative Function Helps Error Rate Methods t =100 t =200 t =300 t =400 t =500 Discrimin ative Exp-HSIC 0.400 (9) 0.367 (8) 0.367 (10) 0.317 (4) 0.333 (9) Med-HSIC 0.433 (14) 0.350 (5) 0.333 (6) 0.350 (8) 0.317 (7) Mod-HSIC 0.367 (6) 0.333 (3) 0.300 (1)* 0.317 (4) 0.300 (2) ϕpr-hsic 0.283 (1)* 0.283 (1)* 0.333 (6) 0.333 (7) 0.300 (2) HSIC 0.450 (16) 0.467 (19) 0.467 (17) 0.500 (18) 0.500 (18) Exp-Ratio 0.433 (14) 0.383 (10) 0.317 (4) 0.300 (2) 0.300 (2) Med-Ratio 0.450 (16) 0.417 (15) 0.450 (16) 0.383 (11) 0.383 (11) Mod-Ratio 0.317 (3) 0.350 (5) 0.433 (15) 0.417 (13) 0.467 (15) ϕpr-ratio 0.400 (9) 0.317 (2) 0.300 (1)* 0.300 (2) 0.267 (1)* Ratio 0.500 (19) 0.483 (20) 0.533 (22) 0.567 (22) 0.533 (20) Exp-Gtest 0.300 (2) 0.367 (8) 0.317 (4) 0.350 (8) 0.383 (11) Med-Gtest 0.517 (21) 0.450 (18) 0.400 (11) 0.500 (18) 0.483 (17) Mod-Gtest 0.517 (21) 0.550 (22) 0.500 (21) 0.500 (18) 0.517 (19) ϕpr-gtest 0.450 (16) 0.417 (15) 0.417 (13) 0.383 (11) 0.300 (2) Gtest 0.500 (19) 0.500 (21) 0.467 (17) 0.433 (14) 0.550 (21) Frequent Exp-Conf 0.367 (7) 0.333 (3) 0.300 (1)* 0.283 (1)* 0.300 (2) Med-Conf 0.333 (4) 0.350 (5) 0.350 (8) 0.350 (8) 0.317 (7) Mod-Conf 0.417 (12) 0.383 (10) 0.350 (8) 0.317 (4) 0.333 (9) ϕpr-conf 0.400 (9) 0.417 (15) 0.467 (17) 0.467 (16) 0.433 (13) Conf 0.400 (9) 0.400 (13) 0.417 (13) 0.450 (15) 0.467 (15) Exp-Freq 0.383 (8) 0.383 (10) 0.400 (11) 0.467 (16) 0.433 (13) Freq 0.350 (5) 0.400 (13) 0.483 (20) 0.550 (21) 0.550 (21)

Statistical Measure Helps Error Rate mean median mode phi-prob Methods t =100 t =200 t =300 t =400 t =500 Exp-HSIC 0.400 (9) 0.367 (8) 0.367 (10) 0.317 (4) 0.333 (9) Med-HSIC 0.433 (14) 0.350 (5) 0.333 (6) 0.350 (8) 0.317 (7) Mod-HSIC 0.367 (6) 0.333 (3) 0.300 (1)* 0.317 (4) 0.300 (2) ϕpr-hsic 0.283 (1)* 0.283 (1)* 0.333 (6) 0.333 (7) 0.300 (2) HSIC 0.450 (16) 0.467 (19) 0.467 (17) 0.500 (18) 0.500 (18) Exp-Ratio 0.433 (14) 0.383 (10) 0.317 (4) 0.300 (2) 0.300 (2) Med-Ratio 0.450 (16) 0.417 (15) 0.450 (16) 0.383 (11) 0.383 (11) Mod-Ratio 0.317 (3) 0.350 (5) 0.433 (15) 0.417 (13) 0.467 (15) ϕpr-ratio 0.400 (9) 0.317 (2) 0.300 (1)* 0.300 (2) 0.267 (1)* Ratio 0.500 (19) 0.483 (20) 0.533 (22) 0.567 (22) 0.533 (20) Exp-Gtest 0.300 (2) 0.367 (8) 0.317 (4) 0.350 (8) 0.383 (11) Med-Gtest 0.517 (21) 0.450 (18) 0.400 (11) 0.500 (18) 0.483 (17) Mod-Gtest 0.517 (21) 0.550 (22) 0.500 (21) 0.500 (18) 0.517 (19) ϕpr-gtest 0.450 (16) 0.417 (15) 0.417 (13) 0.383 (11) 0.300 (2) Gtest 0.500 (19) 0.500 (21) 0.467 (17) 0.433 (14) 0.550 (21) Exp-Conf 0.367 (7) 0.333 (3) 0.300 (1)* 0.283 (1)* 0.300 (2) Med-Conf 0.333 (4) 0.350 (5) 0.350 (8) 0.350 (8) 0.317 (7) Mod-Conf 0.417 (12) 0.383 (10) 0.350 (8) 0.317 (4) 0.333 (9) ϕpr-conf 0.400 (9) 0.417 (15) 0.467 (17) 0.467 (16) 0.433 (13) Conf 0.400 (9) 0.400 (13) 0.417 (13) 0.450 (15) 0.467 (15) Exp-Freq 0.383 (8) 0.383 (10) 0.400 (11) 0.467 (16) 0.433 (13) Freq 0.350 (5) 0.400 (13) 0.483 (20) 0.550 (21) 0.550 (21)

Summary Mining discriminative subgraph features for uncertain graph classification model brains as uncertain graphs mining discriminative subgraph features using different statistical measures

Q&A