Sentiment Analysis Symposium - Disambiguate Opinion Word Sense via Contextonymy

Similar documents
Twitter Analysis of IPL cricket match using GICA method

EVALITA 2011 The News People Search Task: Evaluating Cross-document Coreference Resolution of Named Person Entities in Italian News

A Novel Approach to Predicting the Results of NBA Matches

Using Perceptual Context to Ground Language

knn & Naïve Bayes Hongning Wang

BALLGAME: A Corpus for Computational Semantics

Introduction to Pattern Recognition

Spreading Activation in Soar: An Update

1 Konan University Yuki Hattori Akiyo Nadamoto

Re-evaluating automatic metrics for image captioning

Commonsense Knowledge Acquisition and Applications

Safety Assessment of Installing Traffic Signals at High-Speed Expressway Intersections

Transformer fault diagnosis using Dissolved Gas Analysis technology and Bayesian networks

INTRODUCTION TO PATTERN RECOGNITION

67. Sectional normalization and recognization on the PV-Diagram of reciprocating compressor

Cricket umpire assistance and ball tracking system using a single smartphone camera

Evaluating and Classifying NBA Free Agents

Introduction to Pattern Recognition

Approximate Grammar for Information Extraction

CS472 Foundations of Artificial Intelligence. Final Exam December 19, :30pm

Open Research Online The Open University s repository of research publications and other research outputs

BASKETBALL PREDICTION ANALYSIS OF MARCH MADNESS GAMES CHRIS TSENG YIBO WANG

Uncovering Anomalous Usage of Medical Records via Social Network Analysis

Title: 4-Way-Stop Wait-Time Prediction Group members (1): David Held

Smart-Walk: An Intelligent Physiological Monitoring System for Smart Families

Player Availability Rating (PAR) - A Tool for Quantifying Skater Performance for NHL General Managers

An Artificial Neural Network-based Prediction Model for Underdog Teams in NBA Matches

Human Performance Evaluation

Performance of Fully Automated 3D Cracking Survey with Pixel Accuracy based on Deep Learning

Syntax and Parsing II

CS 221 PROJECT FINAL

PREDICTING the outcomes of sporting events

Using Spatio-Temporal Data To Create A Shot Probability Model

Ivan Suarez Robles, Joseph Wu 1

CS 4649/7649 Robot Intelligence: Planning

Machine Learning an American Pastime

Thursday, Nov 20, pm 2pm EB 3546 DIET MONITORING THROUGH BREATHING SIGNAL ANALYSIS USING WEARABLE SENSORS. Bo Dong. Advisor: Dr.

The Cooperative Cleaners Case Study: Modelling and Analysis in Real-Time ABS

Country report. Swedish bathing water quality in Sweden. May Photo: Peter Kristensen/EEA

Petacat: Applying ideas from Copycat to image understanding

Title: Modeling Crossing Behavior of Drivers and Pedestrians at Uncontrolled Intersections and Mid-block Crossings

JPEG-Compatibility Steganalysis Using Block-Histogram of Recompression Artifacts

A Big Data approach to understand Central Banks Big Data Spain 2018

VDOT Crash Analysis Procedures for Roadway Safety Assessments

Pedestrian Protection System for ADAS using ARM 9

CS 528 Mobile and Ubiquitous Computing Lecture 7a: Applications of Activity Recognition + Machine Learning for Ubiquitous Computing.

Review of transportation mode detection approaches based on smartphone data

Application of Bayesian Networks to Shopping Assistance

In memory of Dr. Kevin P. Granata, my graduate advisor, who was killed protecting others on the morning of April 16, 2007.

Meta-Knowledge as an engine in Classifier Combination

PREDICTING THE NCAA BASKETBALL TOURNAMENT WITH MACHINE LEARNING. The Ringer/Getty Images

Opleiding Informatica

Statistical Machine Translation

Object Recognition. Selim Aksoy. Bilkent University

Basketball field goal percentage prediction model research and application based on BP neural network

CC-Log: Drastically Reducing Storage Requirements for Robots Using Classification and Compression

Paul Burkhardt. May 19, 2016

Supplementary Material for Bayes Merging of Multiple Vocabularies for Scalable Image Retrieval

IDENTIFICATION OF WIND SEA AND SWELL EVENTS AND SWELL EVENTS PARAMETERIZATION OFF WEST AFRICA. K. Agbéko KPOGO-NUWOKLO

Lecture 10: Generation

A Proof-Producing CSP Solver 1

Software Reliability 1

Geometric moments for gait description

Neural Networks II. Chen Gao. Virginia Tech Spring 2019 ECE-5424G / CS-5824

A Machine Learning Approach to Predicting Winning Patterns in Track Cycling Omnium

Detecting Aged Person s Sliding Feet from Time Series Data of Foot Pressure

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag

Visual Traffic Jam Analysis Based on Trajectory Data

Game, Shot and Match: Event-based Indexing of Tennis

Deconstructing Data Science

Design of a Pedestrian Detection System Based on OpenCV. Ning Xu and Yong Ren*

Country report. Italian bathing water quality in Italy. May Photo: Peter Kristensen

Author s Name Name of the Paper Session. Positioning Committee. Marine Technology Society. DYNAMIC POSITIONING CONFERENCE September 18-19, 2001

Hierarchical writing genre classification with neural networks

Dialect Recogni-on Using a Phone GMM Supervector Based SVM Kernel

News English.com Ready-to-use ESL / EFL Lessons

Optimizing Cyclist Parking in a Closed System

2 When Some or All Labels are Missing: The EM Algorithm

EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 6. Wenbing Zhao. Department of Electrical and Computer Engineering

Wearable Trick Classification in Freestyle Snowboarding

Adaptability and Fault Tolerance

Make a Marigram. Overview: Targeted Alaska Grade Level Expectations: Objectives: Materials: Whole Picture: Grades 9-12

Provably Secure Camouflaging Strategy for IC Protection

Predicting Tennis Match Outcomes Through Classification Shuyang Fang CS074 - Dartmouth College

Dynamic Cluster-Based Over-Demand Prediction in Bike Sharing Systems

Outline. Terminology. EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 6. Steps in Capacity Planning and Management

Advanced PMA Capabilities for MCM

Country report. Swedish bathing water quality in Sweden. May Photo: Peter Kristensen

Active Pedestrian Protection System, Project Review

Real-time identification using gait pattern analysis on a standalone wearable accelerometer

Application of Dijkstra s Algorithm in the Evacuation System Utilizing Exit Signs

CS145: INTRODUCTION TO DATA MINING

Walking up Scenic Hills: Towards a GIS Based Typology of Crowd Sourced Walking Routes

DATA MINING ON CRICKET DATA SET FOR PREDICTING THE RESULTS. Sushant Murdeshwar

This file is part of the following reference:

The latest from the World Health Organization meeting in October 2011 ICF updates

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

Bicycle Safety Map System Based on Smartphone Aided Sensor Network

Which Aspects are able to Influence the Decision in Case of the Bids for the Olympic Games 2024?

News English.com Ready-to-use ESL / EFL Lessons Zidane - "Italy player called me a terrorist"

Transcription:

Sentiment Analysis Symposium - Disambiguate Opinion Word Sense via Contextonymy Guillaume Gadek, Josefin Betsholtz, Alexandre Pauchet, Stéphan Brunessaux, Nicolas Malandain and Laurent Vercouter Airbus, LITIS 28 June 2017 G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 1 / 34

G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 2 / 34

Tweets can be very difficult to analyze... #hashtag #verylongandexplicithashtag2 poorlywritten pseudo-english by @user http://spam.url!ponctuation;signs #hashtag3 G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 3 / 34

Tweets can be very difficult to analyze... #hashtag #verylongandexplicithashtag2 poorlywritten pseudo-english by @user http://spam.url!ponctuation;signs #hashtag3 What is the sense of!ponctuation;signs? Synonyms from the Oxford dictionary? G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 3 / 34

Tweets can be very difficult to analyze... #hashtag #verylongandexplicithashtag2 poorlywritten pseudo-english by @user http://spam.url!ponctuation;signs #hashtag3 What is the sense of!ponctuation;signs? Synonyms from the Oxford dictionary? No. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 3 / 34

Outline Introduction State of the art Experiment Results Conclusion G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 4 / 34

State of the art G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 5 / 34

Opinion as a sentiment analysis task [Wilson et al., 2005] [Pennebaker et al., 2001] [Baccianella et al., 2010] G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 6 / 34

Opinion as a classification task Stance Detection [Andreevskaia and Bergler, 2008, Hasan and Ng, 2013] G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 7 / 34

Contextualization Semantic relatedness of words use WordNet [Miller, 1995] G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 8 / 34

Contextualization Semantic relatedness of words use WordNet [Miller, 1995] use Wikipedia [Zesch et al., 2008] G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 8 / 34

Contextualization Semantic relatedness of words use WordNet [Miller, 1995] use Wikipedia [Zesch et al., 2008] poor results on Twitter content: different sentence structure, new vocabulary [Feng et al., 2015] G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 8 / 34

One lead amongst others: contextonyms G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 9 / 34

One lead amongst others: contextonyms Two words are contextonyms if they occur in the same context. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 9 / 34

One lead amongst others: contextonyms Two words are contextonyms if they occur in the same context. Contextonyms represent the minimal meanings of words. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 9 / 34

One lead amongst others: contextonyms Two words are contextonyms if they occur in the same context. Contextonyms represent the minimal meanings of words. Example tweet: What a great match! G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 9 / 34

One lead amongst others: contextonyms Two words are contextonyms if they occur in the same context. Contextonyms represent the minimal meanings of words. Example tweet: What a great match! Some minimal meanings of the word match match, light, fire, wood match, couple, date, love match, football, sports, game G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 9 / 34

If we know something about the context(s) of the word(s) in a tweet, we might be able to disambiguate the meaning of that tweet. introduced by [Hyungsuk et al., 2003] G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 10 / 34

If we know something about the context(s) of the word(s) in a tweet, we might be able to disambiguate the meaning of that tweet. introduced by [Hyungsuk et al., 2003] used for sentiment: [Serban et al., 2012] G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 10 / 34

If we know something about the context(s) of the word(s) in a tweet, we might be able to disambiguate the meaning of that tweet. introduced by [Hyungsuk et al., 2003] used for sentiment: [Serban et al., 2012] used for machine translation: [Ploux and Ji, 2003, Wang et al., 2016] G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 10 / 34

Contextonyms Example: Contextonyms for the word support Contextosets (support, continued, foolery), (climate, support, advocacy, preventing, change), (support, bae, naten, kanta), (support, tennessee, thank, trump2016) G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 11 / 34

Contextonyms Example: Contextonyms for the word support Contextosets (support, continued, foolery), (climate, support, advocacy, preventing, change), (support, bae, naten, kanta), (support, tennessee, thank, trump2016) Contextoset: a set of words representing the context for the target word. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 11 / 34

Contextonyms Example: Contextonyms for the word support Contextosets (support, continued, foolery), (climate, support, advocacy, preventing, change), (support, bae, naten, kanta), (support, tennessee, thank, trump2016) Contextoset: a set of words representing the context for the target word. Contextonyms: two words are contextonyms if they appear in the same context. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 11 / 34

Word Embeddings, contextosets and WordNet synsets for the nearest words of support Method: Word Embeddings supporting, supported, supports, respect, vote, encourage, voting, voted, organize, helping Method: Synsets (extract, total:14) (documentation, support) (support, keep, livelihood, living, bread and butter, sustenance) (support, supporting) (accompaniment, musical accompaniment, backup, support) Method: Contextosets (support, continued, foolery), (climate, support, advocacy, preventing, change), (support, bae, naten, kanta), (support, tennessee, thank, trump2016) Word2Vec: [Mikolov et al., 2013] G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 12 / 34

Experiment Aim: to improve stance detection on tweets using contextonyms G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 13 / 34

Experiment overview Extraction of Contextonyms Tweets SemEval Stance Detection Task Training Tweets Contextonyms Stance Classifier Baseline Ctxts Performance Results G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 14 / 34

Extraction-Theory What do we need to extract contextonyms? Source corpus huge (millions of tweets? more?) not annotated same language same topic, if possible G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 15 / 34

Contextonyms Extraction Algorithm 1. preprocess tweets 2. words co-occurrence graph 3. filter: 3.1 α (filter nodes) 3.2 β (filter edges) 4. k-cliques are our contextonyms. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 16 / 34

Contextonyms Extraction Algorithm 1. preprocess tweets 2. words co-occurrence graph 3. filter: 3.1 α (filter nodes) 3.2 β (filter edges) 4. k-cliques are our contextonyms. Tweet: Hillary is the best candidate #hillary2016 G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 16 / 34

Contextonyms Extraction Algorithm 1. preprocess tweets 2. words co-occurrence graph 3. filter: 3.1 α (filter nodes) 3.2 β (filter edges) 4. k-cliques are our contextonyms. Tweet: Hillary is the best candidate #hillary2016 Processed Tweet: hillary best candidate #hillary2016 G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 16 / 34

Contextonyms Extraction Algorithm 1. preprocess tweets 2. words co-occurrence graph 3. filter: 3.1 α (filter nodes) 3.2 β (filter edges) 4. k-cliques are our contextonyms. Tweet: Hillary is the best candidate #hillary2016 Processed Tweet: hillary best candidate #hillary2016 hillary best candidate #hillary2016 G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 16 / 34

Communities of words: k-cliques G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 17 / 34

Stance in tweets G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 18 / 34

Stance in tweets Target: Hillary Clinton. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 18 / 34

Stance in tweets Target: Hillary Clinton. Sample tweet FAVOR I m proud to announce I support #HillaryClinton!!!! G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 18 / 34

Stance in tweets Target: Hillary Clinton. Sample tweet FAVOR I m proud to announce I support #HillaryClinton!!!! Sample tweet AGAINST #WhyImNotVotingForHillary <<<<<<SHE IS A CRIMINAL G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 18 / 34

Stance in tweets Target: Hillary Clinton. Sample tweet FAVOR I m proud to announce I support #HillaryClinton!!!! Sample tweet AGAINST #WhyImNotVotingForHillary <<<<<<SHE IS A CRIMINAL Sample tweet NONE Ding Dong G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 18 / 34

G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 19 / 34

Details about the SemEval task Title: SemEval2016-task6 subtask-a 5 independent targets: Atheism Climate Change is a Real Concern Feminist Movement Hillary Clinton Legalization of Abortion G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 20 / 34

Details about the SemEval task Title: SemEval2016-task6 subtask-a 5 independent targets: Atheism Climate Change is a Real Concern Feminist Movement Hillary Clinton Legalization of Abortion 2 datasets: training: 2914 texts of tweets, unbalanced test: 1250 texts of tweets, quite well balanced G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 20 / 34

Details about the SemEval task Title: SemEval2016-task6 subtask-a 5 independent targets: Atheism Climate Change is a Real Concern Feminist Movement Hillary Clinton Legalization of Abortion 2 datasets: training: 2914 texts of tweets, unbalanced test: 1250 texts of tweets, quite well balanced and some critics: training set not large enough to train a classifier man-made annotation: need to perfectly know the topic to understand the stances G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 20 / 34

Baselines To benchmark our classifier, we created two baselines using two standard stance detection approaches: 1. Sentiment based: SENT-BASE 2. Learning based: SVM-UNIG G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 21 / 34

Baselines 1. Sentiment based: SENT-BASE Idea Each word is associated with a positivity score, a negativity score, and an objectivity score. Use SentiWordNet 3.0 [Baccianella et al., 2010]. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 22 / 34

Baselines 1. Sentiment based: SENT-BASE Idea Each word is associated with a positivity score, a negativity score, and an objectivity score. Use SentiWordNet 3.0 [Baccianella et al., 2010]. Valence A weighted sum of the scores of all the words in a tweet. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 22 / 34

Baselines 1. Sentiment based: SENT-BASE Idea Each word is associated with a positivity score, a negativity score, and an objectivity score. Use SentiWordNet 3.0 [Baccianella et al., 2010]. Valence A weighted sum of the scores of all the words in a tweet. The valence indicates the overall sentiment of the tweet: a positive valence means that the tweet is favorable, etc. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 22 / 34

Baselines 2. Learning based: SVM-UNIG Idea 1. Select the 10,000 unigrams (words) that are most indicative of stance from training corpus. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 23 / 34

Baselines 2. Learning based: SVM-UNIG Idea 1. Select the 10,000 unigrams (words) that are most indicative of stance from training corpus. 2. Construct a feature vector of boolean indicators of unigram presence in each tweet. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 23 / 34

Baselines 2. Learning based: SVM-UNIG Idea 1. Select the 10,000 unigrams (words) that are most indicative of stance from training corpus. 2. Construct a feature vector of boolean indicators of unigram presence in each tweet. 3. Train SVM classifier on annotated training corpus. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 23 / 34

Improving baselines with contextonyms Sentiment based approach: SENT-CTXT Idea We can improve the sentiment analysis of a tweet by looking at the contextonyms associated with that tweet. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 24 / 34

Improving baselines with contextonyms Sentiment based approach: SENT-CTXT Idea We can improve the sentiment analysis of a tweet by looking at the contextonyms associated with that tweet. 1. Associate each tweet with contextonyms. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 24 / 34

Improving baselines with contextonyms Sentiment based approach: SENT-CTXT Idea We can improve the sentiment analysis of a tweet by looking at the contextonyms associated with that tweet. 1. Associate each tweet with contextonyms. 2. Compute the valence of those contextonyms. This indicates the sentiment of the tweet. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 24 / 34

Improving baselines with contextonyms Learning based approach #1: SVM-CTXT Idea Contextonyms can improve the SVM classifier because we get more information about the context of the tweet. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 25 / 34

Improving baselines with contextonyms Learning based approach #1: SVM-CTXT Idea Contextonyms can improve the SVM classifier because we get more information about the context of the tweet. Same classifier as SVM-UNIG but feature vector is a boolean indicator of contextonym presence. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 25 / 34

Improving baselines with contextonyms Learning based approach #2: SVM-EXP Idea The fact that tweets are short make them difficult to analyze and contextonyms are adding information about context. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 26 / 34

Improving baselines with contextonyms Learning based approach #2: SVM-EXP Idea The fact that tweets are short make them difficult to analyze and contextonyms are adding information about context. Expand the tweets with the associated contextonyms and then train a SVM on the best unigrams. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 26 / 34

Results G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 27 / 34

Extraction-Implementation Source corpus used to extract contextonyms huge: 7,773,089 tweets not annotated: this part is easy same language: English-written tweets same topics: Clinton, Trump, the abortion debate, religion, and miscellaneous US politics gathered between November 20th and December 1st, 2015 using the free Twitter Stream API. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 28 / 34

Tools used stopwords list regexp to remove URLs NetworkX library not used: POS taggers and lemmatisers Parameters 1. α threshold = 10; consequence: vocabulary size at 50,000 2. β threshold = 0.06; number of edges at 300,000 3. k c = 4 the size of smallest clique; 6278 contextonyms ( contexto-sets ) G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 29 / 34

Measure the performance of Results Official Score for benchmarking purposes: P s = Precision (1) R s = Recall (2) F 1 (s) = 2 P sr s P s + R s (3) Score = 1 2 (F 1(F ) + F 1 (A)) (4) G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 30 / 34

Comparison of classifiers 0.7 0.6 F1 OfficialScore Scores by algorithm 0.5 F1 and OfficialScore 0.4 0.3 0.2 0.1 0.0 SENT-BASE SENT-CTXT SVM-UNIG SVM-CTXT SVM-EXP G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 31 / 34

Comparison of competitors 0.70 0.65 Competitors results SVM-UNIG SVM-EXP 0.60 OfficialScore 0.55 0.50 0.45 0.40 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 32 / 34

Summary Conclusion Challenging task: shortness of tweets, innovative spelling and specific usage of words. Lexical relatedness: more information to understand the tweets Contextonyms: a tool to be adapted to one s needs and resources G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 33 / 34

Summary Conclusion Challenging task: shortness of tweets, innovative spelling and specific usage of words. Lexical relatedness: more information to understand the tweets Contextonyms: a tool to be adapted to one s needs and resources Future Works Disambiguate ambiguous tweets only Focus on user s opinion Look for groups of users G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 33 / 34

Questions Thank you for your attention! Remarks, questions? G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 34 / 34

References I Andreevskaia, A. and Bergler, S. (2008). When specialists and generalists work together: Overcoming domain dependence in sentiment tagging. In ACL, pages 290 298. Baccianella, S., Esuli, A., and Sebastiani, F. (2010). Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In LREC, volume 10, pages 2200 2204. Feng, Y., Fani, H., Bagheri, E., and Jovanovic, J. (2015). Lexical semantic relatedness for twitter analytics. In Tools with Artificial Intelligence (ICTAI), 2015 IEEE 27th International Conference on, pages 202 209. IEEE. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 35 / 34

References II Hasan, K. S. and Ng, V. (2013). Extra-linguistic constraints on stance recognition in ideological debates. In ACL (2), pages 816 821. Hyungsuk, J., Ploux, S., and Wehrli, E. (2003). Lexical knowledge representation with contexonyms. In 9th MT summit Machine Translation, pages 194 201. Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111 3119. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 36 / 34

References III Miller, G. A. (1995). Wordnet: a lexical database for english. Communications of the ACM, 38(11):39 41. Pennebaker, J. W., Francis, M. E., and Booth, R. J. (2001). Linguistic inquiry and word count: Liwc 2001. Mahway: Lawrence Erlbaum Associates, 71:2001. Ploux, S. and Ji, H. (2003). A model for matching semantic maps between languages (french/english, english/french). Computational linguistics, 29(2):155 178. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 37 / 34

References IV Serban, O., Pauchet, A., Rogozan, A., Pécuchet, J.-P., and LITIS, I. (2012). Semantic propagation on contextonyms using sentiwordnet. In WACAI 2012 Workshop Affect, Compagnon Artificiel, Interaction, page 86. Wang, R., Zhao, H., Ploux, S., Lu, B.-L., and Utiyama, M. (2016). A bilingual graph-based semantic model for statistical machine translation. In International Joint Conference on Artificial Intelligence. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 38 / 34

References V Wilson, T., Wiebe, J., and Hoffmann, P. (2005). Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the conference on human language technology and empirical methods in natural language processing, pages 347 354. Association for Computational Linguistics. Zesch, T., Müller, C., and Gurevych, I. (2008). Using wiktionary for computing semantic relatedness. In AAAI, volume 8, pages 861 866. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 39 / 34

Sentiment Analyser Compute a valence Let S(n) be the set of i synsets s i containing the word n. Each synset has a positive and a negative valence s + i, s i.. Let S t be the set of all the N synsets taken into account for the whole tweet. We therefore define the valence v(t): v(t) = 1 s + i + s i (5) N s i S t If v(t) is positive (negative), we assume the tweet is supportive (opposed), thus having a stance FAVOR (AGAINST ). G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 40 / 34

Stance classifier SVM-UNIG, using a SVM on word unigrams: Comparison: different algorithms: SVM-linear, SVM-RBF, NN, Bayes,... parameter settings: C = 100.0 γ= 0.01 The feature vector is composed of the boolean indicators of the unigrams presence. Vocabulary size is fixed at 10,000, which limits the feature vector length. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 41 / 34

Graph filtering parameters - α g t = (V t, E t ) the co-occurrence graph for a single tweet t. Average degree φ of a word n, due to its position: φ(n) = 1 K K d(n) gj (6) α(n), the ratio of degree in G to average degree position for word n: α(n) = d(n) G (7) φ(n) A large score implies that word n occurs in a great variety of contexts. A word n would then be removed if α(n) < α threshold. j=1 G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 42 / 34

Graph filtering parameters - β β consists of two weight-node count ratios. β(e) = w e c n1,e w e is the weight of edge e = (n 1, n 2 ), + w e c n2,e (8) c n1,e and c n2,e are the word counts for the two words n 1 and n 2 connected by e. A value approaching 2 implies the association is very important for both words. Filter away the edges whenever β e < β threshold, to get rid of unimportant associations. G.Gadek SAS 17 - Contextonyms for stance detection 28 June 2017 43 / 34