Coupling distributed and symbolic execution for natural language queries. Lili Mou, Zhengdong Lu, Hang Li, Zhi Jin

Similar documents
Neural Networks II. Chen Gao. Virginia Tech Spring 2019 ECE-5424G / CS-5824

CS472 Foundations of Artificial Intelligence. Final Exam December 19, :30pm

Attacking and defending neural networks. HU Xiaolin ( 胡晓林 ) Department of Computer Science and Technology Tsinghua University, Beijing, China

Basketball field goal percentage prediction model research and application based on BP neural network

1.1 The size of the search space Modeling the problem Change over time Constraints... 21

Lecture 10. Support Vector Machines (cont.)

Visual Traffic Jam Analysis Based on Trajectory Data

B. AA228/CS238 Component

Predicting Horse Racing Results with Machine Learning

EXPERIMENTÁLNÍ ANALÝZA MLP SÍTÍ PRO PREDIKCI POHYBU PLIC PŘI DÝCHÁNÍ Experimental Analysis of MLP for Lung Respiration Prediction

RELATIVE PLACEMENT SCORING SYSTEM

Using Perceptual Context to Ground Language

Folding Reticulated Shell Structure Wind Pressure Coefficient Prediction Research based on RBF Neural Network

Syntax and Parsing II

Navigate to the golf data folder and make it your working directory. Load the data by typing

In memory of Dr. Kevin P. Granata, my graduate advisor, who was killed protecting others on the morning of April 16, 2007.

LEVEL 1 FUNCTIONAL SKILLS MATHEMATICS 09865

Predicting the Total Number of Points Scored in NFL Games

Introduction to Machine Learning NPFL 054

Performance of Fully Automated 3D Cracking Survey with Pixel Accuracy based on Deep Learning

Deformable Convolutional Networks

Deriving an Optimal Fantasy Football Draft Strategy

Marathon Performance Prediction of Amateur Runners based on Training Session Data

CS 1675: Intro to Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh November 1, 2018

Lecture 10: Generation

INTRODUCTION AND METHODS OF CMAP SOFTWARE BY IPC-ENG

PUBLISHED PROJECT REPORT PPR850. Optimisation of water flow depth for SCRIM. S Brittain, P Sanders and H Viner

BLACKLIST ECOSYSTEM ANALYSIS: JULY DECEMBER, 2017

Heart Rate Prediction Based on Cycling Cadence Using Feedforward Neural Network

3.6 Magnetic surveys. Sampling Time variations Gradiometers Processing. Sampling

Automatic Identification and Analysis of Basketball Plays: NBA On-Ball Screens

Lecture 39: Training Neural Networks (Cont d)

Predicting Horse Racing Results with TensorFlow

LOCOMOTION CONTROL CYCLES ADAPTED FOR DISABILITIES IN HEXAPOD ROBOTS

Opleiding Informatica

Scientific Herald of the Voronezh State University of Architecture and Civil Engineering. Construction and Architecture

A Novel Travel Adviser Based on Improved Back-propagation Neural Network

Sharing the How (and not the What )

Football Play Type Prediction and Tendency Analysis

Tips to Prevent Form 2802, Lost to Follow-up. Casey Beardslee, Supervisor Donor Contact Team Minneapolis, MN

Tips to Prevent Form 2802, Lost to Follow-up

A Novel Approach to Predicting the Results of NBA Matches

BASKETBALL PREDICTION ANALYSIS OF MARCH MADNESS GAMES CHRIS TSENG YIBO WANG

Machine Learning an American Pastime

Universal Style Transfer via Feature Transforms

#19 MONITORING AND PREDICTING PEDESTRIAN BEHAVIOR USING TRAFFIC CAMERAS

100-Meter Dash Olympic Winning Times: Will Women Be As Fast As Men?

AUSTRALIAN TEAMS OVER TIME

Journal of Human Sport and Exercise E-ISSN: Universidad de Alicante España

SAFETY OF NAVIGATION OPERATING ANOMALIES IDENTIFIED WITHIN ECDIS

Evolving Gaits for the Lynxmotion Hexapod II Robot

An Investigation of Freeway Capacity Before and During Incidents

Fun Neural Net Demo Site. CS 188: Artificial Intelligence. N-Layer Neural Network. Multi-class Softmax Σ >0? Deep Learning II

Prokopios Chatzakis, National and Kapodistrian University of Athens, Faculty of Physical Education and Sport Science 1

2013 CIVL PLENARY ANNEX 24B SOFTWARE PROPOSALS

OSC REFERENCE COLLECTION. BASEBALL AND SOFTBALL History of Baseball and Softball at the Olympic Games

Individual Behavior and Beliefs in Parimutuel Betting Markets

Robust Task Execution: Procedural and Model-based. Outline. Desiderata: Robust Task-level Execution

The Incremental Evolution of Gaits for Hexapod Robots

A New Piston Gauge to Improve the Definition of High Gas Pressure and to Facilitate the Gas to Oil Transition in a Pressure Calibration Chain

Introduction to Pattern Recognition

The Gory Details. How to actually DO the judging

Environmental Science: An Indian Journal

Besides the reported poor performance of the candidates there were a number of mistakes observed on the assessment tool itself outlined as follows:

Machine Learning Methods for Climbing Route Classification

CS 7641 A (Machine Learning) Sethuraman K, Parameswaran Raman, Vijay Ramakrishnan

AVEMS & UNICLIMA joint Position paper on Regulation (EU) No.1253/2014 Paris, June 8 th

2. Determine how the mass transfer rate is affected by gas flow rate and liquid flow rate.

FORECASTING OF ROLLING MOTION OF SMALL FISHING VESSELS UNDER FISHING OPERATION APPLYING A NON-DETERMINISTIC METHOD

Tokyo: Simulating Hyperpath-Based Vehicle Navigations and its Impact on Travel Time Reliability

Recognise that some mechanisms, including levers, pulleys and gears, allow a smaller force to have a greater effect

Scope asymmetry, QUDs, and exhaustification

UNIFIED MECHANISM OF ENSO CONTROL ON INDIAN MONSOON RAINFALL SUNEET DWIVEDI

Separating Load from Moisture Effects in Wet Hamburg Wheel-Track Test

Cover Page for Lab Report Group Portion. Pump Performance

Centre for Transport Studies

Lesson 14: Modeling Relationships with a Line

TPM TIP. Oil Viscosity

A IMPROVED VOGEL S APPROXIMATIO METHOD FOR THE TRA SPORTATIO PROBLEM. Serdar Korukoğlu 1 and Serkan Ballı 2.

IGEM/TD/2 Edition 2 with amendments July 2015 Communication 1779 Assessing the risks from high pressure Natural Gas pipelines

At each type of conflict location, the risk is affected by certain parameters:

Reduction of Bitstream Transfer Time in FPGA

Grade: 8. Author(s): Hope Phillips

FIG: 27.1 Tool String

Prudhoe Bay Oil Production Optimization: Using Virtual Intelligence Techniques, Stage One: Neural Model Building

Leakage Current Testing Is it right for your application?

Preparing Questions for Brownstone's Software Formats:

BASIC FUTSAL COACHING PREPARATION

City of Novi Non-Motorized Master Plan 2011 Executive Summary

Syntax and Parsing II

Making the case for active travel

Convolutional Neural Networks

HD Traffic, IQ Routes and Map Share TomTom s latest technologies for premium navigation and traffic information

Neural Nets Using Backpropagation. Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill

Introduction to Pattern Recognition

Generating None-Plans in Order to Find Plans 1

Connecting Sacramento: A Trip-Making and Accessibility Study

/435 Artificial Intelligence Fall 2015

A Brief History of the Development of Artificial Neural Networks

Aircraft Design Prof. A.K Ghosh Department of Aerospace Engineering Indian Institute of Technology, Kanpur

Transcription:

Coupling distributed and symbolic execution for natural language queries Lili Mou, Zhengdong Lu, Hang Li, Zhi Jin doublepower.mou@gmail.com

Outline Introduction to neural enquirers Coupled approach of neural and symbolic execution Experimental results Conclusion and discussion

Semantic Parsing select Duration where area = max(area)

Approaches Traditional semantic parsing seq2seq models Neural execution Fully distributed model Symbolic execution

Why Execution is Necessary? Q: How long is the game with the largest host country size? Year City Area Duration 2000 Sydney 200 30 2004 Athens 250 20 2008 Beijing 350 25 2012 London 300 35 2016 Rio de Janeiro 200 40

Why Execution is Necessary? Q: How long is the game with the largest host country size? Year City Area Duration 2000 Sydney 200 30 2004 Athens 250 20 2008 Beijing 350 25 2012 London 300 35 2016 Rio de Janeiro 200 40

Why Execution is Necessary? Q: How long is the game with the largest host country size? Year City Area Duration 2000 Sydney 200 30 2004 Athens 250 20 2008 Beijing 350 25 2012 London 300 35 2016 Rio de Janeiro 200 40

Why Execution is Necessary? Q: How long is the game with the largest host country size? Year City Area Duration 2000 Sydney 200 30 2004 Athens 250 20 2008 Beijing 350 25 2012 London 300 35 2016 Rio de Janeiro 200 40

Why Execution is Necessary? Think of a more complicated example: How long is the last game which has smaller country size than the game whose host country GDP is 250? Such compositionality of queries necessitates multiple steps of execution.

Neural Enquirer (Yin et al., 2016) Everything is an embedding and everything is done by neural information processing Differentiable => High learning efficiency Low execution efficiency because of neural information processing Low interpretability

Neural Symbolic Machine (Liang et al., 2016) Discrete operators Differentiable controller REINFORCE algorithm ( trial and error )

Comparison Neuralized Symbolic Wanted (Our approach) Learning efficiency Execution efficiency Interpretability Performance High Low Very low (Comparatively) High High High Low Low High Low High High

Outline Introduction to neural enquirers Coupled approach of neural and symbolic execution Distributed enquirer Symbolic executor A Unified View Experimental results Conclusion and discussion

Overview

Distributed Enquirer (Yin et al., 2016) Query encoder Bidirectional RNN Table encoder Concatenation of cell and field embeddings Further processed by a multi layer perceptron Executor Column attention (soft selection) Row annotation (distributed selection)

Executor The result of one step execution softmax attention over columns and a distributed Column attention Row annotation

Details Last step s execution information Current step Column attention Row annotation

Symbolic Execution Intuition: A more natural way for semantic parsing is symbolic execution E.g., max(.), less_than(.) Methodology Primitive operators Controller (operator/argument predictor)

Primitive Operators

Example Q: How long is the game with the largest host country size? Year City Area Duration 2000 Sydney 200 30 2004 Athens 250 20 2008 Beijing 350 25 2012 London 300 35 2016 Rio de Janeiro 200 40 operator = argmax field = Area

Example Q: How long is the game with the largest host country size? Year City Area Duration 2000 Sydney 200 30 2004 Athens 250 20 2008 Beijing 350 25 2012 London 300 35 2016 Rio de Janeiro 200 40 operator = argmax field = Area

Example Q: How long is the game with the largest host country size? Year City Area Duration 2000 Sydney 200 30 2004 Athens 250 20 2008 Beijing 350 25 2012 London 300 35 2016 Rio de Janeiro 200 40 operator = select_value field = Duration

Example Q: How long is the game with the largest host country size? Year City Area Duration 2000 Sydney 200 30 2004 Athens 250 20 2008 Beijing 350 25 2012 London 300 35 2016 Rio de Janeiro 200 40 operator = EOE

A More Complicated Example Q: How long is the last game which has smaller country size than the game whose host country GDP is 250?

Controller: Operator/Argument Predictor Jordan type RNNs Operator predictor Field predictor

The Problems Non differentiable No step by step supervision

A Unified View Two worlds of execution Fully neuralized enquirer End to end learnable Symbolic enquirer High execution efficiency High interpretability We propose to take advantage of the both worlds Plus high performance

Intuition The fully neuralized enquirer also exhibits some (imperfect) interpretability The field attention generally aligns with column selection

Solution Use neural networks' (imperfect) intermediate results to pretrain the symbolic executor's policy in a step by step fashion Improve the policy by reinforcement learning

Overview

Pretraining Let m be the number of actions to pretrain Imperfect supervision signal Step by step supervision

REINFORCE Policy Improvement Gradient Reward R: 1=correct result, 0 = incorrect result Tricks Exploring with a small probability (0.1) Subtracting the mean (reinforcement comparison) Truncate negative reward (reward inaction)

REINFORCE in a nutshell Sample from your current policy distribution Obtain the reward Update according to the gradient of the sampled actions Extremely difficult to get started Poor local optima (also sensitive to initial policy) Fortunately, the distributed worlds makes life much easier.

Outline Introduction to neural enquirers Coupled approach of neural and symbolic execution Experimental results Conclusion and discussion

Experimental Settings Dataset: Yin et al. (2016) Synthesized data 25k samples (different queries and tables) Hyperparamters Mostly derived from previous work 40 epochs of pretraining before REINFORCE

Experiments Performance

Experiments Interpretability

Experiments Learning efficiency

Experiments Execution efficiency

Feeding back/co training

Outline Introduction to neural enquirer Coupled approach of neural and symbolic execution Conclusion and discussion

Conclusion Propose to couple distributed and symbolic execution for natural language queries The distributed enquirer exhibits some (imperfect) interpretability We use the distributed model's step by step signal to pretrain the symbolic one to acquire a fairly meaningful initial policy. Improve the policy by REINFORCE. The coupled model achieves high learning efficiency, high execution efficiency, high interpretability, as well as high performance.

Future Work Couple more actions Better use the information Using the full distribution information? Sampling from the distribution predicted by the neural enquirer? Inducing operators and pretraining the operation predictors

Discussion Previous work on incorporating neural networks with external (somewhat) symbolic systems Hu et al. (2016) harness knowledge of a rule based system by inducing a probability distribution from it as the training objective. Lei et al. (2016) propose a sparse, hard gating approach to force a neural network to focus on relevant information. Mi et al. (2016) use alignment heuristics to train the attention signal of neural machine translation in a supervised manner.

The Uniqueness of Our Work First train a neural network in an end to end fashion Then guide a symbolic system to achieve a meaningful initial policy

Attention as Alignment (Bahdanau et al., ICLR 2015)

Q & A? Thank you for listening!