Syntax and Parsing II

Similar documents
Syntax and Parsing II

A Minimal Span-Based Neural Constituency Parser. Mitchell Stern, Jacob Andreas, Dan Klein UC Berkeley

Neural Networks II. Chen Gao. Virginia Tech Spring 2019 ECE-5424G / CS-5824

CS472 Foundations of Artificial Intelligence. Final Exam December 19, :30pm

1.1 The size of the search space Modeling the problem Change over time Constraints... 21

BALLGAME: A Corpus for Computational Semantics

Fun Neural Net Demo Site. CS 188: Artificial Intelligence. N-Layer Neural Network. Multi-class Softmax Σ >0? Deep Learning II

Predicting Horse Racing Results with TensorFlow

Using Perceptual Context to Ground Language

BASKETBALL PREDICTION ANALYSIS OF MARCH MADNESS GAMES CHRIS TSENG YIBO WANG

Basketball field goal percentage prediction model research and application based on BP neural network

Environmental Science: An Indian Journal

Neural Nets Using Backpropagation. Chris Marriott Ryan Shirley CJ Baker Thomas Tannahill

CS 7641 A (Machine Learning) Sethuraman K, Parameswaran Raman, Vijay Ramakrishnan

Introduction to Machine Learning NPFL 054

CT PET-2018 Part - B Phd COMPUTER APPLICATION Sample Question Paper

Machine Learning an American Pastime

Joint Parsing and Translation

Computational Models: Class 6

Coupling distributed and symbolic execution for natural language queries. Lili Mou, Zhengdong Lu, Hang Li, Zhi Jin

Transformer fault diagnosis using Dissolved Gas Analysis technology and Bayesian networks

Statistical Machine Translation

Predicting the Total Number of Points Scored in NFL Games

Lecture 5. Optimisation. Regularisation

Approximate Grammar for Information Extraction

Evaluating and Classifying NBA Free Agents

Building an NFL performance metric

Lecture 10: Generation

Application Block Library Fan Control Optimization

intended velocity ( u k arm movements

Predicting Horse Racing Results with Machine Learning

The system design must obey these constraints. The system is to have the minimum cost (capital plus operating) while meeting the constraints.

CSE 3401: Intro to AI & LP Uninformed Search II

CFD and Physical Modeling of DSI/ACI Distribution

Support Vector Machines: Optimization of Decision Making. Christopher Katinas March 10, 2016

ROSE-HULMAN INSTITUTE OF TECHNOLOGY Department of Mechanical Engineering. Mini-project 3 Tennis ball launcher

Phrase-based Image Captioning

Planning and Acting in Partially Observable Stochastic Domains

Bulgarian Olympiad in Informatics: Excellence over a Long Period of Time

Projecting Three-Point Percentages for the NBA Draft

CS 4649/7649 Robot Intelligence: Planning

Math 4. Unit 1: Conic Sections Lesson 1.1: What Is a Conic Section?

A THESIS SUBMITTED TO THE FACULTY OF GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA

Section I: Multiple Choice Select the best answer for each problem.

Make the most of Your FIELD TRIP. Tuesday, February 12:05p.m. Lesson plans

Predicting NBA Shots

CS249: ADVANCED DATA MINING

A Novel Dependency-to-String Model for Statistical Machine Translation

Introduction. AI and Searching. Simple Example. Simple Example. Now a Bit Harder. From Hammersmith to King s Cross

CSC242: Intro to AI. Lecture 21

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

Chapter 12 Practice Test

CENG 466 Artificial Intelligence. Lecture 4 Solving Problems by Searching (II)

Analysis of Professional Cycling Results as a Predictor for Future Success

B. AA228/CS238 Component

A Brief History of the Development of Artificial Neural Networks

STAT/MATH 395 PROBABILITY II

Sentiment Analysis Symposium - Disambiguate Opinion Word Sense via Contextonymy

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

FORECASTING OF ROLLING MOTION OF SMALL FISHING VESSELS UNDER FISHING OPERATION APPLYING A NON-DETERMINISTIC METHOD

ACI_Release_Notes.txt VERSION Fixed Tank info for ELITE in Dive section 2. Fixed USB port initializing for old DC VERSION

Exercise 2-2. Second-Order Interacting Processes EXERCISE OBJECTIVE DISCUSSION OUTLINE. The actual setup DISCUSSION

Deformable Convolutional Networks

FEATURES. Features. UCI Machine Learning Repository. Admin 9/23/13

Lecture 39: Training Neural Networks (Cont d)

LABORATORY EXERCISE 1 CONTROL VALVE CHARACTERISTICS

EVOLVING HEXAPOD GAITS USING A CYCLIC GENETIC ALGORITHM

The Incremental Evolution of Gaits for Hexapod Robots

Sequence Tagging. Today. Part-of-speech tagging. Introduction

2017 RELIABILITY STUDY STYLE INSIGHTS

Bayesian Optimized Random Forest for Movement Classification with Smartphones

Parsimonious Linear Fingerprinting for Time Series

David C. Yen Miami University

Folding Reticulated Shell Structure Wind Pressure Coefficient Prediction Research based on RBF Neural Network

Deriving an Optimal Fantasy Football Draft Strategy

PREDICTING THE NCAA BASKETBALL TOURNAMENT WITH MACHINE LEARNING. The Ringer/Getty Images

Better Search Improved Uninformed Search CIS 32

Character A person or animal in a story Solution The answer to a problem. Inference A conclusion based on reasoning

Appendix A COMPARISON OF DRAINAGE ALGORITHMS UNDER GRAVITY- DRIVEN FLOW DURING GAS INJECTION

knn & Naïve Bayes Hongning Wang

Attacking and defending neural networks. HU Xiaolin ( 胡晓林 ) Department of Computer Science and Technology Tsinghua University, Beijing, China

Special Topics: Data Science

Data Sheet T 8389 EN. Series 3730 and 3731 Types , , , and. EXPERTplus Valve Diagnostic

Application of Dijkstra s Algorithm in the Evacuation System Utilizing Exit Signs

Player Availability Rating (PAR) - A Tool for Quantifying Skater Performance for NHL General Managers

Calibration and Validation of the Simulation Model. Xin Zhang

Grand Slam Tennis Computer Game (Version ) Table of Contents

Workshop 1: Bubbly Flow in a Rectangular Bubble Column. Multiphase Flow Modeling In ANSYS CFX Release ANSYS, Inc. WS1-1 Release 14.

Examples of Carter Corrected DBDB-V Applied to Acoustic Propagation Modeling

Performance of Fully Automated 3D Cracking Survey with Pixel Accuracy based on Deep Learning

Petacat: Applying ideas from Copycat to image understanding

Princess Nora University Faculty of Computer & Information Systems ARTIFICIAL INTELLIGENCE (CS 370D) Computer Science Department

Constant pressure air charging cascade control system based on fuzzy PID controller. Jun Zhao & Zheng Zhang

Warm-up. Make a bar graph to display these data. What additional information do you need to make a pie chart?

CS 221 PROJECT FINAL

Rulebook Revision 2016 v1.0 Published September 18, 2015 Sponsored By

Summarization in Sequential Data with Applications to Procedure Learning

PREDICTING the outcomes of sporting events

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag

Analysis and Research of Mooring System. Jiahui Fan*

Transcription:

Syntax and Parsing II Dependency Parsing Slav Petrov Google Thanks to: Dan Klein, Ryan McDonald, Alexander Rush, Joakim Nivre, Greg Durrett, David Weiss Lisbon Machine Learning School 2015

Notes for 2016 Can add 10min of material

Dependency Parsing prep ROOT nsubj dobj pobj det PRON VERB DET NOUN ADP NOUN They solved the problem with statistics

(Non-)Projectivity Crossing Arcs needed to account for nonprojective constructions Fairly rare in English but can be common in other languages (e.g. Czech):

Formal Conditions

Styles of Dependency Parsing Transition-Based (tr) Fast, greedy, linear time inference algorithms Trained for greedy search Beam search Graph-Based (gr) Slower, exhaustive, dynamic programming inference algorithms Higher-order factorizations 3rd-order gr Accuracy greedy tr k-best tr O(k n) 1st-order gr 2nd-order gr O(n 3 ) O(n 4 ) O(n) O(n 3 ) Time [Nivre et al. 03-11] [McDonald et al. 05-06]

Arc-Factored Models

Dependency Representation Representation * As McGwire neared, fans went wild * As McGwire Heads neared, fans went wild As wild went fans, neared McGwire Modifiers

Dependency Representation Representation * As McGwire neared, fans went wild * As McGwire Heads neared, fans went wild As wild went fans, neared McGwire Modifiers

Dependency Representation Representation * As McGwire neared, fans went wild * As McGwire Heads neared, fans went wild As wild went fans, neared McGwire Modifiers

Dependency Representation Representation * As McGwire neared, fans went wild * As McGwire Heads neared, fans went wild As wild went fans, neared McGwire Modifiers

Dependency Representation Representation * As McGwire neared, fans went wild * As McGwire Heads neared, fans went wild As wild went fans, neared McGwire Modifiers

Dependency Representation Representation * As McGwire neared, fans went wild * As McGwire Heads neared, fans went wild As wild went fans, neared McGwire Modifiers

Dependency Representation Representation * As McGwire neared, fans went wild * As McGwire Heads neared, fans went wild As wild went fans, neared McGwire Modifiers

Dependency Representation Representation * As McGwire neared, fans went wild * As McGwire Heads neared, fans went wild As wild went fans, neared McGwire Modifiers

Arc-factored Projective Parsing

Arc-factored Projective Parsing

Eisner Algorithm

Eisner First-Order Rules Eisner First-Order Parsing + h m h r r +1 m + h e h m m e In practice also left arc version

Eisner First-Order Parsing First-Order Parsing * As McGwire neared, fans went wild * As McGwire neared, fans went wild As wild went fans, neared McGwire

Eisner First-Order Parsing First-Order Parsing * As McGwire neared, fans went wild * As McGwire neared, fans went wild As wild went fans, neared McGwire

Eisner First-Order Parsing First-Order Parsing * As McGwire neared, fans went wild * As McGwire neared, fans went wild As wild went fans, neared McGwire

Eisner First-Order Parsing First-Order Parsing * As McGwire neared, fans went wild * As McGwire neared, fans went wild As wild went fans, neared McGwire

Eisner First-Order Parsing First-Order Parsing * As McGwire neared, fans went wild * As McGwire neared, fans went wild As wild went fans, neared McGwire

Eisner First-Order Parsing First-Order Parsing * As McGwire neared, fans went wild * As McGwire neared, fans went wild As wild went fans, neared McGwire

Eisner First-Order Parsing First-Order Parsing * As McGwire neared, fans went wild * As McGwire neared, fans went wild As wild went fans, neared McGwire

Eisner First-Order Parsing First-Order Parsing * As McGwire neared, fans went wild * As McGwire neared, fans went wild As wild went fans, neared McGwire

Eisner First-Order Parsing First-Order Parsing * As McGwire neared, fans went wild * As McGwire neared, fans went wild As wild went fans, neared McGwire

Eisner Algorithm Pseudo Code

Maximum Spanning Trees (MSTs) Can use MST algorithms for nonprojective parsing!

Chu-Liu-Edmonds

Chu-Liu-Edmonds

Find Cycle and Contract

Recalculate Edge Weights

Theorem

Final MST

Chu-Liu-Edmonds PseudoCode

Chu-Liu-Edmonds PseudoCode

Arc Weights

Arc Feature Ideas for f(i,j,k) Identities of the words wi and wj and the label lk Part-of-speech tags of the words wi and wj and the label lk Part-of-speech of words surrounding and between wi and wj Number of words between wi and wj, and their orientation Combinations of the above

First-Order Feature Calculation First-Order Feature Computation * As McGwire * As McGwire neared, fans went wild neared, fans went wild As wild went fans, neared McGwire [went] [VBD] [As] [ADP] [went] [VERB] [As] [IN] [went, VBD] [As, ADP] [went, As] [VBD, ADP] [went, VERB] [As, IN] [went, As] [VERB, IN] [VBD, As, ADP] [went, As, ADP] [went, VBD, ADP] [went, VBD, As] [ADJ, *, ADP] [VBD, *, ADP] [VBD, ADJ, ADP] [VBD, ADJ, *] [NNS, *, ADP] [NNS, VBD, ADP] [NNS, VBD, *] [ADJ, ADP, NNP] [VBD, ADP, NNP] [VBD, ADJ, NNP] [NNS, ADP, NNP] [NNS, VBD, NNP] [went, left, 5] [VBD, left, 5] [As, left, 5] [ADP, left, 5] [VERB, As, IN] [went, As, IN] [went, VERB, IN] [went, VERB, As] [JJ, *, IN] [VERB, *, IN] [VERB, JJ, IN] [VERB, JJ, *] [NOUN, *, IN] [NOUN, VERB, IN] [NOUN, VERB, *] [JJ, IN, NOUN] [VERB, IN, NOUN] [VERB, JJ, NOUN] [NOUN, IN, NOUN] [NOUN, VERB, NOUN] [went, left, 5] [VERB, left, 5] [As, left, 5] [IN, left, 5] [went, VBD, As, ADP] [VBD, ADJ, *, ADP] [NNS, VBD, *, ADP] [VBD, ADJ, ADP, NNP] [NNS, VBD, ADP, NNP] [went, VBD, left, 5] [As, ADP, left, 5] [went, As, left, 5] [VBD, ADP, left, 5] [went, VERB, As, IN] [VERB, JJ, *, IN] [NOUN, VERB, *, IN] [VERB, JJ, IN, NOUN] [NOUN, VERB, IN, NOUN] [went, VERB, left, 5] [As, IN, left, 5] [went, As, left, 5] [VERB, IN, left, 5] [VBD, As, ADP, left, 5] [went, As, ADP, left, 5] [went, VBD, ADP, left, 5] [went, VBD, As, left, 5] [ADJ, *, ADP, left, 5] [VBD, *, ADP, left, 5] [VBD, ADJ, ADP, left, 5] [VBD, ADJ, *, left, 5] [NNS, *, ADP, left, 5] [NNS, VBD, ADP, left, 5] [NNS, VBD, *, left, 5] [ADJ, ADP, NNP, left, 5] [VBD, ADP, NNP, left, 5] [VBD, ADJ, NNP, left, 5] [NNS, ADP, NNP, left, 5] [NNS, VBD, NNP, left, 5] [VERB, As, IN, left, 5] [went, As, IN, left, 5] [went, VERB, IN, left, 5] [went, VERB, As, left, 5] [JJ, *, IN, left, 5] [VERB, *, IN, left, 5] [VERB, JJ, IN, left, 5] [VERB, JJ, *, left, 5] [NOUN, *, IN, left, 5] [NOUN, VERB, IN, left, 5]

(Structured) Perceptron

Transition Based Dependency Parsing Process sentence left to right Different transition strategies available Delay decisions by pushing on stack Arc-Standard Transition Strategy [Nivre 03] Initial configuration: ([],[0,,n],[]) Terminal configuration: ([0],[],A) shift: (σ,[i β],a) ([σ i],β,a) left-arc (label): ([σ i j],b,a) ([σ j],b,a {j,l,i}) right-arc (label): ([σ i j],b,a) ([σ i],b,a {i,l,j})

Arc-Standard Example Stack Buffer I booked a flight to Lisbon SHIFT I booked a flight to Lisbon

Arc-Standard Example Stack Buffer I booked a flight to Lisbon SHIFT I booked a flight to Lisbon

Arc-Standard Example Stack Buffer booked a flight to Lisbon I LEFT-ARC nsubj I booked a flight to Lisbon

Arc-Standard Example Stack Buffer I booked a flight to Lisbon SHIFT nsubj I booked a flight to Lisbon

Arc-Standard Example Stack Buffer a flight to Lisbon I booked SHIFT nsubj I booked a flight to Lisbon

Arc-Standard Example Stack Buffer flight to Lisbon I a booked LEFT-ARC det nsubj I booked a flight to Lisbon

Arc-Standard Example Stack Buffer a flight to Lisbon I booked SHIFT nsubj det I booked a flight to Lisbon

Arc-Standard Example Stack to Buffer Lisbon a I flight booked SHIFT nsubj det I booked a flight to Lisbon

Arc-Standard Example Stack Buffer Lisbon a to flight RIGHT-ARC pobj I booked nsubj det I booked a flight to Lisbon

Arc-Standard Example Stack Buffer to Lisbon a I flight booked RIGHT-ARC prep nsubj det pobj I booked a flight to Lisbon

Arc-Standard Example Stack Buffer a flight to Lisbon I booked RIGHT-ARC dobj nsubj det prep pobj I booked a flight to Lisbon

Arc-Standard Example Stack Buffer I booked a flight to Lisbon dobj nsubj det prep pobj I booked a flight to Lisbon

Features Stack Buffer a flight to Lisbon I booked SHIFT RIGHT-ARC? LEFT-ARC? Stack top word = flight Stack top POS tag = NOUN Buffer front word = to Child of stack top word = a...

SVM / Structured Perceptron Hyperparameters Regularization Loss function Hand-crafted features

Features ZPar Parser # From Single Words pair { stack.tag stack.word } stack { word tag } pair { input.tag input.word } input { word tag } pair { input(1).tag input(1).word } input(1) { word tag } pair { input(2).tag input(2).word } input(2) { word tag } # From word pairs quad { stack.tag stack.word input.tag input.word } triple { stack.tag stack.word input.word } triple { stack.word input.tag input.word } triple { stack.tag stack.word input.tag } triple { stack.tag input.tag input.word } pair { stack.word input.word } pair { stack.tag input.tag } pair { input.tag input(1).tag } # From word triples triple { input.tag input(1).tag input(2).tag } triple { stack.tag input.tag input(1).tag } triple { stack.head(1).tag stack.tag input.tag } triple { stack.tag stack.child(-1).tag input.tag } triple { stack.tag stack.child(1).tag input.tag } triple { stack.tag input.tag input.child(-1).tag } # Distance pair { stack.distance stack.word } pair { stack.distance stack.tag } pair { stack.distance input.word } pair { stack.distance input.tag } triple { stack.distance stack.word input.word } triple { stack.distance stack.tag input.tag } # valency pair { stack.word stack.valence(-1) } pair { stack.word stack.valence(1) } pair { stack.tag stack.valence(-1) } pair { stack.tag stack.valence(1) } pair { input.word input.valence(-1) } pair { input.tag input.valence(-1) } # unigrams stack.head(1) {word tag} stack.label stack.child(-1) {word tag label} stack.child(1) {word tag label} input.child(-1) {word tag label} # third order stack.head(1).head(1) {word tag} stack.head(1).label stack.child(-1).sibling(1) {word tag label} stack.child(1).sibling(-1) {word tag label} input.child(-1).sibling(1) {word tag label} triple { stack.tag stack.child(-1).tag stack.child(-1).sibling(1).tag } triple { stack.tag stack.child(1).tag stack.child(1).sibling(-1).tag } triple { stack.tag stack.head(1).tag stack.head(1).head(1).tag } triple { input.tag input.child(-1).tag input.child(-1).sibling(1).tag } # label set pair { stack.tag stack.child(-1).label } triple { stack.tag stack.child(-1).label stack.child(-1).sibling(1).lab quad { stack.tag stack.child(-1).label stack.child(-1).sibling(1).label pair { stack.tag stack.child(1).label } triple { stack.tag stack.child(1).label stack.child(1).sibling(-1).labe quad { stack.tag stack.child(1).label stack.child(1).sibling(-1).label pair { input.tag input.child(-1).label } triple { input.tag input.child(-1).label input.child(-1).sibling(1).lab quad { input.tag input.child(-1).label input.child(-1).sibling(1).label

Neural Network Transition Based Parser [Chen & Manning 14] and [Weiss et al. 15] Softmax Hidden Layer words pos labels Embedding Layer f0 = 1 [buffer0-word = to ] f1 = 1 [buffer1-word = Bilbao ] f2 = 1 [buffer0-pos = IN ] f3 = 1 [stack0-label = pobj ] Atomic Inputs

Neural Network Transition Based Parser Softmax [Weiss et al. 15] Hidden Layer words pos labels Embedding Layer f0 = 1 [buffer0-word = to ] f1 = 1 [buffer1-word = Bilbao ] f2 = 1 [buffer0-pos = IN ] f3 = 1 [stack0-label = pobj ] Atomic Inputs

Neural Network Transition Based Parser Softmax [Weiss et al. 15] Hidden Layer words pos labels Embedding Layer f0 = 1 [buffer0-word = to ] f1 = 1 [buffer1-word = Bilbao ] f2 = 1 [buffer0-pos = IN ] f3 = 1 [stack0-label = pobj ] Atomic Inputs

Neural Network Transition Based Parser Softmax [Weiss et al. 15] Hidden Layer 2 Hidden Layer 1 words pos labels Embedding Layer f0 = 1 [buffer0-word = to ] f1 = 1 [buffer1-word = Bilbao ] f2 = 1 [buffer0-pos = IN ] f3 = 1 [stack0-label = pobj ] Atomic Inputs

Neural Network Transition Based Parser f s. w [Weiss et al. 15] words pos labels f0 = 1 [buffer0-word = to ] f1 = 1 [buffer1-word = Bilbao ] f2 = 1 [buffer0-pos = IN ] f3 = 1 [stack0-label = pobj ] structured perceptron

Regularization Loss function NN Hyperparameters

Regularization Loss function NN Hyperparameters Dimensions Activation function Initialization Adagrad Dropout

Regularization Loss function NN Hyperparameters Dimensions Activation function Initialization Adagrad Dropout Mini-batch size Initial learning rate Learning rate schedule Momentum Stopping time Parameter averaging

NN Hyperparameters Optimization matters! Use random restarts, grid Pick best using holdout data Tune: WSJ S24 Dev: WSJ S22 Test: WSJ S23

Random Restarts: How much Variance? Variance of Networks on Tuning/Dev Set UAS (%) on WSJ Dev Set 92.7 92.6 92.5 92.4 92.3 92.2 92.1 Pretrained 200x200 Pretrained 200 200x200 200 2nd hidden layer + pre training increases correlation 92 91.2 91.4 91.6 91.8 92 UAS (%) on WSJ Tune Set

Effect of Embedding Dimensions 92 91.5 Word Tuning on WSJ (Tune Set, D pos,d labels =32) UAS (%) 91 90.5 90 89.5 Pretrained 200x200 Pretrained 200 200x200 200 1 2 4 8 16 32 64 128 Word Embedding Dimension (D words )

Effect of Embedding Dimensions POS/Label Tuning on WSJ (Tune Set, D words =64) 92 UAS (%) 91.5 Pretrained 200x200 91 Pretrained 200 200x200 200 90.5 1 2 4 8 16 32 POS/Label Embedding Dimension (D,D ) pos labels

Tri-Training [Zhou et al. 05, Li et al. 14] ZPar Parser Berkeley Parser UAS 89.96 LAS 87.26 UAS 89.84 LAS 87.21 UAS 96.35 LAS 95.02 ~40% agreement

Tri-Training Impact WSJ 22 (Dev) 94.5 94 WSJ WSJ+Up-training WSJ+Tri-training NN model benefits more from additional data UAS (%) 93.5 Berkeley (converted) ZN does not improve even when using an alternative hyper graph model for Tri-training 93 92.5 Linear (ZN2011) NN (B=1) NN (B=8) Model

English Results (WSJ 23) Method UAS LAS Beam 3rd-order Graph-based (ZM2014) 93.22 91.02 - Transition-based Linear (ZN2011) 93.00 90.95 32 NN Baseline (Chen & Manning, 2014) 91.80 89.60 1 NN Better SGD (Weiss et al., 2015) 92.58 90.54 1 NN Deeper Network (Weiss et al., 2015) 93.19 91.18 1 NN Perceptron (Weiss et al., 2015) 93.99 92.05 8 NN Semi-supervised (Weiss et al., 2015) 94.26 92.41 8 S-LSTM (Dyer et al., 2015) 93.20 90.90 1 Contrastive NN (Zhou et al., 2015) 92.83 100

English Out-of-Domain Results Train on WSJ + Web Treebank + QuestionBank Evaluate on Web 90 3rd Order Graph (ZM2014) Transition-based NN (B=1) Transition-based Linear (ZN 2011, B=32) Transition-based NN (B=8) 89.25 UAS (%) 88.5 87.75 87 Supervised Semi-Supervised

Multilingual Results 3rd-Order Graph (ZM2014) Tensor-based Graph Lei et al. (2014) Transition-based Linear (ZN2011) Transition-based NN (B=32) 93 No tri-training data LAS (%) 88.5 84 With morph features 79.5 75 Catalan Chinese Czech English German Japanese Spanish [Alberti et al., in submission]

Summary Constituency Parsing CKY Algorithm Lexicalized Grammars Latent Variable Grammars Conditional Random Field Parsing Neural Network Representations Dependency Parsing Eisner Algorithm Maximum Spanning Tree Algorithm Transition Based Parsing Neural Network Representations