University of Cambridge > Talks.cam > NLIP Seminar Series > Accurate CCG Parsing with Approximate Language Intersection and Task-specific Optimization

Accurate CCG Parsing with Approximate Language Intersection and Task-specific Optimization

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Thomas Lippincott.

Combinatory Categorial Grammar (CCG) parsing is a longstanding problem in computational linguistics, due to the complexities associated with its mild context-sensitivity. Via an oracle experiment, we show that the upper bound on accuracy of a CCG parser is significantly lowered when its search space is pruned using a supertagger, though the supertagger also prunes many bad parses.

Inspired by this analysis, we design a single model with both supertagging and parsing features, rather than separating them into distinct models chained together in a pipeline. To overcome the resulting complexity, we experiment with two approximation algorithms for language intersection: loopy belief propagation and dual decomposition.

The second part of this talk deals with task-specific optimisation of parsing models. We adopt the softmax-margin training objective which minimises a bound on expected risk for a given loss function but requires the loss to decompose over the predicted structure, which is not true of F-measure. We present a novel dynamic programming algorithm which allows us to use it with F-measure leading to substantial gains in accuracy on CCG Bank.

Each of the presented methods improves over the state-of-the-art. Moreover, the improvements are additive, obtaining the best reported results on this task. Our algorithms are general and we expect them to apply to other parsing problems, including lexcalized tree adjoining grammar and context-free grammar.

This talk is part of the NLIP Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity