BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Unsupervised Morphological Segmentation with Log-Linear Models - D
 iarmuid Ó Séaghdha (University of Cambridge)
DTSTART:20100308T123000Z
DTEND:20100308T133000Z
UID:TALK23651@talks.cam.ac.uk
CONTACT:Diarmuid Ó Séaghdha
DESCRIPTION:At this session of the NLIP Reading Group we’ll be discussin
 g the following paper:\n\nHoifung Poon\, Colin Cherry and Kristina Toutano
 va. 2009. "Unsupervised Morphological Segmentation with Log-Linear Models"
 :http://aclweb.org/anthology-new/N/N09/N09-1024.pdf. In Proceedings of NAA
 CL-09. \n\n*Abstract:*\nMorphological segmentation breaks words into morph
 emes (the basic semantic units). It is a key component for natural languag
 e processing systems. Unsupervised morphological segmentation is attractiv
 e\, because in every language there are virtually unlimited supplies of te
 xt\, but very few labeled resources. However\, most existing model-based s
 ystems for unsupervised morphological segmentation use directed generative
  models\, making it difficult to leverage arbitrary overlapping features t
 hat are potentially helpful to learning. In this paper\, we present the fi
 rst log-linear model for unsupervised morphological segmentation. Our mode
 l uses overlapping features such as morphemes and their contexts\, and inc
 orporates exponential priors inspired by the minimum description length (M
 DL) principle. We present efficient algorithms for learning and inference 
 by combining contrastive estimation with sampling. Our system\, based on m
 onolingual features only\, outperforms a state-of-the-art system by a larg
 e margin\, even when the latter uses bilingual information such as phrasal
  alignment and phonetic correspondence. On the Arabic Penn Treebank\, our 
 system reduces F1 error by 11% compared to Morfessor.\n
LOCATION:GS15\, Computer Laboratory
END:VEVENT
END:VCALENDAR
