BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Semi-supervised Training of a Statistical Parser from Unlabeled Pa
 rtially-bracketed Data - John Carroll - Department of Informatics\, Univer
 sity of Sussex
DTSTART:20070615T140000Z
DTEND:20070615T150000Z
UID:TALK7470@talks.cam.ac.uk
CONTACT:NLIP Seminars
DESCRIPTION:We compare the accuracy of a statistical parse ranking model t
 rained from a fully-annotated portion of the Susanne treebank with one tra
 ined from unlabeled partially-bracketed sentences derived from this treeba
 nk and from the Penn Treebank. We demonstrate that confidence-based semi-s
 upervised techniques similar to self-training outperform expectation maxim
 ization when both are constrained by partial bracketing. Both methods base
 d on partially-bracketed training data outperform the fully supervised tec
 hnique\, and both can\, in principle\, be applied to any statistical parse
 r whose output is consistent with such partial-bracketing. We also explore
  tuning the model to a different domain and the effect of in-domain data i
 n the semi-supervised training processes.\n\n(This is joint work with Rebe
 cca Watson and Ted Briscoe.)
LOCATION:SW01 Computer Laboratory
END:VEVENT
END:VCALENDAR
