BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//talks.cam.ac.uk//v3//EN
BEGIN:VTIMEZONE
TZID:Europe/London
BEGIN:DAYLIGHT
TZOFFSETFROM:+0000
TZOFFSETTO:+0100
TZNAME:BST
DTSTART:19700329T010000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0100
TZOFFSETTO:+0000
TZNAME:GMT
DTSTART:19701025T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
CATEGORIES:NLIP Seminar Series
SUMMARY:NAACL practice talks - Simon Baker (LTL) &\; Ma
rek Rei (NLIP)\, University of Cambridge
DTSTART;TZID=Europe/London:20180525T120000
DTEND;TZID=Europe/London:20180525T130000
UID:TALK104650AThttp://talks.cam.ac.uk
URL:http://talks.cam.ac.uk/talk/index/104650
DESCRIPTION:*Zero-shot Sequence Labeling: Transferring Knowled
ge from Sentences to Tokens*\n\nMarek Rei & Anders
Søgaard\n\nCan attention- or gradient-based visua
lization techniques be used to infer token-level l
abels for binary sequence tagging problems\, using
networks trained only on sentence-level labels?\n
We construct a neural network architecture based o
n soft attention\, train it as a binary sentence c
lassifier and evaluate against token-level annotat
ion on four different datasets. Inferring token la
bels from a network provides a method for quantita
tively evaluating what the model is learning\, alo
ng with generating useful feedback in assistance s
ystems.\nOur results indicate that attention-based
methods are able to predict token-level labels mo
re accurately\, compared to gradient-based methods
\, sometimes even rivaling the supervised oracle n
etwork. \n\n*Variable Typing: Assigning Meaning to
Variables in Mathematical Text*\n\nYiannos A. Sta
thopoulos\, Simon Baker\, Marek Rei & Simone Teufe
l\n\nInformation about the meaning of mathematical
variables in text is useful in NLP/IR tasks such
as symbol disambiguation\, topic modeling and math
ematical information retrieval (MIR). We introduce
variable typing\, the task of assigning one mathe
matical type (multi-word technical terms referring
to mathematical concepts) to each variable in a s
entence of mathematical text. As part of this work
\, we also introduce a new annotated data set comp
osed of 33\,524 data points extracted from scienti
fic documents published on arXiv. Our intrinsic ev
aluation demonstrates that our data set is suffici
ent to successfully train and evaluate current cla
ssifiers from three different model architectures.
The best performing model is evaluated on an extr
insic task: MIR\, by producing a typed formula ind
ex. Our results show that the best performing MIR
models make use of our typed index\, compared to a
formula index only containing raw symbols\, there
by demonstrating the usefulness of variable typing
.
LOCATION:FW26\, Computer Laboratory
CONTACT:Andrew Caines
END:VEVENT
END:VCALENDAR