University of Cambridge > Talks.cam > NLIP Seminar Series > Annotating Genericity: How Do Humans Decide?- A Case Study in Ontology Extraction

Annotating Genericity: How Do Humans Decide?- A Case Study in Ontology Extraction

Add to your list(s) Download to your calendar using vCal

  • UserAurelie Herbelot, Computer Laboratory, University of Cambridge
  • ClockFriday 25 January 2008, 12:00-13:00
  • HouseSW01 Computer Laboratory.

If you have a question about this talk, please contact Johanna Geiss.

This talk deals with the identification of kind versus non-kind entities in natural language text for ontology extraction. The following two sentences illustrate the relevance of obtaining genericity annotations for the creation of ontologies. —the whale is a mammal—the whale rescued the scuba diver. Given this input, an ontology extraction system would typically output the relationships ‘whale—is_a—mammal’ and ‘whale—rescue—scuba diver’. When inserted as such in an real-world ontology, these relations may give the user the false impression that ‘one general feature of whales is that they rescue scuba divers.’ In order to prevent this reading, it is necessary to tag the first whale with a generic label and the second with a specific label.

The task of genericity annotation using machine learning relies on a training corpus. Available corpora, however, are limited in the genres they cover and more importantly in the range of labels that they use to describe the genericity phenomenon. The public annotation schemes linked to those corpora are also often simplified and/or domain-specific. With the view of producing our own training corpus, we propose here an annotation scheme that covers the kind versus object distinction, the specificity phenomenon and reference resolution. The scheme is not domain-specific and produced, over a small test set from the British National Corpus, an inter-annotator agreement of Kappa = 0.74.

We will discuss the scheme, our choice of labels, and the various problems attached to the manual annotation of genericity. In particular, we will show the importance of reference resolution for accurate annotation.

This talk is part of the NLIP Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity