University of Cambridge > > NLIP Seminar Series > Disambiguation of Biomedical Text

Disambiguation of Biomedical Text

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Johanna Geiss.

Like text in other domains, biomedical documents contain a range of terms with more than one possible meaning. These ambiguities form a significant obstacle to the automatic processing of these texts. Previous approaches to resolving this problem have made use of a variety of knowledge sources including the context in which the ambiguous term is used and domain-specific resources (such as UMLS ). We compare a range of knowledge sources which have been previously used and introduce a novel one: MeSH terms. The best performance is obtained using linguistic features in combination with MeSH terms. Performance exceeds previously reported results on a standard test set.

Our approach is supervised and therefore relies on annotated training examples. A novel approach to automatically acquiring additional training data, based on the relevance feedback technique from Information Retrieval, is presented. Applying this method to generate additional training examples is shown to lead to a further increase in performance.

This talk is part of the NLIP Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2023, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity