University of Cambridge > Talks.cam > NLIP Seminar Series > A Standard Document Score for Information Retrieval

A Standard Document Score for Information Retrieval

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Tamara Polajnar.

Ranking functions are a crucial component in information retrieval systems. These functions assign a score to each document in a collection with respect to a static query. Regularising these document scores across different queries is a difficult challenge that has many potential applications.

In this talk I will outline a number of different retrieval tasks for which score regularisation is important. I propose and outline a standard document retrieval score based on term-frequencies. I will show that the standardisation approach adopted automatically creates a measure of term-specifity. An analysis shows that this measure is highly correlated with the traditional idf measure, and furthermore, suggests a novel interpretation of idf-like measures.

Finally, I will present an evaluation on a number of different datasets that shows that the standard document score is comparable with the BM25 retrieval function in terms of effectiveness. However, an advantage of the standard document score is that under certain conditions, the document scores output from the model are comparable across different queries and collections.

This talk is part of the NLIP Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2017 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity