University of Cambridge > > Computer Laboratory Systems Research Group Seminar > ROAR: Increasing the Flexibility and Performance of Distributed Search

ROAR: Increasing the Flexibility and Performance of Distributed Search

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Eiko Yoneki.

To be able to search the web, modern search engines partition the web index over many machines, each of which is consulted when answering a query. To increase query throughput, replicas are added for each of these machines.

The key parameter of web search algorithms is the trade-off between replication and partitioning: while increasing the partitioning level (which reduces the replication level) improves query completion time since more servers handle the query, increasing it indefinitely incurs non-negligible startup costs for each sub-query. Further, the optimal point changes if we factor in costs related to storing and updating the data set, or if the query rate changes. Finding the right operating point and adapting to it is crucial, yet current algorithms assume a fixed trade-off, with reconfiguration being slow and manual.

In this work, we introduce Rendezvous On a Ring (ROAR), a novel distributed algorithm that enables on-the-fly re-configuring of the partitioning level while still servicing queries. In addition, ROAR can add and remove servers without stopping the system, cope with temporary and permanent server failures, and provide very good load-balancing even in the face of servers having heterogeneous hardware capabilities.

To support these claims, we present results from a 43-server testbed deployment of ROAR .

Bio: I am finishing my PhD at UCL on “Making Privacy Preserving Search Practical”, under the supervision of Mark Handley and David Rosenblum. My thesis examines how we can execute encrypted queries on encrypted data, and shows how to paralelize this search to make it applicable in practice. The latter technique is more general and has broader implications in the design of distributed, data-center based, web search algorithms.

I am also working with Mark Handley and Damon Wischik of UCL (along with others in the Trilogy EU project) on designing a multipath transport protocol and an appropriate congestion control algorithm that achieves resource pooling.

Generally, I am interested in distributed systems and networking, as well as security, and I have worked on a few topics in these fields. More info on my webpage at:

This talk is part of the Computer Laboratory Systems Research Group Seminar series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2022, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity