|COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring.|
Understanding and Improving the Efficiency of Failure Resilience for Big Data Frameworks
If you have a question about this talk, please contact Microsoft Research Cambridge Talks Admins.
This event may be recorded and made available internally or externally via http://research.microsoft.com. Microsoft will own the copyright of any recordings made. If you do not wish to have your image/voice recorded please consider this before attending
Big data processing frameworks (MapReduce, Hadoop, Dryad) are hugely popular today. A strong selling point is their ability to provide failure resilience guarantees. They can run computations to completion despite occasional failures in the system. However, an overlooked point has been the efficiency of the failure resilience provided. The vision of this work is that big data frameworks should not only finish computations under failures but minimize the impact of the failures on the computation time.
The first part of the talk presents the first in-depth analysis of the efficiency of the failure resilience provided by the popular Hadoop framework at the level of a single job. The results show that compute node failures can lead to variable and unpredictable job running times. The causes behind these results are detailed in the talk. The second part of the talk focuses on providing failure resilience at the level of multi-job computations. It presents the design, implementation and evaluation of RCMP , a MapReduce system based on the fundamental insight that using replication as the main failure resilience strategy oftentimes leads to significant and unnecessary increases in computation running time. In contrast, RCMP is designed to use job re-computation as a first-order failure resilience strategy. RCMP enables re-computations that perform the minimum amount of work and also maximizes the efficiency of the re-computation work that still needs to be performed.
This talk is part of the Microsoft Research Cambridge, public talks series.
This talk is included in these lists:
Note that ex-directory lists are not shown.
Other listsChallenging Neoliberalism Special DPMMS Colloquium Cambridge American History Seminar
Other talksTitle: Goitrous Congenital Hypothyroidism – a Novel Genetic Cause Athena SWAN - My Life in science Seminar "There’s more to academia than research, you know" The politics of Shari'a Law: Islamist activists and the state of democratizing Indonesia Are Sensor Networks a first step towards the Diamond Age? 'Anything but Idle: Reading and Editing the Essays of Robert Louis Stevenson' Microfabrication technology for the engineering of 3D cell laden microgels for cell culture and tissue engineering