University of Cambridge > Talks.cam > Computer Laboratory Systems Research Group Seminar > Next-generation data-parallel dataflow systems

Next-generation data-parallel dataflow systems

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Eiko Yoneki.

The Naiad project at Microsoft Research introduced a new model of dataflow computation, timely dataflow, which was designed to support low-latency computation in data-parallel dataflow graphs containing structured cycles. This model substantially enlarged the space of data-parallel computations that can be reasonably expressed, as compared to other modern “big data” systems. Naiad achieved excellent performance it its intended application domains, largely by providing the dataflow operators with meaningful and low-overhead coordination primitives, but otherwise staying out of their way.

In this talk we will discuss performance issues with existing systems, review timely dataflow, and present a new data-parallel design that coordinates less frequently yet more accurately. The design is largely implemented, written in 100% safe Rust and available at https://github.com/frankmcsherry/timely-dataflow, and currently out-performs several popular distributed systems even when run on the speaker’s laptop.

This talk reflects work done jointly with Derek Murray, Rebecca Isaacs, Michael Isard, Paul Barham, and Martin Abadi. The photo credit is due to Mihai Budiu.

Bio: Frank McSherry is an independent researcher formerly affiliated with Microsoft Research, Silicon Valley. While there he led the Naiad project, which introduced both differential and timely dataflow, and remains one of the top-performing big data platforms. He also works with differential privacy, due in part to its interesting relationship to data-parallel computation. Frank currently enjoys spending his time in places other than Silicon Valley.

This talk is part of the Computer Laboratory Systems Research Group Seminar series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity