University of Cambridge > Talks.cam > Computer Laboratory Systems Research Group Seminar > Building modern dataflow systems

Building modern dataflow systems

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Marco Caballero.

Abstract: I’ll talk through the design and implementation of “timely dataflow in Rust”, an open-source project that extends and enriches the “timely dataflow” computational model first presented by the Naiad system, and the differential dataflow framework built on top of it. The project’s goal is to provide an near-zero overhead framework for data-parallel dataflow computation, and to this end it simplifies and unifies several of Naiad’s concepts through lossless abstractions that largely compile away. Our experience has been that timely dataflow programs give best-in-class performance, while still providing the experience of a medium-to-high level programming language. To support this, I’ll walk through the example of differential dataflow, an incremental re-computation framework which seems to out-perform the current crop of specialized data processing systems, in part due to its ability to provide general computation abstractions that compile down to sequential scans over carefully managed resources.

https://github.com/frankmcsherry/timely-dataflow
https://github.com/frankmcsherry/differential-dataflow

These projects reflect joint work with a great many people, including what was once the Naiad team at MSR -SV, the Systems Group at ETH Z ürich, and many other collaborators.

Bio: Frank McSherry received his PhD from the University of Washington, working with Anna Karlin on spectral analysis of data. He then spent twelve years at Microsoft Research’s Silicon Valley research center, working on topics ranging from differential privacy to data-parallel computation. He currently works at ETH Z ürich’s Systems Group on scalable stream processing and related topics.

This talk is part of the Computer Laboratory Systems Research Group Seminar series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity