University of Cambridge > Talks.cam > Computer Laboratory Systems Research Group Seminar > GraphChi and GraphChi-DB: large-scale graph computation on just a PC

GraphChi and GraphChi-DB: large-scale graph computation on just a PC

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Eiko Yoneki.

Current systems for graph computation require a distributed computing cluster to handle very large real-world problems, such as analysis on social networks or the web graph. While distributed computational resources have become more accessible, developing distributed graph algorithms still remains challenging, especially to non-experts.

We present GraphChi (OSDI¹12), a disk-based system for computing efficiently on graphs with billions of edges. By using a well-known method to break large graphs into small parts, and a novel Parallel Sliding Windows algorithm, GraphChi is able to execute several advanced data mining, graph mining, and machine learning algorithms on very large graphs, using just a single consumer-level computer. We show, through experiments and theoretical analysis, that GraphChi performs well on both SSDs and rotational hard drives.

We build on the basis of Parallel Sliding Windows to propose a new data structure, the Partitioned Adjacency Lists, which we use to design an online graph database, GraphChi-DB. We demonstrate that, on a single PC, GraphChi-DB can process over one hundred thousand graph updates per second, while simultaneously performing computation. GraphChi-DB compares favorable to existing graph databases, particularly on data that is much larger than the available memory.

We will then look at our research in hindsight and discuss what is GraphChi good for, and what it is not optimal for, and also discuss competing systems that have been proposed since. I will conclude with some remarks on open research problems.

Bio: Aapo Kyrola (Ph.D. Carnegie Mellon University, 2014) is a computer scientist, software engineer and startup entrepreneur. The main contribution of his Ph.D. Thesis is GraphChi (OSDI ¹12), a disk-based graph computation system that can handle as large graphs as many distributed systems, but on just a single PC or laptop. Mr. Kyrola also worked in the GraphLab project (UAI 2010, VLDB 2012 ) and proposed a parallel algorithm for L1-regularized minimization problems (ICML 2011). In CMU , he was advised by professors Carlos Guestrin and Guy Blelloch.

Prior to his Ph.D. Studies, in 2000 Mr. Kyrola co-founded Habbo Hotel (habbo.com), a pioneering web-based virtual world that at its peak attracted over 15 million monthly users and 50 million euros of revenue. During his Ph.D. studies, he co-founded another startup that developed a novel automatic activity tracker app Moves for iOS and Android. Moves was chosen as one of the best apps by both Apple and Google in 2013. Facebook acquired Moves in 2014, and now Mr. Kyrola works as a software engineer in Facebook’s London office.

This talk is part of the Computer Laboratory Systems Research Group Seminar series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2020 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity