University of Cambridge > Talks.cam > Computer Laboratory Systems Research Group Seminar > Pandia: comprehensive contention-sensitive thread placement

Pandia: comprehensive contention-sensitive thread placement

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Liang Wang.

In this talk we introduce Pandia is a system for modelling the performance of in-memory parallel workloads. Pandia generates a description of a workload from a series of profiling runs, and combines this with a description of the machine’s hardware to model the workload’s performance over different thread counts and different placements of those threads.

The approach is “comprehensive” in that it accounts for contention at multiple resources such as processor functional units and memory channels. The points of contention for a workload can shift between resources as the degree of parallelism and thread placement changes. Pandia accounts for these changes and provides a close correspondence between predicted performance and actual performance. Testing a set of 22 benchmarks on 2 socket Intel machines fitted with chips ranging from Sandy Bridge to Haswell we see median differences of 1.05\ to 0\ between the fastest predicted placement and the fastest measured placement, and median errors of 8\ to 4\ across all placements.

Pandia can be used to optimize the performance of a given workload—-for instance, identifying whether or not multiple processor sockets should be used, and whether or not the workload benefits from using multiple threads per core. In addition, Pandia can be used to identify opportunities for reducing resource consumption where additional resources are not matched by additional performance—-for instance, limiting a workload to a small number of cores when its scaling is poor.

Bio: Daniel Goodman is a researcher at Oracle Labs Cambridge where he works on runtime systems. Prior to this he has worked as an RA at Manchester University and Oxford University looking at dataflow programming and abstractions that make HPC computing more accessible.

This talk is part of the Computer Laboratory Systems Research Group Seminar series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity