BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:OptiWISE: Combining Sampling and Instrumentation for Granular CPI 
 Analysis - Yuxin Guo\, University of Cambridge
DTSTART:20240227T120000Z
DTEND:20240227T130000Z
UID:TALK212716@talks.cam.ac.uk
CONTACT:Timothy Jones
DESCRIPTION:This is a practice talk for CGO on the 5th March 2024.\n\nDesp
 ite decades of improvement in compiler technology\, it remains necessary t
 o profile applications to improve performance. Existing profiling tools ty
 pically either sample hardware performance counters or instrument the prog
 ram with extra instructions to analyze its execution. Both techniques are 
 valuable with different strengths and weaknesses\, but do not always corre
 ctly identify optimization opportunities.\n\nWe present OptiWISE\, a profi
 ling tool that runs the program twice\, once with low-overhead sampling to
  accurately measure performance\, and once with instrumentation to accurat
 ely capture control flow and execution counts. OptiWISE then combines this
  information to give a highly detailed per-instruction CPI metric by compu
 ting the ratio of samples to execution counts\, as well as aggregated info
 rmation such as costs per loop\, source-code line\, or function.\n\nWe eva
 luate OptiWISE to show it has an overhead of 8.1× geomean\, and 57× wors
 t case on SPEC CPU2017 benchmarks. Using O PTI WISE\, we present case stud
 ies of optimizing selected SPEC benchmarks on a modern x86 server processo
 r. The per-instruction CPI metrics quickly reveal problems such as costly 
 mispredicted branches and cache misses\, which we use to manually optimize
  for effective performance improvements.
LOCATION:SC04\, Computer Laboratory\, William Gates Building
END:VEVENT
END:VCALENDAR
