BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Dirichlet Process Mixtures of Multivariate Skew t-distributions fo
 r Unsupervised Clustering of Cell Populations from Flow-Cytometry Data - D
 r Boris Hejblum\, University of Bordeaux
DTSTART:20170912T133000Z
DTEND:20170912T143000Z
UID:TALK76722@talks.cam.ac.uk
CONTACT:Alison Quenault
DESCRIPTION:Flow cytometry is a high-throughput technology used to quantif
 y multiple surface and intracellular markers at the level of a single cell
 .  This enables to identify cell sub-types\, and to determine their relati
 ve proportions. Improvements of this technology allow to describe millions
  of individual cells from a blood sample using multiple markers.  This res
 ults in large datasets\, whose manual analysis is highly time-consuming an
 d poorly reproducible.  While several methods have been developed to perfo
 rm automatic recognition of cell populations\, most of them treat and anal
 yze each sample independently.  However\, in practice\, individual samples
  are rarely independent (e.g. longitudinal studies).  Here\, we propose to
  use a Bayesian nonparametric approach with Dirichlet process mixture (DPM
 ) of multivariate skew t-distributions to perform unsupervised model-based
  clustering of flow-cytometry data. DPM models directly estimate the numbe
 r of cell populations from the data\, avoiding model selection issues\, an
 d skew t-distributions provides robustness to outliers and non-elliptical 
 shape of cell populations.  To accommodate repeated measurements\, we prop
 ose a sequential strategy relying on a parametric approximation of the pos
 terior.  We illustrate the good performance of our method on simulated dat
 a\, on an experimental benchmark dataset\, and on new longitudinal data fr
 om the DALIA-1 trial which evaluates a therapeutic vaccine against HIV.  O
 n the benchmark dataset\, the sequential strategy outperforms all other me
 thods evaluated\, and similarly\, leads to improved performance on the DAL
 IA-1 data.  We have implemented an efficient partially collapsed Gibbs sam
 pler with a Metropolis-Hastings step using slice-sampling to estimate the 
 posterior partition of the data\, available from CRAN in the R package NPf
 low.
LOCATION:Large  Seminar Room\, 1st Floor\, Institute of Public Health\, Un
 iversity Forvie Site\, Robinson Way\, Cambridge
END:VEVENT
END:VCALENDAR