Training Random Forests with Ambiguously Labeled Data
- đ¤ Speaker: Christian Leistner
- đ Date & Time: Thursday 07 April 2011, 10:30 - 11:30
- đ Venue: Small lecture theatre, Microsoft Research Ltd, 7 J J Thomson Avenue (Off Madingley Road), Cambridge
Abstract
Nowadays, an increasing number of computer vision applications rely on the usage of powerful machine learning algorithms. For the learning, usually supervised algorithms are applied, which demand large amounts of hand-labeled samples in order to yield accurate results. Although nowadays the number of digital images is exploding, collecting large amounts of labeled data can still be tedious and, if labeled, the labels can be noisy or formatted in a way which might not be optimal to exploit by the learning method – consider bounding box annotations in images. This motivates the development and usage of learning algorithms that are able to exploit both small amounts of labeled data and large amounts of unlabeled data, which are usually easy to get, and, additionally, allow for a certain amount of flexibility in the labeling.
In this talk, I will show how to use Random Forests (RFs) to tackle these challenges. RFs are able to deliver state-of-the-art results in various applications. They are fast in both training and evaluation, are inherently multi-class, run on parallel architectures and are robust to label noise. This makes them perfect candidates to exploit large amounts of unlabeled or ambiguously labeled samples. In contrast, they demand large amounts of data to leverage their full potential, which in turn motivates the incorporation of unlabeled samples into their training. In particular, I will present extensions of RFs to semi-supervised and multiple-instance learning as well as to online learning, which is needed in many applications. Finally, I will present a new method that is able to benefit from unlabeled data, even when the samples are coming from different distributions or the samples are only weakly-related to the actual task.
Series This talk is part of the Microsoft Research Cambridge, public talks series.
Included in Lists
- All Talks (aka the CURE list)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge talks
- Chris Davis' list
- Guy Emerson's list
- Interested Talks
- Microsoft Research Cambridge, public talks
- ndk22's list
- ob366-ai4er
- Optics for the Cloud
- personal list
- PMRFPS's
- rp587
- School of Technology
- Small lecture theatre, Microsoft Research Ltd, 7 J J Thomson Avenue (Off Madingley Road), Cambridge
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Christian Leistner
Thursday 07 April 2011, 10:30-11:30