From Pose Estimation to Fine Grained Activity Recognition
- đ¤ Speaker: Micha Andriluka, Max Planck Institute for Informatics
- đ Date & Time: Thursday 06 September 2012, 15:00 - 16:00
- đ Venue: Small lecture theatre, Microsoft Research Ltd, 7 J J Thomson Avenue (Off Madingley Road), Cambridge
Abstract
Title:
From Pose Estimation to Fine Grained Activity Recognition
Abstract:
Human pose estimation and activity recognition in monocular images are challenging problems, especially when these tasks must be solved in unconstrained environments such as street scenes. The major sources of complexity are cluttered and dynamically changing backgrounds and the presence of multiple people that often partially or fully occlude each other.
While previous work has largely neglected interactions between people, we show that modeling them is crucial for good performance. In the first part of the talk I will to demonstrate that for the case of detection of people in crowded street scenes and for the case of monocular 3D pose estimation. In the case of people detection we propose a new occlusion-aware detector that exploits the patterns emerging from person-person occlusions, and quantify its performance on several publicly available benchmarks, improving over the state-of-the-art. In the case of human pose estimation we propose to incroporate interactions at two level. The 2D poses of people are inferred with a multi-person pictorial structures model that captures interactions between subjects. The 3D poses are then recovered by lifting 2D poses to 3D relying on the learned joined prior model of human poses and motion. We demonstrate that including interactions between subjects both in 2D and in 3D improves pose estimation results.
In the second part of the talk I will focus on the challenge of fine grained activity recognition, where the goal is to recognize a large number of visually similar activities such as those performed during a complex medical procedure, devide maintaince or cooking. I will rely on the cooking activities as a working example and describe our recently introduced dataset, containing over 65 cooking activities and about 9 hours of video footage. I will present initial results on the dataset and discuss open questions related to the use of pose estimation for fine grained activity recognition.
Series This talk is part of the Microsoft Research Cambridge, public talks series.
Included in Lists
- All Talks (aka the CURE list)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge talks
- Chris Davis' list
- Guy Emerson's list
- Interested Talks
- Microsoft Research Cambridge, public talks
- ndk22's list
- ob366-ai4er
- Optics for the Cloud
- personal list
- PMRFPS's
- rp587
- School of Technology
- Small lecture theatre, Microsoft Research Ltd, 7 J J Thomson Avenue (Off Madingley Road), Cambridge
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Micha Andriluka, Max Planck Institute for Informatics
Thursday 06 September 2012, 15:00-16:00