On Data (In-)Dependent Hashing
- đ¤ Speaker: Novi Quadrianto (University of Cambridge)
- đ Date & Time: Thursday 31 May 2012, 14:00 - 15:30
- đ Venue: Engineering Department, CBL Room 438
Abstract
I will provide an overview of techniques to perform approximate nearest neighbor (ANN) search in massive datasets. The ANN search has wide-ranging applications, among others, in information retrieval for finding near-duplicate pages, in computer graphics for completing scenes, and in collaborative filtering. The most widely used approach that is particularly suitable for high-dimensional data is to build similarity-preserving hash functions which map similar data points to nearby codes. These hashing methods can be sub-divided into two main categories: data independent and data dependent methods. I will cover the locality-sensitive hashing (LSH)-based methods as a representative of the data independent approach. I will show how to build LSH that preserves hamming distance, cosine similarity, and Jaccard index. I will briefly mention some of recent machine learning based data dependent approaches such as spectral hashing and other loss-based hashing. To make things a bit closer to home research, I will also try to show some potentials of hashing for Gaussian Process Regression.
Series This talk is part of the Machine Learning Reading Group @ CUED series.
Included in Lists
- All Talks (aka the CURE list)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge Forum of Science and Humanities
- Cambridge Language Sciences
- Cambridge talks
- Cambridge University Engineering Department Talks
- Centre for Smart Infrastructure & Construction
- Chris Davis' list
- Computational Continuum Mechanics Group Seminars
- custom
- Engineering Department, CBL Room 438
- Featured lists
- Guy Emerson's list
- Hanchen DaDaDash
- Inference Group Journal Clubs
- Inference Group Summary
- Information Engineering Division seminar list
- Interested Talks
- Machine Learning Reading Group
- Machine Learning Reading Group @ CUED
- Machine Learning Summary
- ML
- ndk22's list
- ob366-ai4er
- Quantum Matter Journal Club
- Required lists for MLG
- rp587
- School of Technology
- Simon Baker's List
- TQS Journal Clubs
- Trust & Technology Initiative - interesting events
- yk373's list
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Thursday 31 May 2012, 14:00-15:30