ICS 2011

Welcome to ICS2011
Innovations in Computer Science  ICS 2011, Tsinghua University, Beijing, China, January 79, 2011. Proceedings, 275283, 9787302245179
Tsinghua University Press
We study lower bounds for Locality Sensitive Hashing (LSH) in the strongest setting: point sets in {0, 1}d under the Hamming distance. Recall that H is said to be an (r, cr, p, q)sensitive hash family if all pairs x, y ∈ {0, 1}d with dist(x, y) ≤ r have probability at least p of collision under a randomly chosen h ∈ H, whereas all pairs x, y ∈ {0, 1}d with dist(x, y) ≥ cr have probability at most q of collision. Typically, one considers d → ∞, with c > 1 fixed and q bounded away from 0. For its applications to approximate nearest neighbor search in high dimensions, the quality of an LSH family H is governed by how small its "ρ parameter" ρ = ln(1/p)/ ln(1/q) is as a function of the parameter c. The seminal paper of Indyk and Motwani showed that for each c ≥ 1, the extremely simple family H = {x → xi : i ∈ [d]} achieves ρ ≤ 1/c. The only known lower bound, due to Motwani, Naor, and Panigrahy, is that ρ must be at least (e^{1/c} − 1)/(e^{1/c} + 1) ≥ .46/c (minus od(1)). The contribution of our paper is twofold: 1. We show the "optimal" lower bound for ρ: it must be at least 1/c (minus od(1)). Our proof is very simple, following almost immediately from the observation that the noise stability of a boolean function at time t is a logconvex function of t. 2. We raise and discuss the following issue: neither the application of LSH to nearest neighbor search, nor the known LSH lower bounds, hold as stated if the q parameter is tiny. Here "tiny" means q = 2^{−Ɵ(d)}, a parameter range we believe is natural. Preview:

Copyright 20102011, Institute for Computer Science, Tsinghua University, All Rights Reserved.