Locality sensitive indexing for efficient high-dimensional query answering in the presence of excluded regions

154272-Thumbnail Image.png
Description
Similarity search in high-dimensional spaces is popular for applications like image

processing, time series, and genome data. In higher dimensions, the phenomenon of

curse of dimensionality kills the effectiveness of most of the index structures, giving

way to approximate methods like Locality Sensitive

Similarity search in high-dimensional spaces is popular for applications like image

processing, time series, and genome data. In higher dimensions, the phenomenon of

curse of dimensionality kills the effectiveness of most of the index structures, giving

way to approximate methods like Locality Sensitive Hashing (LSH), to answer similarity

searches. In addition to range searches and k-nearest neighbor searches, there

is a need to answer negative queries formed by excluded regions, in high-dimensional

data. Though there have been a slew of variants of LSH to improve efficiency, reduce

storage, and provide better accuracies, none of the techniques are capable of

answering queries in the presence of excluded regions.

This thesis provides a novel approach to handle such negative queries. This is

achieved by creating a prefix based hierarchical index structure. First, the higher

dimensional space is projected to a lower dimension space. Then, a one-dimensional

ordering is developed, while retaining the hierarchical traits. The algorithm intelligently

prunes the irrelevant candidates while answering queries in the presence of

excluded regions. While naive LSH would need to filter out the negative query results

from the main results, the new algorithm minimizes the need to fetch the redundant

results in the first place. Experiment results show that this reduces post-processing

cost thereby reducing the query processing time.
Date Created
2016
Agent