Description
In the presence of big data analysis, large volume of data needs to be systematically indexed to support analytical tasks, such as feature engineering, pattern recognition, data mining, and query processing. The volume, variety, and velocity of these data necessitate

In the presence of big data analysis, large volume of data needs to be systematically indexed to support analytical tasks, such as feature engineering, pattern recognition, data mining, and query processing. The volume, variety, and velocity of these data necessitate sophisticated systems to help researchers understand, analyze, and dis- cover insights from heterogeneous, multidimensional data sources. Many analytical frameworks have been proposed in the literature in recent years, but challenges to accuracy, speed, and effectiveness remain hence a systematic approach to perform data signature computation and query processing in multi-dimensional space is in people’s interest. In particular, real-time and near real-time queries pose significant challenges when working with large data sets.

To address these challenges, I develop an innovative robust multi-variate fea- ture extraction algorithm over multi-dimensional temporal datasets, which is able to help understand and analyze various real-world applications. Furthermore, to an- swer queries over these features, I develop a novel resource-aware indexing framework to approximately solve top-k queries by leveraging onion-layer indexing in conjunc- tion with locality sensitive hashing. The proposed indexing scheme allows people to answer top-k queries by only accessing a bounded amount of data, which optimizes big data small for queries.
Downloads
PDF (5 MB)

Details

Title
  • Feature Extraction from Multi-variate Time Series and Resource-Aware Indexing
Contributors
Date Created
2020
Subjects
Resource Type
  • Text
  • Collections this item is in
    Note
    • Doctoral Dissertation Computer Science 2020

    Machine-readable links