Full metadata
Title
Learning from the Data Heterogeneity for Data Imputation
Description
Data mining, also known as big data analysis, has been identified as a critical and challenging process for a variety of applications in real-world problems. Numerous datasets are collected and generated every day to store the information. The rise in the number of data volumes and data modality has resulted in the increased demand for data mining methods and strategies of finding anomalies, patterns, and correlations within large data sets to predict outcomes. Effective machine learning methods are widely adapted to build the data mining pipeline for various purposes like business understanding, data understanding, data preparation, modeling, evaluation, and deployment. The major challenges for effectively and efficiently mining big data include (1) data heterogeneity and (2) missing data. Heterogeneity is the natural characteristic of big data, as the data is typically collected from different sources with diverse formats. The missing value is the most common issue faced by the heterogeneous data analysis, which resulted from variety of factors including the data collecting processing, user initiatives, erroneous data entries, and so on. In response to these challenges, in this thesis, three main research directions with application scenarios have been investigated: (1) Mining and Formulating Heterogeneous Data, (2) missing value imputation strategy in various application scenarios in both offline and online manner, and (3) missing value imputation for multi-modality data. Multiple strategies with theoretical analysis are presented, and the evaluation of the effectiveness of the proposed algorithms compared with state-of-the-art methods is discussed.
Date Created
2021
Contributors
- liu, Xu (Author)
- He, Jingrui (Thesis advisor)
- Xue, Guoliang (Thesis advisor)
- Li, Baoxin (Committee member)
- Tong, Hanghang (Committee member)
- Arizona State University (Publisher)
Topical Subject
Resource Type
Extent
96 pages
Language
eng
Copyright Statement
In Copyright
Primary Member of
Peer-reviewed
No
Open Access
No
Handle
https://hdl.handle.net/2286/R.2.N.162017
Level of coding
minimal
Cataloging Standards
Note
Partial requirement for: Ph.D., Arizona State University, 2021
Field of study: Computer Engineering
System Created
- 2021-11-16 05:58:01
System Modified
- 2021-11-30 12:51:28
- 3 years ago
Additional Formats