Full metadata
Title
Low to High Dimensional Modality Reconstruction Using Aggregated Fields of View
Description
Autonomous systems that are out in the real world today deal with a slew of different data modalities to perform effectively in tasks ranging from robot navigation in complex maneuverable robots to identity verification in simpler static systems. The performance of the system heavily banks on the continuous supply of data from all modalities. These systems can face drastically increased risk with the loss of one or multiple modalities due to an adverse scenario like that of hardware malfunction, inimical environmental conditions, etc. This thesis investigates modality hallucination and its efficacy in mitigating the risks posed to the autonomous system. Modality hallucination is proposed as one effective way to ensure consistent modality availability thereby reducing unfavorable consequences. While there has been a significant research effort in high-to-low dimensional modality hallucination, like that of RGB to depth, there is considerably lesser interest in the other direction( low-to-high dimensional modality prediction). This thesis serves to demonstrate the effectiveness of this low-to-high modality hallucination in reducing the uncertainty in the affected system while also ensuring that the method remains task agnostic.
A deep neural network based encoder-decoder architecture that aggregates multiple fields of view in its encoder blocks to recover the lost information of the affected modality from the extant modality is presented with evidence of its efficacy. The hallucination process is implemented by capturing a non-linear mapping between the data modalities and the learned mapping is used to aid the extant modality to mitigate the risk posed to the system in the adverse scenarios which involve modality loss. The results are compared with a well known generative model built for the task of image translation, as well as an off-the-shelf semantic segmentation architecture re-purposed for hallucination. To validate the practicality of hallucinated modality, extensive classification and segmentation experiments are conducted on the University of Washington's depth image database (UWRGBD) database and the New York University database (NYUD) and demonstrate that hallucination indeed lessens the negative effects of the modality loss.
A deep neural network based encoder-decoder architecture that aggregates multiple fields of view in its encoder blocks to recover the lost information of the affected modality from the extant modality is presented with evidence of its efficacy. The hallucination process is implemented by capturing a non-linear mapping between the data modalities and the learned mapping is used to aid the extant modality to mitigate the risk posed to the system in the adverse scenarios which involve modality loss. The results are compared with a well known generative model built for the task of image translation, as well as an off-the-shelf semantic segmentation architecture re-purposed for hallucination. To validate the practicality of hallucinated modality, extensive classification and segmentation experiments are conducted on the University of Washington's depth image database (UWRGBD) database and the New York University database (NYUD) and demonstrate that hallucination indeed lessens the negative effects of the modality loss.
Date Created
2019
Contributors
- Gunasekar, Kausic (Author)
- Yang, Yezhou (Thesis advisor)
- Qiu, Qiang (Committee member)
- Amor, Heni Ben (Committee member)
- Arizona State University (Publisher)
Topical Subject
Resource Type
Extent
73 pages
Language
eng
Copyright Statement
In Copyright
Primary Member of
Peer-reviewed
No
Open Access
No
Handle
https://hdl.handle.net/2286/R.I.54924
Level of coding
minimal
Note
Masters Thesis Computer Engineering 2019
System Created
- 2019-11-06 03:40:03
System Modified
- 2021-08-26 09:47:01
- 3 years 3 months ago
Additional Formats