Full metadata
Title
Monocular depth estimation with edge-based constraints and active learning
Description
The ubiquity of single camera systems in society has made improving monocular depth estimation a topic of increasing interest in the broader computer vision community. Inspired by recent work in sparse-to-dense depth estimation, this thesis focuses on sparse patterns generated from feature detection based algorithms as opposed to regular grid sparse patterns used by previous work. This work focuses on using these feature-based sparse patterns to generate additional depth information by interpolating regions between clusters of samples that are in close proximity to each other. These interpolated sparse depths are used to enforce additional constraints on the network’s predictions. In addition to the improved depth prediction performance observed from incorporating the sparse sample information in the network compared to pure RGB-based methods, the experiments show that actively retraining a network on a small number of samples that deviate most from the interpolated sparse depths leads to better depth prediction overall.
This thesis also introduces a new metric, titled Edge, to quantify model performance in regions of an image that show the highest change in ground truth depth values along either the x-axis or the y-axis. Existing metrics in depth estimation like Root Mean Square Error(RMSE) and Mean Absolute Error(MAE) quantify model performance across the entire image and don’t focus on specific regions of an image that are hard to predict. To this end, the proposed Edge metric focuses specifically on these hard to classify regions. The experiments also show that using the Edge metric as a small addition to existing loss functions like L1 loss in current state-of-the-art methods leads to vastly improved performance in these hard to classify regions, while also improving performance across the board in every other metric.
This thesis also introduces a new metric, titled Edge, to quantify model performance in regions of an image that show the highest change in ground truth depth values along either the x-axis or the y-axis. Existing metrics in depth estimation like Root Mean Square Error(RMSE) and Mean Absolute Error(MAE) quantify model performance across the entire image and don’t focus on specific regions of an image that are hard to predict. To this end, the proposed Edge metric focuses specifically on these hard to classify regions. The experiments also show that using the Edge metric as a small addition to existing loss functions like L1 loss in current state-of-the-art methods leads to vastly improved performance in these hard to classify regions, while also improving performance across the board in every other metric.
Date Created
2019
Contributors
- Rai, Anshul (Author)
- Yang, Yezhou (Thesis advisor)
- Zhang, Wenlong (Committee member)
- Liang, Jianming (Committee member)
- Arizona State University (Publisher)
Topical Subject
Resource Type
Extent
viii, 65 pages : color illustrations
Language
eng
Copyright Statement
In Copyright
Primary Member of
Peer-reviewed
No
Open Access
No
Handle
https://hdl.handle.net/2286/R.I.54881
Statement of Responsibility
by Anshul Rai
Description Source
Viewed on August 25, 2020
Level of coding
full
Note
thesis
Partial requirement for: M.S., Arizona State University, 2019
bibliography
Includes bibliographical references
Field of study: Computer science
System Created
- 2019-11-06 03:38:48
System Modified
- 2021-08-26 09:47:01
- 3 years 3 months ago
Additional Formats