Stereo based visual odometry
Document
Description
The exponential rise in unmanned aerial vehicles has necessitated the need for accurate pose estimation under any extreme conditions. Visual Odometry (VO) is the estimation of position and orientation of a vehicle based on analysis of a sequence of images captured from a camera mounted on it. VO offers a cheap and relatively accurate alternative to conventional odometry techniques like wheel odometry, inertial measurement systems and global positioning system (GPS). This thesis implements and analyzes the performance of a two camera based VO called Stereo based visual odometry (SVO) in presence of various deterrent factors like shadows, extremely bright outdoors, wet conditions etc... To allow the implementation of VO on any generic vehicle, a discussion on porting of the VO algorithm to android handsets is presented too. The SVO is implemented in three steps. In the first step, a dense disparity map for a scene is computed. To achieve this we utilize sum of absolute differences technique for stereo matching on rectified and pre-filtered stereo frames. Epipolar geometry is used to simplify the matching problem. The second step involves feature detection and temporal matching. Feature detection is carried out by Harris corner detector. These features are matched between two consecutive frames using the Lucas-Kanade feature tracker. The 3D co-ordinates of these matched set of features are computed from the disparity map obtained from the first step and are mapped into each other by a translation and a rotation. The rotation and translation is computed using least squares minimization with the aid of Singular Value Decomposition. Random Sample Consensus (RANSAC) is used for outlier detection. This comprises the third step. The accuracy of the algorithm is quantified based on the final position error, which is the difference between the final position computed by the SVO algorithm and the final ground truth position as obtained from the GPS. The SVO showed an error of around 1% under normal conditions for a path length of 60 m and around 3% in bright conditions for a path length of 130 m. The algorithm suffered in presence of shadows and vibrations, with errors of around 15% and path lengths of 20 m and 100 m respectively.