Instant Visual Odometry Initialization for Mobile AR
International Symposium on Mixed and Augmented Reality (ISMAR)
July 29, 2021
By: Alejo Concha, Michael Burri, Jesus Briales, Christian Forster, Luc Oth
Mobile AR applications benefit from instant initialization to display world-locked effects promptly. However, standard visual odometry or SLAM algorithms require motion parallax to initialize (see Figure 1) and, therefore, suffer from delayed initialization. In this paper, we present a 6-DoF monocular visual odometry that initializes instantly and without motion parallax. Our main contribution is a pose estimator that decouples estimating the 5-DoF relative rotation and translation direction from the 1-DoF translation magnitude. While scale is not observable in a monocular vision-only setting, it is still paramount to estimate a consistent scale over the whole trajectory (even if not physically accurate) to avoid AR effects moving erroneously along depth. In our approach, we leverage the fact that depth errors are not perceivable to the user during rotation-only motion. However, as the user starts translating the device, depth becomes perceivable and so does the capability to estimate consistent scale. Our proposed algorithm naturally transitions between these two modes. Our second contribution is a novel residual in the relative pose problem to further improve the results. The residual combines the Jacobians of the functional and the functional itself and is minimized using a Levenberg–Marquardt optimizer on the 5-DoF manifold. We perform extensive validations of our contributions with both a publicly available dataset and synthetic data. We show that the proposed pose estimator outperforms the classical approaches for 6-DoF pose estimation used in the literature in low-parallax configurations. Likewise, we show our relative pose estimator outperforms state-of-the-art approaches in an odometry pipeline configuration where we can leverage initial guesses. We release a dataset for the relative pose problem using real data to facilitate the comparison with future solutions for the relative pose problem. The proposed odometry is currently used as a pre-SLAM initialization module in world-locked AR effects in Instagram and Facebook.
Download Paper
Related Publications
Interspeech - October 12, 2021
LiRA: Learning Visual Speech Representations from Audio through Self-supervision
Pingchuan Ma, Rodrigo Mira, Stavros Petridis, Björn W. Schuller, Maja Pantic
CVPR - June 20, 2021
Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation
M. Saquib Sarfraz, Naila Murray, Vivek Sharma, Ali Diba, Luc Van Gool, Rainer Stiefelhagen
3DV - November 18, 2021
Recovering Real-World Reflectance Properties and Shading From HDR Imagery
Bjoern Haefner, Simon Green, Alan Oursland, Daniel Andersen, Michael Goesele, Daniel Cremers, Richard Newcombe, Thomas Whelan
ICCV - October 11, 2021
Contrast and Classify: Training Robust VQA Models
Yash Kant, Abhinav Moudgil, Dhruv Batra, Devi Parikh, Harsh Agrawal
All Publications
Additional Resources
Downloads & Projects
Visiting Researchers & Postdocs
Visit Our Other Blogs
Facebook AI
RSS Feed
Facebook © 2021
To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy