Exploring Representation Learning With CNNs for Frame-to-Frame Ego-Motion Estimation

被引：157

作者：

Costante, Gabriele ^{[1
]}

Mancini, Michele ^{[1
]}

Valigi, Paolo ^{[1
]}

Ciarfuglia, Thomas A. ^{[1
]}

机构：

[1] Univ Perugia, Dept Engn, I-06125 Perugia, Italy

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2016年 / 1卷 / 01期

关键词：

Visual Learning; Visual-Based Navigation;

D O I：

10.1109/LRA.2015.2505717

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

Visual ego-motion estimation, or briefly visual odometry (VO), is one of the key building blocks of modern SLAM systems. In the last decade, impressive results have been demonstrated in the context of visual navigation, reaching very high localization performance. However, all ego-motion estimation systems require careful parameter tuning procedures for the specific environment they have to work in. Furthermore, even in ideal scenarios, most state-of-the-art approaches fail to handle image anomalies and imperfections, which results in less robust estimates. VO systems that rely on geometrical approaches extract sparse or dense features and match them to perform frame-toframe (F2F) motion estimation. However, images contain much more information that can be used to further improve the F2F estimation. To learn new feature representation, a very successful approach is to use deep convolutional neural networks. Inspired by recent advances in deep networks and by previous work on learning methods applied to VO, we explore the use of convolutional neural networks to learn both the best visual features and the best estimator for the task of visual ego-motion estimation. With experiments on publicly available datasets, we show that our approach is robust with respect to blur, luminance, and contrast anomalies and outperforms most state-of-the-art approaches even in nominal conditions.

引用

页码：18 / 25

页数：8

共 28 条

[1]

Alcantarilla PF, 2012, IEEE INT CONF ROBOT, P1290, DOI 10.1109/ICRA.2012.6224690

[2]

[Anonymous], 2014 IEEE C COMPUTER, P580, DOI [DOI 10.1109/CVPR.2014.81, 10.1109/CVPR.2014.81]

[3] Representation Learning: A Review and New Perspectives [J].

Bengio, Yoshua ;

Courville, Aaron ;

Vincent, Pascal .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1798-1828

[4] Learning Deep Architectures for AI [J].

Bengio, Yoshua .

FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2009, 2 (01) :1-127

[5] High accuracy optical flow estimation based on a theory for warping [J].

Brox, T ;

Bruhn, A ;

Papenberg, N ;

Weickert, J .

COMPUTER VISION - ECCV 2004, PT 4, 2004, 2034 :25-36

[6] Evaluation of non-geometric methods for visual odometry [J].

Ciarfuglia, Thomas A. ;

Costante, Gabriele ;

Valigi, Paolo ;

Ricci, Elisa .

ROBOTICS AND AUTONOMOUS SYSTEMS, 2014, 62 (12) :1717-1730

[7]

Ciarfuglia TA, 2012, IEEE INT C INT ROBOT, P3837, DOI 10.1109/IROS.2012.6385654

[8]

Costante G., 2015, EXTRA MARTERIALS THI

[9] MonoSLAM: Real-time single camera SLAM [J].

Davison, Andrew J. ;

Reid, Ian D. ;

Molton, Nicholas D. ;

Stasse, Olivier .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2007, 29 (06) :1052-1067

[10]

Eade E., 2006, P IEEE COMPUTER SOC, VVolume 1, P469, DOI DOI 10.1109/CVPR.2006.263

← 1 2 3 →