Structural-RNN: Deep Learning on Spatio-Temporal Graphs

被引：614

作者：

Jain, Ashesh ^{[1
,2
]}

Zamir, Amir R. ^{[2
]}

Savarese, Silvio ^{[2
]}

Saxena, Ashutosh ^{[3
]}

机构：

[1] Cornell Univ, Ithaca, NY 14853 USA

[2] Stanford Univ, Stanford, CA 94305 USA

[3] Brain Things Inc, Redwood City, CA USA

来源：

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2016年

基金：

美国国家科学基金会;

关键词：

D O I：

10.1109/CVPR.2016.573

中图分类号：

TP18 [人工智能理论];

学科分类号：

140502 [人工智能];

摘要：

Deep Recurrent Neural Network architectures, though remarkably capable at modeling sequences, lack an intuitive high-level spatio-temporal structure. That is while many problems in computer vision inherently have an underlying high-level structure and can benefit from it. Spatio-temporal graphs are a popular tool for imposing such highlevel intuitions in the formulation of real world problems. In this paper, we propose an approach for combining the power of high-level spatio-temporal graphs and sequence learning success of Recurrent Neural Networks (RNNs). We develop a scalable method for casting an arbitrary spatio-temporal graph as a rich RNN mixture that is feedforward, fully differentiable, and jointly trainable. The proposed method is generic and principled as it can be used for transforming any spatio-temporal graph through employing a certain set of well defined steps. The evaluations of the proposed approach on a diverse set of problems, ranging from modeling human motion to object interactions, shows improvement over the state-of-the-art with a large margin. We expect this method to empower new approaches to problem formulation through high-level spatio-temporal graphs and Recurrent Neural Networks.

引用

页码：5308 / 5317

页数：10

共 66 条

[1]

[Anonymous], IJRR

[2]

[Anonymous], 2015, 3 INT C LEARN REPR I

[3]

[Anonymous], 2006, Advances in Neural Information Processing Systems

[4]

[Anonymous], 2013, ICML

[5]

Bengio Y., 2009, P 26 ANN INT C MACHI, P41, DOI DOI 10.1145/1553374.1553380

[6]

Bengio Yoshua, 1994, NIPS, P937

[7]

Bottou L., 1997, CVPR

[8]

Brendel W., 2011, ICCV

[9]

Byeon W., 2015, CVPR

[10]

Chen L.-C., 2014, ARXIV

← 1 2 3 4 5 6 7 →