Shortcut learning in deep neural networks

被引:1005
作者
Geirhos, Robert [1 ,2 ]
Jacobsen, Joern-Henrik [3 ]
Michaelis, Claudio [1 ,2 ]
Zemel, Richard [3 ]
Brendel, Wieland [1 ]
Bethge, Matthias [1 ]
Wichmann, Felix A. [1 ]
机构
[1] Univ Tubingen, Tubingen, Germany
[2] Int Max Planck Res Sch Intelligent Syst, Tubingen, Germany
[3] Univ Toronto, Vector Inst, Toronto, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
CONTEXT;
D O I
10.1038/s42256-020-00257-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep learning has triggered the current rise of artificial intelligence and is the workhorse of today's machine intelligence. Numerous success stories have rapidly spread all over science, industry and society, but its limitations have only recently come into focus. In this Perspective we seek to distil how many of deep learning's failures can be seen as different symptoms of the same underlying problem: shortcut learning. Shortcuts are decision rules that perform well on standard benchmarks but fail to transfer to more challenging testing conditions, such as real-world scenarios. Related issues are known in comparative psychology, education and linguistics, suggesting that shortcut learning may be a common characteristic of learning systems, biological and artificial alike. Based on these observations, we develop a set of recommendations for model interpretation and benchmarking, highlighting recent advances in machine learning to improve robustness and transferability from the lab to real-world applications. Deep learning has resulted in impressive achievements, but under what circumstances does it fail, and why? The authors propose that its failures are a consequence of shortcut learning, a common characteristic across biological and artificial systems in which strategies that appear to have solved a problem fail unexpectedly under different circumstances.
引用
收藏
页码:665 / 673
页数:9
相关论文
共 108 条
[1]  
Alcorn Michael A, 2019, P IEEE C COMP VIS PA
[2]  
Nguyen A, 2015, PROC CVPR IEEE, P427, DOI 10.1109/CVPR.2015.7298640
[3]  
[Anonymous], 2018, ANN MATH SCI APPL
[4]  
[Anonymous], 2015, IEEE I CONF COMP VIS, DOI DOI 10.1109/ICCV.2015.123
[5]  
[Anonymous], 2018, PREPRINT
[6]  
Arjovsky M., 2019, PREPRINT
[7]  
Arpit D, 2017, PR MACH LEARN RES, V70
[8]  
Azulay A, 2019, J MACH LEARN RES, V20
[9]   Deep convolutional networks do not classify based on global object shape [J].
Baker, Nicholas ;
Lu, Hongjing ;
Erlikhman, Gennady ;
Kellman, Philip J. .
PLOS COMPUTATIONAL BIOLOGY, 2018, 14 (12)
[10]  
Barbu A., 2019, P ADV NEURIPS, P9448