Cooperating with machines

被引:139
作者
Crandall, Jacob W. [1 ]
Oudah, Mayada [2 ]
Tennom [3 ]
Ishowo-Oloko, Fatimah [2 ]
Abdallah, Sherief [4 ,5 ]
Bonnefon, Jean-Francois [6 ]
Cebrian, Manuel [7 ]
Shariff, Azim [8 ]
Goodrich, Michael A. [1 ]
Rahwan, Iyad [7 ,9 ]
机构
[1] Brigham Young Univ, Comp Sci Dept, 3361 TMCB, Provo, UT 84602 USA
[2] Khalifa Univ Sci & Technol, Masdar Inst, POB 54224, Abu Dhabi, U Arab Emirates
[3] Univ Virginia, UVA Digital Himalaya Project, Charlottesville, VA 22904 USA
[4] British Univ Dubai, Dubai, U Arab Emirates
[5] Univ Edinburgh, Sch Informat, Edinburgh EH8 9AB, Midlothian, Scotland
[6] Univ Toulouse Capitole, CNRS, Toulouse Sch Econ TSM Res, F-31015 Toulouse, France
[7] MIT, Media Lab, Cambridge, MA 02139 USA
[8] Univ Calif Irvine, Dept Psychol & Social Behav, Irvine, CA 92697 USA
[9] MIT, Inst Data Syst & Soc, 77 Massachusetts Ave, Cambridge, MA 02139 USA
来源
NATURE COMMUNICATIONS | 2018年 / 9卷
关键词
ITERATED PRISONERS-DILEMMA; REPEATED GAMES; SOCIAL DILEMMAS; ROBOTS; COMMUNICATION; STRATEGIES; EVOLUTION; NETWORKS; POKER;
D O I
10.1038/s41467-017-02597-8
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Since Alan Turing envisioned artificial intelligence, technical progress has often been measured by the ability to defeat humans in zero-sum encounters (e.g., Chess, Poker, or Go). Less attention has been given to scenarios in which human-machine cooperation is beneficial but non-trivial, such as scenarios in which human and machine preferences are neither fully aligned nor fully in conflict. Cooperation does not require sheer computational power, but instead is facilitated by intuition, cultural norms, emotions, signals, and pre-evolved dis-positions. Here, we develop an algorithm that combines a state-of-the-art reinforcementlearning algorithm with mechanisms for signaling. We show that this algorithm can cooperate with people and other algorithms at levels that rival human cooperation in a variety of two-player repeated stochastic games. These results indicate that general human-machine cooperation is achievable using a non-trivial, but ultimately simple, set of algorithmic mechanisms.
引用
收藏
页数:12
相关论文
共 70 条
[31]  
Frank Robert H., 1988, Passions within Reason: The Strategic Role of the Emotions
[32]   A decision-theoretic generalization of on-line learning and an application to boosting [J].
Freund, Y ;
Schapire, RE .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (01) :119-139
[33]  
Groom V, 2007, INTERACT STUD, V8, P483
[34]  
Gunning D., 2016, Broad agency announcement explainable artificial intelligence (xai)
[35]  
Haim G., 2012, P 11 INT C AUT AG MU
[36]  
Hoffman G., 2008, 2008 3rd ACM/IEEE International Conference on Human-Robot Interaction (HRI 2008), P1
[37]   Critical Dynamics in the Evolution of Stochastic Strategies for the Iterated Prisoner's Dilemma [J].
Iliopoulos, Dimitris ;
Hintze, Arend ;
Adami, Christoph .
PLOS COMPUTATIONAL BIOLOGY, 2010, 6 (10)
[38]   Reinforcement learning: A survey [J].
Kaelbling, LP ;
Littman, ML ;
Moore, AW .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1996, 4 :237-285
[39]   Modeling information exchange opportunities for effective human-computer teamwork [J].
Kamar, Ece ;
Gal, Ya'akov ;
Grosz, Barbara J. .
ARTIFICIAL INTELLIGENCE, 2013, 195 :528-550
[40]   Evolving aspirations and cooperation [J].
Karandikar, R ;
Mookherjee, D ;
Ray, D ;
Vega-Redondo, F .
JOURNAL OF ECONOMIC THEORY, 1998, 80 (02) :292-331