Do as AI say: susceptibility in deployment of clinical decision-aids

被引:198
作者
Gaube, Susanne [1 ,2 ]
Suresh, Harini [3 ]
Raue, Martina [2 ]
Merritt, Alexander [4 ]
Berkowitz, Seth J. [5 ]
Lermer, Eva [6 ,7 ]
Coughlin, Joseph F. [2 ]
Guttag, John V. [3 ]
Colak, Errol [8 ,9 ]
Ghassemi, Marzyeh [10 ,11 ,12 ]
机构
[1] Univ Regensburg, Dept Psychol, Regensburg, Germany
[2] MIT, MIT AgeLab, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[3] MIT, MIT Comp Sci & Artificial Intelligence Lab, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[4] Boston Med Ctr, Boston, MA USA
[5] Beth Israel Deaconess Med Ctr, Dept Radiol, 330 Brookline Ave, Boston, MA 02215 USA
[6] Ludwig Maximilians Univ Munchen, LMU Ctr Leadership & People Management, Munich, Germany
[7] FOM Univ Appl Sci Econ & Management, Munich, Germany
[8] St Michaels Hosp, Li Ka Shing Knowledge Inst, Toronto, ON, Canada
[9] Univ Toronto, Dept Med Imaging, Toronto, ON, Canada
[10] Univ Toronto, Dept Comp Sci, Toronto, ON, Canada
[11] Univ Toronto, Dept Med, Toronto, ON, Canada
[12] Vector Inst, Toronto, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
CLASSIFICATION; ALGORITHM; INTELLIGENCE; PERFORMANCE; TRUST;
D O I
10.1038/s41746-021-00385-9
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Artificial intelligence (AI) models for decision support have been developed for clinical settings such as radiology, but little work evaluates the potential impact of such systems. In this study, physicians received chest X-rays and diagnostic advice, some of which was inaccurate, and were asked to evaluate advice quality and make diagnoses. All advice was generated by human experts, but some was labeled as coming from an AI system. As a group, radiologists rated advice as lower quality when it appeared to come from an AI system; physicians with less task-expertise did not. Diagnostic accuracy was significantly worse when participants received inaccurate advice, regardless of the purported source. This work raises important considerations for how advice, AI and non-AI, should be deployed in clinical environments.
引用
收藏
页数:8
相关论文
共 42 条
  • [1] Effects of incorrect computer-aided detection (CAD) output on human decision-making in mammography
    Alberdi, E
    Povyakalo, A
    Strigini, L
    Ayton, P
    [J]. ACADEMIC RADIOLOGY, 2004, 11 (08) : 909 - 918
  • [2] Association of American Medical Colleges. Center for Workforce Studies, 2018, NPJ DIGIT MED, DOI [10.1038/s41746-021-00385-9, DOI 10.1038/S41746-021-00453-0]
  • [3] A Human-Centered Evaluation of a Deep Learning System Deployed in Clinics for the Detection of Diabetic Retinopathy
    Beede, Emma
    Baylor, Elizabeth
    Hersch, Fred
    Iurchenko, Anna
    Wilcox, Lauren
    Ruamviboonsuk, Paisan
    Vardoulakis, Laura M.
    [J]. PROCEEDINGS OF THE 2020 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI'20), 2020,
  • [4] Overtrust of Pediatric Health-Care Robots A Preliminary Survey of Parent Perspectives
    Borenstein, Jason
    Wagner, Alan R.
    Howard, Ayanna
    [J]. IEEE ROBOTICS & AUTOMATION MAGAZINE, 2018, 25 (01) : 46 - 54
  • [5] Computerised interpretation of fetal heart rate during labour (INFANT): a randomised controlled trial
    Brocklehurst, Peter
    Johns, Nina
    Johnston, Tracey
    Barnfield, Gemma
    Davies, Karen
    Johnson, Mark
    Patterson, Holly
    Montague, Imogen
    Watmore, Sally
    Stolton, Alison
    Parisaei, Maryam
    McGhee, Natasha
    Segovia, Silvia
    Martindale, Elizabeth
    Jackson, Hilary
    Holleran, Josephine
    Roberts, Devender
    Holt, Siobhan
    Dragovic, Bosko
    Willmott-Powell, Miriam
    Hutchinson, Laura
    Toth, Benedek
    Chandler, Gemma
    Ridley, Suzanne
    Bugg, George
    Molnar, Anna
    Lochrie, Denise
    Connor, Jillian
    Howe, David
    Head, Katie
    Wellstead, Sue
    Mathers, Alan
    Walker, Laura
    Crawford, Isobel
    Davies, David
    Garner, Zoe
    Galloway, Lucy
    Bugg, George
    Davies, Yvette
    Smith, Carys
    Perkins, Gill
    Geary, Mike
    Walsh, Fiona
    Nagle, Ursula
    Martindale, Elizabeth
    Jackson, Hilary
    O'Malley, Louise
    Katakam, Narmada
    White, Heather
    Tanton, Emma
    [J]. LANCET, 2017, 389 (10080) : 1719 - 1729
  • [6] Bias in Radiology: The How and Why of Misses and Misinterpretations
    Busby, Lindsay P.
    Courtier, Jesse L.
    Glastonbury, Christine M.
    [J]. RADIOGRAPHICS, 2018, 38 (01) : 236 - 247
  • [7] The Role of Explanations on Trust and Reliance in Clinical Decision Support Systems
    Bussone, Adrian
    Stumpf, Simone
    O'Sullivan, Dympna
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2015), 2015, : 160 - 169
  • [8] Practice variation and practice guidelines: Attitudes of generalist and specialist physicians, nurse practitioners, and physician assistants
    Cook, David A.
    Pencille, Laurie J.
    Dupras, Denise M.
    Linderbaum, Jane A.
    Pankratz, V. Shane
    Wilkinson, John M.
    [J]. PLOS ONE, 2018, 13 (01):
  • [9] Machine intelligence in healthcare-perspectives on trustworthiness, explainability, usability, and transparency
    Cutillo, Christine M.
    Sharma, Karlie R.
    Foschini, Luca
    Kundu, Shinjini
    Mackintosh, Maxine
    Mandl, Kenneth D.
    Beck, Tyler
    Collier, Elaine
    Colvis, Christine
    Gersing, Kenneth
    Gordon, Valery
    Jensen, Roxanne
    Shabestari, Behrouz
    Southall, Noel
    [J]. NPJ DIGITAL MEDICINE, 2020, 3 (01)
  • [10] Pitfalls in Chest Radiographic Interpretation: Blind Spots
    de Groot, Patricia M.
    Carter, Brett W.
    Abbott, Gerald F.
    Wu, Carol C.
    [J]. SEMINARS IN ROENTGENOLOGY, 2015, 50 (03) : 197 - 209