Gender Bias in Artificial Intelligence: Severity Prediction at an Early Stage of COVID-19

被引:12
作者
Chung, Heewon [1 ]
Park, Chul [2 ]
Kang, Wu Seong [3 ]
Lee, Jinseok [1 ]
机构
[1] Kyung Hee Univ, Dept Biomed Engn, Coll Elect & Informat, Yongin, South Korea
[2] Wonkwang Univ, Dept Internal Med, Sch Med, Iksan, South Korea
[3] Cheju Halla Gen Hosp, Dept Trauma Surg, Jeju Si, South Korea
关键词
COVID-19; severity prediction; artificial intelligence bias; gender dependent bias; feature importance;
D O I
10.3389/fphys.2021.778720
中图分类号
Q4 [生理学];
学科分类号
071003 [生理学];
摘要
Artificial intelligence (AI) technologies have been applied in various medical domains to predict patient outcomes with high accuracy. As AI becomes more widely adopted, the problem of model bias is increasingly apparent. In this study, we investigate the model bias that can occur when training a model using datasets for only one particular gender and aim to present new insights into the bias issue. For the investigation, we considered an AI model that predicts severity at an early stage based on the medical records of coronavirus disease (COVID-19) patients. For 5,601 confirmed COVID-19 patients, we used 37 medical records, namely, basic patient information, physical index, initial examination findings, clinical findings, comorbidity diseases, and general blood test results at an early stage. To investigate the gender-based AI model bias, we trained and evaluated two separate models-one that was trained using only the male group, and the other using only the female group. When the model trained by the male-group data was applied to the female testing data, the overall accuracy decreased-sensitivity from 0.93 to 0.86, specificity from 0.92 to 0.86, accuracy from 0.92 to 0.86, balanced accuracy from 0.93 to 0.86, and area under the curve (AUC) from 0.97 to 0.94. Similarly, when the model trained by the female-group data was applied to the male testing data, once again, the overall accuracy decreased-sensitivity from 0.97 to 0.90, specificity from 0.96 to 0.91, accuracy from 0.96 to 0.91, balanced accuracy from 0.96 to 0.90, and AUC from 0.97 to 0.95. Furthermore, when we evaluated each gender-dependent model with the test data from the same gender used for training, the resultant accuracy was also lower than that from the unbiased model.
引用
收藏
页数:9
相关论文
共 20 条
[11]
Age Bias in Emotion Detection: An Analysis of Facial Emotion Recognition Performance on Young, Middle-Aged, and Older Adults [J].
Kim, Eugenia ;
Bryant, De'Aira ;
Srikanth, Deepak ;
Howard, Ayanna .
AIES '21: PROCEEDINGS OF THE 2021 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, 2021, :638-644
[12]
Automated Assessment of COVID-19 Reporting and Data System and Chest CT Severity Scores in Patients Suspected of Having COVID-19 Using Artificial Intelligence [J].
Lessmann, Nikolas ;
Sanchez, Clara, I ;
Beenen, Ludo ;
Boulogne, Luuk H. ;
Brink, Monique ;
Calli, Erdi ;
Charbonnier, Jean-Paul ;
Dofferhoff, Ton ;
van Everdingen, Wouter M. ;
Gerke, Paul K. ;
Geurts, Bram ;
Gietema, Hester A. ;
Groeneveld, Miriam ;
van Harten, Louis ;
Hendrix, Nils ;
Hendrix, Ward ;
Huisman, Henkjan J. ;
Isgum, Ivana ;
Jacobs, Colin ;
Kluge, Ruben ;
Kok, Michel ;
Krdzalic, Jasenko ;
Lassen-Schmidt, Bianca ;
van Leeuwen, Kicky ;
Meakin, James ;
Overkamp, Mike ;
Vellinga, Tjalco van Rees ;
van Rikxoort, Eva M. ;
Samperna, Riccardo ;
Schaefer-Prokop, Cornelia ;
Schalekamp, Steven ;
Scholten, Ernst Th ;
Sital, Cheryl ;
Stoeger, J. Lauran ;
Teuwen, Jonas ;
Venkadesh, Kiran Vaidhya ;
de Vente, Coen ;
Vermaat, Marieke ;
Xie, Weiyi ;
de Wilde, Bram ;
Prokop, Mathias ;
van Ginneken, Bram .
RADIOLOGY, 2021, 298 (01) :E18-E28
[13]
Bias in data-driven artificial intelligence systems-An introductory survey [J].
Ntoutsi, Eirini ;
Fafalios, Pavlos ;
Gadiraju, Ujwal ;
Iosifidis, Vasileios ;
Nejdl, Wolfgang ;
Vidal, Maria-Esther ;
Ruggieri, Salvatore ;
Turini, Franco ;
Papadopoulos, Symeon ;
Krasanakis, Emmanouil ;
Kompatsiaris, Ioannis ;
Kinder-Kurlanda, Katharina ;
Wagner, Claudia ;
Karimi, Fariba ;
Fernandez, Miriam ;
Alani, Harith ;
Berendt, Bettina ;
Kruegel, Tina ;
Heinze, Christian ;
Broelemann, Klaus ;
Kasneci, Gjergji ;
Tiropanis, Thanassis ;
Staab, Steffen .
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 10 (03)
[14]
Addressing Bias in Artificial Intelligence in Health Care [J].
Parikh, Ravi B. ;
Teeple, Stephanie ;
Navathe, Amol S. .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2019, 322 (24) :2377-2378
[15]
Machine learning model for predicting severity prognosis in patients infected with COVID-19: Study protocol from COVID-AI Brasil [J].
Proenca Lobo Lopes, Flavia Paiva ;
Kitamura, Felipe Campos ;
Prado, Gustavo Faibischew ;
de Aguiar Kuriki, Paulo Eduardo ;
Taveira Garcia, Marcio Ricardo .
PLOS ONE, 2021, 16 (02)
[16]
Soft margins for AdaBoost [J].
Rätsch, G ;
Onoda, T ;
Müller, KR .
MACHINE LEARNING, 2001, 42 (03) :287-320
[17]
Abnormal lung quantification in chest CT images of COVID-19 patients with deep learning and its application to severity prediction [J].
Shan, Fei ;
Gao, Yaozong ;
Wang, Jun ;
Shi, Weiya ;
Shi, Nannan ;
Han, Miaofei ;
Xue, Zhong ;
Shen, Dinggang ;
Shi, Yuxin .
MEDICAL PHYSICS, 2021, 48 (04) :1633-1645
[18]
Cultural proximity bias in AI-acceptability: The importance of being human [J].
Tubadji, Annie ;
Huang, Haoran ;
Webber, Don J. .
TECHNOLOGICAL FORECASTING AND SOCIAL CHANGE, 2021, 173
[19]
Artificial Intelligence-Based Prediction of Covid-19 Severity on the Results of Protein Profiling [J].
Yasar, Seyma ;
Colak, Cemil ;
Yologlu, Saim .
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2021, 202
[20]
Deep transfer learning artificial intelligence accurately stages COVID-19 lung disease severity on portable chest radiographs [J].
Zhu, Jocelyn ;
Shen, Beiyi ;
Abbasi, Almas ;
Hoshmand-Kochi, Mahsa ;
Li, Haifang ;
Duong, Tim Q. .
PLOS ONE, 2020, 15 (07)