Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models

被引:1843
作者
Kung, Tiffany H. [1 ,2 ]
Cheatham, Morgan [3 ]
Medenilla, Arielle [1 ]
Sillos, Czarina [1 ]
De Leon, Lorie [1 ]
Elepano, Camille
Madriaga, Maria [1 ]
Aggabao, Rimel [1 ]
Diaz-Candido, Giezel [1 ]
Maningo, James [1 ]
Tseng, Victor [1 ,4 ]
机构
[1] AnsibleHealth Inc, Mountain View, CA 94043 USA
[2] Harvard Sch Med, Massachusetts Gen Hosp, Dept Anesthesiol, Boston, MA USA
[3] Brown Univ, Warren Alpert Med Sch, Providence, RI USA
[4] UWorld LLC, Dept Med Educ, Dallas, TX 75019 USA
来源
PLOS DIGITAL HEALTH | 2023年 / 2卷 / 02期
关键词
STUDENT;
D O I
10.1371/journal.pdig.0000198
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
100404 [儿少卫生与妇幼保健学];
摘要
We evaluated the performance of a large language model called ChatGPT on the United States Medical Licensing Exam (USMLE), which consists of three exams: Step 1, Step 2CK, and Step 3. ChatGPT performed at or near the passing threshold for all three exams without any specialized training or reinforcement. Additionally, ChatGPT demonstrated a high level of concordance and insight in its explanations. These results suggest that large language models may have the potential to assist with medical education, and potentially, clinical decision-making.
引用
收藏
页数:12
相关论文
共 25 条
[21]
Stanford CRFM, US
[22]
Rethinking the Inception Architecture for Computer Vision [J].
Szegedy, Christian ;
Vanhoucke, Vincent ;
Ioffe, Sergey ;
Shlens, Jon ;
Wojna, Zbigniew .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2818-2826
[23]
usmle.org, Performance data
[24]
Vasey B, 2022, BMJ-BRIT MED J, V377, DOI [10.1038/s41591-022-01772-9, 10.1136/bmj-2022-070904]
[25]
Zhang W, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P4334