Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models

被引：1843

作者：

Kung, Tiffany H. ^{[1
,2
]}

Cheatham, Morgan ^{[3
]}

Medenilla, Arielle ^{[1
]}

Sillos, Czarina ^{[1
]}

De Leon, Lorie ^{[1
]}

Elepano, Camille

Madriaga, Maria ^{[1
]}

Aggabao, Rimel ^{[1
]}

Diaz-Candido, Giezel ^{[1
]}

Maningo, James ^{[1
]}

Tseng, Victor ^{[1
,4
]}

机构：

[1] AnsibleHealth Inc, Mountain View, CA 94043 USA

[2] Harvard Sch Med, Massachusetts Gen Hosp, Dept Anesthesiol, Boston, MA USA

[3] Brown Univ, Warren Alpert Med Sch, Providence, RI USA

[4] UWorld LLC, Dept Med Educ, Dallas, TX 75019 USA

来源：

PLOS DIGITAL HEALTH | 2023年 / 2卷 / 02期

关键词：

STUDENT;

D O I：

10.1371/journal.pdig.0000198

中图分类号：

R19 [保健组织与事业（卫生事业管理）];

学科分类号：

100404 [儿少卫生与妇幼保健学];

摘要：

We evaluated the performance of a large language model called ChatGPT on the United States Medical Licensing Exam (USMLE), which consists of three exams: Step 1, Step 2CK, and Step 3. ChatGPT performed at or near the passing threshold for all three exams without any specialized training or reinforcement. Additionally, ChatGPT demonstrated a high level of concordance and insight in its explanations. These results suggest that large language models may have the potential to assist with medical education, and potentially, clinical decision-making.

引用

页数：12

共 25 条

[21]

Stanford CRFM, US

[22]

Rethinking the Inception Architecture for Computer Vision [J].

Szegedy, Christian ;

Vanhoucke, Vincent ;

Ioffe, Sergey ;

Shlens, Jon ;

Wojna, Zbigniew .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2818-2826

[23]

usmle.org, Performance data

[24]

Vasey B, 2022, BMJ-BRIT MED J, V377, DOI [10.1038/s41591-022-01772-9, 10.1136/bmj-2022-070904]

[25]

Zhang W, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P4334

← 1 2 3 →