Training, testing and benchmarking medical AI models using Clinical AIBench

被引:1
作者
Huang Y. [1 ,6 ]
Miao X. [1 ]
Zhang R. [1 ]
Ma L. [1 ,6 ]
Liu W. [1 ]
Zhang F. [2 ]
Guan X. [1 ]
Liang X. [1 ]
Lu X. [1 ]
Tang S. [5 ]
Zhang Z. [4 ]
机构
[1] Guangxi Key Lab of Multi-Source Information Mining & Security, School of Computer Science and Engineering & School of Software, Guangxi Normal University, Guilin
[2] State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing
[3] Guilin Medical University, Guilin
[4] Department of Physiology and Pathophysiology, Capital Medical University, Beijing
[5] Faculty of Education, Guangxi Normal University, Guilin
[6] International Open Benchmark Council, Beijing
来源
BenchCouncil Transactions on Benchmarks, Standards and Evaluations | 2022年 / 2卷 / 01期
基金
中国国家自然科学基金;
关键词
Alzheimer's disease; Benchmark; Clinical setting; Configurable clinical setting; COVID-19; Dental;
D O I
10.1016/j.tbench.2022.100037
中图分类号
学科分类号
摘要
AI technology has been used in many clinical research fields, but most AI technologies are difficult to land in real-world clinical settings. In most current clinical AI research settings, the diagnosis task is to identify different types of diseases among the given ones. However, the diagnosis in real-world settings needs dynamically developing inspection strategies based on the existing resources of medical institutions and identifying different kinds of diseases out of many possibilities. To promote the development of different clinical AI technologies and the implementation of clinical applications, we propose a benchmark named Clinical AIBench for developing, verifying, and evaluating clinical AI technologies in real-world clinical settings. Specifically, Clinical AIBench can be used for: (1) Model training and testing: Researchers can use the data to train and test their models. (2)Model evaluation: Researchers can use Clinical AIBench to objectively, fairly, and comparably evaluate various models of different researchers. (3) Clinical value evaluation: Researchers can use the clinical indicators provided by Clinical AIBench to evaluate the clinical value of models, which will be applied in real-world clinical settings. For convenience, Clinical AIBench provides three different levels of clinical settings: restricted clinical setting, which is named closed clinical setting, data island clinical setting, and real-world clinical setting, which is called open clinical setting. In addition, Clinical AIBench covers three diseases: Alzheimer's disease, COVID-19, and dental. Clinical AIBench provides python APIs to researchers. The data and source code are publicly available from the project website https://www.benchcouncil.org/clinical_aibench/. © 2022 The Authors
引用
收藏
相关论文
共 17 条
  • [1] Deng J., Dong W., Socher R., Li L.-J., Li K., Fei-Fei L., Imagenet: A large-scale hierarchical image database, 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248-255, (2009)
  • [2] LeCun Y., Bottou L., Bengio Y., Haffner P., Gradient-based learning applied to document recognition, Proc. IEEE, 86, 11, pp. 2278-2324, (1998)
  • [3] Mueller S.G., Weiner M.W., Thal L.J., Petersen R.C., Jack C.R., Jagust W., Trojanowski J.Q., Toga A.W., Beckett L., Ways toward an early diagnosis in alzheimer's disease: the alzheimer's disease neuroimaging initiative (adni), Alzheimer's Dement., 1, 1, pp. 55-66, (2005)
  • [4] Tomczak K., Czerwinska P., Wiznerowicz M., The cancer genome atlas (tcga): an immeasurable source of knowledge, Contemp. Oncol., 19, 1A, (2015)
  • [5] Benjamens S., Dhunnoo P., Mesko B., The state of artificial intelligence-based fda-approved medical devices and algorithms: an online database, NPJ Digit. Med., 3, 1, pp. 1-8, (2020)
  • [6] Brocklehurst P., Field D., Greene K., Juszczak E., Keith R., Kenyon S., Linsell L., Mabey C., Newburn M., Plachcinski R., Et al., Computerised interpretation of fetal heart rate during labour (infant): a randomised controlled trial, Lancet, 389, 10080, pp. 1719-1729, (2017)
  • [7] Liang Y., Guo Y., Gong Y., Luo C., Zhan J., Huang Y., Flbench: A benchmark suite for federated learning, BenchCouncil International Federated Intelligent Computing and Block Chain Conferences, pp. 166-176, (2020)
  • [8] Gao W., Tang F., Zhan J., Wen X., Wang L., Cao Z., Lan C., Luo C., Liu X., Jiang Z., Aibench scenario: Scenario-distilling ai benchmarking, 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 142-158, (2021)
  • [9] Zhan J.
  • [10] Zhang F., Luo C., Lan C., Zhan J., Benchmarking feature selection methods with different prediction models on large-scale healthcare event data, BenchCouncil Trans. Benchmarks Stand. Eval., 1, 1, (2021)