Preparing a collection of radiology examinations for distribution and retrieval

被引:553
作者
Demner-Fushman, Dina [1 ]
Kohli, Marc D. [2 ]
Rosenman, Marc B. [3 ]
Shooshan, Sonya E. [1 ]
Rodriguez, Laritza [1 ]
Antani, Sameer [4 ]
Thoma, George R. [4 ]
McDonald, Clement J. [1 ]
机构
[1] NIH, Lister Hill Natl Ctr Biomed Commun, Natl Lib Med, Bldg 38A,Room 10S-1022,Rockville Pike MSC 3824, Bethesda, MD 20894 USA
[2] Indiana Univ Sch Med, Dept Radiol & Imaging Sci, Informat, Indianapolis, IN 46202 USA
[3] Indiana Univ Sch Med, Dept Pediat, Childrens Hlth Serv Res, Indianapolis, IN 46202 USA
[4] NIH, Commun Engn Branch, Lister Hill Natl Ctr Biomed Commun, Natl Lib Med, Bldg 38A,Room 10S-1022,Rockville Pike MSC 3824, Bethesda, MD 20894 USA
基金
美国国家卫生研究院;
关键词
information storage and retrieval; abstracting and indexing; radiography; medical records; biometric identification; NEGATION;
D O I
10.1093/jamia/ocv080
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective Clinical documents made available for secondary use play an increasingly important role in discovery of clinical knowledge, development of research methods, and education. An important step in facilitating secondary use of clinical document collections is easy access to descriptions and samples that represent the content of the collections. This paper presents an approach to developing a collection of radiology examinations, including both the images and radiologist narrative reports, and making them publicly available in a searchable database. Materials and Methods The authors collected 3996 radiology reports from the Indiana Network for Patient Care and 8121 associated images from the hospitals' picture archiving systems. The images and reports were de-identified automatically and then the automatic de-identification was manually verified. The authors coded the key findings of the reports and empirically assessed the benefits of manual coding on retrieval. Results The automatic de-identification of the narrative was aggressive and achieved 100% precision at the cost of rendering a few findings uninterpretable. Automatic de-identification of images was not quite as perfect. Images for two of 3996 patients (0.05%) showed protected health information. Manual encoding of findings improved retrieval precision. Conclusion Stringent de-identification methods can remove all identifiers from text radiology reports. DICOM de-identification of images does not remove all identifying information and needs special attention to images scanned from film. Adding manual coding to the radiologist narrative reports significantly improved relevancy of the retrieved clinical documents. The de-identified Indiana chest X-ray collection is available for searching and downloading from the National Library of Medicine (http://openi.nlm.nih.gov/).
引用
收藏
页码:304 / 310
页数:7
相关论文
共 19 条
[1]  
[Anonymous], 2005, TREC EXPT EVALUATION
[2]   An overview of MetaMap: historical perspective and recent advances [J].
Aronson, Alan R. ;
Lang, Francois-Michel .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2010, 17 (03) :229-236
[3]  
Chapman WW, 2001, J AM MED INFORM ASSN, P105
[4]   A Systematic Review of Re-Identification Attacks on Health Data [J].
El Emam, Khaled ;
Jonker, Elizabeth ;
Arbuckle, Luk ;
Malin, Bradley .
PLOS ONE, 2011, 6 (12)
[5]   A software tool for removing patient identifying information from clinical documents [J].
Friedlin, F. Jeff ;
McDonald, Clement J. .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2008, 15 (05) :601-610
[6]  
Hersh WR., 2008, Information Retrieval: A Health and Biomedical Perspective, VThird
[7]   Cardiac Rhythm Device Identification Algorithm using X-Rays: CaRDIA-X [J].
Jacob, Sony ;
Shahzad, Muhammad A. ;
Maheshwari, Rahul ;
Panaich, Sidakpal S. ;
Aravindhakshan, Rajeev .
HEART RHYTHM, 2011, 8 (06) :915-922
[8]   Automatic Tuberculosis Screening Using Chest Radiographs [J].
Jaeger, Stefan ;
Karargyris, Alexandros ;
Candemir, Sema ;
Folio, Les ;
Siegelman, Jenifer ;
Callaghan, Fiona ;
Xue, Zhiyun ;
Palaniappan, Kannappan ;
Singh, Rahul K. ;
Antani, Sameer ;
Thoma, George ;
Wang, Yi-Xiang ;
Lu, Pu-Xuan ;
McDonald, Clement J. .
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2014, 33 (02) :233-245
[9]   Evaluating performance of biomedical image retrieval systems An overview of the medical image retrieval task at ImageCLEF 2004-2013 [J].
Kalpathy-Cramer, Jayashree ;
de Herrera, Alba Garcia Seco ;
Demner-Fushman, Dina ;
Antani, Sameer ;
Bedrick, Steven ;
Mueller, Henning .
COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2015, 39 :55-61
[10]   RadLex: A new method for indexing online educational materials [J].
Langlotz, Curtis P. .
RADIOGRAPHICS, 2006, 26 (06) :1595-1597