Robust Statistical Label Fusion Through Consensus Level, Labeler Accuracy, and Truth Estimation (COLLATE)

被引：64

作者：

Asman, Andrew J. ^{[1
]}

Landman, Bennett A. ^{[1
]}

机构：

[1] Vanderbilt Univ, Dept Elect Engn, Nashville, TN 37235 USA

来源：

IEEE TRANSACTIONS ON MEDICAL IMAGING | 2011年 / 30卷 / 10期

基金：

美国国家卫生研究院;

关键词：

Consensus level; labeler accuracy and truth estimation (COLLATE); data fusion; delineation; labeling; parcellation; simultaneous truth and performance level estimation (STAPLE); statistical analysis; IMAGE SEGMENTATION; CLASSIFIER COMBINATION; EM ALGORITHM; BRAIN; VALIDATION;

D O I：

10.1109/TMI.2011.2147795

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Segmentation and delineation of structures of interest in medical images is paramount to quantifying and characterizing structural, morphological, and functional correlations with clinically relevant conditions. The established gold standard for performing segmentation has been manual voxel-by-voxel labeling by a neuroanatomist expert. This process can be extremely time consuming, resource intensive and fraught with high inter-observer variability. Hence, studies involving characterizations of novel structures or appearances have been limited in scope (numbers of subjects), scale (extent of regions assessed), and statistical power. Statistical methods to fuse data sets from several different sources (e. g., multiple human observers) have been proposed to simultaneously estimate both rater performance and the ground truth labels. However, with empirical datasets, statistical fusion has been observed to result in visually inconsistent findings. So, despite the ease and elegance of a statistical approach, single observers and/or direct voting are often used in practice. Hence, rater performance is not systematically quantified and exploited during label estimation. To date, statistical fusion methods have relied on characterizations of rater performance that do not intrinsically include spatially varying models of rater performance. Herein, we present a novel, robust statistical label fusion algorithm to estimate and account for spatially varying performance. This algorithm, COnsensus Level, Labeler Accuracy and Truth Estimation (COLLATE), is based on the simple idea that some regions of an image are difficult to label (e. g., confusion regions: boundaries or low contrast areas) while other regions are intrinsically obvious (e. g., consensus regions: centers of large regions or high contrast edges). Unlike its predecessors, COLLATE estimates the consensus level of each voxel and estimates differing models of observer behavior in each region. We show that COLLATE provides significant improvement in label accuracy and rater assessment over previous fusion methods in both simulated and empirical datasets.

引用

页码：1779 / 1794

页数：16

共 40 条

[1]

Alonzo A, 2001, STAT MED, V20, P656

[2]

ASMAN AJ, 2011, SPIE MED IM C LAK BU

[3]

Beiden S.V., 2000, SPIE MED IMAG, P126

[4]

Bogovic J., 2010, SPIE MED IM C SAN DI

[5]

Collins D.L., 1994, THESIS MCGILL U MONT

[6]

Commowick O, 2010, LECT NOTES COMPUT SC, V6363, P25

[7] A Continuous STAPLE for Scalar, Vector, and Tensor Images: An Application to DTI Analysis [J].

Commowick, Olivier ;

Warfield, Simon K. .

IEEE TRANSACTIONS ON MEDICAL IMAGING, 2009, 28 (06) :838-846

[8] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].

DEMPSTER, AP ;

LAIRD, NM ;

RUBIN, DB .

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38

[9] MEASURES OF THE AMOUNT OF ECOLOGIC ASSOCIATION BETWEEN SPECIES [J].

DICE, LR .

ECOLOGY, 1945, 26 (03) :297-302

[10] A decision-theoretic generalization of on-line learning and an application to boosting [J].

Freund, Y ;

Schapire, RE .

JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (01) :119-139

← 1 2 3 4 →