Threshold Average Precision (TAP-k): a measure of retrieval designed for bioinformatics

被引:23
作者
Carroll, Hyrum D. [1 ]
Kann, Maricel G. [2 ]
Sheetlin, Sergey L. [1 ]
Spouge, John L. [1 ]
机构
[1] Natl Lib Med, Natl Ctr Biotechnol Informat, Bethesda, MD 20894 USA
[2] Univ Maryland, Baltimore, MD 21250 USA
基金
美国国家卫生研究院;
关键词
SEQUENCE COMPARISON METHODS; PROTEIN-SEQUENCE; BLAST; ACCURACY; DATABASE; AREA;
D O I
10.1093/bioinformatics/btq270
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Since database retrieval is a fundamental operation, the measurement of retrieval efficacy is critical to progress in bioinformatics. This article points out some issues with current methods of measuring retrieval efficacy and suggests some improvements. In particular, many studies have used the pooled receiver operating characteristic for n irrelevant records (ROCn) score, the area under the ROC curve (AUC) of a 'pooled' ROC curve, truncated at n irrelevant records. Unfortunately, the pooled ROCn score does not faithfully reflect actual usage of retrieval algorithms. Additionally, a pooled ROCn score can be very sensitive to retrieval results from as little as a single query. Methods: To replace the pooled ROCn score, we propose the Threshold Average Precision (TAP-k), a measure closely related to the well-known average precision in information retrieval, but reflecting the usage of E-values in bioinformatics. Furthermore, in addition to conditions previously given in the literature, we introduce three new criteria that an ideal measure of retrieval efficacy should satisfy. Results: PSI-BLAST, GLOBAL, HMMER and RPS-BLAST provided examples of using the TAP-k and pooled ROCn scores to evaluate sequence retrieval algorithms. In particular, compelling examples using real data highlight the drawbacks of the pooled ROCn score, showing that it can produce evaluations skewing far from intuitive expectations. In contrast, the TAP-k satisfies most of the criteria desired in an ideal measure of retrieval efficacy.
引用
收藏
页码:1708 / 1713
页数:6
相关论文
共 27 条
[1]  
[Anonymous], 2006, 23 INT C MACH LEARN, DOI [DOI 10.1145/1143844.1143874, 10.1145/1143844.1143874]
[2]   AREA ABOVE ORDINAL DOMINANCE GRAPH AND AREA BELOW RECEIVER OPERATING CHARACTERISTIC GRAPH [J].
BAMBER, D .
JOURNAL OF MATHEMATICAL PSYCHOLOGY, 1975, 12 (04) :387-415
[3]   The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data [J].
Berman, Helen ;
Henrick, Kim ;
Nakamura, Haruki ;
Markley, John L. .
NUCLEIC ACIDS RESEARCH, 2007, 35 :D301-D303
[4]   Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships [J].
Brenner, SE ;
Chothia, C ;
Hubbard, TJP .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (11) :6073-6078
[5]   Assessing sequence comparison methods with the average precision criterion [J].
Chen, ZR .
BIOINFORMATICS, 2003, 19 (18) :2456-2460
[6]   Profile hidden Markov models [J].
Eddy, SR .
BIOINFORMATICS, 1998, 14 (09) :755-763
[7]   An introduction to ROC analysis [J].
Fawcett, Tom .
PATTERN RECOGNITION LETTERS, 2006, 27 (08) :861-874
[8]   The Pfam protein families database [J].
Finn, Robert D. ;
Tate, John ;
Mistry, Jaina ;
Coggill, Penny C. ;
Sammut, Stephen John ;
Hotz, Hans-Rudolf ;
Ceric, Goran ;
Forslund, Kristoffer ;
Eddy, Sean R. ;
Sonnhammer, Erik L. L. ;
Bateman, Alex .
NUCLEIC ACIDS RESEARCH, 2008, 36 :D281-D288
[9]   Homologous over-extension: a challenge for iterative similarity searches [J].
Gonzalez, Mileidy W. ;
Pearson, William R. .
NUCLEIC ACIDS RESEARCH, 2010, 38 (07) :2177-2189
[10]   Bootstrapping and normalization for enhanced evaluations of pairwise sequence comparison [J].
Green, RE ;
Brenner, SE .
PROCEEDINGS OF THE IEEE, 2002, 90 (12) :1834-1847