On filtering false positive transmembrane protein predictions

被引:118
作者
Cserzö, M
Eisenhaber, F
Eisenhaber, B
Simon, I
机构
[1] IMP Bioinformat, A-1030 Vienna, Austria
[2] Univ Birmingham, Sch Biosci, Birmingham B15 2TT, W Midlands, England
[3] Hungarian Acad Sci, Biol Res Ctr, Inst Enzymol, H-1518 Budapest, Hungary
来源
PROTEIN ENGINEERING | 2002年 / 15卷 / 09期
关键词
automated sequence database screening; DAS-TMfilter; genome sequence annotation; transmembrane region prediction;
D O I
10.1093/protein/15.9.745
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
While helical transmembrane (TM) region prediction tools achieve high (>90%) success rates for real integral membrane proteins, they produce a considerable number of false positive hits in sequences of known nontransmembrane queries. We propose a modification of the dense alignment surface (DAS) method that achieves a substantial decrease in the false positive error rate. Essentially, a sequence that includes possible transmembrane regions is compared in a second step with TM segments in a sequence library of documented transmembrane proteins. If the performance of the query sequence against the library of documented TM segment-containing sequences in this test is lower than an empirical threshold, it is classified as a non-transmembrane protein. The probability of false positive prediction for trusted TM region hits is expressed in terms of E-values. The modified DAS method, the DAS-TMfilter algorithm, has an unchanged high sensitivity for TM segments (similar to95% detected in a learning set of 128 documented transmembrane proteins). At the same time, the selectivity measured over a non-redundant set of 526 soluble proteins with known 3D structure is similar to99%, mainly because a large number of falsely predicted single membrane-pass proteins are eliminated by the DAS-TMfilter algorithm.
引用
收藏
页码:745 / 752
页数:8
相关论文
共 42 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :45-48
[3]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[4]   Prediction of transmembrane alpha-helices in prokaryotic membrane proteins: the dense alignment surface method [J].
Cserzo, M ;
Wallin, E ;
Simon, I ;
vonHeijne, G ;
Elofsson, A .
PROTEIN ENGINEERING, 1997, 10 (06) :673-676
[5]   NEW ALIGNMENT STRATEGY FOR TRANSMEMBRANE PROTEINS [J].
CSERZO, M ;
BERNASSAU, JM ;
SIMON, I ;
MAIGRET, B .
JOURNAL OF MOLECULAR BIOLOGY, 1994, 243 (03) :388-396
[6]  
CSERZO M, 1989, INT J PEPT PROT RES, V34, P184
[7]   Prediction of potential GPI-modification sites in proprotein sequences [J].
Eisenhaber, B ;
Bork, P ;
Eisenhaber, F .
JOURNAL OF MOLECULAR BIOLOGY, 1999, 292 (03) :741-758
[8]   Post-translational GPI lipid anchor modification of proteins in kingdoms of life: analysis of protein sequence data from complete genomes [J].
Eisenhaber, B ;
Bork, P ;
Eisenhaber, F .
PROTEIN ENGINEERING, 2001, 14 (01) :17-25
[9]   PROTEIN-STRUCTURE PREDICTION - RECOGNITION OF PRIMARY, SECONDARY, AND TERTIARY STRUCTURAL FEATURES FROM AMINO-ACID-SEQUENCE [J].
EISENHABER, F ;
PERSSON, B ;
ARGOS, P .
CRITICAL REVIEWS IN BIOCHEMISTRY AND MOLECULAR BIOLOGY, 1995, 30 (01) :1-94
[10]   IDENTIFYING NONPOLAR TRANSBILAYER HELICES IN AMINO-ACID-SEQUENCES OF MEMBRANE-PROTEINS [J].
ENGELMAN, DM ;
STEITZ, TA ;
GOLDMAN, A .
ANNUAL REVIEW OF BIOPHYSICS AND BIOPHYSICAL CHEMISTRY, 1986, 15 :321-353