Probability-based protein identification by searching sequence databases using mass spectrometry data

被引:82
作者
Perkins, DN
Pappin, DJC
Creasy, DM
Cottrell, JS
机构
[1] Imperial Canc Res Fund, Prot Sequencing Lab, London WC2A 3PX, England
[2] Matrix Sci Ltd, London, England
关键词
protein identification; mass spectrometry; bioinformatics;
D O I
10.1002/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Several algorithms have been described in the literature for protein identification by searching a sequence database using mass spectrometry data. In some approaches, the experimental data are peptide molecular weights from the digestion of a protein by an enzyme. Other approaches use tandem mass spectrometry (MS/MS) data from one or more peptides. Still others combine mass data with amino acid sequence data. We present results from a new computer program, Mascot, which integrates all three types of search. The scoring algorithm is probability based, which has a number of advantages: (i) A simple rule can be used to judge whether a result is significant or not. This is particularly useful in guarding against false positives. (ii) Scores can be compared with those from other types of search, such as sequence homology. (iii) Search parameters can be readily optimised by iteration. The strengths and limitations of probability-based scoring are discussed, particularly in the context of high throughput, fully automated protein identification.
引用
收藏
页码:3551 / 3567
页数:17
相关论文
共 33 条
[1]   COMPLEMENTARY-DNA SEQUENCING - EXPRESSED SEQUENCE TAGS AND HUMAN GENOME PROJECT [J].
ADAMS, MD ;
KELLEY, JM ;
GOCAYNE, JD ;
DUBNICK, M ;
POLYMEROPOULOS, MH ;
XIAO, H ;
MERRIL, CR ;
WU, A ;
OLDE, B ;
MORENO, RF ;
KERLAVAGE, AR ;
MCCOMBIE, WR ;
VENTER, JC .
SCIENCE, 1991, 252 (5013) :1651-1656
[2]   Proteomics: quantitative and physical mapping of cellular proteins [J].
Blackstock, WP ;
Weir, MP .
TRENDS IN BIOTECHNOLOGY, 1999, 17 (03) :121-127
[3]   CONSTRUCTION OF VALIDATED, NONREDUNDANT COMPOSITE PROTEIN-SEQUENCE DATABASES [J].
BLEASBY, AJ ;
WOOTTON, JC .
PROTEIN ENGINEERING, 1990, 3 (03) :153-159
[4]   PEPTIDE SEQUENCING USING THE COMBINATION OF EDMAN DEGRADATION, CARBOXYPEPTIDASE DIGESTION AND FAST ATOM BOMBARDMENT MASS-SPECTROMETRY [J].
BRADLEY, CV ;
WILLIAMS, DH .
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 1982, 104 (04) :1223-1230
[5]   RAPID MASS-SPECTROMETRIC PEPTIDE SEQUENCING AND MASS MATCHING FOR CHARACTERIZATION OF HUMAN-MELANOMA PROTEINS ISOLATED BY 2-DIMENSIONAL PAGE [J].
CLAUSER, KR ;
HALL, SC ;
SMITH, DM ;
WEBB, JW ;
ANDREWS, LE ;
TRAN, HM ;
EPSTEIN, LB ;
BURLINGAME, AL .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1995, 92 (11) :5072-5076
[6]   Protein identification using mass spectrometric information [J].
Fenyö, D ;
Qin, J ;
Chait, BT .
ELECTROPHORESIS, 1998, 19 (06) :998-1005
[7]   RANDOM SEQUENCES [J].
FITCH, WM .
JOURNAL OF MOLECULAR BIOLOGY, 1983, 163 (02) :171-176
[8]   Direct database searching with MALDI-PSD spectra of peptides [J].
Griffin, PR ;
MacCoss, MJ ;
Eng, JK ;
Blevins, RA ;
Aaronson, JS ;
Yates, JR .
RAPID COMMUNICATIONS IN MASS SPECTROMETRY, 1995, 9 (15) :1546-1551
[9]   Re-evaluation of the primary structure of Ralstonia eutropha phasin and implications for polyhydroxyalkanoic acid granule binding [J].
Hanley, SZ ;
Pappin, DJC ;
Rahman, D ;
White, AJ ;
Elborough, KM ;
Slabas, AR .
FEBS LETTERS, 1999, 447 (01) :99-105
[10]   IDENTIFYING PROTEINS FROM 2-DIMENSIONAL GELS BY MOLECULAR MASS SEARCHING OF PEPTIDE-FRAGMENTS IN PROTEIN-SEQUENCE DATABASES [J].
HENZEL, WJ ;
BILLECI, TM ;
STULTS, JT ;
WONG, SC ;
GRIMLEY, C ;
WATANABE, C .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1993, 90 (11) :5011-5015