Popitam: Towards new heuristic strategies to improve protein identification from tandem mass spectrometry data

被引:71
作者
Hernandez, P
Gras, R
Frey, J
Appel, RD
机构
[1] CMU, Swiss Inst Bioinformat, CH-1211 Geneva 4, Switzerland
[2] Univ Geneva, Geneva, Switzerland
[3] Univ Hosp Geneva, Geneva, Switzerland
关键词
ant colony optimization; mutated or modified peptides; protein automated identification; spectrum graph; tandem mass spectrometry;
D O I
10.1002/pmic.200300402
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
In recent years, proteomics research has gained importance due to increasingly powerful techniques in protein purification, mass spectrometry and identification, and and due to the development of extensive protein and DNA databases from various organisms. Nevertheless, current identification methods from spectrometric data have difficulties in handling modifications or mutations in the source peptide. Moreover, they have low performance when run on large databases (such as genomic databases), or with low quality data, for example due to bad calibration or low fragmentation of the source peptide. We present a new algorithm dedicated to automated protein identification from tandem mass spectrometry (MS/MS) data by searching a peptide sequence database. Our identification approach shows promising properties for solving the specific difficulties enumerated above. It consists of matching theoretical peptide sequences issued from a database with a structured representation of the source MS/MS spectrum. The representation is similar to the spectrum graphs commonly used by de novo sequencing software. The identification process involves the parsing of the graph in order to emphazise relevant sections for each theoretical sequence, and leads to a list of peptides ranked by a correlation score. The parsing of the graph, which can be a highly combinatorial task, is performed by a bio-inspired algorithm called Ant Colony Optimization algorithm.
引用
收藏
页码:870 / 878
页数:9
相关论文
共 30 条
  • [1] Allauzen C, 1999, LECT NOTES COMPUT SC, V1725, P295
  • [2] [Anonymous], 2001, Bioinformatics
  • [3] The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000
    Bairoch, A
    Apweiler, R
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 45 - 48
  • [4] Bonabeau E, 1999, SWARM INTELLIGENCE N
  • [5] A dynamic programming approach to de novo peptide sequencing via tandem mass spectrometry
    Chen, T
    Kao, MY
    Tepel, M
    Rush, J
    Church, GM
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2001, 8 (03) : 325 - 337
  • [6] COLORNI A, 1992, FROM ANIM ANIMAT, P134
  • [7] De novo peptide sequencing via tandem mass spectrometry
    Dancík, V
    Addona, TA
    Clauser, KR
    Vath, JE
    Pevzner, PA
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 1999, 6 (3-4) : 327 - 342
  • [8] Ant algorithms for discrete optimization
    Dorigo, M
    Di Caro, G
    Gambardella, LM
    [J]. ARTIFICIAL LIFE, 1999, 5 (02) : 137 - 172
  • [9] AN APPROACH TO CORRELATE TANDEM MASS-SPECTRAL DATA OF PEPTIDES WITH AMINO-ACID-SEQUENCES IN A PROTEIN DATABASE
    ENG, JK
    MCCORMACK, AL
    YATES, JR
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 1994, 5 (11) : 976 - 989
  • [10] A statistical basis for testing the significance of mass spectrometric protein identification results
    Eriksson, J
    Chait, BT
    Fenyö, D
    [J]. ANALYTICAL CHEMISTRY, 2000, 72 (05) : 999 - 1005