Reliable prediction of T-cell epitopes using neural networks with novel sequence representations

被引:836
作者
Nielsen, M
Lundegaard, C
Worning, P
Lauemoller, SL
Lamberth, K
Buus, S
Brunak, S
Lund, O
机构
[1] Tech Univ Denmark, Bioctr DTU, Ctr Biol Sequence Anal, DK-2800 Lyngby, Denmark
[2] Univ Copenhagen, Inst Med Microbiol & Immunol, Dept Expt Immunol, DK-2200 Copenhagen, Denmark
关键词
T-cell class I epitope; HLA-A2; artificial neural network; hidden Markov model; sequence encoding; mutual information;
D O I
10.1110/ps.0239403
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
In this paper we describe an improved neural network method to predict T-cell class I epitopes. A novel input representation has been developed consisting of a combination of sparse encoding, Blosum encoding, and input derived from hidden Markov models. We demonstrate that the combination of several neural networks derived using different sequence-encoding schemes has a performance superior to neural networks derived using a single sequence-encoding scheme. The new method is shown to have a performance that is substantially higher than that of other methods. By use of mutual information calculations we show that peptides that bind to the HLA A*0204 complex display signal of higher order sequence correlations. Neural networks are ideally suited to integrate such higher order correlations when predicting the binding affinity. It is this feature combined with the use of several neural networks derived from different and novel sequence-encoding schemes and the ability of the neural network to be trained on data consisting of continuous binding affinities that gives the new method an improved performance. The difference in predictive performance between the neural network methods and that of the matrix-driven methods is found to be most significant for peptides that bind strongly to the HLA molecule, confirming that the signal of higher order sequence correlation is most strongly present in high-binding peptides. Finally, we use the method to predict T-cell epitopes for the genome of hepatitis C virus and discuss possible applications of the prediction method to guide the process of rational vaccine design.
引用
收藏
页码:1007 / 1017
页数:11
相关论文
共 33 条
[1]   PREDICTION OF BINDING TO MHC CLASS-I MOLECULES [J].
ADAMS, HP ;
KOZIOL, JA .
JOURNAL OF IMMUNOLOGICAL METHODS, 1995, 185 (02) :181-190
[2]   RANKING POTENTIAL BINDING PEPTIDES TO MHC MOLECULES BY A COMPUTATIONAL THREADING APPROACH [J].
ALTUVIA, Y ;
SCHUELER, O ;
MARGALIT, H .
JOURNAL OF MOLECULAR BIOLOGY, 1995, 249 (02) :244-250
[3]   The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :45-48
[4]  
Baldi P, 2001, BIOINFORMATICS MACHI
[5]   GenBank [J].
Benson, DA ;
Karsch-Mizrachi, I ;
Lipman, DJ ;
Ostell, J ;
Rapp, BA ;
Wheeler, DL .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :17-20
[6]  
BRUSIC V, 1994, COMPLEX SYSTEMS ME S, V10, P253
[7]   RECEPTOR-LIGAND INTERACTIONS MEASURED BY AN IMPROVED SPUN COLUMN CHROMATOGRAPHY TECHNIQUE - A HIGH-EFFICIENCY AND HIGH-THROUGHPUT SIZE SEPARATION METHOD [J].
BUUS, S ;
STRYHN, A ;
WINTHER, K ;
KIRKBY, N ;
PEDERSEN, LO .
BIOCHIMICA ET BIOPHYSICA ACTA-GENERAL SUBJECTS, 1995, 1243 (03) :453-460
[8]   Analysis of a successful immune response against hepatitis C virus [J].
Cooper, S ;
Erickson, AL ;
Adams, EJ ;
Kansopon, J ;
Weiner, AJ ;
Chien, DY ;
Houghton, M ;
Parham, P ;
Walker, CM .
IMMUNITY, 1999, 10 (04) :439-449
[9]  
Eddy S, 2001, HMMER PROFILE HIDDEN
[10]   Profile hidden Markov models [J].
Eddy, SR .
BIOINFORMATICS, 1998, 14 (09) :755-763