Profiling the malaria genome: a gene survey of three species of malaria parasite with comparison to other apicomplexan species

被引:28
作者
Carlton, JMR
Muller, R
Yowell, CA
Fluegge, MR
Sturrock, KA
Pritt, JR
Vargas-Serrato, E
Galinski, MR
Barnwell, JW
Mulder, N
Kanapin, A
Cawley, SE
Hide, WA
Dame, JB
机构
[1] NIH, Computat Biol Branch, Natl Ctr Biotechnol Informat, Natl Lib Med, Bethesda, MD 20892 USA
[2] Univ Florida, Coll Vet Med, Dept Pathobiol, Gainesville, FL 32608 USA
[3] Univ Western Cape, S African Natl Bioinformat Inst, ZA-7535 Bellville, South Africa
[4] Emory Univ, Sch Med, Yerkes Reg Primate Res Ctr, Emory Vaccine Res Ctr, Atlanta, GA 30329 USA
[5] Ctr Dis Control & Prevent, Div Parasit Dis, Atlanta, GA 30341 USA
[6] European Bioinformat Inst, EMBL Outstn, Hinxton CB10 1SD, Cambs, England
[7] Affymetrix, Emeryville, CA 94608 USA
关键词
malaria; apicomplexa; comparative genomics; proteome;
D O I
10.1016/S0166-6851(01)00371-1
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We have undertaken the first comparative pilot gene discovery analysis of approximately 25 000 random genomic and expressed sequence tags (ESTs) from three species of Plasmodium, the infectious agent that causes malaria. A total of 5482 genome survey sequences (GSSs) and 5582 ESTs were generated from mung bean nuclease (MBN) and cDNA libraries, respectively, of the ANKA line of the rodent malaria parasite Plasmodium? berghei, and 10 874 GSSs generated from MBN libraries of the Salvador I and Belem lines of Plasmodium vivax, the most geographically wide-spread human malaria pathogen. These tags. together with 2438 Plasmodium falciparum sequences present in GenBank, were used to perform first-pass assembly and transcript reconstruction, and non-redundant consensus sequence datasets created. The datasets were compared against public protein databases and more than 1000 putative new Plasmodium proteins identified based on sequence similarity. Homologs of previously characterized Plasmodium genes were also identified, increasing the number of P. vivax and P. berghei sequences in public databases at least 10-fold. Comparative studies with other species of Apicomplexa identified interesting homologs of possible therapeutic or diagnostic value. A gene prediction program, Phat, was used to predict probable open reading frames for proteins in all three datasets. Predicted and non-redundant BLAST-matched proteins were submitted to InterPro, an integrated database of protein domains, signatures and families, for functional classification. Thus a partial predicted proteome was created for each species. This first comparative analysis of Plasmodium protein coding sequences represents a valuable resource for further studies on the biology of this important pathogen. (C) 2001 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:201 / 210
页数:10
相关论文
共 50 条
[1]   COMPLEMENTARY-DNA SEQUENCING - EXPRESSED SEQUENCE TAGS AND HUMAN GENOME PROJECT [J].
ADAMS, MD ;
KELLEY, JM ;
GOCAYNE, JD ;
DUBNICK, M ;
POLYMEROPOULOS, MH ;
XIAO, H ;
MERRIL, CR ;
WU, A ;
OLDE, B ;
MORENO, RF ;
KERLAVAGE, AR ;
MCCOMBIE, WR ;
VENTER, JC .
SCIENCE, 1991, 252 (5013) :1651-1656
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]   The InterPro database, an integrated documentation resource for protein families, domains and functional sites [J].
Apweiler, R ;
Attwood, TK ;
Bairoch, A ;
Bateman, A ;
Birney, E ;
Biswas, M ;
Bucher, P ;
Cerutti, T ;
Corpet, F ;
Croning, MDR ;
Durbin, R ;
Falquet, L ;
Fleischmann, W ;
Gouzy, J ;
Hermjakob, H ;
Hulo, N ;
Jonassen, I ;
Kahn, D ;
Kanapin, A ;
Karavidopoulou, Y ;
Lopez, R ;
Marx, B ;
Mulder, NJ ;
Oinn, TM ;
Pagni, M ;
Servant, F ;
Sigrist, CJA ;
Zdobnov, EM .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :37-40
[4]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[5]   ORIGIN OF REACTIVE OXYGEN SPECIES IN ERYTHROCYTES INFECTED WITH PLASMODIUM-FALCIPARUM [J].
ATAMNA, H ;
GINSBURG, H .
MOLECULAR AND BIOCHEMICAL PARASITOLOGY, 1993, 61 (02) :231-241
[6]   The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :45-48
[7]   GenBank [J].
Benson, DA ;
Karsch-Mizrachi, I ;
Lipman, DJ ;
Ostell, J ;
Rapp, BA ;
Wheeler, DL .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :15-18
[8]   The complete nucleotide sequence of chromosome 3 of Plasmodium falciparum [J].
Bowman, S ;
Lawson, D ;
Basham, D ;
Brown, D ;
Chillingworth, T ;
Churcher, CM ;
Craig, A ;
Davies, RM ;
Devlin, K ;
Feltwell, T ;
Gentles, S ;
Gwilliam, R ;
Hamlin, N ;
Harris, D ;
Holroyd, S ;
Hornsby, T ;
Horrocks, P ;
Jagels, K ;
Jassal, B ;
Kyes, S ;
McLean, J ;
Moule, S ;
Mungall, K ;
Murphy, L ;
Oliver, K ;
Quail, MA ;
Rajandream, MA ;
Rutter, S ;
Skelton, J ;
Squares, R ;
Squares, S ;
Sulston, JE ;
Whitehead, S ;
Woodward, JR ;
Newbold, C ;
Barrell, BG .
NATURE, 1999, 400 (6744) :532-538
[9]   d2_cluster: A validated method for clustering EST and full-length cDNA sequences [J].
Burke, J ;
Davison, D ;
Hide, W .
GENOME RESEARCH, 1999, 9 (11) :1135-1142
[10]   A protective glycosylphosphatidylinositol-anchored membrane protein of Plasmodium yoelii trophozoites and merozoites contains two epidermal growth factor-like domains [J].
Burns, JM ;
Belk, CC ;
Dunn, PD .
INFECTION AND IMMUNITY, 2000, 68 (11) :6189-6195