Data mining parasite genomes

被引:5
作者
Berriman, M [1 ]
机构
[1] Wellcome Trust Sanger Inst, Hinxton CB10 1SA, England
关键词
annotation; genome; gene ontology; gene prediction;
D O I
10.1017/S0031182004006857
中图分类号
R38 [医学寄生虫学]; Q [生物科学];
学科分类号
07 [理学]; 0710 [生物学]; 09 [农学]; 100103 [病原生物学];
摘要
The term 'data mining' can be used to describe any process where useful information is extracted from data with a large background of 'noise'. In the context of a genome project, several stages involve data mining. Amongst the sequence data, 'signals' need to be detected that indicate the presence of interesting features. Often this involves differentiating between transcribed and non-transcribed bases to predict coding regions. After detection, defining the roles of these sequences involves sifting through multiple lines of evidence. If these roles are accurately reflected in genome annotation, they can be used by researchers to frame queries and interrogate the data further.
引用
收藏
页码:S23 / S31
页数:9
相关论文
共 19 条
[1]
BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]
Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[3]
Berriman Matt, 2003, Briefings in Bioinformatics, V4, P124, DOI 10.1093/bib/4.2.124
[4]
BUCHER P, 1994, ISMB, V2, P53
[5]
Phat -: a gene finding program for Plasmodium falciparum [J].
Cawley, SE ;
Wirth, AI ;
Speed, TP .
MOLECULAR AND BIOCHEMICAL PARASITOLOGY, 2001, 118 (02) :167-174
[6]
Profile hidden Markov models [J].
Eddy, SR .
BIOINFORMATICS, 1998, 14 (09) :755-763
[7]
Genome sequence of the human malaria parasite Plasmodium falciparum [J].
Gardner, MJ ;
Hall, N ;
Fung, E ;
White, O ;
Berriman, M ;
Hyman, RW ;
Carlton, JM ;
Pain, A ;
Nelson, KE ;
Bowman, S ;
Paulsen, IT ;
James, K ;
Eisen, JA ;
Rutherford, K ;
Salzberg, SL ;
Craig, A ;
Kyes, S ;
Chan, MS ;
Nene, V ;
Shallom, SJ ;
Suh, B ;
Peterson, J ;
Angiuoli, S ;
Pertea, M ;
Allen, J ;
Selengut, J ;
Haft, D ;
Mather, MW ;
Vaidya, AB ;
Martin, DMA ;
Fairlamb, AH ;
Fraunholz, MJ ;
Roos, DS ;
Ralph, SA ;
McFadden, GI ;
Cummings, LM ;
Subramanian, GM ;
Mungall, C ;
Venter, JC ;
Carucci, DJ ;
Hoffman, SL ;
Newbold, C ;
Davis, RW ;
Fraser, CM ;
Barrell, B .
NATURE, 2002, 419 (6906) :498-511
[8]
Sequence of Plasmodium falciparum chromosomes 1, 3-9 and 13 [J].
Hall, N ;
Pain, A ;
Berriman, M ;
Churcher, C ;
Harris, B ;
Harris, D ;
Mungall, K ;
Bowman, S ;
Atkin, R ;
Baker, S ;
Barron, A ;
Brooks, K ;
Buckee, CO ;
Burrows, C ;
Cherevach, I ;
Chillingworth, C ;
Chillingworth, T ;
Christodoulou, Z ;
Clark, L ;
Clark, R ;
Corton, C ;
Cronin, A ;
Davies, R ;
Davis, P ;
Dear, P ;
Dearden, F ;
Doggett, J ;
Feltwell, T ;
Goble, A ;
Goodhead, I ;
Gwilliam, R ;
Hamlin, N ;
Hance, Z ;
Harper, D ;
Hauser, H ;
Hornsby, T ;
Holroyd, S ;
Horrocks, P ;
Humphray, S ;
Jagels, K ;
James, KD ;
Johnson, D ;
Kerhornou, A ;
Knights, A ;
Konfortov, B ;
Kyes, S ;
Larke, N ;
Lawson, D ;
Lennard, N ;
Line, A .
NATURE, 2002, 419 (6906) :527-531
[9]
The Gene Ontology (GO) database and informatics resource [J].
Harris, MA ;
Clark, J ;
Ireland, A ;
Lomax, J ;
Ashburner, M ;
Foulger, R ;
Eilbeck, K ;
Lewis, S ;
Marshall, B ;
Mungall, C ;
Richter, J ;
Rubin, GM ;
Blake, JA ;
Bult, C ;
Dolan, M ;
Drabkin, H ;
Eppig, JT ;
Hill, DP ;
Ni, L ;
Ringwald, M ;
Balakrishnan, R ;
Cherry, JM ;
Christie, KR ;
Costanzo, MC ;
Dwight, SS ;
Engel, S ;
Fisk, DG ;
Hirschman, JE ;
Hong, EL ;
Nash, RS ;
Sethuraman, A ;
Theesfeld, CL ;
Botstein, D ;
Dolinski, K ;
Feierbach, B ;
Berardini, T ;
Mundodi, S ;
Rhee, SY ;
Apweiler, R ;
Barrell, D ;
Camon, E ;
Dimmer, E ;
Lee, V ;
Chisholm, R ;
Gaudet, P ;
Kibbe, W ;
Kishore, R ;
Schwarz, EM ;
Sternberg, P ;
Gwinn, M .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D258-D261
[10]
GeneDB: a resource for prokaryotic and eukaryotic organisms [J].
Hertz-Fowler, C ;
Peacock, CS ;
Wood, V ;
Aslett, M ;
Kerhornou, A ;
Mooney, P ;
Tivey, A ;
Berriman, M ;
Hall, N ;
Rutherford, K ;
Parkhill, J ;
Ivens, AC ;
Rajandream, MA ;
Barrell, B .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D339-D343