New strategy for the representation and the integration of biomolecular knowledge at a cellular scale

被引:18
作者
Barriot, R
Poix, J
Groppi, A
Barré, A
Goffard, N
Sherman, D
Dutour, I
de Daruvar, A
机构
[1] Univ Bordeaux 2, Ctr Bioinformat Bordeaux, F-33076 Bordeaux, France
[2] CNRS, UMR 5800, Lab Bordelais Rech & Informat, F-33405 Talence, France
[3] Univ Bordeaux 2, Lab Stat Math & Applicat, F-33076 Bordeaux, France
关键词
D O I
10.1093/nar/gkh681
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The combination of sequencing and post-sequencing experimental approaches produces huge collections of data that are highly heterogeneous both in structure and in semantics. We propose a new strategy for the integration of such data. This strategy uses structured sets of sequences as a unified representation of biological information and defines a probabilistic measure of similarity between the sets. Sets can be composed of sequences that are known to have a biological relationship (e.g. proteins involved in a complex or a pathway) or that share similar values for a particular attribute (e.g. expression profile). We have developed a software, BlastSets, which implements this strategy. It exploits a database where the sets derived from diverse biological information can be deposited using a standard XML format. For a given query set, BlastSets returns target sets found in the database whose similarity to the query is statistically significant. The tool allowed us to automatically identify verified relationships between correlated expression profiles and biological pathways using publicly available data for Saccharomyces cerevisiae. It was also used to retrieve the members of a complex (ribosome) based on the mining of expression profiles. These first results validate the relevance of the strategy and demonstrate the promising potential of BlastSets.
引用
收藏
页码:3581 / 3589
页数:9
相关论文
共 24 条
[1]   Characterizing gene sets with FuncAssociate [J].
Berriz, GF ;
King, OD ;
Bryant, B ;
Sander, C ;
Roth, FP .
BIOINFORMATICS, 2003, 19 (18) :2502-2504
[2]   GeneMerge - post-genomic analysis, data mining, and hypothesis testing [J].
Castillo-Davis, CI ;
Hartl, DL .
BIOINFORMATICS, 2003, 19 (07) :891-892
[3]  
DANCHIN A, 1998, BARQUE DELPHES CE QU
[4]   Genetic Network Analyzer: qualitative simulation of genetic regulatory networks [J].
de Jong, H ;
Geiselmann, J ;
Hernandez, C ;
Page, M .
BIOINFORMATICS, 2003, 19 (03) :336-344
[5]   RPL29 codes for a non-essential protein of the 60S ribosomal subunit in Saccharomyces cerevisiae and exhibits synthetic lethality with mutations in genes for proteins required for subunit coupling [J].
DeLabre, ML ;
Kessl, J ;
Karamanou, S ;
Trumpower, BL .
BIOCHIMICA ET BIOPHYSICA ACTA-GENE STRUCTURE AND EXPRESSION, 2002, 1574 (03) :255-261
[6]  
DUFOUR JM, 1995, MONTE CARLO TSTS NUI
[7]   Cluster analysis and display of genome-wide expression patterns [J].
Eisen, MB ;
Spellman, PT ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) :14863-14868
[8]  
Etzold T, 1996, METHOD ENZYMOL, V266, P114
[9]   Systematic changes in gene expression patterns following adaptive evolution in yeast [J].
Ferea, TL ;
Botstein, D ;
Brown, PO ;
Rosenzweig, RF .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (17) :9721-9726
[10]   Functional organization of the yeast proteome by systematic analysis of protein complexes [J].
Gavin, AC ;
Bösche, M ;
Krause, R ;
Grandi, P ;
Marzioch, M ;
Bauer, A ;
Schultz, J ;
Rick, JM ;
Michon, AM ;
Cruciat, CM ;
Remor, M ;
Höfert, C ;
Schelder, M ;
Brajenovic, M ;
Ruffner, H ;
Merino, A ;
Klein, K ;
Hudak, M ;
Dickson, D ;
Rudi, T ;
Gnau, V ;
Bauch, A ;
Bastuck, S ;
Huhse, B ;
Leutwein, C ;
Heurtier, MA ;
Copley, RR ;
Edelmann, A ;
Querfurth, E ;
Rybin, V ;
Drewes, G ;
Raida, M ;
Bouwmeester, T ;
Bork, P ;
Seraphin, B ;
Kuster, B ;
Neubauer, G ;
Superti-Furga, G .
NATURE, 2002, 415 (6868) :141-147