Flexible informatics for linking experimental data to mathematical models via DataRail

被引:54
作者
Saez-Rodriguez, Julio [1 ,2 ]
Goldsipe, Arthur [1 ,3 ]
Muhlich, Jeremy [1 ,2 ]
Alexopoulos, Leonidas G. [1 ,2 ]
Millard, Bjorn [1 ,2 ]
Lauffenburger, Douglas A. [1 ,3 ]
Sorger, Peter K. [1 ,2 ,3 ]
机构
[1] Harvard Univ, Sch Med, Ctr Cell Decis Proc, Boston, MA 02115 USA
[2] Harvard Univ, Sch Med, Dept Syst Biol, Boston, MA 02115 USA
[3] MIT, Dept Biol Engn, Cambridge, MA 02139 USA
关键词
D O I
10.1093/bioinformatics/btn018
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Linking experimental data to mathematical models in biology is impeded by the lack of suitable software to manage and transform data. Model calibration would be facilitated and models would increase in value were it possible to preserve links to training data along with a record of all normalization, scaling, and fusion routines used to assemble the training data from primary results. Results: We describe the implementation of DataRail, an open source MATLAB-based toolbox that stores experimental data in flexible multi-dimensional arrays, transforms arrays so as to maximize information content, and then constructs models using internal or external tools. Data integrity is maintained via a containment hierarchy for arrays, imposition of a metadata standard based on a newly proposed MIDAS format, assignment of semantically typed universal identifiers, and implementation of a procedure for storing the history of all transformations with the array. We illustrate the utility of DataRail by processing a newly collected set of similar to 22 000 measurements of protein activities obtained from cytokine-stimulated primary and transformed human liver cells.
引用
收藏
页码:840 / 847
页数:8
相关论文
共 12 条
[1]   A compendium of signals and responses triggered by prodeath and prosurvival cytokines [J].
Gaudet, S ;
Janes, KA ;
Albeck, JG ;
Pace, EA ;
Lauffenburger, DA ;
Sorger, PK .
MOLECULAR & CELLULAR PROTEOMICS, 2005, 4 (10) :1569-1590
[2]   Bioconductor: open software development for computational biology and bioinformatics [J].
Gentleman, RC ;
Carey, VJ ;
Bates, DM ;
Bolstad, B ;
Dettling, M ;
Dudoit, S ;
Ellis, B ;
Gautier, L ;
Ge, YC ;
Gentry, J ;
Hornik, K ;
Hothorn, T ;
Huber, W ;
Iacus, S ;
Irizarry, R ;
Leisch, F ;
Li, C ;
Maechler, M ;
Rossini, AJ ;
Sawitzki, G ;
Smith, C ;
Smyth, G ;
Tierney, L ;
Yang, JYH ;
Zhang, JH .
GENOME BIOLOGY, 2004, 5 (10)
[3]   Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals [J].
Gray, J ;
Chaudhuri, S ;
Bosworth, A ;
Layman, A ;
Reichart, D ;
Venkatrao, M ;
Pellow, F ;
Pirahesh, H .
DATA MINING AND KNOWLEDGE DISCOVERY, 1997, 1 (01) :29-53
[4]   Linking data to models: data regression [J].
Jaqaman, Khuloud ;
Danuser, Gaudenz .
NATURE REVIEWS MOLECULAR CELL BIOLOGY, 2006, 7 (11) :813-819
[5]   Structural and functional analysis of cellular networks with CellNetAnalyzer [J].
Klamt, Steffen ;
Saez-Rodriguez, Julio ;
Gilles, Ernst D. .
BMC SYSTEMS BIOLOGY, 2007, 1
[6]   MAPK activation is involved in posttranscriptional regulation of RSV-induced RANTES gene expression [J].
Pazdrak, K ;
Olszewska-Pazdrak, B ;
Liu, TS ;
Takizawa, R ;
Brasier, AR ;
Garofalo, RP ;
Casola, A .
AMERICAN JOURNAL OF PHYSIOLOGY-LUNG CELLULAR AND MOLECULAR PHYSIOLOGY, 2002, 283 (02) :L364-L372
[7]   Computational modeling of the dynamics of the MAP kinase cascade activated by surface and internalized EGF receptors [J].
Schoeberl, B ;
Eichler-Jonsson, C ;
Gilles, ED ;
Müller, G .
NATURE BIOTECHNOLOGY, 2002, 20 (04) :370-375
[8]   The Gaggle: An open-source software system for integrating bioinformatics software and data sources [J].
Shannon, Paul T. ;
Reiss, David J. ;
Bonneau, Richard ;
Baliga, Nitin S. .
BMC BIOINFORMATICS, 2006, 7 (1)
[9]   Informatics and quantitative analysis in biological Imaging [J].
Swedlow, JR ;
Goldberg, I ;
Brauner, E ;
Sorger, PK .
SCIENCE, 2003, 300 (5616) :100-102
[10]   SEBINI: Software Environment for BIological Network Inference [J].
Taylor, Ronald C. ;
Shah, Anuj ;
Treatman, Charles ;
Blevins, Meridith .
BIOINFORMATICS, 2006, 22 (21) :2706-2708