Observ-OM and Observ-TAB: Universal Syntax Solutions for the Integration, Search, and Exchange of Phenotype And Genotype Information

被引:14
作者
Adamusiak, Tomasz [3 ,4 ]
Parkinson, Helen [3 ,4 ]
Muilu, Juha [3 ,5 ,6 ,7 ,8 ]
Roos, Erik [2 ,8 ]
van der Velde, Kasper Joeri [2 ,9 ]
Thorisson, Gudmundur A. [3 ,10 ]
Byrne, Myles [3 ,5 ]
Pang, Chao [2 ,8 ]
Gollapudi, Sirisha [3 ,10 ]
Ferretti, Vincent [8 ,11 ]
Hillege, Hans [8 ,12 ]
Brookes, Anthony J. [3 ,8 ,10 ]
Swertz, Morris A. [1 ,2 ,3 ,4 ,6 ,7 ,8 ,9 ,13 ]
机构
[1] Univ Med Ctr Groningen, Genom Coordinat Ctr, Dept Genet, HPC CB50, NL-9700 RB Groningen, Netherlands
[2] Univ Groningen, Groningen Bioinformat Ctr, NL-9700 RB Groningen, Netherlands
[3] EU GEN2PHEN, Hinxton, S Cambs, England
[4] European Bioinformat Inst, European Mol Biol Lab, Hinxton, S Cambs, England
[5] Inst Mol Med Finland, Helsinki, Finland
[6] BBMRI Netherlands, BBMRI Europe, Rotterdam, Netherlands
[7] BBMRI Finland, Rotterdam, Netherlands
[8] BioSHaRE EU, Groningen, Netherlands
[9] EU PANACEA, Leicester, Leics, England
[10] Univ Leicester, Dept Genet, Leicester LE1 7RH, Leics, England
[11] MaRS Ctr, Ontario Inst Canc Res, Toronto, ON, Canada
[12] Univ Med Ctr Groningen, Dept Cardiol & Epidemiol, NL-9700 RB Groningen, Netherlands
[13] Netherlands Bioinformat Ctr, Biobank TaskForce, Nijmegen, Netherlands
基金
芬兰科学院;
关键词
bioinformatics; data model; databases; phenotype; genotype; GENOME-WIDE ASSOCIATION; MODEL; EUROPHENOME; FRAMEWORK; RESOURCE; DATABASE;
D O I
10.1002/humu.22070
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Genetic and epidemiological research increasingly employs large collections of phenotypic and molecular observation data from high quality human and model organism samples. Standardization efforts have produced a few simple formats for exchange of these various data, but a lightweight and convenient data representation scheme for all data modalities does not exist, hindering successful data integration, such as assignment of mouse models to orphan diseases and phenotypic clustering for pathways. We report a unified system to integrate and compare observation data across experimental projects, disease databases, and clinical biobanks. The core object model (Observ-OM) comprises only four basic concepts to represent any kind of observation: Targets, Features, Protocols (and their Applications), and Values. An easy-to-use file format (Observ-TAB) employs Excel to represent individual and aggregate data in straightforward spreadsheets. The systems have been tested successfully on human biobank, genome-wide association studies, quantitative trait loci, model organism, and patient registry data using the MOL-GENIS platform to quickly setup custom data portals. Our system will dramatically lower the barrier for future data sharing and facilitate integrated search across panels and species. All models, formats, documentation, and software are available for free and open source (LGPLv3) at http://www.observ-om.org. Hum Mutat 33: 867-873, 2012. (C) 2012 Wiley Periodicals, Inc.
引用
收藏
页码:867 / 873
页数:7
相关论文
共 26 条
[1]  
Arends D, 2012, BIOINFORMATICS
[2]   The Phenotype and Genotype Experiment Object Model (PaGE-OM): A Robust Data Structure for Information Related to DNA Variation [J].
Brookes, Anthony J. ;
Lehvaslaiho, Heikki ;
Muilu, Juha ;
Shigemoto, Yasumasa ;
Oroguchi, Takashige ;
Tomiki, Takeshi ;
Mukaiyama, Atsuhiro ;
Konagaya, Akihiko ;
Kojima, Toshio ;
Inoue, Ituro ;
Kuroda, Masako ;
Mizushima, Hiroshi ;
Thorisson, Gudmundur A. ;
Dash, Debasis ;
Rajeevan, Haseena ;
Darlison, Matthew W. ;
Woon, Mark ;
Fredman, David ;
Smith, Albert V. ;
Senger, Martin ;
Naito, Kimitoshi ;
Sugawara, Hideaki .
HUMAN MUTATION, 2009, 30 (06) :968-977
[3]   The Framingham Heart Study 100K SNP genome-wide association study resource: overview of 17 phenotype working group reports [J].
Cupples, L. Adrienne ;
Arruda, Heather T. ;
Benjamin, Emelia J. ;
D'Agostino, Ralph B., Sr. ;
Demissie, Serkalem ;
DeStefano, Anita L. ;
Dupuis, Josee ;
Falls, Kathleen M. ;
Fox, Caroline S. ;
Gottlieb, Daniel J. ;
Govindaraju, Diddahally R. ;
Guo, Chao-Yu ;
Heard-Costa, Nancy L. ;
Hwang, Shih-Jen ;
Kathiresan, Sekar ;
Kiel, Douglas P. ;
Laramie, Jason M. ;
Larson, Martin G. ;
Levy, Daniel ;
Liu, Chun-Yu ;
Lunetta, Kathryn L. ;
Mailman, Matthew D. ;
Manning, Alisa K. ;
Meigs, James B. ;
Murabito, Joanne M. ;
Newton-Cheh, Christopher ;
O'Connor, George T. ;
O'Donnell, Christopher J. ;
Pandey, Mona ;
Seshadri, Sudha ;
Vasan, Ramachandran S. ;
Wang, Zhen Y. ;
Wilk, Jemma B. ;
Wolf, Philip A. ;
Yang, Qiong ;
Atwood, Larry D. .
BMC MEDICAL GENETICS, 2007, 8
[4]   Locus Reference Genomic sequences: an improved basis for describing human DNA variants [J].
Dalgleish, Raymond ;
Flicek, Paul ;
Cunningham, Fiona ;
Astashyn, Alex ;
Tully, Raymond E. ;
Proctor, Glenn ;
Chen, Yuan ;
McLaren, William M. ;
Larsson, Pontus ;
Vaughan, Brendan W. ;
Beroud, Christophe ;
Dobson, Glen ;
Lehvaeslaiho, Heikki ;
Taschner, Peter E. M. ;
den Dunnen, Johan T. ;
Devereau, Andrew ;
Birney, Ewan ;
Brookes, Anthony J. ;
Maglott, Donna R. .
GENOME MEDICINE, 2010, 2
[5]   A framework for variation discovery and genotyping using next-generation DNA sequencing data [J].
DePristo, Mark A. ;
Banks, Eric ;
Poplin, Ryan ;
Garimella, Kiran V. ;
Maguire, Jared R. ;
Hartl, Christopher ;
Philippakis, Anthony A. ;
del Angel, Guillermo ;
Rivas, Manuel A. ;
Hanna, Matt ;
McKenna, Aaron ;
Fennell, Tim J. ;
Kernytsky, Andrew M. ;
Sivachenko, Andrey Y. ;
Cibulskis, Kristian ;
Gabriel, Stacey B. ;
Altshuler, David ;
Daly, Mark J. .
NATURE GENETICS, 2011, 43 (05) :491-+
[6]   Guidelines for the effective use of entity-attribute-value modeling for biomedical databases [J].
Dinu, Valentin ;
Nadkarni, Prakash .
INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2007, 76 (11-12) :769-779
[7]   LOVD v.2.0: The Next Generation in Gene Variant Databases [J].
Fokkema, Ivo F. A. C. ;
Taschner, Peter E. M. ;
Schaafsma, Gerard C. P. ;
Celli, J. ;
Laros, Jeroen F. J. ;
den Dunnen, Johan T. .
HUMAN MUTATION, 2011, 32 (05) :557-563
[8]  
Fowler M, 1997, ANAL PATTERNS REUSAB, V10, P357
[9]   SAIL-a software system for sample and phenotype availability across biobanks and cohorts [J].
Gostev, Mikhail ;
Fernandez-Banet, Julio ;
Rung, Johan ;
Dietrich, Joern ;
Prokopenko, Inga ;
Ripatti, Samuli ;
McCarthy, Mark I. ;
Brazma, Alvis ;
Krestyaninova, Maria .
BIOINFORMATICS, 2011, 27 (04) :589-591
[10]   The anatomy of a nanopublication [J].
Groth P. ;
Gibson A. ;
Velterop J. .
Information Services and Use, 2010, 30 (1-2) :51-56