Rebooting the human mitochondrial phylogeny: an automated and scalable methodology with expert knowledge

被引:12
作者
Blanco, Roberto [1 ,2 ]
Mayordomo, Elvira [1 ,2 ]
Montoya, Julio [3 ,4 ]
Ruiz-Pesini, Eduardo [3 ,4 ,5 ]
机构
[1] Univ Zaragoza, Dept Informat & Ingn Sistemas, Zaragoza 50018, Spain
[2] Univ Zaragoza, Inst Invest Ingn Aragon, Zaragoza 50018, Spain
[3] Univ Zaragoza, Dept Bioquim & Biol Mol & Celular, E-50013 Zaragoza, Spain
[4] Ctr Invest Biomed Red Enfermedades Raras, Zaragoza 50013, Spain
[5] Agencia Aragonesa Invest & Desarrollo, Zaragoza 50013, Spain
关键词
MULTIPLE SEQUENCE ALIGNMENT; CONTROL REGION; GENOME; DNA; EVOLUTIONARY; MODEL; COMPLEXITY; CONFIDENCE; DIVERSITY; DELETION;
D O I
10.1186/1471-2105-12-174
中图分类号
Q5 [生物化学];
学科分类号
070307 [化学生物学];
摘要
Background: Mitochondrial DNA is an ideal source of information to conduct evolutionary and phylogenetic studies due to its extraordinary properties and abundance. Many insights can be gained from these, including but not limited to screening genetic variation to identify potentially deleterious mutations. However, such advances require efficient solutions to very difficult computational problems, a need that is hampered by the very plenty of data that confers strength to the analysis. Results: We develop a systematic, automated methodology to overcome these difficulties, building from readily available, public sequence databases to high-quality alignments and phylogenetic trees. Within each stage in an autonomous workflow, outputs are carefully evaluated and outlier detection rules defined to integrate expert knowledge and automated curation, hence avoiding the manual bottleneck found in past approaches to the problem. Using these techniques, we have performed exhaustive updates to the human mitochondrial phylogeny, illustrating the power and computational scalability of our approach, and we have conducted some initial analyses on the resulting phylogenies. Conclusions: The problem at hand demands careful definition of inputs and adequate algorithmic treatment for its solutions to be realistic and useful. It is possible to define formal rules to address the former requirement by refining inputs directly and through their combination as outputs, and the latter are also of help to ascertain the performance of chosen algorithms. Rules can exploit known or inferred properties of datasets to simplify inputs through partitioning, therefore cutting computational costs and affording work on rapidly growing, otherwise intractable datasets. Although expert guidance may be necessary to assist the learning process, low-risk results can be fully automated and have proved themselves convenient and valuable.
引用
收藏
页数:13
相关论文
共 49 条
[1]
Mitochondrial DNA structure in the Arabian Peninsula [J].
Abu-Amero, Khaled K. ;
Larruga, Jose M. ;
Cabrera, Vicente M. ;
Gonzalez, Ana M. .
BMC EVOLUTIONARY BIOLOGY, 2008, 8 (1)
[2]
SEQUENCE AND ORGANIZATION OF THE HUMAN MITOCHONDRIAL GENOME [J].
ANDERSON, S ;
BANKIER, AT ;
BARRELL, BG ;
DEBRUIJN, MHL ;
COULSON, AR ;
DROUIN, J ;
EPERON, IC ;
NIERLICH, DP ;
ROE, BA ;
SANGER, F ;
SCHREIER, PH ;
SMITH, AJH ;
STADEN, R ;
YOUNG, IG .
NATURE, 1981, 290 (5806) :457-465
[3]
Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA [J].
Andrews, RM ;
Kubacka, I ;
Chinnery, PF ;
Lightowlers, RN ;
Turnbull, DM ;
Howell, N .
NATURE GENETICS, 1999, 23 (02) :147-147
[4]
[Anonymous], SIMPLIFIED MTDNA LIN
[5]
Mitogenomic analyses of caniform relationships [J].
Arnason, Ulfur ;
Gullberg, Anette ;
Janke, Axel ;
Kullberg, Morgan .
MOLECULAR PHYLOGENETICS AND EVOLUTION, 2007, 45 (03) :863-874
[6]
HmtDB, a Human Mitochondrial Genomic resource based on variability studies supporting population genetics and biomedical research [J].
Attimonelli, M ;
Accetturo, M ;
Santamaria, M ;
Lascaro, D ;
Scioscia, G ;
Pappadà, G ;
Russo, L ;
Zanchetta, L ;
Tommaseo-Ponzetta, M .
BMC BIOINFORMATICS, 2005, 6 (Suppl 4)
[7]
A Novel 154-bp Deletion in the Human Mitochondrial DNA Control Region in Healthy Individuals [J].
Behar, Doron M. ;
Blue-Smith, Jason ;
Soria-Hernanz, David E. ;
Tzur, Shay ;
Hadid, Yarin ;
Bormans, Concetta ;
Moen, Alexander ;
Tyler-Smith, Chris ;
Quintana-Murci, Lluis ;
Wells, R. Spencer .
HUMAN MUTATION, 2008, 29 (12) :1387-1391
[8]
Benson DA, 2013, NUCLEIC ACIDS RES, V41, pD36, DOI [10.1093/nar/gkn723, 10.1093/nar/gkp1024, 10.1093/nar/gkw1070, 10.1093/nar/gkr1202, 10.1093/nar/gkx1094, 10.1093/nar/gkl986, 10.1093/nar/gkq1079, 10.1093/nar/gks1195, 10.1093/nar/gkg057]
[9]
The Acquisition of an Inheritable 50-bp Deletion in the Human mtDNA Control Region Does Not Affect the mtDNA Copy Number in Peripheral Blood Cells [J].
Bi, Rui ;
Zhang, A-Mei ;
Zhang, Wen ;
Kong, Qing-Peng ;
Wu, Bei-Ling ;
Yang, Xiao-Hong ;
Wang, Dong ;
Zou, Yang ;
Zhang, Ya-Ping ;
Yao, Yong-Gang .
HUMAN MUTATION, 2010, 31 (05) :538-543
[10]
Blanco R., 2010, Proceedings 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2010), P57, DOI 10.1109/BIBM.2010.5706536