Whole-Genome Alignment and Comparative Annotation

被引:55
作者
Armstrong, Joel [1 ]
Fiddes, Ian T. [1 ,2 ]
Diekhans, Mark [1 ]
Paten, Benedict [1 ]
机构
[1] Univ Calif Santa Cruz, UC Santa Cruz Genom Inst, Santa Cruz, CA 95064 USA
[2] 10x Genom, Pleasanton, CA 94566 USA
来源
ANNUAL REVIEW OF ANIMAL BIOSCIENCES, VOL 7 | 2019年 / 7卷
关键词
genome alignment; genome annotation; comparative genomics; MULTIPLE ALIGNMENT; SEQUENCE ALIGNMENT; GENE PREDICTION; MOUSE; DUPLICATION; IMPROVE; SEARCH; REARRANGEMENT; RECOGNITION; EVOLUTION;
D O I
10.1146/annurev-animal-020518-115005
中图分类号
S8 [畜牧、 动物医学、狩猎、蚕、蜂];
学科分类号
090502 [动物营养与饲料科学];
摘要
Rapidly improving sequencing technology coupled with computational developments in sequence assembly are making reference-quality genome assembly economical. Hundreds of vertebrate genome assemblies are now publicly available, and projects are being proposed to sequence thousands of additional species in the next few years. Such dense sampling of the tree of life should give an unprecedented new understanding of evolution and allow a detailed determination of the events that led to the wealth of bio-diversity around us. To gain this knowledge, these new genomes must be compared through genome alignment (at the sequence level) and comparative annotation (at the gene level). However, different alignment and annotation methods have different characteristics; before starting a comparative genomics analysis, it is important to understand the nature of, and biases and limitations inherent in, the chosen methods. This review is intended to act as a technical but high-level overview of the field that should provide this understanding. We briefly survey the state of the genome alignment and comparative annotation fields and potential future directions for these fields in a new, large-scale era of comparative genomics.
引用
收藏
页码:41 / 64
页数:24
相关论文
共 118 条
[1]
Ensembl 2017 [J].
Aken, Bronwen L. ;
Achuthan, Premanand ;
Akanni, Wasiu ;
Amode, M. Ridwan ;
Bernsdorff, Friederike ;
Bhai, Jyothish ;
Billis, Konstantinos ;
Carvalho-Silva, Denise ;
Cummins, Carla ;
Clapham, Peter ;
Gil, Laurent ;
Giron, Carlos Garcia ;
Gordon, Leo ;
Hourlier, Thibaut ;
Hunt, Sarah E. ;
Janacek, Sophie H. ;
Juettemann, Thomas ;
Keenan, Stephen ;
Laird, Matthew R. ;
Lavidas, Ilias ;
Maurel, Thomas ;
McLaren, William ;
Moore, Benjamin ;
Murphy, Daniel N. ;
Nag, Rishi ;
Newman, Victoria ;
Nuhn, Michael ;
Ong, Chuang Kee ;
Parker, Anne ;
Patricio, Mateus ;
Riat, Harpreet Singh ;
Sheppard, Daniel ;
Sparrow, Helen ;
Taylor, Kieron ;
Thormann, Anja ;
Vullo, Alessandro ;
Walts, Brandon ;
Wilder, Steven P. ;
Zadissa, Amonida ;
Kostadima, Myrto ;
Martin, Fergal J. ;
Muffato, Matthieu ;
Perry, Emily ;
Ruffier, Magali ;
Staines, Daniel M. ;
Trevanion, Stephen J. ;
Cunningham, Fiona ;
Yates, Andrew ;
Zerbino, Daniel R. ;
Flicek, Paul .
NUCLEIC ACIDS RESEARCH, 2017, 45 (D1) :D635-D642
[2]
SLAM: Cross-species gene finding and alignment with a generalized pair hidden Markov model [J].
Alexandersson, M ;
Cawley, S ;
Pachter, L .
GENOME RESEARCH, 2003, 13 (03) :496-502
[3]
BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[4]
A map of human genome variation from population-scale sequencing [J].
Altshuler, David ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Collins, Francis S. ;
De la Vega, Francisco M. ;
Donnelly, Peter ;
Egholm, Michael ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Knoppers, Bartha M. ;
Lander, Eric S. ;
Lehrach, Hans ;
Mardis, Elaine R. ;
McVean, Gil A. ;
Nickerson, DebbieA. ;
Peltonen, Leena ;
Schafer, Alan J. ;
Sherry, Stephen T. ;
Wang, Jun ;
Wilson, Richard K. ;
Gibbs, Richard A. ;
Deiros, David ;
Metzker, Mike ;
Muzny, Donna ;
Reid, Jeff ;
Wheeler, David ;
Wang, Jun ;
Li, Jingxiang ;
Jian, Min ;
Li, Guoqing ;
Li, Ruiqiang ;
Liang, Huiqing ;
Tian, Geng ;
Wang, Bo ;
Wang, Jian ;
Wang, Wei ;
Yang, Huanming ;
Zhang, Xiuqing ;
Zheng, Huisong ;
Lander, Eric S. ;
Altshuler, David L. ;
Ambrogio, Lauren ;
Bloom, Toby ;
Cibulskis, Kristian ;
Fennell, Tim J. ;
Gabriel, Stacey B. .
NATURE, 2010, 467 (7319) :1061-1073
[5]
Mugsy: fast multiple alignment of closely related whole genomes [J].
Angiuoli, Samuel V. ;
Salzberg, Steven L. .
BIOINFORMATICS, 2011, 27 (03) :334-342
[6]
[Anonymous], 2016, DATABASE
[7]
The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans [J].
Ardlie, Kristin G. ;
DeLuca, David S. ;
Segre, Ayellet V. ;
Sullivan, Timothy J. ;
Young, Taylor R. ;
Gelfand, Ellen T. ;
Trowbridge, Casandra A. ;
Maller, Julian B. ;
Tukiainen, Taru ;
Lek, Monkol ;
Ward, Lucas D. ;
Kheradpour, Pouya ;
Iriarte, Benjamin ;
Meng, Yan ;
Palmer, Cameron D. ;
Esko, Tonu ;
Winckler, Wendy ;
Hirschhorn, Joel N. ;
Kellis, Manolis ;
MacArthur, Daniel G. ;
Getz, Gad ;
Shabalin, Andrey A. ;
Li, Gen ;
Zhou, Yi-Hui ;
Nobel, Andrew B. ;
Rusyn, Ivan ;
Wright, Fred A. ;
Lappalainen, Tuuli ;
Ferreira, Pedro G. ;
Ongen, Halit ;
Rivas, Manuel A. ;
Battle, Alexis ;
Mostafavi, Sara ;
Monlong, Jean ;
Sammeth, Michael ;
Mele, Marta ;
Reverter, Ferran ;
Goldmann, Jakob M. ;
Koller, Daphne ;
Guigo, Roderic ;
McCarthy, Mark I. ;
Dermitzakis, Emmanouil T. ;
Gamazon, Eric R. ;
Im, Hae Kyung ;
Konkashbaev, Anuar ;
Nicolae, Dan L. ;
Cox, Nancy J. ;
Flutre, Timothee ;
Wen, Xiaoquan ;
Stephens, Matthew .
SCIENCE, 2015, 348 (6235) :648-660
[8]
Human and mouse gene structure: Comparative analysis and application to exon prediction [J].
Batzoglou, S ;
Pachter, L ;
Mesirov, JP ;
Berger, B ;
Lander, ES .
GENOME RESEARCH, 2000, 10 (07) :950-958
[9]
The many faces of sequence alignment [J].
Batzoglou, S .
BRIEFINGS IN BIOINFORMATICS, 2005, 6 (01) :6-22
[10]
GenBank [J].
Benson, DA ;
Karsch-Mizrachi, I ;
Lipman, DJ ;
Ostell, J ;
Rapp, BA ;
Wheeler, DL .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :15-18