Pervasive Indels and Their Evolutionary Dynamics after the Fish-Specific Genome Duplication

被引:33
作者
Guo, Baocheng [1 ,2 ]
Zou, Ming [3 ]
Wagner, Andreas [1 ,2 ]
机构
[1] Univ Zurich, Inst Evolutionary Biol & Environm Studies, Zurich, Switzerland
[2] Swiss Inst Bioinformat, Lausanne, Switzerland
[3] Chinese Acad Sci, Inst Hydrobiol, Key Lab Aquat Biodivers & Conservat, Wuhan, Peoples R China
基金
瑞士国家科学基金会;
关键词
indel; gene duplication; teleost; fish-specific genome duplication; MULTIPLE SEQUENCE ALIGNMENT; MUTATION-RATE; FUNCTIONAL DIVERGENCE; SECONDARY-STRUCTURE; PROTEIN-STRUCTURE; NUCLEOTIDE SUBSTITUTION; SOLVENT ACCESSIBILITY; GENE LOSS; RATES; PATTERNS;
D O I
10.1093/molbev/mss108
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Insertions and deletions (indels) in protein-coding genes are important sources of genetic variation. Their role in creating new proteins may be especially important after gene duplication. However, little is known about how indels affect the divergence of duplicate genes. We here study thousands of duplicate genes in five fish (teleost) species with completely sequenced genomes. The ancestor of these species has been subject to a fish-specific genome duplication (FSGD) event that occurred approximately 350 Ma. We find that duplicate genes contain at least 25% more indels than single-copy genes. These indels accumulated preferentially in the first 40 my after the FSGD. A lack of widespread asymmetric indel accumulation indicates that both members of a duplicate gene pair typically experience relaxed selection. Strikingly, we observe a 30-80% excess of deletions over insertions that is consistent for indels of various lengths and across the five genomes. We also find that indels preferentially accumulate inside loop regions of protein secondary structure and in regions where amino acids are exposed to solvent. We show that duplicate genes with high indel density also show high DNA sequence divergence. Indel density, but not amino acid divergence, can explain a large proportion of the tertiary structure divergence between proteins encoded by duplicate genes. Our observations are consistent across all five fish species. Taken together, they suggest a general pattern of duplicate gene evolution in which indels are important driving forces of evolutionary change.
引用
收藏
页码:3005 / 3022
页数:18
相关论文
共 107 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Zebrafish hox clusters and vertebrate genome evolution [J].
Amores, A ;
Force, A ;
Yan, YL ;
Joly, L ;
Amemiya, C ;
Fritz, A ;
Ho, RK ;
Langeland, J ;
Prince, V ;
Wang, YL ;
Westerfield, M ;
Ekker, M ;
Postlethwait, JH .
SCIENCE, 1998, 282 (5394) :1711-1714
[3]  
[Anonymous], 2004, PHYLIP PHYLOGENY INF
[4]   Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes [J].
Aparicio, S ;
Chapman, J ;
Stupka, E ;
Putnam, N ;
Chia, J ;
Dehal, P ;
Christoffels, A ;
Rash, S ;
Hoon, S ;
Smit, A ;
Gelpke, MDS ;
Roach, J ;
Oh, T ;
Ho, IY ;
Wong, M ;
Detter, C ;
Verhoef, F ;
Predki, P ;
Tay, A ;
Lucas, S ;
Richardson, P ;
Smith, SF ;
Clark, MS ;
Edwards, YJK ;
Doggett, N ;
Zharkikh, A ;
Tavtigian, SV ;
Pruss, D ;
Barnstead, M ;
Evans, C ;
Baden, H ;
Powell, J ;
Glusman, G ;
Rowen, L ;
Hood, L ;
Tan, YH ;
Elgar, G ;
Hawkins, T ;
Venkatesh, B ;
Rokhsar, D ;
Brenner, S .
SCIENCE, 2002, 297 (5585) :1301-1310
[5]   The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling [J].
Arnold, K ;
Bordoli, L ;
Kopp, J ;
Schwede, T .
BIOINFORMATICS, 2006, 22 (02) :195-201
[6]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[7]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[8]   GeneWise and genomewise [J].
Birney, E ;
Clamp, M ;
Durbin, R .
GENOME RESEARCH, 2004, 14 (05) :988-995
[9]   Protein structure homology modeling using SWISS-MODEL workspace [J].
Bordoli, Lorenza ;
Kiefer, Florian ;
Arnold, Konstantin ;
Benkert, Pascal ;
Battey, James ;
Schwede, Torsten .
NATURE PROTOCOLS, 2009, 4 (01) :1-13
[10]   Alternative Splicing of RNA Triplets Is Often Regulated and Accelerates Proteome Evolution [J].
Bradley, Robert K. ;
Merkin, Jason ;
Lambert, Nicole J. ;
Burge, Christopher B. .
PLOS BIOLOGY, 2012, 10 (01)