Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence

被引:163
作者
Cheung, J
Estivill, X
Khaja, R
MacDonald, JR
Lau, K
Tsui, LC
Scherer, SW
机构
[1] Hosp Sick Children, Res Inst, Program Genet & Genom Biol, Toronto, ON M5G 1X8, Canada
[2] Univ Pompeu Fabra, Genom Regulat Ctr, Genes & Dis Program, E-08003 Barcelona, Spain
[3] Univ Pompeu Fabra, Fac Ciencias Salut & Vida, E-08003 Barcelona, Spain
[4] Univ Toronto, Dept Mol & Med Genet, Toronto, ON M5G 1X8, Canada
关键词
D O I
10.1186/gb-2003-4-4-r25
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Previous studies have suggested that recent segmental duplications, which are often involved in chromosome rearrangements underlying genomic disease, account for some 5% of the human genome. We have developed rapid computational heuristics based on BLAST analysis to detect segmental duplications, as well as regions containing potential sequence misassignments in the human genome assemblies. Results: Our analysis of the June 2002 public human genome assembly revealed that 107.4 of 3,043.1 megabases (Mb) (3.53%) of sequence contained segmental duplications, each with size equal or more than 5 kb and 90% identity. We have also detected that 38.9 Mb (1.28%) of sequence within this assembly is likely to be involved in sequence misassignment errors. Furthermore, we have identified a significant subset ( 199,965 of 2,327,473 or 8.6%) of single-nucleotide polymorphisms ( SNPs) in the public databases that are not true SNPs but are potential paralogous sequence variants. Conclusion: Using two distinct computational approaches, we have identified most of the sequences in the human genome that have undergone recent segmental duplications. Near-identical segmental duplications present a major challenge to the completion of the human genome sequence. Potential sequence misassignments detected in this study would require additional efforts to resolve.
引用
收藏
页数:10
相关论文
共 26 条
  • [1] ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
  • [2] Recent segmental duplications in the human genome
    Bailey, JA
    Gu, ZP
    Clark, RA
    Reinert, K
    Samonte, RV
    Schwartz, S
    Adams, MD
    Myers, EW
    Li, PW
    Eichler, EE
    [J]. SCIENCE, 2002, 297 (5583) : 1003 - 1007
  • [3] Segmental duplications: Organization and impact within the current Human Genome Project assembly
    Bailey, JA
    Yavor, AM
    Massa, HF
    Trask, BJ
    Eichler, EE
    [J]. GENOME RESEARCH, 2001, 11 (06) : 1005 - 1017
  • [4] Segmental duplications: An 'expanding' role in genomic instability and disease
    Emanuel, BS
    Shaikh, TH
    [J]. NATURE REVIEWS GENETICS, 2001, 2 (10) : 791 - 800
  • [5] Chromosomal regions containing high-density and ambiguously mapped putative single nucleotide polymorphisms (SNPs) correlate with segmental duplications in the human genome
    Estivill, X
    Cheung, J
    Pujana, MA
    Nakabayashi, K
    Scherer, SW
    Tsui, LC
    [J]. HUMAN MOLECULAR GENETICS, 2002, 11 (17) : 1987 - 1995
  • [6] Heterozygous submicroscopic inversions involving olfactory receptor-gene clusters mediate the recurrent t(4;8)(p16;p23) translocation
    Giglio, S
    Calvari, V
    Gregato, G
    Gimelli, G
    Camanini, S
    Giorda, R
    Ragusa, A
    Guerneri, S
    Selicorni, A
    Stumm, M
    Tonnies, H
    Ventura, M
    Zollino, M
    Neri, G
    Barber, J
    Wieczorek, D
    Rocchi, M
    Zuffardi, O
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2002, 71 (02) : 276 - 285
  • [7] Olfactory receptor-gene clusters, genomic-inversion polymorphisms, and common chromosome rearrangements
    Giglio, S
    Broman, KW
    Matsumoto, N
    Calvari, V
    Gimelli, G
    Neumann, T
    Ohashi, H
    Voullaire, L
    Larizza, D
    Giorda, R
    Weber, JL
    Ledbetter, DH
    Zuffardi, O
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2001, 68 (04) : 874 - 883
  • [8] A polymorphic genomic duplication on human chromosome 15 is a susceptibility factor for panic and phobic disorders
    Gratacòs, M
    Nadal, M
    Martín-Santos, R
    Pujana, MA
    Gago, J
    Peral, B
    Armengol, L
    Ponsa, I
    Miró, R
    Bulbena, A
    Estivill, X
    [J]. CELL, 2001, 106 (03) : 367 - 379
  • [9] Positive selection of a gene family during the emergence of humans and African apes
    Johnson, ME
    Viggiano, L
    Bailey, JA
    Abdul-Rauf, M
    Goodwin, G
    Rocchi, M
    Eichler, EE
    [J]. NATURE, 2001, 413 (6855) : 514 - 519
  • [10] The AZFc region of the Y chromosome features massive palindromes and uniform recurrent deletions in infertile men
    Kawaguchi, TK
    Skaletsky, H
    Brown, LG
    Minx, PJ
    Cordum, HS
    Waterston, RH
    Wilson, RK
    Silber, S
    Oates, R
    Rozen, S
    Page, DC
    [J]. NATURE GENETICS, 2001, 29 (03) : 279 - 286