An intermediate grade of finished genomic sequence suitable for comparative analyses

被引:68
作者
Blakesley, RW
Hansen, NF
Mullikin, JC
Thomas, PJ
McDowell, JC
Maskeri, B
Young, AC
Benjamin, B
Brooks, SY
Coleman, BI
Gupta, J
Ho, SL
Karlins, EM
Maduro, QL
Stantripop, S
Tsurgeon, C
Vogt, JL
Walker, MA
Masiello, CA
Guan, XB
Bouffared, GG
Green, ED [1 ]
机构
[1] NHGRI, NIH Intramural Sequencing Ctr, NIH, Bethesda, MD 20892 USA
[2] NHGRI, Genome Technol Branch, NIH, Bethesda, MD 20892 USA
关键词
D O I
10.1101/gr.2648404
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Although the cost of generating draft-quality genomic sequence continues to decline, refining that sequence by the process of "sequence finishing" remains expensive. Near-perfect finished sequence is an appropriate goal for the human genome and a small set of reference genomes; however, such a high-quality product cannot be cost-justified for large numbers of additional genomes, at least for the foreseeable future. Here we describe the generation and quality of an intermediate grade of finished genomic sequence (termed comparative-grade finished sequence), which is tailored for use in multispecies sequence comparisons. Our analyses indicate that this sequence is very high quality (with the residual gaps and errors mostly falling within repetitive elements) and reflects 99% of the total sequence. Importantly, cornparative-grade sequence finishing requires similar to40-fold less reagents and similar to10-fold less personnel effort compared to the generation of near-perfect finished sequence, such as that produced for the human genome. Although applied here to finishing sequence derived from individual bacterial artificial chromosome (BAC) clones, one could envision establishing routines for refining sequences emanating from whole-genome shotgun sequencing projects to a similar quality level. Our experience to date demonstrates that comparative-grade sequence finishing represents a practical and affordable option for sequence refinement en route to comparative analyses.
引用
收藏
页码:2235 / 2244
页数:10
相关论文
共 50 条
[1]   The genome sequence of Drosophila melanogaster [J].
Adams, MD ;
Celniker, SE ;
Holt, RA ;
Evans, CA ;
Gocayne, JD ;
Amanatides, PG ;
Scherer, SE ;
Li, PW ;
Hoskins, RA ;
Galle, RF ;
George, RA ;
Lewis, SE ;
Richards, S ;
Ashburner, M ;
Henderson, SN ;
Sutton, GG ;
Wortman, JR ;
Yandell, MD ;
Zhang, Q ;
Chen, LX ;
Brandon, RC ;
Rogers, YHC ;
Blazej, RG ;
Champe, M ;
Pfeiffer, BD ;
Wan, KH ;
Doyle, C ;
Baxter, EG ;
Helt, G ;
Nelson, CR ;
Miklos, GLG ;
Abril, JF ;
Agbayani, A ;
An, HJ ;
Andrews-Pfannkoch, C ;
Baldwin, D ;
Ballew, RM ;
Basu, A ;
Baxendale, J ;
Bayraktaroglu, L ;
Beasley, EM ;
Beeson, KY ;
Benos, PV ;
Berman, BP ;
Bhandari, D ;
Bolshakov, S ;
Borkova, D ;
Botchan, MR ;
Bouck, J ;
Brokstein, P .
SCIENCE, 2000, 287 (5461) :2185-2195
[2]  
[Anonymous], 1998, SCIENCE, V282, P2012
[3]   Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes [J].
Aparicio, S ;
Chapman, J ;
Stupka, E ;
Putnam, N ;
Chia, J ;
Dehal, P ;
Christoffels, A ;
Rash, S ;
Hoon, S ;
Smit, A ;
Gelpke, MDS ;
Roach, J ;
Oh, T ;
Ho, IY ;
Wong, M ;
Detter, C ;
Verhoef, F ;
Predki, P ;
Tay, A ;
Lucas, S ;
Richardson, P ;
Smith, SF ;
Clark, MS ;
Edwards, YJK ;
Doggett, N ;
Zharkikh, A ;
Tavtigian, SV ;
Pruss, D ;
Barnstead, M ;
Evans, C ;
Baden, H ;
Powell, J ;
Glusman, G ;
Rowen, L ;
Hood, L ;
Tan, YH ;
Elgar, G ;
Hawkins, T ;
Venkatesh, B ;
Rokhsar, D ;
Brenner, S .
SCIENCE, 2002, 297 (5585) :1301-1310
[4]   Gene annotation: Prediction and testing [J].
Ashurst, JL ;
Collins, JE .
ANNUAL REVIEW OF GENOMICS AND HUMAN GENETICS, 2003, 4 :69-88
[5]   Analysis of segmental duplications and genome assembly in the mouse [J].
Bailey, JA ;
Church, DM ;
Ventura, M ;
Rocchi, M ;
Eichler, EE .
GENOME RESEARCH, 2004, 14 (05) :789-801
[6]   Recent segmental duplications in the human genome [J].
Bailey, JA ;
Gu, ZP ;
Clark, RA ;
Reinert, K ;
Samonte, RV ;
Schwartz, S ;
Adams, MD ;
Myers, EW ;
Li, PW ;
Eichler, EE .
SCIENCE, 2002, 297 (5583) :1003-1007
[7]  
Birren B, 1999, GENOME ANAL, V3, P241
[8]   Analysis of the quality and utility of random shotgun sequencing at low redundancies [J].
Bouck, J ;
Miller, W ;
Gorrell, JH ;
Muzny, D ;
Gibbs, RA .
GENOME RESEARCH, 1998, 8 (10) :1074-1084
[9]   Prediction of complete gene structures in human genomic DNA [J].
Burge, C ;
Karlin, S .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) :78-94
[10]   Representation of cloned genomic sequences in two sequencing vectors: Correlation of DNA sequence and subclone distribution [J].
Chissoe, SL ;
Marra, MA ;
Hillier, L ;
Brinkman, R ;
Wilson, RK ;
Waterston, RH .
NUCLEIC ACIDS RESEARCH, 1997, 25 (15) :2960-2966