Sequencing a genome by walking with clone-end sequences: A mathematical analysis

被引:17
作者
Batzoglou, S
Berger, B
Mesirov, J
Lander, ES
机构
[1] MIT, Dept Biol, Cambridge, MA 02139 USA
[2] MIT, Comp Sci Lab, Cambridge, MA 02139 USA
[3] MIT, Dept Math, Cambridge, MA 02139 USA
[4] Whitehead Inst Biomed Res, Cambridge Ctr 9, Cambridge, MA 02142 USA
关键词
D O I
10.1101/gr.9.12.1163
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
One approach to sequencing a large genome is (1) to sequence a collection of nonoverlapping "seeds" chosen from a genomic library of large-insert clones [such as bacterial artificial chromosomes (BACs)] and then (2) to take successive "walking" steps by selecting and sequencing minimally overlapping clones, using information such as clone-end sequences to identify the overlaps. In this paper we analyze the strategic issues involved in using this approach. We derive formulas showing how two key factors, the initial density of seed clones and the depth of the genomic library used For walking, affect the cost and time of a sequencing project-that is, the amount of redundant sequencing and the number of steps to cover the vast majority of the genome. We also discuss a variant strategy in which a second genomic library with clones having a somewhat smaller insert size is used to close gaps. This approach can dramatically decrease the amount of redundant sequencing, without affecting the rate at which the genome is covered.
引用
收藏
页码:1163 / 1174
页数:12
相关论文
共 19 条
  • [1] *C EL SEQ CONS, 1998, SCIENCE, V282, P212
  • [2] TOWARD A PHYSICAL MAP OF THE GENOME OF THE NEMATODE CAENORHABDITIS-ELEGANS
    COULSON, A
    SULSTON, J
    BRENNER, S
    KARN, J
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1986, 83 (20) : 7821 - 7825
  • [3] The yeast genome project: What did we learn?
    Dujon, B
    [J]. TRENDS IN GENETICS, 1996, 12 (07) : 263 - 270
  • [4] WHOLE-GENOME RANDOM SEQUENCING AND ASSEMBLY OF HAEMOPHILUS-INFLUENZAE RD
    FLEISCHMANN, RD
    ADAMS, MD
    WHITE, O
    CLAYTON, RA
    KIRKNESS, EF
    KERLAVAGE, AR
    BULT, CJ
    TOMB, JF
    DOUGHERTY, BA
    MERRICK, JM
    MCKENNEY, K
    SUTTON, G
    FITZHUGH, W
    FIELDS, C
    GOCAYNE, JD
    SCOTT, J
    SHIRLEY, R
    LIU, LI
    GLODEK, A
    KELLEY, JM
    WEIDMAN, JF
    PHILLIPS, CA
    SPRIGGS, T
    HEDBLOM, E
    COTTON, MD
    UTTERBACK, TR
    HANNA, MC
    NGUYEN, DT
    SAUDEK, DM
    BRANDON, RC
    FINE, LD
    FRITCHMAN, JL
    FUHRMANN, JL
    GEOGHAGEN, NSM
    GNEHM, CL
    MCDONALD, LA
    SMALL, KV
    FRASER, CM
    SMITH, HO
    VENTER, JC
    [J]. SCIENCE, 1995, 269 (5223) : 496 - 512
  • [5] THE COMPLETE DNA-SEQUENCE OF VACCINIA VIRUS
    GOEBEL, SJ
    JOHNSON, GP
    PERKUS, ME
    DAVIS, SW
    WINSLOW, JP
    PAOLETTI, E
    [J]. VIROLOGY, 1990, 179 (01) : 247 - 266
  • [6] Against a whole-genome shotgun
    Green, P
    [J]. GENOME RESEARCH, 1997, 7 (05): : 410 - 417
  • [7] KINETICS OF RANDOM SEQUENTIAL PARKING ON A LINE
    KRAPIVSKY, PL
    [J]. JOURNAL OF STATISTICAL PHYSICS, 1992, 69 (1-2) : 135 - 150
  • [8] LANDER E S, 1988, Genomics, V2, P231
  • [9] GENE ORGANIZATION DEDUCED FROM THE COMPLETE SEQUENCE OF LIVERWORT MARCHANTIA-POLYMORPHA MITOCHONDRIAL-DNA - A PRIMITIVE FORM OF PLANT MITOCHONDRIAL GENOME
    ODA, K
    YAMATO, K
    OHTA, E
    NAKAMURA, Y
    TAKEMURA, M
    NOZATO, N
    AKASHI, K
    KANEGAE, T
    OGURA, Y
    KOHCHI, T
    OHYAMA, K
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1992, 223 (01) : 1 - 7
  • [10] CHLOROPLAST GENE ORGANIZATION DEDUCED FROM COMPLETE SEQUENCE OF LIVERWORT MARCHANTIA-POLYMORPHA CHLOROPLAST DNA
    OHYAMA, K
    FUKUZAWA, H
    KOHCHI, T
    SHIRAI, H
    SANO, T
    SANO, S
    UMESONO, K
    SHIKI, Y
    TAKEUCHI, M
    CHANG, Z
    AOTA, S
    INOKUCHI, H
    OZEKI, H
    [J]. NATURE, 1986, 322 (6079) : 572 - 574