Sequence-tagged connectors: A sequence approach to mapping and scanning the human genome

被引:49
作者
Mahairas, GG
Wallace, JC
Smith, K
Swartzell, S
Holzman, T
Keller, A
Shaker, R
Furlong, J
Young, J
Zhao, SY
Adams, MD
Hood, L
机构
[1] Univ Washington, Dept Mol Biotechnol, Seattle, WA 98195 USA
[2] Univ Washington, High Throughput Sequencing Ctr, Seattle, WA 98109 USA
[3] Inst Genom Res, TIGR, Rockville, MD 20850 USA
[4] Celera Genom, Rockville, MD 20850 USA
关键词
D O I
10.1073/pnas.96.17.9739
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The sequence-tagged connector (STC) strategy proposes to generate sequence tags densely scattered (every 3.3 kilobases) across the human genome by arraying 450,000 bacterial artificial chromosomes (BACs) with randomly cleaved inserts, sequencing both ends of each, and preparing a restriction enzyme fingerprint of each. The STC resource, containing end sequences, fingerprints, and arrayed BACs, creates a map where the interrelationships of the individual BAC clones are resolved through their STCs as overlapping BAC clones are sequenced. Once a seed or initiation BAC clone is sequenced, the minimum overlapping 5' and 3' BAC clones can be identified computationally and sequenced, By reiterating this "sequence-then-map by computer analysis against the STC database" strategy, a minimum tiling path of clones can be sequenced at a rate that is primarily limited by the sequencing throughput of individual genome centers. As of February 1999, we had deposited, together with The Institute for Genomic Research (TIGR), into GenBank 314,000 STCs (approximate to 135 megabases), or 4.5% of human genomic DNA. This genome survey reveals numerous genes, genome-wide repeats, simple sequence repeats (potential genetic markers), and CpG islands (potential gene initiation sites). It also illustrates the power of the STC strategy for creating minimum tiling paths of BAC clones for large-scale genomic sequencing. Because the STC resource permits the easy integration of genetic, physical, gene, and sequence maps for chromosomes, it will be a powerful tool for the initial analysis of the human genome and other complex genomes.
引用
收藏
页码:9739 / 9744
页数:6
相关论文
共 26 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]  
BERNARDI G, 1993, MOL BIOL EVOL, V10, P186
[3]   Analysis of the 1.1-Mb human alpha/beta T-cell receptor locus with bacterial artificial chromosome clones [J].
Boysen, C ;
Simon, MI ;
Hood, L .
GENOME RESEARCH, 1997, 7 (04) :330-338
[4]  
Boysen C, 1996, IMMUNOGENETICS, V44, P121
[5]   RADIATION HYBRID MAPPING - A SOMATIC-CELL GENETIC METHOD FOR CONSTRUCTING HIGH-RESOLUTION MAPS OF MAMMALIAN CHROMOSOMES [J].
COX, DR ;
BURMEISTER, M ;
PRICE, ER ;
KIM, S ;
MYERS, RM .
SCIENCE, 1990, 250 (4978) :245-250
[6]   CPG ISLANDS AND GENES [J].
CROSS, SH ;
BIRD, AP .
CURRENT OPINION IN GENETICS & DEVELOPMENT, 1995, 5 (03) :309-314
[7]   Isolation of CpG islands from large genomic clones [J].
Cross, SH ;
Clark, VH ;
Bird, AP .
NUCLEIC ACIDS RESEARCH, 1999, 27 (10) :2099-2107
[8]   A physical map of 30,000 human genes [J].
Deloukas, P ;
Schuler, GD ;
Gyapay, G ;
Beasley, EM ;
Soderlund, C ;
Rodriguez-Tomé, P ;
Hui, L ;
Matise, TC ;
McKusick, KB ;
Beckmann, JS ;
Bentolila, S ;
Bihoreau, MT ;
Birren, BB ;
Browne, J ;
Butler, A ;
Castle, AB ;
Chiannilkulchai, N ;
Clee, C ;
Day, PJR ;
Dehejia, A ;
Dibling, T ;
Drouot, N ;
Duprat, S ;
Fizames, C ;
Fox, S ;
Gelling, S ;
Green, L ;
Harrison, P ;
Hocking, R ;
Holloway, E ;
Hunt, S ;
Keil, S ;
Lijnzaad, P ;
Louis-Dit-Sully, C ;
Ma, J ;
Mendis, A ;
Miller, J ;
Morissette, J ;
Muselet, D ;
Nusbaum, HC ;
Peck, A ;
Rozen, S ;
Simon, D ;
Slonim, DK ;
Staples, R ;
Stein, LD ;
Stewart, EA ;
Suchard, MA ;
Thangarajah, T ;
Vega-Czarny, N .
SCIENCE, 1998, 282 (5389) :744-746
[9]   Base-calling of automated sequencer traces using phred.: II.: Error probabilities [J].
Ewing, B ;
Green, P .
GENOME RESEARCH, 1998, 8 (03) :186-194
[10]   Base-calling of automated sequencer traces using phred.: I.: Accuracy assessment [J].
Ewing, B ;
Hillier, L ;
Wendl, MC ;
Green, P .
GENOME RESEARCH, 1998, 8 (03) :175-185