Generation and analysis of 280,000 human expressed sequence tags

被引:376
作者
Hillier, L
Lennon, G
Becker, M
Bonaldo, MF
Chiapelli, B
Chissoe, S
Dietrich, N
DuBuque, T
Favello, A
Gish, W
Hawkins, M
Hultman, M
Kucaba, T
Lacy, M
Le, M
Le, N
Mardis, E
Moore, B
Morris, M
Parsons, J
Prange, C
Rifkin, L
Rohlfing, T
Schellenberg, K
Soares, MB
Tan, F
ThierryMeg, J
Trevaskis, E
Underwood, K
Wohldman, P
Waterston, R
Wilson, R
Marra, M
机构
[1] LAWRENCE LIVERMORE NATL LAB, CTR HUMAN GENOME, LIVERMORE, CA 94550 USA
[2] COLUMBIA UNIV, COLL PHYS & SURG, DEPT PSYCHIAT, NEW YORK, NY 10032 USA
[3] NEW YORK STATE PSYCHIAT INST & HOSP, NEW YORK, NY 10032 USA
关键词
D O I
10.1101/gr.6.9.807
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We report the generation of 319,311 single-pass sequencing reactions (known as expressed sequence taps, or ESTs) obtained from the 5' and 3' ends of 194,031 human cDNA clones. Our goal has been to obtain tag sequences From many different genes and to deposit these in the publicly accessible Data Base for Expressed Sequence Taps. Highly efficient automatic screening of the data allows deposition of the annotated sequences without delay. Sequences have been generated From 26 oligo(dT) primed directionally cloned libraries, of which 18 were normalized. The libraries were constructed using mRNA isolated From 17 different tissues representing three developmental states. Comparisons of a subset of our data with nonredundant human mRNA and protein data bases show that the ESTs represent many known sequences and contain many that are novel. Analysis of protein families using Hidden Markov Models confirms this observation and supports the contention that although normalization reduces significantly the relative abundance of redundant cDNA clones, it does not result in the complete removal of members of gene families.
引用
收藏
页码:807 / 828
页数:22
相关论文
共 38 条
  • [11] FULTON LL, 1994, BIOTECHNIQUES, V17, P298
  • [12] Regional assignment of EST sequences on human chromosome 13
    Hawthorn, LA
    Cowell, JK
    [J]. CYTOGENETICS AND CELL GENETICS, 1996, 72 (01): : 72 - 77
  • [13] AMINO-ACID SUBSTITUTION MATRICES FROM PROTEIN BLOCKS
    HENIKOFF, S
    HENIKOFF, JG
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1992, 89 (22) : 10915 - 10919
  • [14] THE GENEXPRESS INDEX - A RESOURCE FOR GENE DISCOVERY AND THE GENIC MAP OF THE HUMAN GENOME
    HOULGATTE, R
    MARIAGESAMSON, R
    DUPRAT, S
    TESSIER, A
    BENTOLILA, S
    LAMY, B
    AUFFRAY, C
    [J]. GENOME RESEARCH, 1995, 5 (03) : 272 - 304
  • [15] Genome sequencing: The complete code for a eukaryotic cell
    Johnston, M
    [J]. CURRENT BIOLOGY, 1996, 6 (05) : 500 - 503
  • [16] SINGLE PASS SEQUENCING AND PHYSICAL AND GENETIC-MAPPING OF HUMAN BRAIN CDNAS
    KHAN, AS
    WILCOX, AS
    POLYMEROPOULOS, MH
    HOPKINS, JA
    STEVENS, TJ
    ROBINSON, M
    ORPANA, AK
    SIKELA, JM
    [J]. NATURE GENETICS, 1992, 2 (03) : 180 - 185
  • [17] HIDDEN MARKOV-MODELS IN COMPUTATIONAL BIOLOGY - APPLICATIONS TO PROTEIN MODELING
    KROGH, A
    BROWN, M
    MIAN, IS
    SJOLANDER, K
    HAUSSLER, D
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1994, 235 (05) : 1501 - 1531
  • [18] The IMAGE consortium: An integrated molecular analysis of genomes and their expression
    Lennon, G
    Auffray, C
    Polymeropoulos, M
    Soares, MB
    [J]. GENOMICS, 1996, 33 (01) : 151 - 152
  • [19] CDNA ANALYSES IN THE HUMAN GENOME PROJECT
    MATSUBARA, K
    OKUBO, K
    [J]. GENE, 1993, 135 (1-2) : 265 - 274
  • [20] CAENORHABDITIS-ELEGANS EXPRESSED SEQUENCE TAGS IDENTIFY GENE FAMILIES AND POTENTIAL DISEASE GENE HOMOLOGS
    MCCOMBIE, WR
    ADAMS, MD
    KELLEY, JM
    FITZGERALD, MG
    UTTERBACK, TR
    KHAN, M
    DUBNICK, M
    KERLAVAGE, AR
    VENTER, JC
    FIELDS, C
    [J]. NATURE GENETICS, 1992, 1 (02) : 124 - 131