Generation and analysis of 280,000 human expressed sequence tags

被引:376
作者
Hillier, L
Lennon, G
Becker, M
Bonaldo, MF
Chiapelli, B
Chissoe, S
Dietrich, N
DuBuque, T
Favello, A
Gish, W
Hawkins, M
Hultman, M
Kucaba, T
Lacy, M
Le, M
Le, N
Mardis, E
Moore, B
Morris, M
Parsons, J
Prange, C
Rifkin, L
Rohlfing, T
Schellenberg, K
Soares, MB
Tan, F
ThierryMeg, J
Trevaskis, E
Underwood, K
Wohldman, P
Waterston, R
Wilson, R
Marra, M
机构
[1] LAWRENCE LIVERMORE NATL LAB, CTR HUMAN GENOME, LIVERMORE, CA 94550 USA
[2] COLUMBIA UNIV, COLL PHYS & SURG, DEPT PSYCHIAT, NEW YORK, NY 10032 USA
[3] NEW YORK STATE PSYCHIAT INST & HOSP, NEW YORK, NY 10032 USA
关键词
D O I
10.1101/gr.6.9.807
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We report the generation of 319,311 single-pass sequencing reactions (known as expressed sequence taps, or ESTs) obtained from the 5' and 3' ends of 194,031 human cDNA clones. Our goal has been to obtain tag sequences From many different genes and to deposit these in the publicly accessible Data Base for Expressed Sequence Taps. Highly efficient automatic screening of the data allows deposition of the annotated sequences without delay. Sequences have been generated From 26 oligo(dT) primed directionally cloned libraries, of which 18 were normalized. The libraries were constructed using mRNA isolated From 17 different tissues representing three developmental states. Comparisons of a subset of our data with nonredundant human mRNA and protein data bases show that the ESTs represent many known sequences and contain many that are novel. Analysis of protein families using Hidden Markov Models confirms this observation and supports the contention that although normalization reduces significantly the relative abundance of redundant cDNA clones, it does not result in the complete removal of members of gene families.
引用
收藏
页码:807 / 828
页数:22
相关论文
共 38 条
  • [1] ADAMS MD, 1995, NATURE, V377, P3
  • [2] COMPLEMENTARY-DNA SEQUENCING - EXPRESSED SEQUENCE TAGS AND HUMAN GENOME PROJECT
    ADAMS, MD
    KELLEY, JM
    GOCAYNE, JD
    DUBNICK, M
    POLYMEROPOULOS, MH
    XIAO, H
    MERRIL, CR
    WU, A
    OLDE, B
    MORENO, RF
    KERLAVAGE, AR
    MCCOMBIE, WR
    VENTER, JC
    [J]. SCIENCE, 1991, 252 (5013) : 1651 - 1656
  • [3] ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
  • [4] BAIROCH A, 1994, NUCLEIC ACIDS RES, V22, P3578
  • [5] Yeast genes and human disease
    Bassett, DE
    Boguski, MS
    Hieter, P
    [J]. NATURE, 1996, 379 (6566) : 589 - 590
  • [6] ESTABLISHING A HUMAN TRANSCRIPT MAP
    BOGUSKI, MS
    SCHULER, GD
    [J]. NATURE GENETICS, 1995, 10 (04) : 369 - 371
  • [7] A SEQUENCE ASSEMBLY AND EDITING PROGRAM FOR EFFICIENT MANAGEMENT OF LARGE PROJECTS
    DEAR, S
    STADEN, R
    [J]. NUCLEIC ACIDS RESEARCH, 1991, 19 (14) : 3907 - 3911
  • [8] The yeast genome project: What did we learn?
    Dujon, B
    [J]. TRENDS IN GENETICS, 1996, 12 (07) : 263 - 270
  • [9] Eddy S R, 1995, J Comput Biol, V2, P9, DOI 10.1089/cmb.1995.2.9
  • [10] Eddy S R, 1995, Proc Int Conf Intell Syst Mol Biol, V3, P114