The structural genomics experimental pipeline: Insights from global target lists

被引:24
作者
O'Toole, N
Grabowski, M
Otwinowski, Z
Minor, W
Cygler, M
机构
[1] Natl Res Council Canada, Biotechnol Res Inst, Montreal, PQ N4P 2R2, Canada
[2] McGill Univ, Dept Biochem, Montreal, PQ H3G 1Y6, Canada
[3] Montreal Joint Ctr Struct Biol, Montreal, PQ, Canada
[4] Univ Virginia, Dept Mol Physiol & Biol Phys, Charlottesville, VA 22903 USA
[5] Univ Texas, SW Med Ctr, Dept Biochem, Dallas, TX 75235 USA
关键词
structural proteomics; X-ray diffraction; NMR spectroscopy;
D O I
10.1002/prot.20060
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Structural genomics (SG) initiatives are currently attempting to achieve the high-throughput determination of protein structures on a genome-wide scale. Here we analyze the SG target data that have been publicly released over a period of 16 months to assess the potential of the SG initiatives. We use statistical techniques most commonly applied in epidemiology to describe the dynamics of targets through the experimental SG pipeline. There is no clear bottleneck among the key stages of cloning, expression, purification and crystallization. An SG target will progress through each of these steps with a probability of approximately 45%. Around 80% of targets with diffraction data will yield a crystal structure, and 20% of targets with HSQC spectra will yield an NMR structure. We also find the overlaps among SG targets: 61% of SG protein sequences share at least 30% sequence identity with one or more other SG targets. There is no significant difference in average structure quality among SG structures and other structures in the PDB determined by "traditional" methods, but on average SG structures are deposited to the PDB twice as quickly after X-ray data collection. (C) 2004 Wiley-Liss, Inc.
引用
收藏
页码:201 / 210
页数:10
相关论文
共 18 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[3]   Target selection for structural genomics [J].
Brenner, SE .
NATURE STRUCTURAL BIOLOGY, 2000, 7 (Suppl 11) :967-969
[4]  
BROWN BW, 1977, STAT BIOMEDICAL INTR
[5]   Structural genomics: beyond the Human Genome Project [J].
Burley, SK ;
Almo, SC ;
Bonanno, JB ;
Capel, M ;
Chance, MR ;
Gaasterland, T ;
Lin, DW ;
Sali, A ;
Studier, FW ;
Swaminathan, S .
NATURE GENETICS, 1999, 23 (02) :151-157
[6]   NONPARAMETRIC-ESTIMATION FROM INCOMPLETE OBSERVATIONS [J].
KAPLAN, EL ;
MEIER, P .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1958, 53 (282) :457-481
[7]   Shining a light on structural genomics [J].
Kim, SH .
NATURE STRUCTURAL BIOLOGY, 1998, 5 (Suppl 8) :643-645
[8]   Structural genomics of the Thermotoga maritima proteome implemented in a high-throughput structure determination pipeline [J].
Lesley, SA ;
Kuhn, P ;
Godzik, A ;
Deacon, AM ;
Mathews, I ;
Kreusch, A ;
Spraggon, G ;
Klock, HE ;
McMullan, D ;
Shin, T ;
Vincent, J ;
Robb, A ;
Brinen, LS ;
Miller, MD ;
McPhillips, TM ;
Miller, MA ;
Scheibe, D ;
Canaves, JM ;
Guda, C ;
Jaroszewski, L ;
Selby, TL ;
Elsliger, MA ;
Wooley, J ;
Taylor, SS ;
Hodgson, KO ;
Wilson, IA ;
Schultz, PG ;
Stevens, RC .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (18) :11664-11669
[9]   Structural genomics: keystone for a Human Proteome Project [J].
Montelione, GT ;
Anderson, S .
NATURE STRUCTURAL BIOLOGY, 1999, 6 (01) :11-12
[10]  
O'Toole Nicholas, 2003, Journal of Structural and Functional Genomics, V4, P47, DOI 10.1023/A:1026156025612