Leveraging the mouse genome for gene prediction in human: From whole-genome shotgun reads to a global synteny map

被引:72
作者
Flicek, P
Keibler, E
Hu, P
Korf, I
Brent, MR [1 ]
机构
[1] Washington Univ, Dept Comp Sci & Engn, St Louis, MO 63130 USA
[2] Washington Univ, Dept Biomed Engn, St Louis, MO 63130 USA
关键词
D O I
10.1101/gr.830003
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The availability of draft sequences for both the mouse and human genomes slakes it possible, for the first time, to annotate whole mammalian genomes using comparative methods. TWINSCAN is a gene-prediction system that combines the methods of single-genome predictors like GENSCAN with information derived from genome comparison, thereby improving accuracy. Because TWINSCAN uses genomic sequence only, it is less biased toward highly and/or ubiquitously expressed genes than GENEWISE, GENOMESCAN, and other methods based on evidence derived from transcripts. We show that TWINSCAN improves gene prediction in human using intermediate products front various stages of the sequencing and analysis of the mouse genome, front low-redundancy, whole-genome shotgun reads to the draft assembly and the synteny map. TWINSCAN improves on the prior state of the art even when alignments front only 1X coverage of the mouse genome are available. Gene prediction accuracy improves steadily from IX through 3X, more slowly front 3X to 4X, and relatively little thereafter. The assembly and the synteny map greatly speed the computations, however. Our human annotation using the mouse assembly is conservative, predicting only 25,622 genes, and appears to be one of the best de novo annotations of the human genome to date.
引用
收藏
页码:46 / 54
页数:9
相关论文
共 29 条
  • [1] Ansari-Lari MA, 1998, GENOME RES, V8, P29
  • [2] ASH RB, 1965, INFORMATION THEORY
  • [3] Bafna V, 2000, Proc Int Conf Intell Syst Mol Biol, V8, P3
  • [4] Human and mouse gene structure: Comparative analysis and application to exon prediction
    Batzoglou, S
    Pachter, L
    Mesirov, JP
    Berger, B
    Lander, ES
    [J]. GENOME RESEARCH, 2000, 10 (07) : 950 - 958
  • [5] Using GeneWise in the Drosophila annotation experiment
    Birney, E
    Durbin, R
    [J]. GENOME RESEARCH, 2000, 10 (04) : 547 - 548
  • [6] Shotgun sample sequence comparisons between mouse and human genomes
    Bouck, JB
    Metzker, ML
    Gibbs, RA
    [J]. NATURE GENETICS, 2000, 25 (01) : 31 - 33
  • [7] Prediction of complete gene structures in human genomic DNA
    Burge, C
    Karlin, S
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) : 78 - 94
  • [8] PREDICTION OF GENE STRUCTURE
    GUIGO, R
    KNUDSEN, S
    DRAKE, N
    SMITH, T
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1992, 226 (01) : 141 - 157
  • [9] GUIGO R, 2003, IN PRESS P NATL ACAD
  • [10] The Ensembl genome database project
    Hubbard, T
    Barker, D
    Birney, E
    Cameron, G
    Chen, Y
    Clark, L
    Cox, T
    Cuff, J
    Curwen, V
    Down, T
    Durbin, R
    Eyras, E
    Gilbert, J
    Hammond, M
    Huminiecki, L
    Kasprzyk, A
    Lehvaslaiho, H
    Lijnzaad, P
    Melsopp, C
    Mongin, E
    Pettett, R
    Pocock, M
    Potter, S
    Rust, A
    Schmidt, E
    Searle, S
    Slater, G
    Smith, J
    Spooner, W
    Stabenau, A
    Stalker, J
    Stupka, E
    Ureta-Vidal, A
    Vastrik, I
    Clamp, M
    [J]. NUCLEIC ACIDS RESEARCH, 2002, 30 (01) : 38 - 41