Transcribed processed pseudogenes in the human genome: an intermediate form of expressed retrosequence lacking protein-coding ability

被引:149
作者
Harrison, PM
Zheng, DY
Zhang, ZL
Carriero, N
Gerstein, M
机构
[1] McGill Univ, Dept Biol, Montreal, PQ H3A 1B1, Canada
[2] Yale Univ, Dept Mol Biophys & Biochem, New Haven, CT USA
[3] Yale Univ, Dept Comp Sci, New Haven, CT USA
[4] Univ Toronto, Banting & Best Dept Med Res, Toronto, ON, Canada
基金
美国国家卫生研究院;
关键词
D O I
10.1093/nar/gki531
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Pseudogenes, in the case of protein-coding genes, are gene copies that have lost the ability to code for a protein; they are typically identified through annotation of disabled, decayed or incomplete protein-coding sequences. Processed pseudogenes (P Psi gs) are made through mRNA retrotransposition. There is overwhelming genomic evidence for thousands of human P Psi gs and also dozens of human processed genes that comprise complete retrotransposed copies of other genes. Here, we survey for an intermediate entity, the transcribed processed pseudogene (TP Psi g), which is disabled but nonetheless transcribed. TP Psi gs may affect expression of paralogous genes, as observed in the case of the mouse makorin1-p1 TP Psi g. To elucidate their role, we identified human TP Psi gs by mapping expressed sequences onto P Psi gs and, reciprocally, extracting TP Psi gs from known mRNAs. We consider only those P Psi gs that are homologous to either non-mammalian eukaryotic proteins or protein domains of known structure, and require detection of identical coding-sequence disablements in both the expressed and genomic sequences. Oligonucleotide microarray data provide further expression verification. Overall, we find 166-233 TP Psi gs (similar to 4-6% of P Psi gs). Proteins/transcripts with the highest numbers of homologous TP Psi gs generally have many homologous P Psi gs and are abundantly expressed. TP Psi gs are significantly over-represented near both the 5' and 3' ends of genes; this suggests that TP Psi gs can be formed through gene-promoter co-option, or intrusion into untranslated regions. However, roughly half of the TP Psi gs are located away from genes in the intergenic DNA and thus may be co-opting cryptic promoters of undesignated origin. Furthermore, TP Psi gs are unlike other P Psi gs and processed genes in the following ways: (i) they do not show a significant tendency to either deposit on or originate from the X chromosome; (ii) only 5% of human TP Psi gs have potential orthologs in mouse. This latter finding indicates that the vast majority of TP Psi gs is lineage specific. This is likely linked to well-documented extensive lineage-specific SINE/LINE activity. The list of TP Psi gs is available at: < URLREF > http://www.biology.mcgill.ca/faculty/harrison/tppg/bppg.tov </ URLREF > (or) < URLREF > http:pseudogene.org </URLREF >.
引用
收藏
页码:2374 / 2383
页数:10
相关论文
共 54 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] Apweiler R, 2004, NUCLEIC ACIDS RES, V32, pD115, DOI [10.1093/nar/gkw1099, 10.1093/nar/gkh131]
  • [3] Pseudogenes: Are they "Junk" or functional DNA?
    Balakirev, ES
    Ayala, FJ
    [J]. ANNUAL REVIEW OF GENETICS, 2003, 37 : 123 - 151
  • [4] Ultraconserved elements in the human genome
    Bejerano, G
    Pheasant, M
    Makunin, I
    Stephen, S
    Kent, WJ
    Mattick, JS
    Haussler, D
    [J]. SCIENCE, 2004, 304 (5675) : 1321 - 1325
  • [5] Global identification of human transcribed sequences with genome tiling arrays
    Bertone, P
    Stolc, V
    Royce, TE
    Rozowsky, JS
    Urban, AE
    Zhu, XW
    Rinn, JL
    Tongprasit, W
    Samanta, M
    Weissman, S
    Gerstein, M
    Snyder, M
    [J]. SCIENCE, 2004, 306 (5705) : 2242 - 2246
  • [6] Ensembl 2004
    Birney, E
    Andrews, D
    Bevan, P
    Caccamo, M
    Cameron, G
    Chen, Y
    Clarke, L
    Coates, G
    Cox, T
    Cuff, J
    Curwen, V
    Cutts, T
    Down, T
    Durbin, R
    Eyras, E
    Fernandez-Suarez, XM
    Gane, P
    Gibbins, B
    Gilbert, J
    Hammond, M
    Hotz, H
    Iyer, V
    Kahari, A
    Jekosch, K
    Kasprzyk, A
    Keefe, D
    Keenan, S
    Lehvaslaiho, H
    McVicker, G
    Melsopp, C
    Meidl, P
    Mongin, E
    Pettett, R
    Potter, S
    Proctor, G
    Rae, M
    Searle, S
    Slater, G
    Smedley, D
    Smith, J
    Spooner, W
    Stabenau, A
    Stalker, J
    Storey, R
    Ureta-Vidal, A
    Woodwark, C
    Clamp, M
    Hubbard, T
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 : D468 - D470
  • [7] An X-to-autosome retrogene is required for spermatogenesis in mice
    Bradley, J
    Baltus, A
    Skaletsky, H
    Royce-Tolland, M
    Dewar, K
    Page, DC
    [J]. NATURE GENETICS, 2004, 36 (08) : 872 - 876
  • [8] BRISTOW J, 1993, J BIOL CHEM, V268, P12919
  • [9] Genomes were forged by massive bombardments with retroelements and retrosequences
    Brosius, J
    [J]. GENETICA, 1999, 107 (1-3) : 209 - 238
  • [10] Prediction of complete gene structures in human genomic DNA
    Burge, C
    Karlin, S
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) : 78 - 94