Impact of human pathogenic micro-insertions and micro-deletions on post-transcriptional regulation

被引:23
作者
Zhang, Xinjun [1 ,2 ]
Lin, Hai [2 ,4 ]
Zhao, Huiying [5 ]
Hao, Yangyang [2 ,3 ]
Mort, Matthew [6 ]
Cooper, David N. [6 ]
Zhou, Yaoqi [7 ,8 ]
Liu, Yunlong [2 ,3 ,9 ]
机构
[1] Indiana Univ, Sch Informat & Comp, Bloomington, IN 47408 USA
[2] Indiana Univ Purdue Univ, Ctr Computat Biol & Bioinformat, Indianapolis, IN 46202 USA
[3] Indiana Univ Purdue Univ, Dept Med & Mol Genet, Indianapolis, IN 46202 USA
[4] Indiana Univ Purdue Univ, Sch Informat & Comp, Indianapolis, IN 46202 USA
[5] Queensland Inst Med Res, Brisbane, Qld 4006, Australia
[6] Cardiff Univ, Inst Med Genet, Cardiff CF14 4XN, S Glam, Wales
[7] Griffith Univ, Inst Glyc, Southport, Qld 4215, Australia
[8] Griffith Univ, Sch Informat & Commun Technol, Southport, Qld 4215, Australia
[9] Indiana Univ, Sch Med, Ctr Med Genom, Indianapolis, IN 46202 USA
基金
美国国家卫生研究院;
关键词
INSERTION/DELETION POLYMORPHISM; SECONDARY STRUCTURE; RNA INTERACTIONS; GC CONTENT; BINDING; DISEASE; GENE; PROTEIN; CANDIDATE; EVOLUTION;
D O I
10.1093/hmg/ddu019
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Small insertions/deletions (INDELs) of a parts per thousand currency sign21 bp comprise 18% of all recorded mutations causing human inherited disease and are evident in 24% of documented Mendelian diseases. INDELs affect gene function in multiple ways: for example, by introducing premature stop codons that either lead to the production of truncated proteins or affect transcriptional efficiency. However, the means by which they impact post-transcriptional regulation, including alternative splicing, have not been fully evaluated. In this study, we collate disease-causing INDELs from the Human Gene Mutation Database (HGMD) and neutral INDELs from the 1000 Genomes Project. The potential of these two types of INDELs to affect binding-site affinity of RNA-binding proteins (RBPs) was then evaluated. We identified several sequence features that can distinguish disease-causing INDELs from neutral INDELs. Moreover, we built a machine-learning predictor called PinPor (predicting pathogenic small insertions and deletions affecting post-transcriptional regulation, ) to ascertain which newly observed INDELs are likely to be pathogenic. Our results show that disease-causing INDELs are more likely to ablate RBP-binding sites and tend to affect more RBP-binding sites than neutral INDELs. Additionally, disease-causing INDELs give rise to greater deviations in binding affinity than neutral INDELs. We also demonstrated that disease-causing INDELs may be distinguished from neutral INDELs by several sequence features, such as their proximity to splice sites and their potential effects on RNA secondary structure. This predictor showed satisfactory performance in identifying numerous pathogenic INDELs, with a Matthews correlation coefficient (MCC) value of 0.51 and an accuracy of 0.75.
引用
收藏
页码:3024 / 3034
页数:11
相关论文
共 72 条
[1]   Differential GC Content between Exons and Introns Establishes Distinct Strategies of Splice-Site Recognition [J].
Amit, Maayan ;
Donyo, Maya ;
Hollander, Dror ;
Goren, Amir ;
Kim, Eddo ;
Gelfman, Sahar ;
Lev-Maor, Galit ;
Burstein, David ;
Schwartz, Schraga ;
Postolsky, Benny ;
Pupko, Tal ;
Ast, Gil .
CELL REPORTS, 2012, 1 (05) :543-556
[2]   Comparative genomics and evolution of alternative splicing: The pessimists' science [J].
Artamonova, Irena I. ;
Gelfand, Mikhail S. .
CHEMICAL REVIEWS, 2007, 107 (08) :3407-3430
[3]   A DNA INSERTION DELETION NECESSITATES AN ABERRANT RNA SPLICE ACCOUNTING FOR A MU-HEAVY CHAIN DISEASE PROTEIN [J].
BAKHSHI, A ;
GUGLIELMI, P ;
SIEBENLIST, U ;
RAVETCH, JV ;
JENSEN, JP ;
KORSMEYER, SJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1986, 83 (08) :2689-2693
[4]   Deciphering the splicing code [J].
Barash, Yoseph ;
Calarco, John A. ;
Gao, Weijun ;
Pan, Qun ;
Wang, Xinchen ;
Shai, Ofer ;
Blencowe, Benjamin J. ;
Frey, Brendan J. .
NATURE, 2010, 465 (7294) :53-59
[5]   Predicting protein associations with long noncoding RNAs [J].
Bellucci, Matteo ;
Agostini, Federico ;
Masin, Marianela ;
Tartaglia, Gian Gaetano .
NATURE METHODS, 2011, 8 (06) :444-445
[6]   Comprehensive identification and characterization of diallelic insertion-deletion polymorphisms in 330 human candidate genes [J].
Bhangale, TR ;
Rieder, MJ ;
Livingston, RJ ;
Nickerson, DA .
HUMAN MOLECULAR GENETICS, 2005, 14 (01) :59-69
[7]   Sequence Coevolution between RNA and Protein Characterized by Mutual Information between Residue Triplets [J].
Brandman, Relly ;
Brandman, Yigal ;
Pande, Vijay S. .
PLOS ONE, 2012, 7 (01)
[8]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[9]   Influence of RNA secondary structure on the pre-mRNA splicing process [J].
Buratti, E ;
Baralle, FE .
MOLECULAR AND CELLULAR BIOLOGY, 2004, 24 (24) :10505-10514
[10]   Angiotensin-converting enzyme insertion deletion polymorphism and cerebrovascular disease [J].
Catto, A ;
Carter, AM ;
Barrett, JH ;
Stickland, M ;
Bamford, J ;
Davies, JA ;
Grant, PJ .
STROKE, 1996, 27 (03) :435-440