Rapid detection of expanded short tandem repeats in personal genomics using hybrid sequencing

被引:51
作者
Doi, Koichiro [1 ]
Monjo, Taku [1 ,2 ]
Hoang, Pham H. [1 ,2 ]
Yoshimura, Jun [1 ]
Yurino, Hideaki [1 ]
Mitsui, Jun [3 ]
Ishiura, Hiroyuki [3 ]
Takahashi, Yuji [3 ]
Ichikawa, Yaeko [3 ]
Goto, Jun [3 ]
Tsuji, Shoji [3 ]
Morishita, Shinichi [1 ]
机构
[1] Univ Tokyo, Grad Sch Frontier Sci, Dept Computat Biol, Chiba 2778562, Japan
[2] Univ Tokyo, Dept Informat & Commun Engn, Fac Engn, Tokyo 1138655, Japan
[3] Univ Tokyo, Grad Sch Med, Dept Neurol, Tokyo 1138655, Japan
关键词
FRAGILE-X; HEXANUCLEOTIDE REPEAT; EXPANSION; REGION; IDENTIFICATION; MUTATIONS; EFFICIENT; C9ORF72; FTD;
D O I
10.1093/bioinformatics/btt647
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Long expansions of short tandem repeats (STRs), i.e. DNA repeats of 2-6 nt, are associated with some genetic diseases. Cost-efficient high-throughput sequencing can quickly produce billions of short reads that would be useful for uncovering disease-associated STRs. However, enumerating STRs in short reads remains largely unexplored because of the difficulty in elucidating STRs much longer than 100bp, the typical length of short reads. Results: We propose ab initio procedures for sensing and locating long STRs promptly by using the frequency distribution of all STRs and paired-end read information. We validated the reproducibility of this method using biological replicates and used it to locate an STR associated with a brain disease (SCA31). Subsequently, we sequenced this STR site in 11 SCA31 samples using SMRT TM sequencing (Pacific Biosciences), determined 2.3-3.1 kb sequences at nucleotide resolution and revealed that (TGGAA)- and (TAAAATAGAA)-repeat expansions determined the instability of the repeat expansions associated with SCA31. Our method could also identify common STRs, (AAAG)- and (AAAAG)-repeat expansions, which are remarkably expanded at four positions in an SCA31 sample. This is the first proposed method for rapidly finding disease-associated long STRs in personal genomes using hybrid sequencing of short and long reads.
引用
收藏
页码:815 / 822
页数:8
相关论文
共 37 条
[21]   A NOVEL GENE CONTAINING A TRINUCLEOTIDE REPEAT THAT IS EXPANDED AND UNSTABLE ON HUNTINGTONS-DISEASE CHROMOSOMES [J].
MACDONALD, ME ;
AMBROSE, CM ;
DUYAO, MP ;
MYERS, RH ;
LIN, C ;
SRINIDHI, L ;
BARNES, G ;
TAYLOR, SA ;
JAMES, M ;
GROOT, N ;
MACFARLANE, H ;
JENKINS, B ;
ANDERSON, MA ;
WEXLER, NS ;
GUSELLA, JF ;
BATES, GP ;
BAXENDALE, S ;
HUMMERICH, H ;
KIRBY, S ;
NORTH, M ;
YOUNGMAN, S ;
MOTT, R ;
ZEHETNER, G ;
SEDLACEK, Z ;
POUSTKA, A ;
FRISCHAUF, AM ;
LEHRACH, H ;
BUCKLER, AJ ;
CHURCH, D ;
DOUCETTESTAMM, L ;
ODONOVAN, MC ;
RIBARAMIREZ, L ;
SHAH, M ;
STANTON, VP ;
STROBEL, SA ;
DRATHS, KM ;
WALES, JL ;
DERVAN, P ;
HOUSMAN, DE ;
ALTHERR, M ;
SHIANG, R ;
THOMPSON, L ;
FIELDER, T ;
WASMUTH, JJ ;
TAGLE, D ;
VALDES, J ;
ELMER, L ;
ALLARD, M ;
CASTILLA, L ;
SWAROOP, M .
CELL, 1993, 72 (06) :971-983
[22]   MYOTONIC-DYSTROPHY MUTATION - AN UNSTABLE CTG REPEAT IN THE 3' UNTRANSLATED REGION OF THE GENE [J].
MAHADEVAN, M ;
TSILFIDIS, C ;
SABOURIN, L ;
SHUTLER, G ;
AMEMIYA, C ;
JANSEN, G ;
NEVILLE, C ;
NARANG, M ;
BARCELO, J ;
OHOY, K ;
LEBLOND, S ;
EARLEMACDONALD, J ;
DEJONG, PJ ;
WIERINGA, B ;
KORNELUK, RG .
SCIENCE, 1992, 255 (5049) :1253-1255
[23]   AN O(N LOG N) ALGORITHM FOR FINDING ALL REPETITIONS IN A STRING [J].
MAIN, MG ;
LORENTZ, RJ .
JOURNAL OF ALGORITHMS, 1984, 5 (03) :422-432
[24]   DETECTING LEFTMOST MAXIMAL PERIODICITIES [J].
MAIN, MG .
DISCRETE APPLIED MATHEMATICS, 1989, 25 (1-2) :145-153
[25]   Large expansion of the ATTCT pentanucleotide repeat in spinocerebellar ataxia type 10 [J].
Matsuura, T ;
Yamagata, T ;
Burgess', DL ;
Rasmussen, A ;
Grewal, RP ;
Watase, K ;
Khajavi, M ;
McCall, AE ;
Davis, CF ;
Zu, L ;
Achari, M ;
Pulst, SM ;
Alonso, E ;
Noebels, JL ;
Nelson, DL ;
Zoghbi, HY ;
Ashizawa, T .
NATURE GENETICS, 2000, 26 (02) :191-194
[26]   Expandable DNA repeats and human disease [J].
Mirkin, Sergei M. .
NATURE, 2007, 447 (7147) :932-940
[27]   IMEx: Imperfect microsatellite extractor [J].
Mudunuri, Suresh B. ;
Nagarajaram, Hampapathalu A. .
BIOINFORMATICS, 2007, 23 (10) :1181-1187
[28]   FTD and ALS: Genetic Ties that Bind [J].
Orr, Harry T. .
NEURON, 2011, 72 (02) :189-190
[29]   TRStalker: an efficient heuristic for finding fuzzy tandem repeats [J].
Pellegrini, Marco ;
Renda, M. Elena ;
Vecchio, Alessio .
BIOINFORMATICS, 2010, 26 (12) :i358-i366
[30]   A Hexanucleotide Repeat Expansion in C9ORF72 Is the Cause of Chromosome 9p21-Linked ALS-FTD [J].
Renton, Alan E. ;
Majounie, Elisa ;
Waite, Adrian ;
Simon-Sanchez, Javier ;
Rollinson, Sara ;
Gibbs, J. Raphael ;
Schymick, Jennifer C. ;
Laaksovirta, Hannu ;
van Swieten, John C. ;
Myllykangas, Liisa ;
Kalimo, Hannu ;
Paetau, Anders ;
Abramzon, Yevgeniya ;
Remes, Anne M. ;
Kaganovich, Alice ;
Scholz, Sonja W. ;
Duckworth, Jamie ;
Ding, Jinhui ;
Harmer, Daniel W. ;
Hernandez, Dena G. ;
Johnson, Janel O. ;
Mok, Kin ;
Ryten, Mina ;
Trabzuni, Danyah ;
Guerreiro, Rita J. ;
Orrell, Richard W. ;
Neal, James ;
Murray, Alex ;
Pearson, Justin ;
Jansen, Iris E. ;
Sondervan, David ;
Seelaar, Harro ;
Blake, Derek ;
Young, Kate ;
Halliwell, Nicola ;
Callister, Janis Bennion ;
Toulson, Greg ;
Richardson, Anna ;
Gerhard, Alex ;
Snowden, Julie ;
Mann, David ;
Neary, David ;
Nalls, Michael A. ;
Peuralinna, Terhi ;
Jansson, Lilja ;
Isoviita, Veli-Matti ;
Kaivorinne, Anna-Lotta ;
Holtta-Vuori, Maarit ;
Ikonen, Elina ;
Sulkava, Raimo .
NEURON, 2011, 72 (02) :257-268