An exhaustive DNA micro-satellite map of the human genome using high performance computing

被引:44
作者
Collins, JR
Stephens, RM
Gold, B
Long, B
Dean, M
Burt, SK
机构
[1] NCI, Lab Genom Divers, Frederick, MD 21702 USA
[2] NCI, Adv Biomed Comp Ctr, Frederick, MD 21701 USA
[3] Cray Inc, Mendota Hts, MN USA
关键词
D O I
10.1016/S0888-7543(03)00076-4
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
The current pace of the generation of sequence data requires the development of software tools that can rapidly provide full annotation of the data. We have developed a new method for rapid sequence comparison using the exact match algorithm without repeat masking. As a demonstration, we have identified all perfect simple tandem repeats (STR) within the draft sequence of the human genome. The STR elements (chromosome, position, length and repeat subunit) have been placed into a relational database. Repeat flanking sequence is also publicly accessible at http://grid.abcc.ncifcrf.gov. To illustrate the utility of this complete set of STR elements, we documented the increased density of potentially polymorphic markers throughout the genome. The new STR markers may be useful in disease association studies because so many STR elements manifest multiallelic polymorphism. Also, because triplet repeat expansions are important for human disease etiology, we identified trinucleotide repeats that exist within exons of known genes. This resulted in a list that includes all 14 genes known to undergo polynucleotide expansion, and 48 additional candidates. Several of these are non-polyglutamine triplet repeats. Other examinations of the STR database demonstrated repeats spanning splice junctions and identified SNPs within repeat elements. (C) 2003 Elsevier Science (USA). All rights reserved.
引用
收藏
页码:10 / 19
页数:10
相关论文
共 24 条
[1]  
AMOS W, 1999, MICROSATELLITES EVOL, P66
[2]  
[Anonymous], MICROSATELLITES EVOL
[3]  
ARMOUR JAL, 1999, MICROSATELLITES EVOL, P24
[4]   Tandem repeats finder: a program to analyze DNA sequences [J].
Benson, G .
NUCLEIC ACIDS RESEARCH, 1999, 27 (02) :573-580
[5]   A SPACE EFFICIENT ALGORITHM FOR FINDING THE BEST NONOVERLAPPING ALIGNMENT SCORE [J].
BENSON, G .
THEORETICAL COMPUTER SCIENCE, 1995, 145 (1-2) :357-369
[6]   A METHOD FOR FAST DATABASE SEARCH FOR ALL K-NUCLEOTIDE REPEATS [J].
BENSON, G ;
WATERMAN, MS .
NUCLEIC ACIDS RESEARCH, 1994, 22 (22) :4828-4836
[7]  
COSSO S, 1995, J FORENSIC SCI, V40, P424
[8]   Fourteen and counting: unraveling trinucleotide repeat diseases [J].
Cummings, CJ ;
Zoghbi, HY .
HUMAN MOLECULAR GENETICS, 2000, 9 (06) :909-916
[9]   Genetic instabilities in (CTG•CAG) repeats occur by recombination [J].
Jakupciak, JP ;
Wells, RD .
JOURNAL OF BIOLOGICAL CHEMISTRY, 1999, 274 (33) :23468-23479
[10]   An algorithm for locating nonoverlapping regions of maximum alignment score [J].
Kannan, SK ;
Myers, EW .
SIAM JOURNAL ON COMPUTING, 1996, 25 (03) :648-662