Snipping polymorphisms from large EST collections in barley (Hordeum vulgare L.)

被引:80
作者
Kota, R
Rudd, S
Facius, A
Kolesov, G
Thiel, T
Zhang, H
Stein, N
Mayer, K
Graner, A
机构
[1] Inst Plant Genet & Crop Plant Res IPK, D-06466 Gatersleben, Germany
[2] Natl Res Ctr Environm & Hlth GSF, MIPS Inst Bioinformat, D-85764 Neuherberg, Germany
关键词
single-nucleotide polymorphisms (SNPs); expressed sequence tags (ESTs); denaturing high-performance liquid chromatography (DHPLC); data mining; bioinfomatics;
D O I
10.1007/s00438-003-0891-6
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The public EST (expressed sequence tag) databases represent an enormous but heterogeneous repository of sequences, including many from a broad selection of plant species and a wide range of distinct varieties. The significant redundancy within large EST collections makes them an attractive resource for rapid pre-selection of candidate sequence polymorphisms. Here we present a strategy that allows rapid identification of candidate SNPs in barley (Hordeum vulgare L.) using publicly available EST databases. Analysis of 271,630 EST sequences from different cDNA libraries, representing 23 different barley varieties, resulted in the generation of 56,302 tentative consensus sequences. In all, 8171 of these unigene sequences are members of clusters with six or more ESTs. By applying a novel SNP detection algorithm (SNiPpER) to these sequences, we identified 3069 candidate inter-varietal SNPs. In order to verify these candidate SNPs, we selected a small subset of 63 present in 36 ESTs. Of the 63 SNPs selected, we were able to validate 54 (86%) using a direct sequencing approach. For further verification, 28 ESTs were mapped to distinct loci within the barley genome. The polymorphism information content (PIC) and nucleotide diversity (pi) values of the SNPs identified by the SNiPpER algorithm are significantly higher than those that were obtained by random sequencing. This demonstrates the efficiency of our strategy for SNP identification and the cost-efficient development of EST-based SNP-markers.
引用
收藏
页码:24 / 33
页数:10
相关论文
共 49 条
  • [1] ABDELGHANI AH, 2002, INT RES FOOD SECURIT
  • [2] On the origin and domestication history of barley (Hordeum vulgare)
    Badr, A
    Müller, K
    Schäfer-Pregl, R
    El Rabey, H
    Effgen, S
    Ibrahim, HH
    Pozzi, C
    Rohde, W
    Salamini, F
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2000, 17 (04) : 499 - 510
  • [3] FIDELITY OF THE RNA-DEPENDENT DNA-SYNTHESIS EXHIBITED BY THE REVERSE TRANSCRIPTASES OF HUMAN-IMMUNODEFICIENCY-VIRUS TYPE-1 AND TYPE-2 AND OF MURINE LEUKEMIA-VIRUS - MISPAIR EXTENSION FREQUENCIES
    BAKHANASHVILI, M
    HIZI, A
    [J]. BIOCHEMISTRY, 1992, 31 (39) : 9393 - 9398
  • [4] THE FIDELITY OF THE REVERSE TRANSCRIPTASES OF HUMAN IMMUNODEFICIENCY VIRUSES AND MURINE LEUKEMIA-VIRUS, EXHIBITED BY THE MISPAIR EXTENSION FREQUENCIES, IS SEQUENCE DEPENDENT AND ENZYME RELATED
    BAKHANASHVILI, M
    HIZI, A
    [J]. FEBS LETTERS, 1993, 319 (1-2) : 201 - 205
  • [5] Reliable identification of large numbers of candidate SNPs from public EST data
    Buetow, KH
    Edmonson, MN
    Cassidy, AB
    [J]. NATURE GENETICS, 1999, 21 (03) : 323 - 325
  • [6] Genome-wide mapping with biallelic markers in Arabidopsis thaliana
    Cho, RJ
    Mindrinos, M
    Richards, DR
    Sapolsky, RJ
    Anderson, M
    Drenkard, E
    Dewdney, L
    Reuber, TL
    Stammers, M
    Federspiel, N
    Theologis, A
    Yang, WH
    Hubbell, E
    Au, M
    Chung, EY
    Lashkari, D
    Lemieux, B
    Dean, C
    Lipshutz, RJ
    Ausubel, FM
    Davis, RW
    Oefner, PJ
    [J]. NATURE GENETICS, 1999, 23 (02) : 203 - 207
  • [7] Moloney murine leukemia reverse transcriptase suspect in the production of multiple misincorporations during hprt cDNA synthesis
    Curry, J
    Glickman, BW
    [J]. MUTATION RESEARCH-FUNDAMENTAL AND MOLECULAR MECHANISMS OF MUTAGENESIS, 1997, 374 (01) : 145 - 148
  • [8] Base-calling of automated sequencer traces using phred.: II.: Error probabilities
    Ewing, B
    Green, P
    [J]. GENOME RESEARCH, 1998, 8 (03): : 186 - 194
  • [9] Base-calling of automated sequencer traces using phred.: I.: Accuracy assessment
    Ewing, B
    Hillier, L
    Wendl, MC
    Green, P
    [J]. GENOME RESEARCH, 1998, 8 (03): : 175 - 185
  • [10] Comparative Genomics in the grass family: Molecular characterization of grass genome structure and evolution
    Feuillet, C
    Keller, B
    [J]. ANNALS OF BOTANY, 2002, 89 (01) : 3 - 10