Mining for single nucleotide polymorphisms and insertions/deletions in maize expressed sequence tag data

被引:181
作者
Batley, J
Barker, G
O'Sullivan, H
Edwards, KJ
Edwards, D [1 ]
机构
[1] La Trobe Univ, Ctr Plant Biotechnol, Agr Victoria, Bundoora, Vic 3086, Australia
[2] Univ Bristol, Sch Biol Sci, Bristol B58 1UG, Avon, England
关键词
D O I
10.1104/pp.102.019422
中图分类号
Q94 [植物学];
学科分类号
071001 ;
摘要
We have developed a computer based method to identify candidate single nucleotide polymorphisms (SNPs) and small insertions/deletions from expressed sequence tag data. Using a redundancy-based approach, valid SNPs are distinguished from erroneous sequence by their representation multiple times in an alignment of sequence reads. A second measure of validity was also calculated based on the cosegregation of the SNP pattern between multiple SNP loci in an alignment. The utility of this method was demonstrated by applying it to 102,551 maize (Zea mays) expressed sequence tag sequences. A total of 14,832 candidate polymorphisms were identified with an SNP redundancy score of two or greater. Segregation of these SNPs with haplotype indicates that candidate SNPs with high redundancy and cosegregation confidence scores are likely to represent true SNPs. This was confirmed by validation of 264 candidate SNPs from 27 loci, with a range of redundancy and cosegregation scores, in four inbred maize lines. The SNP transition/transversion ratio and insertion/deletion size frequencies correspond to those observed by direct sequencing methods of SNP discovery and suggest that the majority of predicted SNPs and insertion/deletions identified using this approach represent true genetic variation in maize.
引用
收藏
页码:84 / 91
页数:8
相关论文
共 31 条
  • [1] ADAMS MD, 1995, NATURE, V377, P3
  • [2] BENNETZEN JL, 2002, PLANT PHYSIOL, V127, P1572
  • [3] Insertion-deletion polymorphisms in 3′ regions of maize genes occur frequently and can be used as highly informative genetic markers
    Bhattramakki, D
    Dolan, M
    Hanafey, M
    Wineland, R
    Vaske, D
    Register, JC
    Tingey, SV
    Rafalski, A
    [J]. PLANT MOLECULAR BIOLOGY, 2002, 48 (05) : 539 - 547
  • [4] Reliable identification of large numbers of candidate SNPs from public EST data
    Buetow, KH
    Edmonson, MN
    Cassidy, AB
    [J]. NATURE GENETICS, 1999, 21 (03) : 323 - 325
  • [5] d2_cluster: A validated method for clustering EST and full-length cDNA sequences
    Burke, J
    Davison, D
    Hide, W
    [J]. GENOME RESEARCH, 1999, 9 (11) : 1135 - 1142
  • [6] Expression-based genetic/physical maps of single-nucleotide polymorphisms identified by the cancer genome anatomy project
    Clifford, R
    Edmonson, M
    Hu, Y
    Nguyen, C
    Scherpbier, T
    Buetow, KH
    [J]. GENOME RESEARCH, 2000, 10 (08) : 1259 - 1265
  • [7] Coryell VH, 1999, THEOR APPL GENET, V101, P1291
  • [8] MOLECULAR-BASIS OF BASE SUBSTITUTION HOTSPOTS IN ESCHERICHIA-COLI
    COULONDRE, C
    MILLER, JH
    FARABAUGH, PJ
    GILBERT, W
    [J]. NATURE, 1978, 274 (5673) : 775 - 780
  • [9] A SNP resource for human chromosome 22: Extracting dense clusters of SNPs from the genomic sequence
    Dawson, E
    Chen, Y
    Hunt, S
    Smink, LJ
    Hunt, A
    Rice, K
    Livingston, S
    Bumpstead, S
    Bruskiewich, R
    Sham, P
    Ganske, R
    Adams, M
    Kawasaki, K
    Shimizu, N
    Minoshima, S
    Roe, B
    Bentley, D
    Dunham, I
    [J]. GENOME RESEARCH, 2001, 11 (01) : 170 - 178
  • [10] Deutsch S, 2001, GENOME RES, V11, P300