Individual sequences in large sets of gene sequences may be distinguished efficiently by combinations of shared sub-sequences

被引:6
作者
Gibbs, MJ [1 ]
Armstrong, JS [1 ]
Gibbs, AJ [1 ]
机构
[1] Australian Natl Univ, Sch Bot & Zool, Canberra, ACT 0200, Australia
关键词
D O I
10.1186/1471-2105-6-90
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Most current DNA diagnostic tests for identifying organisms use specific oligonucleotide probes that are complementary in sequence to, and hence only hybridise with the DNA of one target species. By contrast, in traditional taxonomy, specimens are usually identified by 'dichotomous keys' that use combinations of characters shared by different members of the target set. Using one specific character for each target is the least efficient strategy for identification. Using combinations of shared bisectionally-distributed characters is much more efficient, and this strategy is most efficient when they separate the targets in a progressively binary way. Results: We have developed a practical method for finding minimal sets of sub-sequences that identify individual sequences, and could be targeted by combinations of probes, so that the efficient strategy of traditional taxonomic identification could be used in DNA diagnosis. The sizes of minimal sub-sequence sets depended mostly on sequence diversity and sub-sequence length and interactions between these parameters. We found that 201 distinct cytochrome oxidase subunit-1 (CO1) genes from moths ( Lepidoptera) were distinguished using only 15 sub-sequences 20 nucleotides long, whereas only 8-10 sub-sequences 6-10 nucleotides long were required to distinguish the CO1 genes of 92 species from the 9 largest orders of insects. Conclusion: The presence/absence of sub-sequences in a set of gene sequences can be used like the questions in a traditional dichotomous taxonomic key; hybridisation probes complementary to such sub-sequences should provide a very efficient means for identifying individual species, subtypes or genotypes. Sequence diversity and sub-sequence length are the major factors that determine the numbers of distinguishing sub-sequences in any set of sequences.
引用
收藏
页数:11
相关论文
共 27 条
  • [1] Effect of secondary structure on single nucleotide polymorphism detection with a porous microarray matrix; Implications for probe selection
    Anthony, RM
    Schuitema, ARJ
    Chan, AB
    Boender, PJ
    Klatser, PR
    Oskam, L
    [J]. BIOTECHNIQUES, 2003, 34 (05) : 1082 - +
  • [2] Borneman J, 2001, Bioinformatics, V17 Suppl 1, pS39
  • [3] PREDICTING DNA DUPLEX STABILITY FROM THE BASE SEQUENCE
    BRESLAUER, KJ
    FRANK, R
    BLOCKER, H
    MARKY, LA
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1986, 83 (11) : 3746 - 3750
  • [4] Improved allelic differentiation using sequence-specific oligonucleotide hybridization incorporating an additional base-analogue mismatch
    Burgner, D
    D'Amato, M
    Kwiatkowski, DP
    Loakes, D
    [J]. NUCLEOSIDES NUCLEOTIDES & NUCLEIC ACIDS, 2004, 23 (05) : 755 - 765
  • [5] Introduction of an automated service for the laboratory confirmation of meningococcal disease in Scotland
    Clarke, SC
    Diggle, MA
    Reid, JA
    Thom, L
    Edwards, GFS
    [J]. JOURNAL OF CLINICAL PATHOLOGY, 2001, 54 (07) : 556 - 557
  • [6] Multiplex PCR: Optimization and application in diagnostic virology
    Elnifro, EM
    Ashshi, AM
    Cooper, RJ
    Klapper, PE
    [J]. CLINICAL MICROBIOLOGY REVIEWS, 2000, 13 (04) : 559 - +
  • [7] Foldes-Papp Zeno, 2004, Mol Diagn, V8, P1
  • [8] Hebert P.D., 2003, PROC ROY SOC LOND SE, V270, P313
  • [9] Barcoding animal life:: cytochrome c oxidase subunit 1 divergences among closely related species
    Hebert, PDN
    Ratnasingham, S
    deWaard, JR
    [J]. PROCEEDINGS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2003, 270 : S96 - S99
  • [10] Information theoretical probe selection for hybridisation experiments
    Herwig, R
    Schmitt, AO
    Steinfath, M
    O'Brien, J
    Seidel, H
    Meier-Ewert, S
    Lehrach, H
    Radelof, U
    [J]. BIOINFORMATICS, 2000, 16 (10) : 890 - 898