Total evidence requires exclusion of phylogenetically misleading data

被引:70
作者
Lecointre, G
Deleporte, P
机构
[1] Museum Natl Hist Nat, CNRS, UMR 7138, Dept Systemat & Evolut, F-75231 Paris 05, France
[2] CNRS, UMR 6552, F-35380 Paimpont, France
[3] Univ Rennes 1, Stn Biol Paimpont, F-35380 Paimpont, France
关键词
D O I
10.1111/j.1463-6409.2005.00168.x
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Treating A available characters simultaneously in a single data matrix (i.e. combined or simultaneous analysis) is frequently called the 'total evidence' (TE) approach, following Kluge's introduction of the term in 1989, quoting Carnap (1950). However, the general principle and one of the possible procedures involved in its application are often confused. The principle, first enunciated within the context of inductive logic by Carnap in 1950, did not refer to a particular procedure, and TE meant using all relevant knowledge, rather than a combined analysis of all available data. Using TE, all relevant knowledge should be taken into account, including the fact that some data are probably misleading as indicators of species phylogeny and should be discarded. Based on the assumption that molecular partitions have some biological significance (process partitions obtained from nonrandom homoplasy or from 'processes of discord'), we suggest that separate analyses constitute an important exploratory investigation, while the phylogenetic tree itself should be produced by a final combined analysis of all relevant data. Given that the concept of process partitions is justified and that reliability cannot be evaluated using any robustness measure from a single combined analysis, the analysis of multiple data sets involves five steps: (1) perform separate analyses without consensus trees in order to assess reliability of clades through their recurrence and improve the detection of artifacts; (2) test significance of character incongruence, using, for example, pairwise ILD tests in order to identify the sets responsible for incongruence; (3) replace likely misleading data with question marks in the combined data matrix; (4) perform simultaneous analysis of this matrix without the misleading data; (5) assess the reliability of clades found by the combined analysis by computing their recurrence within the previous separate analyses, giving priority to repeatability.
引用
收藏
页码:101 / 117
页数:17
相关论文
共 97 条
[1]  
Allard MW, 1996, CLADISTICS, V12, P183, DOI 10.1111/j.1096-0031.1996.tb00008.x
[2]  
[Anonymous], INTERPRETING HIERARC
[3]   AGAINST CONSENSUS [J].
BARRETT, M ;
DONOGHUE, MJ ;
SOBER, E .
SYSTEMATIC ZOOLOGY, 1991, 40 (04) :486-493
[4]  
BISERCIC M, 1991, J BACTERIOL, V173, P3894
[5]  
Boyd EF, 1996, GENETICS, V143, P1091
[6]   Chromosomal regions specific to pathogenic isolates of Escherichia coli have a phylogenetically clustered distribution [J].
Boyd, EF ;
Hartl, DL .
JOURNAL OF BACTERIOLOGY, 1998, 180 (05) :1159-1165
[7]  
BREMER K, 1988, EVOLUTION, V42, P795, DOI [10.2307/2408870, 10.1111/j.1558-5646.1988.tb02497.x]
[8]  
BREMER K, 1994, CLADISTICS, V10, P295, DOI 10.1006/clad.1994.1019
[9]   Molecular phylogeny of Cyprinidae inferred from cytochrome b DNA sequences [J].
Briolay, J ;
Galtier, N ;
Brito, RM ;
Bouvet, Y .
MOLECULAR PHYLOGENETICS AND EVOLUTION, 1998, 9 (01) :100-108
[10]   PARTITIONING AND COMBINING DATA IN PHYLOGENETIC ANALYSIS [J].
BULL, JJ ;
HUELSENBECK, JP ;
CUNNINGHAM, CW ;
SWOFFORD, DL ;
WADDELL, PJ .
SYSTEMATIC BIOLOGY, 1993, 42 (03) :384-397