Why do we still use stepwise modelling in ecology and behaviour?

被引:1168
作者
Whittingham, Mark J.
Stephens, Philip A.
Bradbury, Richard B.
Freckleton, Robert P.
机构
[1] Newcastle Univ, Div Biol, Sch Biol & Psychol, Newcastle Upon Tyne NE1 7RU, Tyne & Wear, England
[2] Univ Bristol, Dept Math, Bristol BS8 1TW, Avon, England
[3] Royal Soc Protect Birds, Sandy SG19 2DL, Beds, England
[4] Univ Sheffield, Dept Anim & Plant Sci, Sheffield S10 2TN, S Yorkshire, England
基金
英国生物技术与生命科学研究理事会;
关键词
ecological modelling; habitat selection; minimum adequate model; multivariate statistical analysis; statistical bias;
D O I
10.1111/j.1365-2656.2006.01141.x
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
1. The biases and shortcomings of stepwise multiple regression are well established within the statistical literature. However, an examination of papers published in 2004 by three leading ecological and behavioural journals suggested that the use of this technique remains widespread: of 65 papers in which a multiple regression approach was used, 57% of studies used a stepwise procedure. 2. The principal drawbacks of stepwise multiple regression include bias in parameter estimation, inconsistencies among model selection algorithms, an inherent (but often overlooked) problem of multiple hypothesis testing, and an inappropriate focus or reliance on a single best model. We discuss each of these issues with examples. 3. We use a worked example of data on yellowhammer distribution collected over 4 years to highlight the pitfalls of stepwise regression. We show that stepwise regression allows models containing significant predictors to be obtained from each year's data. In spite of the significance of the selected models, they vary substantially between years and suggest patterns that are at odds with those determined by analysing the full, 4-year data set. 4. An information theoretic (IT) analysis of the yellowhammer data set illustrates why the varying outcomes of stepwise analyses arise. In particular, the IT approach identifies large numbers of competing models that could describe the data equally well, showing that no one model should be relied upon for inference.
引用
收藏
页码:1182 / 1189
页数:8
相关论文
共 28 条
[1]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[2]   Null hypothesis testing: Problems, prevalence, and an alternative [J].
Anderson, DR ;
Burnham, KP ;
Thompson, WL .
JOURNAL OF WILDLIFE MANAGEMENT, 2000, 64 (04) :912-923
[3]   Habitat associations and breeding success of yellowhammers on lowland farmland [J].
Bradbury, RB ;
Kyrkos, A ;
Morris, AJ ;
Clark, SC ;
Perkins, AJ ;
Wilson, JD .
JOURNAL OF APPLIED ECOLOGY, 2000, 37 (05) :789-805
[4]  
Burnham K. P., 1998, MODEL SELECTION INFE
[5]  
BURNHAM KP, 2002, MODEL SLECTION MULTI
[6]   CASE AGAINST STATISTICAL SIGNIFICANCE TESTING [J].
CARVER, RP .
HARVARD EDUCATIONAL REVIEW, 1978, 48 (03) :378-399
[7]  
Chatfield, 1995, Problem solving: A statistician's guide
[8]   THE EARTH IS ROUND (P-LESS-THAN.05) [J].
COHEN, J .
AMERICAN PSYCHOLOGIST, 1994, 49 (12) :997-1003
[9]  
Cohen J., 1983, APPL MULTIPLE REGRES, DOI [10.1002/0471264385.wei0219, DOI 10.1002/0471264385.WEI0219]
[10]   BACKWARD, FORWARD AND STEPWISE AUTOMATED SUBSET-SELECTION ALGORITHMS - FREQUENCY OF OBTAINING AUTHENTIC AND NOISE VARIABLES [J].
DERKSEN, S ;
KESELMAN, HJ .
BRITISH JOURNAL OF MATHEMATICAL & STATISTICAL PSYCHOLOGY, 1992, 45 :265-282