Multiple changepoint fitting via quasilikelihood, with application to DNA sequence segmentation

被引:106
作者
Braun, JV
Braun, RK
Müller, HG
机构
[1] Kings Mt Res, Woodside, CA 94062 USA
[2] Stottler Henke Associates Inc, Seattle, WA 98105 USA
[3] Univ Calif Davis, Div Stat, Davis, CA 95616 USA
关键词
bacteriophage lambda; deviance; generalised linear model; model selection; Schwarz criterion; step function;
D O I
10.1093/biomet/87.2.301
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
We consider situations where a step function with a variable number of steps provides an adequate model for a regression relationship, while the variance of the observations depends on their mean. This model provides for discontinuous jumps at changepoints and for constant means and error variances in between changepoints. The basic statistical problem consists of identification of the number of changepoints, their locations and the levels the function assumes in between. We embed this problem into a quasilikelihood formulation and utilise the minimum deviance criterion to fit the model; for the choice of the number of changepoints, we discuss a modified Schwarz criterion. A dynamic programming algorithm makes the segmentation feasible for sequences of moderate length. The performance of the segmentation method is demonstrated in an application to the segmentation of the Bacteriophage lambda sequence.
引用
收藏
页码:301 / 314
页数:14
相关论文
共 25 条
[1]   ALGORITHMS FOR THE OPTIMAL IDENTIFICATION OF SEGMENT NEIGHBORHOODS [J].
AUGER, IE ;
LAWRENCE, CE .
BULLETIN OF MATHEMATICAL BIOLOGY, 1989, 51 (01) :39-54
[2]   A BAYESIAN-ANALYSIS FOR CHANGE POINT PROBLEMS [J].
BARRY, D ;
HARTIGAN, JA .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1993, 88 (421) :309-319
[3]   PRODUCT PARTITION MODELS FOR CHANGE POINT PROBLEMS [J].
BARRY, D ;
HARTIGAN, JA .
ANNALS OF STATISTICS, 1992, 20 (01) :260-279
[4]  
Bhattacharya PK, 1994, INST MATH S, V23, P28, DOI 10.1214/lnms/1215463112
[5]  
Braun JV, 1998, STAT SCI, V13, P142
[6]  
Chiou JM, 1999, ANN STAT, V27, P36
[7]   Multiple change-point analysis of disease incidence rates [J].
Christensen, J ;
Rudemo, M .
PREVENTIVE VETERINARY MEDICINE, 1996, 26 (01) :53-76
[8]  
CHURCHILL GA, 1989, B MATH BIOL, V51, P79
[9]   HIDDEN MARKOV-CHAINS AND THE ANALYSIS OF GENOME STRUCTURE [J].
CHURCHILL, GA .
COMPUTERS & CHEMISTRY, 1992, 16 (02) :107-115
[10]   PROBLEM OF NILE - CONDITIONAL SOLUTION TO A CHANGEPOINT PROBLEM [J].
COBB, GW .
BIOMETRIKA, 1978, 65 (02) :243-251