Genotyping of single nucleotide polymorphism using model-based clustering

被引:11
作者
Fujisawa, H [1 ]
Eguchi, S
Ushijima, M
Miyata, S
Miki, Y
Muto, T
Matsuura, M
机构
[1] Inst Stat Math, Tokyo 1068569, Japan
[2] Japanese Fdn Canc Res, Genome Ctr, Tokyo 1708455, Japan
关键词
D O I
10.1093/bioinformatics/btg475
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Single nucleotide polymorphisms have been investigated as biological markers and the representative high-throughput genotyping method is a combination of the Invader assay and a statistical clustering method. A typical statistical clustering method is the k-means method, but it often fails because of the lack of flexibility. An alternative fast and reliable method is therefore desirable. Results: This paper proposes a model-based clustering method using a normal mixture model and a well-conceived penalized likelihood. The proposed method can judge unclear genotypings to be re-examined and also work well even when the number of clusters is unknown. Some results are illustrated and then satisfactory genotypings are shown. Even when the conventional maximum likelihood method and the typical k-means clustering method failed, the proposed method succeeded.
引用
收藏
页码:718 / U489
页数:34
相关论文
共 18 条
[1]   Robust and efficient estimation by minimising a density power divergence [J].
Basu, A ;
Harris, IR ;
Hjort, NL ;
Jones, MC .
BIOMETRIKA, 1998, 85 (03) :549-559
[2]   A modified likelihood ratio test for homogeneity in finite mixture models [J].
Chen, HF ;
Chen, JH ;
Kalbfleisch, JD .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2001, 63 :19-29
[3]   Penalized minimum-distance estimates in finite mixture models [J].
Chen, JH ;
Kalbfleisch, JD .
CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 1996, 24 (02) :167-175
[4]  
EGUCHI S, 2001, I STAT MATH RES MEM, V809
[5]  
FUJISAWA H, 2002, I STAT MATH RES MEM, V867
[6]   Mixture modelling of gene expression data from microarray experiments [J].
Ghosh, D ;
Chinnaiyan, AM .
BIOINFORMATICS, 2002, 18 (02) :275-286
[7]   An integrated system for high throughput TaqMan™ based SNP genotyping [J].
Hampe, J ;
Wollstein, A ;
Lu, T ;
Frevel, HJ ;
Will, M ;
Manaster, C ;
Schreiber, S .
BIOINFORMATICS, 2001, 17 (07) :654-655
[8]   A comparison of related density-based minimum divergence estimators [J].
Jones, MC ;
Hjort, NL ;
Harris, IR ;
Basu, A .
BIOMETRIKA, 2001, 88 (03) :865-873
[9]  
MCLACHLAN G., 2000, WILEY SER PROB STAT, DOI 10.1002/0471721182
[10]  
McLachlan G. J., 1997, EM ALGORITHM EXTENSI