Bi-level clustering of mixed categorical and numerical biomedical data

被引:19
作者
Andreopoulos, Bill [1 ]
An, Aijun
Wang, Xiaogang
机构
[1] York Univ, Dept Comp Sci & Engn, Toronto, ON M3J 1P3, Canada
[2] York Univ, Dept Math & Stat, Toronto, ON M3J 1P3, Canada
关键词
clustering; categorical; numerical; nominal; ordinal; biomedical; bi-level; Bayesian;
D O I
10.1504/IJDMB.2006.009920
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Biomedical data sets often have mixed categorical and numerical types, where the former represent semantic information on the objects and the latter represent experimental results. We present the BILCOM algorithm for 'Bi-Level Clustering of Mixed categorical and numerical data types'. BILCOM performs a pseudo-Bayesian process, where the prior is categorical clustering. BILCOM partitions biomedical data sets of mixed types, such as hepatitis, thyroid disease and yeast gene expression data with Gene Ontology annotations, more accurately than if using one type alone.
引用
收藏
页码:19 / 56
页数:38
相关论文
共 31 条
[1]  
ANDREOPOULOS B, 2005, CS200510 YORK U DEP
[2]  
ANDREOPOULOS B, 2005, CS200501 YORK U DEP
[3]  
ANDREOPOULOS B, 2004, CD200407 YORK U DEP
[4]  
ANDREOPOULOS B, 2005, CS200509 YORK U DEP
[5]  
Andreopoulos B., 2005, P ACM SIGMOD WORKSH, P87
[6]  
ANDRITSOS P, 2004, P 9 INT C EXT DAT TE
[7]  
[Anonymous], 010302 U WASH DEP CO
[8]  
Ashburner M, 2001, GENOME RES, V11, P1425
[9]   A tutorial on Support Vector Machines for pattern recognition [J].
Burges, CJC .
DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (02) :121-167
[10]   Shrinkage-based similarity metric for cluster analysis of microarray data [J].
Cherepinsky, V ;
Feng, JW ;
Rejali, M ;
Mishra, B .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2003, 100 (17) :9668-9673