Contingency matrix theory: Statistical dependence in a contingency table

被引:20
作者
Tsumoto, Shusaku [1 ]
机构
[1] Shimane Univ, Fac Med, Dept Med Informat, Izumo, Shimane 6938501, Japan
关键词
Simpson's paradox; Statistical independence; Contingency matrix; Linear algebra; ROUGH SETS; RULES;
D O I
10.1016/j.ins.2008.11.023
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Chance discovery aims at understanding the meaning of functional dependency from the viewpoint of unexpected relations. One of the most important observations is that such a chance is hidden under a huge number of coocurrencies extracted from a given data. on the other hand, conventional data-mining methods are strongly dependent on frequencies and statistics rather than interestingness or unexpectedness. This paper discusses some limitations of ideas of statistical dependence, especially focusing on the formal characteristics of Simpson's paradox from the viewpoint of linear algebra. Theoretical results show that such a Simpson's paradox can be observed when a given contingency table as a matrix is not regular, in other words, the rank of a contingency matrix is not full. Thus, data-ordered evidence gives some limitations, which should be compensated by human-oriented reasoning. (C) 2008 Elsevier Inc. All rights reserved.
引用
收藏
页码:1615 / 1627
页数:13
相关论文
共 13 条
[1]  
Coxeter H., 1987, Projective geometry, V2
[2]  
HEYDTMANN M, 2002, Q J MED, V95
[3]   Maximal consistent extensions of information systems relative to their theories [J].
Moshkov, Mikhail ;
Skowron, Andrzej ;
Suraj, Zbigniew .
INFORMATION SCIENCES, 2008, 178 (12) :2600-2620
[4]  
Ohsawa Y., 2006, STUDIES COMPUTATIONA, V30
[5]   Rough sets and intelligent data analysis [J].
Pawlak, Z .
INFORMATION SCIENCES, 2002, 147 (1-4) :1-12
[6]   Rough sets: Some extensions [J].
Pawlak, Zdzislaw ;
Skowron, Andrzej .
INFORMATION SCIENCES, 2007, 177 (01) :28-40
[7]   Rough sets and Boolean reasoning [J].
Pawlak, Zdzislaw ;
Skowron, Andrzej .
INFORMATION SCIENCES, 2007, 177 (01) :41-73
[8]  
SIMPSON EH, 1951, J ROY STAT SOC B, V2
[9]  
Skowron A., 1994, Advances in the Dempster-Shafer Theory of Evidence, P193
[10]   Automated extraction of medical expert system rules from clinical databases based on rough set theory [J].
Tsumoto, S .
INFORMATION SCIENCES, 1998, 112 (1-4) :67-84