Likelihood-based data squashing: A modeling approach to instance construction

被引:40
作者
Madigan, D [1 ]
Raghavan, N
Dumouchel, W
Nason, M
Posse, C
Ridgeway, G
机构
[1] Rutgers State Univ, Piscataway, NJ 08855 USA
[2] AT&T Labs Res, Shannon Lab, Florham Pk, NJ USA
[3] Univ Washington, Dept Biostat, Seattle, WA 98195 USA
关键词
instance construction; data compression;
D O I
10.1023/A:1014095614948
中图分类号
TP18 [人工智能理论];
学科分类号
081104 [模式识别与智能系统]; 0812 [计算机科学与技术]; 0835 [软件工程]; 1405 [智能科学与技术];
摘要
Squashing is a lossy data compression technique that preserves statistical information. Specifically, squashing compresses a massive dataset to a much smaller one so that outputs from statistical analyses carried out on the smaller (squashed) dataset reproduce outputs from the same statistical analyses carried out on the original dataset. Likelihood-based data squashing (LDS) differs from a previously published squashing algorithm insofar as it uses a statistical model to squash the data. The results show that LDS provides excellent squashing performance even when the target statistical analysis departs from the model used to squash the data.
引用
收藏
页码:173 / 190
页数:18
相关论文
共 15 条
[1]
INSTANCE-BASED LEARNING ALGORITHMS [J].
AHA, DW ;
KIBLER, D ;
ALBERT, MK .
MACHINE LEARNING, 1991, 6 (01) :37-66
[2]
Box CE, 1978, STAT EXPT INTRO DESI
[3]
Box G, 1987, EMPIRICAL MODEL BUIL
[4]
Bradley P. S., 1998, Proceedings Fourth International Conference on Knowledge Discovery and Data Mining, P9
[5]
BREIMAN L, 1984, STAT SIGNAL PROCESSI, P191
[6]
CATLETT J, 1991, P 8 INT WORKSH MACH, P596
[7]
DuMouchel W., 1999, Proceedings of the Fifth ACM Conference on Knowledge Discovery and Data Mining, V15, P6
[8]
REGRESSIONS BY LEAPS AND BOUNDS [J].
FURNIVAL, GM ;
WILSON, RW .
TECHNOMETRICS, 1974, 16 (04) :499-511
[9]
Strategic directions in storage I/O issues in large-scale computing [J].
Gibson, GA ;
Vitter, JS ;
Wilkes, J .
ACM COMPUTING SURVEYS, 1996, 28 (04) :779-793
[10]
EFFICIENT SCREENING OF NONNORMAL REGRESSION-MODELS [J].
LAWLESS, JF ;
SINGHAL, K .
BIOMETRICS, 1978, 34 (02) :318-327