Managerial decision support with knowledge of accuracy and completeness of the relational aggregate functions

被引:29
作者
Parssian, Amir [1 ]
机构
[1] Univ Illinois, Springfield, IL 62703 USA
关键词
information quality; relational aggregate functions; sampling strategies;
D O I
10.1016/j.dss.2005.12.005
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Aggregate data produced by decision support systems is utilized by managers in their decision making process to run or improve their firm's operations. Often, data residing in corporate databases and data warehouses are far from being perfect, and their imperfections have an impact on decision quality and outcome. Therefore, having knowledge about the effect of data errors on aggregate data could lead to more informed decisions, reduced risks, and competitive advantage. In this paper, we present a methodology to estimate the effects of data accuracy and completeness, as two important data quality dimensions, on the relational aggregate functions Count, Sum, Average, Max, and Min. Our methodology defines a set of attribute value types and deploys sampling strategies to determine the maximum likelihood estimates of each value type. We show the effect of data error rates on the scalar values returned by the aggregate functions and demonstrate the efficiency of our estimates by Monte Carlo simulations. (c) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:1494 / 1502
页数:9
相关论文
共 29 条
[1]   Modeling information manufacturing systems to determine information product quality [J].
Ballou, D ;
Wang, R ;
Pazer, H ;
Tayi, GK .
MANAGEMENT SCIENCE, 1998, 44 (04) :462-484
[2]   SCHUR CONVEXITY OF THE MAXIMUM-LIKELIHOOD FUNCTION FOR THE MULTIVARIATE HYPERGEOMETRIC AND MULTINOMIAL DISTRIBUTIONS [J].
BOLAND, PJ ;
PROSCHAN, F .
STATISTICS & PROBABILITY LETTERS, 1987, 5 (05) :317-322
[3]   Evaluating aggregate operations over imprecise data [J].
Chen, ALP ;
Chiu, JS ;
Tseng, FSC .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1996, 8 (02) :273-284
[4]   The impact of data quality information on decision making: An exploratory analysis [J].
Chengalur-Smith, IN ;
Ballou, DP ;
Pazer, HL .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1999, 11 (06) :853-864
[5]  
CODD EF, 1986, ACM SIGMOD RECORD, V15, P53
[6]  
Eppler Martin J., 2003, Managing Information Quality: Increasing the Value of Information in Knowledge-Intensive Products and Processes
[7]  
Gottlob G., 1988, Proceedings of the Fourteenth International Conference on Very Large Databases, P50
[8]   STATISTICAL ESTIMATORS FOR AGGREGATE RELATIONAL ALGEBRA QUERIES [J].
HOU, WC ;
OZSOYOGLU, GK .
ACM TRANSACTIONS ON DATABASE SYSTEMS, 1991, 16 (04) :600-654
[9]   INCOMPLETE INFORMATION IN RELATIONAL DATABASES [J].
IMIELINSKI, T ;
LIPSKI, W .
JOURNAL OF THE ACM, 1984, 31 (04) :761-791
[10]  
Johnson NL., 1993, Univariate discrete distributions, V2