On Detecting Cherry-picked Trendlines

被引:16
作者
Asudeh, Abolfazl [1 ]
Jagadish, H., V [2 ]
Wu, You [3 ]
Yu, Cong [3 ]
机构
[1] Univ Illinois, Chicago, IL 60637 USA
[2] Univ Michigan, Ann Arbor, MI 48109 USA
[3] Google Res, New York, NY USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2020年 / 13卷 / 06期
基金
美国国家科学基金会;
关键词
FACT;
D O I
10.14778/3380750.3380762
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Poorly supported stories can be told based on data by cherry-picking the data points included. While such stories may be technically accurate, they are misleading. In this paper, we build a system for detecting cherry-picking, with a focus on trendlines extracted from temporal data. We define a support metric for detecting such trendlines. Given a dataset and a statement made based on a trendline, we compute a support score that indicates how cherry-picked it is. Studying different types of trendlines and formalizing terms, we propose efficient and effective algorithms for computing the support measure. We also study the problem of discovering the most supported statements. Besides theoretical analysis, we conduct extensive experiments on real-world data, that demonstrate the validity of our proposed techniques.
引用
收藏
页码:939 / 952
页数:14
相关论文
共 58 条
  • [1] Aggarwal P, 2010, ARTECH HSE GNSS TECH, P35
  • [2] Progressive Approach to Relational Entity Resolution
    Altowim, Yasser
    Kalashnikov, Dmitri V.
    Mehrotra, Sharad
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2014, 7 (11): : 999 - 1010
  • [3] [Anonymous], 2004, Monte Carlo methods
  • [4] The Responsibility Challenge for Data
    Jagadish, H. V.
    Bonchi, Francesco
    Eliassi-Rad, Tina
    Getoor, Lise
    Gummadi, Krishna
    Stoyanovich, Julia
    [J]. SIGMOD '19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2019, : 412 - 414
  • [5] On Obtaining Stable Rankings
    Asudeh, Abolfazl
    Jagadish, H., V
    Miklau, Gerome
    Stoyanovich, Julia
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2018, 12 (03): : 237 - 250
  • [6] Information integration in the enterprise
    Bernstein, Philip A.
    Haas, Laura M.
    [J]. COMMUNICATIONS OF THE ACM, 2008, 51 (09) : 72 - 79
  • [7] "Lyin' Ted', "Crooked Hillary', and "Deceptive Donald': Language of Lies in the 2016 US Presidential Debates
    Bond, Gary D.
    Holman, Rebecka D.
    Eggert, Jamie-Ann L.
    Speller, Lassiter F.
    Garcia, Olivia N.
    Mejia, Sasha C.
    Mcinnes, Kohlby W.
    Ceniceros, Eleny C.
    Rustige, Rebecca
    [J]. APPLIED COGNITIVE PSYCHOLOGY, 2017, 31 (06) : 668 - 677
  • [8] CHAUDHURI S, 1990, PROCEEDINGS : 6TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, P138
  • [9] Computational Journalism
    Cohen, Sarah
    Hamilton, James T.
    Turner, Fred
    [J]. COMMUNICATIONS OF THE ACM, 2011, 54 (10) : 66 - 71
  • [10] Cormen T., 2009, Introduction to Algorithms, V3