Are Call Detail Records Biased for Sampling Human Mobility?

被引:75
作者
Ranjan, Gyan [1 ]
Zang, Hui [2 ]
Zhang, Zhi-Li [1 ]
Bolot, Jean [3 ]
机构
[1] Univ Minnesota, Dept Comp Sci & Engn, Minneapolis, MN 55455 USA
[2] Sprint, Minneapolis, CA USA
[3] Technicolor, Minneapolis, CA USA
基金
美国国家科学基金会;
关键词
D O I
10.1145/2412096.2412101
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
Call detail records (CDRs) have recently been used in studying different aspects of human mobility. While CDRs provide a means of sampling user locations at large population scales, they may not sample all locations proportionate to the visitation frequency of a user, owing to sparsity in time and space of voice-calls, thereby introducing a bias. Also, as the rate of sampling is inherently dependent on the calling frequencies of an individual, high voice-call activity users are often chosen for conducting a meaningful study. Such a selection process can, inadvertently, lead to a biased view as high frequency callers may not always be representative of an entire population. With the advent of 3G technology and wide adoption of smart-phones, cellular devices have become versatile end-hosts. As the data accessed on these devices does not always require human initiation, it affords us with an unprecedented opportunity to validate the utility of CDRs for studying human mobility. In this work, we investigate various metrics for human mobility studied in literature for over a million cellular users in the San Francisco bay-area, for over a month. Our findings reveal that although the voice-call process does well to sample significant locations, such as home and work, it may in some cases incur biases in capturing the overall spatio-temporal characteristics of individual human mobility. Additionally, we motivate an "artificially" imposed sampling process, vis-a-vis the voice-call process with the same average intensity. We observe that in many cases such an imposed sampling process yields better performance results based on the usual metrics like entropies and marginal distributions used often in literature.
引用
收藏
页码:33 / 44
页数:12
相关论文
共 18 条
[1]  
Becker R., 2011, P NETM 2011
[2]   The scaling laws of human travel [J].
Brockmann, D ;
Hufnagel, L ;
Geisel, T .
NATURE, 2006, 439 (7075) :462-465
[3]  
Chaintreau A., 2006, P IEEE INF 06 BARC S
[4]  
Couronne T., 2011, P NETM 2011
[5]  
Cover T. M., 2006, ELEMENTS INFORM THEO, DOI [DOI 10.1002/047174882X, DOI 10.1002/047174882X.CH5]
[6]   Understanding individual human mobility patterns [J].
Gonzalez, Marta C. ;
Hidalgo, Cesar A. ;
Barabasi, Albert-Laszlo .
NATURE, 2008, 453 (7196) :779-782
[7]  
Hartigan J., 1975, CLUSTERING ALGORITHM
[8]  
Isaacman S., 2011, 9 INT C PERV COMP PE
[9]  
KIM M, 2007, J PERSONAL UBIQUITOU, V11
[10]  
Kotz D., 2006, P IEEE INF 06