Private traits and attributes are predictable from digital records of human behavior

被引:1305
作者
Kosinski, Michal [1 ]
Stillwell, David [1 ]
Graepel, Thore [2 ]
机构
[1] Univ Cambridge, Psychometr Ctr, Cambridge CB2 3RQ, England
[2] Microsoft Res, Cambridge CB1 2FB, England
关键词
social networks; computational social science; machine learning; big data; data mining; psychological assessment; PERSONALITY; LIFE; SATISFACTION;
D O I
10.1073/pnas.1218772110
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
We show that easily accessible digital records of behavior, Facebook Likes, can be used to automatically and accurately predict a range of highly sensitive personal attributes including: sexual orientation, ethnicity, religious and political views, personality traits, intelligence, happiness, use of addictive substances, parental separation, age, and gender. The analysis presented is based on a dataset of over 58,000 volunteers who provided their Facebook Likes, detailed demographic profiles, and the results of several psychometric tests. The proposed model uses dimensionality reduction for preprocessing the Likes data, which are then entered into logistic/linear regression to predict individual psychodemographic profiles from Likes. The model correctly discriminates between homosexual and heterosexual men in 88% of cases, African Americans and Caucasian Americans in 95% of cases, and between Democrat and Republican in 85% of cases. For the personality trait "Openness," prediction accuracy is close to the test retest accuracy of a standard personality test. We give examples of associations between attributes and Likes and discuss implications for online personalization and privacy.
引用
收藏
页码:5802 / 5805
页数:4
相关论文
共 30 条
  • [11] Golbeck J., 2011, CHI 11 HUM FACT COMP, P253, DOI DOI 10.1145/1979742.1979614
  • [12] The international personality item pool and the future of public-domain personality measures
    Goldberg, LR
    Johnson, JA
    Eber, HW
    Hogan, R
    Ashton, MC
    Cloninger, CR
    Gough, HG
    [J]. JOURNAL OF RESEARCH IN PERSONALITY, 2006, 40 (01) : 84 - 96
  • [13] Golub G., 1965, Journal of the Society for Industrial and Applied Mathematics, Series B: Numerical Analysis, V2, P205, DOI [10.1137/0702016, DOI 10.1137/0702016]
  • [14] A room with a cue: Personality judgments based on offices and bedrooms
    Gosling, SD
    Ko, SJ
    Mannarelli, T
    Morris, ME
    [J]. JOURNAL OF PERSONALITY AND SOCIAL PSYCHOLOGY, 2002, 82 (03) : 379 - 398
  • [15] Hu J., 2007, DEMOGRAPHIC PREDICTI, P151, DOI [DOI 10.1145/1242572, 10.1145/1242572]
  • [16] Customary Killings in Turkey and Turkish Modernization
    Ince, Hilal Onur
    Yarali, Aysun
    Ozsel, Dogancan
    [J]. MIDDLE EASTERN STUDIES, 2009, 45 (04) : 537 - 551
  • [17] Jernigan C., 2009, 1 MONDAY
  • [18] MATRIX FACTORIZATION TECHNIQUES FOR RECOMMENDER SYSTEMS
    Koren, Yehuda
    Bell, Robert
    Volinsky, Chris
    [J]. COMPUTER, 2009, 42 (08) : 30 - 37
  • [19] Kosinski M, 2012, ACM WEB SCI C, P251
  • [20] SOCIAL SCIENCE Computational Social Science
    Lazer, David
    Pentland, Alex
    Adamic, Lada
    Aral, Sinan
    Barabasi, Albert-Laszlo
    Brewer, Devon
    Christakis, Nicholas
    Contractor, Noshir
    Fowler, James
    Gutmann, Myron
    Jebara, Tony
    King, Gary
    Macy, Michael
    Roy, Deb
    Van Alstyne, Marshall
    [J]. SCIENCE, 2009, 323 (5915) : 721 - 723