An n-Gram Analysis of Communications 2000-2010

被引:13
作者
Soper, Daniel S. [1 ]
Turel, Ofir [1 ]
机构
[1] Calif State Univ Fullerton, Mihaylo Coll Business & Econ, Informat Syst & Decis Sci Dept, Fullerton, CA 92634 USA
关键词
IDENTITY CRISIS;
D O I
10.1145/2160718.2160737
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Culturomic analysis of Communications is presented showing how natural language processing can be used to quantitatively explore the identity and culture of an institution over time, inspired by the n-gram project released in 2010 by Google labs. To appreciate how the identity of Communications has evolved, a corpus of the complete text of every article published from 2000 to 2010 is constructed. The standardized frequency values indicated how often a particular n-gram appeared in Communications during a particular year relative to the total quantity of text published in it that year. It is found that several of the terms showing the most growth were related to science and technology, while several of the declining terms were related to business and management. The n-gram analysis also revealed changes in Communications' use of gender-related terms from 2000 to 2010.
引用
收藏
页码:81 / 87
页数:7
相关论文
共 16 条
  • [1] Agarwal R, 2005, MIS QUART, V29, P381
  • [2] [Anonymous], PRINCIPIA PHILOS
  • [3] [Anonymous], 2010, GENDER CODES WHY WOM
  • [4] Benbasat I, 2003, MIS QUART, V27, P183
  • [5] Bloomer M., 2004, Studies in Continuing Education, V26, P19, DOI DOI 10.1080/158037042000199443
  • [6] Google Books, Wikipedia, and the Future of Culturomics
    Bohannon, John
    [J]. SCIENCE, 2011, 331 (6014) : 135 - 135
  • [7] Crawford D, 2005, COMMUN ACM, V48, P5, DOI 10.1145/1081992.1082001
  • [8] Crawford D, 2003, COMMUN ACM, V46, P5, DOI 10.1145/953460.953468
  • [9] Crawford D, 2001, COMMUN ACM, V44, P5, DOI 10.1145/374308.374311
  • [10] Dennis Alan, 2009, SYSTEMS ANAL DESIGN