Commercial Internet filters: Perils and opportunities

被引:14
作者
Chou, Chen-Huei [2 ]
Sinha, Atish P. [1 ]
Zhao, Huimin [1 ]
机构
[1] Univ Wisconsin, Sheldon B Lubar Sch Business, Milwaukee, WI 53201 USA
[2] Coll Charleston, Sch Business, Charleston, SC 29424 USA
关键词
Internet abuse; Internet filters; Text mining; Text classification; LEARNING APPROACH; WEB USAGE; CLASSIFICATION; WORKPLACE; ABUSE; SYSTEM;
D O I
10.1016/j.dss.2009.11.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Organizations are becoming increasingly aware of Internet abuse in the workplace. Such abuse results in loss of workers' productivity, network congestion, security risks, and legal liabilities. To address this problem, organizations have started to adopt Internet usage policies, management training, and filtering software. Several commercial Internet filters are experiencing an increasing number of organizational adoptions. These products mainly rely on black lists, white lists, and keyword/profile matching to filter out undesired web pages. In this paper, we describe three top-ranked commercial Internet filters - CYBERSitter, Net Nanny. and CyberPatrol - and evaluate their performance in the context of an Internet abuse problem. We then propose a text mining approach to address the problem and evaluate its performance using six different classification algorithms: naive Bayes, multinominal naive Bayes, support vector machine, decision tree, k-nearest neighbor, and neural network. The evaluation results point to the perils of using commercial Internet filters on one hand, and to the prospects of using text mining on the other. The proposed text mining approach Outperforms the commercial filters. We discuss the possible reasons for the relatively poor performance of the filters and the steps that could be taken to improve their performance. (C) 2009 Elsevier B.V. All rights reserved.
引用
收藏
页码:521 / 530
页数:10
相关论文
共 63 条
  • [1] Anandarajan M, 2002, J MANAGE INFORM SYST, V19, P243
  • [2] ANANDARAJAN M, 2004, CONSTRUCTIVE DYSFUNC
  • [3] [Anonymous], 1998, Learning for Text Categorization: Papers from the 1998 Workshop, (USA)
  • [4] AUTOMATED LEARNING OF DECISION RULES FOR TEXT CATEGORIZATION
    APTE, C
    DAMERAU, F
    WEISS, SM
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 1994, 12 (03) : 233 - 251
  • [5] Arnesen D., 2007, Journal of Organizational Culture, Communications and Conflict, V11, P53
  • [6] Baeza-Yates R., 1999, Modern Information Retrieval, V463
  • [7] A machine learning approach to web page filtering using content and structure analysis
    Chau, Michael
    Chen, Hsinchun
    [J]. DECISION SUPPORT SYSTEMS, 2008, 44 (02) : 482 - 494
  • [8] An empirical evaluation of key factors contributing to internet abuse in the workplace
    Chen, Jengchung V.
    Chen, Charlie C.
    Yang, Hsiao-Han
    [J]. INDUSTRIAL MANAGEMENT & DATA SYSTEMS, 2008, 108 (1-2) : 87 - 106
  • [9] Web page classification based on a support vector machine using a weighted vote schema
    Chen, Rung-Ching
    Hsieh, Chung-Hsun
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2006, 31 (02) : 427 - 435
  • [10] Validation of a new scale for measuring problematic Internet use: Implications for pre-employment screening
    Davis, RA
    Flett, GL
    Besser, A
    [J]. CYBERPSYCHOLOGY & BEHAVIOR, 2002, 5 (04): : 331 - 345