Mining e-mail content for author identification forensics

被引:9
作者
de Vel, O
Anderson, A
Corney, M
Mohay, G
机构
[1] Def Sci & Technol Org, Div Informat Technol, Salisbury, SA 5108, Australia
[2] Queensland Univ Technol, Sch Informat Syst, Fac Informat Technol, Brisbane, Qld 4001, Australia
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We describe an investigation into e-mail content mining for author identification, or authorship attribution, for the purpose of forensic investigation. We focus our discussion on the ability to discriminate between authors for the case of both aggregated e-mail topics as well as across different email topics. An extended set of e-mail document features including structural characteristics and linguistic patterns were derived and, together with a Support Vector Machine learning algorithm, were used for mining the e-mail content. Experiments using a number of e-mail documents generated by different authors on a set of topics gave promising results for both aggregated and multi-topic author categorisation.
引用
收藏
页码:55 / 64
页数:10
相关论文
共 42 条
[1]  
ANDERSON A, 2001, UNPUB COMMUNICATION
[2]  
ANDERSON A, 2001, P WORKSH DAT MIN SEC
[3]  
[Anonymous], P AAAI SPRING S MACH
[4]  
Apte C., 1998, WORKSH LEARN TEXT WE
[5]   Separating hyperplanes and the authorship of the disputed Federalist papers [J].
Bosch, RA ;
Smith, JA .
AMERICAN MATHEMATICAL MONTHLY, 1998, 105 (07) :601-608
[6]  
CHASKI C, 1998, DAUBERT INSPIRED ASS
[7]  
CHASKI C, 2001, IN PRESS FORENSIC LI
[8]  
Crain C, 1998, LINGUA FRANCA, V8, P28
[9]  
De Vel O., 2000, P WORKSH TEXT MIN AC
[10]  
DEVEL O, 1999, UNPUB J COMP SEC