Can analysis of word frequency distinguish between writings of different authors?

被引:10
作者
Vilensky, B
机构
[1] Department of Physics, Bar-Ilan University
关键词
D O I
10.1016/0378-4371(96)00109-4
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Various literature writings are compared by the ''rank distance'', d, between two word frequency Zipf plots introduced by S. Havlin (Physica A 216 (1995) 148). We studied 22 books written by six authors. For this ensemble of books we find that the mean distance between books written by the same authors ([d] = 15.2 +/- 2.6) is considerably smaller than that between books written by different authors ([d] = 21.8 +/- 3.2), in good agreement with earlier results on a smaller sample of books. Our results suggest that the distribution of the rank difference of the same words in different books decays exponentially.
引用
收藏
页码:705 / 711
页数:7
相关论文
共 16 条
[1]   LANGUAGE AND CODIFICATION DEPENDENCE OF LONG-RANGE CORRELATIONS IN TEXTS [J].
Amit, M. ;
Shmerler, Y. ;
Eisenberg, E. ;
Abraham, M. ;
Shnerb, N. .
FRACTALS-COMPLEX GEOMETRY PATTERNS AND SCALING IN NATURE AND SOCIETY, 1994, 2 (01) :7-13
[2]  
Bouchaud JP, 1995, LECT NOTES PHYS, V450, P239
[3]  
Cohen A. M., PREPRINT
[4]   CORRELATIONS IN BINARY SEQUENCES AND A GENERALIZED ZIPF ANALYSIS [J].
CZIROK, A ;
MANTEGNA, RN ;
HAVLIN, S ;
STANLEY, HE .
PHYSICAL REVIEW E, 1995, 52 (01) :446-452
[5]  
GELLMANN M, 1993, QUARK JAGUAR
[6]   THE DISTANCE BETWEEN ZIPF PLOTS [J].
HAVLIN, S .
PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 1995, 216 (1-2) :148-150
[7]   MARKOV-PROCESSES - LINGUISTICS AND ZIPFS LAW [J].
KANTER, I ;
KESSLER, DA .
PHYSICAL REVIEW LETTERS, 1995, 74 (22) :4559-4562
[8]  
LI W, 1992, IEEE T INFORM THEORY, V38
[9]  
Mandelbrot B., 1966, Information theory and psycholinguistics: A theory of word frequencies
[10]  
MANDELBROT B, 1965, MATH EXPLORATIONS BE